Merge pull request #20 from UBC-MDS/readme_update

LGTM
UBC-MDS · Jan 11, 2025 · 6bcce13 · 6bcce13
2 parents f6f365d + 8483d41
commit 6bcce13
Show file tree

Hide file tree

Showing 2 changed files with 22 additions and 3 deletions.
diff --git a/README.md b/README.md
@@ -1,6 +1,9 @@
 # textanalyzer
 
-This package includes powerful tools to perform natural language processing on English texts.
+`TextAnalyzer` includes powerful tools to perform natural language processing on English texts.
+
+`TextAnalyzer` is a Python package designed for performing comprehensive Natural Language Processing (NLP) tasks on English texts. 
+This package provides tools for sentiment analysis, keyword extraction, topic modeling, and the detection and visualization of language patterns, making it ideal for text mining and content analysis projects.
 
 ## Installation
 
@@ -10,15 +13,30 @@ $ pip install textanalyzer
 
 ## Usage
 
-
 - `analyze_sentiment(message, model="default")`: This function analyzes the sentiment of a given message and prints alert message if it's highly negative. 
 - `topic_modeling()`: This function performs topic extraction from a text or a list of texts by using Nonnegative Matrix Factorization. 
 - `extract_keywords(messages, method="tfidf", num_keywords=5)`: This function extracts the top keywords from a list of messages using specified methods like TF-IDF or RAKE.
+- `detect_language_patterns(messages, method="language", n=2, top_n=5)`: This function detects language patterns such as detected languages, common n-grams, or character usage patterns from a list of messages.
+- `visualize_language_patterns(patterns, method="language")`: This function visualizes the detected language patterns using bar charts for language frequency, n-grams, or character patterns.
+
+## Ecosystem Fit
+
+`TextAnalyzer` integrates into the Python NLP ecosystem by offering a simple yet powerful toolkit for analyzing text data. While other Python libraries like [NLTK](https://www.nltk.org/) and [spaCy](https://spacy.io/) provide extensive NLP functionalities, `TextAnalyzer` focuses on making sentiment analysis, keyword extraction, and language pattern visualization more accessible and user-friendly. 
+
+For keyword extraction, packages like [YAKE](https://github.com/LIAAD/yake) and [RAKE-NLTK](https://pypi.org/project/rake-nltk/) provide similar functionality. However, TextAnalyzer combines these tasks into a unified and streamlined workflow.
 
 
 ## Contributing
 
-Interested in contributing? Check out the contributing guidelines. Please note that this project is released with a Code of Conduct. By contributing to this project, you agree to abide by its terms.
+Interested in contributing? Check out the contributing guidelines. Please note that this project is released with a Code of Conduct. By contributing to this project, you agree to abide by its terms.(TO DO - add link)
+
+## Dependencies
+
+- [`TextBlob`](https://textblob.readthedocs.io/): For sentiment analysis.
+- [`langdetect`](https://pypi.org/project/langdetect/): For language detection.
+- [`scikit-learn`](https://scikit-learn.org/): For keyword extraction and n-gram analysis (`CountVectorizer`).
+- [`collections.Counter`](https://docs.python.org/3/library/collections.html): For frequency analysis.
+
 
 ## License
 

diff --git a/src/textanalyzer/visualize_language_patterns.py b/src/textanalyzer/visualize_language_patterns.py
@@ -13,4 +13,5 @@ def visualize_language_patterns(patterns, method="language"):
         - "ngrams": Displays a bar chart of top n-grams.
         - "char_patterns": Displays a bar chart of common characters.
     """
+
Original file line number	Diff line number	Diff line change
Expand Up		@@ -13,4 +13,5 @@ def visualize_language_patterns(patterns, method="language"):
		- "ngrams": Displays a bar chart of top n-grams.
		- "char_patterns": Displays a bar chart of common characters.
		"""