Skip to content

Commit

Permalink
Merge pull request #20 from UBC-MDS/readme_update
Browse files Browse the repository at this point in the history
LGTM
  • Loading branch information
adrianl726 authored Jan 11, 2025
2 parents f6f365d + 8483d41 commit 6bcce13
Show file tree
Hide file tree
Showing 2 changed files with 22 additions and 3 deletions.
24 changes: 21 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,9 @@
# textanalyzer

This package includes powerful tools to perform natural language processing on English texts.
`TextAnalyzer` includes powerful tools to perform natural language processing on English texts.

`TextAnalyzer` is a Python package designed for performing comprehensive Natural Language Processing (NLP) tasks on English texts.
This package provides tools for sentiment analysis, keyword extraction, topic modeling, and the detection and visualization of language patterns, making it ideal for text mining and content analysis projects.

## Installation

Expand All @@ -10,15 +13,30 @@ $ pip install textanalyzer

## Usage


- `analyze_sentiment(message, model="default")`: This function analyzes the sentiment of a given message and prints alert message if it's highly negative.
- `topic_modeling()`: This function performs topic extraction from a text or a list of texts by using Nonnegative Matrix Factorization.
- `extract_keywords(messages, method="tfidf", num_keywords=5)`: This function extracts the top keywords from a list of messages using specified methods like TF-IDF or RAKE.
- `detect_language_patterns(messages, method="language", n=2, top_n=5)`: This function detects language patterns such as detected languages, common n-grams, or character usage patterns from a list of messages.
- `visualize_language_patterns(patterns, method="language")`: This function visualizes the detected language patterns using bar charts for language frequency, n-grams, or character patterns.

## Ecosystem Fit

`TextAnalyzer` integrates into the Python NLP ecosystem by offering a simple yet powerful toolkit for analyzing text data. While other Python libraries like [NLTK](https://www.nltk.org/) and [spaCy](https://spacy.io/) provide extensive NLP functionalities, `TextAnalyzer` focuses on making sentiment analysis, keyword extraction, and language pattern visualization more accessible and user-friendly.

For keyword extraction, packages like [YAKE](https://github.com/LIAAD/yake) and [RAKE-NLTK](https://pypi.org/project/rake-nltk/) provide similar functionality. However, TextAnalyzer combines these tasks into a unified and streamlined workflow.


## Contributing

Interested in contributing? Check out the contributing guidelines. Please note that this project is released with a Code of Conduct. By contributing to this project, you agree to abide by its terms.
Interested in contributing? Check out the contributing guidelines. Please note that this project is released with a Code of Conduct. By contributing to this project, you agree to abide by its terms.(TO DO - add link)

## Dependencies

- [`TextBlob`](https://textblob.readthedocs.io/): For sentiment analysis.
- [`langdetect`](https://pypi.org/project/langdetect/): For language detection.
- [`scikit-learn`](https://scikit-learn.org/): For keyword extraction and n-gram analysis (`CountVectorizer`).
- [`collections.Counter`](https://docs.python.org/3/library/collections.html): For frequency analysis.


## License

Expand Down
1 change: 1 addition & 0 deletions src/textanalyzer/visualize_language_patterns.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,4 +13,5 @@ def visualize_language_patterns(patterns, method="language"):
- "ngrams": Displays a bar chart of top n-grams.
- "char_patterns": Displays a bar chart of common characters.
"""


0 comments on commit 6bcce13

Please sign in to comment.