Skip to content

Predicting patient drug review sentiment and identifying ambiguous reviews

Notifications You must be signed in to change notification settings

pratt-datar/Sentiment-ambiguity

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Sentiment-ambiguity

The main objective of the project was to predict customer sentiment based on drug reviews and to identify ambiguous reviews to better serve drug manufacturers and new customers. The dataset for the project was collected from Kaggle 2018 University Club Hackathon and consisted of customer provided ratings for a drug and its review. To strike the right balance between the vocabulary size and the model accuracy, we used a custom stop words list along with various parameters in the vectorization process. We conducted a comparative study of various supervised machine learning classifiers such as SVM and Naive Bayes models to better predict the sentiment of a customer. Based on the evaluation parameters such as Precision, Recall, F-score, Accuracy, and extreme misclassification errors, we concluded that LinearSVC classifiers performed better than Naive Bayes models for predicting sentiment on the given dataset. We hypothesized that the number of conjunctions used in a review is directly proportional to the ambiguity of a review. Therefore, to identify the ambiguous reviews, we used a combination of misclassification errors of LinearSVC with a high number of conjunctions.

About

Predicting patient drug review sentiment and identifying ambiguous reviews

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published