This project uses sentiment analysis, TF-IDF (term frequency-inverse document frequency) and Latent Dirichlet Allocation (LDA) topic modeling to evaluate political bias in 409 pieces of CNN Politics news.
I conduct sentiment analysis using three different lexicons to compare the sentiment in Democrat and Republican-related news texts: Afinn sentiment score, Bing vocabulary, and NRC lexicon.
I utilize TF-IDF to uncover the unique and important terms in Democrat-related and Republican-related texts respectively.
I use Latent Dirichlet Allocation (LDA) topic modeling to estimate topics in the corpus. Two topics are identified, and words associated with each topic are presented.