This is the first project of the course Data Mining Techniques developed by Ritsogianni Argyro and Triantafyllou Leonidas in the Spring semester of 2018. In this project we learned about some steps in Data Mining such as collection, pre-processing and transformation. We also implemented classification, using different classifiers such as Random Forests, Naive-Bayes, Support Vector Machines and K-Nearest Neighbor(our implementation using Majority Voting) and performed 10-fold Cross Validation measuring the following metrics: Precision, Recall, F-Measure and Accuracy. We used some tools and libraries which the instructors noted: SciKit Learn, pandas, gensim.This project is written in the programming language Python.
- WordCloud implementation
- Classification using Random Forests, Naive-Bayes, Support Vector Machines and K-Nearest Neighbor(our implementation using Majority Voting)
- 10-fold Cross Validation measuring Precision, Recall, F-Measure and Accuracy
- Testing to find the best Classifier for our test set
- Ritsogianni Argyro: [email protected]
- Triantafyllou Leonidas: [email protected]