-
Notifications
You must be signed in to change notification settings - Fork 0
Home
Shiva Sitaraman edited this page Dec 9, 2017
·
5 revisions
Welcome to the dBias wiki!
This project aims at debunking some of the potential biases in human-centric datasets.
Any machine learning system is as good as the data it is trained on. It is possible that the machine learning system to catch some of the sensitive biases present due to the inherent bias in the dataset.
dBias framework provides visualization of the dataset to expose weaknesses in the distribution which can prompt the system to learn bias unknowingly.
Some of the articles to refer to:
- https://enterprisersproject.com/article/2016/9/reduce-biases-machine-learning-start-openly-discussing-problem
- https://www.mckinsey.com/business-functions/risk/our-insights/controlling-machine-learning-algorithms-and-their-biases
- http://www.cs.virginia.edu/~vicente/files/bias.pdf
- https://www.graphcore.ai/posts/removing-bias-from-machine-learning
Some candidate datasets:
- Adult Dataset (UCI-ML) - http://archive.ics.uci.edu/ml/machine-learning-databases/adult/
- Titanic Survival Dataset - https://www.kaggle.com/c/titanic/data
- Bank Marketing - http://archive.ics.uci.edu/ml/datasets/Bank+Marketing
- H1-B dataset - https://www.kaggle.com/nsharan/h-1b-visa
- FIFA 18 dataset - https://www.kaggle.com/skalskip/fifa-18-data-exploration-and-d3-js-visualization/