Skip to content

Apply 7 common Machine Learning Algorithms to detect fraud, while dealing with imbalanced dataset

Notifications You must be signed in to change notification settings

ireneliu521/Credit-Card-Fraud_J2D_Project_Python

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 

Repository files navigation

Credit Card Fraud Detection

In this project, we will analyze the dataset which contains 492 frauds out of 284,807 transactions from Kaggle (www.kaggle.com/mlg-ulb/creditcardfraud/data). The transactions were made by european credit card holders in September 2013. Our objective of this project is to fit the dataset into our machine learning models to predict precisely while dealing with the highly unbalanced issue of this dataset. Since there are 28 variables which are the result of a principle component analysis (PCA) transformation and the information of the variables was not given, we will drop the variables which have similar distributions. Our next step is to deal with the unbalanced issue. We will use the synthetic minority over-sampling technique (SMOTE) to resample the dataset to make the numbers of frauds and normal transactions even. The last step is to compare the machine learning methods and we found that Xgboost returned the highest AUC score.

About

Apply 7 common Machine Learning Algorithms to detect fraud, while dealing with imbalanced dataset

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published