For Kaggle's first playground series of 2024, the task was binary classification of customers determining whether they were likely to leave the bank. Typically these competitions involve using tree based and ensemble techniques to maximise an evaluation metric, however given the credit risk inspired theme for the competition: I thought I would develop a score as may be done in a financial institution.
https://www.kaggle.com/competitions/playground-series-s4e1
I have divided my original notebook into the following topics:
- Exploratory Data Analysis: A few visualisations showing the target variable distributed across the other fields.
- Model Development and Validation: A detailed walkthrough of the process involved to develop a scorecard based on a logistic regression (including weights of evidence transformations and score calibration).