The data was taken from Dr. Hans Hofmann, Institut f"ur Statistik und "Okonometrie Universit"at Hamburg. This is a transformed version data. Real-life data is messy and unstructure with expected missing values or unrecognised, issues with date format, regional values etc. In fact data pre-processing for machine learningtakes 60-70% effort of Analytic work. This dataset here classifies people described by a set of attributes as good or bad credit risks in terms of default.
Variables are binary, numerical and categorical.
Data split: 90% training dataset 10% testing dataset
Software use:R and relevant library
Predictive model: DecisionTree & RandomForest Classification
The analysis covered all stages of advanced analytic activity (Descriptive, Diagnostic, Predictive & Prescriptive).