This ML classification project explores housing blight in Detroit, MI in an attempt to better understand evolving real estate trends in the city. The project consists of 3 parts:
- Exploratory Data Analysis on housing-related data from the Detroit Open Portal website
- Training and comparing model produced with this data to predict a home’s likelihood to have gotten at least one blighting citation in Detroit
- A post-hoc analysis of the final model to better understand the problem and develop next steps
The charts below show a couple of findings confirm my two of my hypotheses:
The findings are further complicated by considering more about homeowners vs folks who put their homes up for rent. Future iterations of this project will include a more detailed breakdown about real estate trends for various demographics.
The final Decision Tree model does well at binary classfication, with metrics like the following:
- Accuracy : 94.69
- F1-Score : 0.60
- Precision Score : 0.589
- Recall Score : 0.621
Note: The majority of top indicators were features engineered from the raw data:
However, feature importance evaluation, the total due per parcel (the most important feature at 0.534 importance) is too highly correlated with the target varialbe (total tickets) to provide an independent prediction. As such, the next iteration of this project applies methods that mitigate this confluence.
These methods include:
- Up and Down Sampling
- Various other modeling techniques such as grid search
- Look at a baseline model and null accuracy
- Deploy grid and random search to find optimal tree depth
- Observe evaluation metrics (including confusion matrix) for each class
- Try a more thorough EDA distribution of each feature again target and improve upon current visualizations
- Scale and normalize each variable more thoroughly
- Eventually try a multi-class target instead of binary, with categories of 0 tickets // 1-7 tickets // 8+ tickets
See full slides for this project's presentation at https://bit.ly/2MCz2Ob All data can be found for free at Detroit's Open Data Portal https://data.detroitmi.gov/