In this Houston Data Science meetup we will cover how to apply machine learning algorithms with the Python library Scikit-Learn.
Follow the tutorial in the Intro to Python repository in order to install the Anaconda distribution with Python 3.
- What is machine learning?
- Types of machine learning
- A typical machine learning workflow
- Overview of Scikit-Learn
- Scikit-Learn Gotchas
- Ames Housing dataset from Kaggle competition
- Remedying missing values
- Categorical vs Continuous features
- The importance of a dummy baseline model
- Building more complex models
- Cross-validation
- Parameter tuning and grid search
All work for this tutorial will take place within a Jupyter Notebook