major_leagues_soccer's Introduction

Major leagues - NFL, MLB, NBA and Soccer scores

For this project, I have considered Soccer SPI dataset.

Steps to execute:

Download the files from the github repository.
Get the soccer_spi.csv file by extracting from .rar file.
Place the csv files in datasets folder and place the datasets folder in notebooks folder. The notebooks folder should also have ipynb file as well.
Navigate to terminal and type "jupyter notebook"
Navigate to the folder where the notebook is placed.
From the menu icon cell, click on Run all which will run the whole notebook from the first cell. Verify the results.

The project is all about building regression models to determine the decision as yes/no or win/lose using the other columns as features.

Steps to follow:

Set up a data science project structure in a new git repository in your GitHub account
Pick one of the game data sets depending your sports preference https://github.com/fivethirtyeight/nfl-elo-game https://github.com/fivethirtyeight/data/tree/master/mlb-elo https://github.com/fivethirtyeight/data/tree/master/nba-carmelo https://github.com/fivethirtyeight/data/tree/master/soccer-spi
Load the data set into panda data frames
Formulate one or two ideas on how feature engineering would help the data set to establish additional value using exploratory data analysis
Build one or more regression models to determine the scores for each team using the other columns as features
Document your process and results
Commit your notebook, source code, visualizations and other supporting files to the git repository in GitHub

Recommend Projects