Code Monkey home page Code Monkey logo

nba-data-wrangling-exploration's Introduction

NBA Teams Offensive Data Exploration (1980-2021)

by Daniel Chang

Date last Updated: 3/16/2022 Nba

Summary of Project

For this project, I am mainly interested in conducting data exploration and analysis on the offensive stats and characteristics of different NBA teams based on Finals ranking which is a new column I will create that contains 4 values: Champion, Runner-Up, Knocked Out and Never Qualified. Knocked Out and Never Qualified implies that they have either been knocked out of or never qualified for the NBA playoffs. Some stats that you will see me analyze and visualize are Margin of Victory(MOV), 3P%, Age and shot attempts.

Part I - NBA Web Scraping.ipynb - To begin, I scraped data from the Basketball Reference website, which contains each team's performances throughout the years. I scraped a total of 4 different stats tables from the website and stored them in 4 different datasets. In this notebook, I used 3 different packages: Pandas, BeautifulSoup and Requests.

Part II - NBA Data Cleaning.ipynb - In this notebook, I conducted the cleaning process. Some steps I took here are dealing with null values, dropping unneeded columns, converting datatypes and cleaning up the values. After cleaning up the data, I merged 4 of the datasets into 2. I have also created a new column to indicate the NBA teams' Finals ranking in this notebook. One consists of the teams' total stats per year and one consists of the average stats per game for each year. In this notebook, I used 2 different packages: Pandas and Numpy.

Part III - NBA Data Exploration.ipynb - In this notebook, you'll find my analysis and visualizations of the stats. I started off by analyzing the total stats first to get a broad picture view by conducting and creating visuals for univariate and bivariate exploration. Afterward, I moved onto the average stats per game of each year where I conducted the same type of explorations along with multivariate exploration. You will find that I have also created a couple categorical variables for my analysis as well. In this notebook, I used 5 different packages: Pandas, Numpy, Seaborn, Matplotlib and Warnings. You will find the majority of the multivariate exploration near the end.

Installation

* BeautifulSoup
* Requests
* Matplotlib
* Seaborn
* Numpy
* Pandas
* Warnings

Datasets

total_stats_df.csv - Uncleaned dataset that contains the total stats from 1980-2021 scraped from the Basketball Reference website.

avg_stats_df.csv - Uncleaned dataset that contains the average stats from 1980-2021 scraped from the Basketball Reference website.

advanced_stats_df.csv - Uncleaned dataset that contains the advanced stats from 1980-2021 scraped from the Basketball Reference website. This dataset includes variables such as MOV, ORtg and DRtg.

advanced_stats_df.csv - Uncleaned dataset that contains the season summary from 1980-2021 scraped from the Basketball Reference website. This dataset includes variables such as Champions and Runner-up.

cleaned_total_stats.csv - Cleaned dataset that contains the total stats from 1980-2021 scraped from the Basketball Reference website. This dataset all needed variables such as Finals rankings(Finals_Rk).

cleaned_avg_stats.csv - Cleaned dataset that contains the average stats from 1980-2021 scraped from the Basketball Reference website. This dataset all needed variables such as Finals rankings(Finals_Rk), Age and Wins.

Licensing, Authors, Acknowledgements

I would like to give special thanks to Basketball Reference for the data that I have collected. This couldn't have happened without them.

Further Works

I plan to create predictive models in the coming months with the updated datasets for the upcoming season.

nba-data-wrangling-exploration's People

Contributors

mr-chang95 avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.