Code Monkey home page Code Monkey logo

ibm-data-science-capstone's Introduction

IBM Data Science

Capstone project work for the IBM Data Science certification at Coursera

This project show an analysis of the publicly available data set on liquor retail sales in the state of Iowa, US. The goal is to try and predict the monthly sales per liquor type for a particular store based on information from historical sales and weather data. The work consists of few steps, briefly described as:

  1. Exploratory data analysis:
  • Determine the data set attributes' type and range of allowed (or normal) values
  • Plot a few summarizing bar and line charts to become familiar with attributes' value distribution and relationship (correlation)
    • Notice the (expected) cyclic trend of the sales
  • Remove uninformative or not needed attributes (dimensionality reduction)
  1. Transform the data:
  • Detemine the different sources of data (liquor sales come from the web site of the government of Iowa, weather information comes from NOAA's website
  • Merge the disparate sources
  • Generate new features
    • Lag values for the weather and sales parameters
  1. Do (simple) machine learning:
  • The problem is simple univariate regression
  • Try: linear, lasso and ridge regression
  • Evaluate results on R^2 value
  1. Summarize results
  • Models perform more or less the same
  • The evaluation period contains the outbreak of the COVID-19 pandemic and shows very interesting results (last plot in the notebook)
    • Total liquor retail sales in Iowa dropped for more than 90% during this period!
    • R^2 values become negative because the model is simply not able to predict this (to be fair, neither was any of us :))!

For more information, check out the report and presentation files. For the actual code, take a look at the Jupyter notebook.

ibm-data-science-capstone's People

Contributors

popovstefan avatar

Stargazers

 avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.