Code Monkey home page Code Monkey logo

vithika-karan / retail-sales-prediction Goto Github PK

View Code? Open in Web Editor NEW
6.0 2.0 4.0 17.69 MB

Sales Prediction : Predicting sales of a major store chain Rossmann; Rossmann operates over 3,000 drug stores in 7 European countries. Currently, Rossmann store managers are tasked with predicting their daily sales for up to six weeks in advance. Store sales are influenced by many factors, including promotions, competition, school and state holidays, seasonality, and locality. With thousands of individual managers predicting sales based on their unique circumstances, the accuracy of results can be quite varied. You are provided with historical sales data for 1,115 Rossmann stores. The task is to forecast the "Sales" column for the test set. Note that some stores in the dataset were temporarily closed for refurbishment.

Jupyter Notebook 100.00%
machine-learning regression salesforecast eda python3 scikit-learn

retail-sales-prediction's Introduction

Sales Logo.png

Retail Sales Prediction

AlmaBetter Verified Project - AlmaBetter School

Problem Statement and Project Description

Retail Sales Prediction is a regression machine learning project. Rossmann operates over 3,000 drug stores in 7 European countries. Currently, Rossmann store managers are tasked with predicting their daily sales for up to six weeks in advance. Store sales are influenced by many factors, including promotions, competition, school and state holidays, seasonality, and locality. With thousands of individual managers predicting sales based on their unique circumstances, the accuracy of results can be quite varied. You are provided with historical sales data for 1,115 Rossmann stores. The task is to forecast the "Sales" column for the test set. Note that some stores in the dataset were temporarily closed for refurbishment.

Businesses use sales forecasts to determine what revenue they will be generating in a particular timespan to empower themselves with powerful and strategic business plans. Important decisions such as budgets, hiring, incentives, goals, acquisitions and various other growth plans are affected by the revenue the company is going to make in the coming months and for these plans to be as effective as they are planned to be it is important for these forecasts to also be as good.

The work here forecasts the sales of the various Rossmann stores across Europe for the recent six weeks and compares the results from the models developed with the actual sales values.

๐Ÿ’พ Project Files Description

This project contains two executable file, a technical document and a presentation as follows:

Executable Files:

  • Rossmann_Sales_Prediction_Vithika_Karan.ipynb - Google Collab notebook containing data summary, exploration, visualisations and modeling.
  • Rossmann_Sales_Prediction_Part_2_Vithika_Karan_ipynb.ipynb - Google Collab notebook containg model hyperparameter tuning, model performance, evaluation and conclusion.

Documentation:

  • Technical Documentation.pdf - Includes the complete documentation about the project.
  • Project Presentation.pdf - Presentation of the same.

Source Directory:

  • Data & Resources.zip - Includes sales data and store data for various Rossmann stores.

-----------------------------------------------------

๐Ÿ“– Random Forest

Random forest is a supervised learning algorithm. It creates a "forest" out of an ensemble of decision trees, which are commonly trained using the "bagging" method. The bagging method's basic premise is that combining different learning models improves the overall output. Simply said, random forest combines many decision trees to produce a more accurate and stable prediction. Random Forest

Furthermore, the random forest classifier is efficient, can handle a large number of input variables, and provides correct predictions in most cases. It's a very strong tool that doesn't require any coding to implement.

-----------------------------------------------------

๐Ÿ“ˆ Exploratory Data Analysis

There were more sales on Monday, probably because shops generally remain closed on Sundays which had the lowest sales in a week. Store type B though being few in number had the highest sales average. The reasons include all three kinds of assortments specially assortment level b which is only available at type b stores and being open on sundays as well. The outliers in the dataset showed justifiable behaviour. The outliers were either of store type b or had promotion going on which increased sales.

Sales Results

Sales Results

Store type B was open on all seven days of the week and had more sales than any other store type and promotion had a positive effect across all store types.

Sales Results Sales Results

๐Ÿ“ˆ Results

Random Forest Tuned Model gave the best results and the patterns that could be captured by the model without overfitting was captured achieving a R^2 of 0.95 which helps in allocation of resources and proper planning for the company's growth. Sales Results

-----------------------------------------------------

๐Ÿ“œ Credits

< Vithika Karan > | Keen Learner | Business Analyst | Data Scientist | Machine Learning Enthusiast

Contact me for Data Science Project Collaborations

LinkedIn Badge GitHub Badge Medium Badge Resume Badge

-----------------------------------------------------

๐Ÿ“š References

retail-sales-prediction's People

Contributors

vithika-karan avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

retail-sales-prediction's Issues

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.