Code Monkey home page Code Monkey logo

time-series-analysis's Introduction

Web Traffic Forecasting

Web Traffic visual

This Repository is to analyse the web traffic of a website for a particular time interval using Time Series Analysis and forecast the same for the future. The website chosen for analysis here is Wikipedia. The scripts are in iPython Notebook (ipynb) format.

DSLC

This Analysis adapts the flow of the Data Science Life Cycle.

Initial Setup

git clone https://github.com/San411/Time-Series-Analysis.git
pip3 install -r requirements.txt

Data Mining

We take the wikipedia page views for 145k articles for the year 2015-2017 on a daily basis. This will form our dataset for training. First download the dataset provided by Google from Kaggle. This should not take long. Save the data as 'data_2015-17.csv' in the 'Datasets' folder. To get the validation data use the wikipedia pageview API to request for further article pageviews for the year 2018-19 on a Daily basis. Refer Pageview API documentation. The REST API returns the JSON data which is further stored in a file to access later. Create a folder named 'json_files' in Datasets Now execute the data_mining.ipynb. The files will take long to download. The JSON data will look like this.

{"project": "en.wikipedia",
  "article": "!vote", 
  "granularity": "daily",
  "timestamp": "2018010100",
  "access": "all-access", 
  "agent": "all-agents",
  "views": 9}

Data Cleaning and Exploration

Execute the 2 scripts in the following order:

data_cleaning.ipynb
data_exploration.ipynb 

The cleaned data will be saved in the datasets folder.

View Plot for a sample dataset after exploration

View Plot

Feature Engineering

Results of Rolling Statistics

R-Stats

After performing the Augmented Dickey Fuller Test the flattened data is stored in the datasets folder.

ARIMA

ARIMA

Mean Absolute Error : 28.318

Mean Squared Error : 2110.667

Root Mean Squared Error : 45.941

FACEBOOK - PROPHET

PROPHET - FLATTENED PROPHET

Mean Absolute Error : 220.883

Mean Squared Error : 75844.554

Root Mean Squared Error : 275.398

SMAPE Score : 42.966

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.