Light

shivakumarswamy / movieanalysis Goto Github PK

View Code? Open in Web Editor NEW

1.0 1.0 0.0 5.24 MB

Data Analysis of Movies

License: Apache License 2.0

Jupyter Notebook 97.30% Python 2.70%

dataanalysis python jupyter-notebook pandas beautifulsoup plotly movie-database dataset

movieanalysis's Introduction

MovieAnalysis

The following links are helpful for the project,

The dataset01.csv and dataset02.csv consists of 27000 entries.

For project, we have filtered the dataset for year 1990-2014, country as USA, language as English for which we get 10060 entries.

Project Implementation Steps:

Run the filteringDataset.ipynb to filter the dataset and remove duplicate ID’s. After executing we get datasetWithoutBoxOffice.csv.
Run extractBoxOffice.ipynb to extract box office using WebCrawl class present in webcrawl.py. After executing we get datasetWithBoxOffice.csv.

Optional(but suggested): We have made 10 copies of extractBoxOffice.ipynb with 1000 entries each, and then using mergeCSV.ipynb we have merged all the csv's to get datasetWithBoxOffice.csv.

Alternatively, you can run extractBoxOfficeAllEntries.ipynb to extract box office for all entries, but consumes lot of time (in hrs).

Run extractTicketInflationPrice.ipynb to extract table of ticket inflation price by year. After executing we get ticketPriceInflation.csv.
Run adjustTicketPriceInflation.ipynb. After executing we get finalDataset.csv.
Run plotDataset1.ipynb, plotDataset2.ipynb to visualise the dataset.

For Windows when converting to csv use encoding as UTF-8.

Images

Snapshot of Final dataset

One of the plot of dataset

movieanalysis's People

Contributors

Stargazers

Watchers

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.