Code Monkey home page Code Monkey logo

movieanalysis's Introduction

MovieAnalysis

The following links are helpful for the project,

  1. 10 minutes to Pandas
  2. Beautiful Soup
  3. Requests
  4. plotly
  5. MovieLens Dataset
  6. OMDB API
  7. Markdown Quick Tutorial

The dataset01.csv and dataset02.csv consists of 27000 entries.

For project, we have filtered the dataset for year 1990-2014, country as USA, language as English for which we get 10060 entries.

Project Implementation Steps:

  1. Run the filteringDataset.ipynb to filter the dataset and remove duplicate ID’s. After executing we get datasetWithoutBoxOffice.csv.

  2. Run extractBoxOffice.ipynb to extract box office using WebCrawl class present in webcrawl.py. After executing we get datasetWithBoxOffice.csv.

Optional(but suggested): We have made 10 copies of extractBoxOffice.ipynb with 1000 entries each, and then using mergeCSV.ipynb we have merged all the csv's to get datasetWithBoxOffice.csv.

Alternatively, you can run extractBoxOfficeAllEntries.ipynb to extract box office for all entries, but consumes lot of time (in hrs).

  1. Run extractTicketInflationPrice.ipynb to extract table of ticket inflation price by year. After executing we get ticketPriceInflation.csv.

  2. Run adjustTicketPriceInflation.ipynb. After executing we get finalDataset.csv.

  3. Run plotDataset1.ipynb, plotDataset2.ipynb to visualise the dataset.

For Windows when converting to csv use encoding as UTF-8.

Images

  • Snapshot of Final dataset

  • One of the plot of dataset

movieanalysis's People

Contributors

geethanjalinivas avatar shivakumarswamy avatar

Stargazers

 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.