Code Monkey home page Code Monkey logo

movie-industry-analysis's Introduction

Module 1 Final Project

Team Members: Dipta Roy and Muriel Kosaka

Introduction

In this project, we have been given the scenario of Microsoft expressing interest in creating movies and our team has been tasked to help them better understand the movie industry. By exploring films from 2018 and 2019, we analyzed factors that attributed to its success at the box office. After which we must present our findings and make recommendations to Microsoft to aid in creating profitable films.

Questions that we aim to answer during this project

Pyramid

Process

  1. See BoxOfficeMojo_DataScraping.ipynb: Here we web scraped Box Office Mojo for movies from all seasons (Fall, Spring, Summer, Winter, and Holiday) for 2018 and 2019 and created necessary Data Frame’s for analyses.
  2. See Data_Visualization_years_correlation.ipynb: We then merged this DataFrame with zipped data files that were provided to us to answer our first two questions.
  3. See Data_Visualization_Season_summer: We found that movies shown during the Summer and Holiday season had the highest average gross
  4. See Data_Visualization_Season_summer and Data_visualization_holiday_season: For each season, we answered questions four and five.

Libraries

  • Data Collection:

    • Requests
    • Beautiful Soup
    • Pandas
  • Data Cleaning and Visualization:

    • Matplotlib
    • Seaborn
    • Pandas

Findings and Suggestions

Even though movies were not doing well for 2019, we believe this is reason to investigate further what ways to improve movies to generate high gross. There was a slight positive correlation between production budget and gross indicating that the more money spent on production, the higher gross, total gross, and worldwide gross earned for a given film. Films shown during the summer and holiday season generated the highest average gross, therefore we suggest that films be shown in theaters during those times. Upon further analyses of genres and movie length, we recommend that films shown in the summer should be of the Action and Adventure genres, and should have a runtime of at least 120 minutes. We also recommend that films shown during the holiday season should be of the Animation and Comedy or Comedy and Family genres and should have a runtime of at least 120 minutes to generate high average gross.

Limitations and Further Analyses

Limitations include data files having null values that lowered the number of observations, weakening the strength of our findings. Another limitation includes the accuracy of the data that was used for analyses, when attempting to merge files we found that their gross values did not match which may be due to data being collected at different times.

Further analyses should look into how budget and profitability affects gross. Movies released on streaming platforms versus theaters should also be explored as movies on streaming platforms is currently a trend. Examining how long a movie stays in theaters versus gross would also be interesting to explore. Lastly, further analyses should look at how well a movie is doing in theaters in relation to other movies that are out at the same time.

movie-industry-analysis's People

Contributors

mkosaka1 avatar roydipta avatar

Watchers

 avatar

Forkers

mkosaka1

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.