Code Monkey home page Code Monkey logo

movierecommendationsystem's Introduction

Movie Recommendation System

Contributors Forks Issues Pull Request

Contents

  1. Description
  2. Project structure
  3. Datasets
  4. Project Overview
  5. Project roadmap
  6. Getting started
  7. Contributing
  8. Authors
  9. License
  10. Acknowledgments

Description

Who does not love Movies? The DataSet was scraped from https://www.kaggle.com/datasets/tmdb/tmdb-movie-metadata and includes all the major Movies and the information relevent to it.

Point System

  • Easy Tag issues will fetch you 100 Points.
  • Medium Tag issues will fetch you 200 Points.
  • Hard Tag issues will fetch you 300 Points.

Project structure

  ├── datasets/         Dataset of Movies.
  ├── notebooks/        Contains the jupyter notebook file of Movies.

Datasets

  • movies.csv is the dataset for Data Cleaning and Preprocessing and Recommendation Systems Section of the notebook
  • tmdb_5000_credits.csv and tmdb_5000_movies.csv are the datasets for the notebook Data Visualization and Revenue Prediction Sections of the Notebook

Project Overview

  • https://colab.research.google.com/github/Mangalam0512/Movie-Recomendation/blob/main/Notebook/MovieRecommendation.ipynb link of the colab Notebook

Project roadmap

The project currently does the following things.

  • Cleans the Dataset tmdb_5000_movies.csv and tmdb_5000_credits.csv
  • Cosine Similarity Algorithm is used on that data to predict movie
  • Data Visualization on Movies and their profit percent.

See below for our future steps.

  • Find other possible algorithms for Recommendtion System
  • Make Revenue Prediction on movies whose Status!=released.
  • Make more Productive Visualizations.
  • Clean movies.csv and make Recommendation System based on that data.

Getting started

Prerequisites

Software Needed

  1. A web browser.

    OR
    
  2. Anaconda software.

Knowledge Needed

  • Very basic understanding of git and github:

    1. What are repositories (local - remote - upstream), issues, pull requests
    2. How to clone a repository, how to fork a repository, how to set upstreams
    3. Adding, committing, pulling, pushing changes to remote repositories
  • For EDA and Visualisation

    1. Basic syntax and working of python.(This is a must)
    2. Basic knowledge of pandas library. Reading this blog might help.
    3. Basic knowledge of matplotlib library. Reading this blog might help.
    4. Basic knowledge of seaborn library. Reading this blog might help.
    5. Basic knowledge of scikit learn library. Reading this blog might help.

    However the code is well explained, so anyone knowing the basics of Python can get a idea of what's happenning and contribute to this.

Installing

There are two ways of running the code.

  1. Running the code on web browser.(Google Colab) [Recommended]

    • Head on to Google colab
    • Then click on Upload Notebook Tab.
    • Upload the notebook that you got from this repo. Colab-1
    • Connect with the runtime. Colab-2
    • Upload your dataset. Colab-3
    • Then Click on Run All. Colab-4
    • Start Editing.
  2. You can also run the code locally in your computer by installing Anaconda.

Contributing

Contributions are what make the open source community such an amazing place to be learn, inspire, and create. Any contributions you make are greatly appreciated. Contributing is also a great way to learn more about social coding on Github, new technologies and and their ecosystems and how to make constructive, helpful bug reports, feature requests and the noblest of all contributions: a good, clean pull request.

Guidelines

  • Before starting to work on any issue or feature, open an issue explaining the changes you want to make and wait for any of the project maintainers to assign it to you.
  • Use better commit messages that explain the changes you make. View the example below:
    • Bad commit message: updated readme
    • Good commit message: updated contributors list in readme
  • You should not, in any case, use resources or code snippets from sources that do not allow their public use.

Steps to follow for Pull Request

  • For solving an issue/adding a feature, write the code after the original code finishes and do not forget to add the issue name and number as a heading in the notebook.
  • Before Submitting the PR, make sure to have a link of colab notebook of the feature/issue solved so that we can check easily. This even applies to those who are doing on anaconda.

Authors

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

movierecommendationsystem's People

Contributors

mangalam0512 avatar

Stargazers

Ahmed Gamal avatar

movierecommendationsystem's Issues

Update the Horizontal Bar Graph.

Is your feature request related to a problem? Please describe.
There are no heading, no proper alignment, there is extra padding on both sides and the axis are also not named

Describe the solution you'd like
Apply all the things I mentioned and also apply show the value of profit percent beside the respective bars.

Apply KNN Algorithm to make a Revenue Prediction Model

Describe the solution I'd like
Apply K-Nearest Neighbor Algorithm to build a revenue prediction system. Use genre_onehot which is One Hot Encoded version of all genre of movies and also only use the movies whose status == released. Also co-relate it with the budget of the movie.

Describe alternatives you've considered
You can also use Linear Regression Model or Logistic Regression model.

Preprocess the movies.csv

Describe the solution you'd like
Like data_movies dataframe is preprocessed, similarly the features are picked to be picked from movies_data dataframe.

Features to be picked
You need to pick these features from movies_data : 'title', 'tagline', 'genres', 'cast', 'director'.

Reported a bug

problem

Describe the bug
I got a warning that many columns have mixed types and Specify dtype option on import or set low_memory=False.
The solution of setting low_memory = False wont solve the problem .
I have even attached the screenshot , u can see unnamed 3, 4 ,5 6....1262.

May be the problem is with the dataset or i dont know .. please look after this issue .

Expected behaviour
Based on the notebook in this repo , it should have print like this
data_credits columns: Index(['movie_id', 'title', 'cast', 'crew'], dtype='object')

but i got like the below
data_credits columns: Index(['movie_id', 'title', 'cast', 'Unnamed: 3', 'Unnamed: 4', 'Unnamed: 5',
'Unnamed: 6', 'Unnamed: 7', 'Unnamed: 8', 'Unnamed: 9',
...
'Unnamed: 1255', 'Unnamed: 1256', 'Unnamed: 1257', 'Unnamed: 1258',
'Unnamed: 1259', 'Unnamed: 1260', 'Unnamed: 1261', 'Unnamed: 1262',
'Unnamed: 1263', 'Unnamed: 1264'],
dtype='object', length=1265)

Desktop

  • OS: [e.g. Windows]
  • Browser [e.g. chrome]
  • Version [e.g. 10]

Additional context
If this is my mistake please specify the error i have made or please provide solution

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.