This is a simple movie recommendation system built using Python and Streamlit. It recommends similar movies based on user input.
We developed Movie Match by collecting data from Kaggle and preprocessing movie data, merging, and cleaning it for analysis. We used text vectorization techniques like bag-of-words and utilized cosine similarity to find similarities between movie vectors, enabling us to identify the top 5 nearest movies for recommendations. Our tech stack includes Python, Streamlit, NLTK (Natural Language Toolkit), Pandas, NumPy, and Scikit-learn.
- To use the movie recommendation system, follow the instructions in the Prerequisites file.
- Movie data: The movie dataset used in this project is obtained from Kaggle. The data is comprised of two files:
movies.csv
andcredits.csv
. - Movie posters: Movie posters are fetched from The Movie Database (TMDb) API using an API key.
- Python
- Streamlit
- NLTK (Natural Language Toolkit)
- Pandas
- Scikit-learn
- Numpy
app.py
: Main Streamlit application file.model.ipynb
: Jupyter Notebook containing the code for training the recommendation model.movies.csv
: Raw movie data.credits.csv
: Additional movie data.movie_list.pkl
: Serialized file containing processed movie data.similarity.pkl
: Serialized file containing similarity scores between movies.
Contributions are welcome! Please feel free to open an issue or submit a pull request.
This project is licensed under the MIT License. See the LICENSE file for details.