Code Monkey home page Code Monkey logo

starreco's Introduction

starreco

Python Pytorch Lightning Version GitHub repo size License

starreco stands for State-of-The-Art Review Recommendation System.

starreco is a Pytorch lightning implementation for a series of SOTA deep learning rating-based recommendation systems. This repository also serves as a part of the author's master thesis work's literature review.

Features

  • Up to 20+ recommendation models across 20 publications.
  • Built on top of Pytorch lightning.
  • GPU acceleration execution.
  • Reducing memory usage for large sparse matrices.
  • Simple and understandable code.
  • Easy extension and code reusability.

Click here to get started!

Research Models

Research model Description Reference
MF Matrix Factorization [1]
GMF Generalized Matrix Factorization [2]
MLP Multilayer Perceptrons [2]
NeuMF Neural Matrix Factorization [2]
FM Factorization Machine [3]
NeuFM Neural Factorization Machine [4]
WDL Wide & Deep Learning [5]
DeepFM Deep Factorization Machine [6]
xDeepFM Extreme Deep Factorization Machine [7]
FGCNN Feature Generation by using Convolutional Neural Network [8]
ONCF Outer-based Product Neural Collaborative Filtering [9]
CNNDCF Convolutional Neural Network based Deep Colloborative Filtering [10]
ConvMF Convolutional Matrix Factorization [11]
AutoRec AutoRec [12]
DeepRec DeepRec [13]
CFN Collaborative Filtering Network [14]
CDAE Collaborative Denoising AutoEncoder [15]
CCAE Collaborative Convolutional AutoEncoder [16]
SDAECF Stacked Denoising AutoEncoder for Collaborative Filtering [17]
mDACF marginalized Denoising AutoEncoder Collaborative Filtering [18]
GMF++ Generalized Matrix Factorization ++ [19]
MLP++ Multilayer Perceptrons ++ [19]
NeuMF++ Neural Matrix Factorization ++ [20]

Datasets

  • Movielen Dataset: A movie rating dataset collected from the Movielens websites by the GroupLensResearch Project at University of Minnesota. The datasets were collected over various time periods, depending on the sizes given. Movielen 1M Dataset** has been chosen. It contains 1,000,209 anonymous ratings of approximately 3,900 movies made by 6,040 MovieLens users who joined MovieLens in 2000.

  • Bookcrossing Dataset: The BookCrossing (BX) dataset was collected by Cai-Nicolas Ziegler in a 4-week crawl (August / September 2004) from the Book-Crossing community with kind permission from Ron Hornbaker, CTO of Humankind Systems. It contains 278,858 users (anonymized but with demographic information) providing 1,149,780 ratings (explicit / implicit) about 271,379 books.

Getting Started

Installation

Create virtual environment

python3 -m virtualenv env # Python 3.6 and above

Activate virtual environment

source env/bin/activate # Linux
./env/Scripts/activate # Windows

Clone and install necessary python packages

git clone https://github.com/KyleOng/star-reco
pip install -r requirements.txt

Example

import os

import torch
from pytorch_lightning import Trainer
from pytorch_lightning.loggers import TensorBoardLogger
from pytorch_lightning.callbacks import ModelCheckpoint

from starreco.modules import *
from starreco.data import *
    
# data module
data_module = StarDataModule("ml-1m")
data_module.setup()
    
# module
module = MF([data_module.dataset.rating.num_users, data_module.dataset.rating.num_items],
            "lr" = 0.007629571188584098,
            "weight_decay" = 1.0643056040513936e-05)

# setup
# checkpoint callback
current_version = max(0, len(list(os.walk("checkpoints/mf")))-1)
checkpoint_callback = ModelCheckpoint(dirpath = f"checkpoints/mf/version_{current_version}",
                                      monitor = "val_loss",
                                      filename = "mf-{epoch:02d}-{train_loss:.4f}-{val_loss:.4f}")
# logger
logger = TensorBoardLogger("training_logs", name = "mf")
# trainer
trainer = Trainer(logger = logger,
                  gpus = -1 if torch.cuda.is_available() else None, 
                  max_epochs = 100, 
                  progress_bar_refresh_rate = 2,
                  callbacks=[checkpoint_callback])
trainer.fit(module, data_module)

# evaluate
module_test = MF.load_from_checkpoint(checkpoint_callback.best_model_path)
trainer.test(module_test, datamodule = data_module)

References

[1] Koren, Y., Bell, R., & Volinsky, C. (2009). Matrix factorization techniques for recommender systems. Computer, 42(8), 30-37.

[2] He, X., Liao, L., Zhang, H., Nie, L., Hu, X., & Chua, T. S. (2017, April). Neural collaborative filtering. In Proceedings of the 26th international conference on world wide web (pp. 173-182).

[3] Rendle, S. (2010, December). Factorization machines. In 2010 IEEE International Conference on Data Mining (pp. 995-1000). IEEE.

[4] He, X., & Chua, T. S. (2017, August). Neural factorization machines for sparse predictive analytics. In Proceedings of the 40th International ACM SIGIR conference on Research and Development in Information Retrieval (pp. 355-364).

[5] Cheng, H. T., Koc, L., Harmsen, J., Shaked, T., Chandra, T., Aradhye, H., ... & Shah, H. (2016, September). Wide & deep learning for recommender systems. In Proceedings of the 1st workshop on deep learning for recommender systems (pp. 7-10).

[6] Guo, H., Tang, R., Ye, Y., Li, Z., & He, X. (2017). DeepFM: a factorization-machine based neural network for CTR prediction. arXiv preprint arXiv:1703.04247.

[7] Lian, J., Zhou, X., Zhang, F., Chen, Z., Xie, X., & Sun, G. (2018, July). xdeepfm: Combining explicit and implicit feature interactions for recommender systems. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (pp. 1754-1763).

[8] Liu, B., Tang, R., Chen, Y., Yu, J., Guo, H., & Zhang, Y. (2019, May). Feature generation by convolutional neural network for click-through rate prediction. In The World Wide Web Conference (pp. 1119-1129).

[9] He, X., Du, X., Wang, X., Tian, F., Tang, J., & Chua, T. S. (2018). Outer product-based neural collaborative filtering. arXiv preprint arXiv:1808.03912.

[10] Wu, Y., Wei, J., Yin, J., Liu, X., & Zhang, J. (2020). Deep Collaborative Filtering Based on Outer Product. IEEE Access, 8, 85567-85574.

[11] Kim, D., Park, C., Oh, J., Lee, S., & Yu, H. (2016, September). Convolutional matrix factorization for document context-aware recommendation. In Proceedings of the 10th ACM conference on recommender systems (pp. 233-240).

[12] Sedhain, S., Menon, A. K., Sanner, S., & Xie, L. (2015, May). Autorec: Autoencoders meet collaborative filtering. In Proceedings of the 24th international conference on World Wide Web (pp. 111-112).

[13] Kuchaiev, O., & Ginsburg, B. (2017). Training deep autoencoders for collaborative filtering. arXiv preprint arXiv:1708.01715.

[14] Strub, F., Mary, J., & Gaudel, R. (2016). Hybrid collaborative filtering with autoencoders. arXiv preprint arXiv:1603.00806.

[15] Wu, Yao, et al. "Collaborative denoising auto-encoders for top-n recommender systems." Proceedings of the Ninth ACM International Conference on Web Search and Data Mining. ACM, 2016.

[16] Zhang, S. Z., Li, P. H., & Chen, X. N. (2019, December). Collaborative Convolution AutoEncoder for Recommendation Systems. In Proceedings of the 2019 8th International Conference on Networks, Communication and Computing (pp. 202-207).

[17] Strub, F., & Mary, J. (2015, December). Collaborative filtering with stacked denoising autoencoders and sparse inputs. In NIPS workshop on machine learning for eCommerce.

[18] Li, S., Kawale, J., & Fu, Y. (2015, October). Deep collaborative filtering via marginalized denoising auto-encoder. In Proceedings of the 24th ACM international on conference on information and knowledge management (pp. 811-820).

[19] Liu, Y., Wang, S., Khan, M. S., & He, J. (2018). A novel deep hybrid recommender system based on auto-encoder with neural collaborative filtering. Big Data Mining and Analytics, 1(3), 211-221.

[20] To be published.

Github References

starreco's People

Contributors

kyleong avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.