Code Monkey home page Code Monkey logo

recommender_system's Introduction

Recommender_System

This task was part of the assignment in the course SDSC 3002 (Data Mining) taught by Dr.Yu Yang at City University of Hong Kong. The task was as follows:

Download the files "training.txt", "testing.txt" and "item_tag.txt". In the file "training.txt", each line is in the form

CodeCogsEqn (1)

which means the rating of user u, on the item i is CodeCogsEqn.

In the file "testing.txt", each line is represented by u,i,? which means you are required to predict the rating of user u on item i. Use the training dataset "training.txt" to build a recommender system and make predictions for the testing dataset "testing.txt" by replacing all the "?" with your predicted ratings. All the ratings are within the range [0,5].

You may also want to use the file "item_tag.txt", where each line CodeCogsEqn (2) indicates that the item i has tags CodeCogsEqn (3). Note that some items may not have any tags so it is normal if you cannot find some items in the file "item_tag.txt".

Solution Approach:

The approach of this recommendation system assumes users who liked the item in the past would still like the same in the future, which means, similar items would give similar ratings to a user.

In this recommendation system, singular value decomposition (SVD) is used, which is a collaborative filtering method. And the SVD constructs a matrix with users as row, and items as columns, and the elements are composed by the corresponding users’ rating on the item. And it decomposes a matrix into 3 other matrices and extracts the factors from the factorization of a high-level matrix.

The SVD model of this recommendation system is built based on the users’ past behavior, that is the rating of items of each user. And the model finds the association between the users and the items. Then the model predicts the items or rating of the item by considering those features, that the user may be interested. However, to train the model, we are predicting the rating for the user on specific items.

While for each prediction, a pair of (u,i) where u: user_id, i: item_id are required to input. To achieve this goal, the scikit “surprise” module is used for learning. At the same time, to reduce the error between the actual and predicted rating, the bias term is used. The bias term is shown below.

CodeCogsEqn (4)

image

To estimate the unknowns, we minimized the following regularized error:

image

Stochastic Gradient Descent is used to minimize the error. n_epochs are the number of iterations in SGD which is a tunable parameter along with n_factors which is number of factors. Learning rate is set to 0.005 and regularization terms are set to 0.02 by default.

recommender_system's People

Contributors

mdanish99 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.