Code Monkey home page Code Monkey logo

swat1563 / recommendation-system Goto Github PK

View Code? Open in Web Editor NEW
0.0 1.0 0.0 3.44 MB

This repository features a recommendation system and analytics engine using datasets on users, organizations, contents, contacts, events, and recommendations. It includes data preprocessing, building a recommendation system, and creating visual reports with Power BI.

Jupyter Notebook 100.00%
analytics data-analysis data-visualization engine kaggle numpy pandas powerbi python recommendation-engine recommendation-system recommender-systems scikit-learn scipy powerbi-dashboards powerbi-desktop powerbi-reports

recommendation-system's Introduction

Recommendation System and Analytics Engine

Overview

This repository contains the solution for the skill set assessment task. The task involves creating an analytics engine to produce various reports, insights, and analyses using the provided datasets. The primary goal was to develop a recommendation system based on system scores for multiple users across different categories such as content, contacts, and events.

Datasets

The datasets provided include:

  1. Users
  2. Organizations
  3. Contents
  4. Contacts
  5. Events
  6. Recommendations

Task Breakdown

1. Data Preprocessing

Tools and Libraries

  • Kaggle Platform
  • Python
    • NumPy
    • Pandas

Steps and Cleaning

  1. Users Dataset

    • Removed empty columns: city, country, state, phone_number, linkedin_url, description.
  2. Organizations Dataset

    • Removed empty columns: email, year_founded, phone_number, linkedin_url.
  3. Contents Dataset

    • Removed empty columns: organisation_id, creator_id.
    • Removed unhelpful column: content_type (all records had the same value).
    • Cast id column from float to integer.
    • Removed leading spaces from column names.
    • Dropped records without id values.
  4. Events Dataset

    • Removed empty column: organisation_id.

    • Corrected location value for id 854 from ',' to 'online'.

    • Processed the row with id 438, which had 68 concatenated records in the title field, by identifying the erroneous row, extracting and parsing the concatenated string into individual records, splitting them into 68 distinct rows, reinserting these rows back into the dataset, and verifying the consistency and accuracy of all other fields. As seen in the figure below, all 68 records were originally placed in one row in the title field. row 438

    • Reformatted Price column: converted 'Free' to 0 and string values to float.

    • Reformatted location into three columns: meeting, state, city.

  5. Contacts Dataset

    • Removed empty columns: organisation_id, picture_name, position, gender, phone_number.
    • Removed unhelpful column: role_id (all records had the same value).
  6. Recommendations Dataset

    • Removed empty column: user_score.

2. Categorizing Recommendations

The recommendations table was divided into three categories based on asset_type:

  1. Content Recommendations
    • Renamed asset_id to content_id.
  2. Event Recommendations
    • Renamed asset_id to event_id.
  3. Contact Recommendations
    • Renamed asset_id to contact_id.

3. Data Export Post-Cleanup

The cleaned datasets were exported for further use in building the recommendation system and analytics engine.

4. Recommendation System

Tools and Libraries

  • Kaggle Platform
  • Python
    • NumPy
    • Pandas
    • Scikit-learn
    • SciPy

Methodology

A collaborative filtering recommendation system was developed for content recommendations based on user_id, content_id, and system_score. The system_score is assumed to reflect user interactions like clicks or time spent on content. The same algorithm can be applied to events and contacts.

Services Provided
  1. User-based Recommendations
    • Input: user_id
    • Output: Top 5 recommended contents for the user.
  2. Content-based Recommendations
    • Input: content_id
    • Output: Top 5 contents similar to the provided content.

5. Analytics Engine and Visualization

Tool

  • Power BI Desktop

Certainly! Here’s the enhanced format for the reports section:


Reports

Various reports covering all datasets were created and visualized in Power BI:

  1. Users Users Report

  2. Organizations Organizations Report

  3. Contents Contents Report

  4. Contacts Contacts Report

  5. Events Events Report

  6. Content Recommendations System Content Recommendations Report

  7. Contact Recommendations System Contact Recommendations Report

  8. Event Recommendations System Event Recommendations Report

Repository Structure

  • Data Wrangling Code: Kaggle Notebook
  • Content Recommendations System Code: Kaggle Notebook
  • Power BI Reports PDF: Available in the reports directory.
  • Analytics Engine Files: Power BI files (.pbit, .pbix) for interaction and visualization, available in the repository.

Links

recommendation-system's People

Contributors

swat1563 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.