Code Monkey home page Code Monkey logo

mcclk's Introduction

Multi-level Cross-view Contrastive Learning for Knowledge-aware Recommender System

This is our Pytorch implementation for the paper:

Ding Zou, Wei Wei, Xian-Ling Mao, Ziyang Wang, Minghui Qiu, Feida Zhu, Xin Cao (2022). Multi-level Cross-view Contrastive Learning for Knowledge-aware Recommender System, Paper in arXiv. In SIGIR'22.

Introduction

Multi-level Cross-view Contrastive Learning for Knowledge-aware Recommender System (MCCLK) is a knowledge-aware recommendation solution based on GNN and Contrastive Learning, proposing a multi-level cross-view contrastive framework to enhance representation learning from multi-faced aspects.

Requirement

The code has been tested running under Python 3.7.9. The required packages are as follows:

  • pytorch == 1.5.0
  • numpy == 1.15.4
  • scipy == 1.1.0
  • sklearn == 0.20.0
  • torch_scatter == 2.0.5
  • torch_sparse == 0.6.10
  • networkx == 2.5

Usage

The hyper-parameter search range and optimal settings have been clearly stated in the codes (see the parser function in utils/parser.py).

  • Train and Test
python main.py 

Citation

If you want to use our codes and datasets in your research, please cite:

@inproceedings{mcclk2022,
  author    = {Zou, Ding and
               Mao, Xian-Ling and
	       Wang, Ziyang and
	       Qiu, Minghui and
	       Zhu, Feida and
	       Cao, Xin},
  title     = {Multi-level Cross-view Contrastive Learning for Knowledge-aware Recommender System},
  booktitle = {Proceedings of the 45th International {ACM} {SIGIR} Conference on
               Research and Development in Information Retrieval, {SIGIR} 2022, Madrid,
               Spain, July 11-15, 2022.},
  year      = {2022},
}

Dataset

We provide three processed datasets: Book-Crossing, MovieLens-1M, and Last.FM.

We follow the paper " Ripplenet: Propagating user preferences on the knowledge graph for recommender systems." to process data.

Book-Crossing MovieLens-1M Last.FM
User-Item Interaction #Users 17,860 6,036 1,872
#Items 14,967 2,445 3,846
#Interactions 139,746 753,772 42,346
Knowledge Graph #Entities 77,903 182,011 9,366
#Relations 25 12 60
#Triplets 151,500 1,241,996 15,518

Reference

  • We partially use the codes of KGIN.
  • You could find all other baselines in Github.

mcclk's People

Contributors

cciiplab avatar

Stargazers

Bullshit-guy avatar  avatar  avatar  avatar LIAO Shuiying avatar  avatar  avatar  avatar  avatar Yantong Lai avatar  avatar  avatar Inzaghi avatar  Anmin avatar BaoyinLiu avatar  avatar  avatar Xavier Digrande avatar  avatar  avatar Nico_C avatar  avatar hengtong chang avatar  avatar critical_88 avatar  avatar monster avatar WitonWen avatar Mingyang Chen avatar  avatar Zander Zhang avatar Yuhao avatar Poolarrr avatar  avatar Q avatar  avatar dingzou avatar Marlin avatar  avatar ljl avatar  avatar  avatar  avatar Von Leibniz avatar  avatar  avatar  avatar Tom RYZ avatar  avatar

Watchers

 avatar

mcclk's Issues

关于环境配置问题

您好,pt1.5.0好像不支持torch_sparse==0.6.10,要pytorch1.8以上才支持。我按照文件描述配置环境后发生了如下报错,原因好像就出在torch_sparse上。请问您是怎么解决的呢?
image

指标和论文不一致问题

请问data中的music是不是就是paper中Last.FM数据集,我按您的代码跑了一下,发现均达不到相应指标

关于代码跑recall指标

作者您好,在复现您发布的代码时,train_res按照您的注释,我变成添加 ["Recall@5", "Recall@10", "Recall@20", "Recall@50", "Recall@100"], 但是实验跑出来 recall却是nan值,中间调试了很久也不行,想向您询问一下细节。

训练时间长

请问为何在训练last.fm数据集时,时间不长,但是处理其他数据集MovieLens-20M、Book-Crossing时间超长,而且大多数都在使用cpu跑,是否有遇到同样情况的?能否交流下?

A problem about kNN sparsification

Hello author. I am concerned about the part of constructing a semantic perspective in section 4.1 of the paper, where an aggregation process is first used to construct representations of item and entity nodes (Equation 1), I am a bit confused about the relationship between Equation 1 and Equation 5, since e_i^{(k+1)} appears in both of them. Is Equation 1 involved in the calculation of the losses later?

In addition, I noticed that the authors use the top-k function in their code to get the highest scoring edges:
knn_val, knn_ind = torch.topk(sim, topk, dim=-1)
But can the top-k process be differentiated? Wouldn't this cause the gradient of the previous calculation process to break?

可视化

请问一下可视化部分的SVD分解具体是如何实现的

Requirements request

Good morning authors,

First, congratulations for the great idea and for the source code which seems really intesting. Unfortunately, I was only able to read it, but not to run it; in fact, Iìm currently trying to install and use your source code, but I'm experiencing some problems with dependencies and libraries (I am trying to install, with no success, the libraries in the readme.md file).

Could you please publish a requirements.txt file obtained with the freeze command when using a working environment?

Thank you in advance.

數據集問題

您好,想请教几个问题:
1.网络上找不太到详细关于使用Microsoft Satori 建构子知识图谱的方法,是否能请教您?
2.您的比较对像当中有放像是KGAT这种使用数据之知识图谱种类为UIKG的方法,您的方法使用的是IKG,想请问您是如何将KGAT等方法套用您的资料集进行比较?
3.您在论文当中写说您资料处理的方式与RippleNet相同,但是我检查您和RippleNet的dataset后发现,您的movielens-1M的kg_fianl.txt 和 rating_final.txt 和RippleNet当中的皆不同,请问您有做什么不一样的资料处理方式吗?(RippleNet所使用的数据与您在论文之Table1之数据相符)
感谢

数据集问题

请问你们的数据集是如何得到的呢,我看到如Last-FM的实体数量和关系数量都和KGIN的不一样,有些疑惑

模型实现与论文描述

您发布的这一版本代码与论文中的许多描述不一致,尤其是在全局和局部特征对比部分。我发现您在模型中的许多地方进行了注释,请问如何与论文方法完全对应?

数据集读入问题

请问数据集在读入时, 在ratings_final文件中权值为0的交互为何也被读入训练集中?

About the Movielens-1M dataset

Hi, thank you for your work. The number of triples in the MovieLens-1M dataset you uploaded is only 20782. The number of triplets for MovieLens-1M preprocessed by RippleNet is 1,241,995, but in your paper it is 1,241,996. Can you upload the triples in the MovieLens-1M dataset? Thanks.

关于超参数的一些疑问

你好,在复现模型效果时,发现效果有不小的差距.
在论文中使用 alpha, beta 来对损失的权重进行均衡,但是在代码中好像没有体现这部分内容,是我没有正确理解吗,
image

There is a discrepancy between the model code and the paper?

Hi, folk, nice job.
Taking the music dataset as an example, when I checked the code, I found that there are some differences between the model part and the description of the paper, as follows:
In the following line, I see item_emb = self.all_embed[self.n_users:, :]:

item_emb = self.all_embed[self.n_users:, :]

But the number of items is 3846, and the number of entities is 9366. Why is the embedding of the entity assigned to the embedding of the item? In the later part of the code, the embedding of item and entity is also confusing back and forth. Can you explain it?

超参数设置问题

作者你好!我想按您论文里的超参数方法进行设置,但我在代码里不知道这些超参数是在哪里

代码应用

作者您好,我是机械专业,有文本数据。想请问一下,如何将自己的文本数据,变成您代码data文件下的数据?
具体困扰我的是,您代码data 中kg_final.txt中实体和关系都是数字,每个数字什么意思不知道。而我的数据要变成这样,是不想需要重现建个文档,把实体和关系分别编号,然后在将各自编号生成属于我的kg_final.txt文件

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.