archersama / inttower Goto Github PK

Source code of CIKM 2022 and DLP-KDD workshop 2022 Best Paper: IntTower-“ IntTower: the Next Generation of Two-Tower Model for Pre-ranking System”

License: Apache License 2.0

Python 100.00%

inttower's Introduction

👋 Hi, I’m @archersama , HuaWei Noah Ark Recommendation&Search Lab Researcher
✨ Welcome to join us！Now, we need school graduates and interns. Resume can be sent to me directly.

Requirements:1. Graduated from Top School OR 2. At least one computer top conference paper published
👀 I’m interested in information retrieval and nature language processing. Recently, I focus on LLM for recommendation and RAG.
📫 How to reach me [email protected] or [email protected]

inttower's People

Contributors

Stargazers

Watchers

Forkers

dibyendumandal xiaoqingwang kingleao cshaoping mertgurkan0 karndeepsingh weiucas vincentami seven-xu wombatsai

inttower's Issues

计算maxsim得到的loss能否和双塔之前的loss一起存在？

如上，还是这两个loss之间是替代关系

Alibaba datset

Can you please post the download link for the Alibaba dataset? I can not find it in the link

GPU Memory Gradually Increases Leading to CUDA Out of Memory Error During Model Training

Thank you for providing a good paper and code.

I used the package in requirements.txt you provided in Python 3.7 environment.
But, I am encountering an issue where the GPU memory usage gradually increases during model training until it eventually leads to a CUDA out of memory error.

Is there any solution for this?

矩阵相乘求相似度

您好，请教一下，下面的图中，函数fe_score()内部item_temp和user_tmep矩阵相乘是用来求向量相似度的对吧，但好像没看到在什么地方把item_temp和user_tmep中内部的向量给归一化到0-1之间

Could you share the serving code?

Thanks for your great job! I wonder if you can provide the example code on how to deploy the IntTower in real scenario? such as how to execute the multi-head faiss and maxsim in a parallel way.

Could you provide the training script for the dataset of Amazon and Alibaba?

Thank you for your great job. Could you provide the training script for the dataset of Amazon and Alibaba? We want to know more detail about your great job for following. Thank you very much!

多目标serving时的融合

您好，请教一下，如果我的粗排有多个目标，比如，ctr,cvr, 想问一下在预测时如何进行融合，目前我想到的，
1、使用multi-head分别对ctr塔和cvr塔的顶部进行提取，将提取得到的ctr embedding以及cvr embedding分别和multi-head提取的user embedding过一遍fe_score函数，然后将 ctr的fe_score和cvr的fe_score 以一定权重进行融合，得到最终的score
2、使用multi-head分别对ctr塔和cvr塔的顶部进行提取，将提取得到的ctr embedding以及cvr embedding以一定权重进行融合，将融合后的embedding 和multi-head提取的user embedding过一遍fe_score函数，得到最终的score
谢谢~

Question about serving the model

Hello!

First of all thanks a lot for your great article and for opening the code base.
I have a question regarding the model serving:
I understand that you create Faiss indices based on the multi-head latent representation of the items but how do you query them? Do you use the multi-head latent representation of the last layer of the user tower? And after retrieving the top K items, do you compute the Fe score to rerank the candidates?

How to deploy in real recommender systems

I have several questions:

As I known, faiss does not support 'max' operation.
Fot i-th layer user representaion, we will compute each head pairwise to get the similarity score, So we need to retrieve H^2 times？If there are L layers, eventually we need to retrieve L* H^2 times?

Thanks!

is CIR contrastive loss removed?

有个小问题想请假下哈

这个地方最后是没有用CIR 的 contrastive loss 吗

IntTower/model/base_tower.py

Lines 175 to 178 in a234d89

    
           # total_loss = loss + reg_loss + self.aux_loss + contras 
        
           total_loss = loss + reg_loss + self.aux_loss  
        
           # print(total_loss, contras, loss)

2.contrastive loss 这个地方为啥用y 去作为索引选择cos_sim score呢, 比如batch_size 256，那岂不是都选到前两个的score了后面254个的都选不到，另外一般这种不是只包含正例这里面应该是正负的label 都有？

IntTower/preprocessing/utils.py

Lines 74 to 76 in a234d89

    
           # Compute the loss 
        
           loss = torch.log(exp_scores.sum(dim=1)) - scores[range(scores.shape[0]), y] 
        
           loss = loss.mean()

CUDA out of memory

RuntimeError: CUDA out of memory. Tried to allocate 32.00 MiB (GPU 0; 23.99 GiB total capacity; 37.27 GiB already allocated; 0 bytes free; 37.33 GiB reserved in total by PyTorch)

I run this code on 24G GPU, this error always happened after epoch 2 whatever batch_size I set, is there anything wrong with my environment?


	# total_loss = loss + reg_loss + self.aux_loss + contras
	total_loss = loss + reg_loss + self.aux_loss
	# print(total_loss, contras, loss)

	# Compute the loss
	loss = torch.log(exp_scores.sum(dim=1)) - scores[range(scores.shape[0]), y]
	loss = loss.mean()