chenxijun1029 / deepfm_with_pytorch Goto Github PK

View Code? Open in Web Editor NEW

365.0 5.0 101.0 14 KB

A PyTorch implementation of DeepFM for CTR prediction problem.

Python 100.00%

deepfm pytorch ctr-prediction

deepfm_with_pytorch's People

Contributors

Stargazers

Watchers

Forkers

zenwan shafiahmed littleso-so adamszq queenie88 vnherdeiro wtzhang95 albertwy tempestwk1 morningzhang pikachirp heslowen ningshiqi lijian10086 luckydoggy mesalamon magic3007 minghao2016 gaoyz0625 mtkshu fantajeon miss-star lora-chen desperatek eminemand2pac maxleekang magicbupt littlebus wumch yunsnow saferman wangwei19870604 yepthatlilo shawnandshirley qiaomeng crystal22 marsstones tsarvy michaelldd 5l1v3r1 xiedake babylls greengrass2015 luckmoon coco-boy tianchendoudou aichenbaby wjs2108 sinlight wangqh10 ustcsky hanlard marcucla sjihao bruinxiong datascience-projects blldd yhjflower frankfqchen jasperprynne sjchasel yuanlics wanesta zhikunwei zhangxuemiao doheelab onlyyoufor fengbeihong jia0713 jiangwubing amoshen hubishan youyouhuo youtaobaba 2hip3ng nick-meng wallace-163 houyun19970924 chucy2020 biayangqi oldwaist willzli zhuyetuo vandanayadav24 stu-github qwang1104 tonylibing yonggie 94gy saumyasukant babai2301 srikanthmudunuru senwang98 arctanln2 daisywon tszxtentacion hannyuu jasonzhang1998 chenshaomou doducthao

deepfm_with_pytorch's Issues

哎,BUG满天飞,浪费我一个小时时间,绝绝子.

class ContinuousFeatureGenerator:
"""
Clip continuous features.
"""

def __init__(self, num_feature):
    self.num_feature = num_feature

def build(self, datafile, continous_features):
    df = pd.read_csv(datafile, sep="\t", header=True)
    with open(datafile, 'r') as f:
        for line in f:
            features = line.rstrip('\n').split('\t')
            for i in range(0, self.num_feature):
                val = features[continous_features[i]]
                if val != '':
                    val = int(val)
                    if val > continous_clip[i]:
                        val = continous_clip[i] # 这个val弄了半天,也没存储,赋值,所以处理了有啥用呀

def gen(self, idx, val):
    if val == '':
        return 0.0
    val = float(val)
    return val

Do the order-2 pairwise feature interactions exists in your model ???

the above are code at model/DeepFM.py

Hi, Is the code above same as origin paper DeepFM proposed order-2 pairwise feature interactions?
I didn't see any pairwise feature interaction result e.g. f*f shape matrix.

Is that a problem!??

Thanks

In dataset.py, line 29 and 33, you usepd.read_csv to read from your data that generated from original data. However, you forget to add param header = -1, because the new test.csv and train.csv actully don't have a header. And pandas will use the first line of data as the header, which will cause an index error later.
The follows are what the code should be like.

data = pd.read_csv(os.path.join(root, 'train.txt'), header = -1)
data = pd.read_csv(os.path.join(root, 'test.txt'), header = -1)

20

Hello, Could you tell me why it is 20? What's the meaning of it?
In the part of DeepFM.Forward
"""
emb = self.fm_first_order_embeddings[20]
print(Xi.size())
for num in Xi[:, 20, :][0]:
if num > self.feature_sizes[20]:
print("index out")
"""

RuntimeError: index out of range at /pytorch/aten/src/TH/generic/THTensorEvenMoreMath.cpp:191

observing the following error while running deep ctr on the gpu:

Traceback (most recent call last):
File "main.py", line 31, in
model.fit(loader_train, loader_val, optimizer, epochs=5, verbose=True)
File "/root/deepctr/DeepFM_with_PyTorch/model/DeepFM.py", line 153, in fit
total = model(xi, xv)
File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/module.py", line 489, in call
result = self.forward(*input, **kwargs)
File "/root/deepctr/DeepFM_with_PyTorch/model/DeepFM.py", line 98, in forward
fm_first_order_emb_arr = [(torch.sum(emb(Xi[:, i, :]), 1).t() * Xv[:, i]).t() for i, emb in enumerate(self.fm_first_order_embeddings)]
File "/root/deepctr/DeepFM_with_PyTorch/model/DeepFM.py", line 98, in
fm_first_order_emb_arr = [(torch.sum(emb(Xi[:, i, :]), 1).t() * Xv[:, i]).t() for i, emb in enumerate(self.fm_first_order_embeddings)]
File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/module.py", line 489, in call
result = self.forward(*input, **kwargs)
File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/sparse.py", line 118, in forward
self.norm_type, self.scale_grad_by_freq, self.sparse)
File "/usr/local/lib/python3.5/dist-packages/torch/nn/functional.py", line 1454, in embedding
return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
RuntimeError: index out of range at /pytorch/aten/src/TH/generic/THTensorEvenMoreMath.cpp:191

if self.train:
            # index of continous features are zero
            Xi_coutinous = np.zeros_like(dataI[:continous_features])
else:
            # index of continous features are one
            Xi_coutinous = np.ones_like(dataI[:continous_features])

Why are the indexes generated by the continuous variables of the training set and the test set zero and one respectively? And should index for each continuous variable be the same?

chenxijun1029 / deepfm_with_pytorch Goto Github PK

deepfm_with_pytorch's People

Contributors

Stargazers

Watchers

Forkers

deepfm_with_pytorch's Issues

哎,BUG满天飞,浪费我一个小时时间,绝绝子.

Do the order-2 pairwise feature interactions exists in your model ???

Report a bug.

20

RuntimeError: index out of range at /pytorch/aten/src/TH/generic/THTensorEvenMoreMath.cpp:191

您好，请问下原文中在 deep part中有使用激活函数，而这份代码中没用，是出于其他什么考虑吗？

dataPreprocess.py 两个问题

data文件夹里面没有feature_sizes.txt文件

about the index of continous features

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent