The kobe's discuss from thudm

Difference between tensor2tensor.py and seq2seq.py

ignore this issue

python -m kobe.data.preprocess 你好，我看这一步不止20分钟吧，我这边也太慢了，是我哪里弄错了吗？
examples = Parallel(n_jobs=8) 改成了examples = Parallel(n_jobs=16)
我看速度也没有快起来

关于模型的generate的API试用问题

你好，最近在研究相关的内容，看到您的项目很感兴趣，请问下如何采用这个model生成，能否提供一下demo

English language dataset

Hi Team,

Did you guys also tried training your model on english language dataset ? If yes, could you please provide the link for the same.

Thanks
Manish

Is there any dataset in English?

Hello, your model is very interesting, but I don't have any knowledge of Chinese as the model outputs. So, I'd to ask you about is there any dataset in English available?

Question about using rouge to evaluate the model

Hi @qibinc

I am using rouge to evaluate the score. I change the metrics: ['bleu'] to metrics: ['rouge'], but it seems can not useful? Are there other things I need to do?

Best wishes,

Is reinforcement learning helpful?

I notice that there are some piece of code related to reinforcement learning. Did you try training by reinforcement learning and does rl improve the result?

Named entity to match product title with knowledge graph

Hi,
Thank you so much for a great paper and for sharing the code!

I read the paper and have a small confusion.
https://arxiv.org/pdf/1903.12457.pdf

On page 5, you said "Formally, given a product title x = (x1, x2, . . . , xn), we match each word xi to a
named entity vi ∈ V, which should be a vertex in the knowledge graph."
Could you please explain what are those named entity?
What is the process you used to match the word to the given named entity vi
For Chinese, you used CN-DBPedia. How about English? Which English knowledge graph you recommend using?

Thank you so much again!

Something about the version of pytorch

hi @qibinc

I found that the version of pytorch will impact the code.

For example, the torch.gt() in version 1.2 will get the bool value.

Best wishes

Question on the decoding behavior

Hi Qibin,

Thanks for the nice work. I have a question on the Transformer decoding behavior. I noticed that during training the whole ground truth is fed into the decoder, allowing the tokens in step $i$ to attend on the previous $i-1$ tokens as in the self-attention layer. However, while in the inference stage, you only feed one token for step $i$, i.e. the output token from step $i-1$, as follows:

https://github.com/THUDM/KOBE/blob/master/core/models/tensor2tensor.py#L369

I guess that the self-attention layer loses its effectiveness at this time. Why don't we feed the outputs from step 0 to step $i-1$ for decoding step $i$ as we do in the training phase?

BOS EOS 與 attr embedding

不好意思我有個問題想要問，關於TItle的文字序列部分，有在前後加上BOS EOS等起始終止符嗎？

如果說有，那關於Title embedding的部分，如果要像這篇論文一樣對每個token加上attr embedding，

是否也包括加在BOS、EOS等token的embedding上了呢？或者略過它們純粹對Title文字的token加上

呢？

謝謝作者的熱心！

inference

貌似最新的readme没有generation和api部分？该怎么做测试呢？

Question about personalized generation result

I train your model with the default parameter. Is it the result same as you? I can't tell there is an obvious difference from the sentence (for me) with different item aspects and user categories. Can the author provide your result (best_bleu_prediction.txt) after training? Thanks!

Here is mine best_bleu_prediction.txt after training (no adapting beam searching).

best_bleu_prediction.txt

biattention code

Good name!

preprocess.py数据预处理时一些困惑

感谢作者的开源~ aspect_user.yaml训练成功，本想用训练好的模型，用自己的数据试一下，好像一直没跑通，卡在makeVocabulary/makeData。
通过天池下载了的TaoDescribe 数据集是源数据集；通过download_preprocessed_tao.py下载的是预处理的。
源数据集 -> 预处理数据过程中，在preprocess.py文件中：
其中，In src files, <x> <y> means this product is intended to show with aspect <x> and user category <y>中， aspect <x>这个定义没给出？

Question about BiAttention

Hi, Qibin

First, thanks to your open source codes.

I have some questions about your code.

I want to know the baseline model in your weather using two transformer encoders?
In your code, the BiAttention is used, self.condition_context_attn = BiAttention(config.hidden_size, config.dropout). But the self.condition_context_attn is not be used. So I wanna get the detail setting.

Best Wishes!

Explaination about your preprocessed data

Could you please explain the accurate meaning of these various filenames?

Building Datasets for an English Marketplace

Hi,
PS: I have data consisting of product titles, descriptions, and categories of a marketplace in English.

Could you explain what else data I need to build a dataset; if possible I will try to create or get that data. Also please help me understand what is the knowledge base I require? Do you think this link https://github.com/IBM/build-knowledge-base-with-domain-specific-documents/blob/master/README.md can help me out for the getting out the knowledge base from the data(product titles and descriptions) that I already have?

I would be glad if you help me create the dataset, and advise any changes that I need to make in code after changing the dataset.
Thanks in advance.

请问提供预训练模型了吗？

您好，看了Generation部分，还是不太明白该怎样使用api.py

Question about dataset

Hi, i am interesting in this paper. But I am a little confused about the dataset for experiment. I can get product title in TaoBao easily, but can not find product description. So I am confused what should I do to get description especially personalized description for ground-truth like your dataset.

關於Title input的問題

作者您好，謝謝您的分享，我想請問一下，
問題一:
關於Title x 的作法，請問是把"(10)(a)牛仔外套女2019春秋装新款宽松学生韩版bf原宿风外套牛仔衣潮"這個當成x轉成embedding並與(10)(a)這個attribute的embedding做相加丟入encoder，
還是純粹把"牛仔外套女2019春秋装新款宽松学生韩版bf原宿风外套牛仔衣潮"當成x,
並與(10)(a)這個attribute的embedding做相加丟入呢?
問題二:
關於最後生成的personalized product description，生成的字數個數是隨機的嗎?
有辦法指定限制字數嗎? 還是這是根據訓練集的description長度來決定的呢?

Reversed word order in target compared to input

I trained a baseline model using my own data and I found an interesting result. Different from your baseline setting, I use word-based encoding and char-based decoding. It seems that the model tends to generate words according to the input but in reversed order.

Could you explain this phenomenon? I also wonder if word-based encoding and char-based decoding leads to some information interpretation gap between encoder and decoder?

How to build Data.pkl?

Thank you for sharing such a wonderful project. I find the code very inspiring. However, during viewing the code, I could not find the script to build "data.pkl" and therefore can hardly infer the format of the data. Can you please upload the corresponding code or provide the data format in data.pkl? Thank you again.

Why did you delete bi-attention code?

Version 2 has no bi-attention code, why?

关于24-V2训练

你好，两个问题，关于224-V2代码训练的问题：

请问，24-v2的代码目前是只提供了单卡训练吗？我看论文里面有说在多卡环境下训练，修改为多卡训练需要如何修改配置呢？

关于训练时间，readme里面说 We provide a reference for the training progress (training takes about 150 hours on a 2080 Ti). 这个指的是单卡下24-full 还是 24-baseline的模型？能否提供各个子模型(24-full、24-attr、...)具体训练时间，好有个参考。

辛苦作者解答，谢谢！

How to use api.py to do the inference?

How can I use the pretrained model to do the model inference? seems api.py provides the example but cannot work

At the end of the run, the bleu score can no longer be improved at around 6.

How can I use the data set in the example to train with baseline to achieve the total number of 81 in the paper

Question about allWeight and eval_ in beam sample

Hi @qibinc

Thanks for your patience. I found that your beam sample including a parameter eval_. But it is likely useful for your training or I am wrong?

Best wishes,

Have you tryed any other Hyperparameter？

It seems that the Transformer-encoder is not more important than Decoder.
So why you use a 6 encoder layers but only 2 decoder layers ?
Does it help to improve performance?

What is the proper configs?

Thanks for sharing the code.
In the config file, learning_rate is set to 2, but in the paper, saying that learning_rate is set to 10^-4.
I tried both settings for baseline, and founded 10^-4 was too small to improve BLEU score. But for learning_rate=2, BLEU score was about 6.0 last 1M steps which is not 7.2 showed in the paper. Did I make any mistakes, thx.

user's implicit feedback data

Thanks for the code and paper.
I am curious about the soft assignment of user categories. Is it possible to provide the implicit feedback data (click, dwell time) to the public?

Question about LabelSmoothingLoss

Hi @qibinc ,

I am using your code. I found that you do not use the CrossEntropy loss, instead of LabelSmoothingLoss, so can you give some explanation, and your paper also not explain this thing.

Best wishes,

Failed to download the processed training data

Failed to download the processed training data~ Can you help me?

请问怎么对训练好的模型进行测试

Implementation of BIDAF to combine knowledge encoding and title representation

Hi,would please figure out where is the implementation of BIDAF to combine knowledge encoding and title representation in this released code? Thanks a lot.

Issue with evaluating model with beam search

Hello,

Thank you for providing this well-written and useful repository. After having trained a model, I try to evaluate the saved checkpoint model using beam search with a command similar to the one from the README:

python core/train.py --config configs/baseline.yaml --mode eval --restore experiments/finals-baseline/checkpoint.pt --expname eval-baseline --beam-size 10

However, I am getting an issue which produces a stack trace like this:

Traceback (most recent call last):  
    File 'KOBE/core/train.py', line 371, in <module>  
        score = eval_model(model, data, params, config, device, writer)  
    File 'KOBE/core/train.py', line 250, in eval_model
        samples, alignment = model.beam_sample(
    File 'KOBE/core/models/tensor2tensor.py', line 467, in beam_sample
        b.advance(output[j, :], attn[j, :]) # batch index
    File 'KOBE/core/models/beam.py', line 101, in advance
        self.attn.append(attnOut.index_select(0, prevK))
RuntimeError: "index_select_out_cuda_impl" not implemented for 'Float'
Process finished with exit code 1

It seems to me that it actually makes sense to happen, since we are trying to index a tensor (attnOut) with a tensor of floats (prevK). Here is the code chunk from beam.py for reference:

prevK = bestScoresId / numWords
self.prevKs.append(prevK)
self.nextYs.append((bestScoresId - prevK * numWords))
self.attn.append(attnOut.index_select(0, prevK))

Am I doing something wrong here? Thanks.

Building Dataset In English Language

Hi, I have data consisting of product titles, descriptions, and categories of a marketplace in English. Would be kind enough if you could help me by explaining how to prepare a proper datasets/format of the datasets for that data.
Thankyou in advance.

请问怎么在自己的数据集上做finetune和推理？

我有一些同样是中文的产品信息和用户信息（图像和文本，还有用户最感兴趣的一个tag），请问该如何使用您发布的checkpoint进行推理？

另外，您的模型有没有可能进行fine tune？我在您的readme里找不到关于如何自行构建数据集的指示。

detail user category

Could you tell us 24 detail user categories?
I can not find them neither in your code nor in your paper.

Error when preprocessing dataset

I am following the readme file instruction to download and preprocess the provided data, however I am stuck on the preprocess step.
I am trying to run python -m kobe.data.vocab --input saved/raw/train.cond --vocab-file saved/vocab.cond --vocab-size 31 --algo word

Traceback (most recent call last):
  File "C:\Program Files\Python39\lib\runpy.py", line 197, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "C:\Program Files\Python39\lib\runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "C:\Users\alien\Documents\PyCharm-Projects\KOBE\kobe\data\vocab.py", line 35, in <module>
    spm.SentencePieceTrainer.Train(
  File "C:\Program Files\Python39\lib\site-packages\sentencepiece\__init__.py", line 407, in Train
    return SentencePieceTrainer._TrainFromString(arg)
  File "C:\Program Files\Python39\lib\site-packages\sentencepiece\__init__.py", line 385, in _TrainFromString
    return _sentencepiece.SentencePieceTrainer__TrainFromString(arg)
OSError: Not found: "C:\Users\alien\AppData\Local\Temp\tmpxplscd17": Permission denied Error #13

I have checked the folder's permission is fully accessible to my account, and I am already running cmd as admin.

Question about padding

Hi @qibinc

I found that you reverse the src, can you give some reasons about this?

Best wishes,

thudm / kobe Goto Github PK

kobe's Issues

Recommend Projects

Recommend Topics

Recommend Org