Code Monkey home page Code Monkey logo

kgpt's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

kgpt's Issues

torch.nn.modules.module.ModuleAttributeError: 'GraphGatedTransformerDecoder' object has no attribute 'beam_search' │

Hi, I have a problem when I am trying to evaluate all the fine-tuned models using graph encoder. It gives me the following errors:
torch.nn.modules.module.ModuleAttributeError: 'GraphGatedTransformerDecoder' object has no attribute 'beam_search' │
And then I inspect the Model.py, I find out that there is indeed no 'beam_search' under 'GraphGatedTransformerDecoder'. But I see that in all the eval_graph_*_all.sh files, you set the '--beam_size 2' which means that it is using beam_search. I am really confused about this. Thanks!

issues about decode texts

bash scripts/webnlg/eval_sequence_webnlg.sh 0 challenge checkpoint_webnlg/checkpoint_finetune_sequence_head8_layer6_GPT2_maxfact12/model_ep20.pt We didn't change the code. File "/home/xushoukai/KGPT-webnlg/code/Model.py", line 113, in get_hypothesis
hyp.append(self.next_ys[j+1][k])
IndexError: tensors used as indices must be long, byte or bool tensors

change beam_size then everything decode as "!"

When I use
"bash scripts/webnlg/eval_sequence_webnlg.sh 0 test checkpoint_webnlg/checkpoint_finetune_sequence_head8_layer6_GPT2_maxfact12_from_ep14/model_ep30.pt"
to reproduce the results ,and I find when I set beam_size >1 (using beam_search),then every word decode as "!",but when I used beam_size = 1 (using greedy_decode), it's back to normal.
I want to know how I should use beam_ Search for decoding
图片

KeyError: 'module.embedding.word_embeddings.weight'

When I tried to run:
bash scripts/webnlg/finetune_graph_e2enlg_from_wikidata.sh 0 checkpoint_wikidata/checkpoint_sequence_head8_layer6_GPT2_maxfact12/model_ep14.pt

It gave me the following error.
Traceback (most recent call last): File "code/run.py", line 176, in <module> load_my_state_dict(model, reloaded) File "code/run.py", line 82, in load_my_state_dict own_state[name].copy_(param) KeyError: 'module.embedding.word_embeddings.weight'

Do you know why this error happens, thanks!

segmentation fault

Hi, when I try to run the "run.py", I get the following error.
Segmentation fault (core dumped)
both the Sequence Encoder and Graph Encoder

Anyone have a solution to this error please let me know

IndexError: tensors used as indices must be long, byte or bool tensors

Hey, downloaded your pretrained weights from recent Google Drive link and ran script specified in reproducing results. Got this error:

Traceback (most recent call last):
  File "code/run.py", line 453, in <module>
    result = model.module.beam_search(data[:-2], tokenizer, n_bm=args.beam_size, banwords=banwords, max_token_seq_len=args.max_dec_len)
  File "/home/ilya/repos/KGPT/code/Model.py", line 646, in beam_search
    active_inst_idx_list = beam_decode_step(self, inst_dec_beams, len_dec_seq, active_inst_idx_list,
  File "/home/ilya/repos/KGPT/code/Model.py", line 883, in beam_decode_step
    dec_partial_seq = [inst_dec_beams[idx].get_current_state() 
  File "/home/ilya/repos/KGPT/code/Model.py", line 883, in <listcomp>
    dec_partial_seq = [inst_dec_beams[idx].get_current_state() 
  File "/home/ilya/repos/KGPT/code/Model.py", line 34, in get_current_state
    return self.get_tentative_hypothesis()
  File "/home/ilya/repos/KGPT/code/Model.py", line 103, in get_tentative_hypothesis
    hyps = [self.get_hypothesis(k) for k in keys]
  File "/home/ilya/repos/KGPT/code/Model.py", line 103, in <listcomp>
    hyps = [self.get_hypothesis(k) for k in keys]
  File "/home/ilya/repos/KGPT/code/Model.py", line 114, in get_hypothesis
    hyp.append(self.next_ys[j+1][k])
IndexError: tensors used as indices must be long, byte or bool tensors

question about data use

hi, thanks for publish this great work!

i have some question about data use

  1. in test set of web nlg, there are some example that do not have a text label, how did you evaluate them
  2. webnlg have multi text label for a triple set(for example: triple_set1 can map to [text1,text2,text3]), during training, did you regard one example as multi training data(for above example, we got 3 training data :[(triple_set1,text1) ,(triple_set1,text2),(triple_set1,text3)]) or randomly chose one of the text label for one example? if a triple set have multi text label during training, will the model get confused because the model learn triple_set1 map to text1, but when meet the example triple_set1 map to text2, it will lose some bleu score and it will swing between text1 and text2. what is the standard evaluate way in this task(I am new to this area, sorry for some stupid question).

looking forward to your replay, any help would be appreciated!

Differences between code and paper

Nice job! I have several questions as follow:
(1) I found some differences between code and paper, e.g., property embedding hasn't been used, and various parameter values (batch size/learning rate) are not the same in different experimental setup (training/challenge/a few shots). I wonder whether the SOTA released in the original paper can be implemented by following the settings in the scripts.
(2) I run the sequence-encoder-few-shot-1% experiment at WEGNLG dataset. Only the batch size is modified to 16 due to insufficient resources, and the epoch becomes 4000 correspondingly for the same total steps. I achieve 19.9 BLEU (paper: 24.7). Is there any important information or tricks i miss?
(3) The proposed model has achieved significant performance under zero-shot and few-shot settings. Could you show more details about how many training instances could obtain a competitive results against other baselines (Fully-Supervised Results)?
Thanks a lot!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.