Code Monkey home page Code Monkey logo

templatener's Introduction

Template-Based NER

Source Code For Template-Based Named Entity Recognition Using BART

Training

Training train.py

Inference inference.py

Corpus

ATIS (https://github.com/yvchen/JointSLU/tree/master/data)

MIT Restaurant Corpus (https://groups.csail.mit.edu/sls/downloads/)

MIT Movie Corpus (https://groups.csail.mit.edu/sls/downloads/)

Contact

If you have any questions, please feel free to contact Leyang Cui ([email protected]).

Citation

@inproceedings{cui-etal-2021-template,
    title = "Template-Based Named Entity Recognition Using {BART}",
    author = "Cui, Leyang  and
      Wu, Yu  and
      Liu, Jian  and
      Yang, Sen  and
      Zhang, Yue",
    booktitle = "Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021",
    month = aug,
    year = "2021",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2021.findings-acl.161",
    doi = "10.18653/v1/2021.findings-acl.161",
    pages = "1835--1845",
}

templatener's People

Contributors

nealcly avatar parakalan avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

templatener's Issues

code

When will you make the code public?

Implementation for other language

Hi,

thank you for your great contribution to this interesting template NER topic. I wonder if it's possible to adapt this code to another language. I've included the model and tokenizer in the MODEL_CLASSES (and other parts) since it has a different tokenizer compared to English BART.

MODEL_CLASSES = {
    "auto": (AutoConfig, AutoModel, AutoTokenizer),
    "bart": (BartConfig, BartForConditionalGeneration, BartTokenizer),
    "bert": (BertConfig, BertModel, BertTokenizer),
    "roberta": (RobertaConfig, RobertaModel, RobertaTokenizer),
    "indobart": (MBartConfig, MBartForConditionalGeneration, IndoNLGTokenizer)
}

Could you share some hints on which part I should put attention to when adding other pre-trained models/language to the code?

Thank you so much for your help!

Best,
Oryza

Loss fluctuation

Hi, when I run train.py with the data/train.csv and data/dev.csv, the loss fluctuates a lot between 0.6 and 0.3 and doesn't seem to get better. Do you have any Idea what might be the reason for this?

Further consideration of efficiency?

In this paper, the author considered efficiency, while Inference is still time-consuming. This is because each sample x with a length of n will create 8n * k templates.

The source sequence of the model is an input text X = {x1, . . . , xn} and the target sequence Tyk,xi:j = {t1, . . . , tm} is a template filled by candidate text span xi:j and the entity type yk.

For efficiency, we restrict the number of n-grams for a span from one to eight, so 8n templates are created for each sentence.

Evaluation Metrics

The acc calculated in the project only computed the absolute accuracy of generated sequence, which contained the tokens like: "is," "a," "an," "entity."

I calculated the P, R, F1 of the entity token content and entity class of the generated sequence(train based on CoNLL03). The evaluation results are inconsistent with the paper, with the ' organization' entity obtaining only 0.58 F1. Could you please publish the dataset of the paper and the complete evaluation methods?

Error encountered when running the inference.py file

Hi,
Thank you for sharing. Could you please provide the version of transformers and other packages? I encountered the problem when running the inference.py (Line: output = model(input_ids=input_ids.to(device), decoder_input_ids=output_ids[:, :output_ids.shape[1] - 2].to(device))[0).
The error is as below:

1640094724(1)

在少样本数据集上微调

尊敬的作者您好!我现在已经在CoNLL03上复现了你的结果,也生成了对应的模型,请问我如何在MIT Movie少样本数据集上进行微调呢,直接在train.py中修改数据集的路径,然后运行就可以了吗

Final trained models

Hi,

I am working on using your paper for my research purposes. Would you be releasing the trained checkpoint anytime soon?

Thanks and Regards,

The answer sentence in csv files

The csv file in the data directory seems only use the gold entity as the answer sentence. Shall I add some negative samles like "by is not a named entity" as an answer sentence to the csv file for training? Thanks.

CSV input files

Can you share the format of the input CSV files?
Thank you,
Viet

Seq2SeqModel predicts one entity at a time

Hi,
Seq2SeqModel.predict function predicts one entity at a time. e.g.

predict("Tesla, IBM, and Amazon are the good tech companies") -> "Tesla is an Organization"
What about extracting IBM and Amazon as well at the same time!

Could you please release the test data

Hi, Thank you for your nice work.
Could you please release the test data of this code?
I have no idea what format of the data should be applied in inference.py.

你好,测试集的问题

您好,在测试集推理时,我有两处存在问题,想请教下作者。
(1)为什么这里需要指定id[:,0]==2呢
image
(2)这里,为什么要减2呢?我不太理解这样做的意义

image

训练问题

您好,请问,下载代码直接执行train.py,训练好,得到一个模型,下载readme提到的数据集,在直接执行inference.py文件就可以了嘛?然后就是请问一下你的transformers的版本,因为有些地方导包爆红了

Hard coded numbers in template_entity function of inference.py

Hi,

would you mind explaining some hard-coded numbers in the template_entity function from inference.py?

def template_entity(words, input_TXT, start):
    # input text -> template
    words_length = len(words)
    words_length_list = [len(i) for i in words]
    input_TXT = [input_TXT]*(5*words_length)

    input_ids = tokenizer(input_TXT, return_tensors='pt')['input_ids']
    model.to(device)
    template_list = [" is a location entity .", " is a person entity .", " is an organization entity .",
                     " is an other entity .", " is not a named entity ."]
    entity_dict = {0: 'LOC', 1: 'PER', 2: 'ORG', 3: 'MISC', 4: 'O'}
    temp_list = []
    for i in range(words_length):
        for j in range(len(template_list)):
            temp_list.append(words[i]+template_list[j])

    output_ids = tokenizer(temp_list, return_tensors='pt', padding=True, truncation=True)['input_ids']
    output_ids[:, 0] = 2
    output_length_list = [0]*5*words_length


    for i in range(len(temp_list)//5):
        base_length = ((tokenizer(temp_list[i * 5], return_tensors='pt', padding=True, truncation=True)['input_ids']).shape)[1] - 4
        output_length_list[i*5:i*5+ 5] = [base_length]*5
        output_length_list[i*5+4] += 1

    score = [1]*5*words_length
    with torch.no_grad():
        output = model(input_ids=input_ids.to(device), decoder_input_ids=output_ids[:, :output_ids.shape[1] - 2].to(device))[0]
        for i in range(output_ids.shape[1] - 3):
            # print(input_ids.shape)
            logits = output[:, i, :]
            logits = logits.softmax(dim=1)
            # values, predictions = logits.topk(1,dim = 1)
            logits = logits.to('cpu').numpy()
            # print(output_ids[:, i+1].item())
            for j in range(0, 5*words_length):
                if i < output_length_list[j]:
                    score[j] = score[j] * logits[j][int(output_ids[j][i + 1])]

    end = start+(score.index(max(score))//5)
        # score_list.append(score)
    return [start, end, entity_dict[(score.index(max(score))%5)], max(score)] #[start_index,end_index,label,score]

I learned from the opened issues that the 5s are the length of the template_list but how about the other numbers?

It would be a great help if you could response to this, thank you in advance!

Repository not found

I tried to run inference.py but it gives the error

/configuration_utils.py", line 609, in _get_config_dict
    user_agent=user_agent,
  File "/usr/local/lib/python3.7/dist-packages/transformers/utils/hub.py", line 292, in cached_path
    local_files_only=local_files_only,
  File "/usr/local/lib/python3.7/dist-packages/transformers/utils/hub.py", line 495, in get_from_cache
    _raise_for_status(r)
  File "/usr/local/lib/python3.7/dist-packages/transformers/utils/hub.py", line 418, in _raise_for_status
    f"401 Client Error: Repository not found for url: {response.url}. "
transformers.utils.hub.RepositoryNotFoundError: 401 Client Error: Repository not found for url: https://huggingface.co/checkpoint-3060/resolve/main/config.json. If the repo is private, make sure you are authenticated.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "inference.py", line 104, in <module>
    model = BartForConditionalGeneration.from_pretrained('./checkpoint-3060')
  File "/usr/local/lib/python3.7/dist-packages/transformers/modeling_utils.py", line 1934, in from_pretrained
    **kwargs,
  File "/usr/local/lib/python3.7/dist-packages/transformers/configuration_utils.py", line 526, in from_pretrained
    config_dict, kwargs = cls.get_config_dict(pretrained_model_name_or_path, **kwargs)
  File "/usr/local/lib/python3.7/dist-packages/transformers/configuration_utils.py", line 553, in get_config_dict
    config_dict, kwargs = cls._get_config_dict(pretrained_model_name_or_path, **kwargs)
  File "/usr/local/lib/python3.7/dist-packages/transformers/configuration_utils.py", line 614, in _get_config_dict
    f"{pretrained_model_name_or_path} is not a local folder and is not a valid model identifier listed on "
OSError: ./checkpoint-3060 is not a local folder and is not a valid model identifier listed on 'https://huggingface.co/models'
If this is a private repository, make sure to pass a token having permission to this repo with `use_auth_token` or log in with `huggingface-cli login` and pass ``use_auth_token=True

Cross-domain Few-Shot NER Result

in no source-domain data,MIT Movie 10 shot ,the result in paper is 37.3,why i get 51.06 , can you tell me some detail about that?

Output duplication

Hi, I tried your code and once I use your demo to generate output I expected ["original text","predict entity with prompt"], however it just output ['JapanJapanJapan JapanJapanJapan'].
Is that any layer you add to fine-tuning but not shown in code?

Custom Label Problem

When I train the model with custom labels, the training code works well. However, Adapting Inference.py code to my custom trained model does not work.

I change the Inference.ipynb code to adapt my 11 labels as follows:

    LABELS=["Adjective","API","Core","GUI","Hardware","Language","Platform","Standard","User","Verb","O"]
    template_list=[" is a %s entity"%(e) for e in LABELS]
    entity_dict={i:e for i, e in enumerate(LABELS)}

Here is loading checkpoint

tokenizer = BartTokenizer.from_pretrained('facebook/bart-large')
model = BartForConditionalGeneration.from_pretrained('./outputs/best_model')

Here is the inference and the error

prediction("As a user I should be able to use the attribute type User in my queries.")

RuntimeError
----> 2 prediction("As a user I should be able to use the attribute type User in my queries.")
/usr/local/lib/python3.7/dist-packages/transformers/models/bart/modeling_bart.py in _shape(self, tensor, seq_len, bsz)
157 def _shape(self, tensor: torch.Tensor, seq_len: int, bsz: int):
--> 158 return tensor.view(bsz, seq_len, self.num_heads, self.head_dim).transpose(1, 2).contiguous()
RuntimeError: shape '[88, -1, 16, 64]' is invalid for input of size 778240

cross-domain question

您好,我在cross-domain实验中遇到了key-value的问题,我用的是bert-softmax,不知道您在进行Sequence Labeling BERT是如何解决这个问题的,非常期待您的回答

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.