Code Monkey home page Code Monkey logo

ner-chinese's Introduction

NER-Chinese

Comparison of Chinese Named Entity Recognition Models between NeuroNER and BertNER

  1. Word Embedding-BiLSTM-CRF:众多实验表明,该结构属于命名实体识别中最主流的模型,代表的工具有:NeuroNER。它主要由Embedding层(主要有词向量,字向量以及一些额外特征)、双向LSTM层、以及最后的CRF层构成。
  2. Bert-BiLSTM-CRF:随着Bert语言模型在NLP领域横扫了11项任务的最优结果,将其在中文命名实体识别中Fine-tune必然成为趋势。它主要是使用bert模型替换了原来网络的word2vec部分,从而构成Embedding层,同样使用双向LSTM层以及最后的CRF层来完成序列预测。

ner-chinese's People

Contributors

eoa-ailab avatar jovenchu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

ner-chinese's Issues

在添加新类别后,新类别的识别率为0

我在训练集中加入了两种大概500个左右的新实体类别,在训练完毕后输出的新类别识别率均为0,查看预测结果同样对于新实体类别没有识别出来,我在label_list中添加了新类别,同时修改了训练文件,能否咨询一下该模型是否支持加入新的实体类别

关于不能加载bert模型的问题

您好!我在进行模型训练时遇到了这个问题:
tensorflow.python.framework.errors_impl.DataLossError: Unable to open table file C:\Users\Desktop\ner_zh\bert_ner\bert_ner\chinese_L-12_H-768_A-12\bert_model.ckpt: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
请问该如何正确加载模型呢?试过将"bert_model.ckpt.data-00000-of-00001"的后缀删除但并不好用。期待您的回复!

运行出现以下错误是什么原因

我运行出现错误提示如下(使用的是tensorflow1.13.1):
No such file or directory: 'D:\Workproject\bert_ner\bert_ner\output\label2id.pkl'

无标题

请问怎么解决?

完整的输出信息如下:

Namespace(batch_size=32, bert_config_file='D:\Workproject\bert_ner\bert_ner\chinese_L-12_H-768_A-12\bert_config.json', bert_path='D:\Workproject\bert_ner\bert_ner\chinese_L-12_H-768_A-12', cell='lstm', clean=True, clip=0.5, data_dir='D:\Workproject\bert_ner\bert_ner\NERdata', device_map='0', do_eval=True, do_lower_case=True, do_predict=True, do_train=True, dropout_rate=0.5, filter_adam_var=False, init_checkpoint='D:\Workproject\bert_ner\bert_ner\chinese_L-12_H-768_A-12\bert_model.ckpt', label_list=None, learning_rate=1e-05, lstm_size=128, max_seq_length=512, model_path='D:\Workproject\bert_ner\bert_ner\output', ner='ner', num_layers=1, num_train_epochs=30, output_dir='D:\Workproject\bert_ner\bert_ner\output2', save_checkpoints_steps=200, save_summary_steps=200, verbose=False, vocab_file='D:\Workproject\bert_ner\bert_ner\chinese_L-12_H-768_A-12\vocab.txt', warmup_proportion=0.1)
2020-12-05 15:45:05.235899: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2
checkpoint path:D:\Workproject\bert_ner\bert_ner\output\checkpoint
Traceback (most recent call last):
File "d:/Workproject/bert_ner/bert_ner_api.py", line 50, in
predict_cmd()
File "d:/Workproject/bert_ner/bert_ner_api.py", line 28, in predict_cmd
from src.bert_ner_predict import ner_predict
File "d:\Workproject\bert_ner\src\bert_ner_predict.py", line 49, in
with codecs.open(os.path.join(model_dir, 'label2id.pkl'), 'rb') as rf:
File "C:\Users\hxuwjj\AppData\Local\Programs\Python\Python36\lib\codecs.py", line 897, in open
file = builtins.open(filename, mode, buffering)
FileNotFoundError: [Errno 2] No such file or directory: 'D:\Workproject\bert_ner\bert_ner\output\label2id.pkl'

预测速度

我在“AI前线“的公众号上看到 BertNER 的预测速度可以达到 80ms,想问一下您是如何计算的,有用到一些模型的优化技术吗?我使用 Tensorflow serving 在 P100 下的预测速度都只能达到 400ms 左右。希望能得到您的指导~

基于Bert-NER构建特定领域的中文信息抽取框架(下)在哪有?

我是在网上看到《基于Bert-NER构建特定领域的中文信息抽取框架(上)》,感觉你们做的东西很好,前来学习一下。不知关系抽取的代码有公布出来了吗?
同时,发现文章中提到关系抽取是在《基于Bert-NER构建特定领域的中文信息抽取框架(下)》中详细介绍,但并没找到这个“下篇”。能否提供一下?
谢谢!

训练进程被killed问题

程序运行到这里的时候,就会被Killed掉
tf.estimator.train_and_evaluate(estimator, train_spec, eval_spec)
想知道是为什么吗?
image

AttributeError: module 'tensorflow.contrib.estimator' has no attribute 'stop_if_no_decrease_hook'

你好,我用tensorflow版本为1.14.0,运行你的代码出现
Traceback (most recent call last):
File "bert_ner_api.py", line 47, in
train_api()
File "bert_ner_api.py", line 21, in train_api
train(args)
File "/home/desktop/KG/NER-Chinese/bert_ner/bert_ner/bert_base/train/bert_lstm_ner.py", line 677, in train
early_stopping_hook = tf.contrib.estimator.stop_if_no_decrease_hook(
AttributeError: module 'tensorflow.contrib.estimator' has no attribute 'stop_if_no_decrease_hook'
麻烦帮解决一下

使用NeuroNER,模型运行咨询

To use the CPU if you have installed tensorflow, or use the GPU if you have installed tensorflow-gpu:

python3.5 main.py

请问这个main.py文件在Neuro-master的哪个文件夹路径下呀?
(NeroNER) zhao@cumt:~/anaconda3/envs/NeroNER/NeuroNER-master$ ls
data LICENSE MANIFEST.in neuroner parameters.ini README.md requirements.txt setup.py test

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.