Comments (3)
您好,感谢您的提问,
一般来说Learning rate和能否正常训练没有太大关系。在我们的任务上8e-6和2e-5的区别 应该只是在最后结果上的差别1-2个F1,但是应该并不会造成没有办法收敛的情况。
from mrc-for-flat-nested-ner.
不正常。可能其他地方哪里出错了。
from mrc-for-flat-nested-ner.
我直接用的dense,没有用MultiNonLinearClassifier,然后样本构造应该没啥问题,不知道是不是解码的问题,我解码的时候是要求start end的概率大于0.5并且要求span[start, end]的值也大于0.5。训练过程发现loss下降到一定程度后就不变 了,然后f1值也是一直接近0。然后一般finetune的时候lr都是在5e-5到2e-5之间这样,然后看到这边设置的是8e-6,是专门为了span loss调整的吗?
from mrc-for-flat-nested-ner.
Related Issues (20)
- 关于dataset处理offset的部分的问题 HOT 1
- 请问中文ner数据集(msra onto4) 您论文里的结果是在验证集上,还是在测试集上的呢?
- MSRA数据集start,end很快收敛,span的recall <50%
- RuntimeError: NCCL error in: /opt/conda/conda-bld/pytorch_1607370128159/work/torch/lib/c10d/ProcessGroupNCCL.cpp:31, unhandled cuda error, NCCL version 2.7.8
- 换成自己的数据集报错,不能训练 HOT 14
- train problem 卡在以下最后一行内容current training loss is : 0.01826123334467411 不动
- When reproduce zh_msra for both bert-tagger and bert-mrc, which bert pretrain model is used?
- conll03.sh can't reproduce the f1 score in paper
- mrc-ner脚本必须使用第0块显卡的问题
- inference 遇到错误 HOT 3
- Inference stage complains BertQueryNER were not initialized from the model checkpoint
- Inference stage complains BertQueryNER were not initialized from the model checkpoint
- Ontonoyes5.0的query HOT 1
- Project dependencies may have API risk issues
- the msra2src.py may have some problems HOT 4
- BERT DIR?
- Annotation Guideline Notes
- 使用sh文件复现时,修改DATA_DIR, BERT_DIR, OUTPUT_DIR后依旧无法运行 HOT 1
- 关于NER转换为MRC任务
- 大家是如何准备自己的数据集的? HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from mrc-for-flat-nested-ner.