Comments (9)
Can you be more specific about which results you tried to replicate?
I used the data here: http://conll.cemantix.org/2012/data.html which is train-v4, dev-v4 and test-v9.
from dilated-cnn-ner.
Data we both used are same,
I tried to replicate the results from your paper Fast and Accurate Entity Recognition with Iterated Dilated Convolutions
On the test I got F1 score 64 , but in the paper F1 score is around 86
from dilated-cnn-ner.
from dilated-cnn-ner.
from dilated-cnn-ner.
I tested the dilated-cnn config ...... it seems correct to me.... not that far away :)
Best dev F1: 84.55
Segment evaluation (test):
F1 Prec Recall
Micro (Seg) 85.37 85.60 85.14
Macro (Seg) 72.14 73.70 70.65
-------
ORDINAL 80.31 80.10 80.51
LOC 69.92 67.89 72.07
PRODUCT 52.70 54.17 51.32
NORP 92.38 91.78 92.98
WORK_OF_ART 48.52 47.67 49.40
LANGUAGE 44.44 57.14 36.36
MONEY 83.68 83.28 84.08
PERCENT 88.32 87.82 88.83
PERSON 90.85 91.57 90.14
ORG 83.37 83.79 82.95
CARDINAL 82.71 83.85 81.60
GPE 93.79 94.60 92.99
TIME 58.39 60.30 56.60
DATE 83.21 81.58 84.89
FAC 62.17 62.88 61.48
LAW 61.76 75.00 52.50
EVENT 45.87 54.35 39.68
QUANTITY 70.97 68.75 73.33
Processed 152728 tokens with 11257 phrases; found: 11196 phrases; correct: 9584.
Testing time: 44 seconds
from dilated-cnn-ner.
@strubell One last question F1 score presented in the Fast and Accurate Entity Recognition with Iterated Dilated Convolutions paper is F1_macro or F1_micro ?
from dilated-cnn-ner.
Micro
from dilated-cnn-ner.
Thanks!
from dilated-cnn-ner.
Hello, I have a question about the datasplit. From your earlier comment.
I used the data here: http://conll.cemantix.org/2012/data.html which is train-v4, dev-v4 and test-v9.
In test-v9 as far as I can tell there are 11,057 entities while in the test-v4 section there are 11,257 like you mention in Table 8. Which test set did you use? From the look of the preprocess.sh script it lookes like it was the v4 because the names of the files are gold_conll
rather than gold_parse_conll
like in the v9 test set.
I used this to find entites find annotations/ -name '*.v4_gold_conll' | grep -v 'pt/nt' | xargs cat | sed 's/\s\s*/ /g' | cut -d' ' -f11 | sed -n '/^(/p' | wc -l
Thanks for your clarification.
from dilated-cnn-ner.
Related Issues (20)
- Inference HOT 5
- What is the use of projection layer HOT 1
- Questions regarding differences between the implementation and the experiment details in the research paper HOT 5
- Need some clarification on the settings HOT 2
- Training File HOT 1
- Training issue HOT 1
- Is the dilated cnn ner model stable? HOT 2
- Did you try your model on other seq labeling tasks like Chunking or POS? HOT 1
- Nan problem during training on ontonotes data set HOT 7
- Getting some issue with permission beyond my understanding HOT 3
- int() argument must be a string, a bytes-like object or a number, not 'map' HOT 2
- About the paper HOT 6
- details on the accuracy HOT 12
- preprocessing before triggering 'preprocess.sh' for ontonotes HOT 2
- Validate model on real text data HOT 4
- InvalidArgumentError: indices[11,21] = 243838 is not in [0, 243245)
- Inconsistent results when predicting a single sentence versus predicting labels for dev set HOT 4
- Support for Tensorflow 1.13 HOT 4
- Any pytorch version of dilated-cnn-ner around ? HOT 3
- Question for the paper (only)
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from dilated-cnn-ner.