Code Monkey home page Code Monkey logo

conlleval's Introduction

conlleval-python

NOTE: This repository is currently not managed. Please check the Issues page for bug reports. I will try to fix it later. Sorry for any inconvenience.

Intro

This repository contains two scripts:

  • conlleval_perl.py: the Python equivalent of the Perl script conlleval, which can be used for measuring the performance of a system that has processed the CoNLL-2000 shared task data.

  • conlleval.py: a slight modification on the above script, so that it can be imported and used elsewhere. You will find import evaluate from conlleval useful.

For more information on the original Perl script, see http://www.cnts.ua.ac.be/conll2000/chunking/output.html.

Usage

Both scripts can be used to evaluate from a supported file.

Read from output.txt and print the results to the console:

  python conlleval.py < output.txt

or save the results in result.txt:

  python conlleval.py < output.txt > result.txt

And the result is:

   processed 961 tokens with 459 phrases; found: 539 phrases; correct: 371.
   accuracy:  84.08%; precision:  68.83%; recall:  80.83%; FB1:  74.35
                ADJP: precision:   0.00%; recall:   0.00%; FB1:   0.00
                ADVP: precision:  45.45%; recall:  62.50%; FB1:  52.63
                  NP: precision:  64.98%; recall:  78.63%; FB1:  71.16
                  PP: precision:  83.18%; recall:  98.89%; FB1:  90.36
                SBAR: precision:  66.67%; recall:  33.33%; FB1:  44.44
                  VP: precision:  69.00%; recall:  79.31%; FB1:  73.80

Options for conlleval_perl.py (the same as the original Perl script):

  • -l: Generate output as part of a LaTeX table. The definition of the table can be found in an example document: latex ps pdf

  • -d char: On each line, use this character rather than whitespace (or tab) as delimiter between tokens

  • -r: Assume raw output tokens, that is without the prefixes B- and I-. In this case each word will be counted as one chunk.

  • -o token: Use token as output tag for items that are outside of chunks or other classes. This option only works when -r is used as well. The default value for the outside output tag is O.

Usage for conlleval.py:

from conlleval import evaluate

# print out the table as above
evaluate(true_tags, pred_tags, verbose=True) 

# calculate overall metrics
prec, rec, f1 = evaluate(true_tags, pred_tags, verbose=False)

Data format

NOTE: This script can be used with IOB2 or IOBES tagging scheme. If you are using a different scheme, please convert to IOB2 or IOBES.

EDIT: There has been report that IOB2 support is broken (see Issues page). This has not been fixed yet.

For an example of data format to be used with this script, check out the accompanied output.txt file in this repository, or the original source at http://www.cnts.ua.ac.be/conll2000/chunking/output.txt.gz.

Sentences are separated by empty lines. Each line contains at least three columns, seperated by whitespaces (or a character specified in -d). The second last column is the chunk tag according to the corpus, and the last column is the predicted chunk tag. The other columns are ignored in evaluation.

Example:

   Boeing NNP B-NP I-NP
   's POS B-NP B-NP
   747 CD I-NP I-NP
   jetliners NNS I-NP I-NP
   . . O O
   
   Rockwell NNP B-NP I-NP
   said VBD B-VP B-VP
   the DT B-NP B-NP
   agreement NN I-NP I-NP

conlleval's People

Contributors

sighsmile avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

conlleval's Issues

Is this evaluation a strict or relaxted metrics?

I have ran the script on the following data:

(978) B-Phone I-Phone
934-3623 I-Phone I-Phone

In the IOB2 mode, every entity tag should starts with B. So the precision of the above should be 0. But the script you provided shows a result of 100% precision. However, I tried the original perl version conlleval script, same result.

ValueError: not enough values to unpack (expected 2, got 1)

Getting this error when using arbitrary tags like , , etc.

count_chunks(true_seqs, pred_seqs)
129
130 _, true_type = split_tag(true_tag)
--> 131 _, pred_type = split_tag(pred_tag)
132
133 if correct_chunk is not None:

ValueError: not enough values to unpack (expected 2, got 1)

I believe this happens if tags don't have a "-" in them or aren't an "O".

OSError: conlleval: unexpected number of features

My output.txt file is the following format:
Sao NC B-LOC B-LOC
Paulo VMI I-LOC I-LOC
( Fpa O O
Brasil NC B-LOC B-LOC
) Fpt O O
, Fc O O
23 Z O O
may NC O O
( Fpa O O
EFECOM NP B-ORG B-ORG
) Fpt O O
. Fp O O

Which should be okay according to the conll format. But after running the conll.py I am getting this error:
image
Do you have any idea why? @sighsmile
Thanks in advance!

关于evaluate(true_tags, pred_tags, verbose=True)

请问一下您,true_tags和pred_tags的输入是否类似[[,,,,,,,,,],[,,,,,,,],[,,,,,,,,,,]],即二维嵌套list类型?

#这样会报错
from conlleval import evaluate
true_tags =[ ['O', 'B-Part', 'I-Part'],[ 'O', 'O', 'O']]
pred_tags = [ ['O', 'B-Part', 'I-Part'],[ 'O', 'O', 'O']]
evaluate(true_tags,pred_tags)

#这样不报错
from conlleval import evaluate
true_tags =[ 'O', 'B-Part', 'I-Part', 'O', 'O', 'O']
pred_tags = [ 'O', 'B-Part', 'I-Part', 'O', 'O', 'O']
evaluate(true_tags,pred_tags)

这是否意味着每次只能对一个序列进行性能评估?

IOB2 evaluation

IOB2 is supported as input as you told in the readme, but the evaluation metrics is not in iob2 way. It's in iob1 way. So, you can either provide option for people to choose evaluation scheme for iob1 or iob2, or you can stop the support of iob2 format input.

why the result is different between conlleval.py and connlleval_perl.py?

my output file is like:

a  B-LOC  B-LOC
b  I-LOC  E-LOC
c  E-LOC  S-LOC

and the result is different,
for conlleval.py

processed 3 tokens with 1 phrases; found: 2 phrases; correct: 0.
accuracy:  33.33%; (non-O)
accuracy:  33.33%; precision:   0.00%; recall:   0.00%; FB1:   0.00
              LOC: precision:   0.00%; recall:   0.00%; FB1:   0.00  2

for connlleval_perl.py

processed 3 tokens with 1 phrases; found: 1 phrases; correct: 1.
accuracy:  33.33%; precision: 100.00%; recall: 100.00%; FB1: 100.00
              LOC: precision: 100.00%; recall: 100.00%; FB1: 100.00  1

which one is reliable?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.