Code Monkey home page Code Monkey logo

scorch's Introduction

Scorch¹

Build Status PyPI Code style: black

This is an alternative implementation of the coreference scorer for the CoNLL-2011/2012 shared tasks on coreference resolution.

It aims to be more straightforward than the reference implementation, while maintaining as much compatibility with it as possible.

The implementations of the various scores are as close as possible from the formulas used by Pradhan et al. (2014), with the edge cases for BLANC taken from Recasens and Hovy (2011).


1. Scorer for coreference chains.

Usage

scorch gold.json sys.json out.txt

Install

From the cheeseshop

python3 -m pip install --user scorch

Or directly from git

python3 -m pip install --user git+https://github.com/LoicGrobol/scorch.git

Formats

Single document

The input files should be JSON files with a "type" key at top-level

  • If "type" is "graph", then top-level should have at top-level
    • A "mentions" key containing a list of all mention identifiers
    • A "links" key containing a list of pairs of corefering mention identifiers
  • If "type" is "clusters", then top-level should have a "clusters" key containing a mapping from clusters ids to cluster contents (as lists of mention identifiers).

Of course the system and gold files should use the same set of mention identifiers for the mentions they have in common.

For convenience, the conll.py converts CoNLL-2012 files to this format.

Multiple documents

If the inputs are directories, files with the same base name (excluding extension) as those present in the sys directory are expected to be present in the gold directory, with exactly one gold file for each sys file. In that case, the output scores will be the micro-average of the individual files scores, ie their arithmetic means weighted by the relative numbers of

  • Gold mentions for Recall
  • System mentions for Precision
  • The sum of the previous two for F₁

This is different from the reference interpretation where

  • MUC weighting ignores mentions in singleton entities
    • This should not make any difference for the CoNLL-2012 dataset, since singleton entities are not annotated.
    • For datasets with singletons, the shortcomings of MUC are well known, so this score shouldn't matter much
  • BLANC is calculated by micro-averaging coreference and non-coreference separately, using the number of links as weights instead of the number of mentions.
    • This is roughly equivalent to weighting coreference scores per document by their number of non-singleton clusters and non-coreference scores by the square of their number of mentions. This give disproportionate importance to large documents, which is not desirable in heterogenous corpora

The CoNLL average score is the arithmetic mean of the global MUC, B³ and CEAFₑ F₁ scores.

Sources

License

Unless otherwise specified (see below), the following licence (the so-called “MIT License”) applies to all the files in this repository. See also LICENSE.md.

Copyright 2018 Loïc Grobol <[email protected]>

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and
associated documentation files (the "Software"), to deal in the Software without restriction,
including without limitation the rights to use, copy, modify, merge, publish, distribute,
sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or
substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT
NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,
DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT
OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

scorch's People

Contributors

dependabot-preview[bot] avatar loicgrobol avatar rexhaif avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

scorch's Issues

Fix tests on windows

They are only failing because of EOL management in the multi-file smoke test, but instead of fixing just that we should definitely work on a more robust test.

Scorch fails on samples with no test clusters

I've tried scorch and found, that it fails on certain documents from conll-2012 test set, specifically:

  • bc_cnn_00_cnn_0008-008.json
  • nw_wsj_23_wsj_2390-000.json
    It seems like these documents have no gold clusters, and scorch fails with division by zero exception:
Traceback (most recent call last):                                                                                                         
  File "/usr/local/bin/scorch", line 8, in <module>
    sys.exit(main_entry_point())
  File "/usr/local/lib/python3.8/dist-packages/scorch/main.py", line 240, in main_entry_point
    out_stream.writelines(process_dirs(gold_path, sys_path))
  File "/usr/local/lib/python3.8/dist-packages/scorch/main.py", line 200, in process_dirs
    r = {
  File "/usr/local/lib/python3.8/dist-packages/scorch/main.py", line 201, in <dictcomp>
    name: metric(gold_clusters, sys_clusters)
  File "/usr/local/lib/python3.8/dist-packages/scorch/scores.py", line 118, in b_cubed
    R = math.fsum(
ZeroDivisionError: float division by zero

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.