Code Monkey home page Code Monkey logo

Comments (6)

plison avatar plison commented on June 12, 2024

I don't seem to manage to reproduce the error unfortunately. What is the version of spacy and skweak that you are using? Is it perhaps the case that the train.spacy data has been generated by an older version that is no longer compatible?

from skweak.

AlineBornschein avatar AlineBornschein commented on June 12, 2024

Thanks for the quick reply! I'm using the latest versions.
skweak==0.3.1
spacy==3.2.3

from skweak.

plison avatar plison commented on June 12, 2024

Could you let me know if the example on https://github.com/NorskRegnesentral/skweak/blob/main/examples/quick_start.ipynb works for you (including in particular the last part that runs the spacy training script)?

from skweak.

AlineBornschein avatar AlineBornschein commented on June 12, 2024

Hi, the example skript works for me, apart from the last line of code, where the spacy model is trained. This part is not finishing.

As for my classification model, I assume that there are conflicts between my labelling functions as some of them might overlap and label the same spans. Maybe that could be why the writing to spacy doc results in the error : "ValueError: [E1010] Unable to set entity information for token 10 which is included in more than one span in entities, blocked, missing or outside." I resolved this by filtering these conflicts out. The error unhashable list still prevails even when run on a different machine.

from skweak.

plison avatar plison commented on June 12, 2024

I don't really know what might cause this problem, unfortunately. Could you send me a minimal example I could test?

from skweak.

nleguillarme avatar nleguillarme commented on June 12, 2024

Hi, I don't know if it may be related, but I have the same error : TypeError: unhashable type: 'list'

Here is how I obtain the error:

  • annotate a bunch of spacy docs using a combined annotator
  • add annotated docs to a DocBin object
  • save the DocBin object to disk using DocBin.to_disk

Then in an other script:

  • load the DocBin using DocBin.from_disk
  • get the docs back using docs = list(db.get_docs(nlp.vocab)) -> Error

It seems that the error is caused by voting.MajorityVoter, since I do not have the error when removing the majority voter from my pipeline.

Here is the full trace

Traceback (most recent call last):
  File "fit_model.py", line 48, in <module>
    docs = get_docs(db_path)
  File "fit_model.py", line 29, in get_docs
    docs = list(db.get_docs(nlp.vocab))
  File "/home/leguilln/workspace/nlp/corpus_annotation/skweak-corpus-annot/src/skweak-env/lib/python3.8/site-packages/spacy/tokens/_serialize.py", line 152, in get_docs
    doc.spans.from_bytes(self.span_groups[i])
  File "/home/leguilln/workspace/nlp/corpus_annotation/skweak-corpus-annot/src/skweak-env/lib/python3.8/site-packages/spacy/tokens/_dict_proxies.py", line 96, in from_bytes
    group = SpanGroup(doc).from_bytes(value_bytes)
  File "spacy/tokens/span_group.pyx", line 223, in spacy.tokens.span_group.SpanGroup.from_bytes
  File "/home/leguilln/workspace/nlp/corpus_annotation/skweak-corpus-annot/src/skweak-env/lib/python3.8/site-packages/srsly/_msgpack_api.py", line 27, in msgpack_loads
    msg = msgpack.loads(data, raw=False, use_list=use_list)
  File "/home/leguilln/workspace/nlp/corpus_annotation/skweak-corpus-annot/src/skweak-env/lib/python3.8/site-packages/srsly/msgpack/__init__.py", line 79, in unpackb
    return _unpackb(packed, **kwargs)
  File "srsly/msgpack/_unpacker.pyx", line 191, in srsly.msgpack._unpacker.unpackb
TypeError: unhashable type: 'list'

from skweak.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.