Comments (3)
Furthermore, in an application like this:
`import spacy
from ewiser.spacy.disambiguate import Disambiguator
from spacy.language import Language
import utils
nlp = spacy.load("en_core_web_sm", disable=['parser', 'ner'])
@Language.factory('wsd')
def wsd_engine(nlp, name):
return Disambiguator('ewiser/ewiser.semcor+wngt.pt', lang="en")
nlp.add_pipe('wsd', last=True)
# example
doc = nlp("Have you ever wondered how you are able to remember things for a long time?")
for w in doc:
print(w.text)
if w._.offset:
sensekey = utils.offsets2sensekeys(w._.offset, w.lemma_)
print(sensekey)`
Why isn't the model predicting WN offset for all words? From "Have you ever wondered how you are able to remember things for a long time?", the model only gives output for
ever%4:02:04::
wonder%2:32:01::
able%3:00:00::
remember%2:31:00::
thing%1:10:00::
long%3:00:02::
time%1:28:05::
from ewiser.
Hi,
-
The offsets are the unique IDs that are assigned to synsets (groups of senses). EWISER performs classification at the synset level, and that's why we output WN offsets. Offsets, as you seem to have managed to do judging from your code snippet, can be mapped to sensekeys quite easily given the lemma. I am not planning to add sensekeys to the Spacy plugin for the foreseeable future, but we welcome PRs :)
-
WN only covers nouns, verbs, adjectives, and adverbs--with some exceptions, e.g. interrogative adverbs. Can't predict senses for anything else. In your example sentence, the only things that belong to one of the former part of speech but are (correctly) not disambiguated are "have" and "are", as they function as, respectively, auxilliary verb and copula.
from ewiser.
Thanks for the comment. I'll make the pull request with appropriate modifications.
from ewiser.
Related Issues (16)
- code HOT 1
- Memory Issue while running model as REST service HOT 8
- ModuleNotFoundError: No module named 'ewiser.spacy' HOT 2
- Spacy 3 not compatible HOT 4
- Loading checkpoints HOT 2
- Installation issue with torch-scatter HOT 3
- Terminate called after throwing an instance of 'std::bad_alloc' HOT 1
- running issue with fairseq criterion
- multilingual datasets and model HOT 4
- Multilingual language mapper is not publicly shared HOT 1
- Spacy plugin HOT 1
- Spacy plugin broken by change in Spacy 3.0 HOT 4
- Babelnet 4.0.1 Dependency HOT 2
- Installation issue HOT 2
- Notebooks? HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from ewiser.