Code Monkey home page Code Monkey logo

Comments (4)

jklaise avatar jklaise commented on May 19, 2024

Hi, thanks for your interest in the library, I look forward to reading the blog post!

  1. I've seen the same issue so I'm going to investigate, the warning suggests that we are looking at synonyms for empty vectors, this could be an alibi bug or intended behaviour according to spacy.

  2. This is interesting. First I just want to note that this in an example of an empty anchor not a "no anchor" - no anchor would be a partial anchor which did not satisfy the precision threshold no matter how many features were added in. I agree that it looks like the anchor algorithm is picking up on the model's decision to classify as "neither" if apples/oranges are absent, but it's not entirely clear. Can I ask a bit more about what the sentences in the "neither" category look like?

from alibi.

jklaise avatar jklaise commented on May 19, 2024

@mapmeld I believe the cause of 1. is using the small sm model - the spacy docs say that the sm models don't ship with word vectors, I would suggest trying the lg model instead and see if the warnings go away (the quality of synonyms should also improve!).

from alibi.

jklaise avatar jklaise commented on May 19, 2024

@mapmeld I was wrong, we use the md model by default which does have word vectors, the issue arises because there are a lot of lexemes for which the word vector is identically zero. I think we should prune these from the corpus before finding the synonyms. Alternatively we can bump up the default w_prob=-20 to something higher to exclude more words based on rarity.

Edit: I've submitted a PR to fix this #110

Edit2: This is now merged and fixed in v0.2.2

from alibi.

jklaise avatar jklaise commented on May 19, 2024

@mapmeld I will close this issue now as the warnings have been fixed and it's hard to debug the Anchor output without knowing the details of your model, feel free open a new issue if you have more details.

from alibi.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.