Code Monkey home page Code Monkey logo

log-odds-ratio's People

Contributors

kornosk avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar

log-odds-ratio's Issues

Usage of `log_odds_ratio.py`

I came across this repository after reading your NAACL '21 paper. It seems to be a simple method that captures the important words in the text dataset. But I have questions in two aspects: theory and usage.

Theory

  • Does the log-odds-ratio method constrain the number of classes in the dataset? From what I could see, it would be difficult to apply this approach if I have a dataset with more than 3 classes (or equivalently, this method is only applicable to 2/3 class dataset).
  • In your NAACL '21 paper, you mentioned the following. But I could not see what the score is compared to when you refer to "higher" and "lower". Is there a pre-specified threshold?

A higher score indicates more significance .... A lower score means ....

  • In your NAACL '21 paper, you mentioned the following. What is this "sensitivity analysis"? Did you try to increase k to investigate its influence on the classification performance?

Based on a sensitivity analysis, we set k=10 to extract the top-10 significant words...

Usage

  • The format of corpus i, corpus j, and background corpus is not provided in the README.md. Are they simply sentences separated by \n or some special format is required?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.