Code Monkey home page Code Monkey logo

Comments (3)

carschno avatar carschno commented on July 17, 2024 2

I've been able to reproduce this and have isolated the issue to the trailing newline character. Apparently, this is rooted in FastText itself; however, the problem probably does not arise there because it operates on line-by-line input, whereas the Java API allows for arbitrary (multi-line) strings.

$ echo "Weak wifi otherwise all ok" | fasttext predict-prob model.bin - 5
__label__60 0.678738 __label__80 0.315212 __label__40 0.0055875 __label__100 0.000415088  __label__20 6.88466e-05

Now without trailing newline:

$ echo -n "Weak wifi otherwise all ok" | fasttext predict-prob model.bin - 5
__label__60 0.807072 __label__80 0.126261 __label__40 0.049052 __label__4 0.00411388 __label__5 0.00340998

Running on the command line, using the java package (created with mvn clean package):

$ echo "Weak wifi otherwise all ok" | java -jar JFastText/target/jfasttext-0.4-jar-with-dependencies.jar  predict-prob model.bin - 5
__label__60 0.678737 __label__80 0.315212 __label__40 0.0055875 __label__100 0.000415088 __label__20 6.88467e-05

Again, without trailing newline:

$ echo -n "Weak wifi otherwise all ok" | java -jar JFastText/target/jfasttext-0.4-jar-with-dependencies.jar  predict-prob model.bin - 5
__label__60 0.807072 __label__80 0.126261 __label__40 0.049052 __label__4 0.00411388 __label__5 0.00340998

In the Java API, this is also reproducible. With trailing newline:

	JFastText jft = new JFastText();
	jft.loadModel("model.bin");
	List<ProbLabel> predictions = jft.predictProba("Weak wifi otherwise all ok\n", 5);

Without trailing newline:

	JFastText jft = new JFastText();
	jft.loadModel("model.bin");
	List<ProbLabel> predictions = jft.predictProba("Weak wifi otherwise all ok", 5);

The results are the same as above with echo and echo -n respectively.

from jfasttext.

carschno avatar carschno commented on July 17, 2024

This is actually a known issue in FastText, see:
facebookresearch/fastText#435 and facebookresearch/fastText#165

from jfasttext.

kun368 avatar kun368 commented on July 17, 2024

Based on what @carschno mentioned, I used this to get the right results:

public Map<String, Double> predictTopLabel(String text, int k) {
    Map<String, Double> scoreMap = new LinkedHashMap<>();
    text = StringUtils.trimToEmpty(text) + "\n";
    final List<JFastText.ProbLabel> pl = model.predictProba(text, k);
    for (JFastText.ProbLabel i : CollectionUtils.emptyIfNull(pl)) {
        final double prob = Math.exp(i.logProb);
        final double score = Math.round(prob * 100000000) / 100000000;
        scoreMap.put(i.label, score);
    }
    return scoreMap;
}

from jfasttext.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.