Code Monkey home page Code Monkey logo

jfasttext's People

Contributors

vinhkhuc avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

jfasttext's Issues

why JFastText allowed only model trained with JFastext?

Hello,

How can read a pretrained model? I try to load the preexisting files .vec and .bin, but the load model raises an excpetion. Its looks like the format incompatible and JFastText allowed only model trained with JFastext.

The results are different.

Use the same model weibo.bin

test one:

admindeiMac:fastText admin$ ./fasttext supervised -input weibo1.txt -output weibo  -lr 1.0 -epoch 35 -wordNgrams 2 -bucket 200000 -dim 50 -loss hs
admindeiMac:fastText admin$ ./fasttext test weibo.bin weibo2.txt 
N	101
P@1	0.891
R@1	0.452
admindeiMac:fastText admin$ ./fasttext  predict-prob  weibo.bin - 10
砒霜 堪比 过年 陌生 家中 负责 不会 买菜 食物 一定 一定不会 不会陌生 不能食用 买菜务必 以下常见 堪比砒霜 家中负责 常见食物 快报砒霜 砒霜蔬菜
__label__新闻内容 0.699219 __label__知识性内容 0.10588 __label__娱乐新闻 0.0444504 __label__新闻评价 0.0419737 __label__行业分析 0.0186155 __label__营销内容 0.0144202 __label__行业文章 0.0118233 __label__用户主动提及 0.00708137 __label__用户评价 0.00687072 __label__新闻调查 0.00646674

test two use JFastText:

	@Test
	public void test0() throws IOException, ParseException {	
		JFastText jft = new JFastText();     
		jft.loadModel("/Users/admin/pqy/github/fastText/weibo.bin");
		String text = "砒霜 堪比 过年 陌生 家中 负责 不会 买菜 食物 一定 一定不会 不会陌生 不能食用 买菜务必 以下常见 堪比砒霜 家中负责 常见食物 快报砒霜 砒霜蔬菜";
        List<ProbLabel> probLabels = jft.predictProba(text,10);
        System.out.println(probLabels);
        for (ProbLabel pro : probLabels) {
			System.out.println("处理之后的句子:"+text+" ,标签:"+pro.label+",打分:"+Math.exp(pro.logProb));
		}

        System.out.printf("Text: '%s'\n", text);
        for (JFastText.ProbLabel predictedProbLabel: jft.predictProba(text, 2)) {
            System.out.printf("\tlabel: '%s', probability: %f\n",
                    predictedProbLabel.label, Math.exp(predictedProbLabel.logProb));
        }
	}

result

[logProb = -0.774493, label = __label__新闻内容, logProb = -1.801197, label = __label__营销内容, logProb = -2.293256, label = __label__品牌推广, logProb = -2.761014, label = __label__活动推广, logProb = -3.310276, label = __label__其他, logProb = -3.408686, label = __label__话题推广, logProb = -3.663944, label = __label__行业文章, logProb = -4.000737, label = __label__产品推广, logProb = -4.260805, label = __label__知识性内容, logProb = -4.331049, label = __label__求助上访]
处理之后的句子:砒霜 堪比 过年 陌生 家中 负责 不会 买菜 食物 一定 一定不会 不会陌生 不能食用 买菜务必 以下常见 堪比砒霜 家中负责 常见食物 快报砒霜 砒霜蔬菜 ,标签:__label__新闻内容,打分:0.4609375365905082
处理之后的句子:砒霜 堪比 过年 陌生 家中 负责 不会 买菜 食物 一定 一定不会 不会陌生 不能食用 买菜务必 以下常见 堪比砒霜 家中负责 常见食物 快报砒霜 砒霜蔬菜 ,标签:__label__营销内容,打分:0.16510113524043515
处理之后的句子:砒霜 堪比 过年 陌生 家中 负责 不会 买菜 食物 一定 一定不会 不会陌生 不能食用 买菜务必 以下常见 堪比砒霜 家中负责 常见食物 快报砒霜 砒霜蔬菜 ,标签:__label__品牌推广,打分:0.1009372940469236
处理之后的句子:砒霜 堪比 过年 陌生 家中 负责 不会 买菜 食物 一定 一定不会 不会陌生 不能食用 买菜务必 以下常见 堪比砒霜 家中负责 常见食物 快报砒霜 砒霜蔬菜 ,标签:__label__活动推广,打分:0.06322765415162428
处理之后的句子:砒霜 堪比 过年 陌生 家中 负责 不会 买菜 食物 一定 一定不会 不会陌生 不能食用 买菜务必 以下常见 堪比砒霜 家中负责 常见食物 快报砒霜 砒霜蔬菜 ,标签:__label__其他,打分:0.03650609553075632
处理之后的句子:砒霜 堪比 过年 陌生 家中 负责 不会 买菜 食物 一定 一定不会 不会陌生 不能食用 买菜务必 以下常见 堪比砒霜 家中负责 常见食物 快报砒霜 砒霜蔬菜 ,标签:__label__话题推广,打分:0.0330846476100821
处理之后的句子:砒霜 堪比 过年 陌生 家中 负责 不会 买菜 食物 一定 一定不会 不会陌生 不能食用 买菜务必 以下常见 堪比砒霜 家中负责 常见食物 快报砒霜 砒霜蔬菜 ,标签:__label__行业文章,打分:0.025631217305801535
处理之后的句子:砒霜 堪比 过年 陌生 家中 负责 不会 买菜 食物 一定 一定不会 不会陌生 不能食用 买菜务必 以下常见 堪比砒霜 家中负责 常见食物 快报砒霜 砒霜蔬菜 ,标签:__label__产品推广,打分:0.01830215048111813
处理之后的句子:砒霜 堪比 过年 陌生 家中 负责 不会 买菜 食物 一定 一定不会 不会陌生 不能食用 买菜务必 以下常见 堪比砒霜 家中负责 常见食物 快报砒霜 砒霜蔬菜 ,标签:__label__知识性内容,打分:0.014110936696807902
处理之后的句子:砒霜 堪比 过年 陌生 家中 负责 不会 买菜 食物 一定 一定不会 不会陌生 不能食用 买菜务必 以下常见 堪比砒霜 家中负责 常见食物 快报砒霜 砒霜蔬菜 ,标签:__label__求助上访,打分:0.013153742429429115
Text: '砒霜 堪比 过年 陌生 家中 负责 不会 买菜 食物 一定 一定不会 不会陌生 不能食用 买菜务必 以下常见 堪比砒霜 家中负责 常见食物 快报砒霜 砒霜蔬菜'
	label: '__label__新闻内容', probability: 0.460938
	label: '__label__营销内容', probability: 0.165101
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.298 sec - in com.stq.FastTextTest

JFasttext throws exception during loading model

When I try to load a fastext model, following exception is thrown:

java.lang.RuntimeException: vector
at com.github.jfasttext.FastTextWrapper$FastTextApi.loadModel(Native Method)
at com.github.jfasttext.JFastText.loadModel(JFastText.java:25)

I try to load a model which is generated with the latest fasttext as today

Load model file from resource

After quantization feature of fasttext, it is possible to have a small size model files so I can put them in resource folder of a jar. Is it possible to load this model files from jar resource folder directly?

Error in checkModel API when loading pre-trained model from fastText

Hi,

I got an error when loading pre-trained model as Below. Could you check it in your convenience?

Many thanks,

///////////////////////////////////////////////////////////////////////////////////////////////////

Exception in thread "main" java.lang.UnsatisfiedLinkError: com.github.jfasttext.FastTextWrapper$FastTextApi.checkModel(Ljava/lang/String;)Z
at com.github.jfasttext.FastTextWrapper$FastTextApi.checkModel(Native Method)
at com.github.jfasttext.JFastText.loadModel(JFastText.java:29)
at com.github.jfasttext.JFastText.main(JFastText.java:203)

ngram embedding access

There is any way to get the ngram for a given word. suppose the word is music and that this music belongs to vocab (we have the embedding of music). How can i get the embedding of the ngram of music, e.g embedding of mus, usi, sic?

jvm crash when call .close()

Hi when my java code calls .close() before destroy the bean in spring container, I got the jvm crashed with the error as blow,

A fatal error has been detected by the Java Runtime Environment:

SIGSEGV (0xb) at pc=0x00007fe80add9d1c, pid=4700, tid=0x00007fe75f8ec700

JRE version: Java(TM) SE Runtime Environment (8.0_144-b01) (build 1.8.0_144-b01)

Java VM: Java HotSpot(TM) 64-Bit Server VM (25.144-b01 mixed mode linux-amd64 compressed oops)

Problematic frame:

C [libc.so.6+0x82d1c] cfree+0x1c

Core dump written. Default location: /home/peidong/Downloads/cuba/sts-bundle/sts-3.9.0.RELEASE/core or core.4700

An error report file with more information is saved as:

/home/peidong/Downloads/cuba/sts-bundle/sts-3.9.0.RELEASE/hs_err_pid4700.log

If you would like to submit a bug report, please visit:

http://bugreport.java.com/bugreport/crash.jsp

The crash happened outside the Java Virtual Machine in native code.

See problematic frame for where to report the bug.

hi exception: java.lang.IndexOutOfBoundsException

java.lang.IndexOutOfBoundsException: Index: 0, Size: 0
at java.util.ArrayList.rangeCheck(ArrayList.java:653)
at java.util.ArrayList.get(ArrayList.java:429)
at com.github.jfasttext.JFastText.predictProba(JFastText.java:60)

update please?

I've done the work that updates until may 1st, and included it in #3

Would you mind updating?

linux环境下,java.lang.UnsatisfiedLinkError: no jniFastTextWrapper in java.library.path,环境已安装clang-3.3 python 2.6 or newer numpy & scipy

java.lang.UnsatisfiedLinkError: no jniFastTextWrapper in java.library.path
at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1867) ~[?:1.8.0_91]
at java.lang.Runtime.loadLibrary0(Runtime.java:870) ~[?:1.8.0_91]
at java.lang.System.loadLibrary(System.java:1122) ~[?:1.8.0_91]
at org.bytedeco.javacpp.Loader.loadLibrary(Loader.java:945) ~[stormjar.jar:?]
at org.bytedeco.javacpp.Loader.load(Loader.java:750) ~[stormjar.jar:?]
at org.bytedeco.javacpp.Loader.load(Loader.java:657) ~[stormjar.jar:?]
at com.github.jfasttext.FastTextWrapper.(FastTextWrapper.java:10) ~[stormjar.jar:?]
at java.lang.Class.forName0(Native Method) ~[?:1.8.0_91]
at java.lang.Class.forName(Class.java:348) ~[?:1.8.0_91]
at org.bytedeco.javacpp.Loader.load(Loader.java:712) ~[stormjar.jar:?]
at org.bytedeco.javacpp.Loader.load(Loader.java:657) ~[stormjar.jar:?]
at com.github.jfasttext.FastTextWrapper$FastTextApi.(FastTextWrapper.java:171) ~[stormjar.jar:?]
at com.github.jfasttext.JFastText.(JFastText.java:13) ~[stormjar.jar:?]
at com.taidi.poas.process.topology.cluster.bolt.HotClusterBolt.prepare(HotClusterBolt.java:83) ~[stormjar.jar:?]
at org.apache.storm.daemon.executor$fn__4973$fn__4986.invoke(executor.clj:791) ~[storm-core-1.0.3.jar:1.0.3]
at org.apache.storm.util$async_loop$fn__557.invoke(util.clj:482) [storm-core-1.0.3.jar:1.0.3]
at clojure.lang.AFn.run(AFn.java:22) [clojure-1.7.0.jar:?]
at java.lang.Thread.run(Thread.java:745) [?:1.8.0_91]
Caused by: java.lang.UnsatisfiedLinkError: /root/.javacpp/cache/stormjar.jar/com/github/jfasttext/linux-x86_64/libjniFastTextWrapper.so: /lib64/libc.so.6: version `GLIBC_2.14' not found (required by /root/.javacpp/cache/stormjar.jar/com/github/jfasttext/linux-x86_64/libjniFastTextWrapper.so)
at java.lang.ClassLoader$NativeLibrary.load(Native Method) ~[?:1.8.0_91]
at java.lang.ClassLoader.loadLibrary0(ClassLoader.java:1941) ~[?:1.8.0_91]
at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1824) ~[?:1.8.0_91]
at java.lang.Runtime.load0(Runtime.java:809) ~[?:1.8.0_91]
at java.lang.System.load(System.java:1086) ~[?:1.8.0_91]
at org.bytedeco.javacpp.Loader.loadLibrary(Loader.java:925) ~[stormjar.jar:?]
... 14 more

the prediction is not same as predicted using official c++

I just tested this repo and the official one to predict a number of samples(with the same model trained by official code, in format of ftz).

c++

fasttext predict-prob test-example.txt

java(this) - api call

(equal represents the label is same, discard the probability)
all samples: 21513
equal: 19236
not-equal: 2219
null(in this repo): 58

java-cmd(this)

java -jar jfasttext-0.4-jar-with-dependencies.jar predict-prob test-example.txt
all samples: 21513
equal: 18825
not-equal: 2688

so, what's wrong?

Another thing: the prediction of java-cmd is unstable , changing every time.

JFastText: jniFastTextWrapper in java.library.path

Hello Sir @vinhkhuc and everybody who can help,

Im trying to install and trun JFastText in linux.

I did:

module load gcc/4.9.1, then module load Java/1.8.0_45, then load Maven/3.3.9

when i run the command:

1- mvn compile, i got:
[INFO] Scanning for projects...
[INFO] Inspecting build with total of 1 modules...
[INFO] Installing Nexus Staging features:
[INFO]   ... total of 1 executions of maven-deploy-plugin replaced with nexus-staging-maven-plugin
[INFO]
[INFO] ------------------------------------------------------------------------
[INFO] Building Java interface for fastText 0.4-SNAPSHOT
[INFO] ------------------------------------------------------------------------
[INFO]
[INFO] --- maven-enforcer-plugin:1.3.1:enforce (enforce) @ jfasttext ---
[INFO]
[INFO] --- maven-resources-plugin:2.6:resources (default-resources) @ jfasttext ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] skip non existing resourceDirectory /gs/project/tws-462-aa/JFastText/src/main/resources
[INFO]
[INFO] --- maven-compiler-plugin:3.0:compile (default-compile) @ jfasttext ---
[INFO] Nothing to compile - all classes are up to date
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 6.560 s
[INFO] Finished at: 2017-11-17T11:56:58-05:00
[INFO] Final Memory: 30M/1930M

Then after when i execute the code using the command:

2- [INFO] Scanning for projects...
[INFO] Inspecting build with total of 1 modules...
[INFO] Installing Nexus Staging features:
[INFO]   ... total of 1 executions of maven-deploy-plugin replaced with nexus-staging-maven-plugin
[INFO]
[INFO] ------------------------------------------------------------------------
[INFO] Building Java interface for fastText 0.4-SNAPSHOT
[INFO] ------------------------------------------------------------------------
[INFO]
[INFO] --- exec-maven-plugin:1.6.0:java (default-cli) @ jfasttext ---
[WARNING]
java.lang.UnsatisfiedLinkError: no jniFastTextWrapper in java.library.path
        at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1865)
        at java.lang.Runtime.loadLibrary0(Runtime.java:870)
        at java.lang.System.loadLibrary(System.java:1122)
        at org.bytedeco.javacpp.Loader.loadLibrary(Loader.java:945)
        at org.bytedeco.javacpp.Loader.load(Loader.java:750)
        at org.bytedeco.javacpp.Loader.load(Loader.java:657)
        at com.github.jfasttext.FastTextWrapper.<clinit>(FastTextWrapper.java:10)
        at java.lang.Class.forName0(Native Method)
        at java.lang.Class.forName(Class.java:348)
        at org.bytedeco.javacpp.Loader.load(Loader.java:712)
        at org.bytedeco.javacpp.Loader.load(Loader.java:657)
        at com.github.jfasttext.FastTextWrapper$FastTextApi.<clinit>(FastTextWrapper.java:171)
        at com.github.jfasttext.JFastText.<init>(JFastText.java:14)
        at com.github.jfasttext.JFastText.main(JFastText.java:201)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:497)
        at org.codehaus.mojo.exec.ExecJavaMojo$1.run(ExecJavaMojo.java:282)
        at java.lang.Thread.run(Thread.java:745)
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 2.531 s
[INFO] Finished at: 2017-11-17T11:58:07-05:00
[INFO] Final Memory: 26M/1930M
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal org.codehaus.mojo:exec-maven-plugin:1.6.0:java (default-cli) on project jfasttext: An exception occured while executing the Java class. no jniFastTextWrapper in java.library.path -> [Help 1]

My question is about: where is my error? what shoud to do to let the code run?

Not that i added to pom.xml file the following:

     <plugin>
      <groupId>org.codehaus.mojo</groupId>
      <artifactId>exec-maven-plugin</artifactId>
      <version>1.6.0</version>
      <configuration>
        <mainClass>com.github.jfasttext.JFastText</mainClass>
        <!-- <mainClass>de.dwslab.petar.walks.WalkGeneratorRand</mainClass>-->
      </configuration>
    </plugin>

Just to be able to run the command: mvn exec:java

Statically linked fasttext

Hi,

I'm trying to deploy to a system that doesn't support c++11 and I'd like to statically compile the c libs in jfasttext. I can get fasttext to compile fine with the -static flag for gcc but when I add that flag to the maven file is gets overridden by some default options somewhere. Can you help me get this to compile a statically linked fasttext with JNI bindings?

Eric

Cannot rebuilt jfasttext wrapper with Fasttext version 0.1.0

Due to update structure of many class like dictionary in Fasttext library, I cannot rebuilt with version 0.1.0. Is there any one could help me update fasttext wrapper?
Thanks for your help!

Here is log when I rebuilt jfasttext wrapper with fasttext version 0.1.0:

In file included from /mnt/cuongpx/workspace/toolkits/jfasttext-custom/JFastText-master/src/main/java/../cpp/fasttext_wrapper_javacpp.h:13:0, from /mnt/cuongpx/workspace/toolkits/jfasttext-custom/JFastText-master/target/classes/com/github/jfasttext/jniFastTextWrapper.cpp:102: /mnt/cuongpx/workspace/toolkits/jfasttext-custom/JFastText-master/src/main/java/../cpp/fasttext_wrapper.cc:91:61: error: ‘class fasttext::Vector’ has no member named ‘m_’ return std::vector<real>(vec.data_, vec.data_ + vec.m_);
^

How to build for multiple platforms?

When I execute mvn package it creates a jar with only the library related to the building platform.

Is there a way to build fasttext for all platforms (macos, linux, windows) and have the libraries available inside the jar?

SIGSEGV on getWords after training

Using the example training data (and preprocessing it using the classification-example.sh script that comes with fasttext), I get a SIGSEGV when calling getWords after training.

Training: ft.runCmd("supervised -input dbpedia.train -output model.bin -dim 100 -lr 0.05 -wordNgrams 2 -minCount 5 -bucket 2000000 -epoch 5".split(" "))

model.bin is successfully generated; and if I load it instead of training, there is no crash. I suspect it's running out of memory; but calling unloadModel before getWords does not help. I tried discarding the trained JFastText object and then running loadModel, but it seems model.bin is generated asynchronously so there is no good way to know when to call loadModel.

Crash log: hs_err_pid28676.txt

EDIT: version 0.3 on Mac OSX

Method to unload instance

It seems there is no method to unload a model to free memory.
Can you please add one or give instruction on how to implement it?

Kind regards,
Michael

JVM GC JFastText

after loadmodule
JFastText jft = new JFastText();
jft.loadModel(modelFIle);
jft=null;

when JVM gc collect jft happens System will crash with error
ava(53882,0x70000cfe0000) malloc: *** error for object 0x1000000000000000: pointer being freed was not allocated
*** set a breakpoint in malloc_error_break to debug

java.lang.RuntimeException

Get this exception when load model:
Exception in thread "main" java.lang.RuntimeException: vector<T> too long at com.github.jfasttext.FastTextWrapper$FastTextApi.loadModel(Native Method) at com.github.jfasttext.JFastText.loadModel(JFastText.java:25)

JVM crashed when load model on Ubuntu

When call JFastText.loadModel the JVM crash. The same code run on MAC with no problem but crash on ubuntu.
See logs:

A fatal error has been detected by the Java Runtime Environment:

SIGSEGV (0xb) at pc=0x00007f9cb10c9412, pid=9720, tid=0x00007f9cb1803700

JRE version: Java(TM) SE Runtime Environment (8.0_191-b12) (build 1.8.0_191-b12)

Java VM: Java HotSpot(TM) 64-Bit Server VM (25.191-b12 mixed mode linux-amd64 compressed oops)

Problematic frame:

C [libjniFastTextWrapper.so+0x14412] fasttext::Dictionary::find(std::string const&, unsigned int) const+0x12

Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again

Model file has wrong file format!

I have built a classifier model using the original FastText binary, compiled from the Git branch from August 8th, 2018.
When I try to run this model with JFastText (v0.3), it says:

Model file has wrong file format!

I guess that JFastText needs to be updated to match the most recent FastText version.
In case you no longer maintain this, could you elaborate on which version of FastText is supported by JFastText.
Perhaps you could also point out how to tackle the update.

print-word-vectors command does not work

I could not be able to test print-word-vectors command. While the original is ./fasttext print-word-vectors model.bin < queries.txt, according to JFastText documentation it should be something like:

JFastText jft = new JFastText();
    jft.runCmd(new String[] {
            "print-word-vectors",
            "src/test/resources/models/cbow.model.bin",
            "<",
            "src/test/resources/data/queries.txt"
    });

I checked the compiled version and the command receives 3 arguments. The same command works using linux terminal:
cat queries.txt | java -jar /PATH/JFastText/target/jfasttext-0.4-SNAPSHOT-jar-with-dependencies.jar print-word-vectors ../models/cbow.model.bin > queries.txt
Is there something wrong with the number of parameters on JFastText implementation ?

Is this library thread-safe?

Hi,
ArrayList in the class JFastText is not thread-safe, and maybe it should be replaced with CopyOnWriteArrayList or newArrayList in guava.

Kind regards,
Vincent

Perform unit test on JFastText

I'm currently working on supervised labelling JFastText, and I want to perform some unit test to check if the result of labelling would fit the expected result and has probability above 80%.

Having an issue on loading the model with this error stack trace:
java.lang.RuntimeException: src/main/resources/interest.model.bin cannot be opened for saving.
at com.github.jfasttext.FastTextWrapper$FastTextApi.runCmd(Native Method)
at com.github.jfasttext.JFastText.runCmd(JFastText.java:22)
at com.kokatto.demo.FastTextUtilTest.testGet(FastTextUtilTest.java:18)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.junit.platform.commons.util.ReflectionUtils.invokeMethod(ReflectionUtils.java:628)
at org.junit.jupiter.engine.execution.ExecutableInvoker.invoke(ExecutableInvoker.java:117)
at org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.lambda$invokeTestMethod$7(TestMethodTestDescriptor.java:184)
at org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
at org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.invokeTestMethod(TestMethodTestDescriptor.java:180)
at org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.execute(TestMethodTestDescriptor.java:127)
at org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.execute(TestMethodTestDescriptor.java:68)
at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$5(NodeTestTask.java:135)
at org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$7(NodeTestTask.java:125)
at org.junit.platform.engine.support.hierarchical.Node.around(Node.java:135)
at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$8(NodeTestTask.java:123)
at org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
at org.junit.platform.engine.support.hierarchical.NodeTestTask.executeRecursively(NodeTestTask.java:122)
at org.junit.platform.engine.support.hierarchical.NodeTestTask.execute(NodeTestTask.java:80)
at java.util.ArrayList.forEach(ArrayList.java:1257)
at org.junit.platform.engine.support.hierarchical.SameThreadHierarchicalTestExecutorService.invokeAll(SameThreadHierarchicalTestExecutorService.java:38)
at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$5(NodeTestTask.java:139)
at org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$7(NodeTestTask.java:125)
at org.junit.platform.engine.support.hierarchical.Node.around(Node.java:135)
at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$8(NodeTestTask.java:123)
at org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
at org.junit.platform.engine.support.hierarchical.NodeTestTask.executeRecursively(NodeTestTask.java:122)
at org.junit.platform.engine.support.hierarchical.NodeTestTask.execute(NodeTestTask.java:80)
at java.util.ArrayList.forEach(ArrayList.java:1257)
at org.junit.platform.engine.support.hierarchical.SameThreadHierarchicalTestExecutorService.invokeAll(SameThreadHierarchicalTestExecutorService.java:38)
at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$5(NodeTestTask.java:139)
at org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$7(NodeTestTask.java:125)
at org.junit.platform.engine.support.hierarchical.Node.around(Node.java:135)
at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$8(NodeTestTask.java:123)
at org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
at org.junit.platform.engine.support.hierarchical.NodeTestTask.executeRecursively(NodeTestTask.java:122)
at org.junit.platform.engine.support.hierarchical.NodeTestTask.execute(NodeTestTask.java:80)
at org.junit.platform.engine.support.hierarchical.SameThreadHierarchicalTestExecutorService.submit(SameThreadHierarchicalTestExecutorService.java:32)
at org.junit.platform.engine.support.hierarchical.HierarchicalTestExecutor.execute(HierarchicalTestExecutor.java:57)
at org.junit.platform.engine.support.hierarchical.HierarchicalTestEngine.execute(HierarchicalTestEngine.java:51)
at org.junit.platform.launcher.core.DefaultLauncher.execute(DefaultLauncher.java:220)
at org.junit.platform.launcher.core.DefaultLauncher.lambda$execute$6(DefaultLauncher.java:188)
at org.junit.platform.launcher.core.DefaultLauncher.withInterceptedStreams(DefaultLauncher.java:202)
at org.junit.platform.launcher.core.DefaultLauncher.execute(DefaultLauncher.java:181)
at org.junit.platform.launcher.core.DefaultLauncher.execute(DefaultLauncher.java:128)
at com.microsoft.java.test.runner.junit5.CustomizedConsoleTestExecutor.lambda$executeTests$0(CustomizedConsoleTestExecutor.java:42)
at com.microsoft.java.test.runner.junit5.CustomContextClassLoaderExecutor.invoke(CustomContextClassLoaderExecutor.java:29)
at com.microsoft.java.test.runner.junit5.CustomizedConsoleTestExecutor.executeTests(CustomizedConsoleTestExecutor.java:38)
at com.microsoft.java.test.runner.junit5.CustomizedConsoleLauncher.execute(CustomizedConsoleLauncher.java:29)
at com.microsoft.java.test.runner.Launcher.main(Launcher.java:56)

Has anyone got this issue?

JVM crash in DeallocatorReference::clear method

Stack trace:

#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x00007fcc0b988512, pid=26585, tid=0x00007fcb5e870700
#
# JRE version: Java(TM) SE Runtime Environment (8.0_111-b14) (build 1.8.0_111-b14)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (25.111-b14 mixed mode linux-amd64 compressed oops)
# Problematic frame:
# C  [libc.so.6+0x84512]  cfree+0x22
#
# Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
#
# An error report file with more information is saved as:
# /home/prod/ch/hs_err_pid26585.log
Compiled method (c1)  242883 15227       2       org.bytedeco.javacpp.Pointer$DeallocatorReference::clear (60 bytes)
 total in heap  [0x00007fcbf7193c90,0x00007fcbf71945b0] = 2336
 relocation     [0x00007fcbf7193db8,0x00007fcbf7193e48] = 144
 main code      [0x00007fcbf7193e60,0x00007fcbf71941e0] = 896
 stub code      [0x00007fcbf71941e0,0x00007fcbf71942d8] = 248
 oops           [0x00007fcbf71942d8,0x00007fcbf71942f0] = 24
 metadata       [0x00007fcbf71942f0,0x00007fcbf7194360] = 112
 scopes data    [0x00007fcbf7194360,0x00007fcbf7194490] = 304
 scopes pcs     [0x00007fcbf7194490,0x00007fcbf7194580] = 240
 dependencies   [0x00007fcbf7194580,0x00007fcbf71945a0] = 32
 nul chk table  [0x00007fcbf71945a0,0x00007fcbf71945b0] = 16
#
# If you would like to submit a bug report, please visit:
#   http://bugreport.java.com/bugreport/crash.jsp
# The crash happened outside the Java Virtual Machine in native code.
# See problematic frame for where to report the bug.
#

hs_err_pid26585.log is attached.

This never happens after updating fastText cpp code to the latest version. Is there any chance to update fastText or at least merge PR #22? Thank you.

Prediction throuhput

Just curious what throughput people are getting for prediction requests; I can get up to 380 requests per second under highly concurrent load with a clojure based http server wrapper, using a non-quantized fasttext classification model.

我爬取了2g大的训练数据,训练完成后jre会报错不支持小型转储

JFastText jft = new JFastText();
// Train supervised model
jft.runCmd(new String[]{
"supervised",
"-input", "/resource/trainseg2.txt",
"-output", "/resource/trainseg2.model",
"-bucket", "100",
"-minCount", "1"
});
我训练的代码是这样的,数据有一个g的时候是不报错的,超过一定容量就报错了,
是我的参数不对还是其他原因?日志文件上传不成功,我复制到下面

A fatal error has been detected by the Java Runtime Environment:

EXCEPTION_ACCESS_VIOLATION (0xc0000005) at pc=0x00007fff89122c38, pid=6256, tid=0x000000000000126c

JRE version: Java(TM) SE Runtime Environment (8.0_102-b14) (build 1.8.0_102-b14)

Java VM: Java HotSpot(TM) 64-Bit Server VM (25.102-b14 mixed mode windows-amd64 compressed oops)

Problematic frame:

C [jniFastTextWrapper.dll+0x12c38]

Failed to write core dump. Minidumps are not enabled by default on client versions of Windows

If you would like to submit a bug report, please visit:

http://bugreport.java.com/bugreport/crash.jsp

The crash happened outside the Java Virtual Machine in native code.

See problematic frame for where to report the bug.

Loading the model only one time!

I am thinking to implement this brilliant library into my project.
But when it comes to the production level coding, one thing on my mind, which is loading model.

Could you tell us that how we just need to load the model only 1 time.

" One time" in a sense, once we load the model then we retain it while classifying the text.

For instance,

  1. load the model
  2. get text and classify it
  3. retain the model on memory --> go back to 2

The thing I know is the function below.
// Load model from file
jft.loadModel("src/test/resources/models/supervised.model.bin");

If we succeed to do this, it will save our resource to classify a new message without loading the model to memory again.

Best,

Unable to build using mvn

Hi I am trying to build using the steps here

But I am getting the following error when I run mvn package

Results :

Tests run: 9, Failures: 0, Errors: 0, Skipped: 0

[INFO] 
[INFO] --- maven-jar-plugin:2.4:jar (default-jar) @ jfasttext ---
[INFO] Building jar: /home/JFastText/target/jfasttext-0.4.jar
[INFO] 
[INFO] --- maven-javadoc-plugin:2.9.1:jar (attach-javadocs) @ jfasttext ---
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time:  19.733 s
[INFO] Finished at: 2020-04-24T20:00:01+05:30
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-javadoc-plugin:2.9.1:jar (attach-javadocs) on project jfasttext: MavenReportException: Error while creating archive: Unable to find javadoc command: The environment variable JAVA_HOME is not correctly set. -> [Help 1]
[ERROR] 
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR] 
[ERROR] For more information about the errors and possible solutions, please read the following articles:
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException

Is there any mistake installation ? Please help.
Thanks.

Trying to load .ftz model and getting "wrong file format"

@lidalei can you please advise

import com.github.jfasttext.JFastText;

public class ApiExample {
public static void main(String[] args) {
JFastText jft = new JFastText();

    jft.loadModel("nlpData/model.ftz");

    // Do label prediction
    String text = "What is the most popular game in the US ?";
    JFastText.ProbLabel probLabel = jft.predictProba(text);
    System.out.printf("\nThe label of '%s' is '%s' with probability %f\n",
            text, probLabel.label, Math.exp(probLabel.logProb));
}

}

Output:
Model file has wrong file format!
Process finished with exit code 1

ERROR

ERROR ApplicationMaster: User class threw exception: java.lang.UnsupportedClassVersionError: com/github/jfasttext/JFastText : Unsupported major.minor version 52.0

How to get nearest neighbours using the API?

Hi!

I want to get the nearest neighbours inside my Java code using the JFastText API; however, I can not find which method I should use.
In the original library the method is the "void FastText::nn(int32_t k)", but so far I've been able to call it only by JFastText "command".

Different results from command line tool

The predict-prob method return different results in the java and the native command line tool.
Foe example see the results from test05PredictProba in the JFastTextTest class (or test with your own model).
The java return probability is: 0.500125
The C++ native tool return probability is: 0.500075

Right, this looks like a minor not important, but when test the probs results with large model files, I see huge gap between the return probabilities.

Jfasttext take much time

Excuse me , I'm new to use fasttext and very appreciating your work. but i'm facing latency with using Jfastext. My dataset is around 170MB and with running my code it took more than 8 hours . Do you know how can i make it faster ? Thanks in advance

error C2664: cannot convert parameter from "std::istringstream" to "int32_t"

Hi,

I encountered the following error in building the project, any advice would be greatly appreciated.

\jfasttext\src\main\cpp\fasttext_wrapper.cc(77): error C2664: “void fasttext::FastText::predict(std::istream &,int32_t,bool,fasttext::real)”: cannot convert parameter 1 from "std::istringstream" to "int32_t"

Dimension of pretrained vectors does not match -dim option

Hi,
I try to use with the supervised command a pretrainedVector.
This vector has this first line: 170830 100
And I have this message: Dimension of pretrained vectors does not match -dim option
I tried to set the option "-dim","100".
But no way.
Have you an idea, please?
Gil

NullPointerException :: Code Correction

Demo example will sooner or later lead to NullPointerException

JFastText.ProbLabel probLabel = jft.predictProba(entireData);//This is bound to throw NullPointerException

replace it with

List predictionList = jft.predictProba(entireData, 1);
if(predictionList.size()>0)//check the size of output before fetching an item
{

}

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.