Code Monkey home page Code Monkey logo

fnlp's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

fnlp's Issues

测试异常 基于2015-12-11 Qiu修改POM之后的版本

按照QuickTutorial(链接https://github.com/xpqiu/fnlp/wiki/quicktutorial)中的步骤编译工程,在测试分词(命令 java -Xmx1024m -Dfile.encoding=UTF-8 -classpath "fnlp-core/target/fnlp-core-2.1-SNAPSHOT.jar:libs/trove4j-3.0.3.jar:libs/commons-cli-1.2.jar" org.fnlp.nlp.cn.tag.CWSTagger -s models/seg.m "自然语言是人类交流和思维的主要工具,是人类智慧的结晶。")的时候提醒"找不到或无法加载主类 org.fnlp.nlp.cn.tag.CWSTagger "。

使用Eclipse测试分词时报异常:
java.io.FileNotFoundException: ..\tmp\ar-train.txt (系统找不到指定的路径。)
at java.io.FileInputStream.open(Native Method)
at java.io.FileInputStream.(Unknown Source)
at org.fnlp.data.reader.SimpleFileReader.init(SimpleFileReader.java:100)
at org.fnlp.data.reader.SimpleFileReader.(SimpleFileReader.java:90)
at org.fnlp.nlp.cn.anaphora.train.ARClassifier.train(ARClassifier.java:144)
at org.fnlp.nlp.cn.anaphora.train.ARClassifier.main(ARClassifier.java:75)
Exception in thread "main" java.lang.NullPointerException
at org.fnlp.data.reader.SimpleFileReader.hasNext(SimpleFileReader.java:116)
at org.fnlp.ml.types.InstanceSet.loadThruStagePipes(InstanceSet.java:214)
at org.fnlp.nlp.cn.anaphora.train.ARClassifier.train(ARClassifier.java:144)
at org.fnlp.nlp.cn.anaphora.train.ARClassifier.main(ARClassifier.java:75)

NullPointerError when invoking CNFactory.parse2T

[java] Exception in thread "main" java.lang.NullPointerException
[java] at org.fnlp.nlp.parser.dep.JointParser._getBestParse(JointParser.java:128)
[java] at org.fnlp.nlp.parser.dep.JointParser.parse2T(JointParser.java:220)
[java] at org.fnlp.nlp.parser.dep.JointParser.parse2T(JointParser.java:230)
[java] at org.fnlp.nlp.cn.CNFactory.parse2T(CNFactory.java:306)

I believe the code that causes this error is the function "private Predict estimateActions(JointParsingState state)" in file "JointParser.java", on line 167
for (int i = 0; i < 2; i++) {
Integer guess = ret.getLabel(i);
if(guess==null) //bug:可能为空,待修改。 xpqiu
break;
String action = la.lookupString(guess);
result.add(action,ret.getScore(i));

Could you please help solve or kindly give some tips on solving this problem?

this.labels = this.factory.DefaultLabelAlphabet();空指针

CNFactory factory = CNFactory.getInstance("models");执行这句话就抛异常,文件目录没错

Exception in thread "main" java.lang.NullPointerException
at org.fnlp.nlp.cn.tag.AbstractTagger.(AbstractTagger.java:79)
at org.fnlp.nlp.cn.tag.CWSTagger.(CWSTagger.java:75)
at org.fnlp.nlp.cn.CNFactory.loadSeg(CNFactory.java:219)
at org.fnlp.nlp.cn.CNFactory.getInstance(CNFactory.java:164)
at org.fnlp.nlp.cn.CNFactory.getInstance(CNFactory.java:144)

加群

输入Neuro Linguistic Programming提示答案错误?求指教

CNFactory 的 .ner 方法不应为静态方法

在 Wiki 页面的 Quick Tutorial 介绍中的例子:

public static void main(String[] args) throws Exception {

    // 创建中文处理工厂对象,并使用“models”目录下的模型文件初始化
    CNFactory factory = CNFactory.getInstance("models");

    // 使用标注器对包含实体名的句子进行标注,得到结果
    HashMap result = factory.ner("詹姆斯·默多克和丽贝卡·布鲁克斯 鲁珀特·默多克旗下的美国小报《纽约邮报》的职员被公司律师告知,保存任何也许与电话窃听及贿赂有关的文件。");

    // 显示标注结果
    System.out.println(result);
}

因为 .ner 是静态方法,所以代码提示会给出警告/建议,应该用类名CNFactory调用 .ner 方法,但是这样的话会返回 null .

之字结构 打错了

org.fnlp.nlp.corpus.ctbconvert.DependentTreeProducter 类中的makeFirstClass方法是打错了,还是就这样定义的?
614: else if(is之字结构(root.label))
root.label.setDepClass("的字结构");

english speaker

hello,

i'm a software developer in the us. i am interested in multi-lingual computer communication. chinese language would be very useful for an exploratory project, however, i don't speak chinese. could anyone there help me? i would think someone would find this project interesting and i (potentially) have other talented collaborators (based on if they can be convinced of worthiness, etc).

[Anaphora]class中resolve方法 java.lang.NullPointerException

public class AnaphoraResolution {
public static void main(String args[]) throws Exception{
String str2 = "复旦大学创建于1905年,它位于上海市,这个大学培育了好多优秀的学生。";

String str3[] = {"复旦","大学","创建","于","1905年",",","它","位于","上海市",",","这个","大学","培育","了","好多","优秀","的","学生","。"};
String str4[] = {"专有名","名词","动词","介词","时间短语","标点","代词","动词","专有名","标点","限定词","名词","动词","动态助词","数词","形容词","结构助词","名词","标点"};
    String str5[][][] = new String[1][2][str3.length];
    str5[0][0] = str3;
    str5[0][1] = str4;
    Anaphora aa2 = new Anaphora("../models/ar.m");
    LinkedList<EntityGroup> res3 = aa2.resolve(str5,str2);
    System.out.println(res3);
}

}

使用anaphora解释器的时候,在这里LinkedList res3 = aa2.resolve(str5,str2);
总会有java.lang.NullPointerException,demo中的所有方法都试过 都有这个问题~

实体识别有问题

在fnlp-demo下的org.fnlp.demo.nlp.NamedEntityRecognition类运行会报错,原因好像是在part-of-speech标注的时候会产生实体名这样的标签,但是org.fnlp.nlp.cn.PartOfSpeech枚举中没有实体名这一名字导致该类isEntiry函数出错。

app/lucene/demo/BuildIndex.java在lucene5.0.0包下出现错误!

在IndexWriter调用addDocumentupdateDocument时均出现异常,fnlp目前是否只支持到lucene4.7

Exception in thread "main" java.lang.AbstractMethodError: org.apache.lucene.analysis.Analyzer.createComponents(Ljava/lang/String;)Lorg/apache/lucene/analysis/Analyzer$TokenStreamComponents;
    at org.apache.lucene.analysis.Analyzer.tokenStream(Analyzer.java:179)
    at org.apache.lucene.document.Field.tokenStream(Field.java:556)
    at org.apache.lucene.index.DefaultIndexingChain$PerField.invert(DefaultIndexingChain.java:606)
    at org.apache.lucene.index.DefaultIndexingChain.processField(DefaultIndexingChain.java:344)
    at org.apache.lucene.index.DefaultIndexingChain.processDocument(DefaultIndexingChain.java:300)
    at org.apache.lucene.index.DocumentsWriterPerThread.updateDocument(DocumentsWriterPerThread.java:231)
    at org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:449)
    at org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1349)

跑测试失败

运行命令 mvn clean package


Results :

Tests in error:
org.fnlp.nlp.cn.tag.POSTaggerTest: java.io.FileNotFoundException: ../models/pos.m (No such file or directory)
org.fnlp.nlp.cn.tag.CWSTaggerTest: java.io.FileNotFoundException: ../models/seg.m (No such file or directory)

Tests run: 33, Failures: 0, Errors: 2, Skipped: 0

windows下运行 修改classpath 需要加上 .;

java -Xmx1024m -Dfile.encoding=UTF-8 -classpath ".;fnlp-core/target/fnlp-core-2.1-SNAPSHOT.jar;libs/trove4j-3.0.3.jar;libs/commons-cli-1.2.jar" org.fnlp.nlp.cn.tag.POSTagger -s models/seg.m models/pos.m "周杰伦出生于**,生日为79年1月18日,他曾经的绯闻女友是蔡依林。"

eclipse构建出错


Test set: org.fnlp.nlp.cn.tag.CWSTaggerTest

Tests run: 3, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 1.209 sec <<< FAILURE!
testTagString2(org.fnlp.nlp.cn.tag.CWSTaggerTest) Time elapsed: 0.017 sec <<< FAILURE!
java.lang.AssertionError
at org.junit.Assert.fail(Assert.java:86)
at org.junit.Assert.assertTrue(Assert.java:41)
at org.junit.Assert.assertTrue(Assert.java:52)
at org.fnlp.nlp.cn.tag.CWSTaggerTest.testTagString2(CWSTaggerTest.java:65)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:252)
at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:141)
at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:112)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.maven.surefire.util.ReflectionUtils.invokeMethodWithArray(ReflectionUtils.java:189)
at org.apache.maven.surefire.booter.ProviderFactory$ProviderProxy.invoke(ProviderFactory.java:165)
at org.apache.maven.surefire.booter.ProviderFactory.invokeProvider(ProviderFactory.java:85)
at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:115)
at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:75)

一点疑问

你好,请问词性标注和命名实体模型是怎么得到的,如果我想增加一类实体类型,比如学校名,应该怎么做?
另外我想往词典里面增加一个词,以及其对应的实体类型,应该怎么做呢?谢谢。

我想在hadoop里运行一下 FNLP 结果models/pos.m 找不到

我分别在 提交的程序里放了一份
在hadoop/lib 里以散文件形式放了一份
在hadoop/lib里以jar形式放了一份
在 hdfs /home//models 里放了一份
在hdfs /models 里也放了一份
问一下应该放在那里哈?
要不我试试放在/tmp里?

分词不准确

比如下面这个例子

小明硕士毕业于**科学院计算所,后在日本京都大学深造
小明 硕士 毕业于 ** 科学院 计算 所 , 后 在 日本 京都 大学 深造

毕业于的

请勿发布广告信息或其他无关评论,否则将会删除评论并扣分,严重者给予封号处理。
请 勿 发布 广告 信息 或 其他 无关 评论 , 否则 将 会 删除 评论 并扣分 , 严重者 给予 封号 处理 。

并扣分中的

这两个是非常非常非常明显的错误,分词差成这样后面的关键词提取、词性标注都不好做了

使用FNLPAnalyzer时,highlighter高亮显示将出现错误

QueryStr: 太平洋
使用SmartChineseAnalyzer时,
结果为沃克环流 在赤道附近的<font color='red'>太平洋</font>海区,信风驱使着赤道暖流自东向西流。
使用FNLPAnalyzer时,
结果为沃克环流 在赤道附近<font color='red'>的太</font><font color='red'>在赤</font>平洋海<font color='red'>在赤</font><font color='red'> 在</font>区,信风驱使着赤道暖流自东向西流。

托管在maven上的包安装问题

在配置好pod.xml后,使用以下命令
mV clean install 后出现以下错误:
[INFO] ------------------------------------------------------------------------
Downloading: https://oss.sonatype.org/content/repositories/snapshots/org/fnlp/fnlp-core/2.0/fnlp-core-2.0.pom
Downloading: https://repo.maven.apache.org/maven2/org/fnlp/fnlp-core/2.0/fnlp-core-2.0.pom
Downloaded: https://repo.maven.apache.org/maven2/org/fnlp/fnlp-core/2.0/fnlp-core-2.0.pom (5 KB at 4.4 KB/sec)
Downloading: https://oss.sonatype.org/content/repositories/snapshots/org/fnlp/fnlp-all/2.0-SNAPSHOT/maven-metadata.xml
Downloading: https://oss.sonatype.org/content/repositories/snapshots/org/fnlp/fnlp-all/2.0-SNAPSHOT/fnlp-all-2.0-SNAPSHOT.pom
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 2.394 s
[INFO] Finished at: 2015-10-17T14:55:30-04:00
[INFO] Final Memory: 10M/305M
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal on project cip-guuud-lib: Could not resolve dependencies for project batc-cip:cip-guuud-lib:pom:0.0.1-SNAPSHOT: Failed to collect dependencies at org.fnlp:fnlp-core:jar:2.0: Failed to read artifact descriptor for org.fnlp:fnlp-core:jar:2.0: Could not find artifact org.fnlp:fnlp-all:pom:2.0-SNAPSHOT in sonatype-snapshots (https://oss.sonatype.org/content/repositories/snapshots) -> [Help 1]
org.apache.maven.lifecycle.LifecycleExecutionException: Failed to execute goal on project cip-guuud-lib: Could not resolve dependencies for project batc-cip:cip-guuud-lib:pom:0.0.1-SNAPSHOT: Failed to collect dependencies at org.fnlp:fnlp-core:jar:2.0

下面省略。该项目是使用fnlp,stanford nlp-core 以及open nlp的项目。 fnlp主要用于对中文的分析。问题在于,无法从托管在Maven的项目上正确下载下来。请问该如何解决?万分感谢

下载的代码缺少类

org.fnlp.wsytry.MultiCorpusClusterTagger
提示缺少:
org.fnlp.corpus.transform.tree.RelationalTree;
org.fnlp.ml.classifier.struct.inf.MultiCorpusViterbi;
org.fnlp.ml.classifier.struct.update.MultiCorpusViterbiPAUpdate;

2.1版本词性标注的一点小问题(标点识别错误)

在快速入门中:
java -Xmx1024m -Dfile.encoding=UTF-8 -classpath "fnlp-core/target/fnlp-core-2.1-SNAPSHOT.jar:libs/trove4j-3.0.3.jar:libs/commons-cli-1.2.jar" org.fnlp.nlp.cn.tag.POSTagger -s models/seg.m models/pos.m "周杰伦出生于台 湾,生日为79年1月18日,他曾经的绯闻女友是蔡依林。"

输出:
周杰伦/人名 出生/动词 于/介词 **/地名 ,/动词 生日/名词 为/介词 79年/时间短语 1月/时间短语 18日/时间短语 ,/标点 他/人称代词 曾经/形容词 的/结构助词 绯闻/名词 女友/名词 是/动词 蔡依林/人名 。/标点

前后两个标点识别结果不一致。

运行Demo异常 基于2015-12-11 Qiu修改POM之后的版本

运行NLP test时异常
模型文件读入错误: ../models/pos.m
java.io.EOFException: Unexpected end of ZLIB input stream
at java.util.zip.InflaterInputStream.fill(InflaterInputStream.java:240)
at java.util.zip.InflaterInputStream.read(InflaterInputStream.java:158)
at java.util.zip.GZIPInputStream.read(GZIPInputStream.java:116)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:275)
at java.io.BufferedInputStream.read(BufferedInputStream.java:334)
at java.io.ObjectInputStream$PeekInputStream.read(ObjectInputStream.java:2310)
at java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.java:2323)
at java.io.ObjectInputStream$BlockDataInputStream.readUTFBody(ObjectInputStream.java:3063)
at java.io.ObjectInputStream$BlockDataInputStream.readUTF(ObjectInputStream.java:2864)
at java.io.ObjectInputStream.readString(ObjectInputStream.java:1638)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1341)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
at gnu.trove.map.custom_hash.TObjectIntCustomHashMap.readExternal(TObjectIntCustomHashMap.java:1139)
at java.io.ObjectInputStream.readExternalData(ObjectInputStream.java:1837)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1796)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
at java.util.HashMap.readObject(HashMap.java:1180)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1893)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
at org.fnlp.nlp.cn.tag.AbstractTagger.loadFrom(AbstractTagger.java:205)
at org.fnlp.nlp.cn.tag.AbstractTagger.(AbstractTagger.java:73)
at org.fnlp.nlp.cn.tag.POSTagger.(POSTagger.java:114)
at org.fnlp.demo.nlp.PartsOfSpeechTag.main(PartsOfSpeechTag.java:45)
at org.fnlp.demo.NLPTest.test(NLPTest.java:33)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
at org.junit.runner.JUnitCore.run(JUnitCore.java:160)
at com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:78)
at com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:212)
at com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:68)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at com.intellij.rt.execution.application.AppMain.main(AppMain.java:140)

org.fnlp.util.exception.LoadModelException: java.io.EOFException: Unexpected end of ZLIB input stream
at org.fnlp.nlp.cn.tag.AbstractTagger.loadFrom(AbstractTagger.java:208)
at org.fnlp.nlp.cn.tag.AbstractTagger.(AbstractTagger.java:73)
at org.fnlp.nlp.cn.tag.POSTagger.(POSTagger.java:114)
at org.fnlp.demo.nlp.PartsOfSpeechTag.main(PartsOfSpeechTag.java:45)
at org.fnlp.demo.NLPTest.test(NLPTest.java:33)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
at org.junit.runner.JUnitCore.run(JUnitCore.java:160)
at com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:78)
at com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:212)
at com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:68)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at com.intellij.rt.execution.application.AppMain.main(AppMain.java:140)
Caused by: java.io.EOFException: Unexpected end of ZLIB input stream
at java.util.zip.InflaterInputStream.fill(InflaterInputStream.java:240)
at java.util.zip.InflaterInputStream.read(InflaterInputStream.java:158)
at java.util.zip.GZIPInputStream.read(GZIPInputStream.java:116)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:275)
at java.io.BufferedInputStream.read(BufferedInputStream.java:334)
at java.io.ObjectInputStream$PeekInputStream.read(ObjectInputStream.java:2310)
at java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.java:2323)
at java.io.ObjectInputStream$BlockDataInputStream.readUTFBody(ObjectInputStream.java:3063)
at java.io.ObjectInputStream$BlockDataInputStream.readUTF(ObjectInputStream.java:2864)
at java.io.ObjectInputStream.readString(ObjectInputStream.java:1638)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1341)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
at gnu.trove.map.custom_hash.TObjectIntCustomHashMap.readExternal(TObjectIntCustomHashMap.java:1139)
at java.io.ObjectInputStream.readExternalData(ObjectInputStream.java:1837)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1796)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
at java.util.HashMap.readObject(HashMap.java:1180)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1893)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
at org.fnlp.nlp.cn.tag.AbstractTagger.loadFrom(AbstractTagger.java:205)
... 32 more

跑测试程序失败

当运行如下代码时,发生错误:

	CNFactory factory = CNFactory.getInstance("models");
	HashMap<String, String> result = factory.ner("詹姆斯·默多克和丽贝卡·布鲁克斯 鲁珀特·默多克旗下的美国小报《纽约邮报》的职员被公司律师告知,保存任何也许与电话窃听及贿赂有关的文件。");

	 	// 显示标注结果
	System.out.println(result);

报如下错误:

Exception in thread "main" java.lang.NoClassDefFoundError: gnu/trove/map/hash/TCharCharHashMap
	at org.fnlp.nlp.cn.ChineseTrans.ensureST(ChineseTrans.java:54)
	at org.fnlp.nlp.cn.ChineseTrans.<init>(ChineseTrans.java:48)
	at org.fnlp.nlp.cn.CNFactory.<clinit>(CNFactory.java:54)
	at Test.main(Test.java:9)
Caused by: java.lang.ClassNotFoundException: gnu.trove.map.hash.TCharCharHashMap
	at java.net.URLClassLoader.findClass(Unknown Source)
	at java.lang.ClassLoader.loadClass(Unknown Source)
	at sun.misc.Launcher$AppClassLoader.loadClass(Unknown Source)
	at java.lang.ClassLoader.loadClass(Unknown Source)
	... 4 more

我在您给的链接上下载的trove,但并没有TCharCharHashMap这个类。
请问,这个问题怎么解决?

memory usage of the models

seems that the memory usage is quite a problem.
according to my test,
after loading the dep.m model, memory usage increased over 450MB,
after loading the pos.m model, memory usage increased another 240MB,

I also noticed that the dep.m and pos.m are all less than 50MB

I'm not quite familiar with Java, but I think this kind of memory usage can be problematic.

I used runtime.totalMemory() and runtime.freeMemory(); to measure memory usage, I'm not sure whether this makes sense.

I'll keep investigating this memory usage issue, and looking forward to your help.

Thanks!

Why using MurmurHash instead of String.hashCode()?

My profiling shows String.getBytes() are quiet slow, while MurmurHash would do getBytes() every single hash because it is bytes oriented.

If the primary purpose is speeding up hashing, why not use String.hashCode() instead?

引入正则分词

是否已经有正则分词,如果有,烦请告知调用方法;如没有,能否引入?

在运行demo时出现NullPointerException异常

根据QuickTutorial,按照步骤在Eclipse中进行一步步设置,运行demo的时候出现java.lang.NullPointerException,看了下程序源码,貌似在labels = factory.DefaultLabelAlphabet();处爆出,该语句位于org.fnlp.nlp.cn.tag.AbstractTagger类的public AbstractTagger(String file) throws LoadModelException方法中。

请指教。

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.