fudannlp / fnlp Goto Github PK
View Code? Open in Web Editor NEW中文自然语言处理工具包 Toolkit for Chinese natural language processing
License: GNU Lesser General Public License v3.0
中文自然语言处理工具包 Toolkit for Chinese natural language processing
License: GNU Lesser General Public License v3.0
按照QuickTutorial(链接https://github.com/xpqiu/fnlp/wiki/quicktutorial)中的步骤编译工程,在测试分词(命令 java -Xmx1024m -Dfile.encoding=UTF-8 -classpath "fnlp-core/target/fnlp-core-2.1-SNAPSHOT.jar:libs/trove4j-3.0.3.jar:libs/commons-cli-1.2.jar" org.fnlp.nlp.cn.tag.CWSTagger -s models/seg.m "自然语言是人类交流和思维的主要工具,是人类智慧的结晶。")的时候提醒"找不到或无法加载主类 org.fnlp.nlp.cn.tag.CWSTagger "。
使用Eclipse测试分词时报异常:
java.io.FileNotFoundException: ..\tmp\ar-train.txt (系统找不到指定的路径。)
at java.io.FileInputStream.open(Native Method)
at java.io.FileInputStream.(Unknown Source)
at org.fnlp.data.reader.SimpleFileReader.init(SimpleFileReader.java:100)
at org.fnlp.data.reader.SimpleFileReader.(SimpleFileReader.java:90)
at org.fnlp.nlp.cn.anaphora.train.ARClassifier.train(ARClassifier.java:144)
at org.fnlp.nlp.cn.anaphora.train.ARClassifier.main(ARClassifier.java:75)
Exception in thread "main" java.lang.NullPointerException
at org.fnlp.data.reader.SimpleFileReader.hasNext(SimpleFileReader.java:116)
at org.fnlp.ml.types.InstanceSet.loadThruStagePipes(InstanceSet.java:214)
at org.fnlp.nlp.cn.anaphora.train.ARClassifier.train(ARClassifier.java:144)
at org.fnlp.nlp.cn.anaphora.train.ARClassifier.main(ARClassifier.java:75)
[java] Exception in thread "main" java.lang.NullPointerException
[java] at org.fnlp.nlp.parser.dep.JointParser._getBestParse(JointParser.java:128)
[java] at org.fnlp.nlp.parser.dep.JointParser.parse2T(JointParser.java:220)
[java] at org.fnlp.nlp.parser.dep.JointParser.parse2T(JointParser.java:230)
[java] at org.fnlp.nlp.cn.CNFactory.parse2T(CNFactory.java:306)
I believe the code that causes this error is the function "private Predict estimateActions(JointParsingState state)" in file "JointParser.java", on line 167
for (int i = 0; i < 2; i++) {
Integer guess = ret.getLabel(i);
if(guess==null) //bug:可能为空,待修改。 xpqiu
break;
String action = la.lookupString(guess);
result.add(action,ret.getScore(i));
Could you please help solve or kindly give some tips on solving this problem?
。。。
CNFactory factory = CNFactory.getInstance("models");执行这句话就抛异常,文件目录没错
Exception in thread "main" java.lang.NullPointerException
at org.fnlp.nlp.cn.tag.AbstractTagger.(AbstractTagger.java:79)
at org.fnlp.nlp.cn.tag.CWSTagger.(CWSTagger.java:75)
at org.fnlp.nlp.cn.CNFactory.loadSeg(CNFactory.java:219)
at org.fnlp.nlp.cn.CNFactory.getInstance(CNFactory.java:164)
at org.fnlp.nlp.cn.CNFactory.getInstance(CNFactory.java:144)
输入Neuro Linguistic Programming提示答案错误?求指教
在 Wiki 页面的 Quick Tutorial 介绍中的例子:
public static void main(String[] args) throws Exception { // 创建中文处理工厂对象,并使用“models”目录下的模型文件初始化 CNFactory factory = CNFactory.getInstance("models"); // 使用标注器对包含实体名的句子进行标注,得到结果 HashMap result = factory.ner("詹姆斯·默多克和丽贝卡·布鲁克斯 鲁珀特·默多克旗下的美国小报《纽约邮报》的职员被公司律师告知,保存任何也许与电话窃听及贿赂有关的文件。"); // 显示标注结果 System.out.println(result); }
因为 .ner 是静态方法,所以代码提示会给出警告/建议,应该用类名CNFactory调用 .ner 方法,但是这样的话会返回 null .
The link (https://github.com/xpqiu/fnlp/blob/master/Benchmark) to the benchmark outdated?
org.fnlp.nlp.corpus.ctbconvert.DependentTreeProducter 类中的makeFirstClass方法是打错了,还是就这样定义的?
614: else if(is之字结构(root.label))
root.label.setDepClass("的字结构");
hello,
i'm a software developer in the us. i am interested in multi-lingual computer communication. chinese language would be very useful for an exploratory project, however, i don't speak chinese. could anyone there help me? i would think someone would find this project interesting and i (potentially) have other talented collaborators (based on if they can be convinced of worthiness, etc).
public class AnaphoraResolution {
public static void main(String args[]) throws Exception{
String str2 = "复旦大学创建于1905年,它位于上海市,这个大学培育了好多优秀的学生。";
String str3[] = {"复旦","大学","创建","于","1905年",",","它","位于","上海市",",","这个","大学","培育","了","好多","优秀","的","学生","。"};
String str4[] = {"专有名","名词","动词","介词","时间短语","标点","代词","动词","专有名","标点","限定词","名词","动词","动态助词","数词","形容词","结构助词","名词","标点"};
String str5[][][] = new String[1][2][str3.length];
str5[0][0] = str3;
str5[0][1] = str4;
Anaphora aa2 = new Anaphora("../models/ar.m");
LinkedList<EntityGroup> res3 = aa2.resolve(str5,str2);
System.out.println(res3);
}
}
使用anaphora解释器的时候,在这里LinkedList res3 = aa2.resolve(str5,str2);
总会有java.lang.NullPointerException,demo中的所有方法都试过 都有这个问题~
rt, thank you!
在fnlp-demo下的org.fnlp.demo.nlp.NamedEntityRecognition类运行会报错,原因好像是在part-of-speech标注的时候会产生实体名这样的标签,但是org.fnlp.nlp.cn.PartOfSpeech枚举中没有实体名这一名字导致该类isEntiry函数出错。
在IndexWriter调用addDocument
或updateDocument
时均出现异常,fnlp目前是否只支持到lucene4.7?
Exception in thread "main" java.lang.AbstractMethodError: org.apache.lucene.analysis.Analyzer.createComponents(Ljava/lang/String;)Lorg/apache/lucene/analysis/Analyzer$TokenStreamComponents;
at org.apache.lucene.analysis.Analyzer.tokenStream(Analyzer.java:179)
at org.apache.lucene.document.Field.tokenStream(Field.java:556)
at org.apache.lucene.index.DefaultIndexingChain$PerField.invert(DefaultIndexingChain.java:606)
at org.apache.lucene.index.DefaultIndexingChain.processField(DefaultIndexingChain.java:344)
at org.apache.lucene.index.DefaultIndexingChain.processDocument(DefaultIndexingChain.java:300)
at org.apache.lucene.index.DocumentsWriterPerThread.updateDocument(DocumentsWriterPerThread.java:231)
at org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:449)
at org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1349)
运行命令 mvn clean package
Results :
Tests in error:
org.fnlp.nlp.cn.tag.POSTaggerTest: java.io.FileNotFoundException: ../models/pos.m (No such file or directory)
org.fnlp.nlp.cn.tag.CWSTaggerTest: java.io.FileNotFoundException: ../models/seg.m (No such file or directory)
Tests run: 33, Failures: 0, Errors: 2, Skipped: 0
有没有例子,或者现有的命令,或代码调用
rt, thank you
按照教程导入不了。我用idea开发工具的。
java -Xmx1024m -Dfile.encoding=UTF-8 -classpath ".;fnlp-core/target/fnlp-core-2.1-SNAPSHOT.jar;libs/trove4j-3.0.3.jar;libs/commons-cli-1.2.jar" org.fnlp.nlp.cn.tag.POSTagger -s models/seg.m models/pos.m "周杰伦出生于**,生日为79年1月18日,他曾经的绯闻女友是蔡依林。"
Tests run: 3, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 1.209 sec <<< FAILURE!
testTagString2(org.fnlp.nlp.cn.tag.CWSTaggerTest) Time elapsed: 0.017 sec <<< FAILURE!
java.lang.AssertionError
at org.junit.Assert.fail(Assert.java:86)
at org.junit.Assert.assertTrue(Assert.java:41)
at org.junit.Assert.assertTrue(Assert.java:52)
at org.fnlp.nlp.cn.tag.CWSTaggerTest.testTagString2(CWSTaggerTest.java:65)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:252)
at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:141)
at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:112)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.maven.surefire.util.ReflectionUtils.invokeMethodWithArray(ReflectionUtils.java:189)
at org.apache.maven.surefire.booter.ProviderFactory$ProviderProxy.invoke(ProviderFactory.java:165)
at org.apache.maven.surefire.booter.ProviderFactory.invokeProvider(ProviderFactory.java:85)
at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:115)
at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:75)
你好,请问词性标注和命名实体模型是怎么得到的,如果我想增加一类实体类型,比如学校名,应该怎么做?
另外我想往词典里面增加一个词,以及其对应的实体类型,应该怎么做呢?谢谢。
我分别在 提交的程序里放了一份
在hadoop/lib 里以散文件形式放了一份
在hadoop/lib里以jar形式放了一份
在 hdfs /home//models 里放了一份
在hdfs /models 里也放了一份
问一下应该放在那里哈?
要不我试试放在/tmp里?
比如下面这个例子
小明硕士毕业于**科学院计算所,后在日本京都大学深造
小明 硕士 毕业于 ** 科学院 计算 所 , 后 在 日本 京都 大学 深造
毕业于的于
请勿发布广告信息或其他无关评论,否则将会删除评论并扣分,严重者给予封号处理。
请 勿 发布 广告 信息 或 其他 无关 评论 , 否则 将 会 删除 评论 并扣分 , 严重者 给予 封号 处理 。
并扣分中的并
这两个是非常非常非常明显的错误,分词差成这样后面的关键词提取、词性标注都不好做了
QueryStr: 太平洋
使用SmartChineseAnalyzer时,
结果为沃克环流 在赤道附近的<font color='red'>太平洋</font>海区,信风驱使着赤道暖流自东向西流。
使用FNLPAnalyzer时,
结果为沃克环流 在赤道附近<font color='red'>的太</font><font color='red'>在赤</font>平洋海<font color='red'>在赤</font><font color='red'> 在</font>区,信风驱使着赤道暖流自东向西流。
下载到本地运行。但是发现运行报错,是由于 models 目录下缺少seg.m 这个文件
我确认在org/子路径下,并无fnlp包。但是该包却可以被在oss引擎里查询得到。经过查证,该包包含在https://oss.sonatype.org/content/repositories/release/org/fnlp/希望尽快修复次bug。这个给软件安装部署带来了很多不必要的麻烦。谢谢!
期待加入实体关系提取模型
训练文件示例example-data/structure 不存在了,能否提供?谢谢
在配置好pod.xml后,使用以下命令
mV clean install 后出现以下错误:
[INFO] ------------------------------------------------------------------------
Downloading: https://oss.sonatype.org/content/repositories/snapshots/org/fnlp/fnlp-core/2.0/fnlp-core-2.0.pom
Downloading: https://repo.maven.apache.org/maven2/org/fnlp/fnlp-core/2.0/fnlp-core-2.0.pom
Downloaded: https://repo.maven.apache.org/maven2/org/fnlp/fnlp-core/2.0/fnlp-core-2.0.pom (5 KB at 4.4 KB/sec)
Downloading: https://oss.sonatype.org/content/repositories/snapshots/org/fnlp/fnlp-all/2.0-SNAPSHOT/maven-metadata.xml
Downloading: https://oss.sonatype.org/content/repositories/snapshots/org/fnlp/fnlp-all/2.0-SNAPSHOT/fnlp-all-2.0-SNAPSHOT.pom
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 2.394 s
[INFO] Finished at: 2015-10-17T14:55:30-04:00
[INFO] Final Memory: 10M/305M
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal on project cip-guuud-lib: Could not resolve dependencies for project batc-cip:cip-guuud-lib:pom:0.0.1-SNAPSHOT: Failed to collect dependencies at org.fnlp:fnlp-core:jar:2.0: Failed to read artifact descriptor for org.fnlp:fnlp-core:jar:2.0: Could not find artifact org.fnlp:fnlp-all:pom:2.0-SNAPSHOT in sonatype-snapshots (https://oss.sonatype.org/content/repositories/snapshots) -> [Help 1]
org.apache.maven.lifecycle.LifecycleExecutionException: Failed to execute goal on project cip-guuud-lib: Could not resolve dependencies for project batc-cip:cip-guuud-lib:pom:0.0.1-SNAPSHOT: Failed to collect dependencies at org.fnlp:fnlp-core:jar:2.0
下面省略。该项目是使用fnlp,stanford nlp-core 以及open nlp的项目。 fnlp主要用于对中文的分析。问题在于,无法从托管在Maven的项目上正确下载下来。请问该如何解决?万分感谢
org.fnlp.wsytry.MultiCorpusClusterTagger
提示缺少:
org.fnlp.corpus.transform.tree.RelationalTree;
org.fnlp.ml.classifier.struct.inf.MultiCorpusViterbi;
org.fnlp.ml.classifier.struct.update.MultiCorpusViterbiPAUpdate;
“关键字抽取”?的接口如何调用,实例中好像没有关键字抽取的调用方式
在快速入门中:
java -Xmx1024m -Dfile.encoding=UTF-8 -classpath "fnlp-core/target/fnlp-core-2.1-SNAPSHOT.jar:libs/trove4j-3.0.3.jar:libs/commons-cli-1.2.jar" org.fnlp.nlp.cn.tag.POSTagger -s models/seg.m models/pos.m "周杰伦出生于台 湾,生日为79年1月18日,他曾经的绯闻女友是蔡依林。"
输出:
周杰伦/人名 出生/动词 于/介词 **/地名 ,/动词 生日/名词 为/介词 79年/时间短语 1月/时间短语 18日/时间短语 ,/标点 他/人称代词 曾经/形容词 的/结构助词 绯闻/名词 女友/名词 是/动词 蔡依林/人名 。/标点
前后两个标点识别结果不一致。
如何增加字典
运行NLP test时异常
模型文件读入错误: ../models/pos.m
java.io.EOFException: Unexpected end of ZLIB input stream
at java.util.zip.InflaterInputStream.fill(InflaterInputStream.java:240)
at java.util.zip.InflaterInputStream.read(InflaterInputStream.java:158)
at java.util.zip.GZIPInputStream.read(GZIPInputStream.java:116)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:275)
at java.io.BufferedInputStream.read(BufferedInputStream.java:334)
at java.io.ObjectInputStream$PeekInputStream.read(ObjectInputStream.java:2310)
at java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.java:2323)
at java.io.ObjectInputStream$BlockDataInputStream.readUTFBody(ObjectInputStream.java:3063)
at java.io.ObjectInputStream$BlockDataInputStream.readUTF(ObjectInputStream.java:2864)
at java.io.ObjectInputStream.readString(ObjectInputStream.java:1638)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1341)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
at gnu.trove.map.custom_hash.TObjectIntCustomHashMap.readExternal(TObjectIntCustomHashMap.java:1139)
at java.io.ObjectInputStream.readExternalData(ObjectInputStream.java:1837)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1796)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
at java.util.HashMap.readObject(HashMap.java:1180)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1893)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
at org.fnlp.nlp.cn.tag.AbstractTagger.loadFrom(AbstractTagger.java:205)
at org.fnlp.nlp.cn.tag.AbstractTagger.(AbstractTagger.java:73)
at org.fnlp.nlp.cn.tag.POSTagger.(POSTagger.java:114)
at org.fnlp.demo.nlp.PartsOfSpeechTag.main(PartsOfSpeechTag.java:45)
at org.fnlp.demo.NLPTest.test(NLPTest.java:33)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
at org.junit.runner.JUnitCore.run(JUnitCore.java:160)
at com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:78)
at com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:212)
at com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:68)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at com.intellij.rt.execution.application.AppMain.main(AppMain.java:140)
org.fnlp.util.exception.LoadModelException: java.io.EOFException: Unexpected end of ZLIB input stream
at org.fnlp.nlp.cn.tag.AbstractTagger.loadFrom(AbstractTagger.java:208)
at org.fnlp.nlp.cn.tag.AbstractTagger.(AbstractTagger.java:73)
at org.fnlp.nlp.cn.tag.POSTagger.(POSTagger.java:114)
at org.fnlp.demo.nlp.PartsOfSpeechTag.main(PartsOfSpeechTag.java:45)
at org.fnlp.demo.NLPTest.test(NLPTest.java:33)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
at org.junit.runner.JUnitCore.run(JUnitCore.java:160)
at com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:78)
at com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:212)
at com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:68)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at com.intellij.rt.execution.application.AppMain.main(AppMain.java:140)
Caused by: java.io.EOFException: Unexpected end of ZLIB input stream
at java.util.zip.InflaterInputStream.fill(InflaterInputStream.java:240)
at java.util.zip.InflaterInputStream.read(InflaterInputStream.java:158)
at java.util.zip.GZIPInputStream.read(GZIPInputStream.java:116)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:275)
at java.io.BufferedInputStream.read(BufferedInputStream.java:334)
at java.io.ObjectInputStream$PeekInputStream.read(ObjectInputStream.java:2310)
at java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.java:2323)
at java.io.ObjectInputStream$BlockDataInputStream.readUTFBody(ObjectInputStream.java:3063)
at java.io.ObjectInputStream$BlockDataInputStream.readUTF(ObjectInputStream.java:2864)
at java.io.ObjectInputStream.readString(ObjectInputStream.java:1638)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1341)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
at gnu.trove.map.custom_hash.TObjectIntCustomHashMap.readExternal(TObjectIntCustomHashMap.java:1139)
at java.io.ObjectInputStream.readExternalData(ObjectInputStream.java:1837)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1796)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
at java.util.HashMap.readObject(HashMap.java:1180)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1893)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
at org.fnlp.nlp.cn.tag.AbstractTagger.loadFrom(AbstractTagger.java:205)
... 32 more
当运行如下代码时,发生错误:
CNFactory factory = CNFactory.getInstance("models");
HashMap<String, String> result = factory.ner("詹姆斯·默多克和丽贝卡·布鲁克斯 鲁珀特·默多克旗下的美国小报《纽约邮报》的职员被公司律师告知,保存任何也许与电话窃听及贿赂有关的文件。");
// 显示标注结果
System.out.println(result);
报如下错误:
Exception in thread "main" java.lang.NoClassDefFoundError: gnu/trove/map/hash/TCharCharHashMap
at org.fnlp.nlp.cn.ChineseTrans.ensureST(ChineseTrans.java:54)
at org.fnlp.nlp.cn.ChineseTrans.<init>(ChineseTrans.java:48)
at org.fnlp.nlp.cn.CNFactory.<clinit>(CNFactory.java:54)
at Test.main(Test.java:9)
Caused by: java.lang.ClassNotFoundException: gnu.trove.map.hash.TCharCharHashMap
at java.net.URLClassLoader.findClass(Unknown Source)
at java.lang.ClassLoader.loadClass(Unknown Source)
at sun.misc.Launcher$AppClassLoader.loadClass(Unknown Source)
at java.lang.ClassLoader.loadClass(Unknown Source)
... 4 more
我在您给的链接上下载的trove,但并没有TCharCharHashMap这个类。
请问,这个问题怎么解决?
请问时间模型是怎么构建的,如果是规则,可以自己加规则吗?
中文文本聚类以及后面的计划已经都停止了么
这样子就有组织了!
seems that the memory usage is quite a problem.
according to my test,
after loading the dep.m model, memory usage increased over 450MB,
after loading the pos.m model, memory usage increased another 240MB,
I also noticed that the dep.m and pos.m are all less than 50MB
I'm not quite familiar with Java, but I think this kind of memory usage can be problematic.
I used runtime.totalMemory() and runtime.freeMemory(); to measure memory usage, I'm not sure whether this makes sense.
I'll keep investigating this memory usage issue, and looking forward to your help.
Thanks!
RT
My profiling shows String.getBytes() are quiet slow, while MurmurHash would do getBytes() every single hash because it is bytes oriented.
If the primary purpose is speeding up hashing, why not use String.hashCode() instead?
RT,想尝试用FNLP的API做文本聚类,请问要怎么做啊?
是否已经有正则分词,如果有,烦请告知调用方法;如没有,能否引入?
根据QuickTutorial,按照步骤在Eclipse中进行一步步设置,运行demo的时候出现java.lang.NullPointerException
,看了下程序源码,貌似在labels = factory.DefaultLabelAlphabet();
处爆出,该语句位于org.fnlp.nlp.cn.tag.AbstractTagger
类的public AbstractTagger(String file) throws LoadModelException
方法中。
请指教。
请问如何能够下载到seg.m以及pos.m文件,否则分词和词性标注做不了
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.