liuhuanyong / textgrapher Goto Github PK
View Code? Open in Web Editor NEWText Content Grapher based on keyinfo extraction by NLP method。输入一篇文档,将文档进行关键信息提取,进行结构化,并最终组织成图谱组织形式,形成对文章语义信息的图谱化展示。
Text Content Grapher based on keyinfo extraction by NLP method。输入一篇文档,将文档进行关键信息提取,进行结构化,并最终组织成图谱组织形式,形成对文章语义信息的图谱化展示。
如题,小白求探讨
运行text_grapher.py文件其他函数没问题。例如
handler.rel_entity_keyword()
就是运行handler.mian()出错
为什么我加载模型
进去分析出来的grow_show都是乱码。
模型加载如下:
LTP_DIR = r"C:\Users\ling\Downloads\semantics\ltp_data_v3.4.0\ltp_data_v3.4.0"
乱码如下:
你好,为什么我运行完代码之后,html文件打开是空白的?
有能链接这个图谱,进行问答语义计算,然后搜索出答案吗?
你的电脑上是不是有Ltp项目,我看代码里引入Ltp 路径,但是整个项目也不存在这个路径,所以问一下是不是你已经有了ltp项目的缘故。
Traceback (most recent call last):
File "D:\github\text\TextGrapher-master\text_grapher.py", line 225, in
handler = CrimeMining()
File "D:\github\text\TextGrapher-master\text_grapher.py", line 17, in init
self.parser = LtpParser()
File "D:\github\text\TextGrapher-master\sentence_parser.py", line 12, in init
self.segmentor = Segmentor()
TypeError: init(): incompatible constructor arguments. The following argument types are supported:
1. pyltp.Segmentor(model_path: str, lexicon_path: str = None, force_lexicon_path: str = None)
Invoked with:
请问这是什么原因
运行出来的图谱中是乱码,是什么原因呢?
TypeError: init(): incompatible constructor arguments. The following argument types are supported:
1. pyltp.Segmentor(model_path: str, lexicon_path: str = None, force_lexicon_path: str = None)
你好,非常感谢你的工作,我找了很久才找到。
我注意到你使用的的语料往往都是XX事件,画图通过机构,人命,地名,关键字,高频词构建。
我想把一些说明性语言中,包含的知识提取出来,然后用图来展示,但是得到的却是一个以“关键字”和“高频词”构成的双重环。
请问这是预期的结果吗,有没有办法更好的提取知识呢?
text_grapher.py
文件中,构建实体与关键词关系的函数返回的events_entity_keyword
是空列表[]
通过查看函数的代码,发现问题可能出在这一行:keyword = [i[0] for i in keyword]
。
['谢雕', '凯旋', '同学', '家人', '北京', '南都', '记者', '高中', '警方', '告诉']
,keyword = [i[0] for i in keyword]
:['谢', '凯', '同', '家', '北', '南', '记', '高', '警', '告']
可以看到返回的结果是关键词的第一个字。这样得到的events是空列表。如果想要得到正常的结果的话,应该删除keyword = [i[0] for i in keyword]
这一行。
请问我的理解对吗?想知道这里是不是bug,还是说有我没有想到的地方。
'''基于文章关键词,建立起实体与关键词之间的关系'''
def rel_entity_keyword( ners, keyword, subsent):
events = []
rels = []
sents = []
ners = [i.split('/')[0] for i in set(ners)]
keyword = [i[0] for i in keyword]
for sent in subsent:
tmp = []
for wd in sent:
if wd in ners + keyword:
tmp.append(wd)
if len(tmp) > 1:
sents.append(tmp)
for ner in ners:
for sent in sents:
if ner in sent:
tmp = ['->'.join([ner, wd]) for wd in sent if wd in keyword and wd != ner and len(wd) > 1]
if tmp:
rels += tmp
for e in set(rels):
events.append([e.split('->')[0], e.split('->')[1]])
return events
#将关键词与实体进行关系抽取
events_entity_keyword = rel_entity_keyword(ners, keywords, subsents_seg)
events += events_entity_keyword
这个是任何文本都行吗?还是局限于代码里面的内容
.
(py2.7) ➜ TextGrapher git:(master) ✗ python text_grapher.py
Traceback (most recent call last):
File "text_grapher.py", line 407, in
handler.main(content9)
File "text_grapher.py", line 169, in main
words, postags = self.process_sent(sent)
File "text_grapher.py", line 47, in process_sent
words, postags = self.parser.basic_process(sent)
File "/Users/wanghaisheng/workspace/TextGrapher/sentence_parser.py", line 161, in basic_process
name_entity_dist = self.format_entity(words, netags, postags)
File "/Users/wanghaisheng/workspace/TextGrapher/sentence_parser.py", line 83, in format_entity
name_entity_dist['nhs'] = self.modify_entity(name_entity_list, words, postags, 'nh')
File "/Users/wanghaisheng/workspace/TextGrapher/sentence_parser.py", line 100, in modify_entity
consist = [words[int(start_index)] + '/' + postags[int(start_index)]]
ValueError: invalid literal for int() with base 10: ''
(py2.7) ➜ TextGrapher git:(master) ✗
root@ai001:/home/ubuntu/zz/TextGrapher-master# python text_grapher.py
Segmentor: Model not loaded!
Postagger: Model not loaded!
NER: Model not loaded!
这是什么原因?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.