murray-z / text_analysis_tools Goto Github PK
View Code? Open in Web Editor NEW中文文本分析工具包(包括- 文本分类 - 文本聚类 - 文本相似性 - 关键词抽取 - 关键短语抽取 - 情感分析 - 文本纠错 - 文本摘要 - 主题关键词-同义词、近义词-事件三元组抽取)
License: Apache License 2.0
中文文本分析工具包(包括- 文本分类 - 文本聚类 - 文本相似性 - 关键词抽取 - 关键短语抽取 - 情感分析 - 文本纠错 - 文本摘要 - 主题关键词-同义词、近义词-事件三元组抽取)
License: Apache License 2.0
三元组不应该是 名词,关系,属性
weights = self.get_text_tfidf_matrix(corpus)
self.vectorizer.fit_transform(corpus)
File "D:\anaconda3\lib\site-packages\sklearn\feature_extraction\text.py", line 1225, in fit_transform
max_features)
File "D:\anaconda3\lib\site-packages\sklearn\feature_extraction\text.py", line 1095, in _limit_features
return X[:, kept_indices], removed_terms
File "D:\anaconda3\lib\site-packages\scipy\sparse_index.py", line 35, in getitem
row, col = self._validate_indices(key)
File "D:\anaconda3\lib\site-packages\scipy\sparse_index.py", line 148, in _validate_indices
col = self._asindices(col, N)
File "D:\anaconda3\lib\site-packages\scipy\sparse_index.py", line 169, in _asindices
max_indx = x.max()
**return umr_maximum(a, axis, None, out, keepdims, initial, where)
TypeError: int() argument must be a string, a bytes-like object or a number, not '_NoValueType'**
文本摘要是基于texrank的吗
92和93行代码应改为:
92 word_pinyin = self.p.get_pinyin(word, splitter=' ')
93 candidate_pinyin = self.candidates(word_pinyin.split(" "))
62行代码应改为:
62 return (self.known(word) or self.known(self.edits1(word)) or self.known(self.edits2(word)) or word)
met an error:
AttributeError: 'LTP' object has no attribute 'sent_split'
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.