Comments (10)
可以的,readme里的sentiment.train就是用自己的文档训练
from snownlp.
文档只需要给定标签就可以是吧?会利用自带的工具区分词和提取关键词再用贝叶斯训练?话说大神回的很勤啊
from snownlp.
是的,就是按照那些文本的大概的格式就可以了,因为有邮件提醒,所以看到的很快呢
from snownlp.
我看了一下代码。如果说不考虑更加复杂的模型和分词的准确度,应该说进行分析的难度主要是集中在特征提取上吧?话说大神有没有什么经验上的track,用什么效果会好一些
from snownlp.
可以看下 http://52opencourse.com/222/%E6%96%AF%E5%9D%A6%E7%A6%8F%E5%A4%A7%E5%AD%A6%E8%87%AA%E7%84%B6%E8%AF%AD%E8%A8%80%E5%A4%84%E7%90%86%E7%AC%AC%E5%85%AD%E8%AF%BE-%E6%96%87%E6%9C%AC%E5%88%86%E7%B1%BB%EF%BC%88text-classification%EF%BC%89 http://www.cs.cornell.edu/home/llee/papers/sentiment.pdf 有些基础的特征,这一块的话有些n-gram还有些名词,动词,识别,还有些情感词的识别,还要去对否定句做些特别处理,然后有的在用些rnn之类的model
from snownlp.
话说,用了N-gram以后一般是计算CHI-2或者MI然后每篇提取TOPK构建特征向量么?如果文本数量太大的话,维度爆炸有点夸张啊,一般你们是怎么解决嘞?
from snownlp.
增加的只是总维度吧,单条的维度还在可以控制的范围内,如果你觉得总维度也太大,可以考虑hash trick
from snownlp.
没有找到停用词功能在哪里调用,虽然看了停用词词典,未来会加入么?
from snownlp.
有停用词,可以这样用 https://github.com/isnowfy/snownlp/blob/master/snownlp/__init__.py#L61
from snownlp.
got it!等我自己写个情感分析的demo看看,搞完来请教
from snownlp.
Related Issues (20)
- sentences 有用到什么算法或者语义分析之类的吗? HOT 3
- 我想读懂这个模块的源代码,有谁能给点建议吗
- sentiment train HOT 2
- Sentiment on a quote in a recent news
- Occupy less important district has lower sentiment?
- 请问为什么我训练好自己的模型后,sentiment出了比1大的数字?
- 分词和词性训练使用什么格式的数据
- 情感分析结果有时候很不准 HOT 2
- 关于词性分析 HOT 1
- 请问我想把训练集替换成金融本文的话,应该怎么做呢? HOT 1
- 请问,怎么追加自定义的单词?
- CWS Standard for the Default Version
- How to convert from Simplified Chinese to Traditional Chinese? HOT 1
- snownlp能计算英文文本的情感吗? HOT 1
- minus value in simuarity
- 关于snownlp预训练模型sentiment.marshal.3
- 你好,请问情感取值都在0-1之间么?
- sentiments会出现除零错误 HOT 1
- 分词错误 HOT 1
- Utilizing SnowNLP to Categorize Articles Based on Content Context
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from snownlp.