Code Monkey home page Code Monkey logo

eventextractbynovel's Introduction

【NLP】基于SVM的网络小说事件类型识别

项目详情可见目录下 项目报告书.pdf

依赖

  • sklearn==0.19.1
  • numpy==1.14.0
  • scipy==1.0.0

文件目录

common.py - 放置公共变量和类
generate_feature.py - 手动特征向量,保存在本地
train_model.py - 使用SVM训练保存的特征向量,生成分类器

思路

  1. 首先读取每句话,找到包含主题词的句子(主题词必须作为名词并且后两个词至少一个为动词),
  2. 计算该句子的词频作为特征向量,如果全为0,则抛弃
  3. 手动标注该特征向量的标签,即该句子属于哪个事件类别
  4. 将所有标注过的特征向量和标签保存在本地
  5. 使用SVM训练保存的特征向量,生成分类器
  6. 继续读取每句话,找到包含主题词的句子,对该句子用分类器预测
  7. 输出预测结果

事件类别

1 修炼
2 对话
3 心理活动

特征向量 Tag

腰部
发酸
麻麻
眯起
酸痛
歇息
打熬

咧嘴
说道
笑道
询问

感到
高兴
心中
忐忑
心底
坚定

参考资料

eventextractbynovel's People

Contributors

elliotxx avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.