Code Monkey home page Code Monkey logo

graduation-project's Introduction

graduation-project

my undergraduation project-weichat-robot in movies

第一部分:数据获取

一、豆瓣影评数据爬取

1、模拟登录
2、影评信息爬取
3、存数到sqlite3数据库中

二、微博电影影评数据爬取

1、模拟登录手机端m.weibo.cn
2、获取影评信息
3、钓鱼功能
4、存储到sqlite3数据库中

三、猫眼电影影评数据抓取

1、获取手机端m.maoyan.com的影评数据
2、存储到mysql数据库中,并备份成sql文件
3、存储到sqlite3数据库中

四、现有数据

1、时间 2017.4.30
2、数目 微博24058条 豆瓣21012条 猫眼11854条
3、时间 2017.5.8 微博评论8万条,作为检索数据源
4、评价数据源 2017.5.24 新爬取的微博数据2700条微博

第二部分:检索系统构建

Lucene 检索工具

1、构建索引
2、检索结果

第三部分:微信机器人设计

wxpy应用

第四部分:检索模型的评价

检索模型

1、BM25算法
2、布尔模型
3、Dirichlet语言模型
4、JelinekMercer语言模型

评价数据源

1、采用新爬取的数据源作为评价数据
2、将原始数据去重后分为5类,取其中一类作为评价数据

评价方法

1、bleu值
2、检索微博计算评论相关的bleu值 3、检索评论计算微博相关的bleu值

1、开题报告
2、开题答辩ppt
3、论文大纲
4、中期报告
5、中期答辩ppt
6、过程管理
7、周报内容

graduation-project's People

Contributors

jingyihiter avatar

Stargazers

 avatar  avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.