Code Monkey home page Code Monkey logo

youth_recommendation's Introduction

青少年文章推荐分类系统

0.思路:

1、用户在开始的时候,会选定感兴趣的话题;
2、根据用户选定的话题,对文章库中的每篇文章进行预判是否属于感兴趣话题集合;
3、建立一个模型,得到每篇文章的话题分布概率,选取概率最大的类作为所属类。该类如果在话题集中,则推荐。

1.数据处理

imput:: 6w样本(模拟用户定义了6类感兴趣话题和6类不感兴趣话题)
output:X、Y向量、分词词典、样本平均长度
(1)输入的数据是12类,每类各5k条文章样本,共计6w个样本;
(2)对每个样本进行分词(结巴分词加搜狗数据库),去停用词(导入停用词表),建立所有样本的分词词典,根据词典得到每个样本每个词的ID,每个样本表示成由ID组成的向量X;
(3)每个样本的y标签根据属于的类别,标号为0,1,2,3,4,5,6,7,8,9,10,11,向量Y;

2.模型建立

Input:X、Y output:双向LSTM molel
(1)输入X、Y,搭建双向LSTM层
(2)modle结构:embedding层+LSTM层+dense层+softmax层
(3)用keras搭建整个模型;

3.模型预测

imput:一篇文章、保存的模型(json文件和h5文件) output:文章类别的分布函数/文章的类别 (1)导入的保存的模型 (2)对输入的样本,分词,去停用词,向量化,得到其对应的X; (3)导入保存的模型,将X输入模型中,得到样本 的话题概率分布,选择最大的概率作为类;

4.文件运行

1.同一个文件下放文件:模型预测.py / my_model_weights.h5 / my_model_architecture.json 2.模型预测中更改txtpath/stopwords_path(停用词典的路径) 3.运行即可 4、数据:链接: https://pan.baidu.com/s/1zZLK13ZkbIAYl-irQhzSLA 密码: uj5p

youth_recommendation's People

Contributors

amhu avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.