Light

kolahsh / word_classification Goto Github PK

View Code? Open in Web Editor NEW

This project forked from liuyaku/word_classification

0.0 1.0 0.0 85.39 MB

Use CNN to realize word classification

Python 100.00%

word_classification's Introduction

领域词分类

这份代码用于区分领域词和非领域词，目前的预训练模型只适用于人力资源领域。

环境需求

python 3.6
tensorflow 1.4.0
jieba 0.39

推荐配置

Linux with Tensorflow GPU edition + cuDNN

用法

# 训练模型
python train.py
# 测试模型
python test.py --test_data_path ./data/test.txt --threshold 0.999
test_data_path是要测试的文件的路径。
threshold是阈值最大值为1，越大挑选出来的词越好，但是数量越少，可以调试找出最佳的阈值。
测试结果输出在result文件夹下，以_result结尾的是领域词，以_f_result结尾的是非领域词。
# 更多的参数可以在config.py中自行配置

word_classification's People

Contributors

Watchers

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.