Code Monkey home page Code Monkey logo

xlnet_embbeding's Introduction

XLNet Embedding

将XLNet作为Embedding的Keras封装,根据需要取出某层或某些层的输出作为特征,并可以在后面搭建自定义的网络(如Fasttext)

Usage:

  1. 下载 XLNet模型:https://github.com/ymcui/Chinese-PreTrained-XLNet

  2. 下载代码,解压XLNet模型至代码目录

  3. 准备训练数据并放置在data目录

  4. 修改配置和网络

  5. 训练 / 测试 / 预测

代码说明

demo默认任务为文本分类,若目标为其他任务需要自行修改demo.py文件

高频修改函数:

get_config(): 模型及XLNet配置

process_data(): 修改文本读取、预处理

create_model(): 在XLNet后增加自己的网络结构,默认为fasttext

中频修改函数:

train(): 训练模型,可在这里修改优化器、回调函数等

test(): 加载训练保存的模型进行测试,使用classification_report 和 accuracy_score, 其他任务可自行修改

predict(): 加载模型进行预测,保存到文件中

不建议修改函数:

encode_data(): 对输入进行编码

init():初始化参数

参考/致谢

  1. Chinese-PreTrained-XLNet (ymcui) https://github.com/ymcui/Chinese-PreTrained-XLNet
  2. keras-xlnet (CyberZHG) https://github.com/CyberZHG/keras-xlnet
  3. Keras-TextClassification (yongzhuo) https://github.com/yongzhuo/Keras-TextClassification
  4. xlnet (zihangdai) https://github.com/zihangdai/xlnet

xlnet_embbeding's People

Contributors

zedom1 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

xlnet_embbeding's Issues

长文本分类

大佬好,我想问下,短文本分类maxlen=128足够了,但是如果长文本几千个字这种,xlnet能解决吗,bert极限是512,我看xlnet的源码里设置的极限貌似也是512,默认128。

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.