Code Monkey home page Code Monkey logo

radish's Introduction

Radish

Radish可以让你的模型从训练到部署都使用相同C++代码库, 借助libtorch, 让你专注实现模型及对应数据处理。

如何构建

  1. 安装bazel 0.28+
  2. C++17 特性支持的编译器 (7.3.2, 8.3.0已验证)
  3. 运行构建比如: bazel build bert:train_albert_main

为什么造这个轮子

  1. AI真正的落地需要很好的工程化

  2. 模型太多了,训练, 预处理等也需要很好工程化

  3. 实时训练场景如有些RL需要真正多线程支持,而不是Python

  4. 训练与推理相同代码库,缩小落地Gap

如果你碰到以上问题,Radish值得尝试!

如何使用

  1. 派生自radish::LlbModel类, 实现对应forward过程,以及计算loss的逻辑

  2. 决定你的样本特征,以及对应target

  3. 实现radish::data::ExampleParser , 根据需要实现对应解析方法

  4. 借助radish:: train ::LlbTrainer 指定对应模板参数,函数参数训练模型

  5. ....

    可参考bert目录下spanbert以及albert示例。

数据载入

你可以使用2种数据格式,一种是基于leveldb, 另一种基于纯文本(一行一个样本) 基于leveldb的支持完全随机访问, 基于txt的支持多文件输入,每次随机从某文件读入数据

关于ALBERT

样本格式: TXT格式,一行一个样本,把换行换成\t或者空格

运行训练(示例):

LD_LIBRARY_PATH=/data/chenyw/libtorch_gpu/lib ./train_albert_main --train_data_path /data/chenyw/albert/data/part0,/data/chenyw/albert/data/part1  --test_data_path /data/chenyw/albert/data/valid0  --warmup_steps 10000 --parser_conf_path parser_conf.json --eval_every 5000 -learning_rate 0.0003 --batch_size 460

更多参数可运行加--help参数打印出来参考

论文给出的实验报告,可以看出主要是hidden size在起作用, 共享参数反而使得效果打折扣。 所以本示例实现没有加入参数共享。可自行更改对应代码, 也欢迎pull request.

参考

  1. Pytorch C++ Doc
  2. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
  3. ALBERT: A Lite BERT for Self-supervised Learning of Language Representations
  4. SpanBERT: Improving Pre-training by Representing and Predicting Spans

radish's People

Contributors

koth avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.