Code Monkey home page Code Monkey logo

exbert's Introduction

exBERT

The details of the model is in paper.

Pre-train an exBERT model (only the extension part)

In command line:

python Pretraining.py -e 1 
  -b 256 
  -sp path_to_storage
  -dv 0 1 2 3 -lr 1e-04 
  -str exBERT    
  -config path_to_config_file_of_the_OFF_THE_SHELF_MODEL ./config_and_vocab/exBERT/bert_config_ex_s3.json  
  -vocab ./config_and_vocab/exBERT/exBERT_vocab.txt 
  -pm_p path_to_state_dict_of_the_OFF_THE_SHELF_MODEL
  -dp path_to_your_training_data
  -ls 128 
  -p 1

You can replace the path_to_config_file_of_the_OFF_THE_SHELF_MODEL and path_to_state_dict_of_the_OFF_THE_SHELF_MODEL to any weel pre-trained model in BERT archietecture. ./config_and_vocab/exBERT/bert_config_ex_s3.json defines the size of extension module.

Pre-train an exBERT model (whole model)

python Pretraining.py -e 1 
  -b 256 
  -sp path_to_storage
  -dv 0 1 2 3 -lr 1e-04 
  -str exBERT    
  -config path_to_config_file_of_the_OFF_THE_SHELF_MODEL ./config_and_vocab/exBERT/bert_config_ex_s3.json  
  -vocab ./config_and_vocab/exBERT/exBERT_vocab.txt 
  -pm_p path_to_state_dict_of_the_OFF_THE_SHELF_MODEL
  -dp path_to_your_training_data
  -ls 128 
  -p 1
  -t_ex_only ""

-t_ex_only "" enable training the whole model

Pre-train an exBERT with no extension of vocab

python Pretraining.py -e 1 
  -b 256 
  -sp path_to_storage
  -dv 0 1 2 3 -lr 1e-04 
  -str exBERT    
  -config path_to_config_file_of_the_OFF_THE_SHELF_MODEL config_and_vocab/exBERT_no_ex_vocab/bert_config_ex_s3.json
  -vocab path_to_vocab_file_of_the_OFF_THE_SHELF_MODEL
  -pm_p path_to_state_dict_of_the_OFF_THE_SHELF_MODEL
  -dp path_to_your_training_data
  -ls 128 
  -p 1
  -t_ex_only ""

Data preparation

Input data for pre-training script should be a .pkl file which contains two a list with two elements, e.g. [list1, list2]. list1 and list2 should contains the sentences like [CLS] sentence A [SEP] sentence B [SEP]. The only differnece between list1 and list2 is the relationship between sentence A and sentence B is IsNext or NotNext. Please check example_data.pkl

We also provide a simple script to generate the data from raw text file. python data_preprocess.py -voc path_to_vocab_file -ls 128 -dp path_to_txt_file -n_c 5 -rd 1 -sp ./your_data.pkl replace 128 to the max length limit you want try python data_preprocess.py -voc ./exBERT_vocab.txt -ls 128 -dp ./example_raw_text.txt -n_c 5 -rd 1 -sp ./example_data.pkl

Or you can do your own data preparation and organize the data with the format metioned above.

exbert's People

Contributors

taiwen97 avatar lyasolis avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.