Code Monkey home page Code Monkey logo

sinkd's Introduction

Sinkhorn Distance Minimization for Knowledge Distillation (COLING 2024)

Installation

To install the environment, run:

sh ins.sh

Download GLUE and SuperGLUE Data

Download the GLUE data using this repository or from GLUE benchmark website, unpack it to directory datas/glue and rename the folder CoLA to COLA.

Download the SuperGLUE data from SuperGLUE benchmark website.

Download Pre-trained BERT

Download bert_uncased_L-12_H-768_A-12 (BERT-base) and bert_uncased_L-6_H-768_A-12 for teacher model and student model, respectively, from this repository. and use the API from Huggingface to transform them to pytorch checkpoint.

Task-specific BERT Model Distillation

The training script for Task-specific Teacher Model Finetuning can be found in the script/teacher/ directory, where $TEACHER_PATH denotes the file path of the teacher model.

Similarly, the training script for Task-specific Student Model Distillation is located in the script/student/ directory. In this case, $STUDENT_PATH and $TEACHER_PATH represent the file paths of the student and teacher models, respectively.

Task-specific T0 Model Distillation

To install the environment, run:

sh T0/ins.sh

To perform Task-specific Teacher Model Finetuning, run:

python3 T0/distillation_t.py --dataset_name super_glue --dataset_config_name DATASET_NAME --template_name "TEMPLATE_NAME" --model_name_or_path MODEL_DIR --output_dir ./debug --parallelize

To perform Task-specific Student Model Distillation, run:

python3 T0/distillation.py --dataset_name super_glue --dataset_config_name DATASET_NAME --template_name "TEMPLATE_NAME" --model_name_or_path MODEL_DIR --output_dir ./debug --parallelize

Task-specific GPT Model Distillation

To install the environment, run:

sh GPT-Neo/ins.sh

To perform Task-specific Teacher Model Finetuning, run:

python3 GPT-Neo/distillation_t.py --dataset_name super_glue --dataset_config_name DATASET_NAME --template_name "TEMPLATE_NAME" --model_name_or_path MODEL_DIR --output_dir ./debug --parallelize

To perform Task-specific Student Model Distillation, run:

python3 GPT-Neo/distillation.py --dataset_name super_glue --dataset_config_name DATASET_NAME --template_name "TEMPLATE_NAME" --model_name_or_path MODEL_DIR --output_dir ./debug --parallelize

Student Checkpoints

The distilled student model for each task reported in the paper can be downloaded using the following link: https://drive.google.com/drive/folders/1BsA0VHKSa_-Bp5I7dQ2Ftk2q7cIyPrdC

Teacher Checkpoints

The teacher model for each task reported in the paper can be downloaded using the following link: https://drive.google.com/file/d/1sBi35Dk8VJ7TU0warB6BL9QKx-in9Ww6/view?usp=drive_link

BibTeX

@article{cui2024sinkhorn,
  title={Sinkhorn Distance Minimization for Knowledge Distillation},
  author={Cui, Xiao and Qin, Yulei and Gao, Yuting and Zhang, Enwei and Xu, Zihan and Wu, Tong and Li, Ke and Sun, Xing and Zhou, Wengang and Li, Houqiang},
  journal={arXiv preprint arXiv:2402.17110},
  year={2024}
}

sinkd's People

Contributors

2018cx avatar yuleiqin avatar

Stargazers

Qi Sun 孙启 avatar Suleyn avatar  avatar  avatar  avatar Zolio Marling avatar  avatar TokenPocket avatar 艾伦 avatar  avatar AquaT1C_H4cK avatar  avatar  avatar  avatar  avatar  avatar rookie-ljh avatar  avatar YiHungWONG avatar watermelon avatar  avatar  avatar Kevins avatar su7-gaga avatar  avatar  avatar Yigehaoren8848 avatar F-x avatar MijazzChan avatar DiscoverTruth avatar Meta Luo avatar chiefass avatar  avatar Amen8 avatar Jian avatar Yujie Pan avatar DanL0 avatar  avatar whynot avatar Jinge Li avatar HelloWorld avatar  avatar  avatar LingYuZhao avatar  avatar  avatar Beacas avatar  avatar HelloBoy! avatar codeflying0817 avatar  avatar Haopeng Wang avatar jiaxiangc avatar Shatong (Andy) Zhu avatar rcodeeval avatar  avatar Lincan Li (Kelsey) avatar liuziting avatar SmileSky avatar  avatar  avatar 曾经的你 avatar Oscaner.Aliyun avatar  avatar Lujia Jin avatar in1t avatar huan.wang avatar zhulinkai avatar  avatar  avatar  avatar Pup avatar test_init avatar Nanomoa avatar test_push avatar Junbo Yang avatar  avatar vanker-x avatar  avatar  avatar  avatar maozi avatar SharplyQ avatar 孙明志 avatar  avatar wsh avatar  avatar  avatar Mo Zhu avatar  avatar ZHANG, Hao avatar jiangsiyi avatar Jifei Luo avatar Yu Zikang avatar  avatar Sibylla avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

sinkd's Issues

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.