Code Monkey home page Code Monkey logo

zero-few-shot-electra's Introduction

《ELECTRA is a Zero-Shot Learner, Too》

Overview

Recently, for few-shot or even zero-shot learning, the new paradigm “pre-train, prompt, and predict” has achieved remarkable achievements compared with the “pre-train, fine-tune” paradigm. A series of small language models (e.g., BERT, ALBERT, RoBERTa) based on Masked Language Model (MLM) pre-training tasks became popular and widely used. However, another efficient and powerful pre-trained language model, ELECTRA, has probably been neglected. This paper attempts to accomplish several NLP tasks in the zero-shot scenario using a sample-efficient ELECTRA original pre-training task—Replaced Token Detection (RTD). Through extensive experiments on 15 various NLP datasets, we find that ELECTRA performs surprisingly well as a zero-shot learner, which proves the ELECTRA model has more potential to be stimulated. image

Use

python electra_classification.py

python electra_sentence_pair_classification.py
 
python electra_STS-B.py

Main experimental results

image

Environment

bert4keras>=0.10.8, tensorflow = 1.15.0, keras = 2.3.1;

Acknowledgements

Our code is based on Jianlin Su's bert4keras and Sun Yi's NSP-BERT. Thank you for your open source spirit!

Citation

@article{ni2022electra,
  title={ELECTRA is a Zero-Shot Learner, Too},
  author={Ni, Shiwen and Kao, Hung-Yu},
  journal={arXiv preprint arXiv:2207.08141},
  year={2022}
}

zero-few-shot-electra's People

Contributors

nishiwen1214 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.