Code Monkey home page Code Monkey logo

bioformer's Introduction

Bioformer: an efficient BERT model for biomedical text mining

Bioformer is a lightweight BERT model pretrained from biomedical Literature. We pretrained two Bioformer models, Bioformer-8L and Bioformer-16L. Both models were pretrained on all PubMed abstracts (as of Jan 2021) and 1 million subsampled PubMed Central full-text articles. We used the original implementation of BERT to train the model. Bioformer models have the following features:

  • Accurate. Bioformer achieves comparable or even better performance than BioBERT/PubMedBERT on downstream NLP tasks. A detailed evaluation is here.
  • Smaller model size. Bioformer-8L and Bioformer-16L reduced the model size by 60% compared with BERT-Base/BioBERT-Base/PubMedBERT.
  • Fast and memory efficient. Bioformer-8L is 3X as fast as PubMedBERT, and Bioformer-16L is 2X as fast as PubMedBERT.
  • Biomedical vocabulary. Bioformer uses a biomedical vocabulary of 32768 tokens, which was trained from PubMed abstracts and PubMed Central full-text articles. Bioformer is able to encode some special unicode symbols that are not in the original BERT vocabulary.

Download

Pytorch checkpoint

Pretrained model weights of Bioformer-8L and Bioformer-16L are available on HuggingFace (Bioformer-8L, and Bioformer-16L)

You can easily use Bioformer with the transformers library.

Acknowledgment

Pretraining of Bioformer is partly supported by the Google TPU Research Cloud (TRC) program.

Citation

Fang L, Chen Q, Wei C-H, Lu Z, Wang K: Bioformer: an efficient transformer language model for biomedical text mining. arXiv preprint arXiv:2302.01588 (2023). DOI: https://doi.org/10.48550/arXiv.2302.01588

@ARTICLE{fangli2023bioformer,
       author = {{Fang}, Li and {Chen}, Qingyu and {Wei}, Chih-Hsuan and {Lu}, Zhiyong and {Wang}, Kai},
        title = "{Bioformer: an efficient transformer language model for biomedical text mining}",
      journal = {arXiv preprint arXiv:2302.01588},
         year = {2023}
}

bioformer's People

Contributors

fangli80 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.