Code Monkey home page Code Monkey logo

bertwithpretrained's Introduction

BertWithPretrained

本项目是一个基于PyTorch从零实现的BERT模型及相关下游任务示例

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

更多关于Transformer内容的介绍可以参考文章 This post is all you need(层层剥开Transformer) ,近4万余字、50张图、3个实战示例,带你一网打尽Transformer!

工程结构

  • bert_base_chinese目录中是BERT base中文预训练模型以及配置文件

    模型下载地址:https://huggingface.co/bert-base-chinese/tree/main

  • bert_base_uncased_english目录中是BERT base英文预训练模型以及配置文件

    模型下载地址:https://huggingface.co/bert-base-uncased/tree/main

    注意:config.json中需要添加"pooler_type": "first_token_transform"这个参数

  • data目录中是各个下游任务所使用到的数据集

    • SingleSentenceClassification是今日头条的15分类中文数据集;
    • PairSentenceClassification是MNLI(The Multi-Genre Natural Language Inference Corpus, 多类型自然语言推理数据库)数据集;
  • model目录中是各个模块的实现

    • BasicBert中是基础的BERT模型实现模块
      • MyTransformer.py是自注意力机制实现部分;
      • BertEmbedding.py是Input Embedding实现部分;
      • BertConfig.py用于导入开源的config.json配置文件;
      • Bert.py是BERT模型的实现部分;
    • DownstreamTasks目录是下游任务各个模块的实现
      • BertForSentenceClassification是单标签句子分类的实现部分;
  • Task目录中是各个具体下游任务的训练和推理实现

    • TaskForSingleSentenceClassification是单标签单文本分类任务的训练和推理实现,可用于普通的文本分类任务;
    • TaskForPairSentence是文本对分类任务的训练和推理实现,可用于蕴含任务(例如MNLI数据集);
  • test目录中是各个模块的测试案例

  • utils是各个工具类的实现

    • data_helpers.py是各个下游任务的数据预处理及数据集构建模块;
    • log_helper.py是日志打印模块;

使用方式

  1. 下载完成各个数据集,并放入相应的目录中;

  2. 进入Tasks目录,运行相关模型;
    2.1 单文本分类任务

    python TaskForSingleSentenceClassification.py

    运行结果:

    -- INFO: Epoch: 0, Batch[0/4186], Train loss :2.862, Train acc: 0.125
    -- INFO: Epoch: 0, Batch[10/4186], Train loss :2.084, Train acc: 0.562
    -- INFO: Epoch: 0, Batch[20/4186], Train loss :1.136, Train acc: 0.812        
    -- INFO: Epoch: 0, Batch[30/4186], Train loss :1.000, Train acc: 0.734
    ...
    -- INFO: Epoch: 0, Batch[4180/4186], Train loss :0.418, Train acc: 0.875
    -- INFO: Epoch: 0, Train loss: 0.481, Epoch time = 1123.244s
    ...
    -- INFO: Epoch: 9, Batch[4180/4186], Train loss :0.102, Train acc: 0.984
    -- INFO: Epoch: 9, Train loss: 0.100, Epoch time = 1130.071s
    -- INFO: Accurcay on val 0.884
    -- INFO: Accurcay on val 0.888

    2.2 文本蕴含任务

    python TaskForPairSentenceClassification.py

    运行结果:

    -- INFO: Epoch: 0, Batch[0/17181], Train loss :1.082, Train acc: 0.438
    -- INFO: Epoch: 0, Batch[10/17181], Train loss :1.104, Train acc: 0.438
    -- INFO: Epoch: 0, Batch[20/17181], Train loss :1.129, Train acc: 0.250     
    -- INFO: Epoch: 0, Batch[30/17181], Train loss :1.063, Train acc: 0.375
    ...
    -- INFO: Epoch: 0, Batch[17180/17181], Train loss :0.367, Train acc: 0.909
    -- INFO: Epoch: 0, Train loss: 0.589, Epoch time = 2610.604s
    ...
    -- INFO: Epoch: 9, Batch[0/17181], Train loss :0.064, Train acc: 1.000
    -- INFO: Epoch: 9, Train loss: 0.142, Epoch time = 2542.781s
    -- INFO: Accurcay on val 0.797
    -- INFO: Accurcay on val 0.810

模型详细解析

bertwithpretrained's People

Contributors

moon-hotel avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.