Code Monkey home page Code Monkey logo

mvp's Introduction

MVP: Multi-task Supervised Pre-training for Natural Language Generation

This repository is the official implementation of our paper https://arxiv.org/abs/2206.12131. The implementation is completely based on our text generation library TextBox 2.0.

Overview

  • MVP follows a standard Transformer encoder-decoder architecture.
  • MVP is supervised pre-trained using labeled datasets.
  • MVP also has task-specific soft prompts to stimulate the model's capacity in performing a certain task.
  • MVP is specially designed for natural language generation and can be adapted to a wide range of generation tasks. Our model can also be adapted to natural language understanding tasks.

model

Tips:

  • We have released a series of models in HuggingFace, including MVP, MVP with task-specific prompts, and multi-task pre-trained variants.
  • If you want to use a model without prompts, you can load it through MvpForConditionalGeneration.from_pretrained('RUCAIBox/mvp').
  • If you want to use a model with task-specific prompts, such as summarization, you can load it through MvpForConditionalGeneration.from_pretrained('RUCAIBox/mvp-summarization').
  • Our model supports lightweight prompt tuning following Prefix-tuning with config lightweight_tuning=True.

Installation

You should clone the TextBox repository and follow its instructions.

git clone https://github.com/RUCAIBox/TextBox.git && cd TextBox
bash install.sh

Datasets

You can download our datasets for fine-tuning in: https://huggingface.co/RUCAIBox. You should create a folder dataset and download dataset such as cnndm in it.

Now we support 11 generation tasks and corresponding datasets:

  • Text summarization: CNN/Daily Mail (cnndm), XSum (xsum), SAMSum (samsum), and WLE (wle).
  • Open-ended dialogue system: PersonaChat (pc), DailyDialog (dd), DSTC7-AVSD (da), and SGD (sgd).
  • Data-to-text generation: WebNLG v2.1 (webnlg), WebNLG v3.0 (webnlg2), WikiBio (wikibio), E2E (e2e), DART (dart), and ToTTo (totto).
  • Question generation: SQuAD (squadqg) and CoQA (coqaqg).
  • Story generation: ROCStories (roc) and WritingPrompts (wp).
  • Question answering: SQuAD (squad) and CoQA (coqa).
  • Task-oriented dialogue system: MultiWOZ 2.0 (multiwoz).
  • Commonsense generation: CommonGen (cg).
  • Text simplification: WikiAuto + Turk/ASSET (wia).
  • Paraphrase generation: Quora (quora).
  • Text style transfer: GYAFC-E&M and F&R (gyafc_em, gyafc_fr).

Fine-tuning, Inference and Evaluation

After downloading the dataset, our code can conduct fine-tuning, inference and evaluation in a pipeline.

We propose MVP, MVP+S/M, Single, and BART in our paper, details can be found here.

Fine-tuning with MVP:

python run_textbox.py --model=MVP --dataset=[dataset_name] --model_path=RUCAIBox/mvp

dataset_name can be one of the name under dataset folder, such as cnndm and webnlg.

Fine-tuning with MVP+S/M:

python run_textbox.py --model=MVP --dataset=[dataset_name] --model_path=RUCAIBox/mvp-[task_name]

task_name can be selected from summarization, open-dialog, data-to-text, question-generation, story, question-answering and task-dialog. If you want to fine-tune MVP+M, the task_name should be multi-task.

For example, to fine-tune squadqg dataset on question generation using MVP+S:

python run_textbox.py --model=MVP --dataset=squadqg --model_path=RUCAIBox/mvp-question-generation

Fine-tuning with Single and BART:

python run_textbox.py --model=MVP --dataset=[dataset_name] --model_path=RUCAIBox/mtl-[task_name]

task_name can be selected from summarization, open-dialog, data-to-text, question-generation, story, question-answering and task-dialog.

We also support to fine-tune with BART:

python run_textbox.py --model=BART --dataset=[dataset_name] --model_path=facebook/bart-large

Lightweight Tuning:

If you want to conduct lightweight tuning of MVP+S/M, just add the option --lightweight_tuning=True in the script.

For example, to lightweight tune roc dataset using MVP+M:

python run_textbox.py --model=MVP --dataset=roc --model_path=RUCAIBox/mvp-multi-task --lightweight_tuning=True

We also support to lightweight tune with BART+R (i.e., Prefix-tuning) here.

Citation

@article{tang2022mvp,
  title={MVP: Multi-task Supervised Pre-training for Natural Language Generation},
  author={Tang, Tianyi and Li, Junyi and Zhao, Wayne Xin and Wen, Ji-Rong},
  journal={arXiv preprint arXiv:2206.12131},
  year={2022},
  url={https://arxiv.org/abs/2206.12131},
}

mvp's People

Contributors

steventang1998 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

mvp's Issues

Does the model support FP16 inference?

Hello,

I was looking for ways to increase the inference speed, and one thing I thought would be useful was to use FP16. For this, I called model.half() after loading it. Unfortunately, it generated RuntimeError: "LayerNormKernelImpl" not implemented for 'Half' error. I was wondering if there is a way to use FP16 during inference? (Or any other trick to accelerate inference).

# This works:
model = MvpForConditionalGeneration.from_pretrained('RUCAIBox/mvp')
inputs = tokenizer(
    ["Describe the following data: Iron Man | instance of | Superhero [SEP] Stan Lee | creator | Iron Man",
     "Describe the following data: Batman | instance of | Superhero",
    ]
    return_tensors="pt",
)

generated_ids = model.generate(**inputs)

tokenizer.batch_decode(generated_ids, skip_special_tokens=True)
['Iron Man is a fictional superhero appearing in American comic books published by Marvel Comics.',
"Batman is a superhero"]
# This doesn't:
model = model.half()
generated_ids = model.generate(**inputs)

gpu

how to use multi gpu

Is batch processing possible during inference?

Hi, thank you for providing a pre-trained model. I am wondering if it is possible to perform a batched prediction with the model at https://huggingface.co/RUCAIBox/mtl-data-to-text ? Something like below?:

inputs = tokenizer(
    ["Describe the following data: Iron Man | instance of | Superhero [SEP] Stan Lee | creator | Iron Man",
     "Describe the following data: Batman | instance of | Superhero",
    ]
    return_tensors="pt",
)

generated_ids = model.generate(**inputs)

tokenizer.batch_decode(generated_ids, skip_special_tokens=True)
['Iron Man is a fictional superhero appearing in American comic books published by Marvel Comics.',
"Batman is a superhero"]

I'm trying to use MVP for a text classification project, could you please elaborate how?

MVP is specially designed for natural language generation and can be adapted to a wide range of generation tasks, including but not limited to summarization, data-to-text generation, open-ended dialogue system, story generation, question answering, question generation, task-oriented dialogue system, commonsense generation, paraphrase generation, text style transfer, and text simplification. Our model can also be adapted to natural language understanding tasks such as sequence classification and (extractive) question answering.

Currently having trouble adapting the model to sequence classification

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.