Code Monkey home page Code Monkey logo

aesop's Introduction

AESOP

Here is the code base for AESOP: Paraphrase Generation with Adaptive Syntactic Control by Jiao Sun, Xuezhe Ma and Nanyun Peng, this work is accepted by EMNLP 2021.

Please consider citing our work if you find either our code or data useful.

@inproceedings{sun2021aesop,
  title = {AESOP: Paraphrase Generation with Adaptive Syntactic Control},
  author = {Sun, Jiao and Ma, Xuezhe and Peng, Nanyun},
  booktitle = {The 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP)},
  year = {2021}
}

The code base is actively maintained, please reach out to [email protected] if you encounter any issues or raise an issue here! We would like to thank authors of SGCP and huggingface library. Part of the evaluation script is adopted from the SGCP repository, and AESOP is implemented using huggingface.

Dependencies

pip install -r requirement.txt

Please download all the required tools, data (our preprocessed data) and software (e.g., stanford CoreNLP) from https://drive.google.com/file/d/1MP9k48BuBCdAPhWXjfuq7Cl9b9ZIdAPB/view?usp=sharing

Unzipping this zip file should give you

  1. evaluation: contains the script, software and dependencies that are necessary for evaluating the model performance.
  2. pretrained-models: contain pretrained models to replicate our results, h means height: h2 means trimming the parse tree at height 2. Please see Figure 2 in our paper as an example
  3. raw-data: contains data for two datasets: QQP-Pos and ParaNMT-50M (small), we use the same split from SGCP
  4. processed-data: after pre-processing the raw data to feed into huggingface transformer

These unzipped files should be put directly under AESOP main directory.

The two settings we will be introducing will only differ during the inference time

Proprocessing

AESOP has two modes:

  • when the taget syntactic parses are vailable from crowd-sourced exemplar sentences, we extract the syntactic parses from the exemplar sentences and use them as target syntactic parse to guide the generation (set use_template as Y)
  • when the exemplar sentences are not available, we use the retrieval-based selection strategy to adaptively determine a set of target syntactic parses (set use_template as Y)
python data-processing.py --input_dir raw-data/QQPPos --output_dir processed-data/QQPPos-hf-refine --use_template Y

python data-processing.py --input_dir raw-data/QQPPos --output_dir processed-data/QQPPos-hf-refine --use_template N

this will generate all the necessary files we need for both datasets, please see processed-data to see what to expect, and move different directorys to proper locations if needed

Table1: target syntactic parse from exemplar sentences

  1. load pretrained model and do the inference: please fill in [output_file_path_...] based on your own developmet environment

    QQPPos

    python run_eval.py pretrained-models/qqppos-h4 processed-data/QQPPos-hf-refine/exemplar/level5/test.source [output_file_path_qqppos]

    ParaNMT

    python run_eval.py pretrained-models/paranmt-h4 processed-data/ParaNMT50-hf-refine/exemplar/level5/test.source [output_file_path_paranmt]

    ๐Ÿ“ this should give you two files with lines of {target syntactic parse} <sep> {paraphrase}

  2. we use a simple rule to extract generated paraphrases

    python extract_sentence.py --input_file [output_file_path_qqppos/paranmt]
    

    ๐Ÿ“ this should give you two files with {paraphrases}, and they should be the same as the ones in aesop-result/tab1-paranmt/qqppos-h4.txt

  3. last step to get the evaluation metrics as shown in our paper's Table 1

    • QQPPos
    python -m evaluation.eval -r raw-data/QQPPos/test/ref.txt -t raw-data/QQPPos/test/tgt.txt -i aesop-result/tab1-qqppos-h4.txt
    • ParaNMT
    python -m evaluation.eval -r raw-data/QQPPos/test/ref.txt -t raw-data/ParaNMT50m/test/tgt.txt -i aesop-result/tab1-paranmt-h4.txt
  4. If you want to train those two models from scratch, please use these

    • QQPPos

      python finetune_trainer.py --data_dir processed-data/QQPPos-hf-refine/exemplar/level5 --learning_rate 3e-5 --warmup_steps 500  --num_train_epochs 25  --output_dir [output_dir] --max_source_length 512 --max_target_length 128 --do_train --overwrite_output --model_name_or_path facebook/bart-base --gradient_accumulation_steps 32 --save_total_limit 2
    • ParaNMT

      python finetune_trainer.py --data_dir processed-data/ParaNMT50-hf-refine/exemplar/level5 --learning_rate 3e-5 --warmup_steps 500  --num_train_epochs 25  --output_dir [output_dir] --max_source_length 512 --max_target_length 128 --do_train --overwrite_output --model_name_or_path facebook/bart-base --gradient_accumulation_steps 32 --save_total_limit 2

Table 2: adaptive syntactic parse selection

  1. replicate the result when the ground-truth is not available -- AESOP generates multiple paraphrases by adaptively selecting target syntactic parses, processed-data/QQPPos-hf-refine/diverse/level3.source is the file that AESOP generates at the time we tested. When you run the preprocessing file, it might gives you a different file because of the randomness introduced by the sampling strategy. In the following, you may see

    QQPPos

    python run_eval.py pretrained-models/qqppos-h2 processed-data/QQPPos-hf-refine/diverse/level3.source diverse-qqppos.txt --fp16
    

    ParaNMT

    python run_eval.py pretrained-models/paranmt-h2 processed-data/ParaNMT50-hf-refine/diverse/level3.source diverse-paranmt.txt --fp16
    

    ๐Ÿ“ "processed-data/ParaNMT50-hf-refine/diverse/level3.source"

    in the following, we will use the ones for QQPPos as an illustration, and the one for ParaNMT will be just replacing the path for QQPPos with the one for paranmt.

  2. as before, extract the paraphrases from the model output

    python extract_sentence.py --input_file diverse-qqppos.txt
    

    ๐Ÿ“ It should give you a file called diverse-qqppos_extract.txt, which should look

  3. This file contains 10 instances per example. We choose one among those using ROUGE scores in our work, you can also choose other available metrics

    python candidate_selection.py -gen_dir ./ -scbart_generate diverse-qqppos_extract.txt  -target processed-data/QQPPos-parse-hf-refine/diverse/level3.target -output_file diverse-l3-select
    
  4. after the selection, the file should look exactly the same as aesop-result/table2-qqppos-h2.txt

  5. Then you can get the metrics except TED@2 reported in Table 2 after running the evaluation script. Please note that TED@2 cannot be acuqired from the that testing script because it is looking at the selected paraphrase and the ground-truth paraphrase, but selected paraphrases are from our retrieved target syntactic parses, and it is defined as an average value between all target syntactic parses and generated paraphrases.

    python ted2.py -i diverse-qqppos_extract.txt -s diverse-l3-select -t processed-data/QQPPos-hf-refine/diverse/level3.source
    

Use AESOP as a paraphrase tool in your project

If you are looking for a paraphrasing tool to generate paraphrases with diverse syntactic parses based on your input sentence only, please give AESOP a try!

# first parse the input sentences and generate the necessary file for running the model
python demo.py --output_dir demo_output
# run the generation model
python run_eval.py pretrained-models/paranmt-h2 demo_output/level3_paranmt.source demo_output/level3_result.txt --fp16
# extract sentences from the model output
python extract_sentence.py --input demo_output/level3_result.txt

aesop's People

Contributors

sunjiao123sun avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.