Explore More Guidance: A Task-aware Instruction Network for Sign Language Translation Enhanced with Data Augmentation

This repository fork of the orignal tin-slt, has been created with the aim of supporting the study of intelligent models for sign language translation (SLT), which is part of the Master's Thesis (TFM) carried out for the Master in Computer Engineering (MII) of the University of Seville.

Originally created by Yong Cao, Wei Li, Xianzhi Li, Min Chen, Guangyong Chen, Zhengdao Li, Long Hu, Kai Hwang.

1. Introduction

This repository is for our Findings of NAACL 2022 paper 'Explore More Guidance: A Task-aware Instruction Network for Sign Language Translation Enhanced with Data Augmentation '. In this paper, we propose a task-aware instruction network, namely TIN-SLT, for sign language translation, by introducing the instruction module and the learning-based feature fuse strategy into a Transformer network. In this way, the pre-trained model's language ability can be well explored and utilized to further boost the translation performance. Moreover, by exploring the representation space of sign language glosses and target spoken language, we propose a multi-level data augmentation scheme to adjust the data distribution of the training set. We conduct extensive experiments on two challenging benchmark datasets, PHOENIX-2014-T and ASLG-PC12, on which our method outperforms former best solutions by 1.65 and 1.42 in terms of BLEU-4.

2. Dataset and Trained models

Dataset can be downloaded in Google Drive.
The original trained model can be downloaded in Google Drive. If the trained model doesn't work or if there are any issues, please feel free to contact us.
The original pre-trained model can be downloaded in bert-base-german-dbmdz-uncased.
Our best model found (different from the original) after performing hyperparameter scanning: our best TIN-SLT.

3. Execute Steps

May differs from the original in tin-slt:

Step 1 prepare environment

Clone the repository

git clone https://github.com/manromero/TIN-SLT
cd TIN-SLT

Create a new virtual environment using python 3.6

# create a new virtual environment using python 3.6
virtualenv --python=python3 venv
# If you have more than one python, you can specify the python file. 
# ex: `virtualenv --python=C:\Users\migu1\AppData\Local\Programs\Python\Python36\python.exe venv`
# Activate linux:
source venv/bin/activate
# Activate windows:
.\venv\Scripts\activate

Step 2 install dependencies

pip install --editable .

Note that, if the download speed is not fast, try this:

pip install --editable . -i https://pypi.tuna.tsinghua.edu.cn/simple

Verify that torch has been installed correctly

python
> import torch
> torch.cuda.is_available()
# True -> The installation has been successfully completed and it is possible to use the graphics card for training.
# False -> Despite a successful installation, it will not be possible to make use of the graphics card during training, which will cause errors during training.

If false:

Make sure you have configured CUDA and CUDNN correctly. An example configuration for Windows 11 is available here.
Perform the Torch installation using the commands available from the official PyTorch website, removing the installed version beforehand.

Originally the code is implemented over Python 3.6.8, and Pytorch 1.5.0+cu101. (It wasn't test on other package version.)

pip uninstall torch
pip install torch==1.5.0+cu101 -f https://download.pytorch.org/whl/torch_stable.html

Step 3 Prepare dataset

For facilitate the organization, prepare the next folder structure:

Create folder "dataset" in the root of the directory
Create folder "raw" and "processed" in the folder "dataset" created

Extract the google drives dataset in the previous folders.

If you use our prepared dataset, skip these steps.

Configure "preprocessing/prepare_data.py" with the dataset path you are going to use.

Else:

cd preprocessing      
python prepare_data.py

Then you can get the following files in your destination dir:

Step 4 Train by AutoML

Choose one training method (with / without automl).

Important: "Train without AutoML" was the method used to train the model in the research work carried out for the Master's thesis. We leave the rest of the instructions from the original article, although these have not been tested and may require additional instructions.

(1) Without AutoML

For Linux users:

cd trainer
# please config train.sh first, and then:
sh train.sh

For Windows users:

This script has been created with respect to the original documentation to make it easier for Windows users to train the model.

cd trainer
# please config train_windows.py first, and then:
python train_windows.py

Runtime: 2 hours 52 min (Approximate, using NVIDIA GeForce RTX 3070).

(2) With AutoML

Note: Not tested for the Master's Thesis.

Config automl/config.yml and automl/search_space.json files, and run the following cmd to train in your terminal:

nnictl create --config automl/config.yml -p 11111
# -p means the port you wish to visualize the training process in browser.

If succeed, you should see the following logs in your terminal:

Go to your browser to see the training process.

Please refer to NNI Website for more instructions on NNI.

Step 5 Evaluate

For Linux users:

cd postprocessing       
sh get_bleu4.sh

For Windows users:

This script has been created with respect to the original documentation to make it easier for Windows users to evaluate the model.

cd postprocessing
# please config get_bleu4_stmc_windows.py first, and then:
python get_bleu4_stmc_windows.py

4. Unified flow and hyperparameter scanning

To facilitate the execution of the flow in a unified way, while enabling hyperparameter scanning, the python file "scan.py" has been created (Not available in the original repository). In it we can configure as a grid the combinations of hyperparameters that we want to test. Once configured, the complete training, inference and evaluation flow will be executed for each of the combinations set.

python scan.py

Once we have found the model with the best metrics, we can refine the selection of the hyperparameter "beam size" using the "beam_search.py" script.

python beam_search.py

5. Questions

Please contact [email protected].

6. Some problems you may encounter:

1.During dependencies installation: "Cannot open include file: 'basetsd.h': No such file or directory"

Install winsdksetup

1.'ascii' codec can't decode byte 0xef

UnicodeDecodeError: 'ascii' codec can't decode byte 0xef in position 1622: ordinal not in range(128)

=> please run this command in your terminal

export LC_ALL=C.UTF-8
source ~/.bashrc

2.Resource punkt not found. / Resource wordnet not found.

please run this command in your terminal

python
  >>> import nltk
  >>> nltk.download('wordnet')
  >>> nltk.download('punkt')

manromero / tin-slt Goto Github PK

tin-slt's Introduction