Code Monkey home page Code Monkey logo

tin-slt's Introduction

Explore More Guidance: A Task-aware Instruction Network for Sign Language Translation Enhanced with Data Augmentation

This repository fork of the orignal tin-slt, has been created with the aim of supporting the study of intelligent models for sign language translation (SLT), which is part of the Master's Thesis (TFM) carried out for the Master in Computer Engineering (MII) of the University of Seville.

Originally created by Yong Cao, Wei Li, Xianzhi Li, Min Chen, Guangyong Chen, Zhengdao Li, Long Hu, Kai Hwang.

1. Introduction

This repository is for our Findings of NAACL 2022 paper 'Explore More Guidance: A Task-aware Instruction Network for Sign Language Translation Enhanced with Data Augmentation '. In this paper, we propose a task-aware instruction network, namely TIN-SLT, for sign language translation, by introducing the instruction module and the learning-based feature fuse strategy into a Transformer network. In this way, the pre-trained model's language ability can be well explored and utilized to further boost the translation performance. Moreover, by exploring the representation space of sign language glosses and target spoken language, we propose a multi-level data augmentation scheme to adjust the data distribution of the training set. We conduct extensive experiments on two challenging benchmark datasets, PHOENIX-2014-T and ASLG-PC12, on which our method outperforms former best solutions by 1.65 and 1.42 in terms of BLEU-4.

2. Dataset and Trained models

  • Dataset can be downloaded in Google Drive.
  • The original trained model can be downloaded in Google Drive. If the trained model doesn't work or if there are any issues, please feel free to contact us.
  • The original pre-trained model can be downloaded in bert-base-german-dbmdz-uncased.
  • Our best model found (different from the original) after performing hyperparameter scanning: our best TIN-SLT.

3. Execute Steps

May differs from the original in tin-slt:

Step 1 prepare environment

Clone the repository

git clone https://github.com/manromero/TIN-SLT
cd TIN-SLT

Create a new virtual environment using python 3.6

# create a new virtual environment using python 3.6
virtualenv --python=python3 venv
# If you have more than one python, you can specify the python file. 
# ex: `virtualenv --python=C:\Users\migu1\AppData\Local\Programs\Python\Python36\python.exe venv`
# Activate linux:
source venv/bin/activate
# Activate windows:
.\venv\Scripts\activate

Step 2 install dependencies

pip install --editable .      

Note that, if the download speed is not fast, try this:

pip install --editable . -i https://pypi.tuna.tsinghua.edu.cn/simple   

Verify that torch has been installed correctly

python
> import torch
> torch.cuda.is_available()
# True -> The installation has been successfully completed and it is possible to use the graphics card for training.
# False -> Despite a successful installation, it will not be possible to make use of the graphics card during training, which will cause errors during training.

If false:

  1. Make sure you have configured CUDA and CUDNN correctly. An example configuration for Windows 11 is available here.
  2. Perform the Torch installation using the commands available from the official PyTorch website, removing the installed version beforehand.

Originally the code is implemented over Python 3.6.8, and Pytorch 1.5.0+cu101. (It wasn't test on other package version.)

pip uninstall torch
pip install torch==1.5.0+cu101 -f https://download.pytorch.org/whl/torch_stable.html

Step 3 Prepare dataset

For facilitate the organization, prepare the next folder structure:

  • Create folder "dataset" in the root of the directory
  • Create folder "raw" and "processed" in the folder "dataset" created

Extract the google drives dataset in the previous folders.

If you use our prepared dataset, skip these steps.

Configure "preprocessing/prepare_data.py" with the dataset path you are going to use.

Else:

cd preprocessing      
python prepare_data.py

Then you can get the following files in your destination dir:

Step 4 Train by AutoML

Choose one training method (with / without automl).

Important: "Train without AutoML" was the method used to train the model in the research work carried out for the Master's thesis. We leave the rest of the instructions from the original article, although these have not been tested and may require additional instructions.

(1) Without AutoML

For Linux users:

cd trainer
# please config train.sh first, and then:
sh train.sh

For Windows users:

This script has been created with respect to the original documentation to make it easier for Windows users to train the model.

cd trainer
# please config train_windows.py first, and then:
python train_windows.py

Runtime: 2 hours 52 min (Approximate, using NVIDIA GeForce RTX 3070).

(2) With AutoML

Note: Not tested for the Master's Thesis.

Config automl/config.yml and automl/search_space.json files, and run the following cmd to train in your terminal:

nnictl create --config automl/config.yml -p 11111
# -p means the port you wish to visualize the training process in browser.

If succeed, you should see the following logs in your terminal:

Go to your browser to see the training process.

Please refer to NNI Website for more instructions on NNI.

Step 5 Evaluate

For Linux users:

cd postprocessing       
sh get_bleu4.sh

For Windows users:

This script has been created with respect to the original documentation to make it easier for Windows users to evaluate the model.

cd postprocessing
# please config get_bleu4_stmc_windows.py first, and then:
python get_bleu4_stmc_windows.py

4. Unified flow and hyperparameter scanning

To facilitate the execution of the flow in a unified way, while enabling hyperparameter scanning, the python file "scan.py" has been created (Not available in the original repository). In it we can configure as a grid the combinations of hyperparameters that we want to test. Once configured, the complete training, inference and evaluation flow will be executed for each of the combinations set.

python scan.py

Once we have found the model with the best metrics, we can refine the selection of the hyperparameter "beam size" using the "beam_search.py" script.

python beam_search.py

5. Questions

Please contact [email protected].

6. Some problems you may encounter:

1.During dependencies installation: "Cannot open include file: 'basetsd.h': No such file or directory"

Install winsdksetup

1.'ascii' codec can't decode byte 0xef

UnicodeDecodeError: 'ascii' codec can't decode byte 0xef in position 1622: ordinal not in range(128)

=> please run this command in your terminal

export LC_ALL=C.UTF-8
source ~/.bashrc

2.Resource punkt not found. / Resource wordnet not found.

please run this command in your terminal

python
  >>> import nltk
  >>> nltk.download('wordnet')
  >>> nltk.download('punkt')

tin-slt's People

Contributors

manromero avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.