Large Language Models are Efficient Learners of Noise-Robust Speech Recognition

[Paper] [Data] [Model]

This work extends the latest ASR generative error correction (GER) benchmark to noise-robust ASR with a Robust HyPoradise dataset, and it proposes a language-space denoising approach for GER that has achieved a new breakthrough.

Conda Environment Configuration

Our code is built based on lit-gpt, please refer to official tutorial to build a conda environment. Then, please install the required packages using following command:

pip install -r requirements.txt

Code

Model code: lit_gpt/robust_ger.py;
Training script: finetune.sh;
Inference script: infer.sh;

To run the training or inference script, you need to enter the scripts (including .sh and the called .py files) and modify all the absolute paths of data, model, and experiment directory to be your own (Hint: search for "~/RobustGER"). Then, directly run the .sh script using bash command.

Models

For LLMs, please refer to tutorial for configuration steps, which support many mainstream LLMs like LLaMA-2;
For well-trained adapter checkpoints, please refer to our HuggingFace repo.

Dataset

We have released our Robust HyPoradise dataset at HuggingFace.

References

@inproceedings{hu2024large,
  title={Large Language Models are Efficient Learners of Noise-Robust Speech Recognition},
  author={Hu, Yuchen and Chen, Chen and Yang, Chao-Han Huck and Li, Ruizhe and Zhang, Chao and Chen, Pin-Yu and Chng, Eng Siong},
  booktitle={International Conference on Learning Representations},
  year={2024}
}

@inproceedings{chen2023hyporadise,
  title={HyPoradise: An Open Baseline for Generative Speech Recognition with Large Language Models},
  author={Chen, Chen and Hu, Yuchen and Yang, Chao-Han Huck and Siniscalchi, Sabato Marco and Chen, Pin-Yu and Chng, Eng Siong},
  booktitle={Advances in Neural Information Processing Systems},
  year={2023}
}

yuchen005 / robustger Goto Github PK

robustger's Introduction

Large Language Models are Efficient Learners of Noise-Robust Speech Recognition

Conda Environment Configuration

Code

Models

Dataset

References

robustger's People

Contributors

Stargazers

Watchers

Forkers

robustger's Issues

Token-level noise embedding

When will the code be updated

ROBUST HYPORADISE DATASET training splits

Error while running

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent