joungheekim / k-wav2vec Goto Github PK

View Code? Open in Web Editor NEW

82.0 82.0 15.0 493 KB

License: Apache License 2.0

Dockerfile 0.05% Python 95.69% Cython 0.49% Cuda 1.97% C++ 0.29% Shell 1.51%

k-wav2vec's Introduction

"TRUST me I'm an Engineer"

Jounghee Kim 👋

2020-2022 M.S. Student at Data Science & Business Analytics Lab., School of Industrial Management Engineering, Korea University.
2015-2020 Senior Engineer at SK holdings C&C., Korea.

k-wav2vec's People

Contributors

Stargazers

Watchers

Forkers

ishine wsr692 greatnoble jaehwlee theokjo seonwhee-genome donghwa-kim chlee10 rafle0 jackie-wx hololee seokjin1013 dongwon00kim kang7367 jaeyoon2250

k-wav2vec's Issues

how can I run multi-gpu training in overall pretraining and finetuning?

python -W ignore fairseq_cli/hydra_train.py \ task.data=path_to_data \ checkpoint.save_dir=save_dir \ task.del_silence=True \ model.additional_layers=2 \ model.w2v_path=path_to pretrain_weight \ distributed_training.distributed_world_size=2 \ --config-dir configs/finetune/add \ --config-name 960h

get stuck in
if not cfg.distributed_training.pipeline_model_parallel: self._criterion = self._criterion.to(device=self.device) self._model = self._model.to(device=self.device)
in fairseq.trainer.py line 72 with no errors..

what command should I use to train with multi-gpu?

ConfigKeyError에 대해

안녕하세요
도움을 받고자 글을 남겼습니다. 아래와 같이 pt파일을 불러오려고 하는데 ConfigKeyError 에러가 나는데
이렇게 불러오는 게 맞는지요?

import fairseq
cp_path = "wav2vec/pre_models/checkpoint_best.pt"
model, cfg, task = fairseq.checkpoint_utils.load_model_ensemble_and_task([cp_path])
model = model[0]
model.eval()

[에러]
ConfigKeyError: Key 'eval_wer' not in 'AudioPretrainingConfig'
full_key: eval_wer
reference_type=Optional[AudioPretrainingConfig]
object_type=AudioPretrainingConfig

Pretrained Model Checkpoints

Hi. I was wondering whether the pretrained checkpoint for K-wav2vec 2.0 was going to be released (the one that was further pretrained on the English checkpoint).

Thanks in advance.

Implementation of Character-level Error Rate (CER) metric

안녕하세요. 좋은 연구를 공유해주셔서 감사드립니다.

논문에 report된 CER 점수를 계산하기 위한 구현이 어느 부분인지 확실하지 않아 질문드립니다.

추측컨데 inference.py에서 아래의 부분인 것 같은데요,

https://github.com/JoungheeKim/K-wav2vec/blob/main/inference/beam_search.py#L227

혹시 이처럼 editdistance library의 editdistance.eval() 함수를 사용하고, 함수의 입력으로는 음절 단위로 normalize된 두 개의 string이 사용된게 맞을까요?

감사합니다.

KeyError: 'audio_multitraining' 에러

안녕하세요 우선 멋진 코드 공유해주셔서 감사합니다
bash script/inference/evaluate_multimodel.sh 실행 시 발생한 에러에 대해서 질문드립니다

Traceback (most recent call last):
File "inference/beam_search.py", line 902, in
wer, cer, swer = cli_main()
File "inference/beam_search.py", line 835, in cli_main
args = options.parse_args_and_arch(parser)
File "/usr/local/lib/python3.8/dist-packages/fairseq/options.py", line 158, in parse_args_and_arch
TASK_REGISTRY[args.task].add_args(parser)
KeyError: 'audio_multitraining'

위 에러가 어떤 에러인지 아십니까?

Huggingface FeatureExtractor load 에러

안녕하세요.
훌륭한 코드와 pre-trained 모델 공유해주셔서 정말 감사드립니다.

다름이 아니라 readme에 있는 한국어 사전학습한 허깅페이스 모델을 서버에 다운로드 받아서 돌려보고 있는데요
모델은 local path 통해서 잘 load되는데,
feature extractor를 불러올 때 .from_pretrained(local_path) 로 불러오니까 자꾸 아래와 같은 OSError가 뜹니다.

... make sure {local_path} is the correct path to a directory containing a preprocessor_config.json file

공유해주신 모델 압축 파일에는 config.json과 pytorch_model.bin만 있는데,
혹시 feature extractor는 그냥 허깅페이스 허브에 있는 영어 wav2vec2.0 체크포인트 또는 다른 한국어 wav2vec2.0 체크포인트로 불러오면 되는 것인지 궁금합니다.

감사합니다.

further pre-train the model 에러

bash script/pretrain/run_further_pretrain.sh 코드를 이용하여 further pre-train을 진행하려고 하는데 다음과 같은 에러가 발생합니다.

RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.HalfTensor [8, 768, 180]], which is output 0 of AsStridedBackward0, is at version 2; expected version 1 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True).

혹시 어떻게 해결할 수 있을까요?

How to do ASR with this repo

Hi,
I download the fairseq checkpoint from readme.

The inference/beam_search.py is for asr I think, so I trying to running the bash script/inference/evaluate_multimodel.sh, but missing some file MANIFEST_PATH (some dictionary in it?)

Would like to share the MANIFEST file?

Thank you.

fine-tuning 질문있습니다.

huggingface 모델을 보면 vocab_size가 32인걸로 보아,
fairseq에서 제공하는 vocab_list에 4가지 speacial token을 사용한 것으로 보이는데요.

혹시 그러면 fine-tuning하셨을때, ksponspeech는 전부 한글로 되어있을텐데, 한글자소-알파벳 음차표기법에 의해서 데이터 변환해서 사용하셨나요?

또한, 콩글리쉬나 숫자같은경우 (1/하나) 이렇게 있던데, 이런 것들도 다 데이터 전처리해서 둘중 하나로 선택해서 사용하신건가요?

joungheekim / k-wav2vec Goto Github PK

k-wav2vec's Introduction

Jounghee Kim 👋

k-wav2vec's People

Contributors

Stargazers

Watchers

Forkers

k-wav2vec's Issues

how can I run multi-gpu training in overall pretraining and finetuning?

ConfigKeyError에 대해

Pretrained Model Checkpoints

Implementation of Character-level Error Rate (CER) metric

KeyError: 'audio_multitraining' 에러

Huggingface FeatureExtractor load 에러

further pre-train the model 에러

How to do ASR with this repo

fine-tuning 질문있습니다.

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent