Comments (2)
@agarwalchaitanya Could you try to comment out local/ami_prepare_dict.sh
(line: 120) in run.sh
?
from neural_sp.
@agarwalchaitanya Could you try to comment out
local/ami_prepare_dict.sh
(line: 120) inrun.sh
?
that helps remove the error but it fails somewhere within stage 0
============================================================================
AMI
============================================================================
============================================================================
Data Preparation (stage:0)
============================================================================
+ dir=/home/asr/neural_sp_assets/preprocessed_data/ami/local/downloads
+ mkdir -p /home/asr/neural_sp_assets/preprocessed_data/ami/local/downloads
+ echo 'Downloading annotations...'
Downloading annotations...
+ amiurl=http://groups.inf.ed.ac.uk/ami
+ annotver=ami_public_manual_1.6.1
+ annot=/home/asr/neural_sp_assets/preprocessed_data/ami/local/downloads/ami_public_manual_1.6.1
+ logdir=/home/asr/neural_sp_assets/preprocessed_data/ami/local/downloads
+ mkdir -p /home/asr/neural_sp_assets/preprocessed_data/ami/local/downloads/log
+ '[' '!' -f /home/asr/neural_sp_assets/preprocessed_data/ami/local/downloads/ami_public_manual_1.6.1.zip ']'
+ '[' '!' -d /home/asr/neural_sp_assets/preprocessed_data/ami/local/downloads/annotations ']'
+ '[' '!' -f /home/asr/neural_sp_assets/preprocessed_data/ami/local/downloads/annotations/AMI-metadata.xml ']'
+ local/ami_xml2text.sh /home/asr/neural_sp_assets/preprocessed_data/ami/local/downloads
local/ami_xml2text.sh: line 19: [: openjdk version "11.0.10" 2021-01-19: integer expression expected
local/ami_xml2text.sh. Java not found. Will download exported version of transcripts.
--2021-02-11 17:16:05-- http://groups.inf.ed.ac.uk/ami/AMICorpusAnnotations/ami_manual_annotations_v1.6.1_export.gzip
Resolving groups.inf.ed.ac.uk (groups.inf.ed.ac.uk)... 129.215.202.26
Connecting to groups.inf.ed.ac.uk (groups.inf.ed.ac.uk)|129.215.202.26|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 3725858 (3.6M) [application/x-troff-man]
Saving to: '/home/asr/neural_sp_assets/preprocessed_data/ami/local/annotations/ami_manual_annotations_v1.6.1_export.gzip'
/home/asr/neural_sp_assets/pr 100%[=================================================>] 3.55M 2.49MB/s in 1.4s
2021-02-11 17:16:07 (2.49 MB/s) - '/home/asr/neural_sp_assets/preprocessed_data/ami/local/annotations/ami_manual_annotations_v1.6.1_export.gzip' saved [3725858/3725858]
+ wdir=/home/asr/neural_sp_assets/preprocessed_data/ami/local/annotations
+ '[' '!' -f /home/asr/neural_sp_assets/preprocessed_data/ami/local/annotations/transcripts1 ']'
+ echo 'Preprocessing transcripts...'
Preprocessing transcripts...
+ local/ami_split_segments.pl /home/asr/neural_sp_assets/preprocessed_data/ami/local/annotations/transcripts1 /home/asr/neural_sp_assets/preprocessed_data/ami/local/annotations/transcripts2
+ for dset in train eval dev
+ grep -f local/split_train.orig /home/asr/neural_sp_assets/preprocessed_data/ami/local/annotations/transcripts2
+ for dset in train eval dev
+ grep -f local/split_eval.orig /home/asr/neural_sp_assets/preprocessed_data/ami/local/annotations/transcripts2
+ for dset in train eval dev
+ grep -f local/split_dev.orig /home/asr/neural_sp_assets/preprocessed_data/ami/local/annotations/transcripts2
sdm
In total, 0 files were found.
Warning: expected 169 data data files, found 0
Usage: utils/validate_data_dir.sh [--no-feats] [--no-text] [--non-print] [--no-wav] [--no-spk-sort] <data-dir>
The --no-xxx options mean that the script does not require
xxx.scp to be present, but it will check it if it is present.
--no-spk-sort means that the script does not require the utt2spk to be
sorted by the speaker-id in addition to being sorted by utterance-id.
--non-print ignore the presence of non-printable characters.
By default, utt2spk is expected to be sorted by both, which can be
achieved by making the speaker-id prefixes of the utterance-ids
e.g.: utils/validate_data_dir.sh data/train
AMI sdm1 data preparation succeeded.
In total, 0 files were found.
local/ami_sdm_scoring_data_prep.sh. Applying following fixes to segments
s/^AMI_IB4004_SDM_MIO039_0036179_0036400 AMI_IB4004_SDM 361.79 364$/AMI_IB4004_SDM_MIO039_0036179_0036400 AMI_IB4004_SDM 362.28 364/;
convert2stm: Recording-id AMI_ES2011a_SDM not defined in reco2file_and_channel file /home/asr/neural_sp_assets/preprocessed_data/ami/sdm1/dev_orig/reco2file_and_channel at local/convert2stm.pl line 70.
Usage: utils/validate_data_dir.sh [--no-feats] [--no-text] [--non-print] [--no-wav] [--no-spk-sort] <data-dir>
The --no-xxx options mean that the script does not require
xxx.scp to be present, but it will check it if it is present.
--no-spk-sort means that the script does not require the utt2spk to be
sorted by the speaker-id in addition to being sorted by utterance-id.
--non-print ignore the presence of non-printable characters.
By default, utt2spk is expected to be sorted by both, which can be
achieved by making the speaker-id prefixes of the utterance-ids
e.g.: utils/validate_data_dir.sh data/train
AMI sdm1 scenario and dev set data preparation succeeded.
In total, 0 files were found.
convert2stm: Recording-id AMI_EN2002a_SDM not defined in reco2file_and_channel file /home/asr/neural_sp_assets/preprocessed_data/ami/sdm1/eval_orig/reco2file_and_channel at local/convert2stm.pl line 70.
Usage: utils/validate_data_dir.sh [--no-feats] [--no-text] [--non-print] [--no-wav] [--no-spk-sort] <data-dir>
The --no-xxx options mean that the script does not require
xxx.scp to be present, but it will check it if it is present.
--no-spk-sort means that the script does not require the utt2spk to be
sorted by the speaker-id in addition to being sorted by utterance-id.
--non-print ignore the presence of non-printable characters.
By default, utt2spk is expected to be sorted by both, which can be
achieved by making the speaker-id prefixes of the utterance-ids
e.g.: utils/validate_data_dir.sh data/train
AMI sdm1 scenario and eval set data preparation succeeded.
utils/data/get_utt2dur.sh: segments file does not exist so getting durations from wave files
utils/data/get_utt2dur.sh: successfully obtained utterance lengths from sphere-file headers
utils/data/get_utt2dur.sh: computed /home/asr/neural_sp_assets/preprocessed_data/ami/sdm1/train_orig/utt2dur
utils/data/modify_speaker_info.sh: copied data from /home/asr/neural_sp_assets/preprocessed_data/ami/sdm1/train_orig to /home/asr/neural_sp_assets/preprocessed_data/ami/train_sdm1, number of speakers changed from 0 to 0
Usage: utils/validate_data_dir.sh [--no-feats] [--no-text] [--non-print] [--no-wav] [--no-spk-sort] <data-dir>
The --no-xxx options mean that the script does not require
xxx.scp to be present, but it will check it if it is present.
--no-spk-sort means that the script does not require the utt2spk to be
sorted by the speaker-id in addition to being sorted by utterance-id.
--non-print ignore the presence of non-printable characters.
By default, utt2spk is expected to be sorted by both, which can be
achieved by making the speaker-id prefixes of the utterance-ids
e.g.: utils/validate_data_dir.sh data/train
from neural_sp.
Related Issues (20)
- NaN or Inf found in input tensor HOT 3
- Unable to reproduce MoCha with librispeech HOT 4
- error while runing score.sh in aishell HOT 4
- Streaming Transformer Transducer HOT 9
- RNN-T,growing memory occupancy of GPU.
- Issues with score.sh on streaming transformer(mma) models
- question about rnnt result HOT 2
- what's version fo cuda and pytorch HOT 2
- Can provide ASR performance results for different language models?
- Aishell example die in stage 1
- a question about transducer training HOT 1
- conformer transducer HOT 1
- quetion about the loss and grad of "mbr" HOT 3
- How to decode empty target files which only contain background noise HOT 1
- Streamable architecture
- Recommended versions of Python, Pytorch, CUDA?
- data preprocessing
- librispeech dict issue vocab HOT 3
- loss ctc fluctuates
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from neural_sp.