nexdata-ai / 513-hours-japanese-conversational-speech-data-by-telephone Goto Github PK

View Code? Open in Web Editor NEW

513-Hours-Japanese-Conversational-Speech-Data-by-Telephone

513-hours-japanese-conversational-speech-data-by-telephone's Introduction

513-Hours-Japanese-Conversational-Speech-Data-by-Telephone

Description

The 513 Hours - Japanese Conversational Speech of natural conversations collected by telephony involved more than 800 native speakers, developed with the proper balance of gender ratio, Speakers would choose a few familiar topics out of the given list and start conversations to ensure dialogues' fluency and naturalness. The recording devices is telephony recording system. The audio format is 8kHz, 8bit, uncompressed WAV, and all the speech data was recorded in quiet indoor environments. All the speech audio was manually transcribed with text content, the start and end time of each effective sentence, and speaker identification. The accuracy rate of sentences is ≥ 95%.

For more details, please refer to the link: https://www.nexdata.ai/datasets/1409?source=Github

Format

8kHz, 8bit, u-law/a-law wav, mono channel;

Recording Environment

quiet indoor environment, without echo;

Recording content

dozens of topics are specified, and the speakers make dialogue under those topics while the recording is performed;

Demographics

878 Japanese, with 46% male and 54% female;

Annotation

annotating for the transcription text, speaker identification and gender;

Device

Telephony recording system;

Language

Japanese;

Application scenarios

speech recognition; voiceprint recognition;

Accuracy rate

95%

Licensing Information

Commercial License

Recommend Projects

nexdata-ai / 513-hours-japanese-conversational-speech-data-by-telephone Goto Github PK