The 196 Hours - Urdu Conversational Speech Data collected by telephone involved 270 native speakers, developed with proper balance of gender ratio, Speakers would choose a few familiar topics out of the given list and start conversations to ensure dialogues' fluency and naturalness. The recording devices are various mobile phones. The audio format is 8kHz, 8bit, WAV, and all the speech data was recorded in quiet indoor environments. All the speech audio was manually transcribed with text content, the start and end time of each effective sentence, and speaker identification.
For more details, please refer to the link: https://www.nexdata.ai/datasets/1242?source=Github
8kHz, 8bit, u-law/a-law pcm, mono channel;
quiet indoor environment, without echo;
dozens of topics are specified, and the speakers make dialogue under those topics while the recording is performed;
270 speakers totally, with 56% male and 44% female;.
annotating for the transcription text, speaker identification and gender
Telephony recording system;
Urdu
speech recognition; voiceprint recognition;
the word accuracy rate is not less than 95%
Commerical License Commit directly to the main branch Open as a pull request to the main branch Commit changes Update README.md Add an extended description... Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.