Takes an audio recording and transcribes it into text document so that an user can annotate it. The app then can process the annotated document into audio clips and a document with hyper-links to the clips.
Python 100.00%
text-from-audio-for-justice's Introduction
Text-from-Audio-for-Justice
Clean up audio file using Audacity
Make tools:
Clean up audio file using pydub and sox see video below
Record file locations
Make segments from transcription file
Make training files from transcription and other files:
text: utt_id word1 word2 word3..
segments: utt_id file_id start_time end_time
wav.scp: file_id path/file
utt2spk: utt_id spkr
spk2utt: spkr utt_id1 utt_id2 utt_id3
Transcribe speaker files
Spike - compare original transcription with speaker transcription
Make output rts, word etc files to be used for training and communication
Download Kaldi docker image (message/email for link)
Docker load image
List docker images and containers
Give docker maximum resources to run
Run taj
Tools
taj transcribe
--audio_input_folder
--output_folder
With:
wav.scp: chunk file paths (files extracted from segments)
text: (line for each chunk)
segments: (links text to chunk file and start and end time)
taj chunk_speaker
--audio_input_path
--speech_segmentation_path
--output_folder
taj clean_up
--audio_input_folder (original recording)
--audio_output_folder
taj convert
--type (either rts, pdf, doc)
--online_folder (url of online folder)
--chunks_text_path
--output_folder
taj create_test_data
--input_folder
--output_folder
--audio_input_folder (original recording(s))
taj retrain
--input_folder
--audio_input_folder (original recording(s))