This repo is designed to house code related to Tarteel machine learning related tasks. ๐ฌ
Specifically, things like:
- Model selection โ
- Preprocessing of data ๐
- Model training, validation, and and iteration ๐
- Demos ๐
Code here is mostly experimental so check back regularly for updates.
If you found this repo helpful, please keep it's contributors in your duaa ๐.
๐ฅ To see our technology live in action, visit tarteel.io. ๐ฅ
We use Python 3.7 for our development.
However, any Python above 3.6 should work.
For audio pre-processing, we use ffmpeg
and ffprobe
.
Make sure you install these using your system package manager.
Mac OS
brew install ffmpeg
Linux
sudo apt install ffmpeg
Then install the Python dependencies from requirements.txt
.
pip3 install -r requirements.txt
Use the -h
/--help
flag for more info on how to use each script.
This repo is structured as follows:
Root
download.py
: Download the Tarteel datasetcreate_train_test_split.py
: Create train/test/validation split csv files.generate_alphabet|vocabulary.py
: Generate all unique letters/ayahs in the Quran in a text file.generate_csv_deepspeech.py
: Create a CSV file for training with DeepSpeech.
Check out the wiki for instructions on how to download and pre-process the data, as well as how to start training models.
Check out CONTRIBUTING.md
to start contributing to Tarteel-ML!