Code Monkey home page Code Monkey logo

river's Introduction

Project River

Automated Audio Content Analysis Using Convolutional Deep Neural Networks

River is a prototype built as part of my MSc thesis to analyse pre-classified audio samples from radio broadcasts and use this to build a Convolutional Neural Network to predict these classes when presented with further samples from radio broadcasts. This research is to determine if Convolutional Neural Networks are an appropriate method of classifying audio.

River uses Google's TensorBoard open-source Machine Learning library and Librosa to analyse audio signals.

Installation

virtualenv -p python3.6 env
source env/bin/activate
pip3 install -r requirements.txt

Creating the Dataset

River analyses audio files in WAVE (.wav) format with filenames like yyyymmdd-aaaa-bb.wav where yyyymmdd is a date/time, aaaa is an integer index, and bb is the zero-indexed class number for this audio sample. The only really important part is the class number but the files should be in the format n-n-n.wav with the last n the class. These should be split into a training and validation set, usually at an 80/20 ratio. This can be done by sampling the files:

Sampling the files

To sample at an 80/20 split use a stride of 5 which will move ever 5th file to the target directory. This will create an even distribution of files for the 20% split—the remainder is 80%.

./tf sample_files.py --src=/path/to/wav/input/files/ --stride=2000 --dest=/path/to/wav/output/files/

Running

0) Plotting

./tf plot.py --type{wave|spec} --file=/path/to/file1.wav --file=/path/to/file2.wav

Waveform Plot of R&B Track Waveform Plot of R&B Track

Spectrogram Plot of R&B Track Spectrogram Plot of R&B Track

Waveform Plot of Speech Waveform Plot of Speech

Spectrogram Plot of Speech Spectrogram Plot of Speech

1) Encoding

Extract features found in train and valid subdirectories. Creates training and validation datasets in the data directory

./tf extract.py --dir=/path/to/audio

2) Training

Use training and validation datasets from the data directory to train a convolutional neural network. Saves model in model directory.

  • epochs - number of training iterations
  • batch - number of examples to supply in each training epoch
  • sample - how often the current cost is returned to the console
./tf train.py --epochs=2000 --batch=50 --sample=10

Logs during training will be created in the log subdirectory, these can be visualised with TensorBoard

Evaluation

Run TensorBoard:

tensorboard --logdir=log/

View results at http://localhost:6006/

Accuracy Plot Accuracy Plot

Cross Entropy Plot Cross Entropy Plot

Auditioning Audio

To use the model built during the Training process to classify audio samples:

./tf audition.py --file=/path/to/filename.wav

...which should yield a class prediction with a confidence score. This filename does not have to be in any particular format as it is only used to encode as a feature and doesn't need a label.

Deactivating Tensorflow

deactivate

Notes

To suppress TensorFlow logging, ./tf is used to run the Python code. You could just run python3 however. If you like logging.

river's People

Contributors

babbc avatar betandr avatar

Stargazers

Alessandro Miracapillo avatar  avatar GAURAV avatar Kevin Wierman avatar Alexander avatar

Watchers

James Cloos avatar  avatar

Forkers

jdavidthomson

river's Issues

Error ValueError: setting an array element with a sequence

Hi!
We are trying to run your example. Are you sure it is working?
We are getting this error:
loading training data
training example size: 12
loading validation data
validation example size: 4
found 5 unique labels in dataset
running training: epochs[1000] batch[50] sample[10]
Traceback (most recent call last):
File "train.py", line 211, in
train_accuracy = accuracy.eval(feed_dict={x:batch_x, y_:batch_y, keep_prob: 1.0})
File "/Users/hugosg/pythonts/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 606, in eval
return _eval_using_default_session(self, feed_dict, self.graph, session)
File "/Users/hugosg/pythonts/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3928, in _eval_using_default_session
return session.run(tensors, feed_dict)
File "/Users/hugosg/pythonts/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 789, in run
run_metadata_ptr)
File "/Users/hugosg/pythonts/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 968, in _run
np_val = np.asarray(subfeed_val, dtype=subfeed_dtype)
File "/Users/hugosg/pythonts/lib/python3.6/site-packages/numpy/core/numeric.py", line 531, in asarray
return array(a, dtype, copy=False, order=order)
ValueError: setting an array element with a sequence.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.