Code Monkey home page Code Monkey logo

nupic.audio's Introduction

nupic.audio

![Gitter](https://badges.gitter.im/Join Chat.svg) Gitter public chat channel

Auditory experiments using cortical learning algorithms (CLA) and hierarchical temporal memory (HTM).

Repositories of interest

Note: These repositories currently are all work-in-progress.

Online videos of interest

Taken from the collection gathered via Gitter channel https://gitter.im/rcrowder/EncodingSpecificityPrinciple -

Online books and references

Potential areas of investigation

  • Genre and style classification
  • Musical prediction and composition
  • Acoustic correlation using canonical correlation analysis (CCA)
  • Transient analysis (harmonic tracking)
  • Motion derivative encoding (similar to optical flow)
  • Echo location and spatial positioning (e.g. Anterior Ventral Cochlea Nucleus)
  • Stream segmentation and seperation (includes selective attention)
  • Cortical pathways and projections, 'What' and 'Where' pathways (belts?)
  • Auditory nerve spike firing (e.g. IHC to CN GBC integrators)
  • Dendritic micro-circuits and synaptic placement (temporal smoothing)
  • Spike-timing dependent plasticity
  • Acetylcholine inhibition enhancing discharge frequency but decreasing synaptic adaption
  • Acoustic related cell, and dendrite, membrane properties (cascading conductances, shunting)

An alternative for the encoding of audio signals is the modelling of spike firing of auditory-nerve fibers. A collection of models can be found in the EarLab @ Boston University (http://earlab.bu.edu/ See Modelling -> Downloadable Models). If you plan to use these models, beware of their history and limitations. For example, early models lack some necessary non-linearity in their responses.

nupic.audio's People

Contributors

breznak avatar passiweinberger avatar rcrowder avatar rhyolight avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

nupic.audio's Issues

Define initial work

Example route;

1 Encoders (MIDI file streamer, WAV/Mic streaming encoder
2 First-order learning
3 TM capacity testing
4 Frequency domain transform investigation (e.g. Constant-Q, move to log based).
5 Spike-grams, similar to Spectrograms.
6 Short term recall

make this repo pip installable

so it can be conveniently used along nupic.

It should be possible to install with:
python setup.py install
which will install content of nupic/audio/ and make it accessible from anywhere with:
python -c 'from nupic.audio.encoders.wave import WAVEncoder'

Create sound encoders

Greetings guys,

I'm happy I can hop on audio and play for some time with this repo!
My question is about encoders and ease of use of this repo, would you be in for extracting the code from your examples and creating reusable encoders in NuPIC style? So this repo could be pip-installed (like nupic.vision) and provide the encoders and examples for new users who need to use sound with HTM (currently me :P)

From what I saw in the codebase the encoders could be:

  • {MIDI/WAV/MP3}toScalar{Adaptive,RDSE,ScalarEncoder}
    • WAV (with scipy backend) encoder
    • MIDI encoder
  • SoundToSpiketrain(Cochlea)Encoder
  • any other state of the art models?
  • Experiments could reuse the encoders

Blocked by: pip #24 (partially blocked)

HTM for Musical Genre Classification

George Tzanetakis, Georg Essl, and Perry Cook produced the paper "Musical Genre Classification of Audio Signals" (http://webhome.csc.uvic.ca/~gtzan/output/tsap02gtzan.pdf and http://ismir2001.ismir.net/pdf/tzanetakis.pdf). They create feature vectors from a variety of statistics over short-time frame analysis windows, and longer texture windows containing groups of analysis windows (either whole-file or real-time streaming). The creation and description of these feature vectors I think is an important way of showing HTM SDR creation for audio signals.

Over time a large dataset has been collected, found here http://marsyas.info/ and https://github.com/marsyas/marsyas The main GTZAN dataset consists of 1000 audio tracks each 30 seconds long. It contains 10 genres, each represented by 100 tracks. The tracks are all 22050Hz Mono 16-bit audio files in .wav format, contained in a 1.6GB tar.gz file. A separate dataset exists (~250MB) for music and speech.

As a comparison from supervised learning; in 2010 Philippe Hamel and Douglas Eck detailed "Learning features from music audio with deep belief networks" http://musicweb.ucsd.edu/~sdubnov/Mu270d/DeepLearning/FeaturesAudioEck.pdf

This idea for a new HTM showpiece is to implement the statistical analysis inside a fork of nupic.critic and from there, with the GTZAN dataset, investigate the unsupervised recognition of musical genres. An alternative is to train with musical styles rather than genre. With potential to expand into composition later on. Or adopt an Inner Hair Cell model that can reproduce the irregular spiking of auditory nerve fibers, with an investigation of the integrative effects of bushy cells in the Cochlear Nucleus.

Adaptive Resonance Theory and HTM

Adaptive resonance theory (ART, [1][2]) is a theory developed by Stephen Grossberg and Gail Carpenter on aspects of how the brain processes information. It describes a number of neural network models which use supervised and unsupervised learning methods, and address problems such as pattern recognition and prediction. Models that are capable of learning stable recognition categories.

https://en.wikipedia.org/wiki/Adaptive_resonance_theory

The primary intuition behind the ART model is that object identification and recognition generally occur as a result of the interaction of 'top-down' observer expectations with 'bottom-up' sensory information. The model postulates that 'top-down' expectations take the form of a memory template or prototype that is then compared with the actual features of an object as detected by the senses. This comparison gives rise to a measure of category belongingness. As long as this difference between sensation and expectation does not exceed a set threshold called the 'vigilance parameter', the sensed object will be considered a member of the expected class. The system thus offers a solution to the 'plasticity/stability' problem, i.e. the problem of acquiring new knowledge without disrupting existing knowledge.

In the context of NuPIC Audio, ART can be used (with sound localization) to separate auditory streams, dynamically manage categories, heading towards segmentation and identification of audio sources. An investigation can be conducted to the applicability of resonance theory (ART2A, [3][4][5]) with hierarchical temporal memory. A potential method of emulating current generation NN back propagation within HTM, and obtaining limited feedback and attention processing. A variety of changes will be required internal to NuPIC, such as column burst mechanism and Connections class expansion.

Ideas behind using ART with NuPIC formed via reading;

  • Desimone and Duncan (1995) "Neural mechanisms of selective visual attention"
  • Spratling (1999) "Pre-synaptic lateral inhibition provides a better architecture for self-organizing neural networks"
  • Grossberg et al. (2004) "ARTSTREAM: A neural network model of auditory scene analysis and source segregation" [4]
  • Ballard and Jehee (2011) "Dual roles for spike signaling in cortical neural populations"
  • Ballard and Jehee (2012) "Dynamic coding of signed quantities in cortical feedback circuits"
  • Spratling (2014) "A single functional model of drivers and modulators in cortex"

[1] G. A. Carpenter. Neural network models for pattern recognition and associative memory. Neural Networks 2, 4 (June 1989), 243-257. DOI=10.1016/0893-6080(89)90035-X http://dx.doi.org/10.1016/0893-6080(89)90035-X

[2] Carpenter, G. A. and Grossberg, S., Adaptive Resonance Theory. The Handbook of Brain Theory and Neural Networks, Second Edition, MIT Press, 2003.

[3] Carpenter, G.A. and Grossberg, S., ART2: Self-organization of stable category recognition codes for analog input patterns, Applied Optics, 26 (23): 4919-4930, 1987.

[4] Grossberg, Govindarajan, Wyse, Cohen (2004), ARTSTREAM: A neural network model of auditory scene analysis and source segregation, http://dx.doi.org/10.1016/j.neunet.2003.10.002

[5] https://github.com/rcrowder/AdaptiveResonanceTheory

CinC/Physionet PCG/ECG challenge 2016

CinC challenge

https://physionet.org/challenge/2016/

A prestigious challenge/conference with nice data!

๐Ÿ”ฅ UPDATE: game's still ON! ๐ŸŽธ

Looking for hackers to help me set someting up, if it's feasible. The there will be whole summer to tune the app.

Blocked by: Add encoders #22

Plan of attack

  • audio
    • for now use wav2vect from Matlab
    • implement wavEncoder - IN PROGRESS #26
    • evaluate if functionality of the WAVEncoder (internal scipy) is the same as matlab's
    • try Cochlea encoder
    • implement sound encoders for nupic.audio #22
  • training
    • records are Normal/Anomaly/Unknown
    • aggregate all NORMAL records to a 2 column file (reset, PCG)
      • how radical subsampling? bcs nupic is too slow to process whole dataset: only down to 1000(from 2000),bcs of Sampling Theorem (Fs>=2*F)
    • commit the training data files (bcs the preprocessing takes long)
    • train a HTM model + serialize it
    • try param swarming
  • evaluation
    • load the model, disable learning
    • 2 tasks description.py?, OR other way to train/load/eval a model on datasets
    • compute average anomaly score for all datapoints of a record
    • implement the anomaly metric in nupic
    • create a model (for nupic?) that does this classification based on avg. anomaly scores?
    • threshold to Normal/Anomaly/Unknown
  • submission
    • modify examples sample2016*
    • nupic is installed, so setup will just source a virtualenv
    • each evaluation in next will call matlab (wav2csv), python(writes anomaly scores to CSV), matlab again(loads anomalies and decides classification)
    • this is problematic, better go full-python if possible!
  • improvements:
    • try bag (multi model) voting
      • model trained on full normal data
    • model on FHS parts
    • model on anomalous data
    • model pretrained on ECG data from other sources! https://github.com/breznak/nupic.biodat

Working plan to get some validation results ASAP:

  • training data
    • will train only on Normal data and select (FHS) subsequences of it
    • data extracted from Matlab @breznak will do that
  • train HTM model
    • on the provided data
    • just one HTM model (with RDSE? encoder, what best settings? probably no time to swarm)
    • able to serialize the model and load to run on eval. data (learning off)
      • the approach with OPF is not reliably working, can someone post code to do that? (@rhyolight or someone..?)
  • write simple classification function: classify(anScores[])
    • should decide classification from the anomaly scores for the whole sequence/sample
    • can be sth like avg and Normal iff <0.4; UNKNOWN iff [0.4...0.7]; Anomal iff > 0.7; ETA ~10mins
  • score
    • process validation data (@breznak will commit a file)
    • classify & compute score -> submit! ๐Ÿ™

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.