xception1d's Introduction

Xception-1-dimensional

Implementation of a neural network architecture for solving speech recognition tasks with limited vocabulary out of the box. This architecture is based on the Xception architecture, presented by François Chollet in 2017.

It achieved state-of-the-art results with the Google Tensorflow Speech Commands data set, surpassing human performance in the most complex tasks.
It works always in temporal domain, without needing to perform tedious and computationally expensive Fourier transforms
It can be easily adapted to variable size audio clips and to different tasks

We suggest this architecture as the de facto solution when a voice commands recognition with restricted vocabulary task arises; considering the computing power is not a limiting factor.

Getting started

If you are interested in run the code, please, follow the next steps.

Clone the repository
Navigate with your terminal inside the folder of the project and install the required libraries using the following command: pip install -r requirements.txt
Use the settings_template.json file in the root of the project as a template for creating a settings.json file and fill it with your configuration.
The directory config contains the settings for reproducing the results submitted with the paper. Choose one, select a seed and run it using the following command: python main.py [config filepath] [seed]

The seeds that have been used for generating the current results are the following ones 655321, 655322, 655323, 655324, 655325. Feel free to create new settings and store them in the config file to try new parameters.

Contribution

If you wish to contribute in any way, please, submit a pull request

xception1d's People

Contributors

Stargazers

Watchers

xception1d's Issues

in file data_tools.py ,the function add_noise_to_wavfile have some error?

def add_noise_to_wavfile(wav, amplitude_factor, clip_to_original_range=False):
    max_wav = max(wav)
    min_wav = min(wav)
    noise = np.random.rand(*wav.shape) * (max_wav - min_wav) - min_wav
    wav = wav + amplitude_factor * noise
    if clip_to_original_range:
        wav = np.clip(wav, min_wav, max_wav)
    return wav

the calution of noise should be
noise = np.random.rand(*wav.shape) * (max_wav - min_wav) + min_wav?

the range of np.random.rand(*wav.shape) is (0,1), the range of noise is expected to be (min_wav,max_wav)?
so the third term shoub be + min_wav?

Recommend Projects

ivallesp / xception1d Goto Github PK

xception1d's Introduction

Xception-1-dimensional

Getting started

Contribution

xception1d's People

Contributors

Stargazers

Watchers

Forkers

xception1d's Issues

in file data_tools.py ,the function add_noise_to_wavfile have some error?

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent