Light

jcvasquezc / phonet Goto Github PK

View Code? Open in Web Editor NEW

34.0 2.0 14.0 23.54 MB

Keras-based python framework to compute phonological posterior probabilities from audio files

License: MIT License

Python 100.00%

speech-processing deep-learning deep-neural-networks phonetics linguistics linguistic-analysis

phonet's Introduction

Hi there, I'm Camilo 👋

I have performed research and development activities related to signal processing and machine learning for health-care and biometric applications since five years now, both in academic and industrial partners. Passionate about Machine learning, deep learning, speech processing, and natural language processing technologies. Some technologies I enjoy working and I am familiar with include Pytorch, Transformers, Sklearn, Pandas, FastAPI, Docker, among others.

GitHub Stats:

Find me around the web 🌎:

Personal web jcvasquezc.github.io
Social media on on Twitter 🏓
My research Scholar 💼

phonet's People

Contributors

Stargazers

Watchers

Forkers

aascode bbruhh melspectrum007 entn-at gjm311 colincwilson iezhanqingran pauperezt mbencherif antoniopessotti eunjung31 nestorcalvo ruhireddy cncspeech

phonet's Issues

Accompanying paper?

Have you published a paper describing this tool?

Also, as I understand, the toolkit will compute probabilities for each phoneme for a given audio file. Does it work for free speech too or just specific utterances.

Thanks

Import error for Adam optimizer

Hi @jcvasquezc

While using the disvoice library I encountered this AttributeError when importing the Adam optimizer from keras:
AttributeError: module 'keras.optimizers' has no attribute 'Adam'

This occurs in the phonet.py. What worked for me was replacing all the keras imports, for example from keras import optimizers to from tensorflow.keras import optimizers.

I don't know if it's only me getting this problem, if not, could you kindly make a release with this few changes.
Best regards.

Error?

I find your padding strategy in get_feat function of file phonet.py quite weird. First of all, you compute

fill=len(signal)%int(fs*self.size_frame*self.len_seq)

so with the default parameters this means that if a signal has length say strictly less than 16000=int(fs*self.size_frame*self.len_seq), then you are simply doubling its size regardless of the length (normally you pad to fill frames instead).

Moreover, in the next line you do:

fillv=0.05*np.random.randn(fill)-self.size_frame

so you substract the value size frame to the random signal used for padding. I don't know why you'd do that.

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.