bricewalker / hey-jetson Goto Github PK

Deep Learning based Automatic Speech Recognition with attention for the Nvidia Jetson.

License: GNU General Public License v3.0

Shell 0.02% Jupyter Notebook 66.91% Python 5.49% CSS 2.62% HTML 5.98% JavaScript 9.19% Less 4.64% SCSS 5.15%

attention automatic-speech-recognition azure azure-cognitive-services css deep-learning deep-neural-networks flask html inference jetson-tx2 keras nvidia-jetson-tx2 python recurrent-neural-networks rest-api sentiment-analysis speech-recognition speech-to-text tensorflow

hey-jetson's People

Contributors

Stargazers

Watchers

Forkers

1sfortheelder profcab empireofkings shubhampachori12110095 saquibntt sushantjha8 wandonye flyboy9 aswarthadk 7633 treaston2 slavio88 iamweiweishi msoftware mohammedabdalqader strategist922 pratyushlohumi26 tuyaliang slbinilkumar gypsybud plushpluto borodark odiezapha qqq-tech netg5 garymatthews zhangfaquan dabdirb mojomoon zhangaustin bobandrey37 kyrie24bryant t0admomo mkaderoglu augmentium amphancm kimzerovirus binzhou-com snapbuy feisuo

hey-jetson's Issues

Inferencing on jetson

I am interested in this project for doing emotion detection from speech.

Do we need to use flask for inferencing on jetson. I mean there are projects which do inferencing by direct deployment of model like here https://github.com/dusty-nv/jetson-inference

Any advice for emotion detection from speech for this project

Thanks

Hi brice,
I'm getting this error and I don't understand it at all. Some help would be much appreciated. I've installed all the dependencies and prepared the dataset etc... training starts too but getting this annoying error.

2019-07-17 14:57:26.395650: W tensorflow/core/framework/op_kernel.cc:1502] OP_REQUIRES failed at variable_ops.cc:104 : Already exists: Resource __per_step_3/training/Adam/gradients/bidirectional_7/while_1/strided_slice_3/Enter_grad/ArithmeticOptimizer/AddOpsRewrite_Add/tmp_var/N10tensorflow19TemporaryVariableOp6TmpVarE 2019-07-17 14:57:26.396630: W tensorflow/core/framework/op_kernel.cc:1502] OP_REQUIRES failed at variable_ops.cc:104 : Already exists: Resource __per_step_3/training/Adam/gradients/bidirectional_7/while_1/strided_slice_3/Enter_grad/ArithmeticOptimizer/AddOpsRewrite_Add/tmp_var/N10tensorflow19TemporaryVariableOp6TmpVarE 2019-07-17 14:57:26.397577: W tensorflow/core/framework/op_kernel.cc:1502] OP_REQUIRES failed at variable_ops.cc:104 : Already exists: Resource __per_step_3/training/Adam/gradients/bidirectional_7/while_1/strided_slice_3/Enter_grad/ArithmeticOptimizer/AddOpsRewrite_Add/tmp_var/N10tensorflow19TemporaryVariableOp6TmpVarE Traceback (most recent call last): File "train_model.py", line 842, in <module> spectrogram=True) # True for Spectrogram/False for MFCC File "train_model.py", line 536, in train_model callbacks=[checkpointer, terminator, logger, time_machiner, tensor_boarder], verbose=verbose) File "/home/nvidia/anaconda3/lib/python3.7/site-packages/keras/legacy/interfaces.py", line 91, in wrapper return func(*args, **kwargs) File "/home/nvidia/anaconda3/lib/python3.7/site-packages/keras/engine/training.py", line 1418, in fit_generator initial_epoch=initial_epoch) File "/home/nvidia/anaconda3/lib/python3.7/site-packages/keras/engine/training_generator.py", line 217, in fit_generator class_weight=class_weight) File "/home/nvidia/anaconda3/lib/python3.7/site-packages/keras/engine/training.py", line 1217, in train_on_batch outputs = self.train_function(ins) File "/home/nvidia/anaconda3/lib/python3.7/site-packages/keras/backend/tensorflow_backend.py", line 2715, in __call__ return self._call(inputs) File "/home/nvidia/anaconda3/lib/python3.7/site-packages/keras/backend/tensorflow_backend.py", line 2675, in _call fetched = self._callable_fn(*array_vals) File "/home/nvidia/anaconda3/lib/python3.7/site-packages/tensorflow/python/client/session.py", line 1458, in __call__ run_metadata_ptr) tensorflow.python.framework.errors_impl.AlreadyExistsError: Resource __per_step_3/training/Adam/gradients/bidirectional_7/while_1/strided_slice_3/Enter_grad/ArithmeticOptimizer/AddOpsRewrite_Add/tmp_var/N10tensorflow19TemporaryVariableOp6TmpVarE [[{{node training/Adam/gradients/bidirectional_7/while_1/strided_slice_3/Enter_grad/ArithmeticOptimizer/AddOpsRewrite_Add/tmp_var}}]]

jetson_requirement.txt too many missing packages

I have Jetson nano, with L4T 4.2.
The pip install -r jetson_requirement.txt doesn't run to completion.

First package which is not found is apturl.

What are those hardware components?

Hi Bricewalker,
This is a very interest project. Could you please share the hardware components?

Is it possible to use Adversarial Training?

like this project:

Multi-GPU scenario

Hello @bricewalker ,

thanks for sharing this. Very good work. Can this be scaled to multigpu training?

Getting an error while building environment

Getting this error:

conda env create -f environment.yml
Solving environment: failed

ResolvePackageNotFound:

openssl==1.1.1b=hfa6e2cd_2
m2w64-flac==1.3.1=3
icu==58.2=ha66f8fd_1
h5py==2.9.0=py37h5e291fa_0
python==3.7.3=h8c8aaf0_1
qt==5.9.7=vc14h73c81de_0
mkl-service==2.0.2=py37he774522_0
cryptography==2.7=py37hb32ad35_0
pyyaml==5.1=py37he774522_0
winpty==0.4.3=4
tensorflow-base==1.13.1=gpu_py37h871c8ca_0
matplotlib==3.1.0=py37hc8f65d3_0
wincertstore==0.2=py37_0
mkl_fft==1.0.12=py37h14836fe_0
grpcio==1.16.1=py37h351948d_1
msys2-conda-epoch==20160418=1
m2w64-speex==1.2rc2=3
tensorboard==1.13.1=py37h33f27b4_0
yaml==0.1.7=hc54c509_2
protobuf==3.8.0=py37h33f27b4_0
mkl_random==1.0.2=py37h343c172_0
m2w64-libogg==1.3.2=3
m2w64-gcc-libs==5.3.0=7
m2w64-gcc-libgfortran==5.3.0=6
m2w64-libsndfile==1.0.26=2
libprotobuf==3.8.0=h7bd577a_0
zlib==1.2.11=h62dcd97_3
hdf5==1.10.4=h7ebc959_0
pyqt==5.9.2=py37h6538335_2
libsodium==1.0.16=h9d3ae62_0
sip==4.19.8=py37h6538335_0
pyrsistent==0.14.11=py37he774522_0
m2w64-gcc-libs-core==5.3.0=7
mistune==0.8.4=py37he774522_0
sqlite==3.28.0=he774522_0
m2w64-speexdsp==1.2rc3=3
pyzmq==18.0.0=py37ha925a31_0
pywinpty==0.5.5=py37_1000
libpng==1.6.37=h2a8f88b_0
m2w64-gmp==6.1.0=2
icc_rt==2019.0.0=h0cc432a_1
m2w64-libwinpthread-git==5.0.0.4634.697f757=2
pandas==0.24.2=py37ha925a31_0
pycrypto==2.6.1=py37hfa6e2cd_1002
jpeg==9b=hb83a4c4_2
mkl==2019.4=245
m2w64-libvorbis==1.3.5=2
tornado==6.0.2=py37he774522_0
intel-openmp==2019.4=245
cffi==1.12.3=py37h7a1dbc1_0
statsmodels==0.9.0=py37h452e1ab_0
numpy==1.16.4=py37h19fb1c0_0
pyreadline==2.1=py37_1
freetype==2.9.1=ha9979f8_1
vs2015_runtime==14.15.26706=h3a45250_4
win_inet_pton==1.1.0=py37_0
scikit-learn==0.21.2=py37h6288b17_0
kiwisolver==1.1.0=py37ha925a31_0
vc==14.1=h0510ff6_4
markupsafe==1.1.1=py37he774522_0
tensorflow==1.13.1=gpu_py37h83e5d6a_0
zeromq==4.3.1=h33f27b4_3
numpy-base==1.16.4=py37hc3f5095_0
scipy==1.2.1=py37h29ff71c_0

Portuguese data corpus;

I'm trying to running this model usign a portuguese data corpus, but, I don't know how exactly use a PT-BR dataset because of the accents that have in portuguese, like:

ACCENTS = 'ãõçâêôáíóúàüóé'

how can I add these characters to this model?

run_inference() not working properly

So I am trying to translate an audio file from the LibriSpeech data set using the run_inference() function in the make_predictions.py file by loading up model_11.h5, but the predicted translation is only 2 letters long. This is true for the other audio samples I have tried as well where I only get an output text of 2 letters.
The output array generated by the model's predict() method, which is saved under the variable "prediction", for one of the audio samples I've tried is in the shape of (1, 149, 29), but the tensor generated by the ctc_decode() method is in the shape of (1, 2). So I'm guessing that something is wrong with the decoder? I can't quite figure it out.

Is this an issue that you have encountered before? I've attached the python script I've used to run the inference along with the sample audio file and the output to the screen.

file.zip

Stops training with this error

2019-07-19 04:56:06.472125: I tensorflow/stream_executor/dso_loader.cc:153] successfully opened CUDA library libcublas.so.10 locally
3455/3455 [==============================] - 21350s 6s/step - loss: 356.8575 - val_loss: 237.5635
Epoch 2/30
1883/3455 [===============>..............] - ETA: 2:39:53 - loss: nan Batch 1882: Invalid loss, terminating training

Hardware setup used.

Hello @bricewalker
Could you clarify what hardware setup you used for inference?

Which nvidia jetson version?
Did you use an external microphone connected to the jetson, or the integrated mic in the camera?
Any other hardware that you connected to the jetson?
Thanks in advance.

Is there any docker environment for the Jetson that will work with this?

I've been getting problems with Keras regarding generic utility functions and other problems of modules when I try to adapt my own Jetson to get this to work.