Code Monkey home page Code Monkey logo

ctc-asr's People

Contributors

mdangschat avatar yweweler avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ctc-asr's Issues

Inference garph

Hello,
I just wanted to know where you are saving the .pbtxt file? I noticed your code creates this graph file but I am not able to locate the code snippet for creating it.
Thanks in advance

Error with Output:Node Name for freezing the graph.

Hi, I am trying to freeze the graph but when I use "bazel-bin/tensorflow/tools/graph_transforms/summarize_graph --in_graph=/path_to_file/graph.pbtxt", I get this:
How do I know which one to use amongst this?

No inputs spotted.
Found 36 variables: (name=global_step, type=int64(9), shape=[]) (name=conv/conv2d/kernel, type=float(1), shape=[11,41,1,32]) (name=conv/conv2d/bias, type=float(1), shape=[32]) (name=conv/conv2d_1/kernel, type=float(1), shape=[11,21,32,32]) (name=conv/conv2d_1/bias, type=float(1), shape=[32]) (name=conv/conv2d_2/kernel, type=float(1), shape=[11,21,32,96]) (name=conv/conv2d_2/bias, type=float(1), shape=[96]) (name=rnn/cudnn_lstm/opaque_kernel, type=float(1), shape=) (name=dense4/dense/kernel, type=float(1), shape=[4096,2048]) (name=dense4/dense/bias, type=float(1), shape=[2048]) (name=logits/dense/kernel, type=float(1), shape=[2048,29]) (name=logits/dense/bias, type=float(1), shape=[29]) (name=beta1_power, type=float(1), shape=[]) (name=beta2_power, type=float(1), shape=[]) (name=conv/conv2d/kernel/Adam, type=float(1), shape=[11,41,1,32]) (name=conv/conv2d/kernel/Adam_1, type=float(1), shape=[11,41,1,32]) (name=conv/conv2d/bias/Adam, type=float(1), shape=[32]) (name=conv/conv2d/bias/Adam_1, type=float(1), shape=[32]) (name=conv/conv2d_1/kernel/Adam, type=float(1), shape=[11,21,32,32]) (name=conv/conv2d_1/kernel/Adam_1, type=float(1), shape=[11,21,32,32]) (name=conv/conv2d_1/bias/Adam, type=float(1), shape=[32]) (name=conv/conv2d_1/bias/Adam_1, type=float(1), shape=[32]) (name=conv/conv2d_2/kernel/Adam, type=float(1), shape=[11,21,32,96]) (name=conv/conv2d_2/kernel/Adam_1, type=float(1), shape=[11,21,32,96]) (name=conv/conv2d_2/bias/Adam, type=float(1), shape=[96]) (name=conv/conv2d_2/bias/Adam_1, type=float(1), shape=[96]) (name=rnn/cudnn_lstm/opaque_kernel/Adam, type=float(1), shape=) (name=rnn/cudnn_lstm/opaque_kernel/Adam_1, type=float(1), shape=) (name=dense4/dense/kernel/Adam, type=float(1), shape=[4096,2048]) (name=dense4/dense/kernel/Adam_1, type=float(1), shape=[4096,2048]) (name=dense4/dense/bias/Adam, type=float(1), shape=[2048]) (name=dense4/dense/bias/Adam_1, type=float(1), shape=[2048]) (name=logits/dense/kernel/Adam, type=float(1), shape=[2048,29]) (name=logits/dense/kernel/Adam_1, type=float(1), shape=[2048,29]) (name=logits/dense/bias/Adam, type=float(1), shape=[29]) (name=logits/dense/bias/Adam_1, type=float(1), shape=[29])
Found 59 possible outputs: (name=global_step/read, op=Identity) (name=global_step/cond/switch_t, op=Identity) (name=global_step/cond/switch_f, op=Identity) (name=global_step/add, op=Add) (name=seed2, op=Select) (name=IteratorToStringHandle, op=IteratorToStringHandle) (name=rnn/cudnn_lstm/Identity, op=Identity) (name=rnn/cudnn_lstm/zeros/Less, op=Less) (name=rnn/cudnn_lstm/zeros_1/Less, op=Less) (name=dense4/dense/kernel/Regularizer/l2_regularizer, op=Mul) (name=dense_to_sparse/Shape, op=Shape) (name=gradients/zeros_like, op=ZerosLike) (name=gradients/dense4/dropout/dropout/mul_grad/tuple/control_dependency_1, op=Identity) (name=gradients/dense4/dropout/dropout/truediv_grad/tuple/control_dependency_1, op=Identity) (name=gradients/dense4/Minimum_grad/tuple/control_dependency_1, op=Identity) (name=gradients/zeros_like_3, op=ZerosLike) (name=gradients/rnn/cudnn_lstm/CudnnRNN_grad/tuple/control_dependency_1, op=Identity) (name=gradients/rnn/cudnn_lstm/CudnnRNN_grad/tuple/control_dependency_2, op=Identity) (name=gradients/conv/Minimum_2_grad/tuple/control_dependency_1, op=Identity) (name=gradients/conv/Minimum_1_grad/tuple/control_dependency_1, op=Identity) (name=gradients/conv/Minimum_grad/tuple/control_dependency_1, op=Identity) (name=gradients/conv/conv2d/Conv2D_grad/tuple/control_dependency, op=Identity) (name=conv/conv2d/kernel/Adam/read, op=Identity) (name=conv/conv2d/kernel/Adam_1/read, op=Identity) (name=conv/conv2d/bias/Adam/read, op=Identity) (name=conv/conv2d/bias/Adam_1/read, op=Identity) (name=conv/conv2d_1/kernel/Adam/read, op=Identity) (name=conv/conv2d_1/kernel/Adam_1/read, op=Identity) (name=conv/conv2d_1/bias/Adam/read, op=Identity) (name=conv/conv2d_1/bias/Adam_1/read, op=Identity) (name=conv/conv2d_2/kernel/Adam/read, op=Identity) (name=conv/conv2d_2/kernel/Adam_1/read, op=Identity) (name=conv/conv2d_2/bias/Adam/read, op=Identity) (name=conv/conv2d_2/bias/Adam_1/read, op=Identity) (name=cond/switch_t, op=Identity) (name=cond/switch_f, op=Identity) (name=zeros, op=Fill) (name=rnn/cudnn_lstm/opaque_kernel/Adam/cond/switch_t, op=Identity) (name=rnn/cudnn_lstm/opaque_kernel/Adam/cond/switch_f, op=Identity) (name=rnn/cudnn_lstm/opaque_kernel/Adam/read, op=Identity) (name=cond_1/switch_t, op=Identity) (name=cond_1/switch_f, op=Identity) (name=zeros_1, op=Fill) (name=rnn/cudnn_lstm/opaque_kernel/Adam_1/cond/switch_t, op=Identity) (name=rnn/cudnn_lstm/opaque_kernel/Adam_1/cond/switch_f, op=Identity) (name=rnn/cudnn_lstm/opaque_kernel/Adam_1/read, op=Identity) (name=dense4/dense/kernel/Adam/read, op=Identity) (name=dense4/dense/kernel/Adam_1/read, op=Identity) (name=dense4/dense/bias/Adam/read, op=Identity) (name=dense4/dense/bias/Adam_1/read, op=Identity) (name=logits/dense/kernel/Adam/read, op=Identity) (name=logits/dense/kernel/Adam_1/read, op=Identity) (name=logits/dense/bias/Adam/read, op=Identity) (name=logits/dense/bias/Adam_1/read, op=Identity) (name=Adam, op=AssignAdd) (name=concat, op=ConcatV2) (name=concat_1, op=ConcatV2) (name=Merge/MergeSummary, op=MergeSummary) (name=save/Identity, op=Identity)

how cant i train with my dataset?

i have dataset: 1 folder 'wav' (.wav file), 1 text file have lines = num of wav file with format name_wav text_of_wav
so, how can i train with this data. thanks so much,, im beginer

Update Documentation

  • Directories:
    • Point out the required speech_checkpoints and speech-corpus dirs.
    • Remember to update the tree output.
  • CSV: Add information about the required CSV format to README.md. (#8)
  • Reference the speech-corpus-dl git.
  • reset params.py and validate default params. (#10)

Common Voice Dataset

Hi,

I just wanted to know if all the datasets you have used are clean speech? Specifically, wondering about common voice dataset, by any chance have you analyzed the dataset? Since, they have a platform for recording, a mobile app as well as a browser platform, I feel there is a chance that the recordings can be noisy.

Thank you

About models

Hello, can I have a trained model that you don't need? The computing ability of my computer is relatively poor. I want to test the results of the model and then consider training with the cloud services.

Configuration for low memory GPU

I use laptop with 2GB GPU Memory (Nvidia MX150).

I try to build new language model, so i try many source code from deepspeech, pytorch, etc...

to make my laptop capable handle the process. i set the another source code with low batch and number of n_hidden. I already try to reduce the batch to 1 and_number units_rnn to 1024, but your code still insufied GPU memory...

do you have any recommendation of the setting?

command that i use:
python3 asr/train.py -- --used_model ds2 --rnn_cell rnn_relu --feature_type mfcc --batch_size 1 --max_epochs 15 --cudnn True --allow_vram_growth True --num_units_rnn 1024 --delete tensorboard learning_rate 0.00001

Input & output of graph

I wanted to know what are the input & output nodes of the graph generated in your code. Could you please provide me this information?
Thank you in advance

Issue with input and output names

Hello, I am currently trying to freeze the graph from this model and I am unable to do so because when I am inspecting the "graph.pbtxt" created after training, there is no node with the name of "logits/dense".

Please help me figure out what the output node name is so I can freeze the graph to .pb.

Thank you
Regard
Rahul B

Value of Beam Width

Hi
Could I have some info on how beam width is chosen as 1024? What is the role of beam width parameter? I have a confusion regarding this parameter.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.