Code Monkey home page Code Monkey logo

specs2text's People

Stargazers

 avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

specs2text's Issues

my prediction is only a´s and e´s

i have trained de specs2text with the small_model, and when i test it, i only get "a" and "e" as output. The input i put to test it, is the spectogram goten by the class WavAudio. what am i doing wrong?

------------------------test code--------------------------------
from keras import backend as K
from data_gen import WavAudio
from model import small_model
import numpy as np

labels = [" ", "a", "b", "c", "d", "e", "f", "g", "h", "i", "j",
"k", "l", "m", "n", "o", "p", "q", "r", "s", "t", "u",
"v", "w", "x", "y", "z", "'"]

audio_path = "datagen_utils/datasets/LibriSpeech/train-clean-100-wav/4014/186179/4014-186179-0024.wav"

wav = WavAudio(audio_path)
wavr = wav.specgram

decode_model = small_model((None, wavr.shape[0], 256),
len(labels) + 1, 1000, train=False)

decode_model.load_weights('small_model5x3.h5')

wavr1 = np.expand_dims(wavr, axis=0)

pred = decode_model.predict(wavr1)

def labels_to_text(labs):
ret = []
for c in labs:
if c == len(labels): # CTC Blank
ret.append("_")
else:
ret.append(labels[c])
return "".join(ret)

def decode_predict_ctc(out, top_paths=1):
results1 = []
beam_width = 100
if beam_width < top_paths:
beam_width = top_paths
for i in range(top_paths):
labs = K.get_value(K.ctc_decode(out, input_length=np.ones(out.shape[0]) * out.shape[1],
greedy=False, beam_width=beam_width, top_paths=top_paths)[0][i])[0]
text = labels_to_text(labs)
results1.append(text)

return results1

results = decode_predict_ctc(pred)
print("RESULTADO DE LA PREDICCION----------------------------------------------------------")
print("Transcript original:",
"was a constantly moving line of motor trucks coming forward with men and shells while out ahead of them tremendous and menacing big tanks")
print("Prediccion: ", results)

--------------------------what i get---------------------------------------
RESULTADO DE LA PREDICCION----------------------------------------------------------
Transcript original: for he began to suspect who she was she however without noticing the excitement of cardenio continuing her story went on to say
Prediccion: ['a e e a a e e e e e e e a a e a e e ea e a e e']

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.