Code Monkey home page Code Monkey logo

adventures-in-ml-code's People

Contributors

adventuresinml avatar romeokienzler avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

adventures-in-ml-code's Issues

Is that Q-Learning for a DDQN or is that a wrong implemented SARSA?

Thanks for all your stuff, but one question...

if target_network is None:

Is that Q-Learning for a DDQN or is that a wrong implemented SARSA?

See: https://stackoverflow.com/a/41420616

if target_network is None:
        updates[valid_idxs] += GAMMA * np.amax(prim_qtp1.numpy()[valid_idxs, :], axis=1)
    else:
        prim_action_tp1 = np.argmax(prim_qtp1.numpy(), axis=1)
        q_from_target = target_network(next_states)
        updates[valid_idxs] += GAMMA * q_from_target.numpy()[batch_idxs[valid_idxs], prim_action_tp1[valid_idxs]]

Isn't the correct code like this?

if target_network is None:
        updates[valid_idxs] += GAMMA * np.amax(prim_qtp1.numpy()[valid_idxs, :], axis=1)
    else:
        q_from_target = target_network(next_states)
        updates[valid_idxs] += GAMMA * np.amax(q_from_target.numpy()[valid_idxs, :], axis=1)        

???

keras_lstm.py does not show the results claimed in the tutorial?

I run

python keras_lstm.py 1

with:
detlef@ubuntu-i7:~$ python -c 'import tensorflow as tf; print(tf.version)' # for Python 2
1.12.0

(tested it with turned off gpu too but use cuda 9.1 on GTX1070Ti)

run 50 epoches and got only:
Epoch 50/50
1549/1549 [==============================] - 169s 109ms/step - loss: 5.2143 - categorical_accuracy: 0.1464 - val_loss: 7.6633 - val_categorical_accuracy: 0.0466

The tutorial says, you reach 40% training and 20% validation?

Thanks a lot for the wonderful tutorial!

Tensorflow on hadoop

Can you please provide a tutorial on how to run tf_word2vec.py in hadoop , basically to distribute the workload and then reduce.

keras_word2vec.py doesn't converge

Hi, please see below - it doesn't converge...can you confirm it is still working on your side?
fyi - running on python 2.7, keras 2.1.4, TensorFlow 1.5

Iteration 75600, loss=0.764099955559
Iteration 75700, loss=0.703801393509
Iteration 75800, loss=0.786265671253
Iteration 75900, loss=0.683647096157
Iteration 76000, loss=0.696249544621
Iteration 76100, loss=0.686785459518
Iteration 76200, loss=0.641143143177
Iteration 76300, loss=0.66484028101
Iteration 76400, loss=0.673577725887
Iteration 76500, loss=0.706019639969
Iteration 76600, loss=0.662336111069
Iteration 76700, loss=0.693303346634
Iteration 76800, loss=0.667164921761
Iteration 76900, loss=0.699192345142
Iteration 77000, loss=0.666543602943
Iteration 77100, loss=0.750823795795
Iteration 77200, loss=0.216024711728
Iteration 77300, loss=0.774852216244
Iteration 77400, loss=0.725894510746
Iteration 77500, loss=0.800480008125
Iteration 77600, loss=0.599587082863
Iteration 77700, loss=0.712833166122
Iteration 77800, loss=0.707559227943
Iteration 77900, loss=0.757015168667
Iteration 78000, loss=0.791208267212
Iteration 78100, loss=0.773834228516
Iteration 78200, loss=0.702738165855
Iteration 78300, loss=0.701055109501
Iteration 78400, loss=0.707786381245
Iteration 78500, loss=0.702587544918
Iteration 78600, loss=1.01350021362
Iteration 78700, loss=0.684418618679
Iteration 78800, loss=0.670415282249
Iteration 78900, loss=0.692475199699
Iteration 79000, loss=0.699831724167
Iteration 79100, loss=0.66406583786
Iteration 79200, loss=0.572940707207
Iteration 79300, loss=0.687783002853
Iteration 79400, loss=0.693389892578
Iteration 79500, loss=0.518278539181
Iteration 79600, loss=0.726815402508
Iteration 79700, loss=0.696648955345
Iteration 79800, loss=0.739838123322
Iteration 79900, loss=0.70836687088
Iteration 80000, loss=0.712565600872
Nearest to however: idle, paperback, boris, consolidated, preserved, protest, africans, pointing,
Nearest to four: bremen, vi, fire, designations, citing, ruth, flash, flanders,
Nearest to such: mathrm, adaptive, urban, places, radio, exhibit, corporate, meets,
Nearest to world: shelter, elite, jet, protons, evident, somalia, original, democrats,
Nearest to were: clan, expectancy, comprises, compiler, persians, maxwell, defining, allah,
Nearest to eight: involves, d, hearts, bit, apart, player, press, orthodox,
Nearest to that: prevention, warrior, include, treaty, congo, belief, aerospace, dia,
Nearest to can: waterways, gwh, chord, marriages, rituals, crossing, defended, known,
Nearest to while: dial, exhibits, selective, leonard, extensions, concern, perfectly, egyptian,
Nearest to or: eventually, heard, organised, mirrors, piano, blessed, touch, crowd,
Nearest to after: lab, track, eritrea, implemented, fl, papacy, history, mpeg,
Nearest to first: cp, obvious, demons, allowing, libya, watching, prototype, mistake,
Nearest to use: corner, greenwich, neighbours, biology, converted, armenia, superhero, welsh,

Iteration 190000, loss=1.28259468079
Nearest to however: africans, lack, pressed, voted, consolidated, navigator, developed, absinthe,
Nearest to four: january, vi, deeply, flash, fire, creating, overcome, implementations,
Nearest to such: known, to, mathrm, and, of, radio, the, in,
Nearest to world: separation, background, original, conditions, dimensional, course, verb, characters,
Nearest to were: a, of, to, in, and, one, the, is,
Nearest to eight: hearts, referred, herself, hamilton, census, laser, cameron, arms,
Nearest to that: in, of, the, and, for, a, to, as,
Nearest to can: rico, chord, scheduled, planted, costume, reflects, asks, tied,
Nearest to while: darker, newspapers, loans, crusade, played, method, structural, variable,
Nearest to or: the, in, and, one, on, of, to, a,
Nearest to after: continuity, musician, unique, bengal, hormones, center, fairy, publishers,
Nearest to first: governed, obvious, original, chemicals, fairly, evaluation, cp, aa,
Nearest to use: corner, guitarist, buddha, fauna, arising, electors, god, painters,
Nearest to used: legion, sphere, households, karaoke, ask, nl, raf, persons,
Nearest to people: canterbury, evaluation, kingdom, loyalty, renowned, province, joel, catcher,
Nearest to called: mobility, costa, shorter, labels, manpower, continued, eve, mother,

This tutorial is very outdated

Two things.

This tutorial uses softmax_cross_entropy_with_logits() instead of

softmax_cross_entropy_with_logits_v2(). That should be changed.

Also, this is importing the MNIST database from a soon-to-be-outdated source. I got it working by replacing the imports with
import tensorflow as tf
old_v = tf.logging.get_verbosity()
tf.logging.set_verbosity(tf.logging.ERROR)
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("/tmp/data/", one_hot = True)
tf.logging.set_verbosity(old_v)
import os
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'

I submitted a working model if you want to present a working tutorial.

ValueError while running policy gradient code, when run on colab

ValueError Traceback (most recent call last)
in ()
56
57 if done:
---> 58 loss = update_network(network, rewards, states, actions, num_actions)
59 tot_reward = sum(rewards)
60 print(f"Episode: {episode}, Reward: {tot_reward}, avg loss: {loss:.5f}")

10 frames
in update_network(network, rewards, states, actions, num_actions)
37 discounted_rewards /= np.std(discounted_rewards)
38 states = np.vstack(states)
---> 39 loss = network.train_on_batch(states, discounted_rewards)
40 return loss
41

/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/training.py in train_on_batch(self, x, y, sample_weight, class_weight, reset_metrics, return_dict)
1349 class_weight)
1350 train_function = self.make_train_function()
-> 1351 logs = train_function(iterator)
1352
1353 if reset_metrics:

/usr/local/lib/python3.6/dist-packages/tensorflow/python/eager/def_function.py in call(self, *args, **kwds)
578 xla_context.Exit()
579 else:
--> 580 result = self._call(*args, **kwds)
581
582 if tracing_count == self._get_tracing_count():

/usr/local/lib/python3.6/dist-packages/tensorflow/python/eager/def_function.py in _call(self, *args, **kwds)
625 # This is the first call of call, so we have to initialize.
626 initializers = []
--> 627 self._initialize(args, kwds, add_initializers_to=initializers)
628 finally:
629 # At this point we know that the initialization is complete (or less

/usr/local/lib/python3.6/dist-packages/tensorflow/python/eager/def_function.py in _initialize(self, args, kwds, add_initializers_to)
504 self._concrete_stateful_fn = (
505 self._stateful_fn._get_concrete_function_internal_garbage_collected( # pylint: disable=protected-access
--> 506 *args, **kwds))
507
508 def invalid_creator_scope(*unused_args, **unused_kwds):

/usr/local/lib/python3.6/dist-packages/tensorflow/python/eager/function.py in _get_concrete_function_internal_garbage_collected(self, *args, **kwargs)
2444 args, kwargs = None, None
2445 with self._lock:
-> 2446 graph_function, _, _ = self._maybe_define_function(args, kwargs)
2447 return graph_function
2448

/usr/local/lib/python3.6/dist-packages/tensorflow/python/eager/function.py in _maybe_define_function(self, args, kwargs)
2775
2776 self._function_cache.missed.add(call_context_key)
-> 2777 graph_function = self._create_graph_function(args, kwargs)
2778 self._function_cache.primary[cache_key] = graph_function
2779 return graph_function, args, kwargs

/usr/local/lib/python3.6/dist-packages/tensorflow/python/eager/function.py in _create_graph_function(self, args, kwargs, override_flat_arg_shapes)
2665 arg_names=arg_names,
2666 override_flat_arg_shapes=override_flat_arg_shapes,
-> 2667 capture_by_value=self._capture_by_value),
2668 self._function_attributes,
2669 # Tell the ConcreteFunction to clean up its graph once it goes out of

/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/func_graph.py in func_graph_from_py_func(name, python_func, args, kwargs, signature, func_graph, autograph, autograph_options, add_control_dependencies, arg_names, op_return_value, collections, capture_by_value, override_flat_arg_shapes)
979 _, original_func = tf_decorator.unwrap(python_func)
980
--> 981 func_outputs = python_func(*func_args, **func_kwargs)
982
983 # invariant: func_outputs contains only Tensors, CompositeTensors,

/usr/local/lib/python3.6/dist-packages/tensorflow/python/eager/def_function.py in wrapped_fn(*args, **kwds)
439 # wrapped allows AutoGraph to swap in a converted function. We give
440 # the function a weak reference to itself to avoid a reference cycle.
--> 441 return weak_wrapped_fn().wrapped(*args, **kwds)
442 weak_wrapped_fn = weakref.ref(wrapped_fn)
443

/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/func_graph.py in wrapper(*args, **kwargs)
966 except Exception as e: # pylint:disable=broad-except
967 if hasattr(e, "ag_error_metadata"):
--> 968 raise e.ag_error_metadata.to_exception(e)
969 else:
970 raise

ValueError: in user code:

/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/training.py:571 train_function  *
    outputs = self.distribute_strategy.run(
/usr/local/lib/python3.6/dist-packages/tensorflow/python/distribute/distribute_lib.py:951 run  **
    return self._extended.call_for_each_replica(fn, args=args, kwargs=kwargs)
/usr/local/lib/python3.6/dist-packages/tensorflow/python/distribute/distribute_lib.py:2290 call_for_each_replica
    return self._call_for_each_replica(fn, args, kwargs)
/usr/local/lib/python3.6/dist-packages/tensorflow/python/distribute/distribute_lib.py:2649 _call_for_each_replica
    return fn(*args, **kwargs)
/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/training.py:533 train_step  **
    y, y_pred, sample_weight, regularization_losses=self.losses)
/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/compile_utils.py:204 __call__
    loss_value = loss_obj(y_t, y_p, sample_weight=sw)
/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/losses.py:143 __call__
    losses = self.call(y_true, y_pred)
/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/losses.py:246 call
    return self.fn(y_true, y_pred, **self._fn_kwargs)
/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/losses.py:1527 categorical_crossentropy
    return K.categorical_crossentropy(y_true, y_pred, from_logits=from_logits)
/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/backend.py:4561 categorical_crossentropy
    target.shape.assert_is_compatible_with(output.shape)
/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/tensor_shape.py:1117 assert_is_compatible_with
    raise ValueError("Shapes %s and %s are incompatible" % (self, other))

ValueError: Shapes (24, 1) and (24, 2) are incompatible

Keras word2vec is much more faster than tensorflow word2vec ?

Thanks for your work.
I had a question about word2vec.

I ran both of your codes of word2vec viz Keras_word2vec.py and tf_word2vec.py !!

Keras word2vec with Tensorflow backend seems faster than Tensorflow word2vec. Ideally it should not. Keras is indirectly calling Tensorflow.
Tensorflow code took 1182 sec to run 15 iterations whereas Keras just took 796 seconds to run 15 iterations.

How is Keras faster than Tensorflow?

Can you please help me?
My CPU instance:

AWS Instance: C4.4 Large (Compute Optimized).
Intel® Xeon® CPU ES-2666 V3 @ 2.90 GHz
No. of CPU Cores = 16 2-CPU with 8 Cores/CPU
Memory = 30 GB
FPU = Yes

Thanks.

TF2.30.0: ValueError: operands could not be broadcast together with shapes (31,) (32,32)

I am getting this error while running the code given in file "per_duelingq_spaceinv_tf2.py" on Google Colab that uses the Tensorflow version: 2.3.0

ValueError                                Traceback (most recent call last)
<ipython-input-8-000aec5df542> in <module>()
     29 
     30     if steps > DELAY_TRAINING:
---> 31       loss = train(primary_network, memory, target_network)
     32       update_network(primary_network, target_network)
     33       _, error = get_per_error(tf.reshape(old_state_stack, (1, POST_PROCESS_IMAGE_SIZE[0], 

1 frames
<ipython-input-5-920194395f77> in get_per_error(states, actions, rewards, next_states, terminal, primary_network, target_network)
     10     # the q value for the prim_action_tp1 from the target network
     11     q_from_target = target_network(next_states)
---> 12     updates = rewards + (1 - terminal) * GAMMA * q_from_target.numpy()[:, prim_action_tp1]
     13     target_q[:, actions] = updates
     14     # calculate the loss / error to update priorites

ValueError: operands could not be broadcast together with shapes (31,) (32,32) 

Any suggestion will be useful.

Thanks & Regards,
Swagat

double/dueling Q learning

Should not the model be fitted (keras.fit(...)) and predicted (keras.predict(state)) in double Q learning (and also in dueling Q learning) examples? Seems you also forget to apply the same in the atari example.
Do the graphs in the associated blog-posts are from actual experiments or should we expect something different? I mean have you tested your implementations?

Thanks

Policy Gradient REINFORCE algorithm not converging.

First of all, thank you for the tutorial here!

I am trying to implement/run your code mentioned in the tutorial, however, the results are not converging after 500 steps as shown in the image 'Reward: Training progress of Policy Gradient RL in Cartpole environment". Even after 5000 steps, the reward is around 10. Is this correct?

Thanks again!

While trying to parse the command line args in lstm_tutorial.py: line 14-17 getting error:

usage: ipykernel_launcher.py [-h] [--data_path DATA_PATH] run_opt
ipykernel_launcher.py: error: argument run_opt: invalid int value: 'C:\Users\Sourav\AppData\Roaming\jupyter\runtime\kernel-15d13538-092b-4243-bed1-ed8946e08bc4.json'
An exception has occurred, use %tb to see the full traceback.

SystemExit: 2

C:\Users\Sourav\AppData\Local\conda\conda\envs\tensorflow\lib\site-packages\IPython\core\interactiveshell.py:2855: UserWarning: To exit: use 'exit', 'quit', or Ctrl-D.
warn("To exit: use 'exit', 'quit', or Ctrl-D.", stacklevel=1)

Could you please help me on this? Thanks!

Why weights and bias can be initialized? (CNN Tutorial)

Hello,

I have a code similar to yours. But when I run it in python, it always return error 'Attempting using uninitialized value weights_10 (bias_10)'.

As weights and bias are in a defined function, they should not be able to be initialized (But even I try local initialization, it still returns the same error.)

Can I know some more about how this two variables work in your code? (Mine is very similar to yours, but in my case, it returns error. )

keras_lstm.py as .ipynb file incl. fix of gfile approach

""" This is the jupyter notebook version of the tutorial with some small fixes.
    Instead of running it in command line like "python keras_lstm.py 1" with 
    runopt parameter here = 1, you need to assign run_opt parameter in the code!

    To run this code, you'll need to first download and extract the text dataset
    from here: http://www.fit.vutbr.cz/~imikolov/rnnlm/simple-examples.tgz. Change the
    data_path variable below to your local extraction path and assign run_opt
    """

data_path = "C:\\Users\\Andy\\Documents\\simple-examples\\data\\"

# 'An integer: 1 to train, 2 to test'
run_opt = 1

####

# all from https://adventuresinmachinelearning.com/keras-lstm-tutorial/
# and originally full code from https://github.com/adventuresinML/adventures-in-ml-code/blob/master/keras_lstm.py
# issue of running it as jupyter nb: https://github.com/adventuresinML/adventures-in-ml-code/issues/12
# (solved here)

from __future__ import print_function
import collections
import os
import tensorflow as tf
from keras.models import Sequential, load_model
from keras.layers import Dense, Activation, Embedding, Dropout, TimeDistributed
from keras.layers import LSTM
from keras.optimizers import Adam
from keras.utils import to_categorical
from keras.callbacks import ModelCheckpoint
import numpy as np
import pandas as pd

####

def build_vocab(filename):
    data = pd.read_csv(filename, encoding='utf-8', header = None)
    data = ''.join([str(x) for x in data[0]]).split()

    counter = collections.Counter(data)
    count_pairs = sorted(counter.items(), key=lambda x: (-x[1], x[0]))

    words, _ = list(zip(*count_pairs))
    word_to_id = dict(zip(words, range(len(words))))

    return word_to_id

def file_to_word_ids(filename, word_to_id):
    data = pd.read_csv(filename, encoding='utf-8', header = None)
    data = ''.join([str(x) for x in data[0]]).split()

    return [word_to_id[word] for word in data if word in word_to_id]


def load_data():
    # get the data paths
    train_path = os.path.join(data_path, "ptb.train.txt")
    valid_path = os.path.join(data_path, "ptb.valid.txt")
    test_path = os.path.join(data_path, "ptb.test.txt")

    # build the complete vocabulary, then convert text data to list of integers
    word_to_id = build_vocab(train_path)

    train_data = file_to_word_ids(train_path, word_to_id)

    valid_data = file_to_word_ids(valid_path, word_to_id)
    test_data = file_to_word_ids(test_path, word_to_id)
    vocabulary = len(word_to_id)
    reversed_dictionary = dict(zip(word_to_id.values(), word_to_id.keys()))

    print(train_data[:5])
    print(list(word_to_id.items())[:10])
    print(vocabulary)
    print(" ".join([reversed_dictionary[x] for x in train_data[:10]]))
    return train_data, valid_data, test_data, vocabulary, reversed_dictionary

####

train_data, valid_data, test_data, vocabulary, reversed_dictionary = load_data()

####

class KerasBatchGenerator(object):

    def __init__(self, data, num_steps, batch_size, vocabulary, skip_step=5):
        self.data = data
        self.num_steps = num_steps
        self.batch_size = batch_size
        self.vocabulary = vocabulary
        # this will track the progress of the batches sequentially through the
        # data set - once the data reaches the end of the data set it will reset
        # back to zero
        self.current_idx = 0
        # skip_step is the number of words which will be skipped before the next
        # batch is skimmed from the data set
        self.skip_step = skip_step

    def generate(self):
        x = np.zeros((self.batch_size, self.num_steps))
        y = np.zeros((self.batch_size, self.num_steps, self.vocabulary))
        while True:
            for i in range(self.batch_size):
                if self.current_idx + self.num_steps >= len(self.data):
                    # reset the index back to the start of the data set
                    self.current_idx = 0
                x[i, :] = self.data[self.current_idx:self.current_idx + self.num_steps]
                temp_y = self.data[self.current_idx + 1:self.current_idx + self.num_steps + 1]
                # convert all of temp_y into a one hot representation
                y[i, :, :] = to_categorical(temp_y, num_classes=self.vocabulary)
                self.current_idx += self.skip_step
            yield x, y

####

num_steps = 30
batch_size = 20
train_data_generator = KerasBatchGenerator(train_data, num_steps, batch_size, vocabulary,
                                           skip_step=num_steps)
valid_data_generator = KerasBatchGenerator(valid_data, num_steps, batch_size, vocabulary,
                                           skip_step=num_steps)

####

hidden_size = 500
use_dropout=True
model = Sequential()
model.add(Embedding(vocabulary, hidden_size, input_length=num_steps))
model.add(LSTM(hidden_size, return_sequences=True))
model.add(LSTM(hidden_size, return_sequences=True))
if use_dropout:
    model.add(Dropout(0.5))
model.add(TimeDistributed(Dense(vocabulary)))
model.add(Activation('softmax'))

####

optimizer = Adam()
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['categorical_accuracy'])

print(model.summary())

####

checkpointer = ModelCheckpoint(filepath=data_path + '/model-{epoch:02d}.hdf5', verbose=1)
num_epochs = 1 #50
if run_opt == 1:
    model.fit_generator(train_data_generator.generate(), len(train_data)//(batch_size*num_steps), num_epochs,
                        validation_data=valid_data_generator.generate(),
                        validation_steps=len(valid_data)//(batch_size*num_steps), callbacks=[checkpointer])
    # model.fit_generator(train_data_generator.generate(), 2000, num_epochs,
    #                     validation_data=valid_data_generator.generate(),
    #                     validation_steps=10)
    model.save(data_path + "final_model.hdf5")
elif run_opt == 2:
    model = load_model(data_path + "\model-40.hdf5")
    dummy_iters = 40
    example_training_generator = KerasBatchGenerator(train_data, num_steps, 1, vocabulary,
                                                     skip_step=1)
    print("Training data:")
    for i in range(dummy_iters):
        dummy = next(example_training_generator.generate())
    num_predict = 10
    true_print_out = "Actual words: "
    pred_print_out = "Predicted words: "
    for i in range(num_predict):
        data = next(example_training_generator.generate())
        prediction = model.predict(data[0])
        predict_word = np.argmax(prediction[:, num_steps-1, :])
        true_print_out += reversed_dictionary[train_data[num_steps + dummy_iters + i]] + " "
        pred_print_out += reversed_dictionary[predict_word] + " "
    print(true_print_out)
    print(pred_print_out)
    # test data set
    dummy_iters = 40
    example_test_generator = KerasBatchGenerator(test_data, num_steps, 1, vocabulary,
                                                     skip_step=1)
    print("Test data:")
    for i in range(dummy_iters):
        dummy = next(example_test_generator.generate())
    num_predict = 10
    true_print_out = "Actual words: "
    pred_print_out = "Predicted words: "
    for i in range(num_predict):
        data = next(example_test_generator.generate())
        prediction = model.predict(data[0])
        predict_word = np.argmax(prediction[:, num_steps - 1, :])
        true_print_out += reversed_dictionary[test_data[num_steps + dummy_iters + i]] + " "
        pred_print_out += reversed_dictionary[predict_word] + " "
    print(true_print_out)
    print(pred_print_out)

How is #iterations, step_size and number of epochs related ?

Hello,

Thank you for this wonderful tutorial !!
From the Tensorflow word2vec tutorial and code in Github for tf_word2vec.py, I am not able to understand how are "#iterations", "step_size" and "#epochs" related ?

Is #iterations = #epochs ? If yes what is the relation between Step_size and Batch size?

Please, let me know.

Thanks.

keras_lstm.py with fix of not working gfile approach

# # start in commandline: python keras_lstm.py [-h] [--data_path DATA PATH] runopt
# 'An integer: 1 to train, 2 to test'
# i.e.: python keras_lstm.py 1
# or: python keras_lstm.py 2

from __future__ import print_function
import collections
import os
import tensorflow as tf
from keras.models import Sequential, load_model
from keras.layers import Dense, Activation, Embedding, Dropout, TimeDistributed
from keras.layers import LSTM
from keras.optimizers import Adam
from keras.utils import to_categorical
from keras.callbacks import ModelCheckpoint
import numpy as np
import argparse
import pandas as pd

"""To run this code, you'll need to first download and extract the text dataset
    from here: http://www.fit.vutbr.cz/~imikolov/rnnlm/simple-examples.tgz. Change the
    data_path variable below to your local exraction path"""

# data_path = "C:\\Users\Andy\Documents\simple-examples\data"

parser = argparse.ArgumentParser()
parser.add_argument('run_opt', type=int, default=1, help='An integer: 1 to train, 2 to test')
parser.add_argument('--data_path', type=str, default=data_path, help='The full path of the training data')
args = parser.parse_args()
if args.data_path:
    data_path = args.data_path

def read_words(filename):
    # Changed from tf.gfile to tf.io.gfile
    # https://github.com/tensorflow/tensorflow/issues/31315
    # with tf.io.gfile.GFile(filename, "r") as f:
    # https://github.com/tensorflow/tensorflow/issues/33563
    #     return f.read().decode("utf-8").replace("\n", "<eos>").split()
    data = pd.read_csv(filename, encoding='utf-8', header = None)
    data = ''.join([str(x) for x in data[0]]).split()
    return data

def build_vocab(filename):
    data = read_words(filename)

    counter = collections.Counter(data)
    count_pairs = sorted(counter.items(), key=lambda x: (-x[1], x[0]))

    words, _ = list(zip(*count_pairs))
    word_to_id = dict(zip(words, range(len(words))))

    return word_to_id


def file_to_word_ids(filename, word_to_id):
    data = read_words(filename)
    return [word_to_id[word] for word in data if word in word_to_id]


def load_data():
    # get the data paths
    train_path = os.path.join(data_path, "ptb.train.txt")
    valid_path = os.path.join(data_path, "ptb.valid.txt")
    test_path = os.path.join(data_path, "ptb.test.txt")

    # build the complete vocabulary, then convert text data to list of integers
    word_to_id = build_vocab(train_path)
    train_data = file_to_word_ids(train_path, word_to_id)
    valid_data = file_to_word_ids(valid_path, word_to_id)
    test_data = file_to_word_ids(test_path, word_to_id)
    vocabulary = len(word_to_id)
    reversed_dictionary = dict(zip(word_to_id.values(), word_to_id.keys()))

    print(train_data[:5])
    print(word_to_id)
    print(vocabulary)
    print(" ".join([reversed_dictionary[x] for x in train_data[:10]]))
    return train_data, valid_data, test_data, vocabulary, reversed_dictionary

train_data, valid_data, test_data, vocabulary, reversed_dictionary = load_data()


class KerasBatchGenerator(object):

    def __init__(self, data, num_steps, batch_size, vocabulary, skip_step=5):
        self.data = data
        self.num_steps = num_steps
        self.batch_size = batch_size
        self.vocabulary = vocabulary
        # this will track the progress of the batches sequentially through the
        # data set - once the data reaches the end of the data set it will reset
        # back to zero
        self.current_idx = 0
        # skip_step is the number of words which will be skipped before the next
        # batch is skimmed from the data set
        self.skip_step = skip_step

    def generate(self):
        x = np.zeros((self.batch_size, self.num_steps))
        y = np.zeros((self.batch_size, self.num_steps, self.vocabulary))
        while True:
            for i in range(self.batch_size):
                if self.current_idx + self.num_steps >= len(self.data):
                    # reset the index back to the start of the data set
                    self.current_idx = 0
                x[i, :] = self.data[self.current_idx:self.current_idx + self.num_steps]
                temp_y = self.data[self.current_idx + 1:self.current_idx + self.num_steps + 1]
                # convert all of temp_y into a one hot representation
                y[i, :, :] = to_categorical(temp_y, num_classes=self.vocabulary)
                self.current_idx += self.skip_step
            yield x, y

num_steps = 30
batch_size = 20
train_data_generator = KerasBatchGenerator(train_data, num_steps, batch_size, vocabulary,
                                           skip_step=num_steps)
valid_data_generator = KerasBatchGenerator(valid_data, num_steps, batch_size, vocabulary,
                                           skip_step=num_steps)

hidden_size = 500
use_dropout=True
model = Sequential()
model.add(Embedding(vocabulary, hidden_size, input_length=num_steps))
model.add(LSTM(hidden_size, return_sequences=True))
model.add(LSTM(hidden_size, return_sequences=True))
if use_dropout:
    model.add(Dropout(0.5))
model.add(TimeDistributed(Dense(vocabulary)))
model.add(Activation('softmax'))

optimizer = Adam()
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['categorical_accuracy'])

print(model.summary())
checkpointer = ModelCheckpoint(filepath=data_path + '/model-{epoch:02d}.hdf5', verbose=1)
num_epochs = 50
if args.run_opt == 1:
    model.fit_generator(train_data_generator.generate(), len(train_data)//(batch_size*num_steps), num_epochs,
                        validation_data=valid_data_generator.generate(),
                        validation_steps=len(valid_data)//(batch_size*num_steps), callbacks=[checkpointer])
    # model.fit_generator(train_data_generator.generate(), 2000, num_epochs,
    #                     validation_data=valid_data_generator.generate(),
    #                     validation_steps=10)
    model.save(data_path + "final_model.hdf5")
elif args.run_opt == 2:
    model = load_model(data_path + "\model-40.hdf5")
    dummy_iters = 40
    example_training_generator = KerasBatchGenerator(train_data, num_steps, 1, vocabulary,
                                                     skip_step=1)
    print("Training data:")
    for i in range(dummy_iters):
        dummy = next(example_training_generator.generate())
    num_predict = 10
    true_print_out = "Actual words: "
    pred_print_out = "Predicted words: "
    for i in range(num_predict):
        data = next(example_training_generator.generate())
        prediction = model.predict(data[0])
        predict_word = np.argmax(prediction[:, num_steps-1, :])
        true_print_out += reversed_dictionary[train_data[num_steps + dummy_iters + i]] + " "
        pred_print_out += reversed_dictionary[predict_word] + " "
    print(true_print_out)
    print(pred_print_out)
    # test data set
    dummy_iters = 40
    example_test_generator = KerasBatchGenerator(test_data, num_steps, 1, vocabulary,
                                                     skip_step=1)
    print("Test data:")
    for i in range(dummy_iters):
        dummy = next(example_test_generator.generate())
    num_predict = 10
    true_print_out = "Actual words: "
    pred_print_out = "Predicted words: "
    for i in range(num_predict):
        data = next(example_test_generator.generate())
        prediction = model.predict(data[0])
        predict_word = np.argmax(prediction[:, num_steps - 1, :])
        true_print_out += reversed_dictionary[test_data[num_steps + dummy_iters + i]] + " "
        pred_print_out += reversed_dictionary[predict_word] + " "
    print(true_print_out)
    print(pred_print_out)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.