Hi , Its more or like a pull request . You have defined both trainin

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

How to make prediction for new cases in lab3_RNN ? about tensorflow-tutorial HOT 7 CLOSED

alrojo commented on September 26, 2024

How to make prediction for new cases in lab3_RNN ?

from tensorflow-tutorial.

Comments (7)

alrojo commented on September 26, 2024

Hmm, you would need to know the ID of the tag.

from tensorflow-tutorial.

s4sarath commented on September 26, 2024

@alrojo - I tried to implement . So , if we are having a new data , we will enc_state from dynamic_rnn . Then , we take EOS ( # ) token from target_embedding matrix , calculate new state and then multiply it with W_out and b_out make a prediction using argmax , then using that vector in the argmax position to calculate new state and so on , untill we reach the end of max_length ( if provided ) , or EOS token . But , can you make a implementation inside the Ipython notebook for unseen data prediction ?

from tensorflow-tutorial.

alrojo commented on September 26, 2024

Hi @s4sarath , we are working on something that would accomplish such.
You can checkout this PR for more information.
Also, ideas and contributions are much welcomed.

from tensorflow-tutorial.

s4sarath commented on September 26, 2024

@alrojo - Yeah actually , I did this by separating everything from tf_utils.decoder to training and validation( testing ) as separate functions . But , I feel it is so amateurish , because I am not a hardcore programmer , to follow proper python syntax or pep-8 . Anyway , I will try to modify or giving suggestion based on your PR at tensorflow . Thanks , for your contributions by the way , i make use of your code as revrse-engineering o learn algorithm better .

from tensorflow-tutorial.

alrojo commented on September 26, 2024

I'm glad you can make use it, I did the exact same thing last year when I started learning deep learning and python from this developer: github.com/bennane . The only pep-8 requirements I really think about are: use 4 spaces instead of tabs, don't write more than 79 chars in one line.

Yes, feel free to come with questions/development ideas in the PR or here.
The architecture/code is still very much under development.

from tensorflow-tutorial.

s4sarath commented on September 26, 2024

@alrojo - Hi . I was trying to modify your attention decoder , to create a new attention . My aim is I am having a 2d matrix and 3d matrix . Lets say 2 x 10 and 2 x 10 x 5 . What i need is , first row of 2d matrix will multiply ( matrix ) with , 1 st batch of 3d matrix . Then second row of 2d with 2nd batch of 3d , resulting in 2 x 5 matrix . So , I decided to make use of while , inside the loop of your attention decoder . But I am getting an error . The error is InvalidArgumentError: TensorArray TensorArray: Could not read from TensorArray index 0 because it has not yet been written to.

if I am not supposed to post it here , i am sorry and I will take back the code and comment

import tensorflow as tf
from tensorflow.python.ops import tensor_array_ops
from tensorflow.python.framework import ops
from tensorflow.python.ops import nn_ops
from tensorflow.python.ops import math_ops


###
# a custom masking function, takes sequence lengths and makes masks
def mask(sequence_lengths):
    # based on this SO answer: http://stackoverflow.com/a/34138336/118173
    batch_size = tf.shape(sequence_lengths)[0]
    max_len = tf.reduce_max(sequence_lengths)

    lengths_transposed = tf.expand_dims(sequence_lengths, 1)

    rng = tf.range(max_len)
    rng_row = tf.expand_dims(rng, 0)

    return tf.less(rng_row, lengths_transposed)

###
# decoder with attention

def attention_decodercustom_(attention_input, attention_lengths, initial_state, target_input,
                      target_input_lengths, num_units, num_attn_units, embeddings, W_out, b_out,
                      batch_len = 3, name='decoder', swap=False):
    """Decoder with attention.
    Note that the number of units in the attention decoder must always
    be equal to the size of the initial state/attention input.
    Keyword arguments:
        attention_input:    the input to put attention on. expected dims: [batch_size, attention_length, attention_dims]
        initial_state:      The initial state for the decoder RNN.
        target_input:       The target to replicate. Expected: [batch_size, max_target_sequence_len, embedding_dims]
        num_attn_units:     Number of units in the alignment layer that produces the context vectors.
    """
    with tf.variable_scope(name):
        target_dims = target_input.get_shape()[2]
        attention_dims = attention_input.get_shape()[2]
        input_max_len =  attention_input.get_shape()[1]
        attn_len = tf.shape(attention_input)[1]
        max_sequence_length = tf.reduce_max(target_input_lengths)
        num_units = attention_dims
        weight_initializer = tf.truncated_normal_initializer(stddev=0.1)
        attention_input_mod = tf.transpose(attention_input , [0,2,1])
        # map initial state to num_units
        var = tf.get_variable # for ease of use
        # target_dims + num_units is because we stack embeddings and prev. hidden state to
        # optimize speed
        W_z_x = var('W_z_x', shape=[target_dims, num_units], initializer=weight_initializer)
        W_z_h = var('W_z_h', shape=[num_units, num_units], initializer=weight_initializer)
        b_z = var('b_z', shape=[num_units], initializer=weight_initializer)
        W_r_x = var('W_r_x', shape=[target_dims, num_units], initializer=weight_initializer)
        W_r_h = var('W_r_h', shape=[num_units, num_units], initializer=weight_initializer)
        b_r = var('b_r', shape=[num_units], initializer=weight_initializer)
        W_c_x = var('W_c_x', shape=[target_dims, num_units], initializer=weight_initializer)
        W_c_h = var('W_c_h', shape=[num_units, num_units], initializer = weight_initializer)
        b_c = var('b_c', shape=[num_units], initializer=weight_initializer)
        middle_matrix = var('middle', shape=[num_units, num_units], initializer = weight_initializer)
        # project initial state

        # TODO: don't use convolutions!
        # TODO: fix the bias (b_a)


        # make inputs time-major
        inputs = tf.transpose(target_input, perm=[1, 0, 2])
        inputs_temp = inputs
        # make tensor array for inputs, these are dynamic and used in the while-loop
        # these are not in the api documentation yet, you will have to look at github.com/tensorflow
        input_ta_temp = tensor_array_ops.TensorArray(tf.float32, size=1, dynamic_size=True , name='input_ta_temp')
        input_ta_temp = input_ta_temp.unpack(inputs_temp)
        time = tf.constant(0)

        # calculate the GRU
        x_t = input_ta_temp.read(time)
        z = tf.sigmoid(tf.matmul(x_t, W_z_x) + tf.matmul(initial_state, W_z_h) + b_z) # update gate
        r = tf.sigmoid(tf.matmul(x_t, W_r_x) + tf.matmul(initial_state, W_r_h) + b_r) # reset gate
        c = tf.tanh(tf.matmul(x_t, W_c_x) + tf.matmul(r*initial_state, W_c_h) + b_c) # proposed new state
        new_state = (1-z)*c + z*initial_state # new state
        initial_state = new_state
        input_ta = tensor_array_ops.TensorArray(tf.float32, size=1, dynamic_size=True, name = 'input_ta')
        input_ta = input_ta.unpack(inputs)



        def decoder_cond(time, state, output_ta_t, attention_tracker):
            return tf.less(time, max_sequence_length)


        def decoder_body_builder(feedback=False):
            def decoder_body(time, old_state, output_ta_t, attention_tracker):
                if feedback:
                    def from_previous():
                        prev_1 = tf.matmul(old_state, W_out) + b_out
                        return tf.gather(embeddings, tf.argmax(prev_1, 1))
                    x_t = tf.cond(tf.greater(time, 0), from_previous, lambda: input_ta.read(0))
                else:
                    x_t = input_ta.read(time)

                 # calculate the GRU



                def sub_decoder_cond(sub_time,temp_holder_):
                        return tf.less(sub_time, 3) 

                def sub_decoder_body_builder():

                    def sub_decoder_body(sub_time ,temp_holder_t):
                        sub_x_t = tf.reshape(sub_initial_.read(sub_time) , [1,-1])
                        sub_i_t = sub_input.read(sub_time)
                        sub_res = tf.matmul(sub_x_t, sub_i_t)
                        temp_holder_t.write(sub_time, sub_x_t)

                        return(sub_time+1, temp_holder_t )
                    return sub_decoder_body




                we_project = tf.tanh(tf.matmul( initial_state , middle_matrix ))
                sub_initial_ = tensor_array_ops.TensorArray(tf.float32, size=1, dynamic_size=True ,tensor_array_name='sub_initial')
                sub_input = tensor_array_ops.TensorArray(tf.float32, size=1, dynamic_size=True, tensor_array_name = 'sub_input')
                temp_holder = tensor_array_ops.TensorArray(tf.float32, size=1, dynamic_size=True , tensor_array_name = 'temp_holder'  )

                sub_initial_ = sub_initial_.unpack(we_project)
                sub_input = sub_input.unpack(attention_input_mod)
                sub_time = tf.constant(0)

                sub_loop_vars = [sub_time, temp_holder]

                _, temp_holder = tf.while_loop(sub_decoder_cond,
                                               sub_decoder_body_builder(),
                                               sub_loop_vars,
                                               swap_memory=swap)




                alpha_time = temp_holder.pack()
                # temp_holder.close()
                alpha = tf.to_float(mask(attention_lengths)) * alpha_time
                alpha_softmax = alpha
                # alpha_softmax = tf.nn.softmax(alpha)
                z = tf.sigmoid(tf.matmul(x_t, W_z_x) + tf.matmul(old_state, W_z_h) + b_z) # update gate
                r = tf.sigmoid(tf.matmul(x_t, W_r_x) + tf.matmul(old_state, W_r_h) + b_r) # reset gate
                c = tf.tanh(tf.matmul(x_t, W_c_x) + tf.matmul(r*old_state, W_c_h) + b_c) # proposed new state
                new_state = (1-z)*c + z*old_state # new state

                # writing output
                output_ta_t = output_ta_t.write(time+1, new_state)
                attention_tracker = attention_tracker.write(time, alpha_softmax)
                # context = tf.reduce_sum(tf.expand_dims(alpha_softmax, 2) * attention_input, [1])


                return (time + 1, new_state, output_ta_t, attention_tracker)
            return decoder_body


        output_ta = tensor_array_ops.TensorArray(tf.float32, size=1, dynamic_size=True, infer_shape=False)
        attention_tracker = tensor_array_ops.TensorArray(tf.float32, size=1, dynamic_size=True, infer_shape=False)
        time = tf.constant(0)
        loop_vars = [time, initial_state, output_ta, attention_tracker]

        _, state, output_ta, valid_attention_tracker = tf.while_loop(decoder_cond,
                                               decoder_body_builder(),
                                               loop_vars,
                                               swap_memory=swap)

        # _, valid_state, valid_output_ta, valid_attention_tracker = tf.while_loop(decoder_cond,
        #                                                 decoder_body_builder(feedback=True),
        #                                                 loop_vars,
        #                                                 swap_memory=swap)

        dec_out = tf.transpose(output_ta.pack(), perm=[1, 0, 2])
        # valid_dec_out = tf.transpose(valid_output_ta.pack(), perm=[1, 0, 2])
        valid_attention_tracker = tf.transpose(valid_attention_tracker.pack(), perm=[1, 0, 2])

        # return dec_out, valid_dec_out, valid_attention_tracker

        return dec_out,  valid_attention_tracker

from tensorflow-tutorial.

alrojo commented on September 26, 2024

Hi, TensorFlow supports this type of behaviour now in the seq2seq section. However, to keep the tutorial a learning experience we will keep the old way of doing it, as it gives the learner an intuition about how to build custom encoders and decoders from scratch in TensorFlow.

from tensorflow-tutorial.

How to make prediction for new cases in lab3_RNN ? about tensorflow-tutorial HOT 7 CLOSED

Comments (7)

Related Issues (7)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent