dennybritz / rnn-tutorial-gru-lstm Goto Github PK

View Code? Open in Web Editor NEW

496.0 496.0 219.0 48.07 MB

Language Model GRU with Python and Theano

Python 90.22% TeX 9.78%

rnn-tutorial-gru-lstm's People

Contributors

Stargazers

Watchers

Forkers

elviswf liubo-cs lanzhzh lixiangnlp wangxiong2015 kdjyss dichild wolfhu sandy4321 dracz nagyistoce shaoguangcheng soroushmehr recluze xavierlinnow liangpj junteudjio mandarup jeffzhengye zhangwj0101 thomas-corcoran aaxwaz yanweifu ab93 ryancotterell wyjss2015 bertomartin rjbashar planaria158 li099 cosmozhang pluiefox binbinbian vanova happywwy dawtang kentchun33333 vilhub vchou96 liuhy0908 sigmaquan semanticbeeng shdut cfarbs ludybupt zero76114 wyx1227 blakepan aaronzhudp esun0087 lemaoliu dorniwang sunlinjie yangwang166 peipei1109 tonytongzhao coddinglxf mwilkowski80 benjamesbabala clayandgithub leezqcst guodao tinyloop sherlockhoatszx generalgong laurencecao rajatguptarg babooppa6 gatagat oelesin belugon collawolley bodidze harryhaha jithsjoy williamd4112 gdtm86 nunofernandes-plight moses1994 shaoxuan92 xspring14 imzwz praveensingh123 yao-matrix ivanhehe girmaw chenjun0210 iiapache pratapabhay zwhinmedia vyraun yyp009 ryanlr lihsin25 melodylail gsun1 foxience akmalsabri syan83 hanasncu

rnn-tutorial-gru-lstm's Issues

about s_t1

I think s_t1 may be as follows
s_t1 = (T.ones_like(z_t1) - z_t1) * s_t1_prev + z_t1 * c_t1

functools32 error

The file "requirements.txt" must be changed in the line 6

Now: functools32==3.2.3.post2
Later: functools32==3.2.3-2

Readme Error

It mus be updated in the folllowing line:

source venv/bin/active

source venv/bin/activate

Update your RNN tutorial part 4

Hi Denny,

You have written a impressive tutorial about RNN. I am wondering when you update the part 4 of RNN tutorial in your blog.

Best,
Siqin

should s/s2 be brought back to forward function as s_t1_prev/s_t2_prev

I found the code lost these parameters which makes the network lose its memory. Do they need to be added?
Thanks

Comment Scoring

Can this be used to score a comment as well, eg., get the probability of the comment based on the language model?

is it possible to score a comments

Can this be used to score a comment as well, eg., get the probability of the comment based on the language model?

about batch size of the sgd algorithm

Firstly, thank you very much! Your blog helps so many people to learn the RNN.
I have some questions about the parameter of batch. I am a newer to deep learning, if my question looks stupid, please forgive me.
I have learned that when we use the sgd algorithm to optimize the loss function of CNN, we always give sgd a batch size, but I never use the batch size equals to 1, I think one is too small.
Because I think when batch size equals to 1, the below equation is not right.(the screenshot is from the website book neural networks and deep learning, [http://neuralnetworksanddeeplearning.com/chap1.html])

But I have read your blog and github code, I found that both in the RNN and LSTM, you both use the batch size equals to 1. So my first question is why you use the batch size equals to 1?
And I found that your code do not support to change the batch size of the sgd algorithm. I am trying to modify your code to support change the batch size. Or do you think it is necessary to modify it?

ValueError: sum(pvals[:-1]) > 1.0

Hi, i following the tutorial by denny britz here, and i got some problem of np.random.multinomial(1, next_word_probs)

here's my related code

def generate_sentence(model, index_to_word, word_to_index, min_length=5):
    # We start the sentence with the start token
    new_sentence = [word_to_index[SENTENCE_START_TOKEN], word_to_index[white], word_to_index[plane]]
    # Repeat until we get an end token
    while not new_sentence[-1] == word_to_index[SENTENCE_END_TOKEN]:
            next_word_probs = model.predict(new_sentence)[-1]
            samples = np.random.multinomial(1, next_word_probs)
            sampled_word = np.argmax(samples)
            new_sentence.append(sampled_word)
            # Seomtimes we get stuck if the sentence becomes too long, e.g. "........" :(
            # And: We don't want sentences with UNKNOWN_TOKEN's
            if len(new_sentence) > 100 or sampled_word == word_to_index[UNKNOWN_TOKEN]:
                return None
    if len(new_sentence) < min_length:
        return None
    return new_sentence

and here is the output error message

Traceback (most recent call last):

  File "<ipython-input-13-5282294d5250>", line 1, in <module>
    runfile('C:/Users/cerdas/Documents/bil/lat/rnn-tutorial-gru-lstm-master/train.py', wdir='C:/Users/cerdas/Documents/bil/lat/rnn-tutorial-gru-lstm-master')

  File "C:\Users\cerdas\Anaconda3\lib\site-packages\spyder\utils\site\sitecustomize.py", line 705, in runfile
    execfile(filename, namespace)

  File "C:\Users\cerdas\Anaconda3\lib\site-packages\spyder\utils\site\sitecustomize.py", line 102, in execfile
    exec(compile(f.read(), filename, 'exec'), namespace)

  File "C:/Users/cerdas/Documents/bil/lat/rnn-tutorial-gru-lstm-master/train.py", line 53, in <module>
    generate_sentences(model, 10, index_to_word, word_to_index)

  File "C:\Users\cerdas\Documents\bil\lat\rnn-tutorial-gru-lstm-master\utils.py", line 189, in generate_sentences
    sent = generate_sentence(model, index_to_word, word_to_index)

  File "C:\Users\cerdas\Documents\bil\lat\rnn-tutorial-gru-lstm-master\utils.py", line 166, in generate_sentence
    samples = np.random.multinomial(1, next_word_probs)

  File "mtrand.pyx", line 4630, in mtrand.RandomState.multinomial

ValueError: sum(pvals[:-1]) > 1.0`

I have searching for the similar issue, i suspect there is the problem of `np.random.multinomial

i found the suspected problem by the answer

The root of this problem rises from numpy's implicit data casting: the output of my sorfmax() is in float32 type, however, numpy.random.multinomial() will cast the pval into float64 type IMPLICITLY. This data type casting would cause pval.sum() exceed 1.0 sometimes due to numerical rounding.

but i still have no idea how to solve the problem

two GRU layers?

Hi,

I have read your GRU code: https://github.com/dennybritz/rnn-tutorial-gru-lstm/blob/master/gru_theano.py, and there are two GRU layers added.
`
# GRU Layer 1
z_t1 = T.nnet.hard_sigmoid(U[0].dot(x_e) + W[0].dot(s_t1_prev) + b[0])
r_t1 = T.nnet.hard_sigmoid(U[1].dot(x_e) + W[1].dot(s_t1_prev) + b[1])
c_t1 = T.tanh(U[2].dot(x_e) + W[2].dot(s_t1_prev * r_t1) + b[2])
s_t1 = (T.ones_like(z_t1) - z_t1) * c_t1 + z_t1 * s_t1_prev

        # GRU Layer 2
        z_t2 = T.nnet.hard_sigmoid(U[3].dot(s_t1) + W[3].dot(s_t2_prev) + b[3])
        r_t2 = T.nnet.hard_sigmoid(U[4].dot(s_t1) + W[4].dot(s_t2_prev) + b[4])
        c_t2 = T.tanh(U[5].dot(s_t1) + W[5].dot(s_t2_prev * r_t2) + b[5])
        s_t2 = (T.ones_like(z_t2) - z_t2) * c_t2 + z_t2 * s_t2_prev

        # Final output calculation
        # Theano's softmax returns a matrix with one row, we only need the row
        o_t = T.nnet.softmax(V.dot(s_t2) + c)[0]

`
Can I set 'o_t = T.nnet.softmax(V.dot(s_t1) + c)[0]'?

encoding error?

Hi dennybritz,

I tried to run "train.py" and couldn't pass the data file "reddit-comments-2015.csv" reading part in load_data. My python environment is WinPython-64bit-3.4.4.4Qt5.

At first, the error said that str doesn't have the decode attribute. If I removed the decode part, I got the message similar to the following line:
"UnicodeEncodeError: 'gbk' codec can't encode character '\udca0' in position 356: illegal multibyte sequence".

I could open the csv file in Notepad++ and see its encoding as 'utf-8'.

What did I do wrong? Is it because python 2/3 code incompatible with each other? How can I fix the problem?

Thanks.
chenmaosi