Code Monkey home page Code Monkey logo

deeplearningzerotoall's People

Contributors

antil1 avatar astriker avatar bkryusim avatar bluemelon715 avatar chg0901 avatar crlotwhite avatar cynthia avatar davinnovation avatar ep1804 avatar forybm avatar fuzer avatar godpeny avatar gzupark avatar healess avatar hunkim avatar jayjun911 avatar jeff-hou avatar jennykang avatar jihobak avatar jin-chong avatar jukyellow avatar kkweon avatar malgus1995 avatar qoocrab avatar sihyeon-kim avatar skyer9 avatar surfertas avatar sxjscience avatar togheppi avatar wizardbc avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

deeplearningzerotoall's Issues

Refactor datasets out to a singular source?

Follow-up from #17

The datasets are currently (aside from the CSV imports) defined inline, which can get a bit messy when you have to keep the data consistent across multiple (in this case) implementations of the same thing.

I've been wondering if it'd make sense to refactor all of the data out to a single place and have the individual lesson scripts import the data. (I've been wondering about using something like a local scikit-learn bundle so the audience can switch datasets to datasets readily available in scikit-learn if they want to experiment with other data, not to mention they could use the same data and play around with other frameworks like scikit-learn)

reduce_sum() function does not recognize 2nd argument 'axis'

There seems to be an issue in reduce_sum() function in lab-06-1-softmax_classifier.py

When the second argument is "axis=1", there is a TypeError saying "unexpected keyword argument."
From my experience with python, this shouldn't be an error, but I am not sure if it is a version issue(I have the most recent one though..) or something else.

screen shot 2017-06-30 at 11 41 35 am

In my case, it works when I fix it from
cost = tf.reduce_mean(-tf.reduce_sum(Y * tf.log(hypothesis), axis=1))
to
cost = tf.reduce_mean(-tf.reduce_sum(Y * tf.log(hypothesis), 1))

Also a reference to tensorflow document.
https://www.tensorflow.org/api_docs/python/tf/reduce_sum

전에 텐서플로우 아톰에서 인식 안 됬다고 글 올렸던 사람입니다.

아나콘다 프롬프트에서는 텐서플로가 되고 아톰에서는 안 된다고 글을 오렸는데요.
제가 파이썬이 여러개여서 아나콘다 프롬프트에서는 텐서플로가 되고 아톰에서는 안되는 이유가 파이썬이 여러개 깔려 있어서 그렇다는 설명을 들었습니다.
제가 한참을 생각해본 결과 파이썬을 공식 홈페이지에서 다운 받아서 설치했다가 제어판을 통해서 지우고 다시 아나콘다를 깔았습니다.
아마 그것때문에 문제가 되지 않나 싶은데 공식 홈페이지 파이썬 어떻게 깨끗하게 삭제할 수 있을까요?

Lab 09-5 Sigmoid used instead of Softmax

Problem

In lab-09-5-softmax_back_prop.py,

Although softmax is a generalization of the logistics function, two have different formulas and different derivatives. The current implementation is technically doing multiple of single logistics regressions instead of multi-class classification.

What it means the sum of each output row will not be equal to 1 when using sigmoid instead of softmax

image

The sigmoid implementation in the file

def sigma(x):
    #  sigmoid function
    return tf.div(tf.constant(1.0),
                  tf.add(tf.constant(1.0), tf.exp(-x)))


def sigma_prime(x):
    # derivative of the sigmoid function
    return sigma(x) * (1 - sigma(x))

Suggestion

There are two options

  1. modify the filename to sigmoid backprop
  2. actually implement correct softmax

First suggestion is preferred since the derivative of softmax can be difficult for beginners and yet it should be clear that tf.nn.sofmax is not sigmoid.

In other files containing softmax in its filename, it's all using tf.nn.softmax since it's using tensorflow.

12-5-rnn has compile error at making dynamic_rnn

lab-12-5-rnn_stock_prediction.py
I try to run this code, but it has error.
My TF' version is 1.1.0,

outputs, _states = tf.nn.dynamic_rnn(cell, X, dtype=tf.float32)

===============================================================
ValueError Traceback (most recent call last)
in ()
2 cell = tf.contrib.rnn.BasicLSTMCell(
3 num_units=hidden_dim, state_is_tuple=True, activation=tf.tanh)
----> 4 outputs, _states = tf.nn.dynamic_rnn(cell, X, dtype=tf.float32)

C:\Program Files\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\ops\rnn.py in dynamic_rnn(cell, inputs, sequence_length, initial_state, dtype, parallel_iterations, swap_memory, time_major, scope)
551 swap_memory=swap_memory,
552 sequence_length=sequence_length,
--> 553 dtype=dtype)
554
555 # Outputs of _dynamic_rnn_loop are always shaped [time, batch, depth].

C:\Program Files\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\ops\rnn.py in _dynamic_rnn_loop(cell, inputs, initial_state, parallel_iterations, swap_memory, sequence_length, dtype)
718 loop_vars=(time, output_ta, state),
719 parallel_iterations=parallel_iterations,
--> 720 swap_memory=swap_memory)
721
722 # Unpack final output if not using output tuples.

C:\Program Files\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\ops\control_flow_ops.py in while_loop(cond, body, loop_vars, shape_invariants, parallel_iterations, back_prop, swap_memory, name)
2621 context = WhileContext(parallel_iterations, back_prop, swap_memory, name)
2622 ops.add_to_collection(ops.GraphKeys.WHILE_CONTEXT, context)
-> 2623 result = context.BuildLoop(cond, body, loop_vars, shape_invariants)
2624 return result
2625

C:\Program Files\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\ops\control_flow_ops.py in BuildLoop(self, pred, body, loop_vars, shape_invariants)
2454 self.Enter()
2455 original_body_result, exit_vars = self._BuildLoop(
-> 2456 pred, body, original_loop_vars, loop_vars, shape_invariants)
2457 finally:
2458 self.Exit()

C:\Program Files\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\ops\control_flow_ops.py in _BuildLoop(self, pred, body, original_loop_vars, loop_vars, shape_invariants)
2404 structure=original_loop_vars,
2405 flat_sequence=vars_for_body_with_tensor_arrays)
-> 2406 body_result = body(*packed_vars_for_body)
2407 if not nest.is_sequence(body_result):
2408 body_result = [body_result]

C:\Program Files\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\ops\rnn.py in _time_step(time, output_ta_t, state)
703 skip_conditionals=True)
704 else:
--> 705 (output, new_state) = call_cell()
706
707 # Pack state if using state tuples

C:\Program Files\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\ops\rnn.py in ()
689
690 input_t = nest.pack_sequence_as(structure=inputs, flat_sequence=input_t)
--> 691 call_cell = lambda: cell(input_t, state)
692
693 if sequence_length is not None:

C:\Program Files\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\contrib\rnn\python\ops\core_rnn_cell_impl.py in call(self, inputs, state, scope)
233 def call(self, inputs, state, scope=None):
234 """Long short-term memory cell (LSTM)."""
--> 235 with _checked_scope(self, scope or "basic_lstm_cell", reuse=self._reuse):
236 # Parameters of gates are concatenated into one multiply for efficiency.
237 if self._state_is_tuple:

C:\Program Files\Anaconda3\envs\tensorflow\lib\contextlib.py in enter(self)
57 def enter(self):
58 try:
---> 59 return next(self.gen)
60 except StopIteration:
61 raise RuntimeError("generator didn't yield") from None

C:\Program Files\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\contrib\rnn\python\ops\core_rnn_cell_impl.py in _checked_scope(cell, scope, reuse, **kwargs)
91 "To share the weights of an RNNCell, simply "
92 "reuse it in your second calculation, or create a new one with "
---> 93 "the argument reuse=True." % (scope_name, type(cell).name))
94
95 # Everything is OK. Update the cell's scope and yield it.

ValueError: Attempt to have a second RNNCell use the weights of a variable scope that already has weights: 'rnn/basic_lstm_cell'; and the cell was not constructed as BasicLSTMCell(..., reuse=True). To share the weights of an RNNCell, simply reuse it in your second calculation, or create a new one with the argument reuse=True.

lab04: Is it correct that different batch size generates different result?

Not only is Your result different in lab-04-3 and lab-04-4 (Your result sample also showed it) but also it is different in lab-04-4 depending on its batch size. I'd seen the same result with Numpy instead of TensorFlow.

Is it natural to get different result related to 'batch' size in TensorFlow? If so, can we say this fact is not going to trigger any problem in using 'batch'?

Need FC after RNN outputs

@imcomking reported this:

Since the tf.contrib.seq2seq.sequence_loss requires logit, we need a FC, since the outputs had an activation function.

Or we need to implement a simple sequence_loss which can take RNN outputs.

outputs, _states = tf.nn.dynamic_rnn(
    cell, X, initial_state=initial_state, dtype=tf.float32)

sequence_loss = tf.contrib.seq2seq.sequence_loss(
    logits=outputs, targets=Y, weights=weights)

Lab 10. Back Propagation Implementation.

In, lab-10-X1-mnist_back_prop.py

Back propagation is defined as follows:

# Forward
l1 = tf.add(tf.matmul(X, w1), b1)
a1 = sigma(l1)
l2 = tf.add(tf.matmul(a1, w2), b2)
y_pred = sigma(l2)

diff = (y_pred - Y)

# Back prop (chain rule)
d_l2 = diff * sigma_prime(l2)
d_b2 = d_l2
d_w2 = tf.matmul(tf.transpose(a1), d_l2)
d_a1 = tf.matmul(d_l2, tf.transpose(w2))
d_l1 = d_a1 * sigma_prime(l1)
d_b1 = d_l1
d_w1 = tf.matmul(tf.transpose(X), d_l1)

Problem

This backpropagation is only true when the loss function is
image

Proof

Current Forward Step:
image

If we assume the loss function is above,

image

which is represented as

d_l2 = diff * sigma_prime(l2)
d_w2 = tf.matmul(tf.transpose(a1), d_l2)

We can continue for other variables

Conclusion

  1. The current loss is a variant of mean squared error and this loss is usually used for regression problems
  2. This is a classification problem(MNIST)
  3. Interestingly, with current hyperparameters, it works well
  4. However, if you change hyperparameters, says increasing the batch size to somewhat reasonable like 128 or above, it will fail to converge due to the wrong loss function (because it will try to match all 10 classes at once instead of focusing on the correct label)
  5. I suspect the reason it works because of the MNIST dataset characteristics. Most filled up data is 0, the pixel of the background image.
  6. With the correct cross entropy loss function, it has no issue with batch size or learning rate
  7. I suggest that loss function should be clearly defined before going into any back propagations. This follows Andrew Kapathy's approach as well.

Gradient descent algorithm and learning rate

I think 'learning rate' should be explained with 'Gradient descent algorithm'.

import numpy as np
from keras import optimizers
from keras.layers import Dense
from keras.models import Sequential


x_train = [1, 2, 3, 4]
y_train = [0, -1, -2, -3]

model = Sequential()
model.add(Dense(1, input_dim=1))

sgd = optimizers.SGD(lr=0.1)
model.compile(loss='mse', optimizer=sgd)

# prints summary of the model to the terminal
model.summary()

model.fit(x_train, y_train, epochs=100)

y_predict = model.predict(np.array([5]))
print(y_predict)

this code trains only 100 epochs and more precise.

Our manual backprop weight average is missing

For example,

https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-09-x-xor-nn-back_prop.py

d_W1 = tf.matmul(tf.transpose(X), d_l1)

X's shape is (?, 2), X^T's shape is (2,?), and dl_1's shape is (?, 2). The shape of d_W1 should be (2,2), but the values of d_W1 are proportional to the sample size. We need to average these values.

The current sample size is only 4 so it's OK, but when sample size is too big, it does not work.

To reproduce add this code:

x_data = np.vstack([x_data, x_data])
y_data = np.vstack([y_data, y_data])
x_data = np.vstack([x_data, x_data])
y_data = np.vstack([y_data, y_data])
x_data = np.vstack([x_data, x_data])
y_data = np.vstack([y_data, y_data])
x_data = np.vstack([x_data, x_data])
y_data = np.vstack([y_data, y_data])
x_data = np.vstack([x_data, x_data])
y_data = np.vstack([y_data, y_data])
x_data = np.vstack([x_data, x_data])
y_data = np.vstack([y_data, y_data])
x_data = np.vstack([x_data, x_data])
y_data = np.vstack([y_data, y_data])
x_data = np.vstack([x_data, x_data])
y_data = np.vstack([y_data, y_data])
x_data = np.vstack([x_data, x_data])
y_data = np.vstack([y_data, y_data])
x_data = np.vstack([x_data, x_data])
y_data = np.vstack([y_data, y_data])
x_data = np.vstack([x_data, x_data])
y_data = np.vstack([y_data, y_data])
x_data = np.vstack([x_data, x_data])
y_data = np.vstack([y_data, y_data])

FYI, d_b values are averaged:
tf.reduce_mean(d_b1, axis=[0])

change comment cost -> cost/loss

#26
cost함수 부분 주석과 변수 명이 cost와 loss과 혼용 되어 있는 것을 확인했습니다.
처음 시작하시는 분들이 용어에 혼동하지 않을까 생가되어 주석부분을 "cost/loss"로 둘 다 표기를 하였는데 어떠한가요?

Keras 2 API support

Title:
Keras 2 API support

  • klab-11-1-cnn_mnist.py
    line 61 : model.add(Convolution2D(nb_filters, kernel_size[0], kernel_size[1],
    border_mode='valid',
    input_shape=input_shape))
    line 63 : model.add(Convolution2D(nb_filters, kernel_size[0], kernel_size[1]))
    # on Keras 2, Convolution2D should be changed to Conv2D
  • klab-09-2-xor-nn.py
    line 14 : model.fit(x_data, y_data, nb_epoch=2000)
    # on Keras 2, nb_epoch should be changed to epochs
  • klab-05-2-logistic_regression_diabetes.py
    line 14 : model.fit(x_data, y_data, nb_epoch=2000)
    # on Keras 2, nb_epoch should be changed to epochs
  • klab-04-3-file_input_linear_regression.py
    line 15 : model.fit(x_data, y_data, nb_epoch=2000)
    # on Keras 2, nb_epoch should be changed to to epochs

TF examples take over all GPU memory

I find that if we run the TF example, it will try to take over all of the available GPU memory (Refer: https://www.tensorflow.org/tutorials/using_gpu). This can cause troubles in public servers where lots of users are sharing the GPUs.

For example, when running klab-12-5-seq2seq.py, the GPU usage could be like this:

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 375.26                 Driver Version: 375.26                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 1080    Off  | 0000:02:00.0     Off |                  N/A |
| 17%   47C    P2    44W / 200W |   7786MiB /  8113MiB |      9%      Default |
+-------------------------------+----------------------+----------------------+
|   1  GeForce GTX 1080    Off  | 0000:03:00.0     Off |                  N/A |
|  0%   41C    P2    43W / 200W |   7715MiB /  8113MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   2  GeForce GTX 1080    Off  | 0000:82:00.0     Off |                  N/A |
|  0%   44C    P2    43W / 200W |   7715MiB /  8113MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+

I used the method suggested in https://www.tensorflow.org/tutorials/using_gpu and the memory will be allocated incrementally now.

(Adding the following lines after the import will solve the problem https://github.com/hunkim/DeepLearningZeroToAll/blob/master/Keras/klab-12-5-seq2seq.py#L12)

from keras.utils.vis_utils import plot_model
from keras.backend.tensorflow_backend import set_session
config = tf.ConfigProto()
config.gpu_options.allow_growth = True
set_session(tf.Session(config=config))

The GPU memory usage becomes:

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 375.26                 Driver Version: 375.26                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 1080    Off  | 0000:02:00.0     Off |                  N/A |
|  0%   47C    P2    44W / 200W |    294MiB /  8113MiB |     12%      Default |
+-------------------------------+----------------------+----------------------+
|   1  GeForce GTX 1080    Off  | 0000:03:00.0     Off |                  N/A |
|  0%   41C    P8    14W / 200W |    115MiB /  8113MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   2  GeForce GTX 1080    Off  | 0000:82:00.0     Off |                  N/A |
|  0%   44C    P8    14W / 200W |    115MiB /  8113MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+

Is there a way to enable this by default?

Selu does not work well for MNIST

I simply add the SELU, but it does not work well in our setting.

@kkweon Any thoughts? The code is https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-10-8-mnist_nn_selu(wip).py.

ReLu:

Epoch: 0001 cost = 0.467785273
Epoch: 0002 cost = 0.172470962
Epoch: 0003 cost = 0.130023701
Epoch: 0004 cost = 0.108112177
Epoch: 0005 cost = 0.095870705
Epoch: 0006 cost = 0.085150063
Epoch: 0007 cost = 0.076167965
Epoch: 0008 cost = 0.068576291
Epoch: 0009 cost = 0.065072622
Epoch: 0010 cost = 0.057339554
Epoch: 0011 cost = 0.056553968
Epoch: 0012 cost = 0.050890055
Epoch: 0013 cost = 0.052106281
Epoch: 0014 cost = 0.048473668
Epoch: 0015 cost = 0.046916210
Epoch: 0016 cost = 0.045329482
Epoch: 0017 cost = 0.044389233
Epoch: 0018 cost = 0.040508846
Epoch: 0019 cost = 0.040360809
Epoch: 0020 cost = 0.039615057
Epoch: 0021 cost = 0.033379408
Epoch: 0022 cost = 0.038295556
Epoch: 0023 cost = 0.035628493
Epoch: 0024 cost = 0.035665076
Epoch: 0025 cost = 0.036193184
Epoch: 0026 cost = 0.032183025
Epoch: 0027 cost = 0.029882973
Epoch: 0028 cost = 0.034569517
Epoch: 0029 cost = 0.029946580
Epoch: 0030 cost = 0.029602805
Epoch: 0031 cost = 0.030723976
Epoch: 0032 cost = 0.030724677
Epoch: 0033 cost = 0.031303227
Epoch: 0034 cost = 0.029194953
Epoch: 0035 cost = 0.026885245
Epoch: 0036 cost = 0.029389468
Epoch: 0037 cost = 0.027995968
Epoch: 0038 cost = 0.028875235
Epoch: 0039 cost = 0.028979162
Epoch: 0040 cost = 0.028229359
Epoch: 0041 cost = 0.026483263
Epoch: 0042 cost = 0.027438245
Epoch: 0043 cost = 0.028106798
Epoch: 0044 cost = 0.025572559
Epoch: 0045 cost = 0.029911634
Epoch: 0046 cost = 0.025304895
Epoch: 0047 cost = 0.023481862
Epoch: 0048 cost = 0.024350552
Epoch: 0049 cost = 0.027861851
Epoch: 0050 cost = 0.023458240
Learning Finished!
Accuracy: 0.9811

SeLU:

Epoch: 0001 cost = 0.581152383
Epoch: 0002 cost = 0.152600192
Epoch: 0003 cost = 0.123071806
Epoch: 0004 cost = 0.112109157
Epoch: 0005 cost = 0.108158814
Epoch: 0006 cost = 0.085280647
Epoch: 0007 cost = 0.079516315
Epoch: 0008 cost = 0.080767340
Epoch: 0009 cost = 0.081876351
Epoch: 0010 cost = 0.071583069
Epoch: 0011 cost = 0.072077164
Epoch: 0012 cost = 0.061165387
Epoch: 0013 cost = 0.051244403
Epoch: 0014 cost = 0.060514518
Epoch: 0015 cost = 0.053664495
Epoch: 0016 cost = 0.052410684
Epoch: 0017 cost = 0.051711780
Epoch: 0018 cost = 0.042591221
Epoch: 0019 cost = 0.049355222
Epoch: 0020 cost = 0.035542590
Epoch: 0021 cost = 0.045674886
Epoch: 0022 cost = 0.044495216
Epoch: 0023 cost = 0.038447011
Epoch: 0024 cost = 0.041875363
Epoch: 0025 cost = 0.033574819
Epoch: 0026 cost = 0.032955440
Epoch: 0027 cost = 0.040609113
Epoch: 0028 cost = 0.030550260
Epoch: 0029 cost = 0.036755668
Epoch: 0030 cost = 0.038283713
Epoch: 0031 cost = 0.031447820
Epoch: 0032 cost = 0.035755835
Epoch: 0033 cost = 0.025528288
Epoch: 0034 cost = 0.026880071
Epoch: 0035 cost = 0.034818432
Epoch: 0036 cost = 0.031805422
Epoch: 0037 cost = 0.030338240
Epoch: 0038 cost = 0.028175194
Epoch: 0039 cost = 0.029212171
Epoch: 0040 cost = 0.037782007
Epoch: 0041 cost = 0.047215328
Epoch: 0042 cost = 0.019241051
Epoch: 0043 cost = 0.027530965
Epoch: 0044 cost = 0.025878927
Epoch: 0045 cost = 0.023170506
Epoch: 0046 cost = 0.014257575
Epoch: 0047 cost = 0.035372872
Epoch: 0048 cost = 0.028085495
Epoch: 0049 cost = 0.018021979
Epoch: 0050 cost = 0.035318558
Learning Finished!
Accuracy: 0.977

lab-03-2의 코드에 대해 질문있습니다.

lab-03-2에서는 cost를
cost = tf.reduce_sum(tf.square(hypothesis - Y))
로 정의했는데,
cost = tf.reduce_mean(tf.square(hypothesis - Y))
으로 해야 정확한거죠? 모든 cost의 평균이니까...

물론 총합이나 평균이나 최솟값을 구하는데 영향을 주진 않겠지만
혹시 sum으로 해야만 하는 이유가 있는지 궁금해서 질문드립니다.

[Error] klab-11-1-cnn_mnist.py

Traceback (most recent call last):
File "klab-11-1-cnn_mnist.py", line 59, in
model.add(Conv2D(nb_filters, kernel_size, padding='valid', input_shape=input_shape))
TypeError: init() missing 1 required positional argument: 'nb_col'

why do I get this error?

Minor error in lab-10-X1, backprop process

I always appreciate with this repository and lecture video from youtube.
Unfortunately I think there is an error in back propagation step described in file lab-10-X1.

Line 43, d_l2 = diff * sigma_prime(l2), I think it should be d_l2 = diff.

Lab-09-05

교수님 안녕하세요.
최근에 올리신 TF로 Backprop 구현해보기에서 생각과는 다른 부분이 있어서 문의드립니다.

  1. Y의 레이블 값을 tf.one_hot(Y, nb_classes) 에서 Y에 들어오는 레이블 값이 0에서 시작하지 않으면 잘못된 값이 나오더라구요. 현재 Y 데이터의 레이블 값은 1에서 시작하고 있습니다.

참고 코드

인풋 y의 값이 1부터 시작할 경우: 
[1 2 3 4 5 6 7]

tf.one_hot(y, 7) 결과값:
[[ 0.  1.  0.  0.  0.  0.  0.]
 [ 0.  0.  1.  0.  0.  0.  0.]
 [ 0.  0.  0.  1.  0.  0.  0.]
 [ 0.  0.  0.  0.  1.  0.  0.]
 [ 0.  0.  0.  0.  0.  1.  0.]
 [ 0.  0.  0.  0.  0.  0.  1.]
 [ 0.  0.  0.  0.  0.  0.  0.]]

인풋 y의 값이 0부터 시작할 경우:
[0 1 2 3 4 5 6]

tf.one_hot(y, 7) 결과값:
[[ 1.  0.  0.  0.  0.  0.  0.]
 [ 0.  1.  0.  0.  0.  0.  0.]
 [ 0.  0.  1.  0.  0.  0.  0.]
 [ 0.  0.  0.  1.  0.  0.  0.]
 [ 0.  0.  0.  0.  1.  0.  0.]
 [ 0.  0.  0.  0.  0.  1.  0.]
 [ 0.  0.  0.  0.  0.  0.  1.]]
  1. Backpropgation의 식이 제가 생각했던 것과는 다른데요.
    현재 코드
d_layer1 = diff * sigma_prime(layer1)
d_b = 1 * d_layer1
d_w = tf.matmul(tf.transpose(X), d_layer1)

제가 생각하는 코드는

  • sigma_prime이 없고
  • d_w를 N으로 나누어져야 한다고 생각합니다.
N = xy.shape[0] # data 샘플의 수
d_layer1 = diff
d_b = 1 * d_layer1
d_w = tf.matmul(tf.transpose(X), d_layer1) / N
  • sigma_prime이 필요 없는 경우는
# Forward pass
z = x * w + b
a = sigmoid(z)

Loss = - sum[ y_i * log(a_i) + (1 - y_i) * log(1 - a_i) ] / N

Loss_i = - y_i * log(a_i) - (1 - y_i) * log(1 - a_i)

dLoss_i/d_ai = (a_i - y_i) / (a_i * (1 - a_i))
d_ai/dz_i = a_i * (1 - a_i) # sigmoid 미분 부분

dz_i/dw = x_i
dz_i/db = 1
따라서,

dLoss_i / da_i * da_i/ dz_i = dLoss_i/dz_i = (a_i - y_i) # sigmoid 미분 부분은 사라짐

dLoss_i/dw =  (a_i - y_i) x_i

이걸 매트릭스로 바꿔주면

dLoss_i / dW = dW = x^T * (a - y)

dLoss / dW = x^T * (a - y) / N
  • N으로 나눠주는 이유는
Loss = - reduced_mean[ y_i * log(a_i) + (1 - y_i) * log(1 - a_i) ]
     = - sum[...] / N

이라서 결국 저 N의 경우 끝까지 dLoss 오게 되구요.

위의 문제가 언급 된 랩코드

Adding Batch Normalization Examples

Abstract

TL;DR Let's add a batch norm

Issue

The same issue has been called out a few days ago, but I don't see a PR coming up.
Anyway, I think it's a good idea to add batch normalization examples because it's quite confusing for beginners.

Why it's confusing: because when there is a batch normalization layer, exponential moving average (momentum) must be calculated at training time. However, this is not included in the computation graph because it's not necessary to get the output of a network.

Therefore, one must include

update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS, scope=name)
with tf.control_dependencies(update_ops):
    train_op = optimizer(lr).minimize(self.loss)

There are other ways but this is the simplest one.

Suggestion

lab-10-5-mnist_nn_dropout.py 에 cost 부분에 대해 질문 있습니다.

batch_xs, batch_ys = mnist.train.next_batch(batch_size)
feed_dict = {X: batch_xs, Y: batch_ys, keep_prob: 0.7}
sess.run(optimizer, feed_dict=feed_dict)
avg_cost += sess.run(cost, feed_dict=feed_dict) / total_batch

위의 코드는 lab-10-5-mnist_nn_dropout.py 의 session 부분을 발췌한 것입니다.
저기서 optimizer 를 run 한 뒤에 cost를 다시 run하게 되는데 dropout을 적용했을 경우, 활성화되는 뉴련이 두 run의 경우에 각각 달라지는 것 같습니다.

batch_xs, batch_ys = mnist.train.next_batch(batch_size)
sess.run([optimizer],feed_dict={X:batch_xs,Y:batch_ys,dropout_rate:0.7})
cost_1 = sess.run(cost,feed_dict={X:batch_xs,Y:batch_ys,dropout_rate:0.7})
cost_2 = sess.run(cost,feed_dict={X:batch_xs,Y:batch_ys,dropout_rate:0.7})
avg_cost+=cost_1/total_batch

와 같이 코드를 바꾸어서 수정해보면 여기에서 cost_1이 본래 코드의 cost와 같고 cost_2는 그냥 한번 더 run을 실행하여 구한 값입니다.

이럴 경우 cost_1cost_2의 값이 다름을 확인하였습니다.

print(cost_1,cost_2)의 결과는 아래와 같습니다.

image

따라서 코드의 구동에는 아무 이상이 없지만, (dropout을 적용할 때에 한하여) training error를 표시할 때 optimizer가 run이 될때 cost를 같이 구할 수 있어야 정확한 training error가 아닐까 생각합니다. 아니면 지금 구하는 방식도 training error라고 보아도 될까요? 사실 현재 구하는 cost의 값도 신경망의 수렴정도를 잘 나타내준다고 할 수는 있어서 그 부분이 헛갈리네요.

감사합니다.

lab04-03 vs. lab04-04: Returns of 'tf.decode_csv()'

I thought both 'xy' in each source file were the same matrix, and this leads to a question about the difference in slice index of them in your files. This is because results of xy[:, 0:-1], xy[:, [-1]] and xy[0:-1], xy[-1:] were apparently different in Numpy.

I tried to see results of 'tf.decode_csv()' through terminal, but it took too much time to see on my laptop. Also, description of its results in https://www.tensorflow.org/api_docs/python/tf/decode_csv is also too intangible for me to understand.

Based on this information(https://www.tensorflow.org/programmers_guide/reading_data), I tried changing code little like that:

$ diff lab-04-4-tf_reader_linear_regression.py lab-04-4-tf_reader_linear_regression_2.py
20c20,21
< xy = tf.decode_csv(value, record_defaults=record_defaults)
---
> col1, col2, col3, col4 = tf.decode_csv(value, record_defaults=record_defaults)
> features = tf.stack([col1, col2, col3])
24c25
<     tf.train.batch([xy[0:-1], xy[-1:]], batch_size=10)
---
>     tf.train.batch([features, [col4]], batch_size=10)

Their results are the same, then can I consider this 'tf.decode_csv' returns each column set ([], [], [], []) because of record_defaults ([], [], [], [])? So, you wrote tf.train.batch([xy[0:-1], xy[-1:]], batch_size=10) and it means below one in lab-04-04?

xy([#][#][#][@])
    0. 1. 2. 3
            -1

Should my question be so ambiguous that you cannot understand, I would make it clear.

텐서플로 설치했는데 왜 인식이 안되는지 모르겠습니다.

지금 현재 아나콘다 프롬프트로 텐서플로 활성화 후에 파이썬으로 들어가서 import tensorflow as tf하면 잘 됩니다.
그런데 atom에서 코드 실행하면 노 모듈네임드 텐서플로 뜹니다.
그러면서 헬로 텐서플로가 실행이 안 됩니다.
그리고 리니어리그레션 과정 붙여넣기 하면 cmd에서도 안되고 atom에서도 안 됩니다.
인스톨은 1.0버전이 되어있고요.
왜 안되는지 아시는 분 있으신가요?
너무 답답하네요.
윈도우 10 쓰고 있습니다.

convert py to ipynb

.py 파일을 .ipynb로 변환하려고 합니다.
혹시 다른 분 중에서 변환을 진행 중이거나
변환 목록 등이 있나요?

Naming of w_val, cost_val

I think Lab03-1 can use w_record, cost_record or another name instead of w_val and cost_val
w_record and cost_record are hard to know at a glance

refactoring MinMaxScaler

Summary

  • Currently, two types of MinMaxScalers are used inconsistently
  • It's not consistent
  • Let's choose one and fix !

Type 1: using sklearn

from sklearn.preprocessing import MinMaxScaler

scaler = MinMaxScaler(feature_range=(0, 1))
xy = scaler.fit_transform(xy)

Problems

  1. it creates another dependency on sklearn

Type 2: custom defined function

import numpy as np

def MinMaxScaler(data):
    numerator = data - np.min(data, 0)
    denominator = np.max(data, 0) - np.min(data, 0)
    # noise term prevents the zero division
    return numerator / (denominator + 1e-7)

xy = MinMaxScaler(xy)

Problems

  1. Instead of MinMaxScaler, it should be named as min_max_scaler(data) since it's a function.
  2. It takes multiples lines every time we use it (maybe create an utils.py ?)

Relevent files

  • lab-07-3-linear_regression_min_max.py
  • lab-12-5-rnn_stock_prediction.py
  • Keras/klab-04-4-stock_linear_regression.py
  • Keras/klab-12-3-rnn_stock_prediction.py
  • Keras/klab-12-4-rnn_deep_prediction.py
  • PyTorch/lab-12-5-stock_prediction.py

OutOfRangeError

ML lab 04-2: TensorFlow로 파일에서 데이타 읽어오기 강좌를 수강하다가
케글에있는 데이터를 사용하여 시도해보고 있습니다.

HR_comma_sep.csv 데이터를 임포트하고

coding: utf-8

In[1]:

import tensorflow as tf
import numpy as np

In[2]:

url = '../kookjinkim/Downloads/HR_comma_sep.csv'
filename_queue = tf.train.string_input_producer([url], shuffle=False, name='filename_queue')

In[3]:

reader = tf.TextLineReader()
key, value = reader.read(filename_queue)

In[4]:

record_defaults = [[0.],[0.],[0.],[0.],[0.],[0.],[0.],[0.],[0.],[0.]]
xy = tf.decode_csv(value, record_defaults=record_defaults)

In[5]:

train_x_batch, train_y_batch = tf.train.batch([xy[0:-1],xy[-1:]], batch_size=10)

In[6]:

train_x_batch

In[7]:

X =tf.placeholder(tf.float32, shape=[None,9])
Y =tf.placeholder(tf.float32, shape=[None,1])

In[8]:

W = tf.Variable(tf.random_normal([9,1]), name='weight')
b = tf.Variable(tf.random_normal([1]), name='bias')

In[9]:

hypothesis = tf.sigmoid(tf.matmul(X,W)+b)

In[10]:

cost = tf.reduce_mean(tf.square(hypothesis -Y))

In[11]:

optimizer = tf.train.GradientDescentOptimizer(learning_rate=1e-5)

In[12]:

train = optimizer.minimize(cost)

In[13]:

sess = tf.Session()
sess.run(tf.global_variables_initializer())

In[14]:

coord = tf.train.Coordinator()
thread = tf.train.start_queue_runners(sess=sess, coord=coord)

In[16]:

for step in range(2001):
x_batch, y_batch = sess.run([train_x_batch, train_y_batch])
if step % 100 == 0:
print(x_batch,y_batch)
cost_val,hy_val = sess.run([cost,hypothesis], feed_dict = {X: x_batch,Y:y_batch})

if step%100 == 0:
    print(step, "cost: ",cost_val, "Prediction: ",hy_val)

coord.clear_stop()
coord.join(thread)

위 부분에서 아래와 같은 에러가 나오는데요..

OutOfRangeError Traceback (most recent call last)
/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/tensorflow/python/client/session.py in _do_call(self, fn, *args)
1021 try:
-> 1022 return fn(*args)
1023 except errors.OpError as e:

/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/tensorflow/python/client/session.py in _run_fn(session, feed_dict, fetch_list, target_list, options, run_metadata)
1003 feed_dict, fetch_list, target_list,
-> 1004 status, run_metadata)
1005

/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/contextlib.py in exit(self, type, value, traceback)
88 try:
---> 89 next(self.gen)
90 except StopIteration:

/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/tensorflow/python/framework/errors_impl.py in raise_exception_on_not_ok_status()
465 compat.as_text(pywrap_tensorflow.TF_Message(status)),
--> 466 pywrap_tensorflow.TF_GetCode(status))
467 finally:

OutOfRangeError: FIFOQueue '_2_batch/fifo_queue' is closed and has insufficient elements (requested 10, current size 0)
[[Node: batch = QueueDequeueManyV2[component_types=[DT_FLOAT, DT_FLOAT], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/cpu:0"](batch/fifo_queue, batch/n)]]

During handling of the above exception, another exception occurred:

OutOfRangeError Traceback (most recent call last)
in ()
1 for step in range(2001):
----> 2 x_batch, y_batch = sess.run([train_x_batch, train_y_batch])
3 if step % 100 == 0:
4 print(x_batch,y_batch)
5 cost_val,hy_val = sess.run([cost,hypothesis], feed_dict = {X: x_batch,Y:y_batch})

/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/tensorflow/python/client/session.py in run(self, fetches, feed_dict, options, run_metadata)
765 try:
766 result = self._run(None, fetches, feed_dict, options_ptr,
--> 767 run_metadata_ptr)
768 if run_metadata:
769 proto_data = tf_session.TF_GetBuffer(run_metadata_ptr)

/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/tensorflow/python/client/session.py in _run(self, handle, fetches, feed_dict, options, run_metadata)
963 if final_fetches or final_targets:
964 results = self._do_run(handle, final_targets, final_fetches,
--> 965 feed_dict_string, options, run_metadata)
966 else:
967 results = []

/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/tensorflow/python/client/session.py in _do_run(self, handle, target_list, fetch_list, feed_dict, options, run_metadata)
1013 if handle is None:
1014 return self._do_call(_run_fn, self._session, feed_dict, fetch_list,
-> 1015 target_list, options, run_metadata)
1016 else:
1017 return self._do_call(_prun_fn, self._session, handle, feed_dict,

/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/tensorflow/python/client/session.py in _do_call(self, fn, *args)
1033 except KeyError:
1034 pass
-> 1035 raise type(e)(node_def, op, message)
1036
1037 def _extend_graph(self):

OutOfRangeError: FIFOQueue '_2_batch/fifo_queue' is closed and has insufficient elements (requested 10, current size 0)
[[Node: batch = QueueDequeueManyV2[component_types=[DT_FLOAT, DT_FLOAT], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/cpu:0"](batch/fifo_queue, batch/n)]]

Caused by op 'batch', defined at:
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/ipykernel/main.py", line 3, in
app.launch_new_instance()
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/traitlets/config/application.py", line 658, in launch_instance
app.start()
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/ipykernel/kernelapp.py", line 474, in start
ioloop.IOLoop.instance().start()
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/zmq/eventloop/ioloop.py", line 177, in start
super(ZMQIOLoop, self).start()
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/tornado/ioloop.py", line 887, in start
handler_func(fd_obj, events)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/tornado/stack_context.py", line 275, in null_wrapper
return fn(*args, **kwargs)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/zmq/eventloop/zmqstream.py", line 440, in _handle_events
self._handle_recv()
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/zmq/eventloop/zmqstream.py", line 472, in _handle_recv
self._run_callback(callback, msg)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/zmq/eventloop/zmqstream.py", line 414, in _run_callback
callback(*args, **kwargs)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/tornado/stack_context.py", line 275, in null_wrapper
return fn(*args, **kwargs)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/ipykernel/kernelbase.py", line 276, in dispatcher
return self.dispatch_shell(stream, msg)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/ipykernel/kernelbase.py", line 228, in dispatch_shell
handler(stream, idents, msg)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/ipykernel/kernelbase.py", line 390, in execute_request
user_expressions, allow_stdin)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/ipykernel/ipkernel.py", line 196, in do_execute
res = shell.run_cell(code, store_history=store_history, silent=silent)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/ipykernel/zmqshell.py", line 501, in run_cell
return super(ZMQInteractiveShell, self).run_cell(*args, **kwargs)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/IPython/core/interactiveshell.py", line 2717, in run_cell
interactivity=interactivity, compiler=compiler, result=result)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/IPython/core/interactiveshell.py", line 2821, in run_ast_nodes
if self.run_code(code, result):
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/IPython/core/interactiveshell.py", line 2881, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "", line 1, in
train_x_batch, train_y_batch = tf.train.batch([xy[0:-1],xy[-1:]], batch_size=10)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/tensorflow/python/training/input.py", line 872, in batch
name=name)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/tensorflow/python/training/input.py", line 667, in _batch
dequeued = queue.dequeue_many(batch_size, name=name)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/tensorflow/python/ops/data_flow_ops.py", line 458, in dequeue_many
self._queue_ref, n=n, component_types=self._dtypes, name=name)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/tensorflow/python/ops/gen_data_flow_ops.py", line 1310, in _queue_dequeue_many_v2
timeout_ms=timeout_ms, name=name)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 763, in apply_op
op_def=op_def)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 2327, in create_op
original_op=self._default_original_op, op_def=op_def)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1226, in init
self._traceback = _extract_stack()

OutOfRangeError (see above for traceback): FIFOQueue '_2_batch/fifo_queue' is closed and has insufficient elements (requested 10, current size 0)
[[Node: batch = QueueDequeueManyV2[component_types=[DT_FLOAT, DT_FLOAT], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/cpu:0"](batch/fifo_queue, batch/n)]]

lab-12-4-rnn_long_char ValueError

outputs, _states = tf.nn.dynamic_rnn(cell, X_one_hot, dtype=tf.float32)
X_for_fc = tf.reshape(outputs, [-1, hidden_size])
outputs = tf.contrib.layers.fully_connected(X_for_fc, num_classes, activation_fn=None)
outputs = tf.reshape(outputs, [batch_size, sequence_length, num_classes])

ValueError: Attempt to reuse RNNCell <tensorflow.contrib.rnn.python.ops.core_rnn_cell_impl.BasicLSTMCell object at 0x00000256226FEC50> with a different variable scope than its first use. First use of cell was with scope 'rnn/multi_rnn_cell/cell_0/basic_lstm_cell', this attempt is with scope 'rnn/multi_rnn_cell/cell_1/basic_lstm_cell'. Please create a new instance of the cell if you would like it to use a different set of weights. If before you were using: MultiRNNCell([BasicLSTMCell(...)] * num_layers), change to: MultiRNNCell([BasicLSTMCell(...) for _ in range(num_layers)]). If before you were using the same cell instance as both the forward and reverse cell of a bidirectional RNN, simply create two instances (one for forward, one for reverse). In May 2017, we will start transitioning this cell's behavior to use existing stored weights, if any, when it is called with scope=None (which can lead to silent model degradation, so this error will remain until then.)

supplement code for "Out of Memory" issue

file: lab-11-2-mnist_deep_cnn.py

mnist.test data set is too big for some system so that makes Out of Memory issue.
"ResourceExhaustedError (see above for traceback): OOM when allocating tensor with shape[10000,32,28,28]"

I think it would be good to provide supplement code in comment

Cost(loss) result using placeholder vs. primitive list

I spotted cost result of 'lab-02-2-linear_regression_feed.py' is slightly different in using placeholder and using a python list.

x_train = [1, 2, 3]
y_train = [1, 2, 3]

# Our hypothesis XW+b
hypothesis = x_train * W + b

# cost/loss function
cost = tf.reduce_mean(tf.square(hypothesis - y_train))

0 2.82329 [ 2.12867713] [-0.85235667]
20 0.190351 [ 1.53392804] [-1.05059612]
...
1980 1.32962e-05 [ 1.00423515] [-0.00962736]
2000 1.20761e-05 [ 1.00403607] [-0.00917497]

[ 5.0110054]
[ 2.50091505]
[ 1.49687922 3.50495124]

X = tf.placeholder(tf.float32, shape=[None])
Y = tf.placeholder(tf.float32, shape=[None])

# Our hypothesis XW+b
hypothesis = X * W + b

# cost/loss function
cost = tf.reduce_mean(tf.square(hypothesis - Y))

sess.run([cost, W, b, train], feed_dict={X: [1, 2, 3], Y: [1, 2, 3]})

0 3.52408 [ 2.12867713] [-0.85235667]
20 0.197499 [ 1.53392804] [-1.05059612]
1960 1.47111e-05 [ 1.004444] [-0.01010205]
1980 1.3361e-05 [ 1.00423515] [-0.00962736]
...
2000 1.21343e-05 [ 1.00403607] [-0.00917497]

[ 5.0110054]
[ 2.50091505]
[ 1.49687922 3.50495124]

Hypothesis, W, and b are the same, while cost result is different even with those same values. I suppose this cost result should be equal regardless of node type, so that I cannot understand this discrepancy in cost and conformability in others in spite of that. Did I something miss?

patch for remove UserWarning

i found UserWarning occurs when learning rate parameter exists in compile() function.
so, i made PR.
sorry for not knowing existence of guideline documents before PR.

Lab experiments in MXNet

I decide to add the following 7 lab experiments implemented in MXNet.
Also, all the scripts will be tested against the latest MXNet master.

TODOs

Basic

  • Lab 4-3
  • Lab 5-2
  • Lab 6-2
  • Lab 11-2
  • Lab 11-5
  • Lab 12-4
  • Lab 12-5

Advanced

  • Spatial Transformer Network
  • GAN that fits a mixture of Gaussian

try to make model in Class

I'm trying to make LSTM model in Class

but running with class I got Nan error

can any one help?

here is my code

` class ForecastLSTM(object):

def __init__(self, sequence_length, data_dim, output_dim, hidden_dim):

    self.X = tf.placeholder(tf.float32, [None, sequence_length, data_dim], name="X")
    self.Y = tf.placeholder(tf.float32, [None, output_dim ], name="Y")


    cell=tf.contrib.rnn.BasicLSTMCell(num_units=hidden_dim, state_is_tuple=True, activation=tf.tanh)
    outputs, _states= tf.nn.dynamic_rnn(cell, self.X, dtype=tf.float32)
    self.Y_pred = tf.contrib.layers.fully_connected(outputs[:,-1], output_dim, activation_fn=None)

    self.loss = tf.reduce_sum(tf.sqrt((self.Y_pred - self.Y)))

with sess.as_default():

forecast = ForecastLSTM(sequence_length=5,
                        data_dim=5,
                        output_dim=1,
                        hidden_dim=10)

optimizer = tf.train.AdagradOptimizer(1e-2)
train = optimizer.minimize(forecast.loss)


sess.run(tf.global_variables_initializer())

for i in range(10000):
    _, l = sess.run([train, forecast.loss], feed_dict={forecast.X: trainX, forecast.Y: trainY})
    print (i, l)

`

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.