Comments (12)
We just need to divide by its batch size.
N = X.shape[0]
d_W1 = tf.matmul(tf.transpose(X), d_l1) / N
from deeplearningzerotoall.
How about something like this? I removed sigma_prime
.
W1 = tf.Variable(tf.random_normal([2, 2]), name='weight1')
b1 = tf.Variable(tf.random_normal([2]), name='bias1')
layer1 = tf.sigmoid(tf.matmul(X, W1) + b1)
W2 = tf.Variable(tf.random_normal([2, 1]), name='weight2')
b2 = tf.Variable(tf.random_normal([1]), name='bias2')
Y_pred = tf.sigmoid(tf.matmul(layer1, W2) + b2)
# cost/loss function
cost = -tf.reduce_mean(Y * tf.log(Y_pred) + (1 - Y) *
tf.log(1 - Y_pred))
d_Y_pred = (Y_pred - Y) / (Y_pred * (1.0 - Y_pred) + 1e-7)
d_sigma = Y_pred * (1 - Y_pred)
# Layer 2
d_o2 = d_Y_pred * d_sigma
d_l2 = tf.multiply(d_o2, d_sigma)
d_b2 = d_l2
d_W2 = tf.matmul(tf.transpose(layer1), d_l2)
# Mean
d_b2_mean = tf.reduce_mean(d_b2, axis=[0])
d_W2_mean = d_W2 / tf.cast(tf.shape(layer1)[0], dtype=tf.float32)
# Layer 1
d_o1 = layer1 * (1-layer1)
d_l1 = tf.multiply(tf.matmul(d_l2, tf.transpose(W2)), d_o1)
d_b1 = d_l1
d_W1 = tf.matmul(tf.transpose(X), d_l1)
# Mean
d_W1_mean = d_W1 / tf.cast(tf.shape(X)[0], dtype=tf.float32)
d_b1_mean = tf.reduce_mean(d_b1, axis=[0])
# Weight update
step = [
tf.assign(W2, W2 - learning_rate * d_W2_mean),
tf.assign(b2, b2 - learning_rate * d_b2_mean),
tf.assign(W1, W1 - learning_rate * d_W1_mean),
tf.assign(b1, b1 - learning_rate * d_b1_mean)
]
from deeplearningzerotoall.
I don't have a machine to run a test right now, but I guess it will work.
However, I'm not sure if it's better. It looks quite complicated to me already.
from deeplearningzerotoall.
@kkweon Do you need a machine to run? :-) I think your brain is enough.
The previous code starts with diff:
diff = hypothesis - Y
which is hard to understand.
Let me know if you have any refactoring suggestions.
from deeplearningzerotoall.
@hunkim
I personally prefer d_o2 * d_sigma
over tf.multiply(d_o2, d_sigma)
.
Because
- it's more natural
- it's safer because every operation is overridden in
tf.Tensor
class and tested - Less verbose
Like you remember when it turned into 1.0, all the basic operations were renamed.
People who used tf.mul
had to manually fix their codes to tf.multiply
.
from deeplearningzerotoall.
Refactored:
# cost/loss function
cost = -tf.reduce_mean(Y * tf.log(Y_pred) + (1 - Y) *
tf.log(1 - Y_pred))
# Loss derivative
d_Y_pred = (Y_pred - Y) / (Y_pred * (1.0 - Y_pred) + 1e-7)
# Layer 2
d_sigma2 = Y_pred * (1 - Y_pred)
d_l2 = d_Y_pred * d_sigma2
d_b2 = d_l2
d_W2 = tf.matmul(tf.transpose(layer1), d_l2)
# Mean
d_b2_mean = tf.reduce_mean(d_b2, axis=[0])
d_W2_mean = d_W2 / tf.cast(tf.shape(layer1)[0], dtype=tf.float32)
# Layer 1
d_sigma1 = layer1 * (1-layer1)
d_l1 = d_l2 * d_sigma1
d_b1 = d_l1
d_W1 = tf.matmul(tf.transpose(X), d_l1)
# Mean
d_W1_mean = d_W1 / tf.cast(tf.shape(X)[0], dtype=tf.float32)
d_b1_mean = tf.reduce_mean(d_b1, axis=[0])
# Weight update
step = [
tf.assign(W2, W2 - learning_rate * d_W2_mean),
tf.assign(b2, b2 - learning_rate * d_b2_mean),
tf.assign(W1, W1 - learning_rate * d_W1_mean),
tf.assign(b1, b1 - learning_rate * d_b1_mean)
]
make sense?
from deeplearningzerotoall.
looks good. autopep8 will do the rest.
from deeplearningzerotoall.
@kkweon This is right version:
# Network
# p1 a1 l1 p2 a2 l2 (y_pred)
# X -> (*) -> (+) -> (sigmoid) -> (*) -> (+) -> (sigmoid) -> (loss)
# ^ ^ ^ ^
# | | | |
# W1 b1 W2 b2
# Loss derivative
d_Y_pred = (Y_pred - Y) / (Y_pred * (1.0 - Y_pred) + 1e-7)
# Layer 2
d_sigma2 = Y_pred * (1 - Y_pred)
d_a2 = d_Y_pred * d_sigma2
d_p2 = d_a2
d_b2 = d_a2
d_W2 = tf.matmul(tf.transpose(l1), d_p2)
# Mean
d_b2_mean = tf.reduce_mean(d_b2, axis=[0])
d_W2_mean = d_W2 / tf.cast(tf.shape(l1)[0], dtype=tf.float32)
# Layer 1
d_l1 = tf.matmul(d_p2, tf.transpose(W2))
d_sigma1 = l1 * (1 - l1)
d_a1 = d_l1 * d_sigma1
d_b1 = d_a1
d_p1 = d_a1
d_W1 = tf.matmul(tf.transpose(X), d_a1)
# Mean
d_W1_mean = d_W1 / tf.cast(tf.shape(X)[0], dtype=tf.float32)
d_b1_mean = tf.reduce_mean(d_b1, axis=[0])
# Weight update
step = [
tf.assign(W2, W2 - learning_rate * d_W2_mean),
tf.assign(b2, b2 - learning_rate * d_b2_mean),
tf.assign(W1, W1 - learning_rate * d_W1_mean),
tf.assign(b1, b1 - learning_rate * d_b1_mean)
]
Can you run in your brain?
from deeplearningzerotoall.
Yes, the comment really helped. It looks great.
If I can, anyone should be able to run this in his/her brain. So, it's awesome.
from deeplearningzerotoall.
@kkweon Do you like the naming? p
for product and a
for addition.
from deeplearningzerotoall.
@hunkim it should be fine with the comment. Honestly I thought it is a name for the activation layer but I was able to figure it out by reading the comment
from deeplearningzerotoall.
@kkweon Still I don't like names. Let me know if you have any suggestions.
from deeplearningzerotoall.
Related Issues (20)
- lab-07-2 vs lab-07-3 HOT 1
- Question fashion miniest 도와주세요!!
- lab-11-2-mnist_deep_cnn The output of Fully-connected layer1 HOT 1
- hope have a English video HOT 2
- d_l1 i lab-09-5-linear_back_prop.py HOT 1
- tf.layers.dropout()'s rate is not keep_prob HOT 3
- lab_10_5 dropout HOT 3
- Code does not work(lab-12-0-rnn_basics) HOT 1
- Wrong commented in lab-09-4-xor_tensorboard.py to run the tensorboard HOT 1
- possible look-ahead bias in lab-12-5-rnn-stock HOT 4
- lab-04-4-tf_reader_linear_regression.py HOT 2
- lab-02-2-linear_regression_feed.py need `global_variables_initializer` HOT 2
- lab-10-1-mnist_softmax Weight,bias 관련 질문있습니다. HOT 2
- always same prediction 질문입니다. HOT 1
- Anyone interested in sending PR to change this code to TF 2.0? HOT 4
- keras and TF2 folders HOT 1
- Question on lab-12-0-rnn_basics.ipynb HOT 1
- hello HOT 1
- DeepLearningZeroToAll HOT 1
- - Create a new issue HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from deeplearningzerotoall.