hanbt / learn_dl Goto Github PK

View Code? Open in Web Editor NEW

1.2K 50.0 985.0 41 KB

Deep learning algorithms source code for beginners

License: Apache License 2.0

Python 100.00%

learn_dl's Introduction

learn_dl

Deep learning algorithms source code for beginners

learn_dl's People

Contributors

Stargazers

Watchers

Forkers

flyforfreedom acheron2012 fengjiang96 qiaofengmarco oxidegit jackchinay dailong hsdong2012 yangxunj dongyuelsqm alory 447806664 littleking1 ppperilla liuminglu19870419 huayang0704 db2010 markwzx accipd libracaoyang movinghera ryankang95 231sm qiujkx sunssh ilxlwc jelly1418 githubgong zhchxiao tangqinglin qilewuqiong 18463105800 xinfang qsevent ritrous fan0926 zncepup zaytiamo kwan-ywan 307509256 kellyhuangchs timeless15 zhongcong upuil playfulboy whlmpower danielyan86 chenwade ldjking zhaozhengcoder yearlz model1216 jafei0912 jiangziguo iauai fancycheung linglibao hexj314159 songfgh airy-ict arthury41 jerrychii lyz1900 zhangnn016 wanyuanhu fukexue sunchang2017 wpfhtl wjhsmn424896 yuqinjh lzjtt2017 alexanderluo netshizi morindaz smartradar mushuanli oldboy523 wotaoyanjiaobu linjing2826 evrn1874 dazhaxie0526 zqqcsu time1988 juxinge caikanghua jaezheng 1741581 lizhi3158 litaotao hiredd tanglaoya321 xyb-sia tika64208 jerryscode chengshusss ltyscu zhangxiaodi shirke ljianx2006 immortalfish

learn_dl's Issues

perceptron.py 中有错误

TypeError: () missing 1 required positional argument: 'w'
运行版本python3.x

循环神经网络rnn.py中calc_delta_k方法实现有错误

首先，非常感谢《零基础入门深度学习》作者hanbingtao付出的辛苦努力，提供了这么好的教程和代码程序。

在学习的过程中我发现，零基础入门深度学习(5) - 循环神经网络的代码实现rnn.py有个明显错误的地方，并且我用梯度检验程序验过了，确实有问题。现提出问题和解决方法如下，供作者参考。

原方法内容：

def calc_delta_k(self, k, activator):
'''
根据k+1时刻的delta计算k时刻的delta
'''
state = self.state_list[k+1].copy()
element_wise_op(self.state_list[k+1],
activator.backward)
self.delta_list[k] = np.dot(
np.dot(self.delta_list[k+1].T, self.W),
np.diag(state[:,0])).T

这里存在2处明显的错误：

state 应取self.state_list[k].copy()，而非k+1元素。
state变量取出后，没进行element_wise_op操作，应当放在element_wise_op方法中进行逐元素的activator.backward操作。

分析：
state取self.state_list[k].copy()后，再进行element_wise_op操作，获得激活函数的导数数组，用此k层的数组乘以（k+1层的误差项与W的乘积）才是k层的误差项。

修改如下：
def calc_delta_k(self, k, activator):
'''
根据k+1时刻的delta计算k时刻的delta
'''
state = self.state_list[k].copy()
element_wise_op(state,
activator.backward)
self.delta_list[k] = np.dot(
np.dot(self.delta_list[k+1].T, self.W),
np.diag(state[:,0])).T

验证情况如下：
验证数据调整如下（输入数据调整为4维，输入数据调整为3个）：
def data_set():
x = [np.array([[1], [2], [3], [8]]),
np.array([[2], [3], [4],[-9]]),
np.array([[-1], [-2], [4], [3]])]
d = np.array([[1], [2]])
return x, d

验证程序调整如下（输入数据调整为4维，每层的隐藏神经元数调整为3，输入数据调整为3个）：
def gradient_check():
'''
梯度检查
'''
# 设计一个误差函数，取所有节点输出项之和
error_function = lambda o: o.sum()

rl = RecurrentLayer(4, 3, IdentityActivator(), 1e-3)

# 计算forward值
x, d = data_set()
rl.forward(x[0])
rl.forward(x[1])
rl.forward(x[2])

# 求取sensitivity map
sensitivity_array = np.ones(rl.state_list[-1].shape,
                            dtype=np.float64)
# 计算梯度
rl.backward(sensitivity_array, IdentityActivator())

# 检查梯度
epsilon = 10e-4
for i in range(rl.W.shape[0]):
    for j in range(rl.W.shape[1]):
        rl.W[i,j] += epsilon
        rl.reset_state()
        rl.forward(x[0])
        rl.forward(x[1])
        rl.forward(x[2])
        err1 = error_function(rl.state_list[-1])
        rl.W[i,j] -= 2*epsilon
        rl.reset_state()
        rl.forward(x[0])
        rl.forward(x[1])
        rl.forward(x[2])
        err2 = error_function(rl.state_list[-1])
        expect_grad = (err1 - err2) / (2 * epsilon)
        rl.W[i,j] += epsilon
        print 'weights(%d,%d): expected - actural %f - %f' % (
            i, j, expect_grad, rl.gradient[i,j])

按calc_delta_k的原程序，输出如下：
D:\python_2.7\python.exe D:/python_code/learn_dl-master/rnn.py
weights(0,0): expected - actural 0.000095 - 1.000000
weights(0,1): expected - actural 0.000372 - 1.000000
weights(0,2): expected - actural 0.000512 - 1.000000
weights(1,0): expected - actural 0.000095 - 1.000000
weights(1,1): expected - actural 0.000372 - 1.000000
weights(1,2): expected - actural 0.000512 - 1.000000
weights(2,0): expected - actural 0.000095 - 1.000000
weights(2,1): expected - actural 0.000372 - 1.000000
weights(2,2): expected - actural 0.000512 - 1.000000

Process finished with exit code 0

按calc_delta_k的修改后的程序，输出如下：
D:\python_2.7\python.exe D:/python_code/learn_dl-master/rnn.py
weights(0,0): expected - actural -0.001360 - -0.001360
weights(0,1): expected - actural 0.000520 - 0.000520
weights(0,2): expected - actural 0.000452 - 0.000452
weights(1,0): expected - actural -0.001360 - -0.001360
weights(1,1): expected - actural 0.000520 - 0.000520
weights(1,2): expected - actural 0.000452 - 0.000452
weights(2,0): expected - actural -0.001360 - -0.001360
weights(2,1): expected - actural 0.000520 - 0.000520
weights(2,2): expected - actural 0.000452 - 0.000452

Process finished with exit code 0

由此可以验证原程序calc_delta_k函数是不正确的，修改后的是正确的。

向量化编程错误之 ValueError: operands could not be broadcast together with shapes (10,) (10,300)

你好，首先很感谢你提供的文章以及代码。
我在第三章神经网络用你提供的向量化编程中练习遇到如下错误

`ValueError Traceback (most recent call last)
in ()
142 last_error_ratio = error_ratio
143 if name == 'main':
--> 144 train_and_evaluate()

in train_and_evaluate()
132 while True:
133 epoch += 1
--> 134 network.train(train_labels, train_data_set, 0.3, 1)
135 print '%s epoch %d finished' % (now(), epoch)
136 if epoch % 10 == 0:

/home/superman/learn_dl/fc.py in train(self, labels, data_set, rate, epoch)
98 for d in range(len(data_set)):
99 self.train_one_sample(labels[d],
--> 100 data_set[d], rate)
101
102 def train_one_sample(self, label, sample, rate):

/home/superman/learn_dl/fc.py in train_one_sample(self, label, sample, rate)
102 def train_one_sample(self, label, sample, rate):
103 self.predict(sample)
--> 104 self.calc_gradient(label)
105 self.update_weight(rate)
106

/home/superman/learn_dl/fc.py in calc_gradient(self, label)
108 delta = self.layers[-1].activator.backward(
109 self.layers[-1].output
--> 110 ) * (label - self.layers[-1].output)
111 for layer in self.layers[::-1]:
112 layer.backward(delta)

ValueError: operands could not be broadcast together with shapes (10,) (10,300) `

在”神经网络实战--代码实现部分”,为了实现向量化编程练习。我吧 from bp import * 修改为 from fc import * ，于是出现上面的错误。好想有两个地方比较不一样就是 class network( ) 和 def train_data_set(): ，请问我要改怎么改才能实现呢？谢谢

rnn的bptt貌似有问题

求解。calc_delta_k 的element_wise_op按照文档描述，应该对state 求导，而不是对state_list求导，文档和代码那个是对的？

请问可以加上调用cnn各个模块的主函数的代码吗？

您好，打扰了。
mnist.py里面有个主函数，展示了如何使用全连接网络的各个模块的使用方法。配合pudb一边调试一边运行，更容易理解文章中的理论。
对CNN和循环神经网络, 如果也可以有一个类似的主函数调用各个模块，把训练和分类的各个函数都串起来，就太好了。

计算zero padding的问题

以下代码：
zp = (self.input_width + self.filter_width - 1 - expanded_width) / 2
是通过公式W2 =（W1 - F + 2P ）/ S +1 逆推过来的，是否有问题？

bp.py 的 numpy 版本有比较多的错

张量大小和output layer的梯度初始化没做，存在问题， i 层梯度计算，是用的 i 层的 W

零基础入门深度学习(6) - 长短时记忆网络(LSTM) 问题

谢谢作者提供这么好的教程，有几个问题想要请教
1、在lstm.py代码，test函数中，调用backward函数，传入的第二个参数d是目标输出，最后是传给了delta_h，不知道为什么？delta_h命名的意思不应该是用目标输出d减去forward输入的h吗？
2、不太明白gradient_check（）函数用意，因为教程未给出具体说明，所以不太明白这段代码的意思。个人理解，检查梯度是为了看看梯度计算情况，看有没有出现梯度爆炸或梯度消失的现象吗？
3、lstm代码中的测试例子有更新参数的函数update（），但是测试用例没有调用，lstm有个重要用途是预测，如果作者能给出一个较为完善的测试用例，能够打印出经过BPTT计算后参数更新的预测值就好啦

零基础入门深度学习(6) - 长短时记忆网络(LSTM) 误差项沿时间的反向传递公式（13）有错误

首先，非常感谢《零基础入门深度学习》作者hanbingtao付出的辛苦努力，提供了这么好的教程和代码程序。

误差项沿时间的反向传递公式（13）明显写错了。
将误差项向前传递到任意k时刻的公式：

δ k T = ∏ j = k t − 1 δ o , j T W o h + δ f , j T W f h + δ i , j T W i h + δ c ~ , j T W c h ( 式 13 )

原公式是采用从j=k到j=t-1求delta(j)，然后把各个delta(j）乘起来，得到delta(k)。这显然是错误的，应当采用循环处理，已知delta(t)，用原文中公式（58）求出delta(t-1)，反复使用公式（58）直至求出delta(k)。而不是连乘积的形式。

从原文给出的样例代码lstm.py也可以看出这一点，是采用循环处理，逐步沿时间前移，而不是连乘积的算法。

CNN梯度检测

开始不是用Relu函数吗？为啥现在使用了f(x)=x这个线性函数？
如果我打算使用Relu函数运用在前向传播中，进行梯度检验，那么sensitivity_array应该如何设置啊？就这一块没明白

rnn.py中梯度检查中的epsilon无论取什么值，算出来的梯度都是一样的

卷积神经网络里面的根据链式求导法则写错了

https://www.zybuluo.com/hanbingtao/note/485480
根据链式求导法则，应该是：E_d对a^{l}求偏导，再乘以a^{l}对net^{l-1}求偏导。你写成E_d对a^{l-1}求偏导，再乘以a^{l-1}对net^{l-1}求偏导。

python3 中的bp.cy

python3/bc.py 运行总是报错 'ConstNode' object has no attribute 'set_output'

FC的 backward 有错

delta 应该是等于 self.activator.backward(self.ouput) * np.dot(self.W.T, delta_arry) 吧，你写成了 input 。

fc.py中backward这个方法有问题么？self.input是不是应该改成self.output

def backward(self, delta_array):
    '''
    反向计算W和b的梯度
    delta_array: 从上一层传递过来的误差项
    '''
    # 式8
    self.delta = self.activator.backward(self.input) * np.dot(      # 为什么这里是input 而不是output呢？
        self.W.T, delta_array)   
    self.W_grad = np.dot(delta_array, self.input.T)
    self.b_grad = delta_array

复制黏贴没有公式的，感觉不好

零基础入门深度学习(6) - 长短时记忆网络(LSTM) 中的公式57 58 有问题

58中Woh，Wfh,Wih,Wch 应该取转置，�同时应该让 Woh，Wfh,Wih,Wch 几个值在前，而 delta 在后

这篇文章的推导很好

零基础入门深度学习(4) - 卷积神经网络 backward的问题

为什么backward时，没有对input_array进行两次padding呢？
在得到expanded_array之后，又计算了一个padded_array

问一个比较小白的问题

在bp.py文件中的梯度检查方法gradient_check 中计算网络误差 network_error=lambda vec1,vec2:0.5reduce(lambda a,b:a+b,map(lambda v:(v[0]-v[1])(v[0]-v[1]),zip(vec1,vec2)))为啥是这个公式？

cnn的ConvLayer初始化值是不是指定的不对？

在cnn.cy (https://github.com/hanbt/learn_dl/blob/master/cnn.py)中，ConvLayer的初始化是：

cl = ConvLayer(5,5,3,3,3,2,1,2,IdentityActivator(),0.001)

它定义的channel_number为3， filter_number为2，但是从文章中看，

我们可以把Feature Map可以看做是通过卷积变换提取到的图像特征，三个Filter就对原始图像提取出三组不同的特征，也就是得到了三个Feature Map，也称做三个通道(channel)。

也就是channel数应该跟filter相同，但是初始化的时候并不相同，不知道是不是初始化的时候，值设错了

rewriting perceptron.py and linear_unit.py with deeplearn.js

https://github.com/jaassoon/deeplearnjs/blob/hbt/demos/perceptron/perceptron.ts
and
https://github.com/jaassoon/deeplearnjs/blob/hbt/demos/linear_unit/linear_unit.ts

cnn中的梯度检查（全1数组）

cnn梯度检查中的sensitivity map为什么是全1数组，是为了反向传播时使得权重梯度都不为0么？

很不错的系列文章，请问有后续吗？

第7篇最后提到后面还有增强学习内容，还有吗？

关于感知机的异或问题

感知机不能处理异或问题，但是在spectron.py文件中输入异或运算的真值表，依旧可以打印出正常结果，这是为什么呢？

个人感觉 gradient_check 有些问题

按照公式来看 w权值应该是统一的。

actual_gradient = conn.get_gradient()
这一句获取梯度的时候，这时候的权值是 W[new]，W[old] 早已经变了。。
再执行predict 获取的梯度就不是上一次的W[old] 权值。

感觉这个函数封装起来会有点疑惑... 不知道说得对不对。

Error in lstm.py

Line 264 and Line 266,
bi_grad = self.delta_f_list[t] bo_grad = self.delta_f_list[t]
should be corrected as:
bi_grad = self.delta_i_list[t] bo_grad = self.delta_o_list[t]

matplotlib does not support generators as input 怎么办

def plot(linear_unit):
import matplotlib.pyplot as plt
input_vecs, labels = get_training_dataset()
fig = plt.figure()
ax = fig.add_subplot(111)
ax.scatter(map(lambda x: x[0], input_vecs), labels)
weights = linear_unit.weights
bias = linear_unit.bias
x = range(0,12,1)
y = map(lambda x:weights[0] * x + bias, x)
ax.plot(x, y)
plt.show()

TypeError: <lambda>() missing 1 required positional argument: 'w'

Hi,I'm run the perceptron.py code,but report error. environment:python3.6
return self.activator(
reduce(lambda a, b: a + b,
map(lambda xw: xw[0] * xw[1],
zip(input_vec, self.weights))
, 0.0) + self.bias)

wrong in rnn.py

这个函数中的state_list应该取第k个，也就是需要计算误差层的状态，element_wise_op操作应该是对获取的state变量
def calc_delta_k(self, k, activator):
'''
根据k+1时刻的delta计算k时刻的delta
'''
state = self.state_list[k+1].copy()
element_wise_op(self.state_list[k+1],
activator.backward)
self.delta_list[k] = np.dot(
np.dot(self.delta_list[k+1].T, self.W),
np.diag(state[:,0])).T
修改如下：
def calc_delta_k(self, k, activator):
'''
根据k+1时刻的delta计算k时刻的delta
'''
state = self.state_list[k].copy()
element_wise_op(state, activator.backward)
self.delta_list[k] = np.dot(
np.dot(self.delta_list[k+1].T, self.W),
np.diag(state[:,0])).T

感知机在python3环境下无法运行

在python3下运行报错map(lambda (x, w): x * w,
^
SyntaxError: invalid syntax 不知为何？

mnist.py Loader类加载错误

您好，我在运行 mnist.py里的transpose(get_training_data_set())方法时，Loader类提示了错误。

     24         将unsigned byte字符转换为整数
     25         '''
---> 26         return struct.unpack('B', byte)[0]
     27 
     28 

TypeError: a bytes-like object is required, not 'int'

我的数据是从tensorflow内下载下来的。

from tensorflow.examples.tutorials.mnist import input_data
mnist=input_data.read_data_sets('', one_hot=True)

求指教，谢谢。

fc.py中backward函数权重梯度及偏置梯度计算是否存在笔误？

self.delta = self.activator.backward(self.input) * np.dot(self.W.T, delta_array)
self.W_grad = np.dot(delta_array, self.input.T) 
self.b_grad = delta_array

np.dot(self.W.T, delta_array)中的权重应为下层的权重，不是当前层的权重；
delta_array为从下一层传递过来的误差项，权重梯度及偏置梯度应根据当前层的误差项进行计算。

self.W_grad = np.dot(self.delta, self.input.T) 
self.b_grad = self.delta

fc.python backward()是否有错误？

47 行本层delta的计算应该是用本层的output的backward（）结果乘w*上一层delta
是否应该改成self.delta = self.activator.backward(self.output) * np.dot( self.W.T, delta_array)

感知机的梯度求导疑问

好奇，再做梯度更新的时候，为什么梯度是label-output。
在《统计学习方法》里面，梯度更新的规则是 w = w + αyixi

def _update_weights(self, input_vec, output, label, rate):
        delta = label - output
        print("delta\t: %f" % delta)
        self.weights = [w + rate * delta * x for (x, w) in zip(input_vec, self.weights)]
        self.bias += rate * delta
        print(self)

你好楼主，请问您这个实现的是单层的LSTM网络吧？请问有多层的LSTM网络实现代码吗？

长短时记忆网络(LSTM)中Wfx、Wix, Wox, Wcx的权重梯度的计算，为何不再对历史时刻t进行累加？

学习长短时记忆网络(LSTM)中发现一个问题，Wfx、Wix, Wox, Wcx的权重梯度的计算，没有对历史时刻t进行累加。而在循环神经网络(RNN)中，损失函数E对权重矩阵U的梯度，也是各个时刻E对U的梯度之和。因此很奇怪，LSTM中损失函数对Wfx、Wix, Wox, Wcx的权重梯度，只取最后时刻的梯度，而没有按历史时刻进行梯度累加。
我怀疑LSTM里Wfx、Wix, Wox, Wcx的权重梯度的计算写错了，也应当按照历史时刻t进行梯度累加。

零基础入门深度学习(2) - 为什么训练次数越多越不准了

训练了100次之后,工作了1.5年的工资还没1.4年的高?

    input_vecs = [[5], [3], [8], [1.4], [10.1]]
    # 期望的输出列表，月薪，注意要与输入一一对应
    labels = [5500, 2300, 7600, 1800, 11400]

[1055.2229722313405] -346.68177883
[1165.2048779199872] -335.792481237
[1165.693282501847] -335.694800321
[1139.8517310863037] -344.308650793
[1045.8913152544783] -356.053702772
[1055.5765973142927] -349.135644157
[1165.4456104538835] -338.257524045
[1165.9970840426429] -338.147229327
[1140.2017633586106] -346.745669555
[1046.212288373489] -358.494353928
[1055.9254484763605] -351.556382426
[1165.6830931106465] -340.689288898
weights :[1165.6830931106465]
bias    :-340.689289

Work 3.4 years, monthly salary = 3622.63
Work 15 years, monthly salary = 17144.56
Work 1.5 years, monthly salary = 1407.84
Work 6.3 years, monthly salary = 7003.11

bp里面的to_int

老师您好
self.to_int(content[start + i * 28 + j]))
这里content是bytes，content[]返回的就是int，不用to_int了吧？

当输入参数超过1，线性单元就计算不出来了

比如按 y = 5x1 + 2x2 + 10（字母X，非乘号），给出一些训练数据，跑出来的结果完全不对。

error

When I used the fc.py to train the MINIST,it happened a error:
File "D:\pycharm\project\fc.py", line 107, in calc_gradient
) * (label - self.layers[-1].output)
ValueError: operands could not be broadcast together with shapes (10,) (10,300)

perceptron.py运行出错

感谢分享，请问有Python3版的吗

fc.py中测试时为什么损失函数值不减反增，而且准确率很低。

fc.py 的 test()运行时发现log
after epoch 1 loss: 0.016419
after epoch 2 loss: 0.053763
after epoch 3 loss: 0.108310
after epoch 4 loss: 0.171011
after epoch 5 loss: 0.225434
after epoch 6 loss: 0.261430
after epoch 7 loss: 0.281527
after epoch 8 loss: 0.291327
after epoch 9 loss: 0.295843
after epoch 10 loss: 0.297988
correct_ratio: 5.47%

def predict(self, input_vec):
    '''
    输入向量，输出感知器的计算结果
    '''
    # 把input_vec[x1,x2,x3...]和weights[w1,w2,w3,...]打包在一起
    # 变成[(x1,w1),(x2,w2),(x3,w3),...]
    # 然后利用map函数计算[x1*w1, x2*w2, x3*w3]
    # 最后利用reduce求和
    return self.activator(
        reduce(lambda a, b: a + b,
               map(lambda (x, w): x * w, zip(input_vec, self.weights)),
               0.0) + self.bias)