bojone / capsule Goto Github PK

View Code? Open in Web Editor NEW

350.0 17.0 107.0 16 KB

A Capsule Implement with Pure Keras

Python 100.00%

capsule's Introduction

Capsule

动态路由算法来自：https://kexue.fm/archives/5112

该版本的动态路由略微不同于Hinton原版，在“单数字训练、双数字测试”的准确率上有95%左右。

其他：

1、相比之前实现的版本：https://github.com/XifengGuo/CapsNet-Keras ，我的版本是纯Keras实现的(原来是半Keras半tensorflow)；

2、通过K.local_conv1d函数替代了K.map_fn提升了好几倍的速度，这是因为K.map_fn并不会自动并行，要并行的话需要想办法整合到一个矩阵运算；

3、其次我通过K.conv1d实现了共享参数版的；

4、代码运行环境是Python2.7 + tensorflow 1.8 + keras 2.1.4

交流

QQ交流群：67729435，微信群请加机器人微信号spaces_ac_cn

capsule's People

Contributors

Stargazers

Watchers

Forkers

fangcaotank ahangchen jcjview jacke121 moses1994 camel2000 xylary yfxd zumbalamambo liqipap lebronlambert leviswind weeang763162 llwc caoxu915683474 terrygu0908 berryhn caprdzv yangcen shubhampachori12110095 nkmeng mobil787 kaka7 wangfaofao rxlgq zxz53000 pnnngchg zbxzc35 gongqingyi-github stgapr hanxueming126 ethir fanfanba smartjennings cosmoshua arsentiii rosefun cooper111 peterboyyy laoyangui akailcy carrychang cstghitpku earlzz benjiachong naveenjr amirunpri2018 qshuang123 dencechen czzzzzzzzzr dupanfei1 barcawy jy00002 kbhartiya alphadl zyhwd zhangyijia1979 unosonu mayissunny junnozyp lizhaodong wangjiosw auscenery xutiange threebbox liuhc001 lynneyyq android88 funweb fankli kiwichi xuxingya hhejie123 supeking hypnuz githubgreat886 giser18 xiaoanshi acupofhotwater 1360885769 gaotrees thupalo thelastframe ustyuzhaninky haojiepan1 joeyhome javromsev fer-relate botchway44 ashishpatel26 frankhoang personx000 can-song cuncheng yunhaoshang dev4a 39239580 goodboyandbadboy luoxluo anthony1013

capsule's Issues

Meeting an error

The program is leaving me an error while running
For "cnn = Conv2D(64, (3, 3), activation='relu')(input_image)" in line 41
An error returns like
"ValueError: The channel dimension of the inputs should be defined. Found None."

Keras version == 2.1.5

Thanks for your contribution!~

loss为nan

大神您好。
阅读你关于capsule的三篇文章收益良多。
参考你的代码，换了一个数据集，迭代次数比你例子多，也会出现loss为nan的情况。按你的提升替换了squash中的K.epsilon()为0.001，并没有解决这个问题。

请问是什么地方会导致这个计算结果呢？

capsule 类中的b没有更新

b = K.batch_dot(o, u_hat_vecs, [2, 3])
应该是
b += K.batch_dot(o, u_hat_vecs, [2, 3])

当我用python3执行的时候遇到一个问题

执行这一句的时候 greater = np.sort(Y_pred, axis=1)[:,-2] > 0.5 #判断预测结果是否大于0.5

遇到一个问题
numpy.core._internal.AxisError: axis 1 is out of bounds for array of dimension 1

不知道怎么解决了。谢谢。

keras 2.1.2 theano 0.9 gpu error

Traceback (most recent call last):
File "capsule_test.py", line 46, in
capsule = Capsule(10, 16, 3, True)(cnn)
File "/home/.../anaconda3/lib/python3.6/site-packages/keras/engine/topology.py", line 603, in __call
__
output = self.call(inputs, **kwargs)
File "/data/.../study/Capsule/Capsule_Keras.py", line 60, in call
c = K.softmax(b)
File "/home/.../anaconda3/lib/python3.6/site-packages/keras/backend/theano_backend.py", line 1552, i
n softmax
return T.nnet.softmax(x)
File "/home/.../anaconda3/lib/python3.6/site-packages/theano/tensor/nnet/nnet.py", line 809, in soft
max
return softmax_op(c)
File "/home/.../anaconda3/lib/python3.6/site-packages/theano/gof/op.py", line 615, in call
node = self.make_node(*inputs, **kwargs)
File "/home/.../anaconda3/lib/python3.6/site-packages/theano/tensor/nnet/nnet.py", line 431, in make
_node
x.type)
ValueError: x must be 1-d or 2-d tensor of floats. Got TensorType(float32, 3D)

I try tensorflow backend , no error.

关于conv1d，和local_conv1d，softmax层以及胶囊中池化层的问题

您好，我想问下您使用了conv1d，和local_conv1d，但是这种操作还算是矩阵乘法么？我不太清楚conv1d的机制，网上好像也都讲的不太清楚。能详细说明下么，这个操作能代替W_ij*u_i？
还有，我不太理解您自己定义的softmax，为什么要减去一个最大值？再者看原文，softmax中求和应该是对j累加，也就是说对caps_num累加，看您的程序好像是对i累加了。
最后，Hiton貌似是不希望增加池化层的，但看你的程序好像加了个cnn = AveragePooling2D((2,2))(cnn)，是否不太合适？
最后谢谢您分享的代码，它在速度上的提升令人振奋！

想问下Hinton版的 Capsule 在“单数字训练、双数字测试”的准确率有多少？

Capsule可否用于regression的问题

非常感谢的作者的sharing.

我想用Capsule来尝试处理regression的问题，想法是最后吧vector 用neural转换成一个scalar,不知道这么做是否靠谱。

如何将Capsule用于NLP问题

当embedding_dim=768或者是embedding=300，
也就是说如何接词向量model和bert，
参数如何设设置

squash function

From the paper "Dynamic Routing Between Capsules", I think the squash function should be
scale = K.sqrt(s_squared_norm)/ (K.sqrt(s_squared_norm) + s_squared_norm)
rather than the function written in you code
scale = K.sqrt(s_squared_norm)/ (0.5 + s_squared_norm)
When I change it to function in the paper, the accuracy get improved to 0.936. Could you try it?

关于capsule的输出问题

你好,有个问题想请教以下:

#搭建CNN+Capsule分类模型
input_image = Input(shape=(None,None,1))
cnn = Conv2D(64, (3, 3), activation='relu')(input_image)
cnn = Conv2D(64, (3, 3), activation='relu')(cnn)
cnn = AveragePooling2D((2,2))(cnn)
cnn = Conv2D(128, (3, 3), activation='relu')(cnn)
cnn = Conv2D(128, (3, 3), activation='relu')(cnn)
cnn = Reshape((-1, 128))(cnn)
capsule = Capsule(10, 16, 3, True)(cnn)
output = Lambda(lambda x: K.sqrt(K.sum(K.square(x), 2)), output_shape=(10,))(capsule)

model = Model(inputs=input_image, outputs=output)
model.compile(loss=lambda y_true,y_pred: y_true*K.relu(0.9-y_pred)**2 + 0.25*(1-y_true)*K.relu(y_pred-0.1)**2,
              optimizer='adam',
metrics=['accuracy'])

在这个模型中,output = Lambda(lambda x: K.sqrt(K.sum(K.square(x), 2)), output_shape=(10,))(capsule) 这一行操作的输入capsule的的结构是[batch_size,num_capsule,dim_capsule]. output执行的含义是:

我的理解,这个操作是将capsule的每一个分量转换成可能性(概率),那么这个地方是否可以换乘求和或者求平均呢? 采用是否有其他的含义呢?

谢谢了.

使用keras2.2.4运行时遇到一个问题

b = K.batch_dot(o, u_hat_vecs, [2, 3])
ValueError: Can not do batch_dot on inputs with shapes (None, 10, 10, 16) and (None, 10, None, 16) with axes=[2, 3]. x.shape[2] != y.shape[3] (10 != 16)

而且b = K.batch_dot(o, u_hat_vecs, [2, 3])
中b不需要和上一轮的b进行叠加吗

About Primary Capsule

請問一下蘇大，由於您的capsule_test裏的前一層採用了：
Reshape((-1,128))應該是對應於primary capsule，
這是否是對應於128 capsule的dim而非數量？
因為我稍微看了一下原本Hinton的paper和下面這個實現
https://github.com/XifengGuo/CapsNet-Keras
似乎都把8-dim 當作primary capsule的特徵維度。
雖然說dim和num就算反過來也完全可以做運算，
只是想問一下跟原paper之間的關係是否是想選擇128當作capsule num？

P.S. Blog真的解釋得非常好，真的獲益良多，感謝

why reshape before capsule

Hi;
I could'nt understand why the following line is needed in test code?

cnn = Reshape((-1, 128))(cnn)

and what does 128 correspond here?

thanks in advance

why relu after primary caps ?

Why in the last con2d layer the activation function is relu and not the squash function ?
shouldn't it be the squash function ?