dmlc / keras Goto Github PK

View Code? Open in Web Editor NEW

This project forked from keras-team/keras

125.0 125.0 34.0 5.83 MB

Deep Learning library for Python. Convnets, recurrent neural networks, and more. Runs on MXNet, Theano or TensorFlow.

Home Page: http://keras.io/

License: Other

Makefile 0.05% Python 99.95%

keras's Introduction

Distributed Machine Learning Common Codebase

DMLC-Core is the backbone library to support all DMLC projects, offers the bricks to build efficient and scalable distributed machine learning libraries.

Developer Channel

What's New

Note on Parameter Module for Machine Learning

Known Issues

RecordIO format is not portable across different processor endians. So it is not possible to save RecordIO file on a x86 machine and then load it on a SPARC machine, because x86 is little endian while SPARC is big endian.

Contributing

Contributing to dmlc-core is welcomed! dmlc-core follows google's C style guide. If you are interested in contributing, take a look at feature wishlist and open a new issue if you like to add something.

DMLC-Core uses C++11 standard. Ensure that your C++ compiler supports C++11.
Try to introduce minimum dependency when possible

CheckList before submit code

Type make lint and fix all the style problems.
Type make doc and fix all the warnings.

NOTE

deps:

libcurl4-openssl-dev

keras's People

Contributors

Stargazers

Watchers

keras's Issues

Strange Error in LSTM Layer

The line: LSTM (1000,activation = 'tanh',return_sequences = True)(ht) produces a strange error for me:

/usr/local/lib/python3.5/dist-packages/keras/layers/recurrent.py in get_initial_states(self, x)
202 def get_initial_states(self, x):
203 # build an all-zero tensor of shape (samples, output_dim)
--> 204 initial_state = K.zeros_like(x) # (samples, timesteps, input_dim)
205 initial_state = K.sum(initial_state, axis=(1, 2)) # (samples,)
206 initial_state = K.expand_dims(initial_state) # (samples, 1)

/usr/local/lib/python3.5/dist-packages/keras/backend/mxnet_backend.py in zeros_like(x, name)
822 if name is None:
823 name = _autogen_name('zerolikeinit')
--> 824 y = mx._symbol_internal._zeros(dtype=dtype(x))
825 return KerasSymbol(mx._symbol_internal._identity_with_attr_like_rhs(y, x.symbol), name=name, is_var=True)
826

AttributeError: module 'mxnet' has no attribute '_symbol_internal'
mxnet version 0.12.0 Keras v1.2.2

errors while call the keras.applications.inception_v3

I turn my keras backend from tensorflow to mxnet to use mxnet's multi gpu training. However the code I run successfully in keras tensorflow backend seems not compatible with keras mxnet backend. It prints the following:

Using MXNet backend.
train and valid generator is ok
steps_per_epoch: 55673
validation_steps: 1236
Traceback (most recent call last):
File "train_inception_v3_transfer_learning.py", line 64, in
base_model = InceptionV3(include_top=False, weights=None, input_shape=(3, INPUT_HEIGHT, INPUT_WIDTH))
File "/home/xierenqiang/install/anaconda3/lib/python3.6/site-packages/keras/applications/inception_v3.py", line 151, in InceptionV3
(3, 3), strides=(1, 1), border_mode='same')(x)
File "/home/xierenqiang/install/anaconda3/lib/python3.6/site-packages/keras/engine/topology.py", line 572, in call
self.add_inbound_node(inbound_layers, node_indices, tensor_indices)
File "/home/xierenqiang/install/anaconda3/lib/python3.6/site-packages/keras/engine/topology.py", line 635, in add_inbound_node
Node.create_node(self, inbound_layers, node_indices, tensor_indices)
File "/home/xierenqiang/install/anaconda3/lib/python3.6/site-packages/keras/engine/topology.py", line 166, in create_node
output_tensors = to_list(outbound_layer.call(input_tensors[0], mask=input_masks[0]))
File "/home/xierenqiang/install/anaconda3/lib/python3.6/site-packages/keras/layers/pooling.py", line 160, in call
dim_ordering=self.dim_ordering)
File "/home/xierenqiang/install/anaconda3/lib/python3.6/site-packages/keras/layers/pooling.py", line 251, in _pooling_function
border_mode, dim_ordering, pool_mode='avg')
File "/home/xierenqiang/install/anaconda3/lib/python3.6/site-packages/keras/backend/mxnet_backend.py", line 33, in func_wrapper
train_ret = func(*args, **kwargs)
File "/home/xierenqiang/install/anaconda3/lib/python3.6/site-packages/keras/backend/mxnet_backend.py", line 2912, in pool2d
stride=strides)
File "", line 39, in Pooling
File "/home/xierenqiang/install/anaconda3/lib/python3.6/site-packages/mxnet/_ctypes/symbol.py", line 127, in _symbol_creator
ctypes.byref(sym_handle)))
File "/home/xierenqiang/install/anaconda3/lib/python3.6/site-packages/mxnet/base.py", line 129, in check_call
raise MXNetError(py_str(_LIB.MXGetLastError()))
mxnet.base.MXNetError: Invalid Input: 'same', valid values are: {'full', 'valid'}, in operator Pooling(name="", stride="(1, 1)", pooling_convention="same", pool_type="avg", kernel="(3, 3)")

Error while running test_core.py

test_core.py:433: │
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _├─────────────────────────────────────────────────────────────────────────────────────────────────────────────────
/usr/local/lib/python2.7/dist-packages/keras/utils/test_utils.py:80: in layer_test │
y = layer(x) │ 1 [ 0.0%] 5 [ 0.0%]
/usr/local/lib/python2.7/dist-packages/keras/engine/topology.py:546: in call │ 2 [ 0.0%] 6 [ 0.0%]
self.build(input_shapes[0]) │ 3 [|| 1.3%] 7 [ 0.0%]
/usr/local/lib/python2.7/dist-packages/keras/layers/core.py:1240: in build │ 4 [ 0.0%] 8 [ 0.0%]
constraint=self.W_constraint) │ Mem[||||| 347/15039MB] Tasks: 51, 30 thr; 1 running
/usr/local/lib/python2.7/dist-packages/keras/engine/topology.py:418: in add_weight │ Swp[ 0/0MB] Load average: 0.09 0.05 0.05
weight = initializer(shape, name=name) │ Uptime: 05:02:27
/usr/local/lib/python2.7/dist-packages/keras/initializations.py:66: in glorot_uniform │
return uniform(shape, s, name=name) │ PID USER PRI NI VIRT RES SHR S CPU% MEM% TIME+ Command
/usr/local/lib/python2.7/dist-packages/keras/initializations.py:33: in uniform │ 2733 ubuntu 20 0 24596 2248 1460 R 0.0 0.0 1:11.40 htop
return K.random_uniform_variable(shape, -scale, scale, name=name) │ 1297 ubuntu 20 0 107M 2052 1000 S 0.0 0.0 0:03.50 sshd: ubuntu@pts/0
/usr/local/lib/python2.7/dist-packages/keras/backend/mxnet_backend.py:18: in func_wrapper │ 2487 ubuntu 20 0 286M 46612 6964 S 0.0 0.3 0:06.04 /usr/bin/python3 /usr/local/bin/jupyter-notebook
ret = func(*args, **kwargs) │ 1142 postgres 20 0 240M 1648 300 S 0.0 0.0 0:00.14 postgres: wal writer process
/usr/local/lib/python2.7/dist-packages/keras/backend/mxnet_backend.py:828: in random_uniform_variable │ 1323 ubuntu 20 0 26528 3804 1208 S 0.0 0.0 0:12.40 tmux
value = mx.random.uniform(low=low, high=high, dtype='float32', shape=shape) │ 2548 ubuntu 20 0 22440 4940 1880 S 0.0 0.0 0:00.26 -bash
mxnet/cython/ndarray.pyx:167: in ndarray._make_ndarray_function.generic_ndarray_function (mxnet/cython/ndarray.cpp:│ 3118 ubuntu 20 0 572M 41256 5952 S 0.0 0.3 0:00.24 /usr/bin/python2.7 -m ipykernel -f /run/user/1000/
3614) │ 1060 root 20 0 19320 860 528 S 0.0 0.0 0:01.15 /usr/sbin/irqbalance
??? │ 994 root 20 0 336M 6732 5200 S 0.0 0.0 0:00.24 NetworkManager
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _│ 1144 postgres 20 0 99M 1756 356 S 0.0 0.0 0:00.22 postgres: stats collector process
│ 2509 ubuntu 20 0 573M 40408 6036 S 0.0 0.3 0:00.74 /usr/bin/python2.7 -m ipykernel -f /run/user/1000/

??? │ 3105 ubuntu 20 0 572M 41256 5952 S 0.0 0.3 0:00.84 /usr/bin/python2.7 -m ipykernel -f /run/user/1000/
E MXNetError: Invalid Parameter format for dtype expect int but value='float32', in operator uniform(name="", low│ 2494 ubuntu 20 0 573M 40408 6036 S 0.0 0.3 0:01.58 /usr/bin/python2.7 -m ipykernel -f /run/user/1000/
="-1.09544511501", shape="(3, 2)", dtype="float32", high="1.09544511501")

Error using Keras optimizers with MXNet backend

Hi
I've been trying to run Keras with the MXNet backend and it seems to not like Keras optimizers. When I do the following, I get the following error when trying to train:
from keras.optimizers import Nadam
opt = Nadam(lr=0.0002)
gpu_list = ["gpu(0)"]
model.compile(optimizer=opt, loss='binary_crossentropy', context=gpu_list)
model.fit(X_train, y_train, shuffle=True, batch_size=batch_size, nb_epoch=1)

Traceback (most recent call last):
File "train.py", line 157, in
train()
File "train.py", line 130, in train
hist = model.fit(X_train, y_train, shuffle=True, batch_size=batch_size, nb_epoch=1)
File "/usr/local/lib/python2.7/dist-packages/keras/engine/training.py", line 1198, in fit
initial_epoch=initial_epoch)
File "/usr/local/lib/python2.7/dist-packages/keras/engine/training.py", line 893, in _fit_loop
outs = f(ins_batch)
File "/usr/local/lib/python2.7/dist-packages/keras/engine/training.py", line 1954, in train_function
data, label, _, data_shapes, label_shapes = self._adjust_module(inputs, 'train')
File "/usr/local/lib/python2.7/dist-packages/keras/engine/training.py", line 1912, in _adjust_module
self._mod.init_optimizer(kvstore=self._kvstore, optimizer=self.optimizer)
File "/usr/local/lib/python2.7/dist-packages/mxnet/module/bucketing_module.py", line 385, in init_optimizer
force_init=force_init)
File "/usr/local/lib/python2.7/dist-packages/mxnet/module/module.py", line 509, in init_optimizer
assert isinstance(optimizer, opt.Optimizer)
AssertionError

The training code runs when I use an MXNet optimizer, but then I can't save models:
from mxnet.optimizer import Nadam
opt = Nadam(learning_rate=0.0002)
gpu_list = ["gpu(0)"]
model.compile(optimizer=opt, loss='binary_crossentropy', context=gpu_list)
model.fit(X_train, y_train, shuffle=True, batch_size=batch_size, nb_epoch=1)
model.save(local_model_file_name, overwrite=True)

Traceback (most recent call last):
File "train.py", line 157, in
train()
File "train.py", line 136, in train
model.save(local_model_file_name, overwrite=True)
File "/usr/local/lib/python2.7/dist-packages/keras/engine/topology.py", line 2642, in save
save_model(self, filepath, overwrite)
File "/usr/local/lib/python2.7/dist-packages/keras/models.py", line 74, in save_model
'config': model.optimizer.get_config()
AttributeError: 'Nadam' object has no attribute 'get_config'

Any thoughts? Thanks.

keras_shape and backend_shape mismatch with shape changing operation

Here is the detailed bug report from the user - https://github.com/bgshin/mxnet_cnn/blob/master/src/bug.md

Summary of the issue:

keras_shape and backend_shape doesn't match after applying shape changing operations.
When performing an tensor operation (like multiplication), keras tries to check the validity of a given operation using keras_shape.
If the shape of a tensor is changed by some operation (eg. slicing a tensor), then both of the keras_shape and backend_shape should be changed.
However, for mxNet, only backend_shape is changed.
This causes an error when applying an operations that requires correct shape information.

random_binomial NotImplementedError

model.add(Embedding(max_features,
                    embedding_dims,
                    input_length=maxlen,
                    dropout=0.2))

This line throws this exception:

    train_ret = func(*args, **kwargs)
  File "env\lib\site-packages\keras\backend\mxnet_backend.py", line 3007, in random_binomial
    raise NotImplementedError
NotImplementedError

miss op

Underlying MXNet Model Extraction

It would be beneficial to have a way to extract the underlying MXNet module, similar to how a user can call K.get_session() with the TensorFlow backend. The global _MODEL object is unaccessible via API, so there's no way I can see to access the MXNet module.

In my own fork I have added the following to mxnet_backend.py:

def get_mxnet_module():
    return _MODEL._mod

Please let me know if there's a better way to do it. If not, I'd be happy to submit a PR with the above as I believe that feature would be useful to others as well.

No matching distribution found for keras-mxnet

Hello.
No matching distribution found for keras-mxnet

(pyenv1) keras heidi $ pip install mxnet==0.11.0 --user
Collecting mxnet==0.11.0
  Using cached mxnet-0.11.0-cp35-cp35m-macosx_10_12_x86_64.whl
Requirement already satisfied: numpy in /Users/heidi/.pyenv/versions/3.5.2/envs/pyenv1/lib/python3.5/site-packages (from mxnet==0.11.0)
Requirement already satisfied: graphviz in /Users/heidi/.pyenv/versions/3.5.2/envs/pyenv1/lib/python3.5/site-packages (from mxnet==0.11.0)
Installing collected packages: mxnet
Successfully installed mxnet-0.11.0
(pyenv1)keras heidi $ pip install keras-mxnet --user
Collecting keras-mxnet
  Could not find a version that satisfies the requirement keras-mxnet (from versions: )
No matching distribution found for keras-mxnet
(pyenv1)keras heidi $

Multiple GPU issue

Hi,

My model can run on a single GPU, but it failed on multiple GPU. Here is my code:

x_train, y_train = batch_reader.get_batch()
gpu_list = ["gpu(0)", "gpu(1)", "gpu(2)", "gpu(3)"]
model_dist.compile(loss=losses.dist_loss_cls(C.max_radius), optimizer=optimizer, context=gpu_list)
model_dist.fit(x_train, y_train, batch_size=20, nb_epoch = num_epochs, callbacks=[checkpoint_fixed_name])

The error I got was:

RuntimeError: simple_bind error. Arguments:
input_1: (5, 1L, 32L, 32L, 32L)
[13:36:31] src/storage/storage.cc:59: Check failed: e == cudaSuccess || e == cudaErrorCudartUnloading CUDA: invalid device ordinal

Would anyone please help me? Thanks.

how to predict on GPU with keras1.2 and mxnet

I have trained model using multi-gpu. while in the prediction phase.I found that this code did not working on GPU
model_cnn.predict(train_set, batch_size=CFG['batch_size'])
so,how can i run predict on GPU

mxnet_backend.clear_session should clear the placeholder name dictionary

Hello together,

right now I try to use the Keras (MXNet) library as a model converter. If you can port an existing MXNet model to a Keras model then you can convert from this intermediate format to the Theano, CNTK and TensorFlow format. This also works the other way around.

To make this happen I need access to the names of all nodes. Unfortunately the input variable name has an appended number which increases all the time. There is no way to reset it.

It would be helpful if mxnet_backend.clear_session() would reset the placeholder_name_dict variable.

Failing test 06/21/2017

Require sparse support:

Test cases
test_sparse_mlp
test_sparse_dot
test_sparse_concat

Require rnn symbolic loop implementation:

Test cases
test_dynamic_behavior[SimpleRNN]
test_dynamic_behavior[GRU]
test_dynamic_behavior[LSTM]
imdb_lstm
conv_lstm

Utility function issue, not major blocker:

Test cases	Reason
test_value_manipulation	Print tensor should return tensor
test_gradient	Require implementation of symbol gradient
test_function	Require updating variables
test_switch	Require implementation of switching between two operations
test_random_binomial	Require implementation of random binomial
test_ctc	Require implementation of ctc_batch_cost
test_map	Require implementation of mappinf function over elements
test_foldl	Require implementation of reducing elements using function
test_foldr	Require implementation of reducing elements using function
test_Eigenvalue_reg	Not supported

Function incompatible with mxnet:

Test cases	Reason
test_nn_operations	MXNet Categorical_crossentropy doesn't support from_logits
test_arange	Keras requires that when start >= stop and step > 0, this function should return an empty sequence. Currently mxnet returns error
test_batchnorm_mode_0_or_2	Currently keras mxnet doesn't support batchnorm mode 2
test_shared_batchnorm	Currently keras mxnet doesn't support batchnorm mode 2
variational_autoencoder_deconv	Too big target shape for mx.sym.deconvolution operator
mnist_acgan	For mxnet backend, parameters are not shared between concatenated model and other separate models. So user needs to directly train generator instead of combine generator and discriminator to one model.
test_sequential_model_saving	Optimizer states not preserved when saving/loading model
mnist_net2net	mxnet doesn't support set weights to larger shape

Test case image ordering issue. Passing after modifying test cases. These are actually not failing tests:

Test cases
test_image_classification
test_conv2d
test_conv3d
test_atrous_conv_2d
test_averagepooling_2d
test_zero_padding_3d
test_TimeDistributed

Merge this branch with Keras 2.0

Hi,

Wanted to ask what are the plans to pull in Keras 2.0 as there have been interface level changes from 1.2.2.

Rahul

No training speed improvement can be obtained by using multi-gpus with mxnet as the backend

Hi, I have some questions about the training speed when using multi-gpus with mxnet as the backend for keras. According to https://mxnet.incubator.apache.org/how_to/multi_devices.html, which said "By default, MXNet partitions a data batch evenly among the available GPUs. Assume a batch size b and assume there are k GPUs, then in one iteration each GPU will perform forward and backward on b/k examples. The gradients are then summed over all GPUs before updating the model." I think when the batch size b is fixed, each gpu calculates gradients on b/k examples, compared to the gradients calculation on b examples with single gpu, the former should comsume less time. As a result, with the same batch size, the speed of weights updating by using multi-gpus should be faster than that by using single gpu for each iteration. But through the experiments, I found the speed of training using multi-gpus is slower than that using single gpu .

below are parts of my code, where I used the fully-connected network

model = Sequential()
model.add(Dropout(0.1,input_shape=(2056,)))
model.add(Dense(2800,activation='relu'))
model.add(Dropout(0.1))
model.add(Dense(2800,activation='relu'))
model.add(Dropout(0.1))
model.add(Dense(2800,activation='relu'))
model.add(Dropout(0.1))
model.add(Dense(257))
model.summary()
opt=SGD()
NUM_GPU = 4
gpu_list = []
for i in range(NUM_GPU):
gpu_list.append('gpu(%d)' % i)
batch_size=128
model.compile(loss=my_loss, optimizer=opt, context=gpu_list)

I don't know whether my understanding is right, why no speed improvment can be obatined with multi-gpus? Can anyone solve my questions? Thanks!

Can we use Xception V1 model with mxnet backend

HI, I installed keras with mxnet 0.11 and successfully run the mnist_mlp.py. Everything is OK.
Now I want to use the xception module in \keras\applications and found the comments in the file:

> Also do note that this model is only available for the TensorFlow backend,
> due to its reliance on `SeparableConvolution` layers.

I want to confirm whether it's true.

@piiswrong

Many thanks!

K.gradients: NotImplementedError

OK, I'm looking at the mxnet_backend.py and especially on how gradients are calculated since I'm developing a custom optimizer. However K.gradients do not implement any kind of call to mx:

def gradients(loss, variables):
    """Returns the gradients of `variables` (list of tensor variables)
    with regard to `loss`.
    """
    raise NotImplementedError

So my question is, how are gradients actually calculated, say in a call like:

grads=K.gradients(loss, params)

in my optimizer?

I mean, how the heck does even SGD work with the mx backend?

Thanks

Unable to use Batchnormalization with multi-GPUs in MXNet backend

Summary

Unable to use batchnormalization with MXNet backend when using multiple GPUs. After debugging the issue, I found that there is a mismatch in the shape of batchnorm param in KVStore. in mxnet/model.py -> KVStore is being initialized with a (64,) shape but is being tried to update with a (256,64,1,1) shape.

Stacktrace and Debug messages

Below is the stack trace and my debug messages from "initialize_kvstore" and "update_params_on_kvstore" functions. Observe that param shape at index 4, there is a mismatch.

In initialize kvstore

kvstore - <mxnet.kvstore.KVStore object at 0x7fbfdb6729d0>
len of param_arrays - 304
len of arg_params - 304
len of param_names - 304
update_on_kvstore - True
Index - 0
Param name - normal1
Arg params - <NDArray 3x7x7 @cpu(0)>
Index - 1
Param name - convolution2d_1_b
Arg params - <NDArray 1 @cpu(0)>
Index - 2
Param name - batchnormalization_1_running_mean
Arg params - <NDArray 1 @cpu(0)>
Index - 3
Param name - batchnormalization_1_running_std
Arg params - <NDArray 1 @cpu(0)>
Index - 4
Param name - batchnormalization_1_gamma
Arg params - <NDArray 1 @cpu(0)>
arg_params in idx 4 - <NDArray 64 @cpu(0)>
param name at idx 4 - batchnormalization_1_gamma

In update_params_on_kvstore
param_arrays - 304
grad_arrays - 304
kvstore - <mxnet.kvstore.KVStore object at 0x7fbfdb6729d0>
Index - 0
arg_list[0] <NDArray 64x3x7x7 @gpu(0)>
Current index - 0
Index - 1
arg_list[0] <NDArray 64 @gpu(0)>
Current index - 1
Index - 2
arg_list[0] <NDArray 64 @gpu(0)>
Current index - 2
Index - 3
arg_list[0] <NDArray 64 @gpu(0)>
Current index - 3
Index - 4
arg_list[0] <NDArray 256x64x1x1 @gpu(0)>
Current index - 4
Len of arg_list - 16
Len of grad_list - 16
Arg list[0] - <NDArray 256x64x1x1 @gpu(0)>
Grad list[0] - <NDArray 256x64x1x1 @gpu(0)>
[16:14:11] /home/ubuntu/mxnet/dmlc-core/include/dmlc/./logging.h:304: [16:14:11] src/ndarray/ndarray.cc:319: Check failed: from.shape() == to->shape() operands shape mismatchfrom.shape = (256,64,1,1) to.shape=(64,)

Stack trace returned 10 entries:
[bt] (0) /usr/local/lib/python2.7/dist-packages/mxnet-0.10.1-py2.7.egg/mxnet/libmxnet.so(_ZN4dmlc15LogMessageFatalD1Ev+0x3c) [0x7fc120b0a46c]
[bt] (1) /usr/local/lib/python2.7/dist-packages/mxnet-0.10.1-py2.7.egg/mxnet/libmxnet.so(_ZN5mxnet10CopyFromToERKNS_7NDArrayEPS0_i+0x546) [0x7fc12154c056]
[bt] (2) /usr/local/lib/python2.7/dist-packages/mxnet-0.10.1-py2.7.egg/mxnet/libmxnet.so(_ZN5mxnet7kvstore10CommDevice6ReduceEiRKSt6vectorINS_7NDArrayESaIS3_EEi+0x384) [0x7fc121925de4]
[bt] (3) /usr/local/lib/python2.7/dist-packages/mxnet-0.10.1-py2.7.egg/mxnet/libmxnet.so(_ZN5mxnet7kvstore12KVStoreLocal4PushERKSt6vectorIiSaIiEERKS2_INS_7NDArrayESaIS7_EEi+0x175) [0x7fc121928015]
[bt] (4) /usr/local/lib/python2.7/dist-packages/mxnet-0.10.1-py2.7.egg/mxnet/libmxnet.so(MXKVStorePush+0x7b0) [0x7fc1218cbbc0]
[bt] (5) /usr/lib/x86_64-linux-gnu/libffi.so.6(ffi_call_unix64+0x4c) [0x7fc0b1ef8e40]
[bt] (6) /usr/lib/x86_64-linux-gnu/libffi.so.6(ffi_call+0x2eb) [0x7fc0b1ef88ab]
[bt] (7) /usr/lib/python2.7/lib-dynload/_ctypes.x86_64-linux-gnu.so(_ctypes_callproc+0x48f) [0x7fc0ba1083df]
[bt] (8) /usr/lib/python2.7/lib-dynload/_ctypes.x86_64-linux-gnu.so(+0x11d82) [0x7fc0ba10cd82]
[bt] (9) python(PyObject_Call+0x43) [0x4b0cb3]

Traceback (most recent call last):
File "/home/ubuntu/keras_benchmarks/test_cifar_resnet.py", line 131, in
run_time, memory_usage = profile(train_model)
File "/home/ubuntu/keras_benchmarks/profiler.py", line 84, in profile
func_to_profile()
File "/home/ubuntu/keras_benchmarks/test_cifar_resnet.py", line 125, in train_model
validation_data=(X_test, Y_test))
File "/usr/local/lib/python2.7/dist-packages/Keras-1.2.2-py2.7.egg/keras/engine/training.py", line 1559, in fit_generator
class_weight=class_weight)
File "/usr/local/lib/python2.7/dist-packages/Keras-1.2.2-py2.7.egg/keras/engine/training.py", line 1322, in train_on_batch
outputs = self.train_function(ins)
File "/usr/local/lib/python2.7/dist-packages/Keras-1.2.2-py2.7.egg/keras/engine/training.py", line 1959, in train_function
self._mod.update()
File "/usr/local/lib/python2.7/dist-packages/mxnet-0.10.1-py2.7.egg/mxnet/module/bucketing_module.py", line 408, in update
self._curr_module.update()
File "/usr/local/lib/python2.7/dist-packages/mxnet-0.10.1-py2.7.egg/mxnet/module/module.py", line 575, in update
self._kvstore)
File "/usr/local/lib/python2.7/dist-packages/mxnet-0.10.1-py2.7.egg/mxnet/model.py", line 132, in _update_params_on_kvstore
kvstore.push(index, grad_list, priority=-index)
File "/usr/local/lib/python2.7/dist-packages/mxnet-0.10.1-py2.7.egg/mxnet/kvstore.py", line 162, in push
ctypes.c_int(priority)))
File "/usr/local/lib/python2.7/dist-packages/mxnet-0.10.1-py2.7.egg/mxnet/base.py", line 85, in check_call
raise MXNetError(py_str(_LIB.MXGetLastError()))
mxnet.base.MXNetError: [16:14:11] src/ndarray/ndarray.cc:319: Check failed: from.shape() == to->shape() operands shape mismatchfrom.shape = (256,64,1,1) to.shape=(64,)

Note: I used Resnet50 architecture on CIFAR dataset with batchsize=32.

@mli @piiswrong @madjam @bhavinthaker

loss is nan

(/home/howardsu/keras/keras/engine/training.py:1823): train_function
1823 return [x.asnumpy().sum() for x in self._train_mod.get_outputs()]
(Pydb) p self._train_mod.get_outputs()[0].asnumpy()
array([ nan], dtype=float32)

model.fit return is incorrect

    history = model.fit(X_train, y_train, nb_epoch=12, batch_size=16,
                        validation_data=(X_test, y_test), verbose=2)
    config = optimizer.get_config()                                                                                                   assert type(config) == dict

  assert history.history['val_acc'][-1] >= target

yajiedesign test show

test_loss_masking loss any
test_activations loss softplus hard_sigmoid
test_optimizers loss adamax nadam

[Keras MXNet Interface]Keras MXNet

In keras example imdb_lstm.py:
ValueError: Cannot unroll a RNN if the time dimension is undefined
This requires RNN support.

In keras example conv_lstm.py:
mxnet.base.MXNetError: value 0 for Parameter num_outputs should be greater equal to 1, in operator SliceChannel(name="", num_outputs="0", squeeze_axis="1"
This requires implementation of rnn symbolic loop.

In keras example mnist_acgan.py:
Error: SoftmaxCrossEntropy only accept 1D label

In keras example variational_autoencoder_deconv.py:
mxnet.base.MXNetError: Error in operator uniform15: [18:19:01] src/operator/./deconvolution-inl.h:75: Check failed: pad_y >= target_shape[0] (28 vs. 29) too big target shape
This requires support for oversized target_shape.

TypeError: kernel_initializer keyword not understood when building a dense layer.

Here is the output:
Using MXNet backend.
Traceback (most recent call last):
File "run.py", line 128, in
model = Model(config_args['alpha'], config_args['gamma'], config_args['input_size'], config_args['hidden_size'])
File "/tmp/Model.py", line 37, in init
kernel_initializer='glorot_normal'))
File "/usr/local/lib/python3.6/dist-packages/Keras-1.2.2-py3.6.egg/keras/layers/core.py", line 785, in init
super(Dense, self).init(**kwargs)
File "/usr/local/lib/python3.6/dist-packages/Keras-1.2.2-py3.6.egg/keras/engine/topology.py", line 326, in init
raise TypeError('Keyword argument not understood:', kwarg)
TypeError: ('Keyword argument not understood:', 'kernel_initializer')

The same issue comes up for use_bias = False.

Not sure why this is the case.
I'm using the master version from this repo. Version 1.2.2 with python 3.67 and mxnet 1.3.1

Convolution miss border_mode

need add border_mode in Convolution
valid means rounds down output
same means padding output same with input
full means rounds up output

Modulo Operation not supported

    self.mod_ids = Lambda(lambda sent: sent % (nr_tune-1)+1,
                          output_shape=(self.max_length,))

Returns an error saying - TypeError: unsupported operand type(s) for %: 'KerasSymbol' and 'int'

There is argument issue about Pooling layer(AveragePooling2D : border_mode)

I want to use keras.applications.inception_v3 model. After I load that model it, there is an error about pooling layer.

Error Message
:----> 1 InceptionV3(include_top=True,input_tensor=None, input_shape=None,weights=None)

/pang/linetor/miniconda2/envs/py27/lib/python2.7/site-packages/keras/applications/inception_v3.pyc in InceptionV3(include_top, weights, input_tensor, input_shape, classes)
149
150 branch_pool = AveragePooling2D(
--> 151 (3, 3), strides=(1, 1), border_mode='same')(x)
152 branch_pool = conv2d_bn(branch_pool, 32, 1, 1)
153 x = merge([branch1x1, branch5x5, branch3x3dbl, branch_pool],

/pang/linetor/miniconda2/envs/py27/lib/python2.7/site-packages/keras/engine/topology.pyc in call(self, x, mask)
570 if inbound_layers:
571 # This will call layer.build() if necessary.
--> 572 self.add_inbound_node(inbound_layers, node_indices, tensor_indices)
573 # Outputs were already computed when calling self.add_inbound_node.
574 outputs = self.inbound_nodes[-1].output_tensors

/pang/linetor/miniconda2/envs/py27/lib/python2.7/site-packages/keras/engine/topology.pyc in add_inbound_node(self, inbound_layers, node_indices, tensor_indices)
633 # creating the node automatically updates self.inbound_nodes
634 # as well as outbound_nodes on inbound layers.
--> 635 Node.create_node(self, inbound_layers, node_indices, tensor_indices)
636
637 def get_output_shape_for(self, input_shape):

/pang/linetor/miniconda2/envs/py27/lib/python2.7/site-packages/keras/engine/topology.pyc in create_node(cls, outbound_layer, inbound_layers, node_indices, tensor_indices)
164
165 if len(input_tensors) == 1:
--> 166 output_tensors = to_list(outbound_layer.call(input_tensors[0], mask=input_masks[0]))
167 output_masks = to_list(outbound_layer.compute_mask(input_tensors[0], input_masks[0]))
168 # TODO: try to auto-infer shape

/pang/linetor/miniconda2/envs/py27/lib/python2.7/site-packages/keras/layers/pooling.pyc in call(self, x, mask)
158 strides=self.strides,
159 border_mode=self.border_mode,
--> 160 dim_ordering=self.dim_ordering)
161 return output
162

/pang/linetor/miniconda2/envs/py27/lib/python2.7/site-packages/keras/layers/pooling.pyc in _pooling_function(self, inputs, pool_size, strides, border_mode, dim_ordering)
249 border_mode, dim_ordering):
250 output = K.pool2d(inputs, pool_size, strides,
--> 251 border_mode, dim_ordering, pool_mode='avg')
252 return output
253

/pang/linetor/miniconda2/envs/py27/lib/python2.7/site-packages/keras/backend/mxnet_backend.pyc in func_wrapper(*args, **kwargs)
31 old = learning_phase()
32 set_learning_phase(1)
---> 33 train_ret = func(*args, **kwargs)
34 set_learning_phase(0)
35 test_ret = func(*args, **kwargs)

/pang/linetor/miniconda2/envs/py27/lib/python2.7/site-packages/keras/backend/mxnet_backend.pyc in pool2d(x, pool_size, strides, border_mode, dim_ordering, pool_mode)
2910 x = _preprocess_convnd_input(x, dim_ordering)
2911 s = mx.sym.Pooling(data=x.symbol, kernel=pool_size, pool_type=pool_mode, pooling_convention=border_mode,
-> 2912 stride=strides)
2913 out = _postprocess_convnd_output(KerasSymbol(s), dim_ordering)
2914 return out

/pang/linetor/.local/lib/python2.7/site-packages/mxnet/symbol.pyc in Pooling(data, global_pool, cudnn_off, kernel, pool_type, pooling_convention, stride, pad, name, attr, out, **kwargs)

/pang/linetor/.local/lib/python2.7/site-packages/mxnet/_ctypes/symbol.pyc in _symbol_creator(handle, args, kwargs, keys, vals, name)
125 c_array(ctypes.c_char_p, [c_str(i) for i in keys]),
126 c_array(ctypes.c_char_p, [c_str(str(i)) for i in vals]),
--> 127 ctypes.byref(sym_handle)))
128
129 if args and kwargs:

/pang/linetor/.local/lib/python2.7/site-packages/mxnet/base.pyc in check_call(ret)
127 """
128 if ret != 0:
--> 129 raise MXNetError(py_str(_LIB.MXGetLastError()))
130
131 if sys.version_info[0] < 3:

MXNetError: Invalid Input: 'same', valid values are: {'full', 'valid'}, in operator Pooling(name="", stride="(1, 1)", pooling_convention="same", pool_type="avg", kernel="(3, 3)")

If I read error message, I think argument about border _mode. In keras 1.2.2, Pooling layer 's argument : border_mode need to be in ('valid' or 'same'). But by using mxnet, back-end it says {'full', 'valid'}.
If I use "same', mxnet makes a issue. If I use 'full', keras makes issue. So I can't implement InceptionResNetV2.

Is there any comment about this?

Please make sure that the boxes below are checked before you submit your issue. If your issue is an implementation question, please ask your question on StackOverflow or join the Keras Slack channel and ask there instead of filing a GitHub issue.

Thank you!

Check that you are up-to-date with the master branch of Keras. You can update with:
pip install git+git://github.com/fchollet/keras.git --upgrade --no-deps
If running on TensorFlow, check that you are up-to-date with the latest version. The installation instructions can be found here.
If running on Theano, check that you are up-to-date with the master branch of Theano. You can update with:
pip install git+git://github.com/Theano/Theano.git --upgrade --no-deps
Provide a link to a GitHub Gist of a Python script that can reproduce your issue (or just copy the script here if it is short).

Topic classification model in keras with MXNet backend errors out - Too many slices such that some splits are empty

Running Reuters Topic Classification Model under keras/example for MXNet backend with 8 GPU errors out with following error:
Error: Too many slices such that some splits are empty

Code - https://github.com/fchollet/keras/blob/master/examples/reuters_mlp.py

Setting:

MXNet latest commit till (March 8, 2017)
dmlc/keras latest commit till (March 8, 2017)
Number of GPU - 8

However, same example works with Tensorflow backend.

Also, with MXNet backend for 1, 2, 4 GPUs it works fine.

Error with Theano backend and multiple GPUs

I'm trying to use Theano backend with multiple GPUs and I'm getting this error: ValueError: Invalid argument "context" passed to K.function. However, if i'm running TheanoGPU test script everything is correct. Any idea?

Thanks!

Prediction incorrect when load saved model

Hi,

Now I am using Keras 1.2.2 with mxnet backend on ResNet50. I observed a very weird phenomenon:

If we use mxnet as backend, when we finish training and save the model to disk (by model.save_weight(…)), then reload the model to do the prediction, we will get almost the same ouput for training data. In other word, we use the reloaded model to predict on the training data, all output will be almost the same and classify everything to one category. This shouldn’t happen since we have very high training accuracy. If we directly use trained model to predict on training data (without saving and loading), then everything becomes fine.

However, if we use Theano as backend, (keep all the code unchanged and change ~/.keras/keras.json to use Theano), then everything will be fine. The reloaded model will do its job correctly.

Have you ever seen such a weird phenomenon? Do you think we have some issue to save the model?

One more information: the way we used multi-gpu for training is directly add “context=['gpu(0)', 'gpu(1)', 'gpu(2)', 'gpu(3)']” when we are compiling the model. I am assuming there is no additional configuration needed.

Please let me know if you need more information.

Score screenshot sample:

mxnet to keras model .h5

Is it possible to read in .params + symbol.json file and convert to keras .h5 format now that the backend is supported? This would be a nice workaround for mxnet -> CoreML support until coremltools officially supports mxnet.

Invalid accuracy calculation for MXNet backend with multi-GPU

Running following examples from Keras with MXNet backend and multi-GPU setting has invalid accuracy calculation.

MNIST CNN - https://github.com/fchollet/keras/blob/master/examples/mnist_cnn.py
Reuters MLP - https://github.com/fchollet/keras/blob/master/examples/reuters_mlp.py

and

https://github.com/fchollet/keras/blob/master/examples/mnist_siamese_graph.py always shows accuracy to be "0.50"