hwalsuklee / tensorflow-mnist-cnn Goto Github PK

MNIST classification using Convolutional NeuralNetwork. Various techniques such as data augmentation, dropout, batchnormalization, etc are implemented.

Python 100.00%

tensorflow mnist-classification mnist cnn ensemble-prediction data-augmentation dropout batch-normalization

tensorflow-mnist-cnn's Introduction

Convolutional Neural-Network for MNIST

An implementation of convolutional neural-network (CNN) for MNIST with various techniques such as data augmentation, dropout, batchnormalization, etc.

Network architecture

CNN with 4 layers has following architecture.

input layer : 784 nodes (MNIST images size)
first convolution layer : 5x5x32
first max-pooling layer
second convolution layer : 5x5x64
second max-pooling layer
third fully-connected layer : 1024 nodes
output layer : 10 nodes (number of class for MNIST)

Tools for improving CNN performance

The following techniques are employed to imporve performance of CNN.

Train

1. Data augmentation

The number of train-data is increased to 5 times by means of

Random rotation : each image is rotated by random degree in ranging [-15°, +15°].
Random shift : each image is randomly shifted by a value ranging [-2pix, +2pix] at both axises.
Zero-centered normalization : a pixel value is subtracted by (PIXEL_DEPTH/2) and divided by PIXEL_DEPTH.

2. Parameter initializers

Weight initializer : xaiver initializer
Bias initializer : constant (zero) initializer

3. Batch normalization

All convolution/fully-connected layers use batch normalization.

4. Dropout

The third fully-connected layer employes dropout technique.

5. Exponentially decayed learning rate

A learning rate is decayed every after one-epoch.

Test

1. Ensemble prediction

Every model makes a prediction (votes) for each test instance and the final output prediction is the one that receives the highest number of votes.

Usage

Train

python mnist_cnn_train.py

Training logs are saved in "logs/train". Trained model is saved as "model/model.ckpt".

Test a single model

python mnist_cnn_test.py --model-dir <model_directory> --batch-size <batch_size> --use-ensemble False

<model_directory> is the location where a model to be testes is saved. Please do not specify filename of "model.ckpt".
<batch_size> is employed to reduce burden of memory of machine. The number of test data is 10,000 for MNIST. Different batch_size gives the same result, but requiring different memory size.

You may command like python mnist_cnn_test.py --model-dir model/model01_99.61 --batch-size 5000 --use-ensemble False.

Test ensemble prediction

python mnist_cnn_test.py --model-dir <model_directory> --batch-size <batch_size> --use-ensemble True

<model_directory> is the location of root directory. The root directory contains the sub-directories containg each model.

You may command like python mnist_cnn_test.py --model-dir model --batch-size 5000 --use-ensemble True.

Simulation results

CNN with the same hyper-parameters has been trained 30 times, and gives the following results.

A single model : 99.61% of accuracy.
(the model is saved in "model/model01_99.61".)
Ensemble prediction : 99.72% of accuracy.
(All 5 models under "model/" are used. I found the collection of 5 models by try and error.)

99.72% of accuracy is the 5th rank according to Here.

Acknowledgement

This implementation has been tested on Tensorflow r0.12.

tensorflow-mnist-cnn's People

Contributors

Stargazers

Watchers

Forkers

oppa3109 vyraun awesome-python awesome-ml jlee12393 yxfff sladomic katherinexx shenruihan saadmahboob mohanarunachalam nttrungmt lqq0916 mtchen2011 siddimore rosssong warmstar1986 guifereis searchingmnist osirisjs joshpxyne bareluz93 shenyuanyuan deoko perryhau dsaint31x jarusified chasedreamer chirazben hanksantford alexliyang zedkt xiameng552180 anujonthemove michaelcro liang-yc kr11 zz-x404 guoliangstyle afcarl feziodoshi acevedo-oscar simonxu48 whh1520 jayeshd7 yuxi120407 caodandewo makhthum liyaochong kai-kaushik mowangphy beautifulsumday ishaanb92 tdb-alcorn generousman xing-shuai shilei2403 chensh236 rachelgrz foroliviawong jodezer cli98 xiaoshu01 qieting423 insist1995 taoja12 anbo225 jeilove leomattes sunyu5143 christophschmidl dengdengwo123 cankunqiu lllllliuxt gonmeso cancander syj0113 fenglinlie 89s52 qsws88 osamagkhafagy sunhy666 dhh985856963 bianlingfeng2018 rsshekhar27 tsai2018 flyingdorothia shan-616 ira0427 tay-lor c-octopus liujuanlt assafbam

tensorflow-mnist-cnn's Issues

Tensorflow variables already exist

If I run exactly your code, Tensorflow complains about variables being already defined with, e.g., the following error:

ValueError: Variable conv1/weights already exists, disallowed. Did you mean to set reuse=True or reuse=tf.AUTO_REUSE in VarScope?

Do variables have to be reused or not? How can you set it globally?
[I'm with Tensorflow 1.3, Python 3.6 on Linux]

a suggestion

I think

should be written in

NotFoundError (see above for traceback): Key fc3/BatchNorm/moving_mean not found in checkpoint

Thanks for your code! And I have some questions, glad for your help.
With the Ubuntu 18.04 and the tensorflow 0.12.0, when I run
python mnist_cnn_test.py --model-dir model/model01_99.61 --batch-size 5000 --use-ensemble False
It don't work and return NotFoundError (see above for traceback): Key fc3/BatchNorm/moving_mean not found in checkpoint [[Node: save/RestoreV2_9 = RestoreV2[dtypes=[DT_FLOAT], _device="/job:localhost/replica:0/task:0/cpu:0"](_recv_save/Const_0, save/RestoreV2_9/tensor_names, save/RestoreV2_9/shape_and_slices)]]

And when I run 'python mnist_cnn_test.py --model-dir model/model01_99.61 --batch-size 5000 --use-ensemble True'
The "accuracy“ only 0.092, All models are like this.
Looking forward to your reply.

mnist_cnn_train.py test accuracy values off

When the model is created "y = cnn_model.CNN(x) ", the is_training variable is not passed. Thus in the testing section when performing y_final = sess.run(y, feed_dict={x: batch_xs, y_: batch_ys, is_training: False}), the is_training: False has no affect. This will impact your accuracy.

If you use the mnist_cnn_train.py test function, the model is initialized with the is_training parameter and will give a result of approximately .5 % higher.

I changed y = cnn_model.CNN(x, is_training=is_training) and now the accuracy percents match for both modules.

Just as a side note: tf.scalar_summar is deprecated in Tensor 1.4

Possible bug with is_training parameter

Hello,

Going through the code of your project, I think the parameter is_training is not taken into account for the CNN model in file mnist_cnn_train.py.

I've seen that the cnn_model.CNN function takes "is_training" argument with default equals True which prevent the code to crash.

In mnist_cnn_train, you define the is_training placeholder but you don't use it when calling the cnn_model.CNN function. You use it in the training and testing loops of the same file so I assume this is not an intended behavior.

I've not tested it yet, but I think the is_training entry of the feed_dict is just ignored and this cause dropout to be applied during the testing loop (same goes for batch normalization). This bug could be the cause of the issue #1

Can u give me details what ur hyperparameters are?

If I change test batch size,the result will be difference

I use this code on my own data. I restore model(saver.restore(sess, myModelPath)),and then test.
if I set test batch size is 1,the result will be [ 0. 0. 0. 0. 0.].
if I set test batch size is 5,the result will be [ 1. 0. 0. 0. 1.].
Why is it different

No model.ckpt was saved under dierctory model/

I ran python mnist_cnn_train.py in terminal and the terminal returned

Optimization` Finished!
test accuracy for the stored model: 0.9932

Training logs are saved in "logs/train" , however, there is no trained model saved as "model/model01_99.61/model.ckpt" or in other directories. When I run python mnist_cnn_test.py --model-dir model/model01_99.61 --batch-size 5000 --use-ensemble False and it returns an error message:

NotFoundError (see above for traceback): Key fc3/BatchNorm/beta not found in checkpoint
	 [[Node: save/RestoreV2 = RestoreV2[dtypes=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"](_arg_save/Const_0_0, save/RestoreV2/tensor_names, save/RestoreV2/shape_and_slices)]]

Where is the problem?

hwalsuklee / tensorflow-mnist-cnn Goto Github PK

tensorflow-mnist-cnn's Introduction

Convolutional Neural-Network for MNIST

Network architecture

Tools for improving CNN performance

Train

1. Data augmentation

2. Parameter initializers

3. Batch normalization

4. Dropout

5. Exponentially decayed learning rate

Test

1. Ensemble prediction

Usage

Train

Test a single model

Test ensemble prediction

Simulation results

Acknowledgement

tensorflow-mnist-cnn's People

Contributors

Stargazers

Watchers

Forkers

tensorflow-mnist-cnn's Issues

Recommend Projects

Recommend Topics

Recommend Org