The variation_loss from ndinhtuan

variation_loss's Introduction

Gennerate simple data to evaluate algorithm

TO DO :

Gen data again:

NOTE: Using test_func without decode_batch (ctc) is better than when using ctc for predicting phase

clean set (expected rnn model > 97%)
noise set (expected < acc of clean set)
padding set => so then we can apply IQA to imporve acc on noise set and padding set

1. Change function predict in Recognizor with test function (only X as input not length)
2. pad background identity card to same reality situation.
3. Create thin datase for 1080 train
4. **survey noise and padding method **

DONE

1. add noise (same real) for back  [DONE]
2. Imporve training rnn

PROBLEM IN TRAINING :

loss visualization not smooth

REASON MAYBE:

Augument data too much => variation
batch size small
too much noise

SOLVE:

Increase batch size
About data generation

2.0 [x] Train with shuffe gen data MORE time before gen new data. 2.1 [x] Reduce random in gen function data: rand in font, rand in noise 2.2 [x] Use less noise: only use noise in real intuation.
[] Standardize data
[] check preprocessing of train
[] Too much regularization : dropout, batchnorm, weight L2. Reduce them. (Search more ...)
Switch mode Train to Test in layers: Batch Norm, Dropout, ... Don't use Dropout in test mode For example.
Decrease learning rate. (use sgd (0.003) instead of using Adadelta)
[] Init weight properly (maybe).
Use dropout in LSTM : LSTM(dropout=0.4). Dropout is diff dropout recurrent.
Save the best model in entire training process=> compare loss

Explain Solution

small batch size => more diff between batches => loss in between batches diff
Train in small duration cause cannot learning deep in training set
[]
[]
[]
With dropout cannot app dropout in testing because when test we need to use all weight.

    Note that if your model has a different behavior in training and testing phase (e.g. if it uses Dropout, BatchNormalization, etc.), you will need to pass the learning phase flag to your function:

    get_3rd_layer_output = K.function([model.layers[0].input, K.learning_phase()], [model.layers[3].output])
    output in test mode = 0
    layer_output = get_3rd_layer_output([x, 0])[0]

Large lr => hard to converge.
[]
Dropout for input and dropout for recurrent state. BatchNorm : Conv2d(Non-activation)->BatchNorm->Activation(ReLU). https://stackoverflow.com/questions/39691902/ordering-of-batch-normalization-and-dropout-in-tensorflow

    Should use batch before Activation :(
    https://github.com/ducha-aiki/caffenet-benchmark/blob/master/batchnorm.md

PROBLEM IN TESTING

[] Low accuracy : add more test sample to training sample.

1.1 [x] add red sin background for training 1.2 [] add padded id area when id card is padded. 1.3 [x] find font for id card.

Noise maybe in real on ID card:

Using example on imgaug

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.

ndinhtuan / variation_loss Goto Github PK

variation_loss's Introduction

TO DO :

Gen data again:

DONE

PROBLEM IN TRAINING :

REASON MAYBE:

SOLVE:

Explain Solution

PROBLEM IN TESTING

Noise maybe in real on ID card:

variation_loss's People

Contributors

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent