Code Monkey home page Code Monkey logo

Comments (5)

eiderman avatar eiderman commented on May 18, 2024

The easiest way to use them is to provide the batch_normalize argument to conv2d or fully_connected (both support setting with defaults_scope as well). This takes a pt.BatchNormalizationArguments object (which can be shared and I see now that auto-generated documentation is broken here). The defaults here are likely to be good enough for experimentation and the main one you would want to toggle is probably scale_after_normalization which controls whether a learned multiplier is used.

There is a lot more win with batch normalization in convolutional layers than fully connected, but you may get win there if your batch size is big enough. Also, it is important to note that your train and test networks will be different when using batch normalization, so you will need to construct one with phase=pt.Phase.train and the other with phase=pt.Phase.test. This is because during training the actual batch statistics are used and during test, a running average from train are used instead.

Also, I don't mind answering questions here, especially if you close them when you are satisfied.

from prettytensor.

Hvass-Labs avatar Hvass-Labs commented on May 18, 2024

I've looked at the function conv2d which takes an argument batch_normalize which is a BatchNormalizationArguments object, as you say. Confusingly it is set to False by default, where I would have thought None would be more appropriate?

Anyway, when I search for its definition using the PyCharm editor, I end up in the file pretty_tensor_normalization_methods.py with the following:

BatchNormalizationArguments = collections.namedtuple(
    'BatchNormalizationArguments',
    ('learned_moments_update_rate', 'variance_epsilon',
     'scale_after_normalization'))

There's not much documentation in this file so it isn't clear how to use this, what the parameters mean and which values might be appropriate.

Could I ask you to modify the code I wrote above to give an example of how to do this?

Thanks.

from prettytensor.

eiderman avatar eiderman commented on May 18, 2024

The parameters map directly to the batch_normalization method. It is
unlikely that you will want to change variance_epsilon since it mostly
just avoids problems around 0. learned_moments_update_rate also has a
reasonable default (it changes the decay factor for the exponential moving
average used in test or inference). They are exposed more for completeness
and to support edge cases where they may be useful. You may find some value
in playing with scale_after_normalization which controls a multiplier
applied to each depth channel. See https://github.com/google/
prettytensor/blob/master/docs/PrettyTensor.md#batch_normalize for more
details.

norm = pt.,BatchNormalizationArguments(scale_after_normalization=True)

with pt.defaults_scope(activation_fn=tf.nn.relu, phase=pt.Phase.train):
    y_pred, loss = x_pretty.\
        conv2d(kernel=5, depth=64, name='layer_conv1', batch_normalize=norm).\
        max_pool(kernel=2, stride=2).\
        conv2d(kernel=5, depth=64, name='layer_conv2', batch_normalize=norm).\

        max_pool(kernel=2, stride=2).\
        flatten().\
        fully_connected(size=256, name='layer_fc1', batch_normalize=norm). \

        fully_connected(size=128, name='layer_fc2').\
        softmax_classifier(class_count=10, labels=y_true)

On Wed, Aug 17, 2016 at 1:10 AM, Hvass-Labs [email protected]
wrote:

I've looked at the function conv2d which takes an argument batch_normalize
which is a BatchNormalizationArguments object, as you say. Confusingly it
is set to False by default, where I would have thought None would be more
appropriate?

Anyway, when I search for its definition using the PyCharm editor, I end
up in the file pretty_tensor_normalization_methods.py with the following:

BatchNormalizationArguments = collections.namedtuple(
'BatchNormalizationArguments',
('learned_moments_update_rate', 'variance_epsilon',
'scale_after_normalization'))

There's not much documentation in this file so it isn't clear how to use
this, what the parameters mean and which values might be appropriate.

Could I ask you to modify the code I wrote above to give an example of how
to do this?

Thanks.


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#30 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/ABnmwAt8Yip-rkidrR0jQRuihOzpqi6yks5qgsITgaJpZM4JlSTh
.

from prettytensor.

Hvass-Labs avatar Hvass-Labs commented on May 18, 2024

Thanks for the example. However, I get an error using your sample code.

First I write the following which you suggest:

norm = pt.BatchNormalizationArguments(scale_after_normalization=True)

And in the code for defining the network I have the following line for creating the first conv-layer, as you suggest:

conv2d(kernel=5, depth=64, name='layer_conv1', batch_normalize=norm).\

But this causes the following exception:

UnboundLocalError: local variable 'kwargs' referenced before assignment

which is raised in line 1981 in pretty_tensor_class.py which reads:

result = func(non_seq_layer, *args, **kwargs)

What I do instead is that I use batch_normalize=True in the call to conv2d(). But it's not really clear from the docs what this does.

I've read the following doc which you suggested, but it really doesn't explain much:

https://github.com/google/prettytensor/blob/master/docs/PrettyTensor.md#batch_normalize

The docs also don't make clear what is the difference in using different phases in Pretty Tensor. When I look at the docs for e.g. evaluate_precision_recall() it appears that it completely changes the semantics of the function in the testing phase, so I probably don't want to use Pretty Tensor's definition of training / testing phases, because it might change the semantics in unpredictable and undocumented ways, which would cause bugs in my code that are very hard to find.

Once again I'd like to encourage you to significantly improve the documentation because it is frustrating to try and learn how to use Pretty Tensor from reading the current docs. Scikit-learn has very good documentation which could serve as inspiration. But I read in the TensorFlow forum that the dev-team is currently consolidating the builder-API's, so perhaps you have different plans going forward?

from prettytensor.

eiderman avatar eiderman commented on May 18, 2024

Thanks for the bug, I found the problem and will provide a fix.

Pretty Tensor is supported and alive, but it is a rather small effort now and so fixes take time.

The larger effort is tf.contrib.learn and it has some tutorials here: https://www.tensorflow.org/versions/r0.9/tutorials/index.html

These can all be mixed and matches and any functions that you like and are missing can be added (the simplest way would be by doing pt.Register(tf.contrib.layers.BLAH)). I'm taking the documentation request seriously.

from prettytensor.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.