Code Monkey home page Code Monkey logo

cocob's Introduction

COCOB

TensorFlow implementation of COCOB from the paper

Backprop without Learning Rates Through Coin Betting
Francesco Orabona and Tatiana Tommasi
https://arxiv.org/abs/1705.07795

Description

COntinuous COin Betting (COCOB) is a novel algorithm for stochastic subgradient descent (SGD) that does not require any learning rate setting. Contrary to previous methods, we do not adapt the learning rates, nor we make use of the assumed curvature of the objective function. Instead, we reduce the optimization process to a game of betting on a coin and obtain a learning rate free procedure for deep networks.

How do we reduce SGD to coin betting?
Betting on a coin works in this way: start with $1 and bet some money wt on the outcome of a coin toss gt, that can be +1 or -1. Similarly, in the optimization world, we want to minimize a 1D convex function f(w), and the gradients gt that we can receive are only +1 and -1. Thus, we can treat the gradients gt as the outcomes of the coin toss, and the value of the parameter wt as the money bet in each round.
If we make a lot of money with the betting it means that we are very good at predicting the gradients, and in optimization terms it means that we converge quickly to the minimum of our function. More in details, the average of our bets will converge to the minimizer at a rate that depends on the dual of the growth rate of our money.

Why using this reduction?
Because algorithms to bet on coins are already known, they are optimal, parameter-free, and very intuitive. So, in the reduction we get an optimal parameter-free algorithm, almost for free. Our paper extends a known betting strategy to a data-dependent one.

Is this magical?
No, you could get the same results just running in parallel copies of SGD with different learning rates and then combining them with an algorithm on top. This would lead exactly to the same convergence rate we get with COCOB, but the advantage here is that you just need to run one algorithm!

We refer the interested reader to the paper for many more details.

Code & Usage

Here you find the scripts needed to reproduce the experiment on MNIST data with a fully connected 2-layers network (1000 hidden units each and ReLU activations, mini-batch size of 100) as reported in Figure 2 (top row) of our paper.

To run the code simply cd to the mnist directory and use

python mnist_fully_connected.py

It will create a directory data where the MNIST data are downloaded and saved. In case of any problem with the data download it can also be done manually with

wget https://storage.googleapis.com/cvdf-datasets/mnist/train-images-idx3-ubyte.gz
wget http://storage.googleapis.com/cvdf-datasets/mnist/train-labels-idx1-ubyte.gz
wget http://storage.googleapis.com/cvdf-datasets/mnist/t10k-images-idx3-ubyte.gz
wget http://storage.googleapis.com/cvdf-datasets/mnist/t10k-labels-idx1-ubyte.gz

While running, the code will print on screen the training cost and test error per epoch

epoch 0, training cost 7.38284, test error 0.885 
epoch 1, training cost 0.0914842, test error 0.0413998 
epoch 2, training cost 0.0321911, test error 0.0226998
...
epoch 37, training cost 5.72368e-05, test error 0.0164999 
epoch 38, training cost 5.53296e-05, test error 0.0163999 
epoch 39, training cost 5.34391e-05, test error 0.0164999

History

  • 05/31/2017 COCOB version 1.0 - Initial release

cocob's People

Contributors

ttommasi avatar bremen79 avatar

Watchers

James Cloos avatar paper2code - bot avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.