Code Monkey home page Code Monkey logo

klaus's Introduction

Klaus

This is a show-case of how to run a state-of-the-art image classifier from iOS devices.

Warning ahead, this is a show-case, and it is open-source. However, the ccv's Xcode project presented here will not be maintained. Using full ccv implementation: http://github.com/liuliu/ccv

Challenge

Deep learning based image classifier normally has large memory footprint. ccv's default image classifier uses around 220MiB memory, which is reasonable on desktop, but a bit too much on mobile devices. A deep learning based image classifier also comes with large data files (so-called pre-trained model), ccv's pre-trained model is about 110MiB, Caffe's is about 200MiB, and OverFeat's is about 1GiB. Delivering a mobile utility app with 100MiB data file is quite unreasonable.

In Klaus, accuracy is scarified for smaller memory footprint as well as smaller data files. Specifically, on imageNet 2012, ccv's default image classifier has top-5 missing rate at 16.17% (meaning that given an image, for total 5 guesses, ccv's default image classifier has 83.83% chances to get it right). The mobile-friendly image classifier has top-5 missing rate at 18.22%. This is achieved with a 19.3MiB pre-trained model.

Detail

The first difference of this new mobile-friendly image classifier is its full connect layer size. ccv's default image classifier follows Matt's model, and thus, the full connect layer has 4096 neurons. The new mobile-friendly image classifier has only 2048 neurons for that layer. This change effectively cut memory footprint by half, and it is where the accuracy loss comes from. The pre-trained mobile friendly model can be downloaded from: http://static.libccv.org/image-net-2012-mobile.sqlite3.

ccv's default image classifier already compresses its parameters with half-precision float-point. We've done comparison previously and it showed no loss of accuracy by going from 32-bit to 16-bit. Since full connect layer resides most model parameters, I've done more experiments on what accuracy loss we are looking at when going more aggressive on quantization.

I've using this code snippet to generate quantization table for full connect layers with K-mean algorithm: https://gist.github.com/liuliu/9117a0011a682ab231d3.

After the quantization table generated, full connect layer's model parameters are exported to a PNG file with this code snippet: https://gist.github.com/liuliu/970d97db15f47c196454.

With this code snippet: https://gist.github.com/liuliu/9c737fa53a62d7165f2c, the quantized model parameters are loaded back for analysis.

It turns out that the loss of accuracy is not much.

8-bit: 41.15% (1), 18.18% (5) (for one guess, it has 58.85% chances to get it right, for 5 guesses, it has 81.82% changes to get it right)

4-bit: 41.38% (1), 18.28% (5)

2-bit: 45.22% (1), 20.62% (5)

To strike the balance between accuracy and size, a mixed model is chose: the first full connect layer will be quantized to 4-bit, the other full connect layers will be quantized to 8-bit, which gives 18.20% top-5 missing rate.

The quantization table later is stored at half-precision inside the sqlite3 model with this code snippet: https://gist.github.com/liuliu/1f3e2c1fceb5f1b47dc5.

The full connect layer parameters are stored separately into a series of PNG files, you can use this code snippet to load it back: https://gist.github.com/liuliu/10b7b067ace070ee7e33.

The beautiful part of using PNG file to store parameters is that all optimization techniques available to PNG is now available to us. The generated PNG file can be further reduced with pngcrush -brute. In fact, 10% data file size reduced with pngcrush.

ccv's newer CPU implementation also uses NEON instruction set to speed up convolutional layer computation, thus, the forward pass can be completed around 3 seconds on iPhone 5 (averaging 10 patches).

More

This is obviously a very early presentation about how to run deep learning based image classifier on mobile devices. A few explorations based on these early results may possible. For example, memory footprint with the mobile-friendly version is still around 100MiB, however, with quantization, the full connect layers could expect 4 times smaller memory footprint when implemented properly. The full connect layers may be further deepened but with fewer neurons each layer to get better accuracy and performance.

As always, Klaus is open-sourced under BSD 3-clause license, and the pretrained model is under Creative Commons Attribution 4.0 International License. Hope you will enjoy it.

klaus's People

Contributors

liuliu avatar

Watchers

James Cloos avatar DL avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.