Code Monkey home page Code Monkey logo

clip-android-demo's Introduction

CLIP-lite-android-demo

A demo for running quantized CLIP model (ViT-B/32) on Android.

Usage

Run this jupyter notebook to get the quantized models:

  • clip-image-encoder-quant-int8
  • clip-text-encoder-quant-int8

Place them into app\src\main\assets.

Then build and run in your IDE.

Note: Do NOT use PyTorch > 1.13 or it will failed when converting to ONNX format.

This project is just for testing, so forgive my casual code. Good luck :)

Performance

Model Size

  • Original (Float 32)
    • ImageEncoder: 335 MB
    • TextEncoder: 242 MB
  • Quantized (Int8)
    • ImageEncoder: 91.2 MB
    • TextEncoder: 61.3 MB

Loss

Accuracy compared to original CLIP ViT-B/32 model:

CIFAR100 int8 Original (fp32) Loss
2000 pics 0.825 0.871 -0.046
5000 pics 0.830 0.940 -0.11

Speed

Encode 500 pics in single thread:

Device: Xiaomi 12S @ Snapdragon 8+ Gen 1

Resolution On-disk Size Model Time
400px 21KB fp32 ~54s
400px 21KB int8 ~20s
1000px 779KB fp32 ~62s
1000px 779KB int8 ~27s
4096px 1.7MB int8 ~60s
4096px 4MB int8 ~87s

Note:

  • The encode time for each image is 35~45ms
  • For images with larger on-disk size, it takes more time to read the image. I have tried ''down-sample'' the large image instead of reading the whole file.

Acknowledgement

clip-android-demo's People

Contributors

greyovo avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.