Code Monkey home page Code Monkey logo

docvec's Introduction

DocVec - Wasm meets Semantic search

Alt Text

I wanted an excuse see what all the hype about WebGPU and WebAssembly was all about for a long time. Then I attended a Rust Wasm meetup and was eager to find a project to learn about these technologies.

docVec is a client-side fully working semantic search engine, ie. having the model run ENTIRELY on the client machine. This is NOT a production-ready project.

My goals for the project were to:

  • Use Rust for NN inference
  • Use the GPU for model inference and see how mature it is to use wgpu: Luckily, I found the amazing project wonnx. I had to hack around some issues of running transformers and also implement some missing ONNX operators (cf. PR) for this to work. Also, I am still working on re-implementing the project's MatMul broadcasting and trying if possible to improve the compute shader performance.
  • Implement the whole logic in a webassembly module in Rust. The goal here is to understand some internals of wasm and the limitations that come from that
  • Keep the JS to a minimum.
  • Don't overcomplicate the search engine. For now a simple index of flat vector suffice.

Maintainer

  1. Download gte-small model from huggingface

    cd model/
    git clone https://huggingface.co/Supabase/gte-small
  2. Install onnx simplifier : onnxsim

  3. Simplify model and fix input batch size and sequence length

    python -m onnxsim gte-small/onnx/model.onnx  gte-small/onnx/sim_model.onnx \
     --overwrite-input-shape "input_ids:1,512" "attention_mask:1,512" "token_type_ids:1,512"
  4. Install wasm-pack

    cargo install wasm-pack
  5. Clone modified version of wonnx (temporary)

    cd ..
    git clone https://github.com/AmineDiro/wonnx.git
    git checkout broadcast-matmul
  6. Build web assembly module & serve the page

    cd ..  # go to project root
    ./build.sh && python3 -m http.server 8000

Now you can access the semantic search module on http://localhost:8000 ๐ŸŒŸ

TODO:

  • Backend (wasm):

    • Project scaffolding using wasm-bindgen
    • Generate string embedding using wonnx and gte-small model:
      • Add Erf operator to wonnx
      • Modify MatMul broadcasting checks ( this is temporary)
      • Reimplement correct MatMul with broadcasting
      • Investigate float NaN issues on Vulkan backend for wgpu
    • Tokenize input in wasm tokenizers
    • Build index :
      • Split page text
      • Embed text using sentence-transformers
      • Load index in wasm module
    • Implement L2 distance and return k nearest neighbors (avec Vec<String>)
  • Frontend:

    • Download example wiki page as simple html
    • Loop over page elements and search for matching html element
    • Highlight just the text and a littlebit the surrounding

docvec's People

Stargazers

Scott avatar Chetan baliyan avatar Yuri Dias avatar Miguel Piedrafita avatar Zafar Ansari avatar fadri1 avatar Tan Ho avatar

Watchers

AmineDiro avatar  avatar

Forkers

akashicmarga

docvec's Issues

Getting error in ebedding.rs

Thanks for the tutorial on wasm and webgpu using rust. I was exploring this space and came to you blog and found out other resources too. I tried running docvec but getting this error of size. it's something to do with the Array intialisation in js code which is created after compile where array is set at 128.

Screenshot 2024-05-17 at 12 02 23โ€ฏAM

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.