Code Monkey home page Code Monkey logo

Comments (5)

hamishcunningham avatar hamishcunningham commented on June 8, 2024

What machine spec did you choose for the test rig? (One of my students wants to demo willow at a conference session in a couple of weeks, and wondering whether to use CPU or GPU...)

from willow-inference-server.

kristiankielhofner avatar kristiankielhofner commented on June 8, 2024

That's awesome!

GPU - hands down.

Even if you go with something like a Tesla P4 (lowest cost, lowest power, single slot, passive cooling) or a GTX 1070 it can do most voice command length speech segments at 5x realtime (at least). An RTX 4090 (nice!) is 45x! CPU is... Not that.

As long as the CPU isn't terrible it really doesn't matter as much performance wise when using GPU. Of course there is some variation but by far the most complex and performance-intensive tasks in WIS are offloaded to GPU.

from willow-inference-server.

hamishcunningham avatar hamishcunningham commented on June 8, 2024

Thanks, I've got a GTX 1070 rig running well, but wondered what config you were planning on working with for CPU. I'd like to experiment with Vicuna too but guess I currently need more than the 8 GB VRAM?

from willow-inference-server.

kristiankielhofner avatar kristiankielhofner commented on June 8, 2024

Our CPUs are all over the place - from 6-7 year old intel i[something], ten year old Xeons, AMD Ryzens, to AMD ThreadRippers, etc. I'm hesitant to recommend specific CPUs because there's so much variety and it starts to get into things like types of RAM, etc. There's so much more variation in potential system hardware outside of GPU. My general take is: recent-ish AMDs (Ryzen something, etc) are much better with power and have excellent performance, otherwise anything will work - even REALLY old CPUs that don't even have AVX, etc (if using GPU). When it comes to CPU WIS really isn't any different than any other application. I would just take what you already know/have experienced with CPUs and apply that to WIS - older CPUs are slower, consume more power, etc. However, at a certain point if the CPU is especially low performance the performance advantages of GPU diminish significantly.

Vicuna/LLMs are a completely different animal. RTX 3090 is essentially the minimum to have the required VRAM and performance for a reasonable experience. We quantize Vicuna down to 4-bit and that's the only thing that makes it work in even that amount of VRAM.

from willow-inference-server.

hamishcunningham avatar hamishcunningham commented on June 8, 2024

gr8, tnx!

from willow-inference-server.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.