Comments (5)
What machine spec did you choose for the test rig? (One of my students wants to demo willow at a conference session in a couple of weeks, and wondering whether to use CPU or GPU...)
from willow-inference-server.
That's awesome!
GPU - hands down.
Even if you go with something like a Tesla P4 (lowest cost, lowest power, single slot, passive cooling) or a GTX 1070 it can do most voice command length speech segments at 5x realtime (at least). An RTX 4090 (nice!) is 45x! CPU is... Not that.
As long as the CPU isn't terrible it really doesn't matter as much performance wise when using GPU. Of course there is some variation but by far the most complex and performance-intensive tasks in WIS are offloaded to GPU.
from willow-inference-server.
Thanks, I've got a GTX 1070 rig running well, but wondered what config you were planning on working with for CPU. I'd like to experiment with Vicuna too but guess I currently need more than the 8 GB VRAM?
from willow-inference-server.
Our CPUs are all over the place - from 6-7 year old intel i[something], ten year old Xeons, AMD Ryzens, to AMD ThreadRippers, etc. I'm hesitant to recommend specific CPUs because there's so much variety and it starts to get into things like types of RAM, etc. There's so much more variation in potential system hardware outside of GPU. My general take is: recent-ish AMDs (Ryzen something, etc) are much better with power and have excellent performance, otherwise anything will work - even REALLY old CPUs that don't even have AVX, etc (if using GPU). When it comes to CPU WIS really isn't any different than any other application. I would just take what you already know/have experienced with CPUs and apply that to WIS - older CPUs are slower, consume more power, etc. However, at a certain point if the CPU is especially low performance the performance advantages of GPU diminish significantly.
Vicuna/LLMs are a completely different animal. RTX 3090 is essentially the minimum to have the required VRAM and performance for a reasonable experience. We quantize Vicuna down to 4-bit and that's the only thing that makes it work in even that amount of VRAM.
from willow-inference-server.
gr8, tnx!
from willow-inference-server.
Related Issues (20)
- Add route to get GPU stats
- Download models now fails HOT 2
- Evaluate TTS Engines HOT 11
- Unable to connect to WIS RTC demo Locally HOT 13
- Needed sudo to build HOT 7
- Support docker-less installations
- Building requires docker's buildx HOT 4
- Add licence file/headers HOT 4
- Support GPU compute that doesn't require closed-source graphics drivers HOT 6
- Publish as Home Assistant addon HOT 12
- TTS Does not handle numbers in text HOT 24
- FR: force a specific language HOT 5
- Publish Docker image HOT 9
- Forgo the need for TLS certificates HOT 7
- Upon system reboot after using nohup ./utils.sh run, the nginx docker container does not come back, and the WIS container is stuck in restarting state HOT 4
- [Feature] Support Chatbot to use other LLM models such as ChatGLM-6B HOT 19
- Allow to specify the location of TLS certificate and host name to use. HOT 5
- Error when running download_models.sh HOT 9
- Ubuntu Dependency Script is throwing repository errors, no longer runs successfully HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from willow-inference-server.