Comments (8)
zsl (zero-shot labeling) examples does that. We need to pass candidate class names with --text
argument and zsl is scoring those classes.
from clip.cpp.
Oh got it. This is called image captioning. There exist numerous models for it, but state-of-the-art results come from models like LLaVA. It is basically CLIP + LLaMA bridged with a linear layer. I have an idea like creating another project to combine clip.cpp with llama.cpp to achieve efficient inference of LLaVA, but this might be delayed for a week or so because there are some other features I'd like to implement in this repo before that.
from clip.cpp.
If you don't have candidates, CLIP model won't work for image classification. Then I guess your best bet would be to hardcode class names from a common dataset such as OpenImages by modifying zsl example.
p.s.: I'm planning to experiment with a transfer learning method on the edge to train a single head layer on top of the CLIP backbone for image classification, but I don't know when yet.
from clip.cpp.
sounds like you want to feed the image embedding to a llm. (like llava)
from clip.cpp.
I do not have a general list of candidates to provide
from clip.cpp.
Can you go into more detail?
common dataset such as OpenImages
Like go to OpenImages and generate a text prompt with all its classes in to the command line prompt?
from clip.cpp.
Yes, or only the classes that you are actually expecting to appear in the image. This is how zero-shot labeling is supposed to work.
If you could describe your exact use case, I'd try to make a more detailed comment.
from clip.cpp.
Here’s an example. I want to create an discord bot that watches for a reaction on the image and posts a message explaining what the image looks like for people who have trouble seeing, but know what the words mean.
from clip.cpp.
Related Issues (20)
- QuickGELU - not SOTA ? HOT 2
- Use scratch buffers HOT 1
- Do proper benchmarking
- Introduce requirements.txt for python scripts please HOT 2
- Bug: openai clip-vit-base-patch16 failes with memory error HOT 1
- Bug: openai's clip-vit-large-patch14-336 failes with assert HOT 1
- include license file HOT 3
- Write instructions for Apple Mac HOT 12
- Typo in the error message HOT 1
- Can this use convNeXt architecture? HOT 4
- Prepare clip.cpp for upcoming llava.cpp HOT 1
- Support custom mean-std normalization HOT 3
- not enough space in the context's memory pool (on Apple M1 Max, 32GB RAM, clip-vit-b-32) HOT 6
- Write a better readme HOT 6
- Provide Python bindings
- [ZSL] Results doesn't match hugging face demo HOT 5
- bug: missing clip_free() in example/main.cpp HOT 3
- Optional warmup when loading model HOT 1
- support for larger models HOT 8
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from clip.cpp.