Comments (7)
Hi @tianjunyu0871 ,
There is two version of CLIP (Resnet and VIT)
Their encoding size is different - 500 and 640
I assume this is your issue
It should be solvable using different command line arguments
Is it helpful?
from clip_prefix_caption.
Thanked your reply.
Does the parameter is_rn represent resnet?But the following command appears is_rn?Is it a clerical error?
In addition, can you share the pre-training weights of MLP and the program evaluation code? Thank you so much!!
from clip_prefix_caption.
Yes this is an error
Thank you very much for pointing it out
I will fix it ASAP
We use the evaluation code as used in the OSCAR repository
Just replacing the JSON files with our JSONs
We already shared the weights of MLP - see "Inference Notebooks" section in the readme.
from clip_prefix_caption.
I tried to modify the prediction code and the following error occurred while loading the pre-trained Transformer data.
I don't know if there is a problem with my code. Can you share your code for forecasting with Transformer? Thank you very much!
from clip_prefix_caption.
Prediction with transformer is available in this notebook
from clip_prefix_caption.
I have gained a lot from your work, but I still have a few questions, and I hope to get your answers.
First question: I tried to remove the stoptoken, but the effect is not good, is there a good way to generate more than one sentence?
Second question: Have you tried using different GPT models? Such as GPT2-medium or GPT2-large . Is the difference significant?
Third question: what does the prefix_length_clip parameter mean in training?
Looking forward to your reply, thank you very much!
from clip_prefix_caption.
To generate more than one sentence you should replace the inference algorithm (e.g. beam search)
Using a variants of beam search you can produce different captions.
We haven't tried to use different GPT models.
prefix_length_clip control the transformer mapping network - size (in tokens) of the clip embedding, as some of the prefix is a learned const.
from clip_prefix_caption.
Related Issues (20)
- Some questions about fine-tune with custom dataset HOT 8
- AttributeError: module 'cog' has no attribute 'Predictor' HOT 3
- model overfitting issue HOT 5
- Parsing conceptual caption does not function properly as it removes some images and replaces them with zero tensor. HOT 1
- use different encoder HOT 3
- How to evaluate model with meteor, BLEU, or rouge HOT 3
- AttributeError: module 'cog' has no attribute 'Predictor' HOT 2
- Train costom data HOT 1
- Metrics of ClipCap's Original Performance HOT 2
- use multiple gpus to train
- How to evaluate the trained model? Is there a test.py ? HOT 6
- did anyone reproduce the transformer network with frozen GPT-2? HOT 7
- data json
- Where is the file 'model_wieghts.pt' exists?
- How to do eval, how to set the prompt
- How to inference after training on my own dataset HOT 1
- beamsearch lead to a worse result in inference script?
- Error in Load model weights HOT 3
- clipcap checkpoints file
- Can BERT be used as language model for generating captions? HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from clip_prefix_caption.