Comments (5)
Hi @manushree635 , please check this page:
https://zenodo.org/records/4783391
from onellm.
they have a csv file where each audio has 5 captions, how did you convert it to the required format, that is need by coco eval
from onellm.
i'm not able to replicate the clothov2 numbers mentioned in the paper by using all the reference 5 captions in clotho. If you could guide me with this, it would be really helpful.
Also the spider score mentioned for ltu is incorrect, you have reported the spice score, instead of spider. That is an unfair comparison given how much spider and spice scores vary
from onellm.
The converted annotation file is at:
https://huggingface.co/datasets/csuhan/OneLLM_Eval/blob/main/audio/clothov2/eval_clothocap_ann.json
Sorry for the confusion about LTU. We will update the table soon.
from onellm.
hey @csuhan , i used the annotation file and the checkpoint mentioned in the readme. These are the scores I'm getting, which is way off from the scores mentioned in the paper. Is there something I'm doing wrong, can you please help me with this
Bleu_1: 0.481
Bleu_2: 0.271
Bleu_3: 0.165
Bleu_4: 0.100
METEOR: 0.139
ROUGE_L: 0.321
CIDEr: 0.237
from onellm.
Related Issues (20)
- License HOT 3
- Images and videos with high resolution HOT 3
- .
- Provide some inference examples HOT 1
- Some confusion about the modalities of depth/normal maps. HOT 3
- About evaluation and training codes HOT 3
- Will the code for SFT be open sourced in the near future? Thanks~ HOT 1
- Fmri data is gray image or some numerical data HOT 1
- Is the Tonizer.py code converting modelities into tokens? HOT 2
- Training code HOT 4
- fmri data HOT 1
- webvid dataset no longer available as of 23 of Feb HOT 1
- 关于专家的职能 HOT 2
- freezing of LLM during pretrain stage HOT 2
- Inference inputs multiple modalities other than text at once HOT 1
- How to install petrel_client HOT 1
- Can't reproduce that Page 6, Table 5, Evaluation on Point Cloud-Text Tasks' Bleu, METEOR and ROUGE_L numbers HOT 2
- Whether the embedings generated by different modal data has comparability? HOT 1
- Vague output for audio HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from onellm.