j-min / dsg Goto Github PK

View Code? Open in Web Editor NEW

72.0 3.0 5.0 4.52 MB

Davidsonian Scene Graph (DSG) for Text-to-Image Evaluation (ICLR 2024)

Home Page: https://google.github.io/dsg

Python 1.79% Jupyter Notebook 98.21%

dsg llm text-to-image text-to-image-evaluation text-to-image-generation vqa

dsg's People

Contributors

Stargazers

Watchers

Forkers

mbrukman thomaskalnik erfanili whuhxb weareastral

dsg's Issues

which means annotation_id?

thanks for your amazing work!
about the annotations(tifa_v1.0_question_answers.json),I want to ask:
1.Did you annotate DSG　from coco2014 or 2017? "coco_val_id"?but I cannot find the according number in val_instance2017.json or train_instance2017.json
2.which number means coco anno_id?
{
"id": "coco_435097",
"caption": "Some very big furry brown bears in a big grass field.",
"question": "is this a grass field?",
"choices": [
"yes",
"no"
],
"answer": "yes",
"element_type": "location",
"element": "grass field",
"coco_val_id": "471450"
},

detailed dependancy for reproducing CLIPScore TIFA160

Hey there, thank you for great work! It really inspires my work.
I tried to reproduce Table 12, specifically CLIPScore.

I found in CLIPScore repo, some packages (such as Pillow 8.4 vs 9.4 / torch 1.7 vs 2.0 / numpy 1.20.0 or higher) returns different value, subsequently return different correlation value. Also, clipscore employs prefix A photo depicts . However, I found TIFAv1 CLIPScore corresponds with without any prefix.
When I reproduce with TIFA160 and it returns slightly different values
(DSG report) 0.276 / 0.191

Pilllow==9.4.0
- prefix "A photo depicts ": 0.299 / 0.226
- prefix "": 0.279/ 0.209
Pillow==8.4.0
- prefix "A photo depicts ": 0.285 / 0.215
- prefix "": 0.266 / 0.199

It would be really helpful if you provide package dependancy you used for the paper and whether you used prefix when calculating CLIPScore.
Thanks!

Releasing the human likert scale for TIFA160?

The paper mentions that you collect the human likert scale (1-5) for TIFA160 on 5 diffusion models. Will the likert scale + generated images be released?

Could you open sourcing the annotation system?

Hello,

I've recently delved into the Davidsonian Scene Graph and found it absolutely fascinating. I want to follow in your footsteps and continue exploring this domain.
But now I am stuck in the construction of the annotation system, could you open source the annotation system of DSG?

Thank you for your time and consideration, and once again, kudos to the team for such an outstanding job!

Best regards！

How to make '_OAI_KEY.txt' file

hi, I show 'FileNotFoundError: [Errno 2] No such file or directory: './_OAI_KEY.txt'' when trying to use 'ti2_eval_example.ipynb '. How do I make my own '_OAI_KEY.txt'?
could you help me,pleases.

Duplicated text prompts?

Hi,

Seems item midjourney_61 and midjourney_65 are the same. Are there plans to update the 1k set?

Indentation level of some codes

DSG/query_utils.py

Lines 350 to 389 in 3f844c1

    
           # 2) Run LM calls 
        
           if verbose: 
        
           	print(f"Running LM calls with {num_workers} workers.") 
        
           if num_workers == 1: 
        
           	total_output = [] 
        
           	for kwargs in tqdm.tqdm(total_kwargs): 
        
           		prompt = kwargs["prompt"] 
        
           		output = generate_fn(prompt) 
        
           		total_output += [output] 
        
           else: 
        
           	from multiprocessing import Pool 
        
           	with Pool(num_workers) as p: 
        
           		total_inputs = [d['prompt'] for d in total_kwargs] 
        
           		total_output = list( 
        
           			tqdm.tqdm(p.imap(generate_fn, total_inputs), total=len(total_inputs))) 
        
           # 3) Postprocess LM outputs 
        
           id2outputs = {} 
        
           for i, id_ in enumerate( 
        
           	tqdm.tqdm( 
        
           			ids, 
        
           			dynamic_ncols=True, 
        
           			ncols=80, 
        
           			disable=not verbose, 
        
           			desc="Postprocessing LM outputs" 
        
           		) 
        
           	): 
        
           	test_input = id2inputs[id_]["input"] 
        
           	raw_prediction = total_output[i] 
        
           	prediction = parse_fn(raw_prediction).strip() 
        
           	out_datum = {} 
        
           	out_datum["id"] = id_ 
        
           	out_datum["input"] = test_input 
        
           	out_datum["output"] = prediction 
        
           	id2outputs[id_] = out_datum

Hello,

Thank you for sharing your great work!

Upon attempting to generate a DSG, I've noticed that the indentation of certain lines may be incorrect. It appears they should be indented one level less, as the current formatting causes an exception to be thrown. Once I adjusted the indentation level, the generation process proceeded without any issues.

Qinyu

j-min / dsg Goto Github PK

dsg's People

Contributors

Stargazers

Watchers

Forkers

dsg's Issues

which means annotation_id?

detailed dependancy for reproducing CLIPScore TIFA160

Releasing the human likert scale for TIFA160?

Could you open sourcing the annotation system?

How to make '_OAI_KEY.txt' file

Duplicated text prompts?

Indentation level of some codes

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

	# 2) Run LM calls
	if verbose:
	print(f"Running LM calls with {num_workers} workers.")
	if num_workers == 1:
	total_output = []
	for kwargs in tqdm.tqdm(total_kwargs):
	prompt = kwargs["prompt"]
	output = generate_fn(prompt)
	total_output += [output]

	else:
	from multiprocessing import Pool
	with Pool(num_workers) as p:
	total_inputs = [d['prompt'] for d in total_kwargs]
	total_output = list(
	tqdm.tqdm(p.imap(generate_fn, total_inputs), total=len(total_inputs)))

	# 3) Postprocess LM outputs
	id2outputs = {}

	for i, id_ in enumerate(
	tqdm.tqdm(
	ids,
	dynamic_ncols=True,
	ncols=80,
	disable=not verbose,
	desc="Postprocessing LM outputs"
	)
	):

	test_input = id2inputs[id_]["input"]
	raw_prediction = total_output[i]
	prediction = parse_fn(raw_prediction).strip()

	out_datum = {}
	out_datum["id"] = id_
	out_datum["input"] = test_input
	out_datum["output"] = prediction

	id2outputs[id_] = out_datum