nvlabs / progprompt-vh Goto Github PK

ProgPrompt for Virtualhome

License: Other

Python 100.00%

progprompt-vh's Introduction

ProgPrompt on VirtualHome

This is the code release for the paper ProgPrompt: Generating Situated Robot Task Plans using Large Language Models. It contains code for replicating the results on the VirtualHome dataset.

Setup

Create a conda environment (or your virtualenv):

conda create -n progprompt python==3.9

Install dependencies:

pip install -r requirements.txt

Clone VirtualHome and install it by running:

pip install -e .

Note: If you an encounter an error to do with wrong number of arguments to function execute, then in file virtualhome/src/virtualhome/simulation/evolving_graph/execution.py line 67, add *args as follows:

    def execute(self, script: Script, state: EnvironmentState, info: ExecutionInfo, char_index, *args):

This was tested on VirtualHome commit f84ee28a75b23318ee1bf652862b1c993269cd06.

Finally, download the virtualhome unity simulator and make sure it runs. The simulator can run on the desktop, or on a virtual x-server.

Running evaluation

Here is a minimal example how to run the evaluation script. Replace {arguments in curly braces} with appropriate values on your system:

python3 scripts/run_eval.py --progprompt-path $(pwd) --expt-name {expt_name} --openai-api-key {key} --unity-filename {v2.3_virtualhome_sim} --display {0}

For more options and arguments, look inside scripts/run_eval.py.

progprompt-vh's People

Stargazers

Watchers

Forkers

ishikasingh jwk1rose dr-data briantakion lqiang2003cn breese5 qiyedetianfu songdohou dtbinh

progprompt-vh's Issues

evaluation metric

In run_eval.py,
results["overall"] = {'PSR': sum(sr)/len(sr),
"SR": sr.count(1.0)/len(sr),
"Precision": 1-sum(unchanged_conds)/sum(total_unchanged_conds),
"Exec": sum(exec_per_task)/len(exec_per_task)
}
Could you please explain which is "SR", "Exec", "GSR" in the paper? Based on my understanding, SR is calculated by "PSR" or "SR" , "Exec" is obtained by "Exec" in the code. But how to get "GCR"? Is that same as "Precision"? Checking if the executor keeps the states which should keep unchanged during the whole set of executions, unchanged, and translating it into the overlapping between the final achieved state g' and ground truth final state g.

After the program runs, the action command can be issued, but the virtualhoem is always a black screen, and the animation cannot be seen

[WALK] (205) [0]
[FIND] (247) [0]
[FIND] (272) [0]
[GRAB] (272) [0]
[PUTIN] (272) (247) [0]
[FIND] (271) [0]
[GRAB] (271) [0]
[PUTIN] (271) (247) [0]
Executing: throw away the lime

[WALK] (205) [0]
[FIND] (229) [0]
[FIND] (440) [0]
[GRAB] (440) [0]
[FIND] (229) [0]
[OPEN] (229) [0]
[PUTIN] (440) (229) [0]
[CLOSE] (229) [0]
Executing: put the wine glass in the kitchen cabinet

[WALK] (205) [0]
[FIND] (198) [0]
[GRAB] (198) [0]
[FIND] (236) [0]
[OPEN] (236) [0]
[OPEN] (236) [0]
[PUTIN] (198) (236) [0]
[OPEN] (236) [0]
[CLOSE] (236) [0]
Executing: put the candle on the living room shelf

[WALK] (335) [0]
[FIND] (69) [0]
[GRAB] (69) [0]
[FIND] (43) [0]
[FIND] (69) [0]
[GRAB] (69) [0]
[PUTIN] (69) (250) [0]
Executing: listen to radio

[WALK] (335) [0]
[FIND] (176) [0]
[FIND] (262) [0]
[SWITCHOFF] (262) [0]
[SWITCHON] (262) [0]
[FIND] (176) [0]
[FIND] (428) [0]
[SWITCHON] (428) [0]
[FIND] (176) [0]
[FIND] (176) [0]
[SWITCHOFF] (176) [0]
[SWITCHON] (176) [0]
Executing: bring pillow to the sofa

[WALK] (335) [0]
[FIND] (186) [0]
[GRAB] (186) [0]
[FIND] (368) [0]
[PUTIN] (186) (368) [0]
Executing: open window

[WALK] (70) [0]
[FIND] (70) [0]
[OPEN] (70) [0]
Executing: cut apple

[WALK] (205) [0]
[FIND] (438) [0]
[GRAB] (438) [0]
[FIND] (283) [0]
[GRAB] (283) [0]
[FIND] (231) [0]
[FIND] (438) [0]
[GRAB] (438) [0]
[PUTIN] (438) (231) [0]
[FIND] (231) [0]
[PUTIN] (283) (231) [0]
Executing: wash mug

[WALK] (205) [0]
[FIND] (247) [0]
[FIND] (248) [0]
[FIND] (50) [0]
[FIND] (248) [0]
[SWITCHON] (248) [0]
[FIND] (194) [0]
[FIND] (447) [0]
[FIND] (447) [0]
[GRAB] (447) [0]
[FIND] (247) [0]
[PUTIN] (447) (247) [0]
[FIND] (267) [0]
[GRAB] (267) [0]
[FIND] (247) [0]
[PUTIN] (267) (247) [0]
[FIND] (266) [0]
[GRAB] (266) [0]
[FIND] (247) [0]
[PUTIN] (266) (247) [0]

----Results----
{'PSR': 0.30428571428571427, 'SR': 0.2, 'Precision': 0.9966354241768806, 'Exec': 0.8809396159396158}

After the program runs, the action command can be issued, but the virtualhome is always a black screen, and the animation cannot be seen

OpenAI API Update

Kindly Update the code in utils_execute.py to be compatible with the new openai SKD.
When crafting a response (line 38), one now needs to use "openai.completions.create" instead of "open.Completion.create". You can find the link for this conversation : here

I also suggest updating the Readme to also mention the filepath changes that need to be made in virtualhome/virtualhome/init.py , and the location of where virtualhome needs to be cloned w.r.t. progprompt, to make setup easier for the first-time user.

Sincerely

question for test dataset

I noticed in your article that your dataset has 70 tasks, but I only found 34 of them in github, where are the rest of the tasks please? Thank you

question for executing generated plan

I ran your code and generated a plan, and when this plan is running the simulator is just switching images very quickly from frame to frame without playing any animation. I'd like to say if this is due to a different version of the emulator, but I noticed we're both using version 2.3 of the emulator. If you guys could provide a download link for the emulator that would be great! Thank you.

gpt_version issuses

Hello, author. I am very interested in your work, but I have encountered some issues during the learning process and I hope to get your answers. I have set the "gpt_version" in the "run_eval.py" file to "gpt-3.5-turbo-instruct" and "gpt-3.5-turbo-instruct-0914". However, I noticed that when using these versions, all values for SR=0 are coming out as zero. Is this normal? Also, when using "code-davinci-002", it shows that the model does not exist. Could you please let me know which model versions I can use for training and testing under the current API key?

nvlabs / progprompt-vh Goto Github PK

progprompt-vh's Introduction

ProgPrompt on VirtualHome

Setup

Running evaluation

progprompt-vh's People

Stargazers

Watchers

Forkers

progprompt-vh's Issues

evaluation metric

After the program runs, the action command can be issued, but the virtualhoem is always a black screen, and the animation cannot be seen

OpenAI API Update

question for test dataset

question for executing generated plan

gpt_version issuses

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent