eric-ai-lab / vlmbench Goto Github PK

View Code? Open in Web Editor NEW

77.0 77.0 8.0 97.3 MB

NeurIPS 2022 Paper "VLMbench: A Compositional Benchmark for Vision-and-Language Manipulation"

License: MIT License

Python 96.99% Lua 2.63% Shell 0.38%

compositionality embodied-ai language-grounding robotic-manipulation vision-and-language

vlmbench's People

Contributors

Stargazers

Watchers

Forkers

xihangyu630 guhur ricklentz siilats yaoxt3 chansir-qz qianqianlo reginald-mclean

vlmbench's Issues

Running headless mode raises OpenGL error.

Hi,
I'm trying to use this package and python examples/gym_test.py raises the following error:

This plugin does not support createPlatformOpenGLContext!


Error: signal 11:

<path_to_CoppeliaSim_Edu_V4_1_0>/libcoppeliaSim.so.1(_Z11_segHandleri+0x30)[0x2b256a2deae0]
/lib/x86_64-linux-gnu/libc.so.6(+0x43090)[0x2b255d1c7090]
<path_to_CoppeliaSim_Edu_V4_1_0>/libQt5Gui.so.5(_ZNK14QOpenGLContext10shareGroupEv+0x0)[0x2b256b992060]
<path_to_CoppeliaSim_Edu_V4_1_0>/libQt5Gui.so.5(_ZN16QOpenGLFunctions25initializeOpenGLFunctionsEv+0x4b)[0x2b256bc5ea4b]
<path_to_CoppeliaSim_Edu_V4_1_0>/libQt5Gui.so.5(_ZN24QOpenGLFramebufferObjectC1EiiNS_10AttachmentEjj+0xc8)[0x2b256bc62a18]
<path_to_CoppeliaSim_Edu_V4_1_0>/libsimExtOpenGL3Renderer.so(_ZN18CFrameBufferObjectC2Eii+0x5a)[0x2b25a0acf24a]
<path_to_CoppeliaSim_Edu_V4_1_0>/libsimExtOpenGL3Renderer.so(_ZN16COpenglOffscreenC1EiiiP14QOpenGLContext+0x72)[0x2b25a0acf602]
<path_to_CoppeliaSim_Edu_V4_1_0>/libsimExtOpenGL3Renderer.so(_Z21executeRenderCommandsbiPv+0x2550)[0x2b25a0acdb90]
<path_to_CoppeliaSim_Edu_V4_1_0>/libcoppeliaSim.so.1(_ZN16CPluginContainer11extRendererEiPv+0x19)[0x2b256a4a8249]
<path_to_CoppeliaSim_Edu_V4_1_0>/libcoppeliaSim.so.1(_ZN13CVisionSensor24_extRenderer_prepareViewEi+0x347)[0x2b256a1af107]
QMutex: destroying locked mutex

I have installed PyRep and CoppeliaSim, and I also set the following env variables:

export COPPELIASIM_ROOT=EDIT/ME/PATH/TO/COPPELIASIM/INSTALL/DIR
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$COPPELIASIM_ROOT
export QT_QPA_PLATFORM_PLUGIN_PATH=$COPPELIASIM_ROOT

Do you have any tips?

Dataset about long-horizon tasks

Hello! I am interested in obtaining the dataset about long-horizon tasks such as in Figure 1 of the paper "VLMbench: A Compositional Benchmark for Vision-and-Language Manipulation". This dataset would be extremely useful for advancing research in multi-task visual-and-language manipulation learning. If it is not possible to share the dataset, would it be possible to provide guidance on how to generate similar datasets, including long-horizon task trajectories with observations, abstract instructions, and decomposed sub-tasks?

Thank you for your time in considering this request.

May I ask about the computing sources required?

Hi, thanks for sharing your work.
May I ask how many GPUs (or the memory) it takes to train the baseline and how long the training procedure lasts for each category of tasks?

An issue about the obs

Thanks for your great work!
Are there any segmentation masks or bbox in the dataset?
And if you will consider add these annotations into the dataset or are there interfaces to extract these annotations in this work?

Solutions about OOB problems of gdrive

Recently, the original gdrive failed due to Google's new policy. To solve this problem, please check the detailed discussion in this issue. Here, I just summarize some key points of one working solution:

Follow the instructions to install the latest Go (at least 1.18).
Clone another gdrive repo.
Get your clientId and clientSecret by following the instructions in the repo.
Edited handlers_drive.go to include your credentials.
Exected compile to get the binary file gdrive.
Running gdrive about to authorize access to your google drive.

Task builder gets stuck

waypoint_type etc. None in waypoints_info in cliport_test

Hi VLMBench gurus,

I'm just trying to run the cliport_test script with the provided models.

I installed the code and dependencies, and downloaded the trained model, and the seen and unseen splits of data:
bash download_dataset.sh -s ~/vlmbench/Dataset -p valid_unseen -t pick
bash download_dataset.sh -s ~/vlmbench/Dataset -p valid_seen -t pick

When running cliport_test, waypoint_type etc. fields are all None in waypoints_info array.
python vlm/scripts/cliport_test.py --task pick --data_folder /Path/Dataset/valid --checkpoints_folder /Path/models

This is the waypoints_info array:
['waypoint0', None, None, 1, None, False, array([ 0.09891216, -0.09790423, 0.85907155, -0.64400905, 0.76425868,
-0.01948901, 0.02795067])], ['waypoint1', None, None, 1, None, False, array([ 0.10433818, -0.0974073 , 0.7792573 , -0.64400905, 0.76425868,
-0.01948901, 0.02795067])], ['waypoint2', None, None, 1, None, False, array([ 0.09755566, -0.09802846, 0.8790251 , -0.64400905, 0.76425868,
-0.01948901, 0.02795067])], ['waypoint3', None, None, 1, None, False, array([ 0.43130848, -0.15798603, 0.85353625, 0.39130512, 0.79267031,
-0.0298808 , 0.46654186])]

The code catches this error and says "need re-generate: /Path/valid/seen/pick_cube_shape/variation0/episodes/episode4".

Am I missing some installation step? Am I downloading the correct dataset files?

Thanks,
Le

Error while running to task_reset()

When I tried to write my own python script to run, like RLBench, I encountered two problems: 1. The photos of the five cameras are all black. 2.When the code was executed to task.reset(), I encountered an error with the following specific information:
External call to simCallScriptFunction failed (_WriteCustomDataBlock@PyRep): Script function doe not exist.
After the error occurred, V-REP crashed.
In terminal:
Traceback (most recent call last):
File "test.py", line 77, in
descriptions, obs = task.reset()
File "/home/xiesenwei/robotics/VLMBench/vlmbench/amsolver/task_environment.py", line 93, in reset
desc = self._scene.init_episode(
File "/home/xiesenwei/robotics/VLMBench/vlmbench/amsolver/backend/scene.py", line 131, in init_episode
self.descriptions = self._active_task.init_episode(index)
File "/home/xiesenwei/robotics/VLMBench/vlmbench/vlm/tasks/drop_pen_color.py", line 26, in init_episode
return super().init_episode(index)
File "/home/xiesenwei/robotics/VLMBench/vlmbench/vlm/tasks/drop_pen.py", line 49, in init_episode
waypoints = GraspTask.get_path(try_ik_sampling=False, ignore_collisions=True)
File "/home/xiesenwei/robotics/VLMBench/vlmbench/amsolver/backend/unit_tasks.py", line 284, in get_path
WriteCustomDataBlock(waypoint.get_handle(),"waypoint_type","pre_grasp")
File "/home/xiesenwei/robotics/VLMBench/vlmbench/amsolver/backend/utils.py", line 276, in WriteCustomDataBlock
pyrep_utils.script_call('_WriteCustomDataBlock@PyRep', PYREP_SCRIPT_TYPE,
File "/home/xiesenwei/anaconda3/envs/vlm/lib/python3.8/site-packages/pyrep/backend/utils.py", line 65, in script_call
return sim.simExtCallScriptFunction(
File "/home/xiesenwei/anaconda3/envs/vlm/lib/python3.8/site-packages/pyrep/backend/sim.py", line 698, in simExtCallScriptFunction
_check_return(ret)
File "/home/xiesenwei/anaconda3/envs/vlm/lib/python3.8/site-packages/pyrep/backend/sim.py", line 27, in _check_return
raise RuntimeError(
RuntimeError: The call failed on the V-REP side. Return value: -1

Error: signal 11:

/home/xiesenwei/robotics/CoppeliaSim_Edu_V4_1_0_Ubuntu20_04/libcoppeliaSim.so.1(_Z11_segHandleri+0x30)[0x7f4ba9411ae0]
/lib/x86_64-linux-gnu/libc.so.6(+0x43090)[0x7f4c4bd49090]
/home/xiesenwei/robotics/CoppeliaSim_Edu_V4_1_0_Ubuntu20_04/libQt5Core.so.5(_ZNK18QThreadStorageData3getEv+0x2b)[0x7f4ba6e3d1eb]
/home/xiesenwei/robotics/CoppeliaSim_Edu_V4_1_0_Ubuntu20_04/libQt5OpenGL.so.5(_Z19qt_qgl_paint_enginev+0x2d)[0x7f4ba8ba61ed]
/home/xiesenwei/robotics/CoppeliaSim_Edu_V4_1_0_Ubuntu20_04/libQt5Widgets.so.5(_ZN14QWidgetPrivate11repaint_sysERK7QRegion+0x94)[0x7f4ba84a5094]
/home/xiesenwei/robotics/CoppeliaSim_Edu_V4_1_0_Ubuntu20_04/libQt5Widgets.so.5(_ZN14QWidgetPrivate16syncBackingStoreEv+0x5f)[0x7f4ba84bea6f]
/home/xiesenwei/robotics/CoppeliaSim_Edu_V4_1_0_Ubuntu20_04/libQt5Widgets.so.5(_ZN7QWidget5eventEP6QEvent+0x300)[0x7f4ba84d5920]
/home/xiesenwei/robotics/CoppeliaSim_Edu_V4_1_0_Ubuntu20_04/libQt5Widgets.so.5(_ZN19QApplicationPrivate13notify_helperEP7QObjectP6QEvent+0x9c)[0x7f4ba849792c]
/home/xiesenwei/robotics/CoppeliaSim_Edu_V4_1_0_Ubuntu20_04/libQt5Widgets.so.5(_ZN12QApplication6notifyEP7QObjectP6QEvent+0x2b0)[0x7f4ba849ead0]
/home/xiesenwei/robotics/CoppeliaSim_Edu_V4_1_0_Ubuntu20_04/libQt5Core.so.5(_ZN16QCoreApplication15notifyInternal2EP7QObjectP6QEvent+0x108)[0x7f4ba7005008]
QMutex: destroying locked mutex

Meanwhile, I encountered an error while running dataset generator NLP. py in the tools directory: The call failed on the V-REP side Return value: -1

Thanks for any help

can it use fetch robot?

I want to sim2real in fetch robot, can the benchmark use the fetch robot?

ImportError pyrep

Hello, I met an ImportError:

File "VLMbench/amsolver/__init__.py", line 5, in <module>
    import pyrep
ModuleNotFoundError: No module named 'pyrep'

Next I tried to install this package using pip. However, it raised another error:

File "VLMbench/amsolver/__init__.py", line 9, in <module>
    raise ImportError(
ImportError: PyRep version must be greater than 4.1.0.2. Please update PyRep.

The latest version of pyrep I can install is 3.2.0, and I am stuck here now :(

How to edit obj part textures

Hi, thanks for your splendid work.

I would like to modify the specific obj's part texture, such as the a top drawer from the entire drawer obj.
From this line in pyrep, it seems to be possible to modify by code line, not modifying the obj texture itself. Is there any guidance or example code of this?

Thank you!

Evaluation issues

Hello, I checked your evaluation script cliport_test.py, but was confused about the setting of the beginning phase.

TwoStreamClipLingUNetLatTransporterAgent.act takes in observation and instruction of each step, but directly outputs the action of place (skipping pick). So, I wonder how can this work in the beginning of task evaluation when nothing is being picked. Or more generally, this is a question about how you extend to arbitrary number of steps (as claimed in paper) with two-stage agent (CLIPort). I mean, in other words, only when pick can be ignored (e.g. object already picked), directly applying place may be reasonable.

By the way, I cannot access waypoints information in the small sample dataset. Maybe the waypoints can provide some clues for the above question, but now I have no idea.

Providing another storage platform than Google Drive

Hi,

Thanks for sharing your code!

Could you share your code on another platform than Google Drive? After a few files, I obtain a "Too many request" errors :/

Thanks!