Code Monkey home page Code Monkey logo

skill-chaining's Introduction

Adversarial Skill Chaining for Long-Horizon Robot Manipulation via Terminal State Regularization

[Project website] [Paper]

This project is a PyTorch implementation of Adversarial Skill Chaining for Long-Horizon Robot Manipulation via Terminal State Regularization, published in CoRL 2021.

Note that Unity rendering for IKEA Furniture Assembly Environment is temporally not available due to the deprecated Unity-MuJoCo plugin in the new version of MuJoCo (2.1). It is still working with MuJoCo 2.0.

Files and Directories

  • run.py: launches an appropriate trainer based on algorithm
  • policy_sequencing_trainer.py: trainer for policy sequencing
  • policy_sequencing_agent.py: model and training code for policy sequencing
  • policy_sequencing_rollout.py: rollout with policy sequencing agent
  • policy_sequencing_config.py: hyperparameters
  • method/: implementation of IL and RL algorithms
  • furniture/: IKEA furniture environment
  • demos/: default demonstration directory
  • log/: default training log directory
  • result/: evaluation result directory

Prerequisites

  • Ubuntu 18.04 or above
  • Python 3.6
  • Mujoco 2.1

Installation

  1. Clone this repository and submodules.
$ git clone --recursive [email protected]:clvrai/skill-chaining.git
  1. Install mujoco 2.1 and add the following environment variables into ~/.bashrc or ~/.zshrc Note that the code is compatible with MuJoCo 2.0, which supports Unity rendering.
# download mujoco 2.1
$ mkdir ~/.mujoco
$ wget https://mujoco.org/download/mujoco210-linux-x86_64.tar.gz -O mujoco_linux.tar.gz
$ tar -xvzf mujoco_linux.tar.gz -C ~/.mujoco/
$ rm mujoco_linux.tar.gz

# add mujoco to LD_LIBRARY_PATH
$ export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$HOME/.mujoco/mujoco210/bin

# for GPU rendering
$ export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/lib/nvidia

# only for a headless server
$ export LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libGLEW.so
  1. Install python dependencies
$ sudo apt-get install cmake libopenmpi-dev libgl1-mesa-dev libgl1-mesa-glx libosmesa6-dev patchelf libglew-dev

# software rendering
$ sudo apt-get install libgl1-mesa-glx libosmesa6 patchelf

# window rendering
$ sudo apt-get install libglfw3 libglew2.0
  1. Install furniture submodule
$ cd furniture
$ pip install -e .
$ cd ../method
$ pip install -e .
$ pip install torch torchvision

Usage

For chair_ingolf_0650, simply change table_lack_0825 to chair_ingolf_0650 in the commands. For training with gpu, specify the desired gpu number (e.g. --gpu 0). To change the random seed, append, e.g., --seed 0 to the command.

To enable wandb logging, add the following arguments with your wandb entity and project names: --wandb True --wandb_entity [WANDB ENTITY] --wandb_project [WANDB_PROJECT].

  1. Generate demos
# Sub-task demo generation
python -m furniture.env.furniture_sawyer_gen --furniture_name table_lack_0825 --demo_dir demos/table_lack/ --reset_robot_after_attach True --max_episode_steps 200 --num_connects 1 --n_demos 200 --start_count 0 --phase_ob True
python -m furniture.env.furniture_sawyer_gen --furniture_name table_lack_0825 --demo_dir demos/table_lack/ --reset_robot_after_attach True --max_episode_steps 200 --num_connects 1 --n_demos 200 --preassembled 0 --start_count 1000 --phase_ob True
python -m furniture.env.furniture_sawyer_gen --furniture_name table_lack_0825 --demo_dir demos/table_lack/ --reset_robot_after_attach True --max_episode_steps 200 --num_connects 1 --n_demos 200 --preassembled 0,1 --start_count 2000 --phase_ob True
python -m furniture.env.furniture_sawyer_gen --furniture_name table_lack_0825 --demo_dir demos/table_lack/ --reset_robot_after_attach True --max_episode_steps 200 --num_connects 1 --n_demos 200 --preassembled 0,1,2 --start_count 3000 --phase_ob True

# Full-task demo generation
python -m furniture.env.furniture_sawyer_gen --furniture_name table_lack_0825 --demo_dir demos/table_lack_full/ --reset_robot_after_attach True --max_episode_steps 800 --num_connects 4 --n_demos 200 --start_count 0 --phase_ob True
  1. Train sub-task policies
mpirun -np 16 python -m run --algo gail --furniture_name table_lack_0825 --demo_path demos/table_lack/Sawyer_table_lack_0825_0 --num_connects 1 --run_prefix p0
mpirun -np 16 python -m run --algo gail --furniture_name table_lack_0825 --demo_path demos/table_lack/Sawyer_table_lack_0825_1 --num_connects 1 --preassembled 0 --run_prefix p1 --load_init_states log/table_lack_0825.gail.p0.123/success_00024576000.pkl
mpirun -np 16 python -m run --algo gail --furniture_name table_lack_0825 --demo_path demos/table_lack/Sawyer_table_lack_0825_2 --num_connects 1 --preassembled 0,1 --run_prefix p2 --load_init_states log/table_lack_0825.gail.p1.123/success_00030310400.pkl
mpirun -np 16 python -m run --algo gail --furniture_name table_lack_0825 --demo_path demos/table_lack/Sawyer_table_lack_0825_3 --num_connects 1 --preassembled 0,1,2 --run_prefix p3 --load_init_states log/table_lack_0825.gail.p2.123/success_00027852800.pkl
  1. Collect successful terminal states from sub-task policies Find the best performing checkpoint from WandB, and replace checkpoint path with the best performing checkpoint (e.g. --init_ckpt_path log/table_lack_0825.gail.p0.123/ckpt_00021299200.pt).
python -m run --algo gail --furniture_name table_lack_0825 --demo_path demos/table_lack/Sawyer_table_lack_0825_0 --num_connects 1 --run_prefix p0 --is_train False --num_eval 200 --record_video False --init_ckpt_path log/table_lack_0825.gail.p0.123/ckpt_00000000000.pt
python -m run --algo gail --furniture_name table_lack_0825 --demo_path demos/table_lack/Sawyer_table_lack_0825_1 --num_connects 1 --preassembled 0 --run_prefix p1 --is_train False --num_eval 200 --record_video False --init_ckpt_path log/table_lack_0825.gail.p1.123/ckpt_00000000000.pt
python -m run --algo gail --furniture_name table_lack_0825 --demo_path demos/table_lack/Sawyer_table_lack_0825_2 --num_connects 1 --preassembled 0,1 --run_prefix p2 --is_train False --num_eval 200 --record_video False --init_ckpt_path log/table_lack_0825.gail.p2.123/ckpt_00000000000.pt
python -m run --algo gail --furniture_name table_lack_0825 --demo_path demos/table_lack/Sawyer_table_lack_0825_3 --num_connects 1 --preassembled 0,1,2 --run_prefix p3 --is_train False --num_eval 200 --record_video False --init_ckpt_path log/table_lack_0825.gail.p3.123/ckpt_00000000000.pt
  1. Train skill chaining Use the best performing checkpoints (--ps_ckpt) and their successful terminal states (--ps_laod_init_states).
# Ours
mpirun -np 16 python -m run --algo ps --furniture_name table_lack_0825 --num_connects 4 --run_prefix ours \
--ps_ckpts log/table_lack_0825.gail.p0.123/ckpt_00021299200.pt,log/table_lack_0825.gail.p1.123/ckpt_00021299200.pt,log/table_lack_0825.gail.p2.123/ckpt_00021299200.pt,log/table_lack_0825.gail.p3.123/ckpt_00021299200.pt \
--ps_load_init_states log/table_lack_0825.gail.p0.123/success_00021299200.pkl,log/table_lack_0825.gail.p1.123/success_00021299200.pkl,log/table_lack_0825.gail.p2.123/success_00021299200.pkl,log/table_lack_0825.gail.p3.123/success_00021299200.pkl \
--ps_demo_paths demos/table_lack/Sawyer_table_lack_0825_0,demos/table_lack/Sawyer_table_lack_0825_1,demos/table_lack/Sawyer_table_lack_0825_2,demos/table_lack/Sawyer_table_lack_0825_3

# Policy Sequencing (Clegg et al. 2018)
mpirun -np 16 python -m run --algo ps --furniture_name table_lack_0825 --num_connects 4 --run_prefix ps \
--ps_ckpts log/table_lack_0825.gail.p0.123/ckpt_00021299200.pt,log/table_lack_0825.gail.p1.123/ckpt_00021299200.pt,log/table_lack_0825.gail.p2.123/ckpt_00021299200.pt,log/table_lack_0825.gail.p3.123/ckpt_00021299200.pt \
--ps_load_init_states log/table_lack_0825.gail.p0.123/success_00021299200.pkl,log/table_lack_0825.gail.p1.123/success_00021299200.pkl,log/table_lack_0825.gail.p2.123/success_00021299200.pkl,log/table_lack_0825.gail.p3.123/success_00021299200.pkl \
--ps_demo_paths demos/table_lack/Sawyer_table_lack_0825_0,demos/table_lack/Sawyer_table_lack_0825_1,demos/table_lack/Sawyer_table_lack_0825_2,demos/table_lack/Sawyer_table_lack_0825_3
  1. Train baselines
# BC
python -m run --algo bc --max_global_step 1000 --furniture_name table_lack_0825 --demo_path demos/table_lack_full/Sawyer_table_lack_0825 --record_video False --run_prefix bc --gpu 0

# GAIL
mpirun -np 16 python -m run --algo gail --furniture_name table_lack_0825 --demo_path demos/table_lack_full/Sawyer_table_lack_0825 --num_connects 4 --max_episode_steps 800 --max_global_step 200000000 --run_prefix gail --gail_env_reward 0

# GAIL+PPO
mpirun -np 16 python -m run --algo gail --furniture_name table_lack_0825 --demo_path demos/table_lack_full/Sawyer_table_lack_0825 --num_connects 4 --max_episode_steps 800 --max_global_step 200000000 --run_prefix gail_ppo

# PPO
mpirun -np 16 python -m run --algo ppo --furniture_name table_lack_0825 --num_connects 4 --max_episode_steps 800 --max_global_step 200000000 --run_prefix ppo

Citation

If you find this useful, please cite

@inproceedings{lee2021adversarial,
  title={Adversarial Skill Chaining for Long-Horizon Robot Manipulation via Terminal State Regularization},
  author={Youngwoon Lee and Joseph J. Lim and Anima Anandkumar and Yuke Zhu},
  booktitle={Conference on Robot Learning},
  year={2021},
}

References

skill-chaining's People

Contributors

youngwoon avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

skill-chaining's Issues

MPI fails when trainer has `--wandb True`

Good day,

Given how important wandb is in ablation studies, it would be quite helpful to get it running without crashing the script. I understand from #1 that this does not seem to affect your side, however, it is also not an issue with MPI and wandb alone.

Running a test script like the following with mpirun -n 1 is fine.

import json                                                          
import wandb                                                         
                                                                     
wandb_entity="my-entity"                                         
wandb_project="my-project"                                                
                                                                     
exclude = ["device"]                                                 
                                                                     
with open('~/skill-chaining/log/table_lack_0825.gail.p0.123/params.json', "r") as fp:      
    cdict=json.load(fp)                                              
                                                                     
wandb.init(                                                                               
    resume='table_lack_0825.gail.p0.123',                            
    project=wandb_project,                                           
    config={k: v for k, v in cdict.items() if k not in exclude},     
    dir='~/skill-chaining/log/table_lack_0825.gail.p0.123',
    entity=wandb_entity,                                             
    notes='',                                                        
    mode="online",                                                   
)                                                                    

Using MPI with run.py and wandb enabled, however, crashes the script - it is not a resource issue or a native error to the MPI + wandb pair:

$ mpirun -n 1 python -m run --algo gail --furniture_name table_lack_0825 --demo_path demos/table_lack/Sawyer_table_lack_0825_0 --num_connects 1 --run_prefix p0 --gpu 0 --wandb True --max_global_step 100000000 --wandb_entity my-entity --wandb_project my-project
pybullet build time: Apr 21 2022 20:41:06
[DEBUG] Wandb Init Before
~/anaconda3/envs/IKEA_1/lib/python3.7/site-packages/torchvision/transforms/functional_pil.py:228: DeprecationWarning: BILINEAR is deprecated and will be removed in Pillow 10 (2023-07-01). Use Resampling.BILINEAR instead.
  interpolation: int = Image.BILINEAR,
~/anaconda3/envs/IKEA_1/lib/python3.7/site-packages/torchvision/transforms/functional_pil.py:295: DeprecationWarning: NEAREST is deprecated and will be removed in Pillow 10 (2023-07-01). Use Resampling.NEAREST or Dither.NONE instead.
  interpolation: int = Image.NEAREST,
~/anaconda3/envs/IKEA_1/lib/python3.7/site-packages/torchvision/transforms/functional_pil.py:328: DeprecationWarning: BICUBIC is deprecated and will be removed in Pillow 10 (2023-07-01). Use Resampling.BICUBIC instead.
  interpolation: int = Image.BICUBIC,
wandb: Currently logged in as: my-team (use `wandb login --relogin` to force relogin)
--------------------------------------------------------------------------
It looks like orte_init failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems.  This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):

  getting local rank failed
  --> Returned value No permission (-17) instead of ORTE_SUCCESS
--------------------------------------------------------------------------
--------------------------------------------------------------------------
It looks like orte_init failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems.  This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):

  orte_ess_init failed
  --> Returned value No permission (-17) instead of ORTE_SUCCESS
--------------------------------------------------------------------------
--------------------------------------------------------------------------
It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or environment
problems.  This failure appears to be an internal failure; here's some
additional information (which may only be relevant to an Open MPI
developer):

  ompi_mpi_init: ompi_rte_init failed
  --> Returned "No permission" (-17) instead of "Success" (0)
--------------------------------------------------------------------------
*** An error occurred in MPI_Init_thread
*** on a NULL communicator
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
***    and potentially your MPI job)
[digi2:2953274] Local abort before MPI_INIT completed completed successfully, but am not able to aggregate error messages, and not able to guarantee that all other processes were killed!
Problem at: ~/skill-chaining/method/robot_learning/main.py 133 _make_log_files
Traceback (most recent call last):
  File "~/anaconda3/envs/IKEA_1/lib/python3.7/site-packages/wandb/sdk/wandb_init.py", line 995, in init
    run = wi.init()
  File "~/anaconda3/envs/IKEA_1/lib/python3.7/site-packages/wandb/sdk/wandb_init.py", line 648, in init
    backend.cleanup()
  File "~/anaconda3/envs/IKEA_1/lib/python3.7/site-packages/wandb/sdk/backend/backend.py", line 246, in cleanup
    self.interface.join()
  File "~/anaconda3/envs/IKEA_1/lib/python3.7/site-packages/wandb/sdk/interface/interface_shared.py", line 475, in join
    super().join()
  File "~/anaconda3/envs/IKEA_1/lib/python3.7/site-packages/wandb/sdk/interface/interface.py", line 653, in join
    _ = self._communicate_shutdown()
  File "~/anaconda3/envs/IKEA_1/lib/python3.7/site-packages/wandb/sdk/interface/interface_shared.py", line 472, in _communicate_shutdown
    _ = self._communicate(record)
  File "~/anaconda3/envs/IKEA_1/lib/python3.7/site-packages/wandb/sdk/interface/interface_shared.py", line 226, in _communicate
    return self._communicate_async(rec, local=local).get(timeout=timeout)
  File "~/anaconda3/envs/IKEA_1/lib/python3.7/site-packages/wandb/sdk/interface/interface_shared.py", line 231, in _communicate_async
    raise Exception("The wandb backend process has shutdown")
Exception: The wandb backend process has shutdown
wandb: ERROR Abnormal program exit
Traceback (most recent call last):
  File "~/anaconda3/envs/IKEA_1/lib/python3.7/site-packages/wandb/sdk/wandb_init.py", line 995, in init
    run = wi.init()
  File "~/anaconda3/envs/IKEA_1/lib/python3.7/site-packages/wandb/sdk/wandb_init.py", line 648, in init
    backend.cleanup()
  File "~/anaconda3/envs/IKEA_1/lib/python3.7/site-packages/wandb/sdk/backend/backend.py", line 246, in cleanup
    self.interface.join()
  File "~/anaconda3/envs/IKEA_1/lib/python3.7/site-packages/wandb/sdk/interface/interface_shared.py", line 475, in join
    super().join()
  File "~/anaconda3/envs/IKEA_1/lib/python3.7/site-packages/wandb/sdk/interface/interface.py", line 653, in join
    _ = self._communicate_shutdown()
  File "~/anaconda3/envs/IKEA_1/lib/python3.7/site-packages/wandb/sdk/interface/interface_shared.py", line 472, in _communicate_shutdown
    _ = self._communicate(record)
  File "~/anaconda3/envs/IKEA_1/lib/python3.7/site-packages/wandb/sdk/interface/interface_shared.py", line 226, in _communicate
    return self._communicate_async(rec, local=local).get(timeout=timeout)
  File "~/anaconda3/envs/IKEA_1/lib/python3.7/site-packages/wandb/sdk/interface/interface_shared.py", line 231, in _communicate_async
    raise Exception("The wandb backend process has shutdown")
Exception: The wandb backend process has shutdown

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "~/anaconda3/envs/IKEA_1/lib/python3.7/runpy.py", line 193, in _run_module_as_main
"__main__", mod_spec)
  File "~/anaconda3/envs/IKEA_1/lib/python3.7/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "~/skill-chaining/run.py", line 44, in <module>
    SkillChainingRun(parser).run()
  File "~/skill-chaining/run.py", line 10, in __init__
    super().__init__(parser)
  File "~/skill-chaining/method/robot_learning/main.py", line 44, in __init__
    self._make_log_files()
  File "~/skill-chaining/method/robot_learning/main.py", line 133, in _make_log_files
    mode="online" if config.wandb else "disabled",
  File "~/anaconda3/envs/IKEA_1/lib/python3.7/site-packages/wandb/sdk/wandb_init.py", line 1033, in init
    raise Exception("problem") from error_seen
Exception: problem

Any ideia what could be the problem?

Setting up the repository

Hello,
Starting this issue to continue from clvrai/furniture#32 which has been closed.
I've done as advised and switched to the this repo, which also works out as I'll need to use T-STAR.
I am somewhat confused as to whether this repo or https://github.com/clvrai/furniture/tree/tstar is the most appropriate as they both seem to be trying to do the same thing but as per indication I'll stick to this for now.

So far, a few problems I have run into are that step 1 (demo generation) only seems to use 1 core as well as not accepting --gpu 0, making it quite slow. mpirun also doesn't quite work as the demos all overwrite eachother.

A bigger issue however is step 2. I am not very familiar with wandb, but nevertheless made an account and logged in. Regardless, due to some wandb interaction the script crashes. I also cannot test further steps as they seem to depend on the previous ones.

$ python -m run --algo gail --furniture_name table_lack_0825 --demo_path demos/table_lack/Sawyer_table_lack_0825_0 --num_connects 1 --run_prefix p0 --gpu 0

pybullet build time: Mar 12 2022 19:43:28
[2022-04-01 22:43:14,427] Run a base worker.
[2022-04-01 22:43:14,428] Create log directory: log_refactor/table_lack_0825.gail.p0.123
[2022-04-01 22:43:14,428] Create video directory: log_refactor/table_lack_0825.gail.p0.123/video
[2022-04-01 22:43:14,428] Create demo directory: log_refactor/table_lack_0825.gail.p0.123/demo
[2022-04-01 22:43:14,446] Store parameters in log_refactor/table_lack_0825.gail.p0.123/params.json
wandb: Currently logged in as: khalid-rohith-team (use `wandb login --relogin` to force relogin)
wandb: ERROR Error while calling W&B API: project not found (<Response [404]>)
Thread SenderThread:
Traceback (most recent call last):
  File "~/anaconda3/envs/IKEA_1/lib/python3.7/site-packages/wandb/sdk/lib/retry.py", line 102, in __call__
    result = self._call_fn(*args, **kwargs)
  File "~/anaconda3/envs/IKEA_1/lib/python3.7/site-packages/wandb/sdk/internal/internal_api.py", line 146, in execute
    six.reraise(*sys.exc_info())
  File "~/anaconda3/envs/IKEA_1/lib/python3.7/site-packages/six.py", line 719, in reraise
    raise value
  File "~/anaconda3/envs/IKEA_1/lib/python3.7/site-packages/wandb/sdk/internal/internal_api.py", line 140, in execute
    return self.client.execute(*args, **kwargs)
  File "~/anaconda3/envs/IKEA_1/lib/python3.7/site-packages/wandb/vendor/gql-0.2.0/wandb_gql/client.py", line 52, in execute
    result = self._get_result(document, *args, **kwargs)
  File "~/anaconda3/envs/IKEA_1/lib/python3.7/site-packages/wandb/vendor/gql-0.2.0/wandb_gql/client.py", line 60, in _get_result
    return self.transport.execute(document, *args, **kwargs)
  File "~/anaconda3/envs/IKEA_1/lib/python3.7/site-packages/wandb/vendor/gql-0.2.0/wandb_gql/transport/requests.py", line 39, in execute
    request.raise_for_status()
  File "~/anaconda3/envs/IKEA_1/lib/python3.7/site-packages/requests/models.py", line 960, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 404 Client Error: Not Found for url: https://api.wandb.ai/graphql

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "~/anaconda3/envs/IKEA_1/lib/python3.7/site-packages/wandb/apis/normalize.py", line 24, in wrapper
    return func(*args, **kwargs)
  File "~/anaconda3/envs/IKEA_1/lib/python3.7/site-packages/wandb/sdk/internal/internal_api.py", line 1296, in upsert_run
    response = self.gql(mutation, variable_values=variable_values, **kwargs)
  File "~/anaconda3/envs/IKEA_1/lib/python3.7/site-packages/wandb/sdk/lib/retry.py", line 118, in __call__
    if not check_retry_fn(e):
  File "~/anaconda3/envs/IKEA_1/lib/python3.7/site-packages/wandb/util.py", line 872, in no_retry_auth
    raise CommError("Permission denied, ask the project owner to grant you access")
wandb.errors.CommError: Permission denied, ask the project owner to grant you access

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "~/anaconda3/envs/IKEA_1/lib/python3.7/site-packages/wandb/sdk/internal/internal_util.py", line 54, in run
    self._run()
  File "~/anaconda3/envs/IKEA_1/lib/python3.7/site-packages/wandb/sdk/internal/internal_util.py", line 105, in _run
    self._process(record)
  File "~/anaconda3/envs/IKEA_1/lib/python3.7/site-packages/wandb/sdk/internal/internal.py", line 312, in _process
    self._sm.send(record)
  File "~/anaconda3/envs/IKEA_1/lib/python3.7/site-packages/wandb/sdk/internal/sender.py", line 237, in send
    send_handler(record)
  File "~/anaconda3/envs/IKEA_1/lib/python3.7/site-packages/wandb/sdk/internal/sender.py", line 695, in send_run
    self._init_run(run, config_value_dict)
  File "~/anaconda3/envs/IKEA_1/lib/python3.7/site-packages/wandb/sdk/internal/sender.py", line 733, in _init_run
    commit=run.git.last_commit or None,
  File "~/anaconda3/envs/IKEA_1/lib/python3.7/site-packages/wandb/apis/normalize.py", line 62, in wrapper
    six.reraise(CommError, CommError(message, err), sys.exc_info()[2])
  File "~/anaconda3/envs/IKEA_1/lib/python3.7/site-packages/six.py", line 718, in reraise
    raise value.with_traceback(tb)
  File "~/anaconda3/envs/IKEA_1/lib/python3.7/site-packages/wandb/apis/normalize.py", line 24, in wrapper
    return func(*args, **kwargs)
  File "~/anaconda3/envs/IKEA_1/lib/python3.7/site-packages/wandb/sdk/internal/internal_api.py", line 1296, in upsert_run
    response = self.gql(mutation, variable_values=variable_values, **kwargs)
  File "~/anaconda3/envs/IKEA_1/lib/python3.7/site-packages/wandb/sdk/lib/retry.py", line 118, in __call__
    if not check_retry_fn(e):
  File "~/anaconda3/envs/IKEA_1/lib/python3.7/site-packages/wandb/util.py", line 872, in no_retry_auth
    raise CommError("Permission denied, ask the project owner to grant you access")
wandb.errors.CommError: Permission denied, ask the project owner to grant you access
wandb: ERROR Internal wandb error: file data was not synced
Problem at: ~/TESTDIR/skill-chaining/method/robot_learning/main.py 143 _make_log_files
Traceback (most recent call last):
  File "~/anaconda3/envs/IKEA_1/lib/python3.7/site-packages/wandb/sdk/wandb_init.py", line 954, in init
    run = wi.init()
  File "~/anaconda3/envs/IKEA_1/lib/python3.7/site-packages/wandb/sdk/wandb_init.py", line 614, in init
    backend.cleanup()
  File "~/anaconda3/envs/IKEA_1/lib/python3.7/site-packages/wandb/sdk/backend/backend.py", line 248, in cleanup
    self.interface.join()
  File "~/anaconda3/envs/IKEA_1/lib/python3.7/site-packages/wandb/sdk/interface/interface_shared.py", line 467, in join
    super().join()
  File "~/anaconda3/envs/IKEA_1/lib/python3.7/site-packages/wandb/sdk/interface/interface.py", line 630, in join
    _ = self._communicate_shutdown()
  File "~/anaconda3/envs/IKEA_1/lib/python3.7/site-packages/wandb/sdk/interface/interface_shared.py", line 464, in _communicate_shutdown
    _ = self._communicate(record)
  File "~/anaconda3/envs/IKEA_1/lib/python3.7/site-packages/wandb/sdk/interface/interface_shared.py", line 222, in _communicate
    return self._communicate_async(rec, local=local).get(timeout=timeout)
  File "~/anaconda3/envs/IKEA_1/lib/python3.7/site-packages/wandb/sdk/interface/interface_shared.py", line 227, in _communicate_async
    raise Exception("The wandb backend process has shutdown")
Exception: The wandb backend process has shutdown
wandb: ERROR Abnormal program exit
Traceback (most recent call last):
  File "~/anaconda3/envs/IKEA_1/lib/python3.7/site-packages/wandb/sdk/wandb_init.py", line 954, in init
    run = wi.init()
  File "~/anaconda3/envs/IKEA_1/lib/python3.7/site-packages/wandb/sdk/wandb_init.py", line 614, in init
    backend.cleanup()
  File "~/anaconda3/envs/IKEA_1/lib/python3.7/site-packages/wandb/sdk/backend/backend.py", line 248, in cleanup
    self.interface.join()
  File "~/anaconda3/envs/IKEA_1/lib/python3.7/site-packages/wandb/sdk/interface/interface_shared.py", line 467, in join
    super().join()
  File "~/anaconda3/envs/IKEA_1/lib/python3.7/site-packages/wandb/sdk/interface/interface.py", line 630, in join
    _ = self._communicate_shutdown()
  File "~/anaconda3/envs/IKEA_1/lib/python3.7/site-packages/wandb/sdk/interface/interface_shared.py", line 464, in _communicate_shutdown
    _ = self._communicate(record)
  File "~/anaconda3/envs/IKEA_1/lib/python3.7/site-packages/wandb/sdk/interface/interface_shared.py", line 222, in _communicate
    return self._communicate_async(rec, local=local).get(timeout=timeout)
  File "~/anaconda3/envs/IKEA_1/lib/python3.7/site-packages/wandb/sdk/interface/interface_shared.py", line 227, in _communicate_async
    raise Exception("The wandb backend process has shutdown")
Exception: The wandb backend process has shutdown

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "~/anaconda3/envs/IKEA_1/lib/python3.7/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "~/anaconda3/envs/IKEA_1/lib/python3.7/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "~/TESTDIR/skill-chaining/run.py", line 43, in <module>
    SkillChainingRun(parser).run()
  File "~/TESTDIR/skill-chaining/run.py", line 10, in __init__
    super().__init__(parser)
  File "~/TESTDIR/skill-chaining/method/robot_learning/main.py", line 51, in __init__
    self._make_log_files()
  File "~/TESTDIR/skill-chaining/method/robot_learning/main.py", line 143, in _make_log_files
    notes=config.notes,
  File "~/anaconda3/envs/IKEA_1/lib/python3.7/site-packages/wandb/sdk/wandb_init.py", line 992, in init
    six.raise_from(Exception("problem"), error_seen)
  File "<string>", line 3, in raise_from
Exception: problem

Assembly succeeds but no success.pkl file generated

Since #1 (which I've closed as the overall impression I got was that I should have researched the matter myself more extensively), I've managed to obtain successful states for assembly of the first table leg (using GAIL).
My theory as to why this didn't work before is that the training program doesn't like to be interrupted from warmup until reaching a success state, even if it has checkpoints to pick up from.
I've also noticed that for some reason, after a few success episodes, GAIL seems to 'forget' and can't succeed again (see below figure, in which the large reward spikes match up with successful episodes).

rewards_sawyer_gail_1leg

But onto the main focus of this issue: Even after obtaining successful episodes, in which the checkpoint video proves there is successful assembly of the table leg, there is no success_(...).pkl file to use as a starting point for the next leg.
This is strange as having https://github.com/youngwoon/robot-learning/blob/11bc2ac1b89a0f2e772bd092a87ec2415a785617/robot_learning/trainer.py#L370-L372 classify the video as associated to a successful episode should also mean the same thing for https://github.com/youngwoon/robot-learning/blob/11bc2ac1b89a0f2e772bd092a87ec2415a785617/robot_learning/trainer.py#L195-L197
Since this codebase is quite dense, I can't quite figure out what is the difference/relation between the values in info and info.keys() and hence, how to generate the file

Policy Sequencing Error

Hello, I would like to share an error that I found occurring when running the policy sequencing script.

Traceback (most recent call last):
  File "~/anaconda3/envs/IKEA_1/lib/python3.7/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "~/anaconda3/envs/IKEA_1/lib/python3.7/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "~/skill-chaining/run.py", line 43, in <module>
    SkillChainingRun(parser).run()
  File "~/skill-chaining/method/robot_learning/main.py", line 143, in run
    trainer.train()
  File "~/skill-chaining/policy_sequencing_trainer.py", line 133, in train
    partial=False,
  File "~/skill-chaining/policy_sequencing_trainer.py", line 255, in _evaluate_partial
    is_train=False, record_video=record_video, partial=partial
  File "~/skill-chaining/policy_sequencing_rollout.py", line 272, in run_episode
    ac, ac_before_activation = agent[subtask].act(ob, is_train=is_train)
  File "~/skill-chaining/policy_sequencing_agent.py", line 128, in __getitem__
    return self._rl_agents[key]
IndexError: list index out of range

It occurred attempting to chain the first two table leg assembly subtasks (as so far I can't get GAIL to do the remaining legs, see issue #6)

mpirun -np 8 python -m run --algo ps --furniture_name table_lack_0825 --num_connects 2 --run_prefix ours \
--ps_ckpts log/table_lack_0825.gail.p0.123/ckpt_00018124800.pt,log/table_lack_0825.gail.p1.123/ckpt_00036556800.pt \
--ps_load_init_states log/table_lack_0825.gail.p0.123/success_00018124800.pkl,log/table_lack_0825.gail.p1.123/success_00036556800.pkl \
--ps_demo_paths demos/table_lack/Sawyer_table_lack_0825_0,demos/table_lack/Sawyer_table_lack_0825_1

I found that the agent array in the following code only had the policy sequencing agent, even though the program was trying to access further agents.

ac, ac_before_activation = agent[subtask].act(ob, is_train=is_train)

A quick fix seems to be replacing subtask with 0 but while it does not crash the script anymore, I am not sure if this doesn't cause issues with result quality.
I should also ask if the policy sequencing stage of the training (section 4) needs a further evaluation step (like section 3 for the subtask algorithm).

Running to a no moudule named run error

$ mpirun -np 2 python3 -m run --algo ppo --furniture_name table_lack_0825 --demo_path demos/table_lack_full/ --num_connects 4 --max_episode_steps 800 --max_global_step 200000000 --run_prefix ppo
Invalid MIT-MAGIC-COOKIE-1 key/usr/bin/python3: No module named run
/usr/bin/python3: No module named run
--------------------------------------------------------------------------
Primary job  terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpirun detected that one or more processes exited with non-zero status, thus causing
the job to be terminated. The first process to do so was:

  Process name: [[47051,1],1]
  Exit code:    1

Running error: demo path, unparsed argument is detected

error log:

user@user:~/skill-chaining$ mpirun -np 4 python3 -m run --algo ppo --furniture_name table_lack_0825 --demo_path demos/table_lack/ --num_connects 4 --max_episode_steps 800 --max_global_step 200000000 --run_prefix ppo
Invalid MIT-MAGIC-COOKIE-1 keypybullet build time: Feb 28 2022 22:32:49
/usr/local/lib/python3.8/dist-packages/ale_py/roms/__init__.py:94: DeprecationWarning: Automatic importing of atari-py roms won't be supported in future releases of ale-py. Please migrate over to using `ale-import-roms` OR an ALE-supported ROM package. To make this warning disappear you can run `ale-import-roms --import-from-pkg atari_py.atari_roms`.For more information see: https://github.com/mgbellemare/Arcade-Learning-Environment#rom-management
  _RESOLVED_ROMS = _resolve_roms()
[2022-05-01 22:22:52,277] Unparsed argument is detected:
['--demo_path', 'demos/table_lack/']
pybullet build time: Feb 28 2022 22:32:49
/usr/local/lib/python3.8/dist-packages/ale_py/roms/__init__.py:94: DeprecationWarning: Automatic importing of atari-py roms won't be supported in future releases of ale-py. Please migrate over to using `ale-import-roms` OR an ALE-supported ROM package. To make this warning disappear you can run `ale-import-roms --import-from-pkg atari_py.atari_roms`.For more information see: https://github.com/mgbellemare/Arcade-Learning-Environment#rom-management
  _RESOLVED_ROMS = _resolve_roms()
[2022-05-01 22:22:52,545] Unparsed argument is detected:
['--demo_path', 'demos/table_lack/']
pybullet build time: Feb 28 2022 22:32:49
/usr/local/lib/python3.8/dist-packages/ale_py/roms/__init__.py:94: DeprecationWarning: Automatic importing of atari-py roms won't be supported in future releases of ale-py. Please migrate over to using `ale-import-roms` OR an ALE-supported ROM package. To make this warning disappear you can run `ale-import-roms --import-from-pkg atari_py.atari_roms`.For more information see: https://github.com/mgbellemare/Arcade-Learning-Environment#rom-management
  _RESOLVED_ROMS = _resolve_roms()
[2022-05-01 22:22:52,705] Unparsed argument is detected:
['--demo_path', 'demos/table_lack/']
pybullet build time: Feb 28 2022 22:32:49
/usr/local/lib/python3.8/dist-packages/ale_py/roms/__init__.py:94: DeprecationWarning: Automatic importing of atari-py roms won't be supported in future releases of ale-py. Please migrate over to using `ale-import-roms` OR an ALE-supported ROM package. To make this warning disappear you can run `ale-import-roms --import-from-pkg atari_py.atari_roms`.For more information see: https://github.com/mgbellemare/Arcade-Learning-Environment#rom-management
  _RESOLVED_ROMS = _resolve_roms()
[2022-05-01 22:22:52,995] Unparsed argument is detected:
['--demo_path', 'demos/table_lack/']

I first ran the command:
# Sub-task demo generation

$ python -m furniture.env.furniture_sawyer_gen --furniture_name table_lack_0825 --demo_dir demos/table_lack/ --reset_robot_after_attach True --max_episode_steps 200 --num_connects 1 --n_demos 200 --start_count 0 --phase_ob True

and got the folder demos/table_lack with lots of Sawyer_table_lack_0825_xxx.pkl files

then run the command:

$ mpirun -np 4 python3 -m run --algo ppo --furniture_name table_lack_0825 --demo_path demos/table_lack/ --num_connects 4 --max_episode_steps 800 --max_global_step 200000000 --run_prefix ppo

and I ran into the error metioned above, what's the demo_path trying to get? did I miss anything? or I'm running to other problem? Thanks

Difficulty in training increasing number of subtasks

Good morning,

I have observed, following the steps in sections 2 and 3 (in the README), that while training for the 1st leg is generally quite successful and uneventful, it seems training more table legs (using previous task success states as init states for the current subtask) becomes increasingly difficult, with demos being ruled out as a cause of problems.

Whereas the 1st leg finished in 1 run (each run ~50M steps) and the 2nd in 2, the 3rd has failed even after 5 attempts.
It also seems that in the case of the 2nd leg, despite many videos proving an abundance of successes, the evaluator (section 3) returns only a few successes, corresponding to different steps from those indicated by the video title. This issue picks up on some points of #3, but the main idea is that I suspect the evaluator is at fault somehow, with the negative effects cascading down and becoming more noticeable with the increased number of subtasks.

Is it natural for this behaviour to emerge in training or is there something wrong on my side? And what could it be?

Question about the paper

Hi @youngwoon,

Thanks for the nice work. I have a question about the design of your termination regularizer. If the initial set discriminator learns to distinguish the state perfectly, then the termination state of the current skill should be considered different from the initial state of the next skill, from the perspective of the discriminator. In this case, $R_{TSR}$ will be 0. But is it opposite to what is expected (the initial and termination states should be matched)? Can you help explain the idea behind it? Thanks.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.