Code Monkey home page Code Monkey logo

symmetricrl's Introduction

On Learning Symmetric Locomotion

Installation

There is no need for compilation. You can install all requirements using Pip, however, you might prefer to install some manully, including:

Installation using Pip

# TODO: create and activate your virtual env of choice

# download the repo as well as the submodules (including )
git clone https://github.com/UBCMOCCA/SymmetricRL --recurse-submodules

cd SymmetricRL
pip install -r requirements  # you might prefer to install some packages (including PyTorch) yourself

There is also a helper script in setup/setup_cc.sh that can be used to install the requirements on Compute Canada.

Running Locally

To run an experiment named test_experiment with the PyBullet humanoid environment you can run:

./scripts/local_run_playground_train.sh  w2_test_experiment  env_name='pybullet_envs:Walker2DBulletEnv-v0'

# run the same experiment with the NET architecture symmetry method (other options include "traj, loss, phase, net2")
./scripts/local_run_playground_train.sh  w2_net_experiment  env_name='pybullet_envs:Walker2DBulletEnv-v0' mirror_method='net'

The w2_net_experiment is the name of the experiment. This command will create a new experiment directory inside the runs directory that contains the following files:

  • pid: the process ID of the task running the training algorithm
  • progress.csv: a CSV file containing the data about the the training progress
  • slurm.out: the output of the process. You can use tail -f to view the contents
  • configs.json: a JSON file containing all the hyper-parameter values used in this run
  • run.json: extra useful stuff about the run including the host information and the git commit ID (only works if GitPython is installed)
  • models: a directory containing the saved models

In case you use Compute Canada you also use the other scripts like cedar_run_playground_train.sh to create a batch job. These scripts use the same argument sctructure but also allow you to run the same task with multiple replicates using the num_replicates variable.

Plotting Results

The plot_from_csv.py script can be helpful for plotting the learning curves:

python -m playground.plot_from_csv --load_paths runs/*/*/  --columns mean_rew max_rew  --smooth 2

# to group the results based on the name
python -m playground.plot_from_csv --load_paths runs/*/*/  --columns mean_rew max_rew  --name_regex ".*__([^_\/])*" --group 1
  • The load_paths argument specifies which directories the script should look into
  • It opens the progress.csv file and plots the columns as the y-axis and uses the row for the x-axis (defaults to total_num_steps)
  • You can also provide a name_regex to make the figure legends simpler and more readable, e.g. --name_regex 'walker-(.*)mirror\/' would turn runs/2019_07_08__23_53_20__walker-lossmirror/1 to simply loss.
  • group can be used to aggregate the results of multiple runs of the same experiment into one. name_regex is used to specify the groups.

Running Learned Policy

The enjoy.py script can be used to run a learned policy and render the results:

python -m playground.enjoy with experiment_dir=runs/<EXPERIMENT_DIRECTORY>

# plot the joint positions over time
python -m playground.evaluate with experiment_dir=runs/<EXPERIMENT_DIRECTORY> plot=True

Evaluating Results

The evaluate.py script is used for evaluating learned policies. The results will be saved to a new file called evaluate.json inside the experiment_dir where the policy was loaded from.

# evaluate with 10000 steps
python -m playground.evaluate with experiment_dir=runs/<EXPERIMENT_DIRECTORY> max_steps=10000
# with rendering
python -m playground.evaluate with experiment_dir=runs/<EXPERIMENT_DIRECTORY> render=True

# evaluate all experiments
for d in runs/*/*/; do python -m playground.evaluate with experiment_dir=$d; done

Metrics (will be expanded):

  • Symmetric Index (joint angle and torque)
    • Uses the MetricsEnv environment wrapper from symmetric/metric_utils.py

NOTE: the current script assumes that the environment has a similar interface to PyBullet locomotion environments. It assumes the following parameters are present in the unwrapped environment:

  • robot.feet_contact
  • robot.ordered_joints
    • The joint names that have the word left in them are assumed to be on the left side. Any other joint that does not contain the word abdomen is assumed to be on the right side.

The environments that are currently tested and supported:

  • Walker2DBulletEnv
  • HumanoidBulletEnv
  • Walker3DCustomEnv
  • Walker3DStepperEnv

Citation

Please cite the following paper if you found our work useful. Thanks!

Farzad Adbolhosseini and Hung Yu Ling and Zhaoming Xie and Xue Bin Peng and Michiel van de Panne. "On Learning Symmetric Locomotion", Proc. ACM SIGGRAPH Motion, Interaction, and Games (MIG 2019).

@inproceedings{2019-MIG-symmetry,
  title={On Learning Symmetric Locomotion},
  author={Farzad Adbolhosseini and Hung Yu Ling and Zhaoming Xie and Xue Bin Peng and Michiel van de Panne},
  booktitle = {Proc. ACM SIGGRAPH Motion, Interaction, and Games (MIG 2019)},
  year={2019}
}

symmetricrl's People

Contributors

belinghy avatar farzadab avatar zhaomingxie avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

symmetricrl's Issues

Auxiliary Loss weight tuning

Tune the w parameter in the auxiliary loss method.
Environments:

  • Walker2D: 2
  • Walker3D: 4
  • Humanoid?
  • Cassie2D: 4

Paper write-up

Paper link

  • methods
  • page limit: 7-10
  • template
  • keywords and CCS
  • experimental results
  • metrics
  • description of mirror functions
  • net2?
  • results section
  • conference name, keywords, ...

Future Work

  • new environments
    • non-locomotion
    • more symmetries: Ant, Sudoku
    • quadruped (raisim)
  • symmetric gait
    • as a reward: r_t = | s_t - M(s_{t-T/2}) |
    • as a differentiable loss?
  • revisiting PHASE for non-imitation tasks
    • better designed env
    • adaptible/foot strike based?
      • based on last foot strike side?
  • unique net for each config

Tune gait cycle for phase-based

We need to add an artificial phase variable to the base environments (no imitation) in order to use the phase-based method. Since the gait cycle length is fixed it can be constraining for this method so we need to tune the cycle length using either:

  • hyper-parameter tuning
  • looking at a learned motion and calculate the optimal cycle length based on that

Camera-ready version

Deadline: Sep 20th

  • re-write the results/summary section -> should be more clear and to the point
  • fix typos
  • publish code
  • add the missing reference
  • (optional) put phase-plot diagrams in the main paper
  • (optional) tables to plots
  • video
  • clean-up code

Clean-up Codebase

  • push final changes
  • refactor/clean up if needed
  • documentation
  • make mocca_envs public
    • documentation for mocca_envs
    • clean up for mocca_envs

Tasks other than locomotion?

Many other tasks have symmetry including pendulum, cartpole, etc. We experimented with some of them but never got to do a thorough comparison.

Problem about mirror_inds["sideneg_obs_inds"] / mirror_inds["sideneg_act_inds"] in sym_envs.py

Hi,

I'm a little confused about how to get mirrored obs/acts.
If I understand correctly, mirror_inds["sideneg_obs_inds"] should be the the index of obs that belongs to left or right side of a robot that need to be negated and swap when mirroring.
My problem is that only the obs of left side is included in mirror_inds["sideneg_obs_inds"] in the code, should be the corresponding right side also be included?
For example:
the mirroring function should fulfill obs = M(M(obs)).
if we have two corresponding left and right obs, say ob[left] = 1 and ob[right] = 2 and only "left" is in mirror_inds["sideneg_obs_inds"], after first mirroring ob[left] = 2 and ob[right] = -1, and mirroring them again, ob[left] = -1 and ob[right] = -2. It seems not correct. same thing for the action mirroring.

Looking forward to your responding!

Does adding artificial phase (/time) help?

The Phase_Walker2DBulletEnv-v0 has an added artificial phase variable which might be useful to the other algorithms as well, so to be fair we might need to compare the others with phase as well.

while git clone the submodule, always denied by the gihub.I wonder why is this happen..

Fetching submodule .environments
Already up to date.
Cloning into '/home/dyj/SymmetricRL/cassie_sim_to_real'...
[email protected]: Permission denied (publickey).
fatal: Could not read from remote repository.

Please make sure you have the correct access rights
and the repository exists.
fatal: clone of '[email protected]:zxieaa/cassie_sim_to_real.git' into submodule path '/home/dyj/SymmetricRL/cassie_sim_to_real' failed
Failed to clone 'cassie_sim_to_real'. Retry scheduled
Cloning into '/home/dyj/SymmetricRL/cassie_sim_to_real'...
[email protected]: Permission denied (publickey).
fatal: Could not read from remote repository.

Please make sure you have the correct access rights
and the repository exists.
fatal: clone of '[email protected]:zxieaa/cassie_sim_to_real.git' into submodule path '/home/dyj/SymmetricRL/cassie_sim_to_real' failed
Failed to clone 'cassie_sim_to_real' a second time, aborting

Plots

  • learning curves
    • 4 curves
    • same color across
    • ticks: time steps, Average Return
  • tables for symmetry indices
  • phase plots

Comparison with Replicates

Run all the methods with 5 random seed replicates each.
Environments:

  • Walker2D
  • Cassie2D
  • Walker3D?
  • Humanoid?
  • Stepper?

Phase Plot Metric

  • draw the phase plot for different environments
  • calculate metric based on the phase plot similarity of left and right
    • foot strike based? different cycle lengths, how to find the middle of the cycle (half or optimize for min distance)

Preliminary Results

Current results aren't consistent with our expectations (and previous results):

Walker 2D

image
image

Walker 3D (fixed variance)

image
image

Walker 3D (learned variance)

image

  • Though the net should have use a symmetric variance, so it's expected that it might not work.

Humanoid

image

Cassie 2D

image

Presentation

I need to prepare the presentation slides and content. What are the steps and the questions that need to be answered to do so?

  • answer important questions
  • prepare slides
  • prepare video
  • prepare the speech

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.