Code Monkey home page Code Monkey logo

aligndiff-iclr2024's Introduction

AlignDiff: Aligning Diverse Human Preferences via Behavior-Customisable Diffusion Model (ICLR 2024)

(See more visual examples on the Project Page)

Official python implementation of the ICLR 2024 paper: AlignDiff: Aligning Diverse Human Preferences via Behavior-Customisable Diffusion Model.

Download datasets and pretrained models

The training datasets for Hopper/Walker/Humanoid can be downloaded from this OneDrive Link. After downloading, please unzip the file to the root directory.

.
└── datasets
    ├── behaviors
    │   └── [task].hdf5 # state-action trajectory datasets for various behaviors.
    └── feedbacks
        └── [task]_[label_type]_[train_or_eval].hdf5 # pairwise trajectory evaluation feedback. [label_type] can be 'syn' for synthetic labels and 'hum' for human feedback labels.

Pre-trained models on syn labels can be downloaded from this OneDrive Link. After downloading, please unzip the file to the root directory.

.
├── datasets
│   └── attr_label
│       └── [task]_[label_type].hdf5 # labels given by pre-trained attribute models.
└── results
    ├── attr_func
    │   └── [task]_[label_type].hdf5 # pre-trained transformer-based attribute models.
    ├── diffusion
    │   └── [task]_[label_type].hdf5 # pre-trained diffusion models.
    └── evaluation
        └── [task]_[label_type]_[seed].hdf5 # test logs for pre-trained models.

Quick Start

python train_attr_func.py --task walker --label_type syn  --device [YOUR_DEVICE]
python train_diffusion_model.py --task walker --label_type syn --device [YOUR_DEVICE]
python eval.py --task walker --label_type syn --device [YOUR_DEVICE]
python plot.py --task walker

Citation

@inproceedings{dong2024aligndiff,
title={AlignDiff: Aligning Diverse Human Preferences via Behavior-Customisable Diffusion Model},
author={Zibin Dong and Yifu Yuan and Jianye HAO and Fei Ni and Yao Mu and YAN ZHENG and Yujing Hu and Tangjie Lv and Changjie Fan and Zhipeng Hu},
booktitle={The Twelfth International Conference on Learning Representations, {ICLR}},
year={2024},}

Note: The code has been refactored for better readability and improved performance. If you encounter any problems, feel free to email [email protected]. In this new implementation, despite not carefully tuning the hyperparameters, the diffusion sampling steps for the Hopper/Walker/Humanoid tasks have been reduced to just 5 steps, achieving sufficiently good performance compared to the suggested 10/10/20 steps in the paper. The performance for the three tasks are as follows: $0.652\pm0.009$, $0.638\pm0.023$, $0.312\pm0.011$. Both Hopper and Walker outperform the results reported in the paper, while Humanoid, although slightly lower, has improved decision speed by 4 times.

aligndiff-iclr2024's People

Contributors

yifu-yuan avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.