Hello, I am trying to use this algorithm (rewritten in PyTorch with

Parameters used for motion imitation about awr HOT 6 OPEN

xbpeng commented on August 16, 2024

Parameters used for motion imitation

from awr.

Comments (6)

xbpeng commented on August 16, 2024

Sure, here're the hyperparameters for the motion imitation tasks with the humanoid:

"actor_net_layers": [1024, 512],
"actor_stepsize": 0.0000015,
"actor_momentum": 0.9,
"actor_init_output_scale": 0.01,
"actor_batch_size": 256,
"actor_steps": 200,
"action_std": 0.05,

"critic_net_layers": [1024, 512],
"critic_stepsize": 0.01,
"critic_momentum": 0.9,
"critic_batch_size": 256,
"critic_steps": 100,

"discount": 0.95,
"samples_per_iter": 4096,
"replay_buffer_size": 50000,
"normalizer_samples": 1000000,

"weight_clip": 50,
"td_lambda": 0.95,
"temp": 1.0,

from awr.

ManifoldFR commented on August 16, 2024

Thanks! I also have another couple of questions: were actions normalized like in the original DeepMimic code, and was MPI used to speed up data collection and train agents?

from awr.

xbpeng commented on August 16, 2024

yes, actions were also normalized. Besides using AWR instead of PPO, the rest of the setup was the same.

from awr.

ManifoldFR commented on August 16, 2024

In the paper's appendix C it is said a temperature of 0.05 is used with step size 0.00005, though the config file in this repo sets the temperature to 1.0 and changes the learning rates -- which one should be used ? I can see where the tradeoff happens with this parameter: in my experiments, adjusting it made a difference between being able to train on an environment or not

from awr.

xbpeng commented on August 16, 2024

In the code we are using advantage normalization, so the temperature is just set to 1.0. The temp of 0.05 was used without advantage normalization. If you are using the code, a temp of 1 should work for the tasks.

from awr.

ManifoldFR commented on August 16, 2024

Thanks ! I'm interested in how the temperature and weight clip interact: I guess having a lot of weights clipped should be bad news, right? because intuitively if half of the weights are set to 20 then you lose info on the relative quality of the corresponding actions in the gradient -- perhaps I'll look into it.

from awr.

Parameters used for motion imitation about awr HOT 6 OPEN

Comments (6)

Related Issues (5)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent