Comments (3)
All transfers in navigation environments using the discriminator were done using the point mass. Thus, the point mass row contains the correct hyperparameters. As mentioned in the text, we use a decay on the learning rate of the discriminator (hence --discrim-decay true
), do not collect online data for the discriminator (--discrim-decay false
), and discrim-time-limit
refers to the episode length of the imitated agent. For example, the point mass is much faster than the ant, thus it doesn't make sense to collect data from the point mass where it is just sitting at the goal. discrim-time-limit
refers to how long the episode lengths are for the point mass during data collection.
Here's the general procedure for reproducing the maze results with the discriminator.
- Train the Point mass low level on PointMass_Low (I believe its named something similar)
- Train the point mass high level, PointMaze_High using PointMass_Low
- Train the Ant low level with the discriminator using data from PointMass_Low
- Compose PointMaze high with AntDiscrim_Low in zero-shot manner. This can be done with the
composition_test.py
script. Note that depending on the type of maze evaluation you want to do, you may need to edit thecompose_params
function inutils/loader.py
.
from hierarchical_morphology_transfer.
@jhejna Many thanks for the quick reply!
-
I found the name of PointMass_Low was PointMassLargeMJ_Low, and I just want to confirm it with you.
-
I trained a PointMaze_High policy and an Ant_Low policy to do a zero-shot transfer as mentioned in my other question. And I didn't edit the
compose_params
function as you said. Do you mean that I need to edit it with Ant_Discrim? -
I got the Ant sometime stuck in a location during the zero-shot transfer as you can see in this image, do you have any idea for the reason? (Even though the Ant is not overturned)
The Ant_Low looks correct
-
Do I need to always set
high-level-skips
manually? I think you try to store it here, but it doesn't work now.
-
I found some minor bugs in the code
The type isint
, right?
I got empty images with this code, the following should work
-
I updated
test_composition
function to do onscreen rendering
def test_composition(low_name, high_name, env_name, g=0, k=None, num_ep=100):
params = compose_params(low_name, high_name, env_name, k=k)
model, env = load(high_name, params, best=True)
print("COMPOSED PARAMS", params)
print("ENV", env)
ep_rewards = list()
rewards = list()
obs = env.reset()
if g == 0:
while True:
action, _states = model.predict(obs)
obs, reward, done, info = env.step(action)
rewards.append(reward)
if done:
ep_rewards.append(sum(rewards))
print("REWARD", sum(rewards), len(rewards), "Ep to go:", num_ep, "cur avg", np.mean(ep_rewards))
num_ep -= 1
rewards = []
if num_ep == 0:
break
obs = env.reset()
env.render()
else:
gif_frames = list()
for _ in range(g):
action, _states = model.predict(obs)
obs, reward, done, info = env.step(action)
frame = env.render(mode='rgb_array')
gif_frames.append(frame)
rewards.append(reward)
if done:
ep_rewards.append(sum(rewards))
print("REWARD", sum(rewards), len(rewards), "Ep to go:", num_ep, "cur avg", np.mean(ep_rewards))
num_ep -= 1
rewards = []
if num_ep == 0:
break
obs = env.reset()
import imageio
render_path = os.path.join(RENDERS, 'composition_' + low_name + '.gif')
os.makedirs(os.path.dirname(render_path), exist_ok=True)
print("saving to ", render_path)
imageio.mimsave(render_path, gif_frames[::4], subrectangles=True, duration=0.05)
print("completed saving")
from hierarchical_morphology_transfer.
- Yeah, that should be the correct one.
- In
compose_params
there is a line that disables sampling goals for the maze. This is the difference between the Maze and Maze End evaluations. Depending on the type of evaluation you want to run, you will need to comment / uncomment this. https://github.com/jhejna/hierarchical_morphology_transfer/blob/master/bot_transfer/utils/loader.py#L403 - Hmmmm. We did observe this once or twice but not to the extent that is seen here. I'm not sure exactly what would be causing this -- perhaps try training the Ant Low level for more than 2.5 million timesteps, then make sure that you are using the "best" policy saved during training. Additionally confirm that contact information is enabled in the environment and that the mujoco_py version is <2.0. I can perhaps investigate this later when I have more time.
- The code makes its best guess at what the skip level should be. When running evals, we set this by hand to 35.
- Thanks for pointing that out in the parser! As far as rendering goes, this is only meant to be used when debug rendering is enabled for the high level wrapper: https://github.com/jhejna/hierarchical_morphology_transfer/blob/master/bot_transfer/envs/hierarchical.py#L196. It's commented out because it makes everything run really slowly when enabled.
from hierarchical_morphology_transfer.
Related Issues (3)
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from hierarchical_morphology_transfer.