Implementation of MuZero in Pytorch on OpenAI's gym CartPole environment.
To train a model, run the main function: python muzero.py
.
DISCLAIMER: this code is early research code.
- Takes about 350 episodes to consitantly achieve a score of 500 on Cartpole-V1.
- Takes about 400 episodes to consitantly achieve a score of 200 on Cartpole-V0