Collaborators:
- Georgia Gabriela Sampaio ([email protected]; Computer Science, Electrical Engineering)
- Max Sobol Mark ([email protected]; Computer Science)
#description TBD
#references: TBD
- Decide which agent from ACME we're going to use, and set it up with entropy maximization.
- Replicate the latent variable-conditioned policy model.
- Replicate the discriminator.
- Replicate Hierarchical RL environments, adding goals.
- Create hierarchical policy.