juliareinforcementlearning / reinforcementlearningzoo.jl Goto Github PK
View Code? Open in Web Editor NEWHome Page: https://juliareinforcementlearning.org/
License: MIT License
Home Page: https://juliareinforcementlearning.org/
License: MIT License
Doc:
Implementations:
This issue is used to trigger TagBot; feel free to unsubscribe.
If you haven't already, you should update your TagBot.yml
to include issue comment triggers.
Please see this post on Discourse for instructions and more details.
If you'd like for me to do this for you, comment TagBot fix
on this issue.
I'll open a PR within a few hours, please be patient!
Sorry if this is trivial. But how can I run experiment?
I tried to:
run(E`JuliaRL_BasicDQN_CartPole`)
But I get:
ERROR: LoadError: UndefVarError: @E_cmd not defined
What am I missing?
Hello,
TD3 is mentioned in the README, but I can't seem to find it anywhere in the source. Is it implemented?
Along the same lines, I tried the JuliaRL_TD3_Pendulum
experiment from the docs and it looks like it is missing:
julia> using ReinforcementLearning
julia> run(E`JuliaRL_TD3_Pendulum`)
ERROR: LoadError: MethodError: no method matching Experiment(::Val{:JuliaRL}, ::Val{:TD3}, ::Val{:Pendulum}, ::Nothing)
Closest candidates are:
Experiment(::Any, ::Any, ::Any, ::Any, ::String) at /Users/rlee18/.julia/packages/ReinforcementLearningCore/xwt8K/src/core/experiment.jl:7
Experiment(::Any, ::Any, ::Any, ::Any, ::Any) at /Users/rlee18/.julia/packages/ReinforcementLearningCore/xwt8K/src/core/experiment.jl:7
Experiment(::Val{:JuliaRL}, ::Val{:DDPG}, ::Val{:Pendulum}, ::Nothing; save_dir, seed) at /Users/rlee18/.julia/packages/ReinforcementLearningZoo/uxBP8/src/experiments/rl_envs.jl:677
...
Stacktrace:
[1] Experiment(::String) at /Users/rlee18/.julia/packages/ReinforcementLearningCore/xwt8K/src/core/experiment.jl:32
[2] @E_cmd(::LineNumberNode, ::Module, ::Any) at /Users/rlee18/.julia/packages/ReinforcementLearningCore/xwt8K/src/core/experiment.jl:25
in expression starting at REPL[2]:1
Thanks!
julia> using ReinforcementLearning
julia> run(E`JuliaRL_PPO_Pendulum`)
ERROR: LoadError: UndefVarError: action_space not defined
Stacktrace:
[1] Experiment(::Val{:JuliaRL}, ::Val{:PPO}, ::Val{:Pendulum}, ::Nothing; save_dir::Nothing, seed::Int64)
@ ReinforcementLearningZoo ~/.julia/packages/ReinforcementLearningZoo/ma4P7/src/experiments/rl_envs/JuliaRL_PPO_Pendulum.jl:17
[2] Experiment(::Val{:JuliaRL}, ::Val{:PPO}, ::Val{:Pendulum}, ::Nothing)
@ ReinforcementLearningZoo ~/.julia/packages/ReinforcementLearningZoo/ma4P7/src/experiments/rl_envs/JuliaRL_PPO_Pendulum.jl:9
[3] Experiment(s::String)
@ ReinforcementLearningCore ~/.julia/packages/ReinforcementLearningCore/NWrFY/src/core/experiment.jl:35
[4] var"@E_cmd"(__source__::LineNumberNode, __module__::Module, s::Any)
@ ReinforcementLearningCore ~/.julia/packages/ReinforcementLearningCore/NWrFY/src/core/experiment.jl:25
in expression starting at REPL[2]:1
Running Pendulum with some other algorithm or running PPO with some other environment seems to work. I can't really see where the problem is but thought you would like to know something is amiss.
This is my script for running:
id = "JuliaRL_A2C_CartPole"
e = Experiment(id)
agent = e.policy
env = e.env
stop_condition = e.stop_condition
hook = TotalRewardPerEpisode()
run(agent, env, stop_condition, hook)
rewards = hook.rewards
and it gives me error as:
LoadError: MethodError: no method matching +(::Float64, ::Vector{Float32})
For element-wise addition, use broadcasting with dot syntax: scalar .+ array
And I found that the error is from line 139 in hook.jl:
function (hook::TotalRewardPerEpisode)(::PostActStage, agent, env)
hook.reward += reward(env)
end
the reward(env) returns a vector:
Float32[1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0]
I didn't change any code, and I just use the newest package by add ReinforcementLearning#master
Hi! I'm using the PPO implementation for my custom environment with continuous action space. I built my custom experiments based on the PPO pendulum experiment template, where the actor and critic are defined explicitly with optimizer=ADAM(3e-4). After playing with it for a while, I realized that I have to use the optimizer defined as part of the ActorCritic type if I want to change the learning rate, etc. It looks the optimizers defined for actor and critic are not used, so it would be less confusing if the optimizer is specified only for the ActorCritic call in the template.
I have installed fine on my stationary computer, but I wanted to try using a GPU using nextjournal.com
so on an instance with Julia 1.3.1 I try to install reinforcement
Pkg.add("ReinforcementLearning")
(I also add Flux just in case due to the error below)
The when using ReinforcementLearning I get:
ERROR: LoadError: LoadError: UndefVarError: TrackedArray not defined
Stacktrace:
[1] getproperty(::Module, ::Symbol) at ./Base.jl:13
[2] top-level scope at /root/.julia/packages/ReinforcementLearning/qSdCS/src/learner/dqn.jl:70
[3] include at ./boot.jl:328 [inlined]
[4] include_relative(::Module, ::String) at ./loading.jl:1105
[5] include at ./Base.jl:31 [inlined]
[6] include(::String) at /root/.julia/packages/ReinforcementLearning/qSdCS/src/ReinforcementLearning.jl:1
[7] top-level scope at /root/.julia/packages/ReinforcementLearning/qSdCS/src/ReinforcementLearning.jl:38
[8] include at ./boot.jl:328 [inlined]
[9] include_relative(::Module, ::String) at ./loading.jl:1105
[10] include(::Module, ::String) at ./Base.jl:31
[11] top-level scope at none:2
[12] eval at ./boot.jl:330 [inlined]
[13] eval(::Expr) at ./client.jl:425
[14] top-level scope at ./none:3
in expression starting at /root/.julia/packages/ReinforcementLearning/qSdCS/src/learner/dqn.jl:70
in expression starting at /root/.julia/packages/ReinforcementLearning/qSdCS/src/ReinforcementLearning.jl:38 ```
P.S. I am very excited about this package! it's so clean and nicely structured!
I found a possible error:
In rainbow.jl, calculating the error of lines 177 to 179, the logits from the online network don't use the "softmax" operation.
I think this package can be a superset of google/dopamine after implementing IQN ๐ (should be ready in the next week)? And maybe stable-baselines the next step?
Pretrained agent is supposed be loaded with Agent(artifact"EnvType_envname_Method_version_role")
legal_actions
According to README, we only need to be load ReinforcementLearingZoo
and ReinforcementLearningEnvironments
. But I think we also need to load ReinforcementLearingCore
to be able to run single line experiments since macro E_cmd
is defined in ReinforcementLearningCore
sid dev-RLZoo-GridWorlds $ julia --project=.
_
_ _ _(_)_ | Documentation: https://docs.julialang.org
(_) | (_) (_) |
_ _ _| |_ __ _ | Type "?" for help, "]?" for Pkg help.
| | | | | | |/ _` | |
| | |_| | | | (_| | | Version 1.5.0 (2020-08-01)
_/ |\__'_|_|_|\__'_| | Official https://julialang.org/ release
|__/ |
julia> using ReinforcementLearningZoo
julia> using ReinforcementLearningEnvironments
julia> run(E`JuliaRL_BasicDQN_CartPole`)
ERROR: LoadError: UndefVarError: @E_cmd not defined
in expression starting at REPL[3]:1
julia>
Necessary changes:
Add a field of dist
into the PPOLearner
(just like @norci did in VPG )
Following method needs to be extended to recognize environments with continuous action space. Currently the PPOLearner is assumed to return a (batch of ) logits. I'd suggest to rename the PPOLearner
into PPOPolicy
and return an action directly.
GausianNetwork is also needed.
Calculating entropy loss in update!
is hard coded. Better to split it into a function to support continuous distribution (or reuse the one in StatsBase or Distributions. But use them with caution! I had some problems using them with Zygote before)
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.