Comments (14)
Hi, thanks for your comment.
In short, yes, I agree with your pain. However, I still believe this is not generally possible due to the current implementation of flatten
with Discrete which cannot be solved without possibly breaking a large amount of users' code.
I don't know if SB3 supports custom spaces but I suggest making your own custom space for OneHot. An alternative solution is a hack where you create a new class that inherits from Box with a custom sample
function that is unflattenable.
from gymnasium.
Hey, I think this is a good idea.
Would you be interested in adding the argument to MultiBinary
?
from gymnasium.
Hey @pseudo-rnd-thoughts , absolutely. How do we proceed from here?
from gymnasium.
I'm not sure about this tbh. If we introduce this, we'd also have to adjust the contains
method.
The code in both sample
and contains
would probably have a structure like
def sample(self):
if self.onehot:
...
else:
...
at which point the functionality should probably just be split into two classes.
So then we can consider whether we want to create a new OneHot
space. But unless I'm missing something, this conveys the same exact information as a Discrete
(or MultiDiscrete
if we want a different shape), just represented differently, so it feels redundant.
from gymnasium.
I am not sure I completely agree that this would convey the same information of MultiDiscrete
...
MultiDiscrete
uses integer encoding to represent different classes whereas this directly point to one-hot encoding, which just feel way more natural when dealing with lots of problems.
I agree you can always go from one to the other, it is just a matter of how common one practice is with respect to others.
The contains
method can be adjusted with ease. As a matter of fact, I have just implemented my own custom space starting from MultiBinary
and modifying it. I could post is here maybe, you'll see is nothing too complex.
I agree with the self.onehot
switch. I was thinking I could implement the sample
and contains
methods with:
def sample(self):
def sample_onehot(self):
# code for sampling with onehot
return sample
def sample_simple(self):
# old code
return sample
if self.onehot:
return sample_onehot()
else:
return sample_simple()
The same structure would be used for contains. Maybe I could post here my OneHotSpace
once I am done testing it out.
from gymnasium.
I am not sure I completely agree that this would convey the same information of MultiDiscrete
I agree you can always go from one to the other, it is just a matter of how common one practice is with respect to others.
These two statements are in contradiction.
To be clear with what I mean, you have a direct correspondence between Discrete(n)
and Onehot(n)
. 3
is the exact same as [0, 0, 0, 1]
, it's just differently presented. If we add a OneHot
space, we do not enable using any new spaces because anything you'd like to express as a OneHot
, you can already express as a Discrete
.
I was thinking I could implement the sample and contains methods with:
To clarify my other point about the if X: ... else: ...
structure, the point was that it's generally just bad practice. Here's one SO thread, you can find a bunch more sources explaining it. The code with two functions defined inside of sample
is even more hacky, and we're trying to stay pretty strict on good code quality unless there are really good reasons to compromise on it.
Overall my argument against adding this goes roughly as follows:
- If we were to implement this as part of
MultiBinary
, we'd introduce a code smell via the boolean argument - We don't want to introduce bad new code
- If we were to add this feature in some way, a far preferable approach would be just splitting it in two separate classes, i.e.
MultiBinary
stays as it is, andOneHot
is created - If
OneHot
exists, it exactly duplicatesDiscrete
orMultiDiscrete
, so adding it is redundant.
I might be open to a different approach for adding this functionality, but I don't know how important it even is for Gymnasium to handle. Converting between integers and one-hot arrays is fairly trivial. If you want to use one-hot in your code, you can call space.contains(onehot.argmax())
and it's done. Going the other way could be something like np.eye(space.n)[space.sample()]
It miiiiiiiight make sense to have back-and-forth conversion as a method on Discrete
, so you'd have e.g.
space = Discrete(5)
space.to_onehot(space.sample()) # [0, 1, 0, 0, 0]
space.from_onehot([0, 1, 0, 0, 0]) # 2
but I'm not sure how useful it really is since DL/RL libraries usually have their own utilities to do that (and it's a one-liner anyways, unless we want to optimize for high dimensionality)
from gymnasium.
Thanks for the very precise answer. I clearly am a novice to these topics, hence my appreciation for such a detailed answer.
I agree on most of the things you said. I am still convinced that, for good design purposes in custom environments, having a OneHot
space already available would make things easier but this only is a personal opinion. By the way, I already implemented such space (mainly by modifying here-and-there MultiBinary
), maybe you can have a look if you thing that might interesting.
Ultimately, I am convinced that having a OneHot already available would make custom-env development easier, but I am more than happy to accept your feedback on this
from gymnasium.
from gymnasium.
Very similar to MultiBinary ;)
from gymnasium.
This conversion reminds me of #102 which pointed out that flatten(Discrete) -> Box
means that samples of Box
are not one-hotted therefore, unflatten(Box.sample)
are not contained in the original Discrete
space.
Therefore this might introduce more problems if people expect that flatten_space(Discrete) -> OneHot
To me, this is part of the legacy / technical debt that Gym / Gymnasium has and is difficult to change / fix due to its size and aim of backward compatibility.
from gymnasium.
This conversion reminds me of #102 which pointed out that
flatten(Discrete) -> Box
means that samples ofBox
are not one-hotted therefore,unflatten(Box.sample)
are not contained in the originalDiscrete
space.
Therefore this might introduce more problems if people expect thatflatten_space(Discrete) -> OneHot
I am not sure I completely understand this... Can you please point out which conversion are you referring to?
So why not adding a OneHot
space for all those cases in which it just feel more natural than using Discrete
/MultiDiscrete
for custom environment design? I already have mine implemented and tested. Let me know if you would be interested in testing it out.
from gymnasium.
Hey there, any update?
from gymnasium.
Hey,
Read #102 for the whole discussion, in short, Gymnasium allows users to flatten a space which for Discrete
is transformed to a Box
. The problem is that samples generated by the Box
cannot be reverted back to the Discrete values as they are not one-hotted. See the example code in #102. The problem is that we can't solve this and just have to put a warning that flattened samples, i.e., Box.sample()
, are not guaranteed to exist within the original space, i.e., Discrete
.
One of the proposals in #102 was to introduce a OneHot
as replacement to Box
in this case but we found there are critical reasons we can't do this.
In addition to the code smell issues pointed out by @RedTachyon, I suspect that introducing the OneHot
will cause confusion with the issue related above and given that Discrete
is functionally identical to OneHot
for users if they want this feature then I don't think we will add a OneHot
space. Finally, adding a space into Gymnasium is not as simple as one thinks given we have many spaces.utils
and vector.utils
functions that need to be added / updated.
Thank you for opening the issue, I am going to close the issue, we can continue the discussion until there is a particularly strong argument that we haven't considered yet then we can reopen
from gymnasium.
Hi, I have a fairly simple motivating example and ran across this and #102.
I'm trying to write a new gym env (hopefully written The Right Way from the start), and I'd like to run stable-baselines3 on it.
The env has an action space of type Dict(Discrete, ...)
.
stable-baseline3 does not support Dict
action spaces, so as far as I can tell the Right way to handle this is to wrap the env in a FlattenAction wrapper. However, doing so results in errors as stable-baselines I assume calls sample() on its end, and generates invalid one-hots of e.g. all-zeros.
The advice in #102 to sample before flattening does not seem to apply here as the sampling isn't on my end.
The only fix I can think of would be to do some truly horrible one-hotifying in the FlattenAction wrapper before the invalid values can be passed to unflatten
. np.argmax
also doesn't seem like it would suffice alone as it front-biases the results in the case of ties. Similarly if I changed my space to use MultiBinary my env-internal code would be a mess of calls to an unbiased-argmax.
But overall the fundamental problem seems to me to be that there does not exist a space that is both 1. flat 2. sample() behaves like a one-hot. Discrete violates 1, MultiBinary violates 2. Without such a space, there doesn't appear to be a 'clean' env setup for both random sampling and actually feeding values to/from a NN.
To me, this would make the addition of the Discrete-lookalike OneHot
space worth it. It would also save users from googling for half an hour trying to figure out whether they should go for MultiBinary/Box or Discrete when they know they want a one hot (as I did). Due to this I think in some cases it may actually reduce confusion in spite of the similarity to Discrete.
from gymnasium.
Related Issues (20)
- The dynamics for Toy text environments HOT 3
- [Bug Report] MultiDiscrete spaces' __eq__ method needs to check shape. HOT 3
- [Proposal] new continuous space `Continuous` HOT 1
- [Bug Report] Code examples in Vector API documentation use deprecated "gym.vector.make" instead of "gym.make_vec" HOT 3
- [Proposal] Add Optional State Variable in `initial` method of `FuncEnv` HOT 1
- [Question] Rough timeline for Atari (ALE) support for gymnasium>=1.0. HOT 3
- [Question] Collect multiple images from different cameras HOT 6
- [Proposal] Default metadata in BaseMujocoEnv HOT 1
- Setting up seed properly in Custom env HOT 1
- [Question] ValueError: ndarray is not C-contiguous HOT 3
- [Question] Is there any reason to add epsilon to action mean? HOT 4
- How to interact with the LunarLander-v2 with a keyboard?
- [Bug Report] Human vector rendering fails
- spaces.Dict: Make `sorted(spaces.items())` optional HOT 2
- [Bug Report] Wrong information about the beginning observation in InvertedDoublePendulum-v4 HOT 1
- ERROR: Cannot install gymnasium[atari]==0.26.1, gymnasium[atari]==0.26.2 and gymnasium[atari]==0.26.3 because these package versions have conflicting dependencies. HOT 2
- [Bug Report] Discrete space does not register as inheriting from gymnasium.spaces.Space when imported directly HOT 1
- [Question] What actions are expected for a previously terminated sub-env in a vector (next `env.step()` should return reset obs so the actions shouldn't matter)? HOT 2
- [Bug Report] 'HumanRendering' object has no attribute 'window' HOT 1
- [Bug Report] Can't install the Gymnasium with Atari dependencies. HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from gymnasium.