Hello, Not sure if this repo is active, but I am interested in using

How to eliminate "invalid actions" about atc-reinforcement-learning HOT 3 OPEN

fvalka commented on September 25, 2024

How to eliminate "invalid actions"

from atc-reinforcement-learning.

Comments (3)

fvalka commented on September 25, 2024

Hello Eric,

not very active. But I still am.

Sounds like you're trying to perform actions which are outside of the action space.

If you are using the continous action space with normalization (the default) everything should be normalized to between -1 and 1

see the action space definition here:

atc-reinforcement-learning/envs/atc/atc_gym.py

Lines 81 to 82 in c603a40

    
           self.action_space = gym.spaces.Box(low=np.array([-1, -1, -1]), 
        
                                              high=np.array([1, 1, 1]))

Hope that helped.

All the best
Fabian

from atc-reinforcement-learning.

epaulz-vt commented on September 25, 2024

Thank you for your response. I have managed to move past the invalid action issue. However, I am having a hard time understanding how to properly interact with the action space of this environment from my custom DeepQ network... let me explain.

When training on an environment like CartPole or LunarLander, the "action space" is a set of scalar values (say 0-4), one of which is selected and then gets interpreted and perhaps translated by the environment in some way. When I use that approach here, it seems that each "action" is a tuple of 3 separate actions (v,h,phi). When I try to choose a scalar action, I get an error because the environment expects to be able to index my action. However, my attempts to modify my model to select and store actions in tuples does not seem to be working.

Do you perhaps have any examples of training a model other than those from 'baselines' so that I could get a better idea of how to interact with this environment? I am very interested in getting this working.

from atc-reinforcement-learning.

epaulz-vt commented on September 25, 2024

I suppose a simpler way to explain my dilemma is that I don't quite understand how to interact with the continuous action space (I am still fairly new to machine learning). I see that there seems to be a way to switch the environment to a discrete action space. However, no matter which mode it's in when I attempt to understand the action space with "num_outputs = env.action_space.n" it keeps telling me that 'Box' and 'MultiDiscrete' don't have an 'n' attribute.

from atc-reinforcement-learning.

How to eliminate "invalid actions" about atc-reinforcement-learning HOT 3 OPEN

Comments (3)

Related Issues (2)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

	self.action_space = gym.spaces.Box(low=np.array([-1, -1, -1]),
	high=np.array([1, 1, 1]))