Comments (6)
Strange, it works here. Also, line 341
in the original file is
args = parser.parse_args()
And not the solve_mdp
line. Sure you didn't change anything in the file?
from ai-toolbox.
As a sanity check, you could try to add at line 264
(in the loop that constructs the probabilities) the following:
for state in range(len(S)):
coord = decodeState(state)
T.append([[getTransitionProbability(coord, action,
decodeState(next_state))
for next_state in range(len(S))] for action in A])
print([sum(x) for x in T[-1]]) # <-------------------- ADD THIS LINE
The last line will perform the sums for you; all printed numbers should be 1.0
, otherwise something is going wrong.
from ai-toolbox.
from ai-toolbox.
Ah, I see. Something is going wrong in the creation of the transition function. What the Python code in the example is doing is creating a 3-dimensional matrix, with dimensions SxAxS, where each entry T[s][a][s']
corresponds to the probability of transitioning from state s
to state s'
, given action a
.
What this means is that, to be correct probability distributions, all these numbers must be valid probabilities (between 0 and 1), and that the sum for all possible transitions from T[s][a]
must be 1.0. The error is related to the fact that these assumptions are not true, so the program aborts. The prints are to check that all these sums are indeed 1.0, and in your case they are not, which explains it.
That said, I think I found the problem. It's a subtler change on how integer divisions between Python 2 and Python 3. I'm going to do a patch soon that makes the example more "cross-compatible" against both versions.
As a quick fix, you need to replace the decodeState(state)
function in your example file with:
def decodeState(state):
"""
Convert from state_index to coordinate.
Parameters
----------
state: int
Index of the state.
Returns
-------
coord: tuple of int
Four element tuple containing the position of the tiger and antelope.
"""
coord = []
for _ in range(4):
c = state % SQUARE_SIZE
state = state // SQUARE_SIZE # This is the changed line that forces an integer division in Python 3.
coord.append(c)
return tuple(coord)
Regarding your problem, there's no direct way to do this "cleanly" without modifying the internals of the library (depending on which algorithm you plan to use).
The best solution that will maintain compatibility with everything else is to simply use the same transitions for all actions in your particular states. So, for all actions, the transition probabilities will be the same. This makes it so that picking the action does not affect the environment, which is what you want. This will allow planning algorithms to work correctly (like for example value iteration).
If you plan to use reinforcement learning (for example Q-learning), then you might also want to force your agent to pick a specific action (say, 0) in those states, as it will prevent unnecessary exploration in states where picking an action does not do anything.
Let me know if what I wrote makes sense to you :)
from ai-toolbox.
from ai-toolbox.
Sure, that'd be cool! I love to see what people are doing with the library :)
If the example works now feel free to close the issue; if you then have more trouble with your setting just open another one no problem. Good luck for now!
from ai-toolbox.
Related Issues (20)
- Python module problem HOT 26
- Using the Toolbox to solve the Tag problem HOT 4
- Save POMDP policy HOT 6
- Sparse Matrix's on POMDP model HOT 22
- Improve serialization for MDP and POMDP Sparse Models HOT 1
- Problem with make HOT 5
- Make cannot find .hpp file in Boost HOT 2
- Errors when compiling tutorials HOT 5
- Can't find Lpsolve when run cmake HOT 9
- Using AI-Toolbox with OMNeT++ HOT 2
- Make issue HOT 9
- Not able to find "compare" file HOT 3
- Trouble installing and running AIToolbox HOT 9
- about LP_Solve HOT 13
- error C3779: 'AIToolbox::IndexMapIterator<IdsIterator,Container>::operator *': a function that returns 'auto' cannot be used before it is defined HOT 11
- Better Project Setup Tutorial HOT 6
- C++: error: expected ‘)’ before ‘elementsN’ HOT 10
- Problems when building the library HOT 4
- Sparse Matrix: Space Time Complexity while assigning and accessing an element. HOT 1
- Can't build because of lpslove HOT 16
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from ai-toolbox.