Code Monkey home page Code Monkey logo

discretevalueiteration.jl's Introduction

DiscreteValueIteration

Build Status Coverage Status

This package implements the discrete value iteration algorithm in Julia for solving Markov decision processes (MDPs). The user should define the problem according to the API in POMDPs.jl. Examples of problem definitions can be found in POMDPModels.jl. For an extensive tutorial, see these notebooks.

There are two solvers in the package. The "vanilla" ValueIterationSolver calls functions from the POMDPs.jl interface in every iteration, while the SparseValueIterationSolver first creates sparse transition and reward matrices and then performs value iteration with the new matrix representation. While both solvers take advantage of sparsity, the SparseValueIterationSolver is generally faster because of low-level optimizations, while the ValueIterationSolver has the advantage that it does not require allocation of transition matrices (which could potentially be too large to fit in memory).

Installation

Start Julia and make sure you have the JuliaPOMDP registry:

import POMDPs
POMDPs.add_registry()

Then install using the standard package manager:

using Pkg; Pkg.add("DiscreteValueIteration")

Usage

Use

using POMDPs
using DiscreteValueIteration
@requirements_info ValueIterationSolver() YourMDP()
@requirements_info SparseValueIterationSolver() YourMDP()

to get a list of POMDPs.jl functions necessary to use the solver. This should return a list of the following functions to be implemented for your MDP:

discount(::MDP)
n_states(::MDP)
n_actions(::MDP)
transition(::MDP, ::State, ::Action)
reward(::MDP, ::State, ::Action, ::State)
stateindex(::MDP, ::State)
actionindex(::MDP, ::Action)
actions(::MDP, ::State)
support(::StateDistribution)
pdf(::StateDistribution, ::State)
states(::MDP)
actions(::MDP)

Once the above functions are defined, the solver can be called with the following syntax:

using DiscreteValueIteration

mdp = MyMDP() # initializes the MDP
solver = ValueIterationSolver(max_iterations=100, belres=1e-6) # initializes the Solver type
solve(solver, mdp) # runs value iterations

To extract the policy for a given state, simply call the action function:

a = action(policy, s) # returns the optimal action for state s

discretevalueiteration.jl's People

Contributors

etotheipluspi avatar zsunberg avatar maximebouton avatar mykelk avatar rejuvyesh avatar femtocleaner[bot] avatar

Watchers

mexsser avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.