boettiger-lab / sarsop Goto Github PK

:package: A library for solving POMDPs

License: GNU General Public License v2.0

R 4.50% C++ 81.85% C 10.98% Makefile 0.91% TeX 1.76%

r-package pomdp fisheries decision-theory

sarsop's Introduction

SARSOP for R

library(sarsop)
library(tidyverse) # for plotting

Problem definition

Our problem is defined by a state space, states, representing the true fish stock size (in arbitrary units), and an action space, actions representing the number of fish that will be harvested (or attempted to harvest). For simplicity, we will permit any action from 0 harvest to the maximum possible state size.

A stock recruitment function, f describes the expected future state given the current state. The true future state will be a stochastic draw with this mean.

A reward function determines the value of taking action of harvesting h fish when stock size is x fish; for simplicity this example assumes a fixed price per unit harvest, with no cost on harvesting effort. Future rewards are discounted.

states <- seq(0,1, length=50)
actions <- states
observations <- states
sigma_g <- 0.1
sigma_m <- 0.2
reward_fn <- function(x,h) pmin(x,h) # - .001*h
discount <- 0.95

r <- 1
K <- 0.75
f <- function(x, h){ # ricker
  s <- pmax(x - h, 0)
  s * exp(r * (1 - s / K) )
}

Semi-analytic solution to Deterministic problem

For comparison, we note that an exact solution to the deterministic or low-noise problem comes from Reed 1979, which proves that a constant escapement policy (S^) is optimal, with (\tfrac{df}{dx}|_{x = S^} = 1/\gamma) for discount (\gamma),

S_star <- optimize(function(x) -f(x,0) + x / discount, 
                   c(min(states),max(states)))$minimum
det_policy <- sapply(states, function(x) if(x < S_star) 0 else x - S_star)
det_action <- sapply(det_policy, function(x) which.min(abs(actions - x)))

When the state is observed without error, the problem is a Markov Decision Process (MDP) and can be solved by stochastic dynamic programming (e.g. policy iteration) over the discrete state and action space. To do so, we need matrix representations of the above transition function and reward function.

sarsop provides a convenience function for generating transition, observation, and reward matrices given these parameters for the fisheries management problem:

m <- fisheries_matrices(states, actions, observations, reward_fn, 
                        f, sigma_g, sigma_m, noise = "lognormal")

POMDP Solution

In the POMDP problem, the true state is unknown, but measured imperfectly. We introduce an observation matrix to indicate the probability of observing a particular state (y) given a true state (x). In principle this could depend on the action taken as well, though for simplicity we assume only a log-normal measurement error independent of the action chosen.

Long-running code to actually compute the solution.

log_dir <- "inst/extdata/vignette" 
alpha <- sarsop(m$transition, m$observation, m$reward, discount, 
                log_dir = log_dir,
                precision = .1, timeout = 200) # run much longer for more precise curve

sarsop logs solution files in the specified directory, along with a metadata table. The metadata table makes it convenient to store multiple solutions in a single directory, and load the desired solution later using it’s id or matching metatata. We can read this solution from the log where it is stored.

Given the model matrices and alpha vectors. Start belief with a uniform prior over states, compute & plot policy:

state_prior = rep(1, length(states)) / length(states) # initial belief
df <- compute_policy(alpha, m$transition, m$observation, m$reward,  state_prior)

## append deterministic action
df$det <- det_action

ggplot(df, aes(states[state], states[state] - actions[policy])) + 
  geom_line(col='blue') + 
  geom_line(aes(y = states[state] - actions[det]), col='red')

Simulate management under the POMDP policy:

set.seed(12345)
x0 <- which.min(abs(states - K))
Tmax <- 20
sim <- sim_pomdp(m$transition, m$observation, m$reward, discount, 
                 state_prior, x0 = x0, Tmax = Tmax, alpha = alpha)

Plot simulation data:

sim$df %>%
  select(-value) %>%
  mutate(state = states[state], action = actions[action], obs = observations[obs]) %>%
  gather(variable, stock, -time) %>%
  ggplot(aes(time, stock, color = variable)) + geom_line()  + geom_point()

Plot belief evolution:

sim$state_posterior %>% 
  data.frame(time = 1:Tmax) %>%
  filter(time %in% seq(1,Tmax, by = 2)) %>%
  gather(state, probability, -time, factor_key =TRUE) %>% 
  mutate(state = as.numeric(state)) %>% 
  ggplot(aes(state, probability, group = time, alpha = time)) + geom_line()

Developer Notes

Unfortunately the appl source code is a bit dated and not suitable for using as a shared library. It builds with lot of warnings and on Windows it only builds with MS Visual Studio. This package tries to make things as easy as possible for the user by bundling the appl executables and wrap them with system calls in R. This package also provides higher-level functions for POMDP analysis.

Thanks

The underlying algorithm, SARSOP: Successive Approximations of the Reachable Space under Optimal Policies was designed and described by Hanna Kurniawati, David Hsu, and Wee Sun Lee, in Department of Computer Science, National University of Singapore in 2008, see doi:10.15607/RSS.2008.IV.009. Kurniawati et al acknowledge Yanzhu Du and Xan Huang for helping with the software implementation in C++, which at the time of that publication, was available at http://motion.comp.nus.edu.sg/projects/pomdp/pomdp.html. That implementation has since moved (in 2017, after this R package wrapper was first implemented in 2016) to the GitHub repository: https://github.com/AdaCompNUS/sarsop. The C++ implementation, “APPL” acknowledges contributions based on an early version of ZMDP by Trey Smith (http://www.cs.cmu.edu/~trey/zmdp/). ZMDP in turn uses code from pomdp-solve by Tony Cassandra (http://www.cassandra.org/pomdp/code/index.shtml). The POMDPX parser uses TinyXML by Lee Thomason (http://www.grinninglizard.com/tinyxml/). Part of APPL makes use of code based on an early version of ZMDP released under Apache License 2.0. The POMDPX parser makes use of TinyXML released under zlib License. The rest of APPL is released under GNU General Public License V2.

Jeroen Ooms developed the original R package compilation setup, Mykel Kochenderfer and Markus Herrmann have been helpful in providing windows builds using MS Visual Studio. Milad Memarzadeh wrote the R POMDPX parsers and contributed to pacakge development. Carl Boettiger developed and maintains the R package. The C++ library here is modified from the original only to conform to modern C++ development standards and avoid compilation errors or warnings under gcc-8 and gcc-9, as well as modern clang compilers.

sarsop's People

Contributors

Stargazers

Watchers

Forkers

jimhester abarner luzvpascal aculich miladm12 minghao2016 rnaimehaom jeroen chanheungcho

sarsop's Issues

Errors on Apple clang 10.0 (Apple Mojave X-Code)

The package will still install and compile the pomdpsol binaries successfully on Apple Mojave using clang 10.0, but the resulting code will fail to run, throwing a segfault with exit statis -11.

This can be replicated on Travis MacOS X build by setting the xcode version to 10.1: https://travis-ci.org/boettiger-lab/sarsop/jobs/502695869
The error does not occur with any previous version of xcode, e.g. https://travis-ci.org/boettiger-lab/sarsop/jobs/502695870 (e.g. High Sierra X-Code flavor)

Linux builds seem un-impacted by this: I can build and run successfully on g++ / gcc toolchain, and can also confirm that code runs fine on Debian system that uses the latest (llvm-7) clang compilers.

The problem behavior occurs only on Apple’s llvm 10.0 AFIK (which is ~ based on open source LLVM 6.0.1, according to wikipedia).

On both Appveyor and my local windows machine, the code runs fine, though notably the binaries are not compiled at install time on Windows, but instead uses the pre-packaged binaries. However, on CRAN winbuilder machines throw a similar runtime error... Gabor has suggested this can be reproduced if the windows machines are missing an MSVC dll (Microsoft Visual Studio C++ libs), so may be unrelated.

pomdpsol producing error

Hi Carl,

first I wanted to mention that I am not very familiar with the sarsop package, as I am trying to use it through the pomdp package.
I did raise an issue with pomdp github: mhahsler/pomdp#14

There is a process halt when I try

library(pomdp)
data(Tiger)
pomdp::solve_SARSOP(Tiger)
Error in processx::run(path, strsplit(args, " ")[[1]], spinner = spinner, :
System command 'pomdpsol.exe' failed, exit status: -1073741515, stderr empty
Type .Last.error.trace to see where the error occurred
.Last.error.trace

Stack trace:

pomdp:::solve_SARSOP(Tiger)
base:::do.call(sarsop::pomdpsol, c(list(model = model_file, output = policy_file, ...
(function (model, output = tempfile(), precision = 0.001, timeout = NULL, ...
sarsop:::exec_program("pomdpsol", args, stdout = stdout, stderr = stderr, ...
processx::run(path, strsplit(args, " ")[[1]], spinner = spinner, ...
throw(new_process_error(res, call = sys.call(), echo = echo, ...

x System command 'pomdpsol.exe' failed, exit status: -1073741515, stderr empty

I am raising the issue here since I have also retried with the following, which removes the pomdp call to sarsop pomdpsol:

pomdp::write_POMDP(Tiger, "tigerpomdp")
sarsop::pomdpsol(paste(getwd(), "tigerpomdp", sep = "/"))
Error in processx::run(path, strsplit(args, " ")[[1]], spinner = spinner, :
System command 'pomdpsol.exe' failed, exit status: -1073741515, stderr empty
Type .Last.error.trace to see where the error occurred

Has this error been encountered by others?
Or am I doing something wrong that is easily detectable by people who use sarsop more regularly than me? One reason I am not sure is that my error is still occurring after I have tried some troubleshooting recommended by users of the pomdp package, and they haven't been able to reproduce my error.

Thanks and much appreciated.

Check assert_has_appl() in pomdpsol

Hi Carl,

I get on Windows the following:

> pomdpsol(model, output = policy, timeout = 1)
Error in processx::run(path, strsplit(args, " ")[[1]], spinner = spinner,  : 
  System command 'pomdpsol.exe' failed, exit status: -1073741515, stderr empty
Type .Last.error.trace to see where the error occurred

The error message is not very self-explanatory. Maybe you can check assert_has_appl() in pomdpsol and create an error message that points out that the installation had a problem and what most likely needs to be done (I assume installing something on Windows).

Best,
-Michael

write_pomdp should support XML version, factored MOMDP version

pomdpsol should parse standard-out info

Should extract such information as initialize time, convergence information.

Address warnings thrown by modern compilers

Note the following warnings on compile (These are taken From clang / llvm-7 on Debian, as clang is known for better error messages. Similar warnings appear on more recent gcc builds too.)

(below showing unique examples only, the same warning occurs in many many locations for each of these)

pomdp_spec.y:934:17: warning: ISO C++11 does not allow conversion from string literal to 'char *' [-Wwritable-strings]
      ERR_enter("Parser<enterUniformMatrix>:", currentLineNumber,

include/pomdp_spec.yy.cc:900:2: warning: 'register' storage class specifier is deprecated and incompatible with C++17 [-Wdeprecated-register]
        register yy_state_type yy_current_state;

appl-0.96/Parser/Cassandra/decision-tree.c:104:3: warning: implicit declaration of function 'checkAllocatedPointer' is invalid in C99
      [-Wimplicit-function-declaration]
  checkAllocatedPointer((void *)out);
  ^

appl-0.96/Parser/POMDPX/SparseTable.cpp:525:1: warning: control may reach end of non-void function [-Wreturn-type]
}
^
1 warning generated.

(and these ones seen only once:)

appl-0.96/Parser/POMDPX/tinyxmlparser.cpp -o appl-0.96/Parser/POMDPX/tinyxmlparser.o
appl-0.96/Parser/POMDPX/tinyxmlparser.cpp:325:19: warning: '&&' within '||' [-Wlogical-op-parentheses]
        while (*p && IsWhiteSpace(*p) || *p == '\n' || *p == '\r')
               ~~~^~~~~~~~~~~~~~~~~~~ ~~
appl-0.96/Parser/POMDPX/tinyxmlparser.cpp:325:19: note: place parentheses around the '&&' expression to silence this warning
        while (*p && IsWhiteSpace(*p) || *p == '\n' || *p == '\r')
                  ^
               (                     )
1 warning generated.

appl-0.96/Core/BeliefForest.cpp:17:15: warning: equality comparison result unused [-Wunused-comparison]
        sampleEngine == NULL;   
        ~~~~~~~~~~~~~^~~~~~~
appl-0.96/Core/BeliefForest.cpp:17:15: note: use '=' to turn this equality comparison into an assignment
        sampleEngine == NULL;   
                     ^~
                     =
1 warning generated.

appl-0.96/Bounds/AlphaVectorPolicy.cpp -o appl-0.96/Bounds/AlphaVectorPolicy.o
appl-0.96/Bounds/AlphaVectorPolicy.cpp:21:19: warning: assigning field to itself [-Wself-assign-field]
        this->policyFile = policyFile;
                         ^
appl-0.96/Bounds/AlphaVectorPolicy.cpp:165:27: warning: format specifies type 'char *' but the argument has type 'char (*)[200]' [-Wformat]
                if(fscanf(infile, "%s", &dvalue)==EOF){
                                   ~~   ^~~~~~~
appl-0.96/Bounds/AlphaVectorPolicy.cpp:178:37: warning: format specifies type 'float *' but the argument has type 'double *' [-Wformat]
                sscanf(contents, "%d %f", &index, &value);
                                     ~~           ^~~~~~
                                     %lf

appl-0.96/MathLib/SparseVector.cpp:486:23: warning: format specifies type 'unsigned int' but the argument has type 'const momdp::SparseVector *' [-Wformat]
        printf("this: %X\n", this);
                      ~~     ^~~~
1 warning generated.

Misc small issues

file is called readpolicy_MOMDP.R but function is findpolicy_MOMDP()
arguments to findpolicy_MOMDP() should be defined in roxygen .
lots of redundant code between read_policy() and findpolicy_MOMDP(). these functions could probably be combined, with the function either figuring out which one to use (e.g. if fs is given, do the MOMDP behavior)
what's the difference between write_pomdpx() function and the writepomdpx_POMDP() function? Looks like we can use the first and drop the second? (and update the pomdpplus code accordingly)?
write_pomdpx() then needs roxygen docs, @export