Code Monkey home page Code Monkey logo

inlautils's Introduction

INLAutils

Build Status codecov.io cran version

A package containing utility functions for the R-INLA package.

There's a fair bit of overlap with inlabru.

Installation

To install, first install INLA.

install.packages("INLA", repos="https://www.math.ntnu.no/inla/R/stable")

then install INLAutils

# From github
library(devtools)
install_github('timcdlucas/INLAutils')

# Load packages
library(INLA)
library(INLAutils)

Overview

Plotting

I find the the plot function in INLA annoying and I like ggplot2. So INLAutils provides an autoplot method for INLA objects.

      data(Epil)
      ##Define the model
      formula = y ~ Trt + Age + V4 +
               f(Ind, model="iid") + f(rand,model="iid")
      result = inla(formula, family="poisson", data = Epil, control.predictor = list(compute = TRUE))
     
      p <- autoplot(result)

plot of chunk autoplot

Because these are ggplot2 objects, we can easily modify them.

  # Find data names with names(p[[1]]$data)
  p[[1]] + 
    geom_line(aes(colour = var), size = 1.3) +
    palettetown::scale_colour_poke(pokemon = 'Oddish', spread = 4)

plot of chunk autoplot2

There is an autoplot method for INLA SPDE meshes.

    m = 100
    points = matrix(runif(m * 2), m, 2)
    mesh = inla.mesh.create.helper(
      points = points,
      cutoff = 0.05,
      offset = c(0.1, 0.4),
      max.edge = c(0.05, 0.5))
    
    autoplot(mesh)

plot of chunk autoplot_mesh

There are functions for plotting more diagnostic plots.

 data(Epil)
 observed <- Epil[1:30, 'y']
 Epil <- rbind(Epil, Epil[1:30, ])
 Epil[1:30, 'y'] <- NA
 ## make centered covariates
 formula = y ~ Trt + Age + V4 +
          f(Ind, model="iid") + f(rand,model="iid")
 result = inla(formula, family="poisson", data = Epil,
               control.predictor = list(compute = TRUE, link = 1))
 ggplot_inla_residuals(result, observed, binwidth = 0.1)

plot of chunk plot_residuals

 ggplot_inla_residuals2(result, observed, se = FALSE)
## `geom_smooth()` using method = 'loess'

plot of chunk plot_residuals

Finally there is a function for combining shapefiles, rasters (or INLA projections) and meshes. For more fine grained control the geoms defined in inlabru might be useful.

# Create inla projector
n <- 20
loc <- matrix(runif(n*2), n, 2)
mesh <- inla.mesh.create(loc, refine=list(max.edge=0.05))
projector <- inla.mesh.projector(mesh)

field <- cos(mesh$loc[,1]*2*pi*3)*sin(mesh$loc[,2]*2*pi*7)
projection <- inla.mesh.project(projector, field)

# And a shape file
crds <- loc[chull(loc), ]
SPls <- SpatialPolygons(list(Polygons(list(Polygon(crds)), ID = 'a')))

# plot
ggplot_projection_shapefile(projection, projector, SPls, mesh)

plot of chunk shapefileraster

Analysis

There are some helper functions for general analyses.

INLAstep runs stepwise variable selection with INLA.

  data(Epil)
  stack <- inla.stack(data = list(y = Epil$y),
                      A = list(1),
                      effects = list(data.frame(Intercept = 1, Epil[3:5])))
                      
  result <- INLAstep(fam1 = "poisson", 
                     Epil,
                     in_stack = stack,
                     invariant = "0 + Intercept",
                     direction = 'backwards',
                     include = 3:5,
                     y = 'y',
                     y2 = 'y',
                     powerl = 1,
                     inter = 1,
                     thresh = 2)
  
  result$best_formula
## y ~ 0 + Intercept + Base + Age + V4
## <environment: 0xb6b15b0>
  autoplot(result$best_model, which = 1)

plot of chunk INLAstep

makeGAM helps create a function object for fitting GAMs with INLA.

 data(Epil)
 formula <- makeGAM('Age', invariant = '', linear = c('Age', 'Trt', 'V4'), returnstring = FALSE)
 formula
## y ~ +Age + Trt + V4 + f(inla.group(Age), model = "rw2")
## <environment: 0xc1d0688>
 result = inla(formula, family="poisson", data = Epil)

Spatial leave-one-out cross-validation (sloo-cv)

The package INLAutils provides an approach to run sloo-cv for INLA objects.

# generate a dataframe and INLA output for the function
set.seed(10)
coords <- data.frame(long = c(rnorm(70), rnorm(30, 3)), lat = rnorm(100))
x <- data.frame(x1 = rnorm(100), x2 = rnorm(100))# x1 no relat., x2 pos. relat.
y <- x$x2 * 2 + rnorm(100)
dataf1 <- sp::SpatialPointsDataFrame(coords = coords, data = data.frame(y = y, x))
mesh <- INLA::inla.mesh.2d(loc = sp::coordinates(dataf1), max.edge = c(3, 3),cutoff = 1.3)
spde <- INLA::inla.spde2.matern(mesh, alpha=2)#SPDE model is defined
A <- INLA::inla.spde.make.A(mesh, loc = sp::coordinates(dataf1))#projector matrix
dataframe <- data.frame(dataf1) #generate dataframe with response and covariate
modform<-stats::as.formula(paste('y ~ -1+ x1 + x2 + y.intercept + f(spatial.field, model=spde)'))
stk <- INLA::inla.stack(data = list(y=dataframe$y), 
                        A = list(A, 1),
                        effects = list(list(spatial.field=1:spde$n.spde),
                        list(y.intercept = rep(1, length(dataframe$y)),
                             covariate = dataframe[c(-1)])), 
                        tag='est')
out <- INLA::inla(modform, family='normal',Ntrials = 1, data=INLA::inla.stack.data(stk, spde=spde),
                  control.predictor = list(A =INLA::inla.stack.A(stk),link=1),
                  control.compute = list( config=TRUE),control.inla = list(int.strategy='eb'))
out.field <- INLA::inla.spde2.result(out,'spatial.field', spde, do.transf = TRUE)
range.out <- INLA::inla.emarginal(function(x) x, out.field$marginals.range.nominal[[1]])

# parameters for the SLOO process
ss <- 20 # sample size to process (number of SLOO runs)
rad <- min(range.out, max(dist(coords)) / 4) # define the radius of the spatial buffer surrounding the removed point. Make sure it isn't bigger than 25% of the study area (Le Rest 2014)
modform <- y ~ -1+ y.intercept + x1 + x2 + f(spatial.field, model=spde)
alpha <- 0.05 # rmse and mae confidence intervals (1-alpha)

# run the function
cv <- inlasloo(dataframe = dataframe, 
               long = 'long', lat = 'lat',
               y = 'y', ss = ss, 
               rad = rad, modform = modform,
               mesh = mesh, family = 'normal',
               mae = TRUE)

plot of chunk inlaslooplot of chunk inlasloo

inlautils's People

Contributors

pyt215 avatar salauer avatar timcdlucas avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

inlautils's Issues

free x and y

I think most facets should be free in x and y with an option to not be.

Height of distributions depends on range of x (sums to 1 over range of x).

Option could simply be to document well

ww[[2]] + facet_wrap('var', scale = 'free')

or something.

Some errors

library(INLA)
library(INLAutils)

data <- data.frame(y = rpois(100, 10), x1 = rnorm(100))

data$x2 <- sin(data$y / 2) + rnorm(100, sd = 0.1)

ggplot(data, aes(y, x2)) + geom_point()

model <- inla(y ~ x1 + x2, data = data, family = 'poisson')

autoplot(model)

data stack in INLA step

stk <- inla.stack(data = list(y = dataframe$y), 
                  A = list(A, 1),
                  effects = list(s.index,
                                 list(y.intercept = rep(1, length(dataframe$y)),
                                      covariate = dataframe[-1])), 
                  tag='est')
                  
stk <- inla.stack(data = list(y = dataframe$y),
                  A = list(A, 1), 
                  effects = list(c(s.index, list(y.intercept = 1)),
                                 list(dataframe[-1])),
                  tag='est')

I think both of these formats work in inla but only the second seems to work in INLAstep. No idea why.

tests for which need to remove which values

data(Tokyo)
summary(Tokyo)

## Define the model
formula = y ~ f(time, model="rw2", cyclic=TRUE, param=c(1,0.0001)) - 1

## The call to inla
result = inla(formula, family="binomial", Ntrials=n, data=Tokyo)

autoplot(result)

fails because which = 1 doesn't work for whatever reason. It gives the error but then doesn't remove 1 from which.

Plot priors

Add option to plot priors.

I guess this should be for all plots where that makes sense.

It should default to off as the default prior is just mega flat

Error on priors = T with binomial data.

  inla.model <- inla(formula, 
                     family = 'binomial',
                     data = inla.stack.data(inla_point_stack, spde_full = spde_full),
                     Ntrials = examined,
                      control.family = list(link = "probit"),
                     control.predictor = list(A = inla.stack.A(inla_point_stack), compute=TRUE, link = 1)
                     )

autoplot(inla.model, which = 1, priors = T)

errors. Not sure why. This is the model with crazy posteriors.

autoplot to do

  • Axis titles
  • Tick labels i.e. 1:250 is not useful. Cut down to 10 max.
  • Understand what the fitted values and linear predictor plots actually are. Then make funciton names and docs fit.
  • reconsider CI. Darker grey inside, etc.

Autoplot with smoother

formula3 = Y ~ f(region.struct, model="besag", graph.file = g) +
           f(region, model = "iid") + f(x, model = "rw2")

result3 = inla(formula3, family = "poisson", data = Germany, E = E)
autoplot(result3)

The smoother has x axis label "ID" which doesn't make sense.

inlasloo fails if lat or long is called 'y'


library(sp)
data(meuse)




coords <- meuse[, c('x', 'y')] %>% scale
dataf1 <- sp::SpatialPointsDataFrame(coords = coords, data = meuse[, -c(1:2)])

mesh <- inla.mesh.2d(loc = sp::coordinates(dataf1), max.edge = c(0.2, 0.5), cutoff = 0.1)
spde <- inla.spde2.matern(mesh, alpha=2) # SPDE model is defined
A <- inla.spde.make.A(mesh, loc = sp::coordinates(dataf1)) # projector matrix
dataframe <- data.frame(dataf1) # generate dataframe with response and covariate
modform <- cadmium ~ -1 + y.intercept + ffreq + om + soil + lime + f(spatial.field, model = spde)
modform2 <- cadmium ~ -1 + y.intercept + ffreq + om + soil + lime

# make index for spatial field
s.index <- inla.spde.make.index(name="spatial.field",n.spde=spde$n.spde)

## Prepare the data
stk <- inla.stack(data=list(cadmium=dataframe$cadmium),
                      A=list(A,1), 
                      effects=list(c(s.index,list(y.intercept=1)),
                                   list(dataframe[, 7:10])),
                      tag='est')

out <- inla(modform, family = 'normal', Ntrials = 1,
            data = inla.stack.data(stk, spde = spde),
            control.predictor = list(A = inla.stack.A(stk), link = 1),
            control.compute = list(config = TRUE), 
            control.inla = list(int.strategy = 'eb'))
out.field <- inla.spde2.result(out,'spatial.field', spde, do.transf = TRUE)
range.out <- inla.emarginal(function(x) x, out.field$marginals.range.nominal[[1]])

# parameters for the SLOO process
ss <- 20 # sample size to process (number of SLOO runs)
# define the radius of the spatial buffer surrounding the removed point. 
rad <- min(range.out, max(dist(coords)) / 4) 
# Make sure it isn't bigger than 25% of the study area (see Le Rest et al.(2014))
alpha <- 0.05 # rmse and mae confidence intervals (1-alpha)

# run the function to compare both models
cv <- inlasloo(dataframe = dataframe, 
               long = 'x', lat = 'y',
               y = 'cadmium', ss = ss, 
               rad = rad, 
               modform = list(modform, modform2),
               mesh = mesh, family = 'normal',
               mae = TRUE)
  

because of the funky ordering of these lines.

    colnames(dataframe)[colnames(dataframe) == y] <- "y"
    colnames(dataframe)[colnames(dataframe) == long] <- "long"
    colnames(dataframe)[colnames(dataframe) == lat] <- "lat"

Add tests for plots

Mega basic plots to check that functions don't error.

Will bump up test coverage which is nice.

Worth doing to set out what needs to be in model autoplot. Combine with going through INLA plot function.

autoplot with multiple likelihood models


## An example with three independent AR(1)'s with separate means, but
## with the same hyperparameters. These are observed with three
## different likelihoods.

n = 100
x1 = arima.sim(n=n, model=list(ar=c(0.9))) + 0
x2 = arima.sim(n=n, model=list(ar=c(0.9))) + 1
x3 = arima.sim(n=n, model=list(ar=c(0.9))) + 2

## Binomial observations
Nt = 10 + rpois(n,lambda=1)
y1 = rbinom(n, size=Nt, prob = exp(x1)/(1+exp(x1)))

## Poisson observations
Ep = runif(n, min=1, max=10)
y2 = rpois(n, lambda = Ep*exp(x2))

## Gaussian observations
y3 = rnorm(n, mean=x3, sd=0.1)

## stack these in a 3-column matrix with NA's where not observed
y = matrix(NA, 3*n, 3)
y[1:n, 1] = y1
y[n + 1:n, 2] = y2
y[2*n + 1:n, 3] = y3

## define the model
r = c(rep(1,n), rep(2,n), rep(3,n))
rf = as.factor(r)
i = rep(1:n, 3)
formula = y ~ f(i, model="ar1", replicate=r, constr=TRUE) + rf -1
data = data.frame(y, i, r, rf)

## parameters for the binomial and the poisson
Ntrial = rep(NA, 3*n)
Ntrial[1:n] = Nt
E = rep(NA, 3*n)
E[1:n + n] = Ep

result = inla(formula, family = c("binomial", "poisson", "normal"),
              data = data, Ntrial = Ntrial, E = E,
              control.family = list(
                      list(),
                      list(),
                      list(initial=0)))

gives

Error in eval(expr, envir, enclos) : object 'X0.975quant' not found

Vignettes

Going to want fairly extensive vignettes:

  • General INLA usage

  • Spatial INLA usage

  • inlaSDM analysis

Forward selection is not working

  # Try and make a dataset where a variable WILL get added.
  Epil2 <- Epil
  Epil2$Base <- Epil$y + rnorm(nrow(Epil), sd = 0.01)
  Epil2$Age <- Epil$y + rnorm(nrow(Epil), sd = 0.01)
  
  stack2 <- inla.stack(data = list(y = Epil2$y),
                      A = list(1),
                      effects = list(data.frame(Intercept = 1, Epil2[3:5])))
  
  result1 <- INLAstep(fam1 = "poisson", 
                      Epil2,
                      in_stack = stack2,
                      invariant = "0 + Intercept",
                      direction = 'forwads',
                      include = 3:5,
                      y = 'y',
                      y2 = 'y',
                      powerl = 1,
                      inter = 1,
                      thresh = 0.001)

Does the sloo readme make sense?

Hi @pyt215 I'm just tidying up the readme and reknitting it.

https://github.com/timcdlucas/INLAutils/tree/dev

I think the code that was in there wasn't working. So I took code from the examples. But now I'm a bit confused because it looks like it has binomial response data, fitted with a normal likelihood in the first case and with a gamma in the sloo line?

I'll try and work it out, or maybe you could let me know what it should be.

Surg example causes autoplot error

data(Surg)
formula = r ~ f(hospital,model="iid",param=c(0.001,0.001))
mod.surg = inla(formula,data=Surg,family="binomial",Ntrials=n)

autoplot(mod.surg)

breaks

Random effects sorted

Might be useful to sort the random effects in autoplot. This will easily show the range of the random effects.

Switch ggfortify for cowplot

autoplot should

print with grid_arrange or something.
return the plain plot list.

Documentation should show

p <- autoplot()
p[1] <- p[1] + theme_grey()
plot_grid(p)

Does Ntrials break autoplot?

  formula <- positive ~ x1 + x2
  
  
  xx = data.frame(x1 = rnorm(length(pointcases$examined)), x2 = rnorm(length(pointcases$examined)))
  inla_point_stack <- inla.stack(tag = 'est', 
                                 data = list(positive = pointcases$examined,
                                             examined = pointcases$examined),
                                 A = list(1), 
                                 effects = list(xx))
                                 
  
  inla.model <- inla(formula, 
                     family = 'binomial',
                     data = inla.stack.data(inla_point_stack),
                     Ntrials = examined
                     )

Much improved README

Include at least:

  • autoplot.inla.mesh
  • plot_residuals
  • makeGAM
  • inlaSDM
  • stepINLA
  • ggplot_raster_shapefile

INLA ridge

Might just be a function to create the formula needed.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.