yxlin / ggdmc Goto Github PK

Bayesian Cognitive Modelling

R 46.86% C++ 51.38% Shell 0.06% M4 1.50% Makefile 0.19%

ggdmc's Introduction

Modelling Cognitive Processes

ggdmc is an R package for modelling cognitive processes. Although its focus is on the challenging hierarchical Bayesian models, fitting them with Bayesian MCMC. It can also fit cognitive models with conventional methods, such as maximum likelihood estimation and least squares. The package uses the sampling method of population-based Markov chain Monte Carlo (pMCMC).

Getting Started

This example demonstrates the Wiener diffusion model. For other models, see my tutorials site. The naming of R functions in ggdmc attempts to inform the user what the functions are for. For example, BuildModel is to build a model object.

As the user is often reminded in using Bayesian tools, it is always a good practice to check the result of a model fit. Note that the sequence of parameters in a parameter vector (i.e., p.vector) must follow the sequence in the p.vector reported by BuildModel. Some built-in checks will try to safeguard this, but they are far from bulletproof.

Fit a fixed-effect model to a participant

## Set up model ----
## Fixing sv & sz to 0 to set up a Wiener diffusion model
require(ggdmc)
model <- BuildModel(
  p.map     = list(a = "1", v="1", z="1", d="1", sz="1", sv="1", t0="1", 
                   st0="1"),
  match.map = list(M = list(s1 = "r1", s2 = "r2")),
  factors   = list(S = c("s1", "s2")),
  responses = c("r1","r2"),
  constants = c(st0 = 0, d = 0, sv = 0, sz = 0),  
  type      = "rd")   

npar <- model@npar   ## Note this works for version > 0.2.7.5; 
## npar <- length(GetPNames(model))   ## Use GetPNames instead in 0.2.6.0

p.vector <- c(a=1, v=1.5, z=0.5, t0=.15)
dat <- simulate(model, nsim = 50, ps = p.vector)
dmi <- BuildDMI(dat, model)

p.prior <- BuildPrior(
  dists = rep("tnorm", npar),
  p1=c(a=1, v=0, z=1, t0=1),
  p2=c(a=1, v=2, z=1, t0=1),
  lower = c(0, -5, rep(0, 2)),
  upper = rep(NA, npar))

## Fit model -------------
fit0 <- StartNewsamples(dmi, p.prior)
fit  <- run(fit0)

## Check model -----------
plot(fit)
plot(fit, den = TRUE)
plot(fit, pll = FALSE)
plot(fit, pll = FALSE, den = TRUE)

isconv <- gelman(fit)
est    <- summary(fit, recovery = TRUE, ps = p.vector, verbose = TRUE)

How to fit fixed-effect and hierarchical model with multiple participants

require(ggdmc);
model <- BuildModel(
  p.map     = list(a = "1", v ="1", z ="1", d ="1", sz ="1", sv ="1", t0 ="1", 
                   st0 ="1"),
  match.map = list(M = list(s1 = "r1", s2 = "r2")),
  factors   = list(S = c("s1", "s2")),
  responses = c("r1","r2"),
  constants = c(st0 = 0, d = 0, sv = 0, sz = 0),
  type      = "rd")

npar <- model@npar
pop.mean  <- c(a = 2,   v = 4,  z = 0.5, t0 = 0.3)
pop.scale <- c(a = 0.5, v = .5, z = 0.1, t0 = 0.05)
pop.prior <- BuildPrior(
    dists = rep("tnorm", npar),
    p1    = pop.mean,
    p2    = pop.scale,
    lower = c(0,-5,  0, 0),
    upper = c(5, 7,  1, 1))

## Simulate some data
dat <- simulate(model, nsub = 50, nsim = 30, prior = pop.prior)
dmi <- BuildDMI(dat, model)
ps <- attr(dat, "parameters")

p.prior <- BuildPrior(
    dists = rep("tnorm", npar),
    p1    = pop.mean,
    p2    = pop.scale*5,
    lower = c(0,-5, 0, 0),
    upper = c(5, 7, 1, 1))

plot(p.prior, ps = ps)  ## Check if all true values are in the range 

## Sampling separately
fit0 <- StartNewsamples(dmi, p.prior, ncore = 4)
fit  <- run(fit0, 5e2, ncore = 4)
fit  <- run(fit, 1e2, add = TRUE, ncore = 4)  ## add additional 100 samples

## Check model -----
isconv <- gelman(fit, verbose = TRUE)
plot(fit)
est0 <- summary(fit, recovery = TRUE, ps = ps, verbose = TRUE)

## Sampling hierarchically
mu.prior <- BuildPrior(
    dists = rep("tnorm", npar),
    p1    = pop.mean,
    p2    = pop.scale*5,
    lower = c(0,-5,  0, 0),
    upper = c(5, 7,  1, 1))

sigma.prior <- BuildPrior(
    dists = rep("beta", npar),
    p1    = c(a=1, v=1, z=1, t0=1),
    p2    = rep(1, npar),
    upper = rep(1, npar))

## !!!The names are important!!!
priors <- list(pprior = p.prior, location = mu.prior, scale = sigma.prior)
names(priors)
## [1] "pprior"   "location" "scale"

## Fit hierarchical model ----
fit0 <- StartNewsamples(dmi, priors)
fit  <- run(fit0, 5e2)

p0 <- plot(fit, hyper = TRUE)
p0 <- plot(fit, hyper = TRUE, den = TRUE, pll=FALSE)

## Check model -----------
## hgelman function is deprecated 
res  <- gelman(fit, verbose = TRUE)
est0 <- summary(fit, recovery = TRUE, ps = ps, verbose = TRUE)
est1 <- summary(fit, hyper = TRUE, recovery = TRUE, ps = pop.mean,  type = 1, verbose = TRUE)
est2 <- summary(fit, hyper = TRUE, recovery = TRUE, ps = pop.scale, type = 2, verbose = TRUE)

Response time models

The LBA model, type = "norm",
The DDM, type = "rd",
The Wiener diffusion, type = "rd" and set sv=0 and sz=0

PDA-based models

The Piecewise LBA model 0; CPU-based PDA likelihoods; type = "plba0",
The Piecewise LBA model 1; CPU-based PDA likelihoods; type = "plba1",
The Piecewise LBA model 0; GPU-based PDA likelihoods; type = "plba0_gpu",
The Piecewise LBA model 1; GPU-based PDA likelihoods; type = "plba1_gpu",
The LBA model; GPU-based PDA likelihoods;, type = "norm_pda_gpu",
The leaky, competing accumulator model (Experimental!).

4 to 8 are separated from the latest version of the package. For these PDA-based models, see my BRM paper and associated packages there (osf project). 9 is in a separate module, which has yet to be incorporated. See the LCA tutorial for its testing result using MLE.

For the details regarding PLBA types, please see Holmes, Trueblood, and Heathcote (2016)

Experimental (untested) models

2-D/circular drift-diffusion model, type = "cddm"
Prospective memory model, type = "norm" (see tutorial for more details)
Time-varying changes in other free parameters

Further information

One aim in designing ggdmc is to read objects from DMC, which share similarities. They have, however, some differences. For example, in the latest version of ggdmc, the dimension of theta and phi arrays are 'npar x nchain x nmc'. DMC uses 'nchain x npar x nmc'. To reduce the computation time for manipulating the matrices and arrays, we change this to accommodate the Armadillo convention change to accommodate the convention in Armadillo. Similarly, the dimension of the 'log_likelihoods' and 'summed_log_prior' matrices are 'nchain x nmc'. DMC uses 'nmc x nchain'. Remember to transpose them if you want to operate objects back and forth. Currently, we use the R functions, 'aperm' and 't', to transpose matrices and arrays when operating have to operate between DMC and ggdmc. The following two convenient functions are designed for doing this operation.

DMC2ggdmc <- function(x) {
  ## x is an object of posterior samples from individual subject fit
  x$theta <- aperm(x$theta, c(2, 1, 3))
  x$summed_log_prior <- t(x$summed_log_prior)
  x$log_likelihoods <- t(x$log_likelihoods)
  class(x) <- c('list', 'model')
  return(x)
}
ggdmc2DMC <- function(x) {
  ## Should change $ to @ whenoperatinge output from ggdmc's run function,
  ## because ggdmc now uses S4 class
  x$theta <- aperm(x$theta, c(2, 1, 3))
  x$summed_log_prior <- t(x$summed_log_prior)
  x$log_likelihoods <- t(x$log_likelihoods)
  return(x)
}

Note Dstats.dmc in DMC is also affected by the issue of the different array and matrix dimensions because Dstats.dmc calculates the means of the theta/phi
array across the column,

apply(samples$theta,2,mean)

ggdmc provides DIC function, which uses a back-end function, deviance_model to attain the same operation.

The tutorial in 3-accumulator LBA model illustrates an example of doing the back-and-forth operation.

Note that we start to use S4 class after version 0.2.7.5, so switch to use "@" operator to extract object components (i.e., slot).

Prerequisites

R (>= 3.3.0)
R packages: Rcpp (>= 0.12.10), RcppArmadillo (>= 0.7.700.3.0), ggplot2 (>= 2.1.0), coda (>= 0.16-1), matrixStats, data.table
Windows users need Rtools (>= 3.3.0.1959)
~~Mac OS users need to make Clang understand the OpenMP flag.~~
~~Linux/Unix users may need to install the Open MPI library if it has not been installed.~~
~~Armadillo may need a recent g++ compiler > 4.6~~

Installation

We now use S4 class after version 0.2.7.5. The new design enables a more user-friendly interface in help pages.

From CRAN (0.2.6.0):

install.packages("ggdmc")

From source:

install.packages("ggdmc_0.2.8.0.tar.gz", repos = NULL, type="source")

From GitHub (need devtools) (0.2.8.0):

devtools::install_github("yxlin/ggdmc")

For Microsoft R users:

As of 06-01-2020, because Microsoft R uses R version 3.5.3, the user who wishes to deploy ggdmc on Microsoft R may encounter two challenges. First is RcppArmadillo on MRAN is behind the one on R CRAN. The RcppArmadillo on MRAN has yet to introduce recent Armadillo functions, for instance, randperm in C++. This can be resolved by installing RcppArmadillo directly from its source tarball, downloaded from CRAN. Secondly, the default installation process on Windows is to look for the package binary matching the R version on a Windows machine. This may result in Microsoft R looking for a version of ggdmc matching R 3.5.3, and thereby, it cannot find one. This can be resolved similarly by installing from the source tarball.

For Mac Users:

~~1. Install gfortran. As of 27, Aug, 2018, the gfortran version has to be 6.1, even if you are using a macOS High Sierra Version 10.13.4. gfortran 6.3 may not work.~~

2. Install clang4-r. James Balamuta has created a convenient tool, clang4-r. Once you install clang4-r, your clang will understand the OpenMP flag in ggdmc. The aim is to allow macOS to understand the OpenMP flag, so you may use other methods if you do not want to install clang4-r. The clang4-r is the most straightforward we have found so far. However, we have yet to look into the source code of clang4-r. You can use it at your own risk.

The configure script now disables OpenMP, so macOS users should be able to install without encountering the OpenMP problem.

FAQ:

When the error message arises, "/usr/bin/ld: cannot find -lgsl" and/or "/usr/bin/ld: cannot find -lgslcblas", installing libgsl-dev may resolve this problem (Ubuntu).

Citation

Lin, Y.-S and Strickland, L. (2020). Evidence accumulation models with R: A a practical guide to hierarchical Bayesian methods. The Quantitative Methods for Psychology.

Contributors

The R documentation, tutorials, C++ codes, parallel computations, new genetic algorithm, R helper functions and R packaging are developed by Yi-Shin Lin. A substantial part of R codes for handling experimental designs are adapted from the DMC, developed by Andrew Heathcote (Heathcote et al., 2018). You could find different and more interesting cognitive models in DMC.

Please report bugs to me or start an issue here.

Correction

The help page for the function, likelihood in Density.cpp states that it returns log-likelihood (v2.8.0). An inspection of the source code found that it returns likelihood, not log likelihood. (13-06-2022; v2.8.1). Thanks for Nachshon Meiran pointing it out.

License

GPL-2

Acknowledgments

The PDF, CDF and random number generation of DDM were derived from Voss & Voss's fast-dm 30.2 and rtdists 0.9-0.
Truncated normal functions were originally based on Jonathan Olmsted's RcppTN 0.1-8 at https://github.com/olmjo/RcppTN, Christopher Jackson's R codes in msm package, and Robert's paper (1995, Statistics & Computing).
Thanks to Matthew Gretton's consultation regarding the rtdists package.
Thanks to Andrew Heathcote for lending me his MacBook Air. ggdmc works on OS X (macOS High Sierra Version 10.13.4)
The PDF and random number generation of the 2-D diffusion/circular diffusion model is based on Smith (2016).

Reference

Heathcote, A., Lin, Y.-S., Reynolds, A., Strickland. L. Gretton, M., & Matzke, D. (2018). Dynamic models of choice, Behavior Research Methods. https://doi.org/10.3758/s13428-018-1067-y
Lin, Y.-S. and Strickland, L. (2020). Evidence accumulation models with R: A practical guide to hierarchical Bayesian methods. The Quantitative Methods for Psychology.
Smith, P. (2016). Diffusion Theory of Decision Making in Continuous Report, Psychological Review, 123(4), 425-451. http://dx.doi.org/10.1037/rev0000023

ggdmc's People

Contributors

Stargazers

Watchers

Forkers

humanfactors idesignitx sgt103

ggdmc's Issues

configure cannot handle GCC correctly

Configure script does not work correctly: it fails to identify gcc as gcc and cannot understand that gcc supports OpenMP.

* installing *source* package ‘ggdmc’ ...
** using staged installation
checking whether the C++ compiler works... yes
checking for C++ compiler default output file name... a.out
checking for suffix of executables... 
checking whether we are cross compiling... no
checking for suffix of object files... o
checking whether we are using the GNU C++ compiler... yes
checking whether /opt/local/bin/g++-mp-12 -std=gnu++17 accepts -g... yes
checking how to run the C++ preprocessor... /opt/local/bin/g++-mp-12 -std=gnu++17 -E
checking whether we are using the GNU C++ compiler... (cached) yes
checking whether /opt/local/bin/g++-mp-12 -std=gnu++17 accepts -g... (cached) yes
checking whether g++ version is sufficient... almost
configure: WARNING: Compiler self-identifies as being compliant with GNUC extensions but is not g++.
checking for macOS... found
checking for macOS Apple compiler... not found
checking for clang compiler... not found
configure: WARNING: unsupported macOS build detected; if anything breaks, you keep the pieces.
checking LAPACK_LIBS... R-supplied partial LAPACK found
configure: WARNING: Some complex-valued LAPACK functions may not be available
configure: creating ./config.status
config.status: creating inst/include/RcppArmadilloConfigGenerated.h
config.status: creating src/Makevars
** libs
using C++ compiler: ‘g++-mp-12 (MacPorts gcc12 12.3.0_0) 12.3.0’
using C++11

trying to get slot "dmi" from an object (class "model") that is not an S4 object

Hi, Yi-Shen,

Thanks for sharing your package.

I encountered an error when I tried to reproduce the code in the tutorial: https://yxlin.github.io/fixed-effect-model/one_participant/

Everything worked until the posterior predictive check:

> pp  <- predict_one(fit, xlim = c(0, 5))
Error in predict_one(fit, xlim = c(0, 5)) : 
  trying to get slot "dmi" from an object (class "model") that is not an S4 object

The same problem occured when I tried to reproduce the code in README file

## Set up model ----
## fixing sv & sz to 0, makes to set up a Wiener diffusion model
require(ggdmc)
model <- BuildModel(
  p.map     = list(a = "1", v="1", z="1", d="1", sz="1", sv="1", t0="1", 
                   st0="1"),
  match.map = list(M = list(s1 = "r1", s2 = "r2")),
  factors   = list(S = c("s1", "s2")),
  responses = c("r1","r2"),
  constants = c(st0 = 0, d = 0, sv = 0, sz = 0),  
  type      = "rd")   


> npar <- model@npar
Error: trying to get slot "npar" from an object (class "model") that is not an S4 object

Here is my system information:

> sessionInfo()
R version 4.0.1 (2020-06-06)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 20.04 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_HK.UTF-8        LC_COLLATE=en_US.UTF-8     LC_MONETARY=en_HK.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_HK.UTF-8       LC_NAME=C                  LC_ADDRESS=C               LC_TELEPHONE=C             LC_MEASUREMENT=en_HK.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] data.table_1.12.8 dplyr_1.0.0       ggdmc_0.2.6.0

Trying to calculate DICs from data of 8 participant

Hi, Yi-Shen,

Thanks for sharing your package, it has been very useful and I'm learning a lot with it!

I got an error when I tried to reproduce the code of this page: https://rdrr.io/github/yxlin/ggdmc/man/DIC-methods.html

The DIC function runs when is used with one participant, but when I tried to run with all the fit, it didn't work, this is the error:

Error in apply(object$theta, 2, mean) :
dim(X) must have a positive length

This is my session info:

> sessionInfo()
R version 4.2.3 (2023-03-15 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19044)

Matrix products: default

locale:
[1] LC_COLLATE=Spanish_Spain.utf8  LC_CTYPE=Spanish_Spain.utf8   
[3] LC_MONETARY=Spanish_Spain.utf8 LC_NUMERIC=C                  
[5] LC_TIME=Spanish_Spain.utf8    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods  
[7] base     

other attached packages:
 [1] ez_4.4-0          DescTools_0.99.48 ggdmc_0.2.6.0    
 [4] ggpubr_0.5.0      rstatix_0.7.1     lubridate_1.9.0  
 [7] timechange_0.1.1  forcats_1.0.0     stringr_1.5.0    
[10] dplyr_1.1.1       purrr_1.0.1       readr_2.1.4      
[13] tidyr_1.3.0       tibble_3.2.1      ggplot2_3.4.2    
[16] tidyverse_2.0.0

Thanks for any help you can give me.

LBA MLE Example

Hi,

First, thanks for releasing this package. I had a bit of trouble reproducing the example for MLE of the LBA model presented here: https://yxlin.github.io/basics/mle/. This is using version 0.2.6 in CRAN on Windows. I tried also using the latest from GitHub but this introduced some runtime errors.

For version 0.2.6, I had to make a slight modification to the code presented to convert likelihood_norm call to simply likelihood. Not sure if this is correct but I assumed since likelihood_norm does not seem to exist in version 0.2.6.

I have included my script and output below. As you can see the parameters are not recovered correctly. Any insights into what I might be doing wrong? I am using the incorrect likelihood?

Thanks,
Dave

library(ggdmc)
model <- BuildModel(
  p.map     = list(A = "1", B = "1", t0 = "1", mean_v = "M", sd_v = "1",
                   st0 = "1"),
  match.map = list(M = list(s1 = 1, s2 = 2)),
  factors   = list(S = c("s1", "s2")),
  constants = c(st0 = 0, sd_v = 1),
  responses = c("r1", "r2"),
  type      = "norm")

p.vector <- c(A = .75, B = 1.25, t0 = .15, mean_v.true = 2.5, mean_v.false = 1.5)
ntrial <- 1e3

## use the seed option to make sure I always replicate the result
## remove it, if you want to see the stochastic process.
dat <- simulate(model, nsim = ntrial, ps = p.vector, seed = 123)
dmi <- BuildDMI(dat, model)

## This is to create a column in the data frame to indicate
## correct and error responses.
dmi$C <- ifelse(dmi$S == "s1" & dmi$R == "r1", TRUE,
                ifelse(dmi$S == "s2" & dmi$R == "r2", TRUE,
                       ifelse(dmi$S == "s1" & dmi$R == "r2" ,FALSE,
                              ifelse(dmi$S == "s2" & dmi$R == "r1", FALSE, NA))))

prop.table(table(dmi$C))


objective_fun <- function(par, data) {
  den <- likelihood(par, data)
  return(-sum(log(den)))
}


p.prior <- BuildPrior(
  dists = c("tnorm", "tnorm", "beta", "tnorm", "tnorm"),
  p1    = c(A = 1, B = 1, t0 = 1, mean_v.true = 1, mean_v.false = 1),
  p2    = c(1,  1,  1, 1, 1),
  lower = c(rep(0, 3),  rep(NA, 2)),  
  upper = c(rep(NA, 2), 1, rep(NA, 2)))

init_par <- rprior(p.prior)
names(init_par) <- c("A", "B", "t0", "mean_v.true", "mean_v.false")
res <- nlminb(objective_fun, start = init_par, data = dmi, lower = 0)
round(res$par, 2)  ## remember to check res$convergence

#         A            B           t0  mean_v.true mean_v.false 
#        0.00         2.00         0.07         2.73         1.83

installing error in install_github

When I tried to install ggdmc, I got following error message.

devtools::install_github("yxlin/ggdmc")
Downloading GitHub repo yxlin/ggdmc@master
Error in FUN(X[[i]], ...) :
Invalid comparison operator in dependency: >=

Installing from source is working. Thank you for a great R package.

Help with BuildModel, 4 accumulator LBA

Hi,

Sorry if this question is a bit basic but I was wondering if you could help me understand the BuildModel command. I am trying to setup an LBA model for an experiment I have involving a 4-choice task. I also have 4 stimuli that have a shape and color (blue_diamond, blue_heart, green_diamond, and green_heart). In addition, I have a task indicator which is given to the subject that specifies whether to attend to the shape or color of the stimuli. The subject is then asked to specify the correct response (blue, green, heart, diamond).

Do you think you could provide me with some insight into how to express such a model to ggdmc via the BuildModel command. Is it possible? I tried to follow along with this 3 accumulator example but I got a bit lost (https://yxlin.github.io/cognitive-model/lba3/). Also, incidentally there seems to be an error in the script in the predict_one function with mismatched brackets.

Thanks for the package and any help you can give me.

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.