Code Monkey home page Code Monkey logo

powerlmm's Introduction

powerlmm

Travis-CI Build Status CRAN_Status_Badge

Power Analysis for Longitudinal Multilevel/Linear Mixed-Effects Models.

Overview

The purpose of powerlmm is to help design longitudinal treatment studies (parallel groups), with or without higher-level clustering (e.g. longitudinally clustered by therapists, groups, or physician), and missing data. The main features of the package are:

  • Longitudinal two- and three-level (nested) linear mixed-effects models, and partially nested designs.
  • Random slopes at the subject- and cluster-level.
  • Missing data.
  • Unbalanced designs (both unequal cluster sizes, and treatment groups).
  • Design effect, and estimated type I error when the third-level is ignored.
  • Fast analytical power calculations for all designs.
  • Power for small samples sizes using Satterthwaite’s degrees of freedom approximation.
  • Explore bias, Type I errors, model misspecification, and LRT model selection using convenient simulation methods.

Installation

powerlmm can be installed from CRAN and GitHub.

# CRAN, version 0.4.0
install.packages("powerlmm")

# GitHub, dev version
devtools::install_github("rpsychologist/powerlmm")

Example usage

This is an example of setting up a three-level longitudinal model with random slopes at both the subject- and cluster-level, with different missing data patterns per treatment arm. Relative standardized inputs are used, but unstandardized raw parameters values can also be used.

library(powerlmm)
d <- per_treatment(control = dropout_weibull(0.3, 2),
               treatment = dropout_weibull(0.2, 2))
p <- study_parameters(n1 = 11,
                      n2 = 10,
                      n3 = 5,
                      icc_pre_subject = 0.5,
                      icc_pre_cluster = 0,
                      icc_slope = 0.05,
                      var_ratio = 0.02,
                      dropout = d,
                      effect_size = cohend(-0.8, 
                                           standardizer = "pretest_SD"))

p
#> 
#>      Study setup (three-level) 
#> 
#>               n1 = 11
#>               n2 = 10 x 5 (treatment)
#>                    10 x 5 (control)
#>               n3 = 5      (treatment)
#>                    5      (control)
#>                    10     (total)
#>          total_n = 50     (treatment)
#>                    50     (control)
#>                    100    (total)
#>          dropout =  0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10 (time)
#>                     0,  0,  1,  3,  6,  9, 12, 16, 20, 25, 30 (%, control)
#>                     0,  0,  1,  2,  4,  5,  8, 10, 13, 17, 20 (%, treatment)
#> icc_pre_subjects = 0.5
#> icc_pre_clusters = 0
#>        icc_slope = 0.05
#>        var_ratio = 0.02
#>      effect_size = -0.8 (Cohen's d [SD: pretest_SD])
plot(p)

get_power(p, df = "satterthwaite")
#> 
#>      Power Analyis for Longitudinal Linear Mixed-Effects Models (three-level)
#>                   with missing data and unbalanced designs 
#> 
#>               n1 = 11
#>               n2 = 10 x 5 (treatment)
#>                    10 x 5 (control)
#>               n3 = 5      (treatment)
#>                    5      (control)
#>                    10     (total)
#>          total_n = 50     (control)
#>                    50     (treatment)
#>                    100    (total)
#>          dropout =  0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10 (time)
#>                     0,  0,  1,  3,  6,  9, 12, 16, 20, 25, 30 (%, control)
#>                     0,  0,  1,  2,  4,  5,  8, 10, 13, 17, 20 (%, treatment)
#> icc_pre_subjects = 0.5
#> icc_pre_clusters = 0
#>        icc_slope = 0.05
#>        var_ratio = 0.02
#>      effect_size = -0.8 (Cohen's d [SD: pretest_SD])
#>               df = 7.971459
#>            alpha = 0.05
#>            power = 68%

Unequal cluster sizes

Unequal cluster sizes is also supported, the cluster sizes can either be random (sampled), or the marginal distribution can be specified.

p <- study_parameters(n1 = 11,
                      n2 = unequal_clusters(2, 3, 5, 20),
                      icc_pre_subject = 0.5,
                      icc_pre_cluster = 0,
                      icc_slope = 0.05,
                      var_ratio = 0.02,
                      effect_size = cohend(-0.8, 
                                           standardizer = "pretest_SD"))

get_power(p)
#> 
#>      Power Analyis for Longitudinal Linear Mixed-Effects Models (three-level)
#>                   with missing data and unbalanced designs 
#> 
#>               n1 = 11
#>               n2 = 2, 3, 5, 20 (treatment)
#>                    2, 3, 5, 20 (control)
#>               n3 = 4           (treatment)
#>                    4           (control)
#>                    8           (total)
#>          total_n = 30          (control)
#>                    30          (treatment)
#>                    60          (total)
#>          dropout = No missing data
#> icc_pre_subjects = 0.5
#> icc_pre_clusters = 0
#>        icc_slope = 0.05
#>        var_ratio = 0.02
#>      effect_size = -0.8 (Cohen's d [SD: pretest_SD])
#>               df = 6
#>            alpha = 0.05
#>            power = 44%

Cluster sizes follow a Poisson distribution

p <- study_parameters(n1 = 11,
                      n2 = unequal_clusters(func = rpois(5, 5)), # sample from Poisson
                      icc_pre_subject = 0.5,
                      icc_pre_cluster = 0,
                      icc_slope = 0.05,
                      var_ratio = 0.02,
                      effect_size = cohend(-0.8, 
                                           standardizer = "pretest_SD"))

get_power(p, R = 100, progress = FALSE) # expected power by averaging over R realizations
#> 
#>      Power Analyis for Longitudinal Linear Mixed-Effects Models (three-level)
#>                   with missing data and unbalanced designs 
#> 
#>               n1 = 11
#>               n2 = rpois(5, 5) (treatment)
#>                    -           (control)
#>               n3 = 5           (treatment)
#>                    5           (control)
#>                    10          (total)
#>          total_n = 24.96       (control)
#>                    24.96       (treatment)
#>                    49.92       (total)
#>          dropout = No missing data
#> icc_pre_subjects = 0.5
#> icc_pre_clusters = 0
#>        icc_slope = 0.05
#>        var_ratio = 0.02
#>      effect_size = -0.8 (Cohen's d [SD: pretest_SD])
#>               df = 8
#>            alpha = 0.05
#>            power = 48% (MCSE: 1%)
#> 
#> NOTE: n2 is randomly sampled. Values are the mean from R = 100 realizations.

Convenience functions

Several convenience functions are also included, e.g. for creating power curves.

x <- get_power_table(p, 
                     n2 = 5:10, 
                     n3 = c(4, 8, 12), 
                     effect_size = cohend(c(0.5, 0.8), standardizer = "pretest_SD"))
plot(x)

Simulation

The package includes a flexible simulation method that makes it easy to investigate the performance of different models. As an example, let’s compare the power difference between the 2-level LMM with 11 repeated measures, to doing an ANCOVA at posttest. Using sim_formula different models can be fit to the same data set during the simulation.

p <- study_parameters(n1 = 11,
                      n2 = 40, 
                      icc_pre_subject = 0.5,
                      cor_subject = -0.4,
                      var_ratio = 0.02,
                      effect_size = cohend(-0.8, 
                                           standardizer = "pretest_SD"))

# 2-lvl LMM
f0 <- sim_formula("y ~ time + time:treatment + (1 + time | subject)")

# ANCOVA, formulas with no random effects are with using lm()
f1 <- sim_formula("y ~ treatment + pretest", 
                  data_transform = transform_to_posttest, 
                  test = "treatment")

f <- sim_formula_compare("LMM" = f0, 
                         "ANCOVA" = f1)

res <- simulate(p, 
                nsim = 2000, 
                formula = f, 
                cores = parallel::detectCores(logical = FALSE))

We then summarize the results using summary. Let’s look specifically at the treatment effects.

summary(res, para = list("LMM" = "time:treatment",
                         "ANCOVA" = "treatment"))
#> Model:  summary 
#> 
#> Fixed effects: 'time:treatment', 'treatment'
#> 
#>   model M_est theta M_se SD_est Power Power_bw Power_satt
#>     LMM  -1.1  -1.1 0.32   0.32  0.93     0.93          .
#>  ANCOVA -11.0   0.0 3.70   3.70  0.85     0.84       0.84
#> ---
#> Number of simulations: 2000  | alpha:  0.05
#> Time points (n1):  11
#> Subjects per cluster (n2 x n3):  40 (treatment)
#>                                  40 (control)
#> Total number of subjects:  80 
#> ---
#> At least one of the models applied a data transformation during simulation,
#> summaries that depend on the true parameter values will no longer be correct,
#> see 'help(summary.plcp_sim)'

We can also look at a specific model, here’s the results for the 2-lvl LMM.

summary(res, model = "LMM")
#> Model:  LMM 
#> 
#> Random effects 
#> 
#>          parameter  M_est theta est_rel_bias prop_zero is_NA
#>  subject_intercept  99.00 100.0     -0.00560         0     0
#>      subject_slope   2.00   2.0     -0.00310         0     0
#>              error 100.00 100.0      0.00086         0     0
#>        cor_subject  -0.39  -0.4     -0.03200         0     0
#> 
#> Fixed effects 
#> 
#>       parameter   M_est theta M_se SD_est Power Power_bw Power_satt
#>     (Intercept)  0.0048   0.0 1.30   1.30 0.050        .          .
#>            time  0.0011   0.0 0.25   0.25 0.054        .          .
#>  time:treatment -1.1000  -1.1 0.32   0.32 0.930     0.93          .
#> ---
#> Number of simulations: 2000  | alpha:  0.05
#> Time points (n1):  11
#> Subjects per cluster (n2 x n3):  40 (treatment)
#>                                  40 (control)
#> Total number of subjects:  80 
#> ---
#> At least one of the models applied a data transformation during simulation,
#> summaries that depend on the true parameter values will no longer be correct,
#> see 'help(summary.plcp_sim)'

Launch interactive web application

The package’s basic functionality is also implemented in a Shiny web application, aimed at users that are less familiar with R. Launch the application by typing

library(powerlmm)
shiny_powerlmm()

powerlmm's People

Contributors

rpsychologist avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

powerlmm's Issues

References

Dear Mr Magnusson,

First of all congratulations for this package. It seems that you put a lot of effort on that!

I have some questions about it. I saw the references that you provided, but unfortunately i do not have access to the books. I read the articles and i realized that you use slightly different calculations than what i can find elswhere( i.e. "Effects of study Duration, Frequency of observation, and sample size on Power in Studies of group differences in Polynomial Change"). Especially i observed a difference in the standardized effect and in the variance ratio, which i saw being calculated by using the total variance in the denominator and not only the within variance. And by comparing the results from this package with the PASS software i observed a slight difference in the estimated sample sizes. Can you please provide me with some articles or anything that i can read and understand your calculations ?

Thank you very much for your time.

Kind regards,
John Zavrakidis

Simulate results

Dear Kristoffer,

I am very glad to see that you updated the powerlmm package and that you are trying to improve it! Keep up the good work!

To my problem now, I am working on longitudinal models. More precisely, I have 2 level structure with subjects randomized in 2 treatments and they are followed over time. Treatment is the only covariate in the data along with Time and their interaction, where I am most interested in.

I try to simulate some data. When I use the following input, the time and the treatment estimates seem to mess up with each other. In the input I understand that the Time estimate(slope in the control group) is given through the parameter “fixed_slope”, but looking at the results further below, I see that the theta for Time is 0 while the input that I specified(fixed_slope = 0.2) is given to the Treatment theta! And in addition, I see that the mean estimate of the time is 0.20( as it should be)!! Is this a bug or have I misunderstood something here?

P.S. What is the differnece between Power & Power_bw ?

Input

      p1 <- study_parameters(n1 = 10, # number of measurements per subject
                             n2 = 10, # number of subjects per treatment
                             fixed_intercept = 2.4,
                             fixed_slope = 0.20,
                             sigma_subject_intercept = 0.61,
                             sigma_subject_slope = 0.03,
                             cor_subject = -0.30,
                             sigma_error = 0.78,
                             effect_size = -0.8)

Sim

      ttt <- simulate(p1, nsim = 5000)

Output

 Model:  correct 
Random effects

     parameter     M_est   theta est_rel_bias prop_zero  is_NA
subject_intercept  0.3925  0.3721        0.055    0.0044 0.0000
 subject_slope     0.0018  0.0009        1.038    0.0014 0.0000
         error     0.6019  0.6084       -0.011    0.0000 0.0000
   cor_subject    -0.2174 -0.3000       -0.275    0.4922 0.0044

Fixed effects

  parameter       M_est  theta  M_se  SD_est  Power Power_bw Power_satt
(Intercept)       2.4032  2.400  0.242  0.240   1.00       NA        NaN
  treatment      -0.0076  0.200  0.342  0.339   0.06       NA        NaN
      time        0.2000  0.000  0.030  0.029   1.00       NA        NaN
 time:treatment  -0.0884 -0.089  0.042  0.040   0.54      0.49        NaN
---
 Number of simulations: 5000  | alpha:  0.05
 Time points (n1):  10
 Subjects per cluster (n2 x n3):  10 (treatment)
                                     10 (control)

Thanks,
John

simulate: satterthwaite dfs give error when time:treatment is omitted

The following model will fail:

p <- study_parameters(n1 = 5, 
                     n2 = 5, 
                     icc_pre_subject = 0.5)

f <- "y ~ 1 + time + (1 | subject)"

res <- simulate(object = p,
        nsim = 10,
        formula = f,
        satterthwaite = TRUE)

Even though it is a custom model, it is probably a good idea to support this type of custom use.

Would need to update:

  • simulate would need an argument for which terms Satterthwaite dfs should be calculated
  • more stuff?

Before this is supported it would be better to give a more informative error message.

Allow crossed designs

Cluster can be crossed with treatments, e.g. therapists that deliver two interventions. These models are currently not supported by powerlmm.

image

FEATURE REQUEST: R code from shiny app

Hi,
Would it be very hard to implement a button in the shiny app, so that one can download the R code generated by the app?
Thank you very much,
Andrea

Feature request: Generate multilevel data with no clusters

Hi, would it be possible to include options for generating non longitudinal multilevel datasets (without cluster groups etc).

What I'm trying to simulate is a 2 level data set with a level 2 predictor (subject level) predicting a level 1 outcome (observation level).
I like the possibility to control all the parameters (like ICCs or ESs) and could use it in the context of purely continuous variables at different levels of measurement.

Or maybe it's already possible, but I can't figure out how?

All the best!

Shiny App: 2-level design gives NaN% Power

Dear Kristoffer,

I tried out the shiny app right now, but when I select a 2-level design a power of NaN% is shown (with default values, but also when changing the values).

The 3-level design works.

R shows the following warnings:
Warning in qt(1 - alpha/2, df = df) : NaNs wurden erzeugt
Warning in qt(alpha/2, df = df) : NaNs wurden erzeugt

I'm using powerlmm version 0.4.0.

Best,
Caroline

Availability for R version 3.6.1

Any chance to get an Update for this package so it is available for R Version 3.6.1?

This is what I get when trying to install the package
Warning in install.packages :
package ‘powerlmm’ is not available (for R version 3.6.1 Patched)

Thanks for any help

Simulating continous predictors at given level

Hello,
is there any way of simulating a dataset with (for example) 2 cont. predictors measured at different levels of the hierarchy?

What I'm trying to do here is computing simulated power for a continuous fixed effect given c(varying ICC, varying subject number, varying observation per subject).

I'd really appreciate some help with this

All the best!

An issue with powerlmm - error message Error in prettyNum(.Internal(format(x, trim, digits, nsmall, width, 3L, : invalid value 0 for 'digits' argument

Hello,
I am new to the Powerlmm package and haven't used R for a while. I have tried to update all relevant packages. But when I tested the Powerlmm package with the example provided online, it kept giving me this error message
Error in prettyNum(.Internal(format(x, trim, digits, nsmall, width, 3L, :
invalid value 0 for 'digits' argument

Here was what I tested:

library(powerlmm)

p <- study_parameters(n1 = 11,

  •                   n2 = 20,
    
  •                   icc_pre_subject = 0.5,
    
  •                   var_ratio = 0)
    

get_power(p)

Any feedback or guidance would be much appreciated!

simulate: Bug when a parameter is NA, but the term is still included in model formula

The following code will give an error.

library(powerlmm)
p <- study_parameters(n1 = 6,
                      n2 = 5,
                      n3 = 5,
                      icc_pre_subject = 0.5,
                      icc_pre_cluster = NA,
                      var_ratio = 0.02,
                      icc_slope = 0.05,
                      cohend = 0
                      )

res <- simulate(p, nsim = 5, formula = "y ~ time * treatment + (1 | subject) + (1 | cluster)")
summary(res)

summary will fail, code will work if icc_pre_cluster = 0 is used instead.

Use a study setup to calculate power for t-test/ANCOVA

It would be useful to be able to use a study setup to calculate power for an ANCOVA or t-test, i.e. we use the assumed DGP but fit an ANCOVA/t-test. For instance, power for an ANCOVA should be:

p <- study_parameters(
    design = study_design(),
    n1 = 2,
    n2 = 10,
    T_end = 1,
    icc_pre_subject = c(0.7),
    var_ratio = NA,
    sigma_error = 10,
    cor_subject = 0,
    effect_size = cohend(0.5)
)

n <- p$n2
alpha <- 0.05
delta <- p$effect_size[[1]]$get()$ES
# -1 df added because the covariate
df <- n * 2 - 2 - 1
r <- get_ICC_pre_subjects(p)
sd <- sqrt(1 * (1 - r^2))
lambda <- sqrt(n / 2) * delta / sd

power <- pt(qt(1 - alpha / 2, df = df), df = df, ncp = lambda, lower.tail = FALSE) +
    pt(qt(alpha / 2, df = df), df = df, ncp = lambda)

Allow comparison of more than two models when simulating

Currently simulate(object, formula = formula) only allows at most two different formulas, in a named list with the names "correct" or "wrong". It would be better to allow custom names and any number of formulas. eg

f <- list("3_lvl_correct" = "y ~ treatment * time + (1 + time | subject) + (time | cluster)",
             "3_lvl_no_cor" = "y ~ treatment * time + (1 + time | subject) +  (time || cluster)",
             "2_lvl" = "y ~ treatment * time + (1 + time | subject)"
)

uneven time between observations

Hi, thanks for putting together this great tool.
Is there a way I can input a different t, example: c(0, 1, 3, 9, 12)?
Also, I couldn't figure out how to change the default df.
Thanks!

dev version of installation error (... non-zero exit status)

Dear Dr. Magnusson,

I'm trying to install the dev version of your powerlmm package from github. However, I get the following error. Just was wondering if am I missing something? (multiple friends of mine faced the same issue)

library(remotes)
remotes::install_github("rpsychologist/powerlmm", dependencies = TRUE)

Error: Failed to install 'powerlmm' from GitHub:
  (converted from warning) installation of package ‘C:/Users/AppData/Local/Temp/RtmpKUee0F/file4ed83f5e3806/
powerlmm_0.4.0.9000.tar.gz’ had non-zero exit status

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.