ucd-ipo / agroft Goto Github PK

View Code? Open in Web Editor NEW

4.0 4.0 5.0 31.17 MB

Agricultural Field Trial Statistics Package

Home Page: http://ucd-ipo.github.io/agroft/

License: BSD 3-Clause "New" or "Revised" License

R 100.00%

agroft's People

Contributors

Stargazers

Watchers

Forkers

msimmond rkingdc moorepants

agroft's Issues

Add note to split plot output saying why Shapiro-Wilk is not present

See: #35 (comment)

Determine what operating systems we need to support.

Do we need to support other operating systems other than Windows?

Remove the square root option for transformation

It's redundant since the power transformation can do this. Also sq root is more common for count data which we aren't allowing for the dependent variable.

In Post-hoc tests tab, change the title to 'Post-hoc tests and figures'

plot_effects.R needs modularization

This file is extremely difficult to understand. It is a ~800 line function with god knows how many nested if statements. It'll need some major cleanup if any more plotting functionality needs to be added.

shinyBS >.5 and readxl now on cran

The packages shinyBS and readxl are on CRAN now, so the install_github part of the initialize_AIP() function for those packages can be changed to install.packages and the section of code loading readxl and using read_excel can be uncommented. Users should now be able to upload excel files!

Let users set units on plots

See: #34 (comment)

Post hoc Analysis

Add functions for multiple mean comparisons that separate treatment means using 3 methods and order them with corresponding letters using Fisher’s protected and unprotected least significant difference (LSD) and Tukey procedures. Add an option to choose significance level.
*Add a reminder that pops up that reminds them that mean separations should only be done if the that factor (treatment) is significant in the ANOVA table (i.e. p-value < their chosen sig level), unless they are doing Fisher’s Unprotected LSD.

examples of testing assumptions (normality by shapiro-wilk, HOV by Levene's) for split-plot designs

@lmpincus Help! I can't find this!

Add functions to test ANOVA assumptions

Check for normality and homogeneity of variance using Shapiro Wilk and Levene’s HOV tests, respectively.

Shapiro-Wilk
Levene's

Downloading report before analyses is performed raises error and hangs the app.

The button should be disabled until there is sufficient info.


processing file: report.Rmd
  |.......                                                          |  11%
   inline R code fragments

  |..............                                                   |  22%
label: unnamed-chunk-1 (with options) 
List of 4
 $ echo   : logi FALSE
 $ warning: logi FALSE
 $ error  : logi FALSE
 $ message: logi FALSE

  |......................                                           |  33%
   inline R code fragments

  |.............................                                    |  44%
label: unnamed-chunk-2 (with options) 
List of 3
 $ echo   : logi FALSE
 $ warning: logi FALSE
 $ message: logi FALSE

  |....................................                             |  56%
  ordinary text without R code

  |...........................................                      |  67%
label: unnamed-chunk-3 (with options) 
List of 4
 $ echo   : logi FALSE
 $ results: chr "asis"
 $ warning: logi FALSE
 $ message: logi FALSE

  |...................................................              |  78%
  ordinary text without R code

  |..........................................................       |  89%
label: unnamed-chunk-4 (with options) 
List of 3
 $ echo   : logi FALSE
 $ message: logi FALSE
 $ warning: logi FALSE

  |.................................................................| 100%
   inline R code fragments


output file: report.md

Warning in file.rename(out, file) :
  cannot rename file '_analysis_report_2015-07-21.html' to '/tmp/RtmpXJMAG9/file31004bc0aff0.html', reason 'Invalid cross-device link'
Error opening file: 2
Error reading: 9

Add subtitle to model fit summary and move the ANOVA table

In Analysis tab, under 'Model Fit Summary', add subtitle, 'ANOVA table'.
Also move this output to below assumption tests.

Switch to interaction.plot, a builtin R function

Right now we are using intxplot from the HH library which requires some C libs to install. We avoid that library altogether by using the builtin interaction.plot.

edit to Tukey 1df test

instead of:
summary(one.df.model)

use:
anova(one.df.model)

Nuke plot_effects.R

It is GPL code and may not be needed. See issue #22 for more details.

It is loaded in global.R, but doesn't seem to be used.

I'd like to change the license from GPL to something that isn't copyleft

BSD, MIT, or Apache are preferable.

Unless there is a specific reason for needing GPL features, i.e. forcing downstream users to use the same license, then we should remove that restriction and choose a more liberal license that only requires essentially "citation".

Include resources in the application

From @msimmond:

This is what I was asking about at the last meeting. Jason, do you think there's anyway to link the presentation materials I'm creating in powerpoint to the app in github? I'd like people to be able to use the ppt materials, but with a citation.

Sounds like we need some way to host general materials like a Powerpoint and have them accessible to from the application.

Edit title 'Tukey's Test' to 'Tukey's Test for Nonadditivity'

In Analysis output

Add function to "detransform" data

This could be an option at the end of the LSD step. The function produces LSD table with detransformed means. If this is selected, the post-hoc plots use the detransformed means.

Change the formula debug message into a proper display of the formula.

"error: subscript out of bounds"

It still seems to work, but the error pops up.

The power calculation can potentially fail, resulting in NA/NaN/Inf in the dependent variable.

To reproduce use the corn data with a linear regression:

observation.pow ~ rx

Plots

The user should be able to produce standard bar charts, histograms, and scatterplots that include confidence intervals if relevant.
The titles, x and y axes, and colors of the plots should be editable via the GUI.
The interaction plot should be available if model includes an interaction.
For the linear regression model, the user should be able to generate residuals vs. fitted graph to test assumptions.
Bar chart comparing the means in ANOVA with standard error for each variable?
interaction.plot (Maegan will send me an example one)
Look at residual plots in ANOVA. residuals on y axes and predicted values on x axes (to evaluate whether they are meeting the assumptions of there tests)
Scatter plot for single variable with R^2 and trend line.
Plots need to be moved into the analyses steps instead of at the end in step 5.

Add Randomized Complete Block Design Options

Must be able to analyze CRD (completely randomized design) and RCBD, there should be option for 1 and 2 treatments, and their interaction:
- Y ~ trtA
- Y ~ trtA trtB trtA*trtB
- Y ~ trtA Block
- Y ~ trtA trtB trtA*trtB Block
Add a reminder that pops up anytime there’s a significant treatment x treatment interaction, that main effects can not be evaluated.

initialize_AIP() gives warnings and packages are not avaliable

Warning messages:
1: In library(package, lib.loc = lib.loc, character.only = TRUE, logical.return = TRUE,  :
  there is no package called ‘knitr’
2: In library(package, lib.loc = lib.loc, character.only = TRUE, logical.return = TRUE,  :
  there is no package called ‘gridExtra’
3: In library(package, lib.loc = lib.loc, character.only = TRUE, logical.return = TRUE,  :
  there is no package called ‘agricolae’
4: In library(package, lib.loc = lib.loc, character.only = TRUE, logical.return = TRUE,  :
  there is no package called ‘effects’
5: In library(package, lib.loc = lib.loc, character.only = TRUE, logical.return = TRUE,  :
  there is no package called ‘lsmeans’
6: In library(package, lib.loc = lib.loc, character.only = TRUE, logical.return = TRUE,  :
  there is no package called ‘shinyAce’
7: In library(package, lib.loc = lib.loc, character.only = TRUE, logical.return = TRUE,  :
  there is no package called ‘shinyBS’
> AIP()

Listening on http://127.0.0.1:4289
Loading required package: estimability
Error in loadNamespace(i, c(lib.loc, .libPaths()), versionCheck = vI[[i]]) : 
  namespace ‘lattice’ 0.20-24 is already loaded, but >= 0.20.27 is required
Error : package or namespace load failed for ‘lsmeans’

plots for checking assumptions

Two things:

Can you please update the parameters for the plot(model...) so that there is less white space between subplots? I found this to work:
par(mfrow = c(2, 1), oma = c(0, 0, 2, 0),
mar = c(4,4,2,2))

-Can you please change the 2 plots produced to: (1) res x pred (already there) and (2) histogram or kernel density plot (I think we were calling it) - instead of QQ

Anytime data is loaded all the other input variables should be cleared.

shinyjs can do this but it requires R > 3.1
Here is a propsoal by the shinyjs author to add reset functionality to shiny: rstudio/shiny#870

Introduce Mixed Effects Option for RCBD and Split Plot

The user should be able to select the location as a random effect and the mixed effect model results computed. Should include a pop-up box explaining why location is being analyzed as a random effect.

edit to tukey test #2

Can you remove all rows in the Tukey's ANOVA table except for 'sq.pred'? If not, add subtitle to 'Tukey's Test for Nonadditivity': 'Attention: refer only to 'sq.pred' in this table - ignore all other rows'

Reloading the browser kills the app

To reproduce, use runApp('inst/app') to start the server and the refresh the browser window. The server disconnects, R unblocks, and there are no error messages.

shiny app with some ideas that might be good to integrate

I found a shiny app with a similar idea as this one.

https://vnijs.shinyapps.io/base/?SSUID=2e1f1c7227d5dc9bbff1b3a311982f0c

Check it out. Website here:

http://vnijs.github.io/radiant/

edit to tukey test

Change the squared predicted variable name to something consistent for any scenario, such as 'sq.pred'.

Have a button that downloads an R script with the code.

The report is useful, but an R script with all the code needed to reproduce the GUI analyses might be even better. This is not in the deliverables of the contract.

Variance-weighted ANOVA example

Following up on #13 here's an example of a variance-weighted ANOVA that can be performed if transformations don't resolve the problems with assumption tests for the scenario 'CRD with 1 treatment':

Instead of assuming equal variance with: model <- aov(y ~ group, data = my.data)
Do variance-weighted ANOVA that does not assume equal variances. There could be option below transformation drop-down to do that, but don't allow transformation + variance-weight ANOVA.
oneway.test(y ~ group, data = my.data)
(Source: https://stat.ethz.ch/R-manual/R-devel/library/stats/html/oneway.test.html)

We need a decent library name

So Lauren said we should call this the "Agricultural Field Trial Statistics Package". That is a mouthful so we need something simpler for the library name. We could make an acronym: "AFTSP" or "AFTS" but I'm not really fond of that. Ian named it "AIP", which I don't even know what it stands for.

What do you all think about "Chaval"? It is the Urdu word for "rice".

So "Chaval: An Agricultural Field Trial Statistics Package" and you would install and load it with:

install.packages('chaval')
library('chaval')
chaval()

find out correct procedure if significant interaction in CRD and RCBD

Currently, we are doing the following: if there is a significant interaction between 2 treatment factors (A and B) in a CRD or RCBD fixed effects model (e.g. y ~ A + B + A:B + Blk), we perform an LSD on the combination of all levels of both factors (e.g., in a 3 x 3 factorial, we would do an LSD on 9 treatments), as opposed to first subsetting the data for each level of A and re-running the ANOVA on B (y ~ B + Blk #for each level of A) and doing LSD test if significant effect of B, and vice versa for A. I am not sure if both approaches are valid (i.e. looking at all combinations versus subsetting). In the latter approach we'd have to retest all ANOVA assumptions for each individual ANOVA. YUCK

Ensure app can run from a flash drive

This is what Ian used:

http://oddhypothesis.blogspot.com/2014/04/deploying-self-contained-r-apps-to.html

make factor names visible for each Levene's test

Still need to update for transformed data scenarios: #1 add LSD table for original data with the grouping letters replaced with the transformed data LSD table below the LSD output for transformed data. #2 replace the bar plot with #1

Add Split Plot Methods

There should be a dropdown function under ANOVA for split plot methods. R function available in the agricolae package (split plot only at this time).

Add calculation of confidence intervals whenever relevant

Users should be able to download a report in PDF format.

The app should be able to print a PDF report of the functions performed.

Add reminder that the graphs can be downloaded by right clicking the image.

Add a popup over the graph images that remind people that they can save the images by right clicking.

Add functions to transform datasets

Transforms: log, square root, power. Add a pop-up that suggests/explains different types of transformations and when they are used.

Inconsistent results in split plot designs

From the console:

> library('agricolae')
> data('plots')
> spcrd.fit <- aov(yield ~ A + B + A:B + Error(A:block), plots)
> summary(spcrd.fit)

Error: A:block
          Df Sum Sq Mean Sq F value Pr(>F)
A          1 0.1071  0.1071   0.143   0.77
Residuals  1 0.7500  0.7500               

Error: Within
          Df Sum Sq Mean Sq F value   Pr(>F)    
A          1   4.20    4.20   1.172   0.3045    
B          2  29.78   14.89   4.155   0.0486 *  
A:B        2 300.44  150.22  41.922 1.37e-05 ***
Residuals 10  35.83    3.58                     

---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
> sprcbd.fit <- aov(yield ~ A + B + A:B + block + Error(A:block), plots)
> summary(sprcbd.fit)

Error: A:block
      Df Sum Sq Mean Sq
A      1 0.1071  0.1071
block  1 0.7500  0.7500

Error: Within
          Df Sum Sq Mean Sq F value   Pr(>F)    
A          1   4.20    4.20   1.172   0.3045    
B          2  29.78   14.89   4.155   0.0486 *  
A:B        2 300.44  150.22  41.922 1.37e-05 ***
Residuals 10  35.83    3.58                     

---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

And results from the app for spcrd:

And for sprcbd:

Error: A:block
          Df Sum Sq Mean Sq F value Pr(>F)
A          1  0.222  0.2222   0.108  0.774
block      2  2.111  1.0556   0.514  0.661
Residuals  2  4.111  2.0556               

Error: Within
          Df Sum Sq Mean Sq F value   Pr(>F)    
B          2  29.78   14.89   3.458 0.082744 .  
A:B        2 300.44  150.22  34.890 0.000112 ***
Residuals  8  34.44    4.31                     

---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Add instructions to readme to pull install latest app from Github using devtools

Add USAID branding

Add USAID branding and project logo: could just have a link to a website or a window pop-up under “about” for app.

Dealing with datasets that have missing values

We'll have to figure out how to do this, but for now as a reminder...
Excerpt from "http://www.unh.edu/halelab/BIOL933/lectures/lect_16_reading.pdf":
'In summary, a major problem in the analysis of unbalanced data is the contamination of means,
and thus the differences among means, by effects of other factors. The solution to these
problems is to replace missing data by their least square estimates and to remove the
contaminating effects of other factors through a proper adjustment of means. With R, all this
means is that you should use the Type II SS (Anova()) and the lsmeans() function.'

Cleanup analyses types

Remove t-test as the same thing can be accomplished with one-way ANOVA of two factors.
We need a selection box for experiment design that comes before analyses type. This should have a selection for CRD/RCBD/None and a checkbox to signal whether this is a split plot design if CRD or RCBD is selected or not. The subsequent steps in the analysis side panel will be adjusted (or automatically selected) based on this selection. The variable choices for split plot should reflect the main plot and split variable. The split plot may use aov or lm depending on what type of design it is (see the example in #15).
The analyses type should be ANOVA or linear model or selected automatically based on the split plot selections.
Remove RCBD from analysis, it is an experiment design type.
Maybe only do single variate linear model.

See the design in the wiki: https://github.com/ucd-ipo/aip-analysis/wiki

Create Github org to host this app

I'd like to move this application to a Github organization so it isn't under Kyle or my username. @msimmond, what would be a good org name? Is this under your lab? Or is it under a project?

Add disclaimer

Add disclaimer (e.g. to best of our knowledge this is correct and we cannot be held accountable for any conclusions or data generated using this)