Code Monkey home page Code Monkey logo

ggsignif's Introduction

ggsignif: Significance Brackets for ‘ggplot2’

CRAN_Status_Badge R build status Total downloads badge Codecov test coverage lifecycle

Introduction

This package provides an easy way to indicate if two groups are significantly different. Commonly this is shown by a bar on top connecting the groups of interest which itself is annotated with the level of significance (NS, *, **, ***). The package provides a single layer (geom_signif) that takes the groups for comparison and the test (t.test, wilcox etc.) and adds the annotation to the plot.

Citation

If you wish to cite this package in a publication, you can run the following command in your R console:

citation("ggsignif")
#> To cite 'ggsignif' in publications use:
#> 
#>   Ahlmann-Eltze, C., & Patil, I. (2021). ggsignif: R Package for
#>   Displaying Significance Brackets for 'ggplot2'. PsyArxiv.
#>   doi:10.31234/osf.io/7awm6
#> 
#> A BibTeX entry for LaTeX users is
#> 
#>   @Article{,
#>     title = {{ggsignif}: R Package for Displaying Significance Brackets for {'ggplot2'}},
#>     author = {Ahlmann-Eltze Constantin and Indrajeet Patil},
#>     year = {2021},
#>     journal = {PsyArxiv},
#>     url = {https://psyarxiv.com/7awm6},
#>     doi = {10.31234/osf.io/7awm6},
#>   }

Example

You can first install this package from CRAN:

install.packages("ggsignif")

Or get the latest development version:

install.packages("remotes")
remotes::install_github("const-ae/ggsignif")

Plot significance

library(ggplot2)
library(ggsignif)

p1 <- ggplot(mpg, aes(class, hwy)) +
  geom_boxplot() +
  geom_signif(
    comparisons = list(c("compact", "midsize"), c("minivan", "suv")),
    map_signif_level = TRUE, textsize = 6
  ) +
  ylim(NA, 48)
p1

Control the direction (either x or y) via orientation

p2 <- ggplot(
  data = mpg,
  mapping = aes(
    x = hwy,
    y = class
  )
) +
  geom_boxplot(
    orientation = "y"
  ) +
  geom_signif(
    comparisons = list(
      c("compact", "midsize"),
      c("minivan", "suv")
    ),
    map_signif_level = TRUE,
    textsize = 6,
    margin_top = 0.08,
    step_increase = 0.05,
    tip_length = 0.01,
    orientation = "y"
  )
p2

Compatible with coord_flip

p1 + coord_flip()

Setting the precise location

This is important if you use position="dodge", because in that case I cannot calculate the correct position of the bars automatically.

# Calculate annotation
anno <- t.test(
  iris[iris$Petal.Width > 1 & iris$Species == "versicolor", "Sepal.Width"],
  iris[iris$Species == "virginica", "Sepal.Width"]
)$p.value

# Make plot with custom x and y position of the bracket
ggplot(iris, aes(x = Species, y = Sepal.Width, fill = Petal.Width > 1)) +
  geom_boxplot(position = "dodge") +
  geom_signif(
    annotation = formatC(anno, digits = 1),
    y_position = 4.05, xmin = 2.2, xmax = 3,
    tip_length = c(0.2, 0.04)
  )

ggsignif is compatible with facetting (facet_wrap or facet_grid). The significance label is calculated for each facet where the axis labels listed in comparisons occur. Note that ggsignif fails to calculate the significance if the data is grouped globally (e.g., by setting color, fill, or group in ggplot(aes(...))). It is fine to group the data per geom (e.g., set the fill within geom_boxplot(aes(fill = ...))).

ggplot(diamonds, aes(x = cut, y = carat)) +
  geom_boxplot(aes(fill = color)) +
  geom_signif(comparisons = list(
    c("Fair", "Good"),
    c("Very Good", "Ideal")
  )) +
  facet_wrap(~color) +
  ylim(NA, 6.3)

Advanced Example

Sometimes one needs to have a very fine tuned ability to set the location of the the significance bars in combination with facet_wrap or facet_grid. In those cases it you can set the flag manual=TRUE and provide the annotations as a data.frame:

annotation_df <- data.frame(
  color = c("E", "H"),
  start = c("Good", "Fair"),
  end = c("Very Good", "Good"),
  y = c(3.6, 4.7),
  label = c("Comp. 1", "Comp. 2")
)

annotation_df
#>   color start       end   y   label
#> 1     E  Good Very Good 3.6 Comp. 1
#> 2     H  Fair      Good 4.7 Comp. 2

ggplot(diamonds, aes(x = cut, y = carat)) +
  geom_boxplot() +
  geom_signif(
    data = annotation_df,
    aes(xmin = start, xmax = end, annotations = label, y_position = y),
    textsize = 3, vjust = -0.2,
    manual = TRUE
  ) +
  facet_wrap(~color) +
  ylim(NA, 5.3)

You can ignore the warning about the missing aesthetics.

For further details, see: https://const-ae.github.io/ggsignif/articles/intro.html

Maintenance

This package is provided as is and we currently don’t have any plans and the capacity to add any new features to it. If there is nonetheless a feature which you would like to see in the package, you are always welcome to submit pull request, which we will try to address as soon as possible.

Code of Conduct

Please note that the ggsignif project is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.

ggsignif's People

Contributors

albluca avatar const-ae avatar ilia-kats avatar indrajeetpatil avatar m-colley avatar romanhaa avatar smargell avatar xiangpin avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ggsignif's Issues

Warning is produced when map_signif_level is specified as a numeric vector

Hi,

A warning is generated in if (params$map_signif_level == TRUE) , when map_signif_level is specified as a numeric vector.

Reproducible example:

library(ggplot2)
library(ggsignif)
ggplot(iris, aes(Species, Sepal.Length)) +
  geom_boxplot()+
  geom_signif(comparisons = list(c("setosa", "versicolor")),
              map_signif_level = c("****"=0.0001, "***"=0.001, "**"=0.01,  "*"=0.05))

Warning message:

In if (params$map_signif_level == TRUE) { :
The condition has length > 1 and only the first element will be used

The error is generated because if can only evaluate a logical vector of length 1.

Suggestion:

if (params$map_signif_level [1] == TRUE)

Thanks for your work:-)!

For loop for anova p-values

Can I use "for loop" in you package to make the p -values of aov appear in the boxplot ?
Tried using, but not able to iterate the Pr>F values. Now, the p values are getting replaced.

Thanks for your time!

Changing confidence level

It is not very clear from the vignette, what you can actually specify in the test.args argument. However, I would like to change the confidence level of the test. Is that possible?

'textsize' does not affect the text size

Hi Constantin,

Thanks for the awesome package! I was agonizing over how to plot dozens of significance comparisons over several figures today, and was ecstatic when I found that you had gone ahead and implemented this!

The issue I'm having is that 'textsize' doesn't seem to be doing anything. It's a minor issue as I can create larger plots and scale them down after, but just a heads up.

Thanks again!
Virginia

Missing p-value with multiple comparisons

Hi there,

first of all, thank you for your package. It's very helpful! I'm using the package to display the result of multiple tests and some p-vals are not plotted

This is an example with synthetic data:

library(ggplot2)
library(ggsignif)

ggplot(data.frame(y=runif(100), x=sample(c("A", "B", "C", "D"), size = 100, replace = TRUE)), aes(x = x, y=y)) +
geom_boxplot() + geom_signif(comparisons = list(c(1,2), c(2,3), c(1,3), c(1,4), c(2,4)), step_increase = .1)

and this is the result:

image

I tried playing around with the "step_increase", but it did not helped ... Any idea on how to fix it? Thanks!

Best wishes,
Luca

Puzzled for annotation that contain duplicate contents

Hi Constantin,
The package you provide is exceedingly useful.
But I'm troubled by the function aes(annotation=). The annotation would be merged and drawn on the midpoint coordinate, when the annotation contain duplicate contents.
You can try this code which comes from the site https://cran.r-project.org/web/packages/ggsignif/vignettes/intro.html.
After I modified the content of annotation, the strange result emerged.
geom_signif(stat="identity", data=data.frame(x=c(0.875, 1.875), xend=c(1.125, 2.125), y=c(5.8, 8.5), annotation=c("**", "**")), aes(x=x,xend=xend, y=y, yend=y, annotation=annotation))
Waitting for your solution.

working with log2 data

Hi Constantin,

Very nice package, your ggsignif, and released just in time as I got into making such plots yesterday ;-)

While things work fine with test data of one kind (linear or log2), I have problems with the significance bracket however when I combine them.

I am working with gene expression data, and they are commonly displayed as log2 values to avoid the squishing of data points in the low range. The t test for group comparison however is best done on the original values. So that I'd like to draw the plot with the log2 data points while showing the t test results of the linear data.

Here is a synthetic example:

# create the data:
myDF <- data.frame(value=c(runif(15, min=7500, max=12500), runif(15, min=75, max=125)), group=c(rep("A", 15), rep("B", 15)))
myDF$value.log2 <- log(myDF$value, 2)
# work with only the log2 data:
ggplot(myDF, aes(group, value.log2)) + geom_jitter(width=0.02) + geom_signif(comparisons=list(c("A", "B")), test="t.test", y_position=16)
# work with only the linear data:
ggplot(myDF, aes(group, value)) + geom_jitter(width=0.02) + geom_signif(comparisons=list(c("A", "B")), test="t.test", y_position=16)
# combine the data (note the suddenly negative scale):
ggplot(myDF, aes(group, value.log2)) + geom_jitter(width=0.02) + geom_signif(aes(group, value), comparisons=list(c("A", "B")), test="t.test", y_position=16)
# combine the data, limiting the displayed y axis range to where the data and the bracket are (note that the bracket is no longer displayed):
ggplot(myDF, aes(group, value.log2)) + geom_jitter(width=0.02) + geom_signif(aes(group, value), comparisons=list(c("A", "B")), test="t.test", y_position=16) + ylim(1, 17)

Perhaps I am just getting lost in the intricacies of ggplot here, or perhaps this is a case not foreseen by ggsignif. In any case, I'd appreciate your help in finding a solution to this problem.

Best regards,

Anton

Is it possible to compare between facets?

For example if I wanted to compare classes in the mpg data set against themselves based on wheter they had manual or automatic transmissions (i.e. compact(auto) vs compact(manual)).

Is this possible?

library(tidyverse)
library(ggsignif)

mpg <- mpg %>% 
  separate(trans, c("type", "variant"), sep="\\(")

ggplot(mpg, aes(class, hwy)) +
  geom_boxplot() +
  facet_grid(.~type) 

Dashed horizontal line

Is there any way to make the horizontal line (and line tips, if tip_length != 0) a different linetype? For example, if I wanted a dashed horizontal line? I kept trying to pass the linetype = 2 to various plot layers (including the base layer), but that didn't work. Other aesthetics like color (changes color of both the line and the annotation) and size (changes size of annotation only) worked just fine. Any suggestions?

I think having the option for dashed lines might be useful, especially for making figures for publications that require black and white images (or those that make you pay for color images). Thanks again!

Tip length

Hi,
How to get rid of tip length in aes ?
I want to have only significance (*, **, ***). No need of lines.

Thanks!

error when adding geom_signif to a plot

Dear Constantin

Thank you very much for your package ggsignif, I very appreciate it.
I try to add results from chi-square test to a ggplot and it was mentioned, that I could use ggsignif:

https://stackoverflow.com/questions/51886623/compare-dependent-proportions-in-a-ggplot

Moreover I found your adivce to use geom_signif

#23

However, If i add this to my plot:

geom_signif(data = annotation_df,
              aes(annotations = annotations, xmin = xmin, xmax = xmax, y_position = y_position),
              manual = TRUE)

df <- data.frame(timepoint=rep(0:2, each=10),response=c("A","B","A","A","A","A","A","A","B","B","A","A","A","A","A","A","A","B","B","B","A","B","B","B","B","B","A","B","B","B"),variable=rep(c("var1","var2"),each=5, 3), subject=rep(1:5,6))
df$timepoint <- factor(df$timepoint, level=c(1,0,2), labels=c("method_A","baseline","method_B"))

df %>% add_count(timepoint,variable,response) %>% add_count(timepoint,variable) %>% mutate(freq=n/nn*100) %>% mutate(total=1) -> df

stats <-data.frame(xmax=c(rep(c("baseline","method_B"),2)))
stats %>% mutate(xmin=as.factor(c(rep(c("method_A","baseline"),2)))) %>% 
  mutate(annotations=c("1","0.2","1","0.5")) %>% 
  mutate(y_position=5) %>% 
  mutate(variable=as.factor(c("var1","var1","var2","var2"))) -> annotation_df

ggplot(df,
       aes(x = timepoint, stratum = response, alluvium = subject,
           y = total, 
           fill = response, label = paste(freq,"%") )) +
  geom_flow() +
  geom_stratum(alpha = .5) +
  geom_text(stat = "stratum", size = 3) +
  theme(legend.position = "none") +
  geom_signif(data = annotation_df,
              aes(annotations = annotations, xmin = xmin, xmax = xmax, y_position = y_position),
              manual = TRUE) +
  facet_wrap(~variable) 

I get this error:
Warning: Ignoring unknown aesthetics: annotations, xmin, xmax, y_position
Error in FUN(X[[i]], ...) : object 'response' not found

If i leave out geom_signif(...) everything works.
Thank you for any advice,
Jacob

feature request - annotate with magnitude of difference in mean

Thanks for the great ggplot2 extension.

Is it possible to easily display the magnitude of the difference between two groups?

I would like to be able to see if two groups are economically significantly different as well as statistical significant. Have an annotation more like "+1.05**".

I know this is possible by passing in a custom data frame with the new values however it requires a fair amount of code. It involves:

  • Compute the mean across different groups of the dataset
  • Pivot the data to apply a pairwise difference calculation across all combinations
  • Unpivot the data back to tidy format to pass into the geom.

Can't add p values to stacked bar plot

Hi,

First of all, thank you for developing and maintaining the ggsignif !

I was trying to add significant p values to a faceted stacked bar plot but kept getting an error message saying

Error in check_factor(f) : object 'Rank' not found

Below are the data and code to reproduce my problem:

library(tidyverse) 
#> Warning: 程辑包'tidyverse'是用R版本3.4.4 来建造的
#> Warning: 程辑包'ggplot2'是用R版本3.4.4 来建造的
#> Warning: 程辑包'tibble'是用R版本3.4.4 来建造的
#> Warning: 程辑包'tidyr'是用R版本3.4.4 来建造的
#> Warning: 程辑包'readr'是用R版本3.4.4 来建造的
#> Warning: 程辑包'purrr'是用R版本3.4.4 来建造的
#> Warning: 程辑包'dplyr'是用R版本3.4.4 来建造的
#> Warning: 程辑包'stringr'是用R版本3.4.4 来建造的
#> Warning: 程辑包'forcats'是用R版本3.4.4 来建造的
library(cowplot) 
#> Warning: 程辑包'cowplot'是用R版本3.4.4 来建造的
#> 
#> 载入程辑包:'cowplot'
#> The following object is masked from 'package:ggplot2':
#> 
#>     ggsave
library(ggsignif) 
#> Warning: 程辑包'ggsignif'是用R版本3.4.4 来建造的

# Make a dataframe for plotting stacked bar plot
df <- data.frame(Diet = rep(c("REF", "IM"), each = 8),
                 Variable = c("hpv", "hpv", "hpv", "hpv", "smc", "smc", "lpc", "lpc",
                              "hpv", "hpv", "hpv", "smc", "smc", "smc", "lpc", "lpc"),
                 Rank = c("Mild", "Moderate", "Marked", "Severe", "Normal", "Mild", "Normal", "Mild",
                          "Mild", "Moderate", "Marked", "Normal", "Mild", "Moderate", "Normal", "Mild"),
                 Percent = c(5.56, 38.9, 44.4, 11.1, 38.9, 61.1, 77.8, 22.2, 
                             16.7, 66.7, 16.7, 11.1, 72.2, 16.7, 50, 50)
                 )

# Specify the desired orders of factors and convert "Rank" to an ordered factor
df$Diet <- factor(df$Diet, levels = c("REF", "IM"))
df$Variable <- factor(df$Variable, levels = c("hpv", "smc", "lpc"))
df$Rank <- ordered(df$Rank, levels = c("Normal", "Mild", "Moderate", "Marked", "Severe")) # Rank as ordered factor

# Define color scheme 
my_col = c(Normal = "royalblue2", Mild = "peachpuff1", Moderate = "tan1", Marked = "tomato", Severe = "red3")

# Make stacked barplot 
p <- ggplot(df, aes(Diet, Percent, fill = forcats::fct_rev(Rank))) + # forcats::fct_rev() reverses stacked bars
  geom_bar(stat = "identity") +
  facet_wrap(~ Variable, nrow = 1) +
  scale_fill_manual(values = my_col) + 
  scale_y_continuous(limits = c(0, 105), breaks = 0:5*20, expand = expand_scale(mult = c(0, 0.05))) +
  labs(title = "Stacked bar plot", y = "%") +
  guides(fill = guide_legend(title = "Rank")) + 
  theme_cowplot()
  
# Make a datafraome for p value annotation
anno <- data.frame(Variable = "hpv",
                   p = 0.03,
                   start = "REF",
                   end = "IM",
                   y = 102)

# Add p value to the plot
p + geom_signif(data = anno,
                aes(xmin = start, 
                    xmax = end, 
                    annotations = p, 
                    y_position = y),
                textsize = 4, 
                tip_length = 0,
                manual = TRUE)
#> Warning: Ignoring unknown aesthetics: xmin, xmax, annotations, y_position
#> Error in check_factor(f): 找不到对象'Rank'

                 
# P values cann't be added. Even when I tried to add p value manually using geom_text + geom_segment
# P values can be added if the barplots are not stacked by Rank

Created on 2018-11-05 by the reprex package (v0.2.1)

Session info
devtools::session_info()
#> - Session info ----------------------------------------------------------
#>  setting  value                         
#>  version  R version 3.4.3 (2017-11-30)  
#>  os       Windows 10 x64                
#>  system   x86_64, mingw32               
#>  ui       RTerm                         
#>  language (EN)                          
#>  collate  Chinese (Simplified)_China.936
#>  ctype    Chinese (Simplified)_China.936
#>  tz       Europe/Berlin                 
#>  date     2018-11-05                    
#> 
#> - Packages --------------------------------------------------------------
#>  package     * version date       lib source        
#>  assertthat    0.2.0   2017-04-11 [1] CRAN (R 3.4.4)
#>  backports     1.1.2   2017-12-13 [1] CRAN (R 3.4.3)
#>  base64enc     0.1-3   2015-07-28 [1] CRAN (R 3.4.1)
#>  bindr         0.1.1   2018-03-13 [1] CRAN (R 3.4.4)
#>  bindrcpp      0.2.2   2018-03-29 [1] CRAN (R 3.4.4)
#>  broom         0.5.0   2018-07-17 [1] CRAN (R 3.4.3)
#>  callr         3.0.0   2018-08-24 [1] CRAN (R 3.4.4)
#>  cellranger    1.1.0   2016-07-27 [1] CRAN (R 3.4.4)
#>  cli           1.0.1   2018-09-25 [1] CRAN (R 3.4.4)
#>  colorspace    1.3-2   2016-12-14 [1] CRAN (R 3.4.4)
#>  cowplot     * 0.9.3   2018-07-15 [1] CRAN (R 3.4.4)
#>  crayon        1.3.4   2017-09-16 [1] CRAN (R 3.4.4)
#>  curl          3.2     2018-03-28 [1] CRAN (R 3.4.4)
#>  debugme       1.1.0   2017-10-22 [1] CRAN (R 3.4.4)
#>  desc          1.2.0   2018-05-01 [1] CRAN (R 3.4.4)
#>  devtools      2.0.0   2018-10-19 [1] CRAN (R 3.4.3)
#>  digest        0.6.18  2018-10-10 [1] CRAN (R 3.4.4)
#>  dplyr       * 0.7.7   2018-10-16 [1] CRAN (R 3.4.4)
#>  evaluate      0.12    2018-10-09 [1] CRAN (R 3.4.4)
#>  forcats     * 0.3.0   2018-02-19 [1] CRAN (R 3.4.4)
#>  fs            1.2.6   2018-08-23 [1] CRAN (R 3.4.4)
#>  ggplot2     * 3.1.0   2018-10-25 [1] CRAN (R 3.4.4)
#>  ggsignif    * 0.4.0   2017-08-03 [1] CRAN (R 3.4.4)
#>  glue          1.3.0   2018-07-17 [1] CRAN (R 3.4.4)
#>  gtable        0.2.0   2016-02-26 [1] CRAN (R 3.4.4)
#>  haven         1.1.2   2018-06-27 [1] CRAN (R 3.4.4)
#>  hms           0.4.2   2018-03-10 [1] CRAN (R 3.4.4)
#>  htmltools     0.3.6   2017-04-28 [1] CRAN (R 3.4.4)
#>  httr          1.3.1   2017-08-20 [1] CRAN (R 3.4.4)
#>  jsonlite      1.5     2017-06-01 [1] CRAN (R 3.4.4)
#>  knitr         1.20    2018-02-20 [1] CRAN (R 3.4.4)
#>  lattice       0.20-35 2017-03-25 [2] CRAN (R 3.4.3)
#>  lazyeval      0.2.1   2017-10-29 [1] CRAN (R 3.4.4)
#>  lubridate     1.7.4   2018-04-11 [1] CRAN (R 3.4.4)
#>  magrittr      1.5     2014-11-22 [1] CRAN (R 3.4.4)
#>  memoise       1.1.0   2017-04-21 [1] CRAN (R 3.4.4)
#>  mime          0.6     2018-10-05 [1] CRAN (R 3.4.4)
#>  modelr        0.1.2   2018-05-11 [1] CRAN (R 3.4.4)
#>  munsell       0.5.0   2018-06-12 [1] CRAN (R 3.4.4)
#>  nlme          3.1-137 2018-04-07 [1] CRAN (R 3.4.4)
#>  pillar        1.3.0   2018-07-14 [1] CRAN (R 3.4.4)
#>  pkgbuild      1.0.2   2018-10-16 [1] CRAN (R 3.4.3)
#>  pkgconfig     2.0.2   2018-08-16 [1] CRAN (R 3.4.4)
#>  pkgload       1.0.1   2018-10-11 [1] CRAN (R 3.4.4)
#>  plyr          1.8.4   2016-06-08 [1] CRAN (R 3.4.4)
#>  prettyunits   1.0.2   2015-07-13 [1] CRAN (R 3.4.4)
#>  processx      3.2.0   2018-08-16 [1] CRAN (R 3.4.4)
#>  ps            1.1.0   2018-08-10 [1] CRAN (R 3.4.4)
#>  purrr       * 0.2.5   2018-05-29 [1] CRAN (R 3.4.4)
#>  R6            2.3.0   2018-10-04 [1] CRAN (R 3.4.4)
#>  Rcpp          0.12.19 2018-10-01 [1] CRAN (R 3.4.4)
#>  readr       * 1.1.1   2017-05-16 [1] CRAN (R 3.4.4)
#>  readxl        1.1.0   2018-04-20 [1] CRAN (R 3.4.4)
#>  remotes       2.0.1   2018-10-19 [1] CRAN (R 3.4.3)
#>  rlang         0.2.2   2018-08-16 [1] CRAN (R 3.4.4)
#>  rmarkdown     1.10    2018-06-11 [1] CRAN (R 3.4.4)
#>  rprojroot     1.3-2   2018-01-03 [1] CRAN (R 3.4.4)
#>  rvest         0.3.2   2016-06-17 [1] CRAN (R 3.4.4)
#>  scales        1.0.0   2018-08-09 [1] CRAN (R 3.4.4)
#>  sessioninfo   1.1.0   2018-09-25 [1] CRAN (R 3.4.4)
#>  stringi       1.1.7   2018-03-12 [1] CRAN (R 3.4.4)
#>  stringr     * 1.3.1   2018-05-10 [1] CRAN (R 3.4.4)
#>  testthat      2.0.1   2018-10-13 [1] CRAN (R 3.4.4)
#>  tibble      * 1.4.2   2018-01-22 [1] CRAN (R 3.4.4)
#>  tidyr       * 0.8.1   2018-05-18 [1] CRAN (R 3.4.4)
#>  tidyselect    0.2.5   2018-10-11 [1] CRAN (R 3.4.4)
#>  tidyverse   * 1.2.1   2017-11-14 [1] CRAN (R 3.4.4)
#>  usethis       1.4.0   2018-08-14 [1] CRAN (R 3.4.4)
#>  withr         2.1.2   2018-03-15 [1] CRAN (R 3.4.4)
#>  xml2          1.2.0   2018-01-24 [1] CRAN (R 3.4.4)
#>  yaml          2.2.0   2018-07-25 [1] CRAN (R 3.4.4)
#> 
#> [1] C:/Users/ljt89/Documents/R/win-library/3.4
#> [2] C:/Program Files/R/R-3.4.3/library

Computation failed in `stat_signif()`

Hello,
I met some questions when used this packages.Could you help me?
This is my code:

my_comparisons <- list(c("A", "B"), c("B", "C"), c("C", "D"))
ggplot(distance, aes(x=Group,y=Distance, color=Method,shape=Method)) +
  geom_boxplot(fill="cornflowerblue",
               color="black", notch=TRUE) +
  geom_point(position = "jitter", color="blue", alpha=.5) +
  geom_rug(side="1", color="black")+theme_bw()+
  geom_signif(comparisons = my_comparisons,test = "t.test")+
  facet_grid(.~Method)

This is my data structures:
Group Distance Method
A 0 Hamming distance
A 0 Hamming distance
A 0 Hamming distance
A 0 Hamming distance
A 0 Hamming distance
A 0 Hamming distance
A 0 Hamming distance
A 0 Hamming distance
A 0 Hamming distance
A 0 Hamming distance
A 0 Hamming distance
A 0 Hamming distance
A 0 Hamming distance
A 0 Hamming distance
A 0 Hamming distance
A 0 Hamming distance
A 0 Hamming distance
A 0 Hamming distance
A 0 Hamming distance
B 162 Hamming distance
B 151 Hamming distance
B 90 Hamming distance
B 150 Hamming distance
B 131 Hamming distance
B 107 Hamming distance
B 145 Hamming distance
B 87 Hamming distance
B 103 Hamming distance
B 96 Hamming distance
B 114 Hamming distance
B 102 Hamming distance
B 103 Hamming distance
B 91 Hamming distance
B 71 Hamming distance
B 77 Hamming distance
B 77 Hamming distance
B 67 Hamming distance
B 40 Hamming distance
C 179 Hamming distance
C 167 Hamming distance
C 109 Hamming distance
C 152 Hamming distance
C 152 Hamming distance
C 152 Hamming distance
C 152 Hamming distance
C 152 Hamming distance
C 152 Hamming distance
C 152 Hamming distance
C 129 Hamming distance
C 89 Hamming distance
C 86 Hamming distance
C 109 Hamming distance
C 86 Hamming distance
C 93 Hamming distance
C 89 Hamming distance
C 80 Hamming distance
C 55 Hamming distance
D 275 Hamming distance
D 250 Hamming distance
D 193 Hamming distance
D 241 Hamming distance
D 235 Hamming distance
D 186 Hamming distance
D 240 Hamming distance
D 174 Hamming distance
D 183 Hamming distance
D 193 Hamming distance
D 171 Hamming distance
D 182 Hamming distance
D 169 Hamming distance
D 159 Hamming distance
D 141 Hamming distance
D 131 Hamming distance
D 122 Hamming distance
D 111 Hamming distance
D 94 Hamming distance
I want to draw a picture like this:
42407858-33b43f68-81c3-11e8-8668-67c916b007d3

But I used above data and code can't get similar picture.
Rplot03

And I got some warning messages:
Warning messages:
1: Computation failed in stat_signif():
missing value where TRUE/FALSE needed
2: Computation failed in stat_signif():
missing value where TRUE/FALSE needed
3: Computation failed in stat_signif():
missing value where TRUE/FALSE needed
Could you help me ? Thank you !

change text size of asterisk

Really awesome ggplot package. I'd like to increase the size of the asterisks we get when map_signif_level = true.

If just set 'size = 14' it changes the bracket, not the asterisk size. Changing size of text in theme() doesn't effect the asterisk either.

Anyway to change the text annotations ggsignif adds to the plots? Ideally I'd like to change size and face.

Thanks!

tip_length in aes

Would it be possible to set the tip_length in the aes?

I am working on some time series data. So making a custom dataframe with the annotations could be nice.
Since I am comparing two different time series at same time/x it would be nicer to remove the tip and just have the p-value/annotation.

Here is an example (not time series though, just as an example for this specific problem):

annotation_df <- data.frame(color=c("E", "H"), 
                            start=c("Good", "Good"), 
                            end=c("Very Good", "Good"),
                            y=c(max(diamonds$carat), max(diamonds$carat)),
                            label=c("Comp. 1", "Comp. 2"),
                            tip_length = c(0.5,0))

annotation_df
#>   color start       end   y   label
#> 1     E  Good Very Good 3.6 Comp. 1
#> 2     H  Fair      Good 4.7 Comp. 2

ggplot(diamonds, aes(x=cut, y=carat)) +
  geom_boxplot() +
  geom_signif(data=annotation_df,
              aes(xmin=start, xmax=end, annotations=label, y_position=y, tip_length = tip_length),
              textsize = 3, vjust = -0.2,
              manual=TRUE) +
  facet_wrap(~ color) +
  ylim(NA, 5.3)

image

Annotations disappear when setting a high y_position

Hi,

I am using this wonderful package, which I found very useful to my work, and I am faced with a little problem here.
I am trying to do a violin plot using Seurat package.

library(Seurat)
library(ggsignif)
fig <- VlnPlot(object = pbmc_small, features.plot = 'PC1', do.return = TRUE)
fig  + geom_signif(xmin = 3, xmax = 4, y_position = 8 , annotations="**")

plot1

It needs a bit of refinement because the annotation overlaps other part of the plot, so I set y_position from 8 to 9

library(Seurat)
library(ggsignif)
fig <- VlnPlot(object = pbmc_small, features.plot = 'PC1', do.return = TRUE) +
coord_cartesian(ylim = c(-4,15))
fig  + geom_signif(xmin = 3, xmax = 4, y_position = 9 , annotations="**")

plot2

However, the annotation disappeared, and I do not know how to fix this.
Thank you very much in advance!

Change asterisks size?

Hi,

very nice package!
One question:
-is it possible to specifically change only the size (or colour) of the asterisks on top of the line, instead of changing only the line size?
I tried the argument size = x, but that works only for the line and not for the asterisks.

Many thanks,
Ni-Ar

Significance stars misaligned

I find that the significance stars do not line up well between each comparison. There are a lot of stars clumped together.

Here is some reproducible code using the mpg dataset. I want to show significance in the boxplot, named p, between 2seater and all the other cars:

p <- ggplot(mpg, aes(class, hwy))
p + geom_boxplot()+geom_signif(comparisons = list(c("2seater", "compact"), c("2seater", "midsize"), c("2seater", "minivan"), c("2seater", "pickup"), c("2seater", "subcompact"), c("2seater", "suv")), annotation="***", color="red",y_position = 40)

Unfortunately the signficance stars are not spread out properly:
image

It is also kind of hard to tell which pairs I am comparing. Do you have any suggestions to fix?

Thanks so much!!

Expanded to Support Two-Factor Anova

Hi,
I really love this package and have been using it extensively of late. I've recently written a piece of code that allows you to run a test like a two-factor anova followed by a post-hoc like a TukeyHSD and then automatically passes those values to geom_signif using the manual override. The x and y values are all calculated without user input. It relies on broom. I've been using it for large numbers of graphs at once. I was wondering if I'd be able to contribute to the package to save some other folks the work?
Best,
Margot

Feature request: annotate horizontal geoms

wonderful package--you've address a huge lacunae in the ggplot ecosystem!

I wonder if you've thought about something which is a pretty natural extension--annotating comparable horizontal geoms? for instance:

library(ggplot2)
library(ggsignif)
library(ggstance)

ggplot(iris, aes(y=Species, x=Sepal.Length)) +
  geom_boxploth() +
  geom_signif(comparisons = list(c("versicolor", "virginica")), 
          map_signif_level=TRUE)

results in

Warning messages:
1: In f(..., self = self) : NAs introduced by coercion
2: Computation failed in `stat_signif()`:
missing value where TRUE/FALSE needed 

Problems using alternative tests (i.e. anova)

Thanks for the fantastic package !

I got a small issue. When I use geom_significance to get the significance layer in the following code.
Yes, I am getting it and I understand this is based wilcoxon test. Please correct me, if I am wrong.
But I wish to use anova . Since my data is showing significance difference in anova but showing not significant different in wilcoxon test.
so when I give test = "anova". The significance layer is not appearing.

Could you please help me with this issue as soon as possible ?
Thanks for your time.
Please find the attached code.

group<-ggplot(data, aes(x=level,y=height,fill=level))+geom_boxplot()  + 
  labs(x="level",y="height") + 
  theme(plot.title = element_text(hjust = 0.5)) +
  geom_signif(comparisons = list(c("High", "Low"), test = "anova"), 
              map_signif_level=T)

Error in the value of the p-value

Hello,

I have noticed when asking the for the t-test for the calculation of the p-value I wasn't getting the same value as the function t.test.

t.test(condition1, condition2) I get p-value = 0.00253
geom_signif(comparisons = list(c(condition1,condition2),
test= "t.test",
map_signif_level = FALSE)
I get p-value = 0.53

I was wondering what were the parameters that you are using for the t.test

Thank you,

Pauline

Error message

I am trying to use the ggsignif package to notate the statistical difference in pollen crude protein content between two different study sites (Site A and Site E). I have both geom_signif and gggplot2 packages installed and the latest version of RStudio (1.0.143) and all packages are updated. However, I keep getting an error message,

Error: No stat called StatSignif.

Below is the code for the simple plot I am attempting to make. Any help is much appreciated.

ggplot(km2014long, aes(x=site_letter, y=protein_dry_percent)) + geom_boxplot() + geom_signif(comparisons = list(c("A","E")), map_signif_level = TRUE)

feature request: allow user given text as 'p value'

To give full control to the user, it would be nice, if there was the possibility to pass a character vector to be plotted as the 'p value'. I am thinking of something along the lines

geom_signif(comparisons = list(c("A", "B"), c("A", "C"), c("A", "D")),
            pvalues     = c("< 0.02", "not computable", "****"))

This is just a silly example, of course.

Feature Request: 3-dimensional data

Can you make it possible that not only comparisons with x-axis groups but also "fill/color" groups works?

Great addition to ggplot by the way, I'm looking forward to using this!

Custom annotations when using facets

Hello, I am trying to add custom annotations to ggplot2 faceted (facet_wrap) barplots and I am unable to make it work.
When I use the built-in tests in stat_signif() everything seems to work, from a technical standpoint, since I see the faceted barplots with all selected contrasts annotated by pvalues. Nonetheless I see strange things, like bars where there should clearly a statistically significant difference that are non-significant.
Therefore I decided to make tests separately and then add annotations in a custom way on the barplots.
I tried and it works nicely when I do a single plot, but when I add facets nothing works, I see just plain barplots without any annotation. I have tried to make it work by adding repeated lists to 'comparisons' and 'annotations' but the way it works is by piling each of the list elements on top of each other on every single plot of the facets rather than using one element per facet plot.
I hope I was able to explain my issue in a clear way.

Thanks for any suggestion

Michele

How to specify test.args

I am trying to specify additional arguments for the test performed by geom_signif. However, the following does not work:
geom_signif(comparisons = list(as.character(genotype)),
test = "t.test",test.args = c(alternative = "two.sided",var.equal = T,paired = F))
I get 11 warnings, that read:
Warning messages:
1: Computation failed in stat_signif():
invalid argument type

Could you give an example of how to specify the test.args argument?
Thanks!

Is it possible to put the annotations in bold?

Hi there!

Thanks a lot for a great (and sanity saving!) package!
This is something small but I was wondering if it's possible to make the annotations of geom_signif in bold? I already saw it's possible to change the size (with textsize) and even the font (with family), but maybe I'm missing how to put it in bold?

Thanks in advance!

Use with coord_cartisian when outliers present

I have not been able to control the position of the significance brackets when there are outliers present.
Here's an example of what I mean:

mydf <- data.frame(ID=paste(sample(LETTERS, 163, replace=TRUE), sample(1:1000, 163, replace=FALSE), sep=''), Group=c(rep('C',10),rep('FH',10),rep('I',19),rep('IF',42),rep('NA',14),rep('NF',42),rep('NI',15),rep('NS',10),rep('PGMC4',1)), Value=rnorm(n=163))   
CN <- combn(levels(mydf$Group), 2, simplify = FALSE)  

#This is what I want the plot to look like 
ggplot(mydf, aes(x=Group, y=Value, fill=Group)) + geom_boxplot(outlier.shape = NA) + stat_compare_means(comparisons = CN)  

#Add outliers 
mydf$Value[4] <- 300 
mydf$Value[5] <- 765 
mydf$Value[6] <- 12000   

# the plot with outliers 
ggplot(mydf, aes(x=Group, y=Value, fill=Group)) + geom_boxplot(outlier.shape = NA) + stat_compare_means(comparisons = CN)

How can I incorporate coord_cartisian with this plot and get the brackets in a position that I want?

Error with use

Hello,

Thanks for creating this package, this was what I was missing in life.
However I do not get the package to work. Every time I use the package I get this error:

Warning message:
Computation failed in stat_signif():
missing value where TRUE/FALSE needed

This is the used code:

pall <-  ggplot(enzymtest, aes(x =Land.use , y =nmol)) +
  geom_boxplot() +
  theme_bw() +
  geom_signif(comparisons = list(c("Forest", "Maize"),
              map_signif_level = TRUE, textsize=6))
pall

I have the newest R
Do you know if I can fix it?
Kind regards Nienke

Move labels below lines

Is there a way to put the label below the line? I'm working on a graph with negative values, so the bars are above the annotation, and I'd like the line to be above the annotation as well (thus, the line would fall between the bar and the annotation).

Alpha error

I'm really looking forward to using this package more, but I can quite figure out this issue...

ggplot(iris, aes(x=Species, y=Sepal.Length)) + 
  geom_boxplot() +
  geom_signif(comparisons = list(c("versicolor", "virginica")), 
          map_signif_level=TRUE)

...gives me the error:
Error in alpha(data$colour, data$alpha) : Data must either be a data frame or a matrix

Any ideas on why this might be? Thanks!

Error with use

I tried to use one of the vignettes on this page. Following is the error message:

ggplot(dat, aes(Group, Value)) +
     geom_bar(aes(fill = Sub), stat="identity", position="dodge", width=.5) +
   geom_signif(stat="identity",
                 data=data.frame(x=c(0.875, 1.875), xend=c(1.125, 2.125),
                                 y=c(5.8, 8.5), annotation=c("**", "NS")),
                 aes(x=x,xend=xend, y=y, yend=y, annotation=annotation)) +
     geom_signif(comparisons=list(c("S1", "S2")), annotations="***",
                 y_position = 9.3, tip_length = 0, vjust=0.4) +
     scale_fill_manual(values = c("grey80", "grey20"))

#> Error in alpha(data$colour, data$alpha) : 
  Data must either be a data frame or a matrix

Making the "*" and "NS"s bigger

Is there a way to increase the text size of the significance mapping?
size works only for the width of the lines for me
Thanks

text is hided when it use facet_grid

Hi, Thank you for making a great library. I like your ggsignif very much. But I found the text is half covered when I use facet_grid. (See attachment). Is there any way to fix it?

Rplot.pdf

Interaction on x-axis

Hi, first I wanna say thanks for your package - it works great so far but...
I have an issue with putting significance annotations to my bar plot where on x-axis I plotted an interaction between A and B (levels of A are: A1, A2 and levels of B are: B1, B2):

data %>%
ggplot(aes(x = interaction(A, B), y = Y)) +
geom_bar(stat="identity", position=position_dodge(width = .9), aes(fill = interaction(A, B)), color = "white") +
geom_errorbar(aes(ymin=Y-se, ymax=Y+se), width = .2, size = .8,
position = position_dodge(width = .9))

When I run "levels(interaction(A, B))" I get: A1.B1, A1.B2, A2.B1, A2.B2, so if I wanna add to this plot geom_signif that looks for example like this:
geom_signif(comparisons = list(c("A1.B1", "A2.B1"))

I get an error:

Error in f(...) :
Can only handle data with groups that are plotted on the x-axis

Is there any solution to fix it? Thanks for help

'gpar' error when using color aesthetics in ggplot

When using geom_signif on a ggplot plot that has color aesthetics, I get the error below. However, it works if the color aesthetics are specified in another geom. It took me a bit to find the cause and workaround for the problem, not sure if I was using ggplot not correctly here.

The minimal example test data:

require(ggplot2)
require(ggsignif)

df <- data.frame(
    'data' = c(1,2,3,4,5,6,7,8,9,10),
    'group' = c(rep('group 1', 5), rep('group 2', 5))
)

This results in an error

p <- ggplot(df, aes(y = data, x = group, group = group, color = group)) + geom_boxplot()
p + geom_signif(comparisons = list(c('group 1', 'group 2')), y_position = 11)

Error in check.length(gparname) : 
  'gpar' element 'fontsize' must not be length 0
In addition: Warning message:
In is.na(colour) : is.na() applied to non-(list or vector) of type 'NULL'

while this works as expected

p <- ggplot(df, aes(y = data, x = group, group = group)) + geom_boxplot(color = group)
p + geom_signif(comparisons = list(c('group 1', 'group 2')), y_position = 11)

Comparisons defined not by columns but levels of one column

Hi,

The "ggsignif" package looks handy, but is there a way to define comparisions by column?
In my case I have a logical variable in one column and I generate a bargraph out of it:

ggplot(persons, aes(x = gender, y = height) +
  geom_boxplot()

tip_length not used when geom_signif() with stat = "identity"

Hi,

As can be seen in your vignette, tips at the extremities of significance bars are not drawn regardless of tip_length value supplied in case comparisons to be done are not passed to geom_signif(). I believe this is caused by this parameter not being passed to the geom_signif() function when the stat != "signif".

This is a useful approach when one wants to annotate the graph with custom significance values, e.g. results of multiple comparisons, lsmeans etc which might be stored in a data.frame. I'm not too familiar with the ggplot way of coding aesthetics so for now i've avoided forking and proposing a PR, mostly because I would need a lto fo trial-and-error, but I feel lie this should be an easy fix for the maintainer.

Thanks for looking into it!
Thomas

geom_signif fails when reassigning factor levels

Hi Constantin,

Thank you for your work in developing the geom_signif extension to ggplot. It is a great tool.

I want to bring to your attention an issue I have run into using geom_signif, specifically, that the geom_signif layer will not render in the plot if factor levels are reassigned within ggplot. I found this while using geom_signif to annotate some bar plots

Here is an example:

library(plyr)
library(ggplot2)
library(ggsignif)

#generate some data

mtcars.meanMPG <- 
  ddply(
    mtcars,
    .(carb, am),
    summarize,
    meanMPG = round(mean(mpg),3)
  )

This works:

ggplot(
  data=mtcars.meanMPG,
  aes(
    x=factor(carb),
    y=meanMPG,
    fill=am,
    group=am
  )
)+
  geom_bar(
    stat = "identity",
    position = position_dodge(preserve = "single")
  )+
  geom_signif(
    annotation="p = 0.01",
    y_position=29,
    xmin=1.7,
    xmax=2.3,
    tip_length = c(0.01, 0.01)
  )

This does not:

ggplot(
  data=mtcars.meanMPG,
  aes(
    x=factor(carb, levels = unique(rev(mtcars.meanMPG$carb))),
    y=meanMPG,
    fill=am,
    group=am
  )
)+
  geom_bar(
   stat = "identity",
    position = position_dodge(preserve = "single")
  )+
  geom_signif(
    annotation="p = 0.01",
    y_position=29,
    xmin=1.7,
    xmax=2.3,
    tip_length = c(0.01, 0.01)
  )

Nor does this:

mtcars.meanMPG$carb <- 
factor(mtcars.meanMPG$carb, levels = unique(rev(mtcars.meanMPG$carb)))

ggplot(
  data=mtcars.meanMPG,
  aes(
    x=carb,
    y=meanMPG,
    fill=am,
    group=am
  )
)+
  geom_bar(
    stat = "identity",
    position = position_dodge(preserve = "single")
  )+
  geom_signif(
    annotation="p = 0.01",
    y_position=29,
    xmin=1.7,
   xmax=2.3,
   tip_length = c(0.01, 0.01)
 )

I often find myself reassigning factor levels in order to get figures to rendering correctly. Just letting you know in case this is something you want to look into.

Thanks,
-Joe

Padding between text and horizontal line

It would be great to have an option like text_padding to allow a gap to be specified between the text and the horizontal line. Currently the text is often just touching the line, which could be improved slightly for publication etc. Thanks!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.