const-ae / ggsignif Goto Github PK

View Code? Open in Web Editor NEW

574.0 574.0 43.0 14.81 MB

Easily add significance brackets to your ggplots

Home Page: https://const-ae.github.io/ggsignif/

License: GNU General Public License v3.0

R 65.22% TeX 32.64% Makefile 2.15%

asterisk ggplot-extension ggplot2 rstats significance-stars

ggsignif's People

Contributors

Stargazers

Watchers

Forkers

umeshach albluca rhshah kassambara zhaoxiaohe sunhuaibo wuffi sidderb smargell jimparkinson sageswang baifengbai ytlogos kormilitzin tankmermaid ichobits xtmgah altanastor jayhesselberth yww2567 songbaozou margot-l 0cbh0 schlaipferm mamscience nemochina2008 juadiegaitan chencaf quanrd joshualiuxu memo1986 xiangpin aprillee826 yy-song0718 nbahti aphalo elnaggarj oxfist allenlile michaelchirico hswl1314

ggsignif's Issues

feature request - annotate with magnitude of difference in mean

Thanks for the great ggplot2 extension.

Is it possible to easily display the magnitude of the difference between two groups?

I would like to be able to see if two groups are economically significantly different as well as statistical significant. Have an annotation more like "+1.05**".

I know this is possible by passing in a custom data frame with the new values however it requires a fair amount of code. It involves:

Compute the mean across different groups of the dataset
Pivot the data to apply a pairwise difference calculation across all combinations
Unpivot the data back to tidy format to pass into the geom.

Computation failed in `stat_signif()`

Hello,
I met some questions when used this packages.Could you help me?
This is my code：

my_comparisons <- list(c("A", "B"), c("B", "C"), c("C", "D"))
ggplot(distance, aes(x=Group,y=Distance, color=Method,shape=Method)) +
  geom_boxplot(fill="cornflowerblue",
               color="black", notch=TRUE) +
  geom_point(position = "jitter", color="blue", alpha=.5) +
  geom_rug(side="1", color="black")+theme_bw()+
  geom_signif(comparisons = my_comparisons,test = "t.test")+
  facet_grid(.~Method)

This is my data structures:
Group Distance Method
A 0 Hamming distance
A 0 Hamming distance
A 0 Hamming distance
A 0 Hamming distance
A 0 Hamming distance
A 0 Hamming distance
A 0 Hamming distance
A 0 Hamming distance
A 0 Hamming distance
A 0 Hamming distance
A 0 Hamming distance
A 0 Hamming distance
A 0 Hamming distance
A 0 Hamming distance
A 0 Hamming distance
A 0 Hamming distance
A 0 Hamming distance
A 0 Hamming distance
A 0 Hamming distance
B 162 Hamming distance
B 151 Hamming distance
B 90 Hamming distance
B 150 Hamming distance
B 131 Hamming distance
B 107 Hamming distance
B 145 Hamming distance
B 87 Hamming distance
B 103 Hamming distance
B 96 Hamming distance
B 114 Hamming distance
B 102 Hamming distance
B 103 Hamming distance
B 91 Hamming distance
B 71 Hamming distance
B 77 Hamming distance
B 77 Hamming distance
B 67 Hamming distance
B 40 Hamming distance
C 179 Hamming distance
C 167 Hamming distance
C 109 Hamming distance
C 152 Hamming distance
C 152 Hamming distance
C 152 Hamming distance
C 152 Hamming distance
C 152 Hamming distance
C 152 Hamming distance
C 152 Hamming distance
C 129 Hamming distance
C 89 Hamming distance
C 86 Hamming distance
C 109 Hamming distance
C 86 Hamming distance
C 93 Hamming distance
C 89 Hamming distance
C 80 Hamming distance
C 55 Hamming distance
D 275 Hamming distance
D 250 Hamming distance
D 193 Hamming distance
D 241 Hamming distance
D 235 Hamming distance
D 186 Hamming distance
D 240 Hamming distance
D 174 Hamming distance
D 183 Hamming distance
D 193 Hamming distance
D 171 Hamming distance
D 182 Hamming distance
D 169 Hamming distance
D 159 Hamming distance
D 141 Hamming distance
D 131 Hamming distance
D 122 Hamming distance
D 111 Hamming distance
D 94 Hamming distance
I want to draw a picture like this:

But I used above data and code can't get similar picture.

And I got some warning messages:
Warning messages:
1: Computation failed in stat_signif():
missing value where TRUE/FALSE needed
2: Computation failed in stat_signif():
missing value where TRUE/FALSE needed
3: Computation failed in stat_signif():
missing value where TRUE/FALSE needed
Could you help me ? Thank you !

For loop for anova p-values

Can I use "for loop" in you package to make the p -values of aov appear in the boxplot ?
Tried using, but not able to iterate the Pr>F values. Now, the p values are getting replaced.

Thanks for your time!

Significance stars misaligned

I find that the significance stars do not line up well between each comparison. There are a lot of stars clumped together.

Here is some reproducible code using the mpg dataset. I want to show significance in the boxplot, named p, between 2seater and all the other cars:

p <- ggplot(mpg, aes(class, hwy))
p + geom_boxplot()+geom_signif(comparisons = list(c("2seater", "compact"), c("2seater", "midsize"), c("2seater", "minivan"), c("2seater", "pickup"), c("2seater", "subcompact"), c("2seater", "suv")), annotation="***", color="red",y_position = 40)

Unfortunately the signficance stars are not spread out properly:

It is also kind of hard to tell which pairs I am comparing. Do you have any suggestions to fix?

Thanks so much!!

Puzzled for annotation that contain duplicate contents

Hi Constantin,
The package you provide is exceedingly useful.
But I'm troubled by the function aes(annotation=). The annotation would be merged and drawn on the midpoint coordinate, when the annotation contain duplicate contents.
You can try this code which comes from the site https://cran.r-project.org/web/packages/ggsignif/vignettes/intro.html.
After I modified the content of annotation, the strange result emerged.
geom_signif(stat="identity", data=data.frame(x=c(0.875, 1.875), xend=c(1.125, 2.125), y=c(5.8, 8.5), annotation=c("**", "**")), aes(x=x,xend=xend, y=y, yend=y, annotation=annotation))
Waitting for your solution.

How to specify test.args

I am trying to specify additional arguments for the test performed by geom_signif. However, the following does not work:
geom_signif(comparisons = list(as.character(genotype)),
test = "t.test",test.args = c(alternative = "two.sided",var.equal = T,paired = F))
I get 11 warnings, that read:
Warning messages:
1: Computation failed in stat_signif():
invalid argument type

Could you give an example of how to specify the test.args argument?
Thanks!

Error in the value of the p-value

Hello,

I have noticed when asking the for the t-test for the calculation of the p-value I wasn't getting the same value as the function t.test.

t.test(condition1, condition2) I get p-value = 0.00253
geom_signif(comparisons = list(c(condition1,condition2),
test= "t.test",
map_signif_level = FALSE)
I get p-value = 0.53

I was wondering what were the parameters that you are using for the t.test

Thank you,

Pauline

How do you ignore the "NS" and just show those which are significantly different?

eg:
https://openi.nlm.nih.gov/imgs/512/213/3265556/PMC3265556_1756-0500-4-539-1.png

Problems using alternative tests (i.e. anova)

Thanks for the fantastic package !

I got a small issue. When I use geom_significance to get the significance layer in the following code.
Yes, I am getting it and I understand this is based wilcoxon test. Please correct me, if I am wrong.
But I wish to use anova . Since my data is showing significance difference in anova but showing not significant different in wilcoxon test.
so when I give test = "anova". The significance layer is not appearing.

Could you please help me with this issue as soon as possible ?
Thanks for your time.
Please find the attached code.

group<-ggplot(data, aes(x=level,y=height,fill=level))+geom_boxplot()  + 
  labs(x="level",y="height") + 
  theme(plot.title = element_text(hjust = 0.5)) +
  geom_signif(comparisons = list(c("High", "Low"), test = "anova"), 
              map_signif_level=T)

feature request: allow user given text as 'p value'

To give full control to the user, it would be nice, if there was the possibility to pass a character vector to be plotted as the 'p value'. I am thinking of something along the lines

geom_signif(comparisons = list(c("A", "B"), c("A", "C"), c("A", "D")),
            pvalues     = c("< 0.02", "not computable", "****"))

This is just a silly example, of course.

text is hided when it use facet_grid

Hi, Thank you for making a great library. I like your ggsignif very much. But I found the text is half covered when I use facet_grid. (See attachment). Is there any way to fix it?

Rplot.pdf

Change asterisks size?

Hi,

very nice package!
One question:
-is it possible to specifically change only the size (or colour) of the asterisks on top of the line, instead of changing only the line size?
I tried the argument size = x, but that works only for the line and not for the asterisks.

Many thanks,
Ni-Ar

Feature request: support for coord_flip

When using coord flip, one often wants to have the labels at the right of the brackets. Is there any way to do this?

'gpar' error when using color aesthetics in ggplot

When using geom_signif on a ggplot plot that has color aesthetics, I get the error below. However, it works if the color aesthetics are specified in another geom. It took me a bit to find the cause and workaround for the problem, not sure if I was using ggplot not correctly here.

The minimal example test data:

require(ggplot2)
require(ggsignif)

df <- data.frame(
    'data' = c(1,2,3,4,5,6,7,8,9,10),
    'group' = c(rep('group 1', 5), rep('group 2', 5))
)

This results in an error

p <- ggplot(df, aes(y = data, x = group, group = group, color = group)) + geom_boxplot()
p + geom_signif(comparisons = list(c('group 1', 'group 2')), y_position = 11)

Error in check.length(gparname) : 
  'gpar' element 'fontsize' must not be length 0
In addition: Warning message:
In is.na(colour) : is.na() applied to non-(list or vector) of type 'NULL'

while this works as expected

p <- ggplot(df, aes(y = data, x = group, group = group)) + geom_boxplot(color = group)
p + geom_signif(comparisons = list(c('group 1', 'group 2')), y_position = 11)

Use with coord_cartisian when outliers present

I have not been able to control the position of the significance brackets when there are outliers present.
Here's an example of what I mean:

mydf <- data.frame(ID=paste(sample(LETTERS, 163, replace=TRUE), sample(1:1000, 163, replace=FALSE), sep=''), Group=c(rep('C',10),rep('FH',10),rep('I',19),rep('IF',42),rep('NA',14),rep('NF',42),rep('NI',15),rep('NS',10),rep('PGMC4',1)), Value=rnorm(n=163))   
CN <- combn(levels(mydf$Group), 2, simplify = FALSE)  

#This is what I want the plot to look like 
ggplot(mydf, aes(x=Group, y=Value, fill=Group)) + geom_boxplot(outlier.shape = NA) + stat_compare_means(comparisons = CN)  

#Add outliers 
mydf$Value[4] <- 300 
mydf$Value[5] <- 765 
mydf$Value[6] <- 12000   

# the plot with outliers 
ggplot(mydf, aes(x=Group, y=Value, fill=Group)) + geom_boxplot(outlier.shape = NA) + stat_compare_means(comparisons = CN)

How can I incorporate coord_cartisian with this plot and get the brackets in a position that I want?

error when adding geom_signif to a plot

Dear Constantin

Thank you very much for your package ggsignif, I very appreciate it.
I try to add results from chi-square test to a ggplot and it was mentioned, that I could use ggsignif:

https://stackoverflow.com/questions/51886623/compare-dependent-proportions-in-a-ggplot

Moreover I found your adivce to use geom_signif

#23

However, If i add this to my plot:

geom_signif(data = annotation_df,
              aes(annotations = annotations, xmin = xmin, xmax = xmax, y_position = y_position),
              manual = TRUE)

df <- data.frame(timepoint=rep(0:2, each=10),response=c("A","B","A","A","A","A","A","A","B","B","A","A","A","A","A","A","A","B","B","B","A","B","B","B","B","B","A","B","B","B"),variable=rep(c("var1","var2"),each=5, 3), subject=rep(1:5,6))
df$timepoint <- factor(df$timepoint, level=c(1,0,2), labels=c("method_A","baseline","method_B"))

df %>% add_count(timepoint,variable,response) %>% add_count(timepoint,variable) %>% mutate(freq=n/nn*100) %>% mutate(total=1) -> df

stats <-data.frame(xmax=c(rep(c("baseline","method_B"),2)))
stats %>% mutate(xmin=as.factor(c(rep(c("method_A","baseline"),2)))) %>% 
  mutate(annotations=c("1","0.2","1","0.5")) %>% 
  mutate(y_position=5) %>% 
  mutate(variable=as.factor(c("var1","var1","var2","var2"))) -> annotation_df

ggplot(df,
       aes(x = timepoint, stratum = response, alluvium = subject,
           y = total, 
           fill = response, label = paste(freq,"%") )) +
  geom_flow() +
  geom_stratum(alpha = .5) +
  geom_text(stat = "stratum", size = 3) +
  theme(legend.position = "none") +
  geom_signif(data = annotation_df,
              aes(annotations = annotations, xmin = xmin, xmax = xmax, y_position = y_position),
              manual = TRUE) +
  facet_wrap(~variable)

I get this error:
Warning: Ignoring unknown aesthetics: annotations, xmin, xmax, y_position
Error in FUN(X[[i]], ...) : object 'response' not found

If i leave out geom_signif(...) everything works.
Thank you for any advice,
Jacob

Is it possible to compare between facets?

For example if I wanted to compare classes in the mpg data set against themselves based on wheter they had manual or automatic transmissions (i.e. compact(auto) vs compact(manual)).

Is this possible?

library(tidyverse)
library(ggsignif)

mpg <- mpg %>% 
  separate(trans, c("type", "variant"), sep="\\(")

ggplot(mpg, aes(class, hwy)) +
  geom_boxplot() +
  facet_grid(.~type)

Warning is produced when map_signif_level is specified as a numeric vector

Hi,

A warning is generated in if (params$map_signif_level == TRUE) , when map_signif_level is specified as a numeric vector.

Reproducible example:

library(ggplot2)
library(ggsignif)
ggplot(iris, aes(Species, Sepal.Length)) +
  geom_boxplot()+
  geom_signif(comparisons = list(c("setosa", "versicolor")),
              map_signif_level = c("****"=0.0001, "***"=0.001, "**"=0.01,  "*"=0.05))

Warning message:

In if (params$map_signif_level == TRUE) { :
The condition has length > 1 and only the first element will be used

The error is generated because if can only evaluate a logical vector of length 1.

Suggestion:

if (params$map_signif_level [1] == TRUE)

Thanks for your work:-)!

Feature request: Significant label for one sample t Test

Hi there
I wonder how to plot the significant label for one sample t test. In other words, how to plot the "*" for each column without brackets?

Thanks

Haiyang

Error with use

Hello,

Thanks for creating this package, this was what I was missing in life.
However I do not get the package to work. Every time I use the package I get this error:

Warning message:
Computation failed in stat_signif():
missing value where TRUE/FALSE needed

This is the used code:

pall <-  ggplot(enzymtest, aes(x =Land.use , y =nmol)) +
  geom_boxplot() +
  theme_bw() +
  geom_signif(comparisons = list(c("Forest", "Maize"),
              map_signif_level = TRUE, textsize=6))
pall

I have the newest R
Do you know if I can fix it?
Kind regards Nienke

Error message

I am trying to use the ggsignif package to notate the statistical difference in pollen crude protein content between two different study sites (Site A and Site E). I have both geom_signif and gggplot2 packages installed and the latest version of RStudio (1.0.143) and all packages are updated. However, I keep getting an error message,

Error: No stat called StatSignif.

Below is the code for the simple plot I am attempting to make. Any help is much appreciated.

ggplot(km2014long, aes(x=site_letter, y=protein_dry_percent)) + geom_boxplot() + geom_signif(comparisons = list(c("A","E")), map_signif_level = TRUE)

Feature request: annotate horizontal geoms

wonderful package--you've address a huge lacunae in the ggplot ecosystem!

I wonder if you've thought about something which is a pretty natural extension--annotating comparable horizontal geoms? for instance:

library(ggplot2)
library(ggsignif)
library(ggstance)

ggplot(iris, aes(y=Species, x=Sepal.Length)) +
  geom_boxploth() +
  geom_signif(comparisons = list(c("versicolor", "virginica")), 
          map_signif_level=TRUE)

results in

Warning messages:
1: In f(..., self = self) : NAs introduced by coercion
2: Computation failed in `stat_signif()`:
missing value where TRUE/FALSE needed

Dashed horizontal line

Is there any way to make the horizontal line (and line tips, if tip_length != 0) a different linetype? For example, if I wanted a dashed horizontal line? I kept trying to pass the linetype = 2 to various plot layers (including the base layer), but that didn't work. Other aesthetics like color (changes color of both the line and the annotation) and size (changes size of annotation only) worked just fine. Any suggestions?

I think having the option for dashed lines might be useful, especially for making figures for publications that require black and white images (or those that make you pay for color images). Thanks again!

Comparisons defined not by columns but levels of one column

Hi,

The "ggsignif" package looks handy, but is there a way to define comparisions by column?
In my case I have a logical variable in one column and I generate a bargraph out of it:

ggplot(persons, aes(x = gender, y = height) +
  geom_boxplot()

Expanded to Support Two-Factor Anova

Hi,
I really love this package and have been using it extensively of late. I've recently written a piece of code that allows you to run a test like a two-factor anova followed by a post-hoc like a TukeyHSD and then automatically passes those values to geom_signif using the manual override. The x and y values are all calculated without user input. It relies on broom. I've been using it for large numbers of graphs at once. I was wondering if I'd be able to contribute to the package to save some other folks the work?
Best,
Margot

Changing the size of the annotation text

Hi, I am passing my p values using the annotation argument. Is there any way to reduce the size of the text? They appear quite big in my plot.
Thanks.

tip_length in aes

Would it be possible to set the tip_length in the aes?

I am working on some time series data. So making a custom dataframe with the annotations could be nice.
Since I am comparing two different time series at same time/x it would be nicer to remove the tip and just have the p-value/annotation.

Here is an example (not time series though, just as an example for this specific problem):

annotation_df <- data.frame(color=c("E", "H"), 
                            start=c("Good", "Good"), 
                            end=c("Very Good", "Good"),
                            y=c(max(diamonds$carat), max(diamonds$carat)),
                            label=c("Comp. 1", "Comp. 2"),
                            tip_length = c(0.5,0))

annotation_df
#>   color start       end   y   label
#> 1     E  Good Very Good 3.6 Comp. 1
#> 2     H  Fair      Good 4.7 Comp. 2

ggplot(diamonds, aes(x=cut, y=carat)) +
  geom_boxplot() +
  geom_signif(data=annotation_df,
              aes(xmin=start, xmax=end, annotations=label, y_position=y, tip_length = tip_length),
              textsize = 3, vjust = -0.2,
              manual=TRUE) +
  facet_wrap(~ color) +
  ylim(NA, 5.3)

Interaction on x-axis

Hi, first I wanna say thanks for your package - it works great so far but...
I have an issue with putting significance annotations to my bar plot where on x-axis I plotted an interaction between A and B (levels of A are: A1, A2 and levels of B are: B1, B2):

data %>%
ggplot(aes(x = interaction(A, B), y = Y)) +
geom_bar(stat="identity", position=position_dodge(width = .9), aes(fill = interaction(A, B)), color = "white") +
geom_errorbar(aes(ymin=Y-se, ymax=Y+se), width = .2, size = .8,
position = position_dodge(width = .9))

When I run "levels(interaction(A, B))" I get: A1.B1, A1.B2, A2.B1, A2.B2, so if I wanna add to this plot geom_signif that looks for example like this:
geom_signif(comparisons = list(c("A1.B1", "A2.B1"))

I get an error:

Error in f(...) :
Can only handle data with groups that are plotted on the x-axis

Is there any solution to fix it? Thanks for help

Missing p-value with multiple comparisons

Hi there,

first of all, thank you for your package. It's very helpful! I'm using the package to display the result of multiple tests and some p-vals are not plotted

This is an example with synthetic data:

library(ggplot2)
library(ggsignif)

ggplot(data.frame(y=runif(100), x=sample(c("A", "B", "C", "D"), size = 100, replace = TRUE)), aes(x = x, y=y)) +
geom_boxplot() + geom_signif(comparisons = list(c(1,2), c(2,3), c(1,3), c(1,4), c(2,4)), step_increase = .1)

and this is the result:

I tried playing around with the "step_increase", but it did not helped ... Any idea on how to fix it? Thanks!

Best wishes,
Luca

Alpha error

I'm really looking forward to using this package more, but I can quite figure out this issue...

ggplot(iris, aes(x=Species, y=Sepal.Length)) + 
  geom_boxplot() +
  geom_signif(comparisons = list(c("versicolor", "virginica")), 
          map_signif_level=TRUE)

...gives me the error:
Error in alpha(data$colour, data$alpha) : Data must either be a data frame or a matrix

Any ideas on why this might be? Thanks!

feature request: multiple testing correction

If I do not miss anything, there is currently no way to include a multiple comparison correction. This would be a very useful addition.

Feature Request: 3-dimensional data

Can you make it possible that not only comparisons with x-axis groups but also "fill/color" groups works?

Great addition to ggplot by the way, I'm looking forward to using this!

Padding between text and horizontal line

It would be great to have an option like text_padding to allow a gap to be specified between the text and the horizontal line. Currently the text is often just touching the line, which could be improved slightly for publication etc. Thanks!

Making the "*" and "NS"s bigger

Is there a way to increase the text size of the significance mapping?
size works only for the width of the lines for me
Thanks

Error with use

I tried to use one of the vignettes on this page. Following is the error message:

ggplot(dat, aes(Group, Value)) +
     geom_bar(aes(fill = Sub), stat="identity", position="dodge", width=.5) +
   geom_signif(stat="identity",
                 data=data.frame(x=c(0.875, 1.875), xend=c(1.125, 2.125),
                                 y=c(5.8, 8.5), annotation=c("**", "NS")),
                 aes(x=x,xend=xend, y=y, yend=y, annotation=annotation)) +
     geom_signif(comparisons=list(c("S1", "S2")), annotations="***",
                 y_position = 9.3, tip_length = 0, vjust=0.4) +
     scale_fill_manual(values = c("grey80", "grey20"))

#> Error in alpha(data$colour, data$alpha) : 
  Data must either be a data frame or a matrix

I want to do paired wilcoxon test,where should I put my parameters of "paired"

q <- ggplot(zymo, aes(Method, concentrition, fill = Method))+geom_boxplot() + geom_signif(comparisons = compaired,step_increase = 0.5,map_signif_level = F, test = wilcox.test)
I notice the "test.args " ，but I still wonder how to use it or how to have a paired wilcoxon test?
thanks a lot!

Is it possible to put the annotations in bold?

Hi there!

Thanks a lot for a great (and sanity saving!) package!
This is something small but I was wondering if it's possible to make the annotations of geom_signif in bold? I already saw it's possible to change the size (with textsize) and even the font (with family), but maybe I'm missing how to put it in bold?

Thanks in advance!

tip_length not used when geom_signif() with stat = "identity"

Hi,

As can be seen in your vignette, tips at the extremities of significance bars are not drawn regardless of tip_length value supplied in case comparisons to be done are not passed to geom_signif(). I believe this is caused by this parameter not being passed to the geom_signif() function when the stat != "signif".

This is a useful approach when one wants to annotate the graph with custom significance values, e.g. results of multiple comparisons, lsmeans etc which might be stored in a data.frame. I'm not too familiar with the ggplot way of coding aesthetics so for now i've avoided forking and proposing a PR, mostly because I would need a lto fo trial-and-error, but I feel lie this should be an easy fix for the maintainer.

Thanks for looking into it!
Thomas

'textsize' does not affect the text size

Hi Constantin,

Thanks for the awesome package! I was agonizing over how to plot dozens of significance comparisons over several figures today, and was ecstatic when I found that you had gone ahead and implemented this!

The issue I'm having is that 'textsize' doesn't seem to be doing anything. It's a minor issue as I can create larger plots and scale them down after, but just a heads up.

Thanks again!
Virginia

Move labels below lines

Is there a way to put the label below the line? I'm working on a graph with negative values, so the bars are above the annotation, and I'd like the line to be above the annotation as well (thus, the line would fall between the bar and the annotation).

change text size of asterisk

Really awesome ggplot package. I'd like to increase the size of the asterisks we get when map_signif_level = true.

If just set 'size = 14' it changes the bracket, not the asterisk size. Changing size of text in theme() doesn't effect the asterisk either.

Anyway to change the text annotations ggsignif adds to the plots? Ideally I'd like to change size and face.

Thanks!

geom_signif fails when reassigning factor levels

Hi Constantin,

Thank you for your work in developing the geom_signif extension to ggplot. It is a great tool.

I want to bring to your attention an issue I have run into using geom_signif, specifically, that the geom_signif layer will not render in the plot if factor levels are reassigned within ggplot. I found this while using geom_signif to annotate some bar plots

Here is an example:

library(plyr)
library(ggplot2)
library(ggsignif)

#generate some data

mtcars.meanMPG <- 
  ddply(
    mtcars,
    .(carb, am),
    summarize,
    meanMPG = round(mean(mpg),3)
  )

This works:

ggplot(
  data=mtcars.meanMPG,
  aes(
    x=factor(carb),
    y=meanMPG,
    fill=am,
    group=am
  )
)+
  geom_bar(
    stat = "identity",
    position = position_dodge(preserve = "single")
  )+
  geom_signif(
    annotation="p = 0.01",
    y_position=29,
    xmin=1.7,
    xmax=2.3,
    tip_length = c(0.01, 0.01)
  )

This does not:

ggplot(
  data=mtcars.meanMPG,
  aes(
    x=factor(carb, levels = unique(rev(mtcars.meanMPG$carb))),
    y=meanMPG,
    fill=am,
    group=am
  )
)+
  geom_bar(
   stat = "identity",
    position = position_dodge(preserve = "single")
  )+
  geom_signif(
    annotation="p = 0.01",
    y_position=29,
    xmin=1.7,
    xmax=2.3,
    tip_length = c(0.01, 0.01)
  )

Nor does this:

mtcars.meanMPG$carb <- 
factor(mtcars.meanMPG$carb, levels = unique(rev(mtcars.meanMPG$carb)))

ggplot(
  data=mtcars.meanMPG,
  aes(
    x=carb,
    y=meanMPG,
    fill=am,
    group=am
  )
)+
  geom_bar(
    stat = "identity",
    position = position_dodge(preserve = "single")
  )+
  geom_signif(
    annotation="p = 0.01",
    y_position=29,
    xmin=1.7,
   xmax=2.3,
   tip_length = c(0.01, 0.01)
 )

I often find myself reassigning factor levels in order to get figures to rendering correctly. Just letting you know in case this is something you want to look into.

Thanks,
-Joe

Can't add p values to stacked bar plot

Hi,

First of all, thank you for developing and maintaining the ggsignif !

I was trying to add significant p values to a faceted stacked bar plot but kept getting an error message saying

Error in check_factor(f) : object 'Rank' not found

Below are the data and code to reproduce my problem:

library(tidyverse) 
#> Warning: 程辑包'tidyverse'是用R版本3.4.4 来建造的
#> Warning: 程辑包'ggplot2'是用R版本3.4.4 来建造的
#> Warning: 程辑包'tibble'是用R版本3.4.4 来建造的
#> Warning: 程辑包'tidyr'是用R版本3.4.4 来建造的
#> Warning: 程辑包'readr'是用R版本3.4.4 来建造的
#> Warning: 程辑包'purrr'是用R版本3.4.4 来建造的
#> Warning: 程辑包'dplyr'是用R版本3.4.4 来建造的
#> Warning: 程辑包'stringr'是用R版本3.4.4 来建造的
#> Warning: 程辑包'forcats'是用R版本3.4.4 来建造的
library(cowplot) 
#> Warning: 程辑包'cowplot'是用R版本3.4.4 来建造的
#> 
#> 载入程辑包：'cowplot'
#> The following object is masked from 'package:ggplot2':
#> 
#>     ggsave
library(ggsignif) 
#> Warning: 程辑包'ggsignif'是用R版本3.4.4 来建造的

# Make a dataframe for plotting stacked bar plot
df <- data.frame(Diet = rep(c("REF", "IM"), each = 8),
                 Variable = c("hpv", "hpv", "hpv", "hpv", "smc", "smc", "lpc", "lpc",
                              "hpv", "hpv", "hpv", "smc", "smc", "smc", "lpc", "lpc"),
                 Rank = c("Mild", "Moderate", "Marked", "Severe", "Normal", "Mild", "Normal", "Mild",
                          "Mild", "Moderate", "Marked", "Normal", "Mild", "Moderate", "Normal", "Mild"),
                 Percent = c(5.56, 38.9, 44.4, 11.1, 38.9, 61.1, 77.8, 22.2, 
                             16.7, 66.7, 16.7, 11.1, 72.2, 16.7, 50, 50)
                 )

# Specify the desired orders of factors and convert "Rank" to an ordered factor
df$Diet <- factor(df$Diet, levels = c("REF", "IM"))
df$Variable <- factor(df$Variable, levels = c("hpv", "smc", "lpc"))
df$Rank <- ordered(df$Rank, levels = c("Normal", "Mild", "Moderate", "Marked", "Severe")) # Rank as ordered factor

# Define color scheme 
my_col = c(Normal = "royalblue2", Mild = "peachpuff1", Moderate = "tan1", Marked = "tomato", Severe = "red3")

# Make stacked barplot 
p <- ggplot(df, aes(Diet, Percent, fill = forcats::fct_rev(Rank))) + # forcats::fct_rev() reverses stacked bars
  geom_bar(stat = "identity") +
  facet_wrap(~ Variable, nrow = 1) +
  scale_fill_manual(values = my_col) + 
  scale_y_continuous(limits = c(0, 105), breaks = 0:5*20, expand = expand_scale(mult = c(0, 0.05))) +
  labs(title = "Stacked bar plot", y = "%") +
  guides(fill = guide_legend(title = "Rank")) + 
  theme_cowplot()
  
# Make a datafraome for p value annotation
anno <- data.frame(Variable = "hpv",
                   p = 0.03,
                   start = "REF",
                   end = "IM",
                   y = 102)

# Add p value to the plot
p + geom_signif(data = anno,
                aes(xmin = start, 
                    xmax = end, 
                    annotations = p, 
                    y_position = y),
                textsize = 4, 
                tip_length = 0,
                manual = TRUE)
#> Warning: Ignoring unknown aesthetics: xmin, xmax, annotations, y_position
#> Error in check_factor(f): 找不到对象'Rank'

                 
# P values cann't be added. Even when I tried to add p value manually using geom_text + geom_segment
# P values can be added if the barplots are not stacked by Rank

^{Created on 2018-11-05 by the reprex package (v0.2.1)}

Session info

devtools::session_info()
#> - Session info ----------------------------------------------------------
#>  setting  value                         
#>  version  R version 3.4.3 (2017-11-30)  
#>  os       Windows 10 x64                
#>  system   x86_64, mingw32               
#>  ui       RTerm                         
#>  language (EN)                          
#>  collate  Chinese (Simplified)_China.936
#>  ctype    Chinese (Simplified)_China.936
#>  tz       Europe/Berlin                 
#>  date     2018-11-05                    
#> 
#> - Packages --------------------------------------------------------------
#>  package     * version date       lib source        
#>  assertthat    0.2.0   2017-04-11 [1] CRAN (R 3.4.4)
#>  backports     1.1.2   2017-12-13 [1] CRAN (R 3.4.3)
#>  base64enc     0.1-3   2015-07-28 [1] CRAN (R 3.4.1)
#>  bindr         0.1.1   2018-03-13 [1] CRAN (R 3.4.4)
#>  bindrcpp      0.2.2   2018-03-29 [1] CRAN (R 3.4.4)
#>  broom         0.5.0   2018-07-17 [1] CRAN (R 3.4.3)
#>  callr         3.0.0   2018-08-24 [1] CRAN (R 3.4.4)
#>  cellranger    1.1.0   2016-07-27 [1] CRAN (R 3.4.4)
#>  cli           1.0.1   2018-09-25 [1] CRAN (R 3.4.4)
#>  colorspace    1.3-2   2016-12-14 [1] CRAN (R 3.4.4)
#>  cowplot     * 0.9.3   2018-07-15 [1] CRAN (R 3.4.4)
#>  crayon        1.3.4   2017-09-16 [1] CRAN (R 3.4.4)
#>  curl          3.2     2018-03-28 [1] CRAN (R 3.4.4)
#>  debugme       1.1.0   2017-10-22 [1] CRAN (R 3.4.4)
#>  desc          1.2.0   2018-05-01 [1] CRAN (R 3.4.4)
#>  devtools      2.0.0   2018-10-19 [1] CRAN (R 3.4.3)
#>  digest        0.6.18  2018-10-10 [1] CRAN (R 3.4.4)
#>  dplyr       * 0.7.7   2018-10-16 [1] CRAN (R 3.4.4)
#>  evaluate      0.12    2018-10-09 [1] CRAN (R 3.4.4)
#>  forcats     * 0.3.0   2018-02-19 [1] CRAN (R 3.4.4)
#>  fs            1.2.6   2018-08-23 [1] CRAN (R 3.4.4)
#>  ggplot2     * 3.1.0   2018-10-25 [1] CRAN (R 3.4.4)
#>  ggsignif    * 0.4.0   2017-08-03 [1] CRAN (R 3.4.4)
#>  glue          1.3.0   2018-07-17 [1] CRAN (R 3.4.4)
#>  gtable        0.2.0   2016-02-26 [1] CRAN (R 3.4.4)
#>  haven         1.1.2   2018-06-27 [1] CRAN (R 3.4.4)
#>  hms           0.4.2   2018-03-10 [1] CRAN (R 3.4.4)
#>  htmltools     0.3.6   2017-04-28 [1] CRAN (R 3.4.4)
#>  httr          1.3.1   2017-08-20 [1] CRAN (R 3.4.4)
#>  jsonlite      1.5     2017-06-01 [1] CRAN (R 3.4.4)
#>  knitr         1.20    2018-02-20 [1] CRAN (R 3.4.4)
#>  lattice       0.20-35 2017-03-25 [2] CRAN (R 3.4.3)
#>  lazyeval      0.2.1   2017-10-29 [1] CRAN (R 3.4.4)
#>  lubridate     1.7.4   2018-04-11 [1] CRAN (R 3.4.4)
#>  magrittr      1.5     2014-11-22 [1] CRAN (R 3.4.4)
#>  memoise       1.1.0   2017-04-21 [1] CRAN (R 3.4.4)
#>  mime          0.6     2018-10-05 [1] CRAN (R 3.4.4)
#>  modelr        0.1.2   2018-05-11 [1] CRAN (R 3.4.4)
#>  munsell       0.5.0   2018-06-12 [1] CRAN (R 3.4.4)
#>  nlme          3.1-137 2018-04-07 [1] CRAN (R 3.4.4)
#>  pillar        1.3.0   2018-07-14 [1] CRAN (R 3.4.4)
#>  pkgbuild      1.0.2   2018-10-16 [1] CRAN (R 3.4.3)
#>  pkgconfig     2.0.2   2018-08-16 [1] CRAN (R 3.4.4)
#>  pkgload       1.0.1   2018-10-11 [1] CRAN (R 3.4.4)
#>  plyr          1.8.4   2016-06-08 [1] CRAN (R 3.4.4)
#>  prettyunits   1.0.2   2015-07-13 [1] CRAN (R 3.4.4)
#>  processx      3.2.0   2018-08-16 [1] CRAN (R 3.4.4)
#>  ps            1.1.0   2018-08-10 [1] CRAN (R 3.4.4)
#>  purrr       * 0.2.5   2018-05-29 [1] CRAN (R 3.4.4)
#>  R6            2.3.0   2018-10-04 [1] CRAN (R 3.4.4)
#>  Rcpp          0.12.19 2018-10-01 [1] CRAN (R 3.4.4)
#>  readr       * 1.1.1   2017-05-16 [1] CRAN (R 3.4.4)
#>  readxl        1.1.0   2018-04-20 [1] CRAN (R 3.4.4)
#>  remotes       2.0.1   2018-10-19 [1] CRAN (R 3.4.3)
#>  rlang         0.2.2   2018-08-16 [1] CRAN (R 3.4.4)
#>  rmarkdown     1.10    2018-06-11 [1] CRAN (R 3.4.4)
#>  rprojroot     1.3-2   2018-01-03 [1] CRAN (R 3.4.4)
#>  rvest         0.3.2   2016-06-17 [1] CRAN (R 3.4.4)
#>  scales        1.0.0   2018-08-09 [1] CRAN (R 3.4.4)
#>  sessioninfo   1.1.0   2018-09-25 [1] CRAN (R 3.4.4)
#>  stringi       1.1.7   2018-03-12 [1] CRAN (R 3.4.4)
#>  stringr     * 1.3.1   2018-05-10 [1] CRAN (R 3.4.4)
#>  testthat      2.0.1   2018-10-13 [1] CRAN (R 3.4.4)
#>  tibble      * 1.4.2   2018-01-22 [1] CRAN (R 3.4.4)
#>  tidyr       * 0.8.1   2018-05-18 [1] CRAN (R 3.4.4)
#>  tidyselect    0.2.5   2018-10-11 [1] CRAN (R 3.4.4)
#>  tidyverse   * 1.2.1   2017-11-14 [1] CRAN (R 3.4.4)
#>  usethis       1.4.0   2018-08-14 [1] CRAN (R 3.4.4)
#>  withr         2.1.2   2018-03-15 [1] CRAN (R 3.4.4)
#>  xml2          1.2.0   2018-01-24 [1] CRAN (R 3.4.4)
#>  yaml          2.2.0   2018-07-25 [1] CRAN (R 3.4.4)
#> 
#> [1] C:/Users/ljt89/Documents/R/win-library/3.4
#> [2] C:/Program Files/R/R-3.4.3/library

Rotate the label

Great library, thanks!

Is there a way to rotate the labels?

t.text values are not the same as I calculated with t.test?

Hi I recently see that the p value calculated by ggsignif is much larger than what I would calculate with t.test() function, even I specify the test method to be 't.test', is there any explanations on that?

Tip length

Hi,
How to get rid of tip length in aes ?
I want to have only significance (*, **, ***). No need of lines.

Thanks!

Custom annotations when using facets

Hello, I am trying to add custom annotations to ggplot2 faceted (facet_wrap) barplots and I am unable to make it work.
When I use the built-in tests in stat_signif() everything seems to work, from a technical standpoint, since I see the faceted barplots with all selected contrasts annotated by pvalues. Nonetheless I see strange things, like bars where there should clearly a statistically significant difference that are non-significant.
Therefore I decided to make tests separately and then add annotations in a custom way on the barplots.
I tried and it works nicely when I do a single plot, but when I add facets nothing works, I see just plain barplots without any annotation. I have tried to make it work by adding repeated lists to 'comparisons' and 'annotations' but the way it works is by piling each of the list elements on top of each other on every single plot of the facets rather than using one element per facet plot.
I hope I was able to explain my issue in a clear way.

Thanks for any suggestion

Michele

Changing confidence level

It is not very clear from the vignette, what you can actually specify in the test.args argument. However, I would like to change the confidence level of the test. Is that possible?

Annotations disappear when setting a high y_position

Hi,

I am using this wonderful package, which I found very useful to my work, and I am faced with a little problem here.
I am trying to do a violin plot using Seurat package.

library(Seurat)
library(ggsignif)
fig <- VlnPlot(object = pbmc_small, features.plot = 'PC1', do.return = TRUE)
fig  + geom_signif(xmin = 3, xmax = 4, y_position = 8 , annotations="**")

It needs a bit of refinement because the annotation overlaps other part of the plot, so I set y_position from 8 to 9

library(Seurat)
library(ggsignif)
fig <- VlnPlot(object = pbmc_small, features.plot = 'PC1', do.return = TRUE) +
coord_cartesian(ylim = c(-4,15))
fig  + geom_signif(xmin = 3, xmax = 4, y_position = 9 , annotations="**")

However, the annotation disappeared, and I do not know how to fix this.
Thank you very much in advance!

working with log2 data

Hi Constantin,

Very nice package, your ggsignif, and released just in time as I got into making such plots yesterday ;-)

While things work fine with test data of one kind (linear or log2), I have problems with the significance bracket however when I combine them.

I am working with gene expression data, and they are commonly displayed as log2 values to avoid the squishing of data points in the low range. The t test for group comparison however is best done on the original values. So that I'd like to draw the plot with the log2 data points while showing the t test results of the linear data.

Here is a synthetic example:

# create the data:
myDF <- data.frame(value=c(runif(15, min=7500, max=12500), runif(15, min=75, max=125)), group=c(rep("A", 15), rep("B", 15)))
myDF$value.log2 <- log(myDF$value, 2)
# work with only the log2 data:
ggplot(myDF, aes(group, value.log2)) + geom_jitter(width=0.02) + geom_signif(comparisons=list(c("A", "B")), test="t.test", y_position=16)
# work with only the linear data:
ggplot(myDF, aes(group, value)) + geom_jitter(width=0.02) + geom_signif(comparisons=list(c("A", "B")), test="t.test", y_position=16)
# combine the data (note the suddenly negative scale):
ggplot(myDF, aes(group, value.log2)) + geom_jitter(width=0.02) + geom_signif(aes(group, value), comparisons=list(c("A", "B")), test="t.test", y_position=16)
# combine the data, limiting the displayed y axis range to where the data and the bracket are (note that the bracket is no longer displayed):
ggplot(myDF, aes(group, value.log2)) + geom_jitter(width=0.02) + geom_signif(aes(group, value), comparisons=list(c("A", "B")), test="t.test", y_position=16) + ylim(1, 17)

Perhaps I am just getting lost in the intricacies of ggplot here, or perhaps this is a case not foreseen by ggsignif. In any case, I'd appreciate your help in finding a solution to this problem.

Best regards,

Anton

const-ae / ggsignif Goto Github PK

ggsignif's People

Contributors

Stargazers

Watchers

Forkers

ggsignif's Issues

Recommend Projects

Recommend Topics

Recommend Org