aphalo / ggpmisc Goto Github PK

View Code? Open in Web Editor NEW

91.0 91.0 6.0 26.83 MB

R package ggpmisc is an extension to ggplot2 and the Grammar of Graphics

Home Page: https://docs.r4photobiology.info/ggpmisc

R 44.67% Shell 0.01% TeX 0.45% HTML 54.88%

data-analysis dataviz ggplot2-annotations ggplot2-stats statistics

ggpmisc's Introduction

Computing-related Skills

Programing: R, (Python), C++, (Java), C, bash, awk, Perl, sh, (FORTRAN), Modula-2, Pascal, BASIC, Forth.

Text mark up: $\LaTeX$, $\TeX$, Quarto, Rmarkdown, markdown, HTML.

Revision control systems: git, svn, (cvs), rss.

Drafting: Inkscape, OpenSCAD, Adobe Illustrator.

Image/photography: Capture One, Helicon Focus, Lumariver Profile Designer, RawDigger, VueScan, LightRoom, Photoshop.

Video- and interactive tutorials: FlashBack Pro, R shiny.

IDE/GUI’s I like

WinEdt (for $\LaTeX$, $\TeX$, with embedded R code or not), RStudio (for R scripts, Quarto and Rmarkdown), GitKraken (for git).

Elsewhere

ORCID profile: https://orcid.org/0000-0003-3385-972X

Web site for the book Learn R: As a Language.

R Packages

The sources of the R packages I have published are in public Git repositories at GitHub. Out of the packages that I have authored and maintain, 14 are currently available through CRAN. The total number of packages submissions (mostly updates) as author and maintainer is 203 since 2016-01-29.

I have published in CRAN one package update roughly every 15 days, or about 1.97 package updates per month, since 2016-01-29.

The most recent of these updates was published in CRAN on 2024-07-01.

📂 Click to expand a list of my packages at CRAN with the most recently updated one at the top.

R package	Title	Version	Date
ggpp	Grammar Extensions to ‘ggplot2’	0.5.8-1	2024-07-01
ggpmisc	Miscellaneous Extensions to ‘ggplot2’	0.6.0	2024-06-28
gginnards	Explore the Innards of ‘ggplot2’ Objects	0.2.0	2024-05-01
learnrbook	Datasets and Code Examples from P. J. Aphalo’s “Learn R” Book	2.0.1	2024-04-28
photobiologyPlants	Plant Photobiology Related Functions and Data	0.5.0	2024-04-02
photobiologySun	Data for Sunlight Spectra	0.5.0	2024-04-01
photobiology	Photobiological Calculations	0.11.2	2024-03-31
photobiologyFilters	Spectral Transmittance and Spectral Reflectance Data	0.6.0	2024-02-27
photobiologyLEDs	Spectral Data for Light-Emitting-Diodes	0.5.2	2023-11-01
photobiologyLamps	Spectral Irradiance Data for Lamps	0.5.2	2023-10-24
photobiologySensors	Response Data for Light Sensors	0.5.1	2023-10-24
photobiologyWavebands	Waveband Definitions for UV, VIS, and IR Radiation	0.5.2	2023-10-24
ggspectra	Extensions to ‘ggplot2’ for Radiation Spectra	0.3.12	2023-10-21
photobiologyInOut	Read Spectral and Logged Data from Foreign Files	0.4.27	2023-07-20

Updates under development are published at R-Universe as soon as merged or commited into the main branch in the repositories at GitHub. Two packages that depend on a commercial closed-source driver, but usable with a free runtime of the driver, are published only at R-Universe.

R-Universe profile: https://aphalo.r-universe.dev.

Posts and Pages at R for Photobiology

The site R for Photobiology contains 95 posts and pages published since 2016-09-15! I have recently rebuilt the site using Quarto, and I have transferred only some of the posts originally published using WordPress. I am slowly adding more old posts, but only those that remain relevant. The figure below shows original publication date even when posts have been later updated. The source files are in a public repository at GitHub.

I have published one post or page roughly every days, or about 1.0 posts per month, since 2016-09-15.

I published the most recent post 1 days ago.

📂 Click to expand a full list of posts.

Date	Title
2024-07-13	“R Packages: Timeline of Updates”
2024-07-10	“Multichannel LED arrays”
2024-06-11	“Move from Wordpress to Quarto”
2024-06-01	“Fitted-model labels in Markdown”
2024-05-28	“Looking back 40 years”
2024-05-16	“Is this a polynomial?”
2024-04-17	“Annotating Plot Matrices”
2024-02-10	“Repository migrated to R-Universe”
2024-01-13	“ooacquire: Spectral Irradiance Algorithms”
2023-11-27	“Linear Models”
2023-11-21	“Theoretical probability distributions”
2023-11-18	“Flow of code execution”
2023-10-30	“Introduction to Data Visualization”
2023-10-21	“photobiology 0.11.x”
2023-10-21	“Design of Experiments”
2023-09-19	“R at its simplest”
2023-08-19	“Research as a process”
2023-08-19	“Research as a process”
2023-08-14	“Multiple comparisons with ggpmisc”
2023-08-01	“Pairwise labels with ggpp”
2023-07-31	“Open Access Weather and Climate Data”
2023-07-30	“Timelines with ggplot2”
2023-06-25	“Fitted-model labels with ggpmisc and plotly”
2023-06-24	“Fitted-model labels with ggpmisc and gganimate”
2023-06-10	“ooacquire 0.4.x”
2023-06-02	“Functional analysis of spectra with photobiology to fda.usc”
2023-05-30	“Model fitting in R”
2023-05-30	“Randomization and independent replicates”
2023-05-28	“photobiology 0.10.1x”
2023-05-24	“EDA with ggplot2”
2023-05-21	“I have started using Mastodon…”
2023-05-11	“Plant photoreceptors”
2023-05-03	“ggplot2 Basics”
2023-04-27	“Spectral fluorescence with ooacquire”
2023-04-27	“ooacquire: Spectral Irradiance Measurement”
2023-04-15	“Weather data for Finland from FMI”
2023-04-10	“ooacquire 0.3.x”
2023-03-20	“OmniDriver, Java and the whims of companies”
2023-03-04	“R Packages”
2023-02-28	“Packages ggpmisc, ggpp and gginnards”
2023-02-28	“Nudging + repulsion with ggrepel and ggpp”
2023-02-27	“Website migrated to Quarto”
2023-02-27	“Fitted-model labels with ggpmisc”
2023-02-25	“Volcano and quadrant plots with ggpmisc”
2023-02-25	“Data labels in bar plots with ggpp”
2023-02-23	“ggplot insets with package ggpp”
2023-02-20	“Handbook on photobiological calculations with R”
2023-02-20	“A handbook of best practice in plant UV photobiology”
2023-02-19	“The R for Photobiology Suite”
2023-02-19	“Pedro J. Aphalo”
2023-02-18	“R, RStudio and Quarto”
2023-02-18	“Support”
2023-02-15	“About this Website”
2023-02-03	“ggspectra 0.3.10/0.3.11/0.3.12”
2023-01-05	“photobiologyWavebands 0.5.1”
2023-01-05	“photobiology 0.10.15”
2022-12-30	“Are plants and plant canopies flat?”
2022-12-23	“Visit to Universidad Austral de Chile”
2022-12-17	“Enhancing geom_text() and geom_label()”
2022-12-05	“ggpp >= 0.5.0 updates”
2022-10-18	“ooacquire 0.2.6”
2022-10-15	“An R marathon updating packages”
2022-10-15	“photobiologyInOut 0.4.25/0.4.26/0.4.27”
2022-10-15	“photobiology 0.10.14”
2022-10-15	“ggspectra 0.3.9”
2022-10-15	“gginnards >= 0.1.1 updates”
2022-10-05	“photobiology 0.10.13”
2022-10-01	“ooacquire 0.2.4 and 0.2.5”
2022-09-30	“ggpp 0.4.5”
2022-08-24	“Learn R: As a Language”
2022-08-13	“HTML5 compliance of R packages”
2022-08-13	“photobiologyWavebands 0.5.0”
2022-08-05	“ggpmisc >= 0.5.0 updates”
2022-07-23	“photobiology 0.10.12”
2022-07-10	“photobiology 0.10.11”
2022-06-15	“ggpmisc 0.4.7”
2022-05-15	“photobiologyInOut 0.4.24”
2022-05-14	“photobiologyLEDs 0.5.0”
2022-04-29	“R 4.2.0”
2022-04-16	“ggspectra 0.3.8”
2022-01-30	“Instrumentation”
2022-01-30	“Controlled Environment Chambers”
2022-01-29	“LED-based light sources”
2021-10-20	“What is plant intelligence? and what it is not?”
2021-10-17	“Sensing of solar UVA by plants”
2021-10-17	“Cryptochromes and stomatal opening”
2021-07-13	“Perception of solar UV radiation by plants”
2020-07-12	“Performance of package photobiology”
2020-04-25	“UVR8 is an UV-B and UV-A photoreceptor”
2019-04-24	“Benchmarking function `sun_angles()`”
2019-02-22	“Yoctopuce modules: Spectrometer”
2019-02-21	“Yoctopuce modules: Introduction”
2018-08-10	“Using the Quick TUV Calculator”
2017-11-24	“Article titles in the era of the internet”
2016-09-15	“For those interested in optical properties”

Posts and pages at Photo Rumblings and Whispers

The Photo Rumblings and Whispers has 27 posts since 2015-10-18! I have recently rebuilt the site using Quarto, and I have transferred most of the posts originally published using WordPress. I may add one or two old posts. The figure below shows original publication date even when posts and pages have been later updated. I have updated several of the posts and pages and I aim to continue updating them as needed. The source files are in a public repository at GitHub.

I have published one post or page roughly every days, or about 0.3 posts per month, since 2015-10-18.

I published the most recent post or page 154 days ago.

📂 Click to expand a full list of posts.

Date	Title
2024-02-11	“The nitty-gritty details of macrophotography”
2023-08-28	“Photographing Insects: Lenses”
2023-08-22	“WordPress -> Quarto”
2023-08-21	“Pedro J. Aphalo”
2023-05-13	“Flexible and oversized lens hoods”
2023-04-18	“About this Website”
2023-04-15	“UV short-pass filter stacks”
2023-04-14	“UV short-pass filters”
2023-04-12	“Notes on the OM-1 (digital) camera”
2022-10-15	“Small fill/video LED lights revisited”
2021-10-25	“Broad band VIS+NIR LEDs”
2021-08-21	“Photo Rumblings and Whispers”
2021-02-01	“NIR long-pass filters”
2020-09-29	“Hemispherical time-lapse under a tree”
2020-06-30	“Lens Adapter with Filter Drawer”
2020-06-30	“Lens Adapters: Flange-to-Flange Distance”
2020-06-12	“UV-cut filters”
2020-06-11	“UV-IR-cut filters”
2019-08-21	“A time lapse video assembled in ImageJ”
2019-07-14	“Godox AD200 flash for UV, VIS and IR photography”
2019-07-14	“Digital UVA-photography with M43 equipment”
2019-06-19	“Lautaret”
2019-05-11	“Neutral Density (ND) Filters”
2018-05-13	“Camera objectives for digital UV photography”
2017-12-12	“Black anodised aluminium in IR”
2017-10-28	“Photographs through windows”
2015-10-18	“Bracketing”

Updated 2024-07-14 10:33:58.507133

This README file is based on the blog post by Athanasia Mo Mowinckel and the R code by Martin Henze.

ggpmisc's People

Contributors

Stargazers

Watchers

Forkers

rpkgs luffyluffy r-study-cxb qindan2008 xenos-code evolandeco

ggpmisc's Issues

Wrong equation when using `formula=y~I(1/x)`

When I use stat_poly_eq with formula=y~I(1/x), the wrong equation is printed. In the case of the code below, it should be
y = 1/x,

but instead it gives:

y = -0.00033 + 1.01x

library(ggplot2)
set.seed(0)

tibble::tibble(x=1:100, y=1/x + rnorm(100, sd=0.01)) |>   
  ggplot(aes(x, y))+
  geom_point()+
  geom_smooth(formula=y~I(1/x), method="lm", se=F)+
  ggpmisc::stat_poly_eq(ggpmisc::use_label("eq"), formula=y~I(1/x), method="lm")

^{Created on 2023-06-29 with reprex v2.0.2}

Allow `method` functions in model fit functions to return NA and NULL

stat_poly_line(), stat_poly_eq() and relatives should be able to gracefully handle NA and NULL returned by model fit methods as an indication of failure, so that in such cases no line or equation is added to the plot. This would be useful in user defined wrappers on model fit methods.

Increase unit-tests coverage

As done with 'ggpp', unit tests need to be added to 'ggpmisc' so as to test the code better (in actual practice) and to achieve formally high coverage.

Add support for `lqs()` in `stat_poly_eq()` and `stat_poly_line()`

Currrently we support lm() and rlm() and any other function that returns an object of class "lm". MASS::lqs() with method lts is a useful alternative that could be also supported rather easily, I think.

stat_fit_glance() for quantreg:rq()

https://stackoverflow.com/questions/65695409/is-there-a-neat-approach-to-label-a-ggplot-plot-with-the-equation-and-other-stat

This would be usable with ggplot2::geom_quantile()

stat_ma_line - Error: Discrete value supplied to continuous scale

There are two problems happening when using stat_ma_line():

when I use the range.y and range.x = "interval", the ggplot code works fine when I use the variable "y = wind250_1". However when I use the variable "y= lee250_1" I get the error: Discrete value supplied to continuous scale.
if I change the range.y and range.x = "relative". The code runs fine for both y variables. However, when I exclude the lines " "scale_x_continuous(trans='log10')" and scale_y_continuous(trans='log10') and use "y=lee250_1", it gives me the error: Discrete value supplied to continuous scale.

The variable is numeric, I even used the as.numeric() in the code. The values per group could be a problem, as suggested in this post - https://stackoverflow.com/questions/75242355/error-discrete-value-supplied-to-continuous-scale-stat-ma-line. However it does not make sense for me considering that the code works with the variable "wind250_1" and not with "lee250_1".

ROS<- c(0.03, 0.03, 0.03, 0.03, 0.07, 0.07, 0.07, 0.07, 0.07, 0.1, 0.1, 0.1)
wind<- c(0.84, 1.77, 3.5, 6.44, 0.84, 1.77, 3.5, 6.44, 7.55, 0.84, 1.77, 3.5)
rey <- c(31500,66375,131250,241500,31500,66375,131250,241500,283125,31500,66375,131250)
wind250_1 <- c(69.4,69.4,1,1,31.08,37.07,1,1,1,22.8,19.45,1) 
lee250_1 <- c(79.84,125.56,93.34,94.42,33.78,49.6,38.95,40.9,39.32,24.2,32.95,27.46) 

df<- data.frame(ROS,wind,rey,wind250_1,lee250_1) 

ggplot() +
  stat_ma_line(df, mapping=aes(rey, lee250_1),  method="RMA",
               range.y = "relative", range.x = "relative", 
               linewidth = 1,fill = "yellow") +
  geom_point(df, mapping = aes(x = rey, lee250_1, colour=factor(ROS)), 
             size=3)+ 
  xlab("Re") + ylab((expression(paste(tau~"windward"))))+
  #scale_x_continuous(trans='log10') +
  #scale_y_continuous(trans='log10') +
  scale_color_manual(values = c("#0072B2", "#000000","#E7B800","#CC79A7")) + 
  labs(colour = "ROS (m/s)") +
  theme_bw()
#> Error: Discrete value supplied to continuous scale"

Peak fitting and adjust `span` or `ignore_threshold` dynamically to limit the number of peaks

Peak fitting could be useful in some cases. New parameters n.min and n.max could be used to dynamically adjust the span, or the ignore_threshold.

`stat_multcomp()` Implement support for `orientation = "y"`

This will require flipping of data but for bars also rotation of pairwise labels. angle aesthetic support in geom_label() and derived geoms is partly implemented in 'ggpp' 0.5.5 and will be implementrd in 'ggplot2' 3.5.5.

equation is not showing

Dear Prof. Pedro, the example in the Readme is not working

I have the following message:

Warning messages:
1: In as.character.polynomial(polynom::as.polynomial(coefs), digits = coef.digits, :
NAs introduced by coercion

library(ggpmisc)

formula <- y ~ x + I(x^2)
ggplot(cars, aes(speed, dist)) +
  geom_point() +
  stat_fit_deviations(formula = formula, colour = "red") +
  stat_poly_line(formula = formula) +
  stat_poly_eq(aes(label =  paste(stat(eq.label), stat(adj.rr.label), sep = "*\", \"*")),
               formula = formula)

Thanks for the help

`stat_multcomp()` Enhance multiple comparisions and contrasts

Add "staircase" pairwise contrasts.
Test with additional model fit functions.

stat_poly_eq fails with errors

stat_poly-eq fails with following message:
Warning messages:
1: In as.character.polynomial(polynom::as.polynomial(coefs), digits = coef.digits, :
NAs introduced by coercion
2: Computation failed in stat_poly_eq():
missing value where TRUE/FALSE needed
For testing purposes I used the following repex from Stack overflow https://stackoverflow.com/questions/7549694/add-regression-line-equation-and-r2-on-graph/35140066#35140066

library(ggplot2)
library(ggpmisc)
df <- data.frame(x = c(1:100))
df$y <- 2 + 3 * df$x + rnorm(100, sd = 40)
my.formula <- y ~ x
p <- ggplot(data = df, aes(x = x, y = y)) +

geom_smooth(method = "lm", se=FALSE, color="black", formula = my.formula) +
ggpmisc::stat_poly_eq(formula = my.formula,

           aes(label = paste(..eq.label.., ..rr.label.., sep = "~~~")),

```
           parse = TRUE) +         
```
geom_point()

p
Warning messages:
1: In as.character.polynomial(polynom::as.polynomial(coefs), digits = coef.digits, :
NAs introduced by coercion
2: Computation failed in stat_poly_eq():
missing value where TRUE/FALSE needed
No Text is output on graph

problem showing only fit coefficients on faceted plots

Using the example dataset in the help for stat_poly_eq...

set.seed(4321)
x <- 1:100
y <- (x + x^2 + x^3) + rnorm(length(x), mean = 0, sd = mean(x^3) / 4)
my.data <- data.frame(x = x, y = y,
                      group = c("A", "B"),
                      y2 = y * c(0.5,2),
                      w = sqrt(x))

# give a name to a formula
formula <- y ~ poly(x, 3, raw = TRUE)

I want to do something like the last example, in which the label is not the entire equation, but is constructed from the fitted parameters.

my.format <-
  "b[0]~`=`~%.3g*\", \"*b[1]~`=`~%.3g*\", \"*b[2]~`=`~%.3g*\", \"*b[3]~`=`~%.3g"
ggplot(my.data, aes(x, y)) +
  geom_point() +
  geom_smooth(method = "lm", formula = formula) +
  stat_poly_eq(formula = formula,
               output.type = "numeric",
               parse = TRUE,
               mapping =
                aes(label = sprintf(my.format,
                                    stat(coef.ls)[[1]][[1, "Estimate"]],
                                    stat(coef.ls)[[1]][[2, "Estimate"]],
                                    stat(coef.ls)[[1]][[3, "Estimate"]],
                                    stat(coef.ls)[[1]][[4, "Estimate"]])
                                    )
                   )

Except I want to use facets.

my.format <-
  "b[0]~`=`~%.3g*\", \"*b[1]~`=`~%.3g*\", \"*b[2]~`=`~%.3g*\", \"*b[3]~`=`~%.3g"
ggplot(my.data, aes(x, y)) +
  geom_point() +
  geom_smooth(method = "lm", formula = formula) +
  facet_wrap(~group) +
  stat_poly_eq(formula = formula,
               output.type = "numeric",
               parse = TRUE,
               mapping =
                aes(label = sprintf(my.format,
                                    stat(coef.ls)[[1]][[1, "Estimate"]],
                                    stat(coef.ls)[[1]][[2, "Estimate"]],
                                    stat(coef.ls)[[1]][[3, "Estimate"]],
                                    stat(coef.ls)[[1]][[4, "Estimate"]])
                                    )
                   )

The above does not work: the same coefficients are shown for both groups A and B.
However, when I try using geom="debug", the separate coefficients in coef.ls are returned correctly:

Input 'data' to 'draw_panel()':
                             Estimate   Std. Error    t value   Pr(>|t|)
(Intercept)             -2.219423e+04 3.293548e+04 -0.6738701 0.50376620
poly(x, 3, raw = TRUE)1  1.527045e+03 2.853578e+03  0.5351335 0.59513525
poly(x, 3, raw = TRUE)2 -1.478358e+01 6.631448e+01 -0.2229315 0.82457583
poly(x, 3, raw = TRUE)3  9.981851e-01 4.359348e-01  2.2897575 0.02667175

Input 'data' to 'draw_panel()':
                            Estimate   Std. Error    t value   Pr(>|t|)
(Intercept)             17289.934326 3.634737e+04  0.4756859 0.63654947
poly(x, 3, raw = TRUE)1   360.046135 3.055598e+03  0.1178316 0.90671408
poly(x, 3, raw = TRUE)2   -20.054699 6.924042e+01 -0.2896386 0.77339427
poly(x, 3, raw = TRUE)3     1.198387 4.464844e-01  2.6840515 0.01007669

Is this a bug, or am I doing something incorrectly?

Compatibility with `gganimate` package

According to my tests ggpmisc and gganimate are currently not compatible with each other, or please enlight me if I was wrong:

library(gganimate)
#> Loading required package: ggplot2
library(ggpmisc)
#> Loading required package: ggpp
#> 
#> Attaching package: 'ggpp'
#> The following object is masked from 'package:ggplot2':
#> 
#>     annotate

# Animation without stat_poly_eq()
ggplot(diamonds, aes(x = carat, y = price)) + geom_point() + geom_smooth(method = "lm") +
  transition_states(cut, transition_length = 1, state_length = 1) +
  enter_fade() + exit_shrink() +
  labs(title = "Cut = {closest_state}")
#> `geom_smooth()` using formula = 'y ~ x'

# Static plot with stat_poly_eq()
ggplot(diamonds, aes(x = carat, y = price)) + geom_point() + geom_smooth(method = "lm") +
    stat_poly_eq()
#> `geom_smooth()` using formula = 'y ~ x'

# Static plot with stat_poly_eq() and facets
ggplot(diamonds, aes(x = carat, y = price)) + geom_point() + geom_smooth(method = "lm") +
  stat_poly_eq() + facet_wrap(. ~ cut)
#> `geom_smooth()` using formula = 'y ~ x'

# Adding stat_poly_eq() to the animation causes the error
ggplot(diamonds, aes(x = carat, y = price)) + geom_point() + geom_smooth(method = "lm") +
  transition_states(cut, transition_length = 1, state_length = 1) +
  enter_fade() + exit_shrink() +
  stat_poly_eq() +
  labs(title = "Cut = {closest_state}")
#> `geom_smooth()` using formula = 'y ~ x'
#> Warning: Computation failed in `stat_poly_eq()`
#> Caused by error in `abs()`:
#> ! non-numeric argument to mathematical function
#> Error in `$<-.data.frame`(`*tmp*`, "group", value = ""): replacement has 1 row, data has 0

# Combining stat_poly_eq() with facets fails for each facet
ggplot(diamonds, aes(x = carat, y = price)) + geom_point() + geom_smooth(method = "lm") +
  transition_states(clarity, transition_length = 1, state_length = 1) +
  enter_fade() + exit_shrink() +
  stat_poly_eq() + facet_wrap(. ~ cut) +
  labs(title = "Clarity = {closest_state}")
#> `geom_smooth()` using formula = 'y ~ x'
#> Warning: Computation failed in `stat_poly_eq()`
#> Computation failed in `stat_poly_eq()`
#> Computation failed in `stat_poly_eq()`
#> Computation failed in `stat_poly_eq()`
#> Computation failed in `stat_poly_eq()`
#> Caused by error in `abs()`:
#> ! non-numeric argument to mathematical function
#> Error in `$<-.data.frame`(`*tmp*`, "group", value = ""): replacement has 1 row, data has 0

^{Created on 2023-06-18 with reprex v2.0.2}

It might also be the case that more efforts should be made by the author(s) of gganimte, but it would be nice if you could investigate a bit why they are not compatible

geom_table() cuts off right edge with table.hjust = 0

library(ggplot2)
library(ggpmisc)
library(tibble)

my_df <- data.frame(
  a = c(0, 1, 2, 7777),
  b = c('ggpmisc::geom_table cuts off text', 'one', 'two', 'seven thousand seven hundred and seventy seven')
)
my_tibble <- tibble(x = 0, y = 0, tbl = list(my_df))
t_geom_table <- ggplot(my_df) +
  geom_table(
    data    = my_tibble,
    mapping = aes(x = 0, y = 0, label = tbl),
    table.hjust    = 0,
    table.colnames = FALSE
  ) +
  theme_void()
t_geom_table

Here's an image of the output. The beginning of each line is also offset to the right, which may be related.

stat_fit_fb() is returning duplicated rows!

Using geom_debug() one sees that without groups or facets a data frame with two rows is returned. The example in the User Guide shows this. For some reason two versions of the same methods are run. The problem is not visible in the plot as two identical and overlapping tables are added to the plot with the top one hiding the lower one-

CRAN vignette documentation has a formatting error in the table

The Cran vignettes link has an error with the formatting of table 2, at least on chrome and edge browsers.
https://cran.r-project.org/web/packages/ggpmisc/vignettes/model-based-annotations.html

Formatting of numbers in `stat_..._eq()`

A question at Stackoverflow makes me think that the format string used for coeffient estimates could be a formal parameter of the stat_..._eq() functions.

Needs some tought as the formatting of the equation as a whole is far from trivial.

Justification and the new nudge functions

It needs to be explored if justification could be also adjusted for each text label dynamically in the position functions in the same way as the nudges are. This would be very useful when the length of the label text varies, as in this case constant nudge would result in varying distances.

Spacing of "text" output in `stat_poly_eq()`

The spacing of "text" output in stat_poly_eq() could be improved. Possibly, best is not to include the asterisk.

See question at Stackoverflow. The code below could work as a possible answer.

df <- data.frame(x = c(1:100))
df$y <- 20 + 30 * df$x + rnorm(100, sd = 80)

library(ggpmisc)
#> Loading required package: ggpp
#> Loading required package: ggplot2
#> 
#> Attaching package: 'ggpp'
#> The following object is masked from 'package:ggplot2':
#> 
#>     annotate

ggplot(df, aes(x, y)) + 
  geom_point() +
  stat_poly_line(method = "lm", se = FALSE) +
  stat_poly_eq(use_label(c("eq", "R2"), sep = ", "), output.type = "text")

^{Created on 2022-09-06 with reprex v2.0.2}

Multiple comparisions and contrasts

'ggsignif' exports geom_signif() and stat_signif(), however pairwise comparisons are done individually, and p-values returned uncorrected. In addittion geom_signif() and stat_signif() are too much interdependent, making it impossible to use stat_signif() with other geometries. No labels to be parsed into expressions are returned and small p-values are not shown as < 0.001 or similar as they are in 'ggpmisc'.

After looking at 'ggsignif' code with the intention of submitting a pull request, it seems more reasonable to implement from scratch a statistic and a geometry with similar functionality in 'ggpmisc' than updating 'ggsignif'.

Getting this done soon and included in the 2nd edition of Learn R: As a Language would be ideal.

The geometry can be implemented first, and tested and used on its own, and later one or more statistics could be written making use of it: stat_fit_contrasts() and stat_multcomp().

`use_anova_cols()` and `use_summary_cols()`

The equivalents of use_label() for stat_fit_tb() to be used for selecting and renaming columns.
(Language translations of colum headers could be also implemented at a later time.)

Add blog post with examples of `ggpairs()` used with 'ggpmisc' and 'ggpp'

Hi!

Thank you for the outstanding work on ggpmisc!

Over the past few days, I've been attempting, albeit unsuccessfully, to implement ggpmisc's stat_ma_eq and stat_ma_line within the upper plots of ggpairs. My objective is to display equations grouped by class in the upper plots with proper spacing, and scatterplots with fitted moving average (MA) lines (or any other type) at the bottom.

While I've managed to successfully integrate the lines, I've encountered challenges in creating a custom function for the upper plots. Specifically, I haven't found a method to ensure the proper display of equation overlap or the names of the corresponding taxa. Any guidance on resolving this issue would be greatly appreciated.

Perhaps a similar approach could be applied to other functions within ggpmisc.

Below is the code I've produced:

library(GGally)
library(ggplot2)
library(ggpmisc)


fun_name <– function(data, columns, grp_labels, color, ...) {
  log_data <- function(data) {
    data <- data %>% 
      mutate(across(where(is.numeric), ~ {log10(.)
      }))
    return(data)
  }
  
  upperfun <- function(data, mapping, grp.label){
    ggplot(data = data, mapping = mapping, grp.label=grp.label, color=color) +
      ggpmisc::stat_ma_eq(aes(label = after_stat(paste("bold(", {{grp.label}}, "*\":\")~~", 
                                                       eq.label, sep = ""))),
                          parse = TRUE) +
      geom_blank() +
      facet_wrap(~{{grp.label}}, ncol = 1, 
                 strip.position = "top")+ #I tried to pass them as facet to avoid overlap, but works poorly
      theme_void()+
      theme(strip.background = element_blank(),
            strip.text = element_blank(),
            panel.grid = element_blank())
  }
  
  lowerfun <- function(data, mapping){
    ggplot(data = data, mapping = mapping) +
      geom_point() +
      stat_ellipse(type = "norm") +
      ggpmisc::stat_ma_line(show.legend = T, se = FALSE, method = "lmodel2:MA") +
      theme(panel.grid = element_blank())
  }
  
  data <- log_data(data)
  
  data %>%
    ggpairs(columns = columns,
            ggplot2::aes(colour = {{color}}, alpha = 0.4),
            upper = list(continuous = wrap(upperfun, grp.label = grp_labels, ...)),
            lower = list(continuous = wrap(lowerfun, ...)),
            progress = FALSE) +
    theme_minimal(base_size = 9) +
    theme(panel.grid = element_blank())
}

Thank you

Best wishes
M.

Color fitted lines based on p value or R^2

The question at https://stackoverflow.com/questions/68998729/color-ggscatter-by-r-and-p-values is important. For stat_xxx_eq() this is possible for the equation, a similar approach applied to stat_xx_line() functions would allow coloring or even hiding fitted lines based on p.value of R^2. We should also make R^2adj available. This is fairly easy to implement...

Enable multi-line use_label

Currently, the sep argument in use_label cannot take a \n for output.type=expression, since plotmath doesn't directly understand \n. However, there are ways to use, for example, atop to make multi-line expressions.

It would be great if a multi-line use_label functionality was added. One way this might work is by behind the scenes interpreting sep="\n" to trigger multiline atop functionality.

Implement format templates for label assembly in use_label()

I just implemented function use_label() that uses paste() to assemble the combined label and supports a single value for sep. This is a simple user interface usable in many cases. A function fmt_label()could be also useful when users want to have arbitrary separators or additional text. This is to keep a note that if use_label() turns out to be limiting, a function using a template for the overall format would be a reasonable UI, although not necessarily very easy to implement, in particular handling of parsing errors.

Support orientation in stat_peaks() and stat_valleys()

Recent versions of 'ggplot2' have updated versions of statistics with formal parameter orientation. I have implemented this for the statistics in 'ggpmisc' except for these two. This would add consistency, and allow finding maxima and minima on the x values.

Plotting error due to `after_stat` not in NAMESPACE with v0.4.2 update

Please briefly describe your problem and what output you expect. If you have an ordinary question, please don't use this form, but instead ask on stackoverflow http://stackoverflow.com using tag [ggpmisc} and optionally [ggoplot2].

Please include a minimal reprex. The goal of a reprex is to make it as easy as possible for me to recreate your problem so that I can fix it. If you've never heard of a reprex before, start by reading https://www.tidyverse.org/help/#reprex, and follow the advice further down the page. Do NOT include session info unless it's explicitly asked for, or you've used reprex::reprex(..., si = TRUE) to hide it away.

Delete these instructions once you have read them.

With the ggpmisc 0.4.2 update, I'm now seeing this error pop up with existing code that worked with the previous release:

Error in after_stat(npcx) : could not find function "after_stat"

It appears that ggplot2::after_stat should be imported into the NAMESPACE.

Best,
Mike

How to output the values of peaks (by stat_peaks) into a new file?

These days I used your nice package, It looks brilliant. I got a minor question.

I want to save the values of peaks into a new file, but now i can only plot the values in the figure. Is there any way for me to do this?

Best,
Xu

Axis title not centered when using `symmetric_limits`

When using the function symmetric_limits inside scale_y_continuous like this:

scale_y_continuous(limits = ggpmisc::symmetric_limits)

I get this result with the y-axis title label where it is not centered correctly. I have played around with hjust in theme for axis.title, but can't seem to get it to work. I know I am not providing reproducible code at the moment, and in my use case it is getting passed into the ggiraph package for an interactive plot. I'm not sure if this is fixable or not, or if you have any idea as to why this is happening. Thanks in advance for any help!

stat_poly_eq > eq.label does not show "x" when using formula y~x+0.

Running
ggpmisc 0.4.0
R version 4.0.3 (2020-10-10)
MacOS X 11.2.3 x86_64-apple-darwin17.0

library(tibble)
library(ggplot)
library(ggpmisc)

set.seed(0); tibble(a=1:100, b=runif(length(a), 0.7, 1.3) * (a*5)) %>% 
  ggplot(aes(a,b))+
  geom_point()+
  stat_poly_eq(aes(label=stat(eq.label)), formula=y~x, parse=T)


set.seed(0); tibble(a=1:100, b=runif(length(a), 0.7, 1.3) * (a*5)) %>% 
  ggplot(aes(a,b))+
  geom_point()+
  stat_poly_eq(aes(label=stat(eq.label)), formula=y~x+0, parse=T)

Running first chunk displays the equation y = -0.603 + 5.06 x
Running second chunk displays the equation y = 5.05 instead of y = 5.05 x

Add support for major axis regression

Add stat_sma_eq() and update stat_fit_deviations() and stat_fit_residuals().
Either package 'smatr' or 'lmodel2' could be used. Catch is that these packages only support very simple formulas, i.e., y ~ x which could surprise users is implemented in the same functions as 'lm', etc.

I suspect that 'smatr' would be the best option, but this needs to be further explored.

Support for ma and sma in stat_fit_deviations() would be particularly useful in teaching.

Only show facets for which R squared >= specified value

Would like to be able to show only the facets with high R squared. The code below does it but it would be easier if it could be done entirely within ggplot2, i.e. within the second pipeline. In particular we ran lm in the first pipeline and then again, implicitly, in the second pipeline. Also more important is having simpler code would be nice. A variation would be to show only the top k panels in R squared where k is specified.

library(broom)
library(dplyr)
library(ggplot2)
library(ggpmisc)

# find Trees that for which R squared >= 0.97.  Here all but tree 4.
Trees <- Orange %>%
  nest_by(Tree) %>%
  summarize(model = list(lm(age ~ circumference, data)), glance(model)) %>%
  filter(r.squared >= 0.97) %>%
  pull(Tree)

# plot
if (length(Trees)) {
  p <- Orange %>%
    filter(Tree %in% Trees) %>%
    ggplot(aes(circumference, age)) +
      geom_point() +
      stat_poly_eq() +
      geom_smooth(method = "lm", se = FALSE) +
      facet_wrap(~ Tree)
  plot(p)
 }

alpha in geom_label_npc() only works if data.frame is provided

I am trying to add a semi-transparent label to a plot using the geom_label_npc() function. Unfortunately, this does not seem to work, when I directly provide npcx, npcy and the label argument:

library(ggplot2); library(ggpmisc)

ggplot(data = mtcars) +
  geom_point(mapping = aes(wt, mpg)) +
  geom_label_npc(npcx=0.04, npcy=0.95, label="(a)", colour="white", alpha = 0.5)

However, it seems to work, if I provide the same values inside a data.frame object:

df <- data.frame(
  x = 0.04,
  y = c(0.95),
  text = c("(a)")
)

ggplot(data = mtcars) +
  geom_point(mapping = aes(wt, mpg)) +
  geom_label_npc(data = df, aes(npcx = x, npcy = y, label = text), colour="white", alpha=0.5)

Is probably a small bug, that can easily be solved.

And another question, is there any way to use geom_label_npc() without the a border around the label or give the border a different colour to the text?

cannot install ggpmisc (failed splus2R dependencies)

First of all, Thank you very much for your contribution. It looks brilliant.
I would like to ask you if there is any issue when installing your package on a Linux machine (Ubuntu 22.04) and R version 4.2.1.
I can not install the stable or unstable version. It says that splus2R library is not available and I cannot install it.
Could you please help with this?

ERROR: compilation failed for package ‘splus2R’

removing ‘/home/guido/R/x86_64-pc-linux-gnu-library/4.2/splus2R’
Warning in install.packages :
installation of package ‘splus2R’ had non-zero exit status
ERROR: dependency ‘splus2R’ is not available for package ‘ggpmisc’
removing ‘/home/guido/R/x86_64-pc-linux-gnu-library/4.2/ggpmisc’
Warning in install.packages :
installation of package ‘ggpmisc’ had non-zero exit status

I can not install the package

First of all, Thank you very much for your contribution. It looks brilliant.
I would like to ask you if there is any issue when installing your package in a mac machine.
I can not install the stable or unstable version.
Could you please help on this?

Error in loadNamespace(i, c(lib.loc, .libPaths()), versionCheck = vI[[i]]) :
namespace ‘quantreg’ 5.61 is being loaded, but >= 5.93 is required
Calls: ... withCallingHandlers -> loadNamespace -> namespaceImport -> loadNamespace
Execution halted
ERROR: lazy loading failed for package ‘ggpmisc’

How can I get the equation as text?

Hi,
Thanks for your great tool, and I want to know how to extract the equations from stat_poly_eq, not just plot on the graph.

Best,
xiucz

x on y regression support

In some cases, such as teaching it is useful to be able to swap x and y as response and explanatory variables. Support for this could be implemented in:
stat_poly_eq() and stat_quant_eq()
stat_fit_deviations() and stat_fit_residuals()
ggplot::stat_smooth() and ggplot::stat_quantiles()

Simultaneous confidence intervals

A statistic could be rather easily implemented by editing the code of stat_multcomp().

Equations not correctly plotted in Zuzana's code

Need to create a reprex and investigate what is going on...

Documentation of parsing and default

Looking at the documentation of stat_poly_eq and stat_quant_eq, do you think parse = [default] is important enough to appear in the arguments list, or is there a specific reason it doesn't appear there?

While I note it is discussed later, and the default is FALSE, that default is primarily for LaTeX users? Would a more common usage by others suggest a default of parse = TRUE might be preferred?

Confidence intervals for model-fit parameter estimates

Some time ago a request/suggestion for supporting in stat_poly_eq() CIs for the fitted coefficients and other parameter estimates was made. More recently, a pull request at kassambara/ggpubr#418 and comments on it discussed CIs for correlation.

A CI for R2 could be useful in all the _eq statistics and could be obtained with package 'confintr'. For rank correlation CIs for tau and rho can be also computed using functions from 'confintr'. CIs for parameter estimates would be useful in stat_fit_tb() with summary tables. Could be implemented using confint() methods from 'stats', 'lme4' and 'MASS', or intervals() from 'nlme' .

This is a note to myself, for the future, only tentatively assigned to a milestone as most likely CIs will be implemented in stages at different milestones for different statistics. CIs for rho, tau and R^2 seem to be most useful and would help in achieving consistency in the available labels across similar statistics. Those for parameter estimates are in many circumstances more informative than t-values and p-values.

Error with inherited method

receiving error message with stat_poly_eq:

Warning message:
Computation failed in stat_poly_eq():
unable to find an inherited method for function ‘isa’ for signature ‘"formula"’

Would appreciate advice on this issue. Thank you.

Implement model-fitting statistics using compute-panel functions

Is it right that ggpmisc does not support lm models without interactions? Any easy way to handle it?

Equations not rendering correctly with ggtext::geom_richtext()

I suspect this is a problem in 'ggtext' rather than 'ggpmisc', but anyway spurious space is inserted where there is not any in eq.label.

Conditional use of "geom" in stat_correlation

Hi,

Thank you for a nice package!
Is there a way to determine the use of geom (either "text" or "label") based on whether the P value is significant or not?
I tried to do so but it didn't work.

x <- (1:100) / 10
y <- x + rnorm(length(x))
my.data <- data.frame(x = x,
                      y = y,
                      y.desc = - y,
                      group = c("A", "B"))


ggplot(my.data, aes(x, y)) +
  geom_point() +
  stat_correlation(geom = ifelse(after_stat(p.value) < 0.05, "text", "label"))

Thanks!

Install warning for ggpmisc 0.4.0 (macOS / R 4.1)

I'm running into an issue installing from source that is specific to v0.4.0. The previous v0.3.9 release installs correctly.

install.packages("https://cran.r-project.org/src/contrib/ggpmisc_0.4.0.tar.gz")

trying URL 'https://cran.r-project.org/src/contrib/ggpmisc_0.4.0.tar.gz'
Content type 'application/x-gzip' length 529683 bytes (517 KB)
==================================================
downloaded 517 KB

* installing *source* package ‘ggpmisc’ ...
** package ‘ggpmisc’ successfully unpacked and MD5 sums checked
** using staged installation
** R
** data
*** moving datasets to lazyload DB
** inst
** byte-compile and prepare package for lazy loading
Warning in .recacheSubclasses(def@className, def, env) :
  undefined subclass "numericVector" of class "Mnumeric"; definition not updated
Calls: <Anonymous> ... loadNamespace -> <Anonymous> -> .cacheClass -> .recacheSubclasses
** help
*** installing help indices
*** copying figures
** building package indices
** installing vignettes
** testing if installed package can be loaded from temporary location
Warning in .recacheSubclasses(def@className, def, env) :
  undefined subclass "numericVector" of class "Mnumeric"; definition not updated
Calls: <Anonymous> ... loadNamespace -> <Anonymous> -> .cacheClass -> .recacheSubclasses
** testing if installed package can be loaded from final location
Warning in .recacheSubclasses(def@className, def, env) :
  undefined subclass "numericVector" of class "Mnumeric"; definition not updated
Calls: <Anonymous> ... loadNamespace -> <Anonymous> -> .cacheClass -> .recacheSubclasses
** testing if installed package keeps a record of temporary installation path
* DONE (ggpmisc)

Here's my session info:

sessionInfo()

> sessionInfo()
R version 4.1.0 (2021-05-18)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Big Sur 10.16

Matrix products: default
BLAS:   /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRblas.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] AcidDevTools_0.3.14

loaded via a namespace (and not attached):
[1] compiler_4.1.0 tools_4.1.0

Installation guidelines on Ubuntu 22.04.2 LTS

Hi, please, consider adding an instruction on how one can install ggpmisc package on an Ubuntu machine. I have faced the problem that clean Ubuntu is lacking from not straight-forward dependencies:

sudo apt update
sudo apt install gfortran liblapack-dev libopenblas-dev r-cran-sparsem

only after that we can run

install.packages("ggpmisc")

FYI, my sessionInfo():

R version 4.2.2 Patched (2022-11-10 r83330)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 22.04.2 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so

locale:
 [1] LC_CTYPE=C.UTF-8       LC_NUMERIC=C           LC_TIME=C.UTF-8
 [4] LC_COLLATE=C.UTF-8     LC_MONETARY=C.UTF-8    LC_MESSAGES=C.UTF-8
 [7] LC_PAPER=C.UTF-8       LC_NAME=C              LC_ADDRESS=C
[10] LC_TELEPHONE=C         LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] ggpmisc_0.5.2 ggpp_0.5.1    ggplot2_3.4.1

loaded via a namespace (and not attached):
 [1] magrittr_2.0.3     splines_4.2.2      MASS_7.3-58.2      tidyselect_1.2.0
 [5] munsell_0.5.0      colorspace_2.1-0   lattice_0.20-45    R6_2.5.1
 [9] rlang_1.0.6        fansi_1.0.4        dplyr_1.1.0        tools_4.2.2
[13] grid_4.2.2         gtable_0.3.1       quantreg_5.94      utf8_1.2.3
[17] cli_3.6.0          withr_2.5.0        MatrixModels_0.5-1 survival_3.5-3
[21] tibble_3.1.8       lifecycle_1.0.3    Matrix_1.5-1       vctrs_0.5.2
[25] glue_1.6.2         compiler_4.2.2     pillar_1.8.1       generics_0.1.3
[29] scales_1.2.1       SparseM_1.81       pkgconfig_2.0.3

Small typo in manual

Hi @aphalo,

many thanks for developing this very helpful package!

I just wanted to make you aware of a very small typo (duplicated punctuation) in the ggpmisc manual that I just stumbled across when trying to add p.values using stat_poly_eq with aes(label = paste(..p.value..label..).

As one would guess, the computed variable name is p.value.label not p.value..label.

Have a nice day!

Support comma as decimal marker

Make stat_poly_eq(), stat_quant_eq() and stat_ma_eq() fully respect R option options(OutDec = ","). It seems the model equation respects it but not R2 and possibly other labels. See https://stackoverflow.com/questions/74813032/modifying-the-decimal-mark-of-stat-poly-eq

stat_quant_line() and consistency with stat_quant_eq()

I based stat_quant_line() partly on ggplot2::stat_quantile() while I wrote stat_quant_eq() almost from scratch. The current code for stat_quant_line() makes it impossible to retrieve the formula and class from the fitted model. As I have made it possible to use methods that modify the formula or call user defined functiions that in turn call a model fit function returning an acceptable fited model object, this represents a problematic inconsistency.

A rewrite based in stat_quant_eq() should not be too difficult.