Code Monkey home page Code Monkey logo

tukeyedar's Introduction

tukeyedar

Lifecycle: experimental R-CMD-check

The tukeyedar package houses data exploration tools. Many functions are inspired by work published by Tukey (1977), Hoaglin (1983), Velleman and Hoaglin (1981), and Cleveland (1993). Note that this package is in beta mode, so use at your own discretion.

Installation

You can install the development version of tukeyedar from GitHub with:

# install.packages("devtools")
devtools::install_github("mgimond/tukeyedar")

Note that the vignettes will not be automatically generated with the above command; note too that the vignettes are available on this website (see next section). If you want a local version of the vignettes, add the build_vignettes = TRUE parameter.

devtools::install_github("mgimond/tukeyedar", build_vignettes = TRUE)

If, for some reason the vignettes are not created, you might want to re-install the package with the force=TRUE parameter.

devtools::install_github("mgimond/tukeyedar", build_vignettes = TRUE, force=TRUE)

Vignettes

It’s strongly recommended that you read the vignettes. These can be accessed from this website:

If you chose to have the vignettes locally created when you installed the package, then you can view them locally via vignette("RLine", package = "tukeyedar"). If you use a dark themed IDE, the vignettes may not render very well so you might opt to view them in a web browser via the functions RShowDoc("RLine", package = "tukeyedar").

Using the functions

All functions start with eda_. For example, to generate a three point summary plot of the mpg vs. disp from the mtcars dataset, type:

library(tukeyedar)
eda_3pt(mtcars, disp, mpg)

Note that most functions are pipe friendly. For example, the following works:

# Using R >= 4.1
mtcars |>  eda_3pt(disp, mpg)

# Using magrittr (or any of the tidyverse packages)
library(magrittr)
mtcars %>% eda_3pt(disp, mpg)

Cleveland, William. 1993. Visualizing Data. Hobart Press.

Hoaglin, Mosteller, D. C. 1983. Understanding Robust and Exploratory Data Analysis. Wiley.

Tukey, John W. 1977. Exploratory Data Analysis. Addison-Wesley.

Velleman, P. F., and D. C. Hoaglin. 1981. Applications, Basics and Computing of Exploratory Data Analysis. Boston: Duxbury Press.

tukeyedar's People

Contributors

mgimond avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

tukeyedar's Issues

Add a Normal QQ option to eda_qq

A proposed feature is to add a Normal option to eda_qq whereby eda_qq generates a Normal QQ plot. In such a case, x is a vector y is NULL and a parameter norm is set to TRUE.

show.par not working in qq plot

The show.par parameter is ignored when generating a qq plot using the eda_qq function. But, it's respected when setting md=TRUE.

eda_rline ignores the iter argument

It seems that eda_rline is ignoring the iter argument in its iteration. For example,

eda_rline(neoplasms, Temp, Mortality, iter =1)

will still iterate until a stable slope is achieved.

...
$px
[1] 1
$py
[1] 1
$iter
[1] 4
attr(,"class")
[1] "eda_rline"

Add option to assess symmetry in a batch using eda_qq

Split batch into two halves using median.

med <- median(x)
len <- length(x)
x <- sort(x)
n2 <- ifelse( len%%2 == 0, len/2, (len + 1)/2)
vi <- med - x[1:n2]
ui <- x[ (len + 1) - (1:n2) ] - med

eda_qq(vi,ui)

ref: Sections 2.3 and 2.8 of Graphical methods for data analysis (Chambers et al.)

eda_boxls has redundant argument

eda_boxls has an unused argument, outliers. This is not to be confused with the outlier argument which is used in the function.

Eliminate dplyr from the eda_sl routine

The dplyr function in eda_sl raises a Note in the package check (this fails the check in TravisCI). A possible solution follows:

  df2 <- df1[order(df1$grp, df1$y),]
  df3 <- split(df1, df1$grp)
  y <- lapply(df3, function(x) as.vector(x[,2]) )
  n <- lapply(df3, FUN = function(x)nrow(x) )
  M <- lapply(n, function(x) floor((x - 1) / 2))
  H <- lapply(M, function(x) (floor(x) - 1 ) / 2)
  med <- lapply(1:length(y), FUN=function(x, lst1, lst2) log(lst1[[x]][lst2[[x]]]) ,
                lst1=y, lst2=M)
  Hlo <- lapply(1:length(y), FUN=function(x, lst1, lst2) lst1[[x]][floor(lst2[[x]])] ,
                lst1=y, lst2=H)
  n_hi <- lapply(1:length(M), function(x, lst1, lst2) ceiling(lst1[[x]] + 1 - lst2[[x]]),
                 lst1 = n, lst2 = H)
  Hhi <- lapply(1:length(y), FUN=function(x, lst1, lst2) lst1[[x]][lst2[[x]] + 1] ,
                lst1=y, lst2=n_hi)

  df4 <- data.frame( grp = unique(df2$grp), med = unlist(med), Hhi = unlist(Hhi),
                    Hlo = unlist(Hlo))
  df4$sprd <- log(df4$Hhi - df4$Hlo)
  return(df4)

Modify mtext for y-axis

Many functions use side=3 in the mtext function to add the y-axis label. This needs to be changed to side=2, for example:

mtext(ylab, side = 2,  col = plotcol, padj = -0.5, at = par('usr')[4], las = 2)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.