Code Monkey home page Code Monkey logo

corrplot's Introduction

R-CMD-check codecov.io CRAN Status CRAN Downloads

Summary

R package corrplot provides a visual exploratory tool on correlation matrix that supports automatic variable reordering to help detect hidden patterns among variables.

corrplot is very easy to use and provides a rich array of plotting options in visualization method, graphic layout, color, legend, text labels, etc. It also provides p-values and confidence intervals to help users determine the statistical significance of the correlations.

For examples, see its online vignette.

This package is licensed under the MIT license, and available on CRAN: https://cran.r-project.org/package=corrplot.

Basic example

library(corrplot)
M = cor(mtcars)
corrplot(M, order = 'hclust', addrect = 2)

Basic example

Download and Install

To download the release version of the package on CRAN, type the following at the R command line:

install.packages('corrplot')

To download the development version of the package, type the following at the R command line:

devtools::install_github('taiyun/corrplot', build_vignettes = TRUE)

How to cite

To cite corrplot properly, call the R built-in command citation('corrplot') as follows:

citation('corrplot')

Reporting bugs and other issues

If you encounter a clear bug, please file a minimal reproducible example on github.

corrplot's People

Contributors

alexchristensen avatar caijun avatar cos-editor avatar druedin avatar gitter-badger avatar jeffzemla avatar jyfeather avatar michaellevy avatar protivinsky avatar taiyun avatar vsimko avatar yihui avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

corrplot's Issues

How to have Corrplot title position correct?

Code

library('corrplot')

# http://www.sthda.com/english/wiki/visualize-correlation-matrix-using-correlogram
cor.mtest <- function(mat, ...) {
    mat <- as.matrix(mat)
    n <- ncol(mat)
    p.mat<- matrix(NA, n, n)
    diag(p.mat) <- 0
    for (i in 1:(n - 1)) {
        for (j in (i + 1):n) {
            tmp <- cor.test(mat[, i], mat[, j], ...)
            p.mat[i, j] <- p.mat[j, i] <- tmp$p.value
        }
    }
  colnames(p.mat) <- rownames(p.mat) <- colnames(mat) 
  p.mat
}

M <- cor(mtcars)

p.mat <- cor.mtest(M)

title <- "ECG p-value significance"
col <- colorRampPalette(c("#BB4444", "#EE9988", "#FFFFFF", "#77AADD", "#4477AA"))
corrplot(M, method="color", col=col(200),  
     diag=FALSE, # tl.pos="d", 
         type="upper", order="hclust", 
         title=title, 
         addCoef.col = "black", # Add coefficient of correlation
         # Combine with significance
         p.mat = p.mat, sig.level = 0.05, insig = "blank" 
         )

Output in the related thread where the title position is outside the window space, about half of the title.

R: 3.3.1
OS: Debian 8.5
Related thread: http://stackoverflow.com/q/40509217/54964

restore `par$mar` settings

mar is set to c(0,0,0,0) by default and affects the whole environment outside corrplot function, which is usually not expected . Better to store par settings and restore it back after plotting.

lim.segment parameter default value

Currently, the default value for lim.segment parameter in colorlegend function is NULL.
I would like to propose a better, more intuitive value "auto".

Is there a way to print only the top labels?

I found this question on stackoverflow:
http://stackoverflow.com/questions/30207260/corrplot-label-printing

df <- data.frame(
    x1 = rnorm(20), x2 = rnorm(20), x3 = rnorm(20), x4 = rnorm(20),
    x5 = rnorm(20), x6 = rnorm(20), x7 = rnorm(20), x8 = rnorm(20),
    x9 = rnorm(20), x10 = rnorm(20), x11 = rnorm(20), x12 = rnorm(20))
cormatx <- cor(df)
corrplot(cormatx, method="color")

Now I can alter the position of the labels by adding tl.pos=... which, according to the package manual, only takes "lt", "ld", "td", "d" or "n" as arguments. These are "left and top", "left and diagonal", "top and diagonal", "diagonal" and "NULL" respectively. (To my knowledge all the arguments involving the "diagonal" option won't even work with method="color").

Is there a way to print only the top labels. I tried tl.pos="t", without any luck. I think that argument just isn't supported so it returned "default".

Size and number of decimals of coefficients of correlation

Hi,

Thanks for the great library. I have a correlation plot with 46 variables. When plotting with method="number", the size of the number exceeds the size of the square. Is it possible to change to font size of the correlations of coefficients? Also, is it possible to change the number of decimals?

If not, this would be two great features!

Best,
Kristoffer

NA errors when is.corr = F

It seems this bug #7 hasn't been fixed in the case when is.corr = F.

x <- matrix(0, ncol = 5, nrow = 5)
x[1,1] <- NA
corrplot(x) #this works
corrplot(x, is.corr = F) #this does not

Error with correlation plot using insig argument when all values are significant

Please take a look to this error when you try to plot a correlation matrix with all values lower or equal to significant level.
Thank's in advance
Juan

require(Hmisc, quietly = TRUE)
require(corrplot)
p=rcorr(as.matrix(attitude[,1:3]))$P
cor_=cor(attitude[,1:3])
corrplot.mixed(cor_, upper='number',lower='ellipse', p.mat=p,
insig='blank')

Error in symbols(pos.pNew[, 1][ind.p], pos.pNew[, 2][ind.p], inches = FALSE

ind.p has length zero because there is no value greater than significant level.

Allow separate colors for X-label and Y-label

I saw that we have a non-mergeable pull request from @zhilongjia who suggested to add separate parameters tl.x.col and tl.y.col instead of the current tl.col. This looks like a reasonable feature.
However, it would also break the current API.

Therefore, I would like to suggest to rather keep the tl.col parameter that can either contain the color name as a string e.g. "red" or separate colors as a pair, e.g c("red", "blue") - red for X, blue for Y label.

Any suggestions ?

Using plotmath in variable names

Is there a way of changing the variables names such that plotmath expressions could be used? I've tried adding a labels argument with a vector of new labels thinking that it may be passed through to text, but that didn't seem to work.

White bg in Spearman cor diagonal?

Code

library("psych")
library("corrplot")   

M <- mtcars 
M.cor <- cor(M)

p.mat.all <- psych::corr.test(M.cor, adjust = "none", ci = F)
alpha <- 0.05
col <- colorRampPalette(c("#BB4444", "#EE9988", "#FFFFFF", "#77AADD", "#4477AA"))  

lapply(
  c("r","p","t"),
  function(ID) { # http://stackoverflow.com/a/40531043/54964
    x <- p.mat.all[[ID]]    
      corrplot( M.cor, 
                p.mat = x, 
                sig.level = alpha, 
                insig = "blank", 
      ) 
  })

Output: spearman r diagonal is white, please see the linked thread for the figure

R: 3.3.1
OS: Debian 8.5
Related: http://stackoverflow.com/q/40533069/54964

Plotted NAs should be different from zeroes

As pointed out by @taiyun we should distinguish between 0 and NA in the plot.

It is not a good idea that plot nothing if a cell value is NA:

M1 = M2 = cor(mtcars)
diag(M1) = 0
diag(M2) = NA

# should not be the same
corrplot(M1) 
corrplot(M2) 

How about using '?' instead ?

ability to define plot ranges

Hi,

Not sure if I've missed this in the documentation, but it would be nice to be able to define plot ranges.

For example, if I'm not interested in extreme correlations +1/-1 and, for the purposes of clarity, I wanted to remove those from the plot and concentrate on, say, the 0.6/-0.6 range there is currently no mechanism in corrplot to do that (as far as I can tell).

Large dataset = diagonal not straight

i am trying to plot the correlations of a large dataset (~1300 variables) and saw that the diagonal doesn't appear straight. I was not able to reproduce this using random matrices, maybe you can give me some hint on how to debug this:
corrplot

here the same correlation matrix with image:
image

Support for multiple characters when rendering NAs

There is a parameter na.label. However, it currently only allows for a single character (default "?").
It might be useful to support more characters with some reasonable upper limit.
Example:

corrplot(M2, na.label = "NA", number.cex = .7) # works from v0.76

image

@taiyun Before I implement this, let's discuss the following:

  • Should the limit be 2 chars or more? (I'm voting for 2)
  • Should the default value be "?" or "NA" ? (I'm voting for "?")

error when the matrix(corr) contains NA values.

Hi, I would like to plot some other pairwise statistics (e.g. odds ratio) with corrplot() with is.corr = F option.

I found out that your codes makes following error when the matrix(corr) contains NA values.

# Error in if (max(corr) * min(corr) < 0) { : 
#   missing value where TRUE/FALSE needed

I tried following modification adding na.rm = T on your corrplot() codes:

    if (!is.corr) {
        if (max(corr, na.rm = T) * min(corr, na.rm = T) < 0) {
            intercept <- 0
            zoom <- 1/max(abs(cl.lim))
        }
        if (min(corr, na.rm = T) >= 0) {
            intercept <- -cl.lim[1]
            zoom <- 1/(diff(cl.lim))
        }
        if (max(corr, na.rm = T) <= 0) {
            intercept <- -cl.lim[2]
            zoom <- 1/(diff(cl.lim))
        }

However, after I modified, I got a new error as follows:

# # Error in (n + 1 - n2):(n + 1 - n1) : result would be too long a vector
# # In addition: Warning messages:
# # 1: In min(corr, na.rm = TRUE) :
# #   no non-missing arguments to min; returning Inf
# # 2: In max(corr, na.rm = TRUE) :
# #   no non-missing arguments to max; returning -Inf
# # 3: In max(Pos[, 2]) : no non-missing arguments to max; returning -Inf
# # 4: In min(Pos[, 2]) : no non-missing arguments to min; returning Inf

Please let me know your advice.

Thanks.

Checks for [-1; 1] interval are too strict

When computing a weighted correlation matrix with wt.cor(.., cor=TRUE)$cor, diagonal correlations can be slightly above 1 or slightly below -1, but corrplot raises an error which is likely to confuse users since the printed values are either 1 or -1. The checks could be made a bit more tolerant by checking for x <= 1 + 2 * .Machine$double.eps and x >= -1 - 2 * .Machine$double.eps (which worked in my case, but one could be a bit more tolerant...).

rounded r values

I am trying to show the correlation r values in at least two decimal instead rounded up (e.g. 0.99 instead of 1) as corrplot always rounded the r values.

May I ask is there any parameter in the corrplot package allowed to do so?

Not support R3.3.2

Hi, there.
Thank you very much for developing such a good package for plotting the correlation matrix!
Unfortunately, I found this package is unavailable for R 3.3.2.
So, I am wondering will you update this package soon?
if not, maybe I need to use the previous version of R.

Thanks again.

CRAN deployment

Describe somewhere the process of deployment to CRAN

  • e.g. in a separate section in README.md or a separate document

NAs still causing problems ?

@CrHvG:

Thanks for the great package. I've installed the update, but I still get an error when running a corr matrix with NAs. When will this be available to the public? Also, why not just display "NA" rather than "?"? The value may not be in question ("?"), but may be genuinely not applicable ("NA"). Thanks.

documentation for corrPlot.hclust

The function corrRect.hclust is exported (see NAMESPACE) but doesn't have any documentation.
Some parameters, such as k or method are still documented within corrRect.

Corrplot.mixed & plotCI

currently corrplot.mixed and plotCI are not really compatible.

corrplot::corrplot.mixed(M,lower='number',upper='circle',low=L,upp=U,plotCI='rect')

results in the following error:

Error in corrplot(corr, add = TRUE, type = "lower", method = lower, diag = (diag ==  : 
method should be circle or square if draw confidence interval!

And only draws the upper half with confidence plots.
It should also draw the lower triangle with numbers.

It is also currently not clear how to specify which (the upper or lower) triangle to be used by "rect"

hclust method ward.D2 not supported

Newer versions of R support a method for hclust called ward.D2 which implements the Ward's minimum variance method. The original method, simply ward is now deprecated (now called ward.D). Its implementation was flawed in that it did not square the dissimilarities scores.

When using hclust.method="ward" in corrplot, a warning is flagged:

The "ward" method has been renamed to "ward.D"; note new "ward.D2"

However using hclust.method="ward.D2" (or ward.D) results in an error:

Error in match.arg(hclust.method) :
'arg' should be one of “complete”, “ward”, “single”, “average”, “mcquitty”, “median”, “centroid”

Size of correlation coefficients

Thanks for a very versatile package!

I'm trying to adjust the size of the displayed correlation coefficient. On this issue:
#36
it is suggested to use number.cex=0.5
I have v0.73 installed from the repos, and when I add this to a basic call, as suggested there:
corrplot(cor(mtcars), method='number', number.cex=0.5)
I get the following error:

Warning messages:
1: "number.cex" is not a graphical parameter 
2: "number.cex" is not a graphical parameter 
3: "number.cex" is not a graphical parameter 
4: "number.cex" is not a graphical parameter 
5: "number.cex" is not a graphical parameter 
6: "number.cex" is not a graphical parameter 
7: In text.default(pos.xlabel[, 1], pos.xlabel[, 2], newcolnames, srt = tl.srt,  :
  "number.cex" is not a graphical parameter
8: In text.default(pos.ylabel[, 1], pos.ylabel[, 2], newrownames, col = tl.col,  :
  "number.cex" is not a graphical parameter
9: In title(title, ...) : "number.cex" is not a graphical parameter

and there is no change to the displayed text size.

Also, if I remove the repo version and try to install the developer version:
devtools::install_github("taiyun/corrplot")
I get the following error message:

Downloading github repo taiyun/corrplot@master
Error in system(full, intern = quiet, ignore.stderr = quiet, ...) : 
  error in running command

Any suggestions as to what to do about these?

Thanks.

Enable to plot a matrix with NA

Hi, I start using corrplot and appreciate for this nice package, but it might be better if corrplot() can plot a matrix with NA values.
it throws an error like below:

> M
     [,1] [,2]
[1,]    1   NA
[2,]   NA    1
> corrplot(M)
Error in if (min(corr) < -1 - .Machine$double.eps^0.75 || max(corr) >  : 
  missing value where TRUE/FALSE needed

it may be a simple solution to plot nothing if a cell value is NA.
Thanks in advance.

add example for corrplot with NAs

Support for NAs in corrplot has been discussed in issues #55, #49, #46 and #7.
Some code utilizing the NAs is already located in tests/testthat/test-corrplot.R.
However, we also need some demo code snippets and plots in examples and vignettes, i.e.

  • in vignettes/example-corrplot.R
  • in vignettes/corrplot-intro.Rmd

Changing aspect ratio for the plot

Fabian Roger asked per email:

Is there any way to for the output of corrplot (with is.corr = F) to be quadratic for a matrix that is not?

Return value should be the same as corrplot function

Return value from corrplot.mixed differs from corrplot.
The main corrplot function now returns invisible(corr) which is useful for testing.
However, the function corrplot.mixed does not return anything.

I suggest to return the same: invisible(corr)

title position and pie corrplot background circle

  1. When type="lower", is it possible to put title in upper section, instead of top? Sometimes, the title is far from the figure itself.
  2. Make the circle perimeter nonvisual (see figure) or add a parameter.
    corrplot

line 241 in corrplot.R.

original:

symbols(Pos, add = TRUE, inches = FALSE, circles = rep(0.5, len.DAT) * 0.85)

nonvisual circle perimeter:

symbols(Pos, add = TRUE, inches = FALSE, circles = rep(0.5, len.DAT) * 0.85, fg = bg)

Add an example plot to your readme

For a package that creates a visual it makes a lot of sense to have at least an example of those visuals on the frontpage of your repo. If you add something to your readme that could help get people excited about your package.

How to set corrplot diagonal numeric labels?

I am trying to have numeric diagonal labels on corrplot. I have the correlation matrix M and the labels ids. I opened also a thread in SO about the case where link below. I think the line colnames(p.mat) <- rownames(p.mat) <- colnames(mat) <- c(ages) in the thread's function cor.mtest should associate ids with the diagonal labels.

ids <- c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11)

cor.mtest <- function(mat, ...) {
    mat <- as.matrix(mat)
    n <- ncol(mat)
    p.mat<- matrix(NA, n, n)
    diag(p.mat) <- 0
    for (i in 1:(n - 1)) {
        for (j in (i + 1):n) {
            tmp <- cor.test(mat[, i], mat[, j], ...)
            p.mat[i, j] <- p.mat[j, i] <- tmp$p.value
        }
    }
  colnames(p.mat) <- rownames(p.mat) <- colnames(mat) <- diag.labels
  p.mat
}

M<-cor(mtcars, diag.labels=ids)
corrplot(M, type="upper", order="hclust", tl.pos=c("td"), method="circle",  tl.cex = 0.5, tl.col = 'black', diag = FALSE,
         p.mat = p.mat, sig.level = 0.05)

OS: Debian 8.5
R: 3.3.1
Related: http://stackoverflow.com/q/40494979/54964

Publish the project to OpenHub

https://www.openhub.net/

The Black Duck Open Hub (formerly Ohloh.net) is an online community and public directory of free and open source software (FOSS), offering analytics and search services for discovering, evaluating, tracking, and comparing open source code and projects.

corrplot with type = "upper" and long colname strings cuts off top labels

If you corrplot a correlation matrix where variables in the data have very long names, then the plot cuts off the top labels.

For example, the following code

set.seed(123)
rmat <- matrix(runif(100), ncol = 10)
colnames(rmat) <- c("the quick brown fox jumps over the lazy dog", "and then went to get ice cream", "A", "B", "C", "D", "E", "F", "G", "H")
M <- cor(rmat)
corrplot(M, type = "upper", tl.pos = "td",
         method = "circle", tl.cex = 0.5, tl.col = 'black',
         order = "hclust", diag = FALSE)

Produces the following plot:
image

Removing the tl.cex still has the problem, yielding the following plot:
image

Note that both plots also have a fairly excessive amount of white space to the left of the plot, but that is not the issue here.

As a workaround, I found that if I comment out line 183 of corrplot.R then the problem is reduced or resolved, although the colorlegend last value (-1) gets cut off the bottom of the plot.

How to hide grid when plotting a large matrix

This might be a bug, inconsistent API, missing feature or bug in the documentation.

In corrplot.mixed.R the parameter addgrid.col is documented as:

#' @param addgrid.col The color of grid, if \code{NULL}, don't add grid.

In corrplot.R the parameter addgrid.col is documented as:

#' @param addgrid.col The color of grid. The default value is depends on
#'   \code{method}, if \code{method} is \code{color} or \code{shade}, the
#'   default values is \code{"white"}, otherwise \code{"grey"}.

It looks like it should be possible to use addgrid.col to disable the rendering of the grid.
But it does not work. See the following code snippet:

M <- matrix(runif(2500, 0.5, 1), nrow = 50)
corrplot(M, method = "color", cl.pos = "n", tl.pos = "n", addgrid.col = NULL)

image

However, I would like to generate something like this:
image

github wiki not needed

I think that we actually don't need the github wiki.
I suggest to turn it off in: Settings->Features->Wiki checkbox

Warning from lintr about corrMatOrder on Travis

I'm not sure why, but lintr always reports the following warning when built on Travis.
R/corrplot.R:310:12: warning: no visible global function definition for ‘corrMatOrder’

ord <- corrMatOrder(corr, order = order, hclust.method = hclust.method)
           ^~~~~~~~~~~~

If I run lintr locally in RStudio, I don't see any lints:

> lintr::lint_package()

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.