Code Monkey home page Code Monkey logo

visualizer's Introduction

Hi there, I'm Ashish ๐Ÿ‘‹

โšก I love applied maths, programming, data science, and books

  • ๐ŸŒฑ Iโ€™m addicted to learning and growing every day

  • ๐ŸŒ I am currently sharing a little bit of my knowledge to the world through my blog.

  • โœ๏ธ I am current working on mixed data clustering

  • Connect with me on:

  • ๐Ÿ“ซ Learn more about me on:

Ashish's GitHub stats

visualizer's People

Contributors

duttashi avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

Forkers

huyaodong

visualizer's Issues

How to rotate axis labels in ggplot2?

If the x-axis labels are very long say the x-axis is a factor variable with long length names, then when plotting such data the x-axis will be cramped. So how to overcome this problem?

Consider an example as given;

library(ggplot2)
data(diamonds)
diamonds$cut <- paste("Super Dee-Duper",as.character(diamonds$cut))
q <- qplot(cut,carat,data=diamonds,geom="boxplot")
q 

Change the shape of the legend in density plots with ggplot2

to change boxes shown in the density plot legend into lines, and I understand I need to use guides. However, the following codes are not working:

> set.seed(0)
> library(ggplot2)
> df <- data.frame(Income=c(rnorm(500,1000,200),rnorm(500,900,10)),Type=c(rep("A",500),rep("K",500)))
> p2p <- ggplot(df,aes(x=Income))+geom_density(aes(colour=Type))+
+   guides(colour = guide_legend(override.aes = list(linetype = 1, shape = 3)))
> p2p

How to create a multiple kernel density plot with ggplot2?

I would like to add a kernel density estimate for 2 types of data to a ggplot. If I use the following code, it displays a kernal density estimate for the 2nd factor level only. How do I get a kernel density estimate for both factor levels (preferably different colors)?

ggplot(mtcars, aes(x = disp, y=mpg, color=factor(vs))) +
   theme_bw() +
   geom_point(size=.5) +
   geom_smooth(method = 'loess', se = FALSE) +
   stat_density_2d(geom = "raster", aes(fill = ..density.., alpha = ..density..), contour = FALSE) +
   scale_alpha(range = c(0,1)) + 
   guides(alpha=FALSE)

See the plot

How to overlay histogram with a density curve?

I am trying to do is make a histogram of density values and overlay that with the curve of a density function (not the density estimate).

  • Density curve- A density curve is a graph that shows probability. The area under the density curve is equal to 100 percent of all probabilities. As we usually use decimals in probabilities you can also say that the area is equal to 1 (because 100% as a decimal is 1).

  • Density estimate- is the construction of an estimate, based on observed data, of an unobservable underlying probability density function. The unobservable density function is thought of as the density according to which a large population is distributed; the data are usually thought of as a random sample from that population.

Using a simple standard normal example, here is some data:

library(ggplot2)
# sample data
x <- rnorm(1000)

How to save each graph to its directory with its name on it

This question was originally asked on SO. The question is as follows;
I want to add one more feature that creating the directory with the graph name and store corresponding figures to those folder. For instance if the graph name is setosa create folder named setosa and store setosa graph to inside to that new directory.

See the following similar questions, 1, 2, 3

Here is my current working code for saving graphs to working directory.

library(ggplot2)
library(dplyr)
plot_list = list() # Initialize an empty list
     for (i in unique(iris$Species)) {
  p = ggplot(iris[iris$Species == i, ], aes(x=Sepal.Length, y=Sepal.Width)) +
     geom_point(size = 3, aes(colour = Species))
  plot_list[[i]] = p
}

for (i in unique(iris$Species)) {
  file_name = paste( i, ".tiff", sep="")
  tiff(file_name)
  print(plot_list[[i]])
  dev.off()
}

How to set limits for axes in ggplot2 plots?

This question was originally asked on SO

library(ggplot2)    

carrots <- data.frame(length = rnorm(500000, 10000, 10000))
cukes <- data.frame(length = rnorm(50000, 10000, 20000))
carrots$veg <- 'carrot'
cukes$veg <- 'cuke'
vegLengths <- rbind(carrots, cukes)

ggplot(vegLengths, aes(length, fill = veg)) +
 geom_density(alpha = 0.2)

Now say I only want to plot the region between x=-5000 to 5000, instead of the entire range.
How can I do that?

Annotation in ggplot2

  • What is annotation?
  • How does it work in ggplot2?
  • How to annotate a plot?
  • How to annotate text on individual facets in ggplot2?
  • How to add a single annotation on a facet?
  • How to add different annotation on facets?

How to center align the plot title if using ggplot2?

So I was trying to center align the plot title, using the following code snippet, but it would not work; I provide the code below

library(ggplot2)  
ggplot(iris, aes(x=Sepal.Length, y=Petal.Length, color=Species))+
  ggtitle("Sepal & Petal length in Iris flower")+
  theme(plot.title = element_text(hjust = 0.5))+
  geom_point() +
  scale_color_brewer(palette = "Set2")+
  ggsave("plot.pdf")+ 
  theme_classic()+
  theme(plot.title = element_text(hjust = 0.5))

The system specs are R version, ggplot2 version 2.2.1, RStudio v1.0.136 3.3.3, Win 7 64 bit Service Pack 1

How to show data values on a stacked bar chart?

This question was originally asked on SO

The OP wanted to show data values on a stacked bar chart, such that the data values are in the middle of each portion.

# dummy data
Year      <- c(rep(c("2006-07", "2007-08", "2008-09", "2009-10"), each = 4))
Category  <- c(rep(c("A", "B", "C", "D"), times = 4))
Frequency <- c(168, 259, 226, 340, 216, 431, 319, 368, 423, 645, 234, 685, 166, 467, 274, 251)
Data      <- data.frame(Year, Category, Frequency)

Don't know how to automatically pick scale for object of type data.frame. Defaulting to continuous. Error: Aesthetics must be either length 1 or the same as the data (234): x, y

Given a data frame, mpg_data <- as.data.frame(mpg). On trying to plot using the code, ggplot(mpg_data, aes(x = mpg_data[1:10,], y = mpg_data[1:10,])) + geom_point(), the following error is generated;

Don't know how to automatically pick scale for object of type data.frame. Defaulting to continuous. Error: Aesthetics must be either length 1 or the same as the data (234): x, y

Understanding Facets

Let's say, you want to do split up your data by one or more variables and plot the subsets of data together. This can be achieved by facetting or facet. A minimum reproducible example is given below;

library(ggplot2)
mpg_data <- as.data.frame(mpg)

ggplot(data = mpg_data[1:10,], aes(x = cty, y = cyl)) + 
  geom_point()+
  facet_wrap(~model)

How to add adjusted p-value to ggplot with comparison?

This question was originally asked on SO. The OP question goes like this;

"I need help to add the adjusted p value (bonferroni for example) on ggplot boxplot instead of p value. I've try do it with stat_compare_means from ggpub package by using the ..p.adj.. on the aesthetics but it doesn't work when add the comparison list."

mydf <- data.frame(A=1:300, B=rep(c("x","y","z"),100))
ggplot(data= mydf, aes(x=B,y=A)) + geom_boxplot() + 
stat_compare_means(aes(label=..p.adj..),
                   comparisons = list(c("x","y"),c("x","z"),c("y","z")))

How to alter horizontal and/or vertical spacing between facets in ggplot2?

This question was originally asked on SO here and here

ggplot2 has the ability to change the margins between a faceted plot using the argument panel.margin in opts. This seems to change both horizontal and vertical spacing. Is there a way to change the spacing of either horizontal or vertical without changing the other?

library(ggplot2)
# dummy data
mtcars[, c("cyl", "am", "gear")] <- lapply(mtcars[, c("cyl", "am", "gear")], as.factor)
# create the plot
example <- ggplot(mtcars, aes(mpg, wt, group = cyl)) + 
  geom_line(aes(color=cyl)) +
  geom_point(aes(shape=cyl)) + 
  facet_grid(gear ~ am) +
  theme_bw()        
# visualize the plot
example + theme(panel.spacing.x=unit(2, "lines"))

See the Plot

Specifying different x-tick labels for two facet groups in ggplot2

This question was originally asked on SO

I have boxplots representing results of two methods, each with two validation approaches and three scenarios, to be plotted using ggplot2. Everything works fine, but I want to change the x-axis tick label to differentiate between the type of technique used in each group.

library(ggplot2)
set.seed(1)
data <- data.frame(
  Method = rep(c("Method 1", "Method 2"), each = 100),
  Validation = rep(c("Iterations", "Recursive"), times = 100),
  Scenario = sample(c("Scenario 1", "Scenario 2", "Scenario 3"), 200, replace = TRUE),
  Accuracy = runif(200)
)

I just want to change the first x-tick label (Iterations) in Method 1 and Method 2 into 100-iterations and 10-iterations, respectively.

I tried to add this code but that changes the labels for both groups.

+ scale_x_discrete(name = "Validation",  labels = c("100-iterations", "Recursive", 
                              "10-iterations", "Recursive"))

See the plot

How to sort by Year and then by month in ggplot?

By default ggplot2 sorts this particular graph (code below) in a format that places the months numerically in order (1,2,3,4...). How do I get the graph to place the 2017 months before the 2018 months?
This question was originally posted on SO

library(dplyr)
library(lubridate)

df <- data.frame(date = today() + days(1:300), value = runif(300))

df_summary_mo <- df %>%
mutate(year = format(date, "%Y"), month = format(date, "%m")) %>%
group_by(year, month) %>%
summarise(total = sum(value))

ggplot(df_summary_mo, aes(month, total, fill=year, reorder(year,month))) + geom_col()

How to plot numeric values in text format as legend?

This question was originally asked at SO where the OP had numeric values in a column and wanted them to be plotted as text in the legend. Essentially the variable, daysofweek had values 1,2,3,4,5,6,7. The question was how to plot these values as Monday, Tuesday,...,Sunday?

Error: ggplot2 doesn't know how to deal with data of class mts/ts?

When using the filter() verb from the dplyr library, this error message came up.

set.seed(123)
library(dplyr)
df <- data.frame(year = 1960:2006,
                 Weekly_Hours_Per_Person = c(2:10, 9:0, 1:10, 9:1, 2:10),
                 GDP_Per_Hour = 1:47 + rnorm(n = 47, mean = 0))

# Only label selected years
str(df)

df_label <- filter(df, df$year %in% c(1960, 1968, 1978, 1988, 1997, 2006))

library(ggrepel)

ggplot(df, aes(Weekly_Hours_Per_Person, GDP_Per_Hour)) +
  geom_path() +
  geom_point(data = df_label) +
  geom_text_repel(data = df_label, aes(label = year)) +
  scale_x_continuous(limits = c(-2, 12))
))

The above code when executed will give this error?

How to add labels over barplot?

Today morning, a student asked me this question, "Sir, how do I add labels over a barplot?". She did not want to use the ggplot2 library.

How to annotate ggplot2 facets with number of observations per facet?

Occasionally when faceting data in ggplot, I think it would be nice to annotate each facet with the number of observations that fell into each facet. This is particularly important when faceting may result in relatively few observations per facet.

What would be the best / simplest way to add an "n=X" to each facet of this plot?

require(ggplot2)
mms <- data.frame(deliciousness = rnorm(100),
                  type=sample(as.factor(c("peanut", "regular")), 100, replace=TRUE),
                  color=sample(as.factor(c("red", "green", "yellow", "brown")), 100, replace=TRUE))
plot <- ggplot(data=mms, aes(x=deliciousness)) + geom_density() + facet_grid(type ~ color)
plot # view the plot

Warning message: font family not found in Windows font database

In the following code snippet survey_set %>% group_by(Country, TabsSpaces) %>% summarize(MedianSalary = median(Salary)) %>% ungroup() %>% mutate(Country = factor(Country, countries)) %>% ggplot(aes(TabsSpaces, MedianSalary, fill = TabsSpaces)) + geom_col(alpha = 0.9, show.legend = FALSE) + theme(strip.text.x = element_text(size = 11, family = "Roboto-Bold")) + facet_wrap(~ Country, scales = "free") + labs(x = '"Do you use tabs or spaces?"', y = "Median annual salary (US Dollars)", title = "Salary differences between developers who use tabs and spaces", subtitle = paste("From", comma(nrow(survey_set)), "respondents in the 2017 Developer Survey results")) + scale_y_continuous(labels = dollar_format(), expand = c(0,0)),

on execution gives the a warning message like Warning messages: 1: In grid.Call(L_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font family not found in Windows font database 2: In grid.Call(L_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font family not found in Windows font database

How to put the title of the legend on top, whereas the values should be distributed horizontally ?

This question was initially asked on SO

Question

I am trying to put the title of the legend on top, whereas the values are distributed horizontally but I cannot.

library(tidyverse)

df1 <- data.frame(
  sex = factor(c("Female","Female","Male","Male")),
  time = factor(c("Lunch","Dinner","Lunch","Dinner"), levels=c("Lunch","Dinner")),
  total_bill = c(13.53, 16.81, 16.24, 17.42))

ggplot(data=df1, 
       aes(x=time, y=total_bill, group=sex, shape=sex, colour=sex)) + 
  geom_line() + 
  geom_point() +
  theme_bw() +
  theme(
    legend.direction = "horizontal",
  ) +     
  scale_color_manual(values=c("#0000CC", "#CC0000"),
                     name = 'Gender')

Coloring ggplot2 axis tick labels based on data displayed at axis tick positions

How to make a column graph that has the y-axis text mirror the fill color in the graph itself?

# create dummy data
top_bot_5_both <- structure(list(
  name = structure(c(20L, 19L, 18L, 17L, 16L, 15L, 14L, 13L, 12L, 
                     11L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L), 
                   .Label = c("Michele Bachmann", "Donald Trump", "Ted Cruz", 
                              "Newt Gingrich", "Rick Santorum", "Terry McAuliffe", 
                              "Nancy Pelosi", "Debbie Wasserman Schultz", 
                              "Tammy Baldwin", "Joe Biden", "Rand Paul", "Jeb Bush", 
                              "John Kasich", "Barack Obama", "Bill Clinton", 
                              "Hillary Clinton", "Nathan Deal", "Tim Kaine", 
                              "Rob Portman", "Sherrod Brown"), 
                   class = "factor"),
  Party = c("Democratic", "Republican", "Democratic", "Republican", "Democratic", 
            "Democratic", "Democratic", "Republican", "Republican", "Republican", 
            "Republican", "Republican", "Republican", "Republican", "Republican", 
            "Democratic", "Democratic", "Democratic", "Democratic", "Democratic"), 
  total_ratings = c(35L, 48L, 51L, 49L, 296L, 41L, 599L, 64L, 80L, 55L, 
                    61L, 472L, 123L, 82L, 61L, 31L, 35L, 48L, 33L, 75L), 
  sum = c(22, 29, 21, 18, 96, 12, 172, 16, 18, 2, -86, -525, -94, -57, 
          -42, -19, -14, -7, -4, -1), 
  score = c(0.628571428571429, 0.604166666666667, 0.411764705882353, 
            0.36734693877551, 0.324324324324324, 0.292682926829268, 
            0.287145242070117, 0.25, 0.225, 0.0363636363636364, -1.40983606557377, 
            -1.11228813559322, -0.764227642276423, -0.695121951219512, 
            -0.688524590163934, -0.612903225806452, -0.4, -0.145833333333333, 
            -0.121212121212121, -0.0133333333333333)), 
  class = c("tbl_df", "tbl", "data.frame"), 
  row.names = c(NA, -20L), 
  .Names = c("name", "Party", "total_ratings", "sum", "score"))

Error in pmin(y, 0) : object 'y' not found when plotting a bar plot

In creating a bar plot, if the code is written as p1<- ggplot(data = xapi.data, x=gender, y=raisedhands)+ geom_bar(stat = "identity") + coord_flip() + ylab("Y LABEL") + xlab("X LABEL") + ggtitle("TITLE OF THE FIGURE"), it will compile but will not execute. On execution, you will get an error like, Error in pmin(y, 0) : object 'y' not found In addition: Warning messages: 1: In min(x, na.rm = na.rm) : no non-missing arguments to min; returning Inf 2: In max(x, na.rm = na.rm) : no non-missing arguments to max; returning -Inf 3: In min(diff(sort(x))) : no non-missing arguments to min; returning Inf

Error when making a simple boxplot in ggplot2

Given the code,

> attach(mpg)
> ggplot(data = mpg, aes(y= displ))+
  geom_boxplot()

generates the following error;

Error: stat_boxplot requires the following missing aesthetics: y
In addition: Warning message:
Continuous x aesthetic -- did you forget aes(group=...)? 

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.