Code Monkey home page Code Monkey logo

ggplot2-book's Introduction

ggplot2 book

Build status

This is code and text behind the ggplot2: elegant graphics for data analysis book. Please help us make it better by contributing!

Installing dependencies

Install the R packages used by the book with:

# install.packages("devtools")
devtools::install_deps()

Build the book

In RStudio, press Cmd/Ctrl + Shift + B. Or run:

bookdown::render_book("index.Rmd")

ggplot2-book's People

Contributors

abinashbunty avatar alex-trueman avatar cpsievert avatar davechilders avatar dgromer avatar djmurphy420 avatar djnavarro avatar excelsior7 avatar gokceneraslan avatar hadley avatar howardbaik avatar jimhester avatar jonas-hag avatar lindbrook avatar marher90 avatar mmhamdy avatar pitmonticone avatar pursuitofdatascience avatar robinlovelace avatar seaaan avatar shitao5 avatar sidrahussain avatar statsrhian avatar thomasp85 avatar tklebel avatar tomjemmett avatar xiaochi-liu avatar yiluheihei avatar yutannihilation avatar zekiakyol avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ggplot2-book's Issues

Fix reference placement

How to get references to appear at the end of the book instead of the end of each chapter?

Starting with qplot makes learning ggplot2 harder

I've read the original edition cover to cover a few times. Not kidding. I feel that starting with qplot makes it harder to learn ggplot because much of qplot doesn't transfer. Someone advised me to pretend that qplot doesn't exist and to ignore the examples in Chapter 2 when trying to make graphs of a certain type. My learning really took off after that. I also observe that qplot is not that popular in online code repos, which suggests to me that many people have learned to ignore it. I would say put it at the end as a topic for advanced users who want the occasional shortcut.

ggplot2 book; pitfall for the unwary reader

On page 22 (printed), faceting is introduced and contrasted with aesthetics. The example is given of the diamonds data, which has the aesthetics "colour" and "shape" and on the following page the examples of two qplots faceted by color ~ geom ... are given

qplot(caret, data = diamonds, facets = color ~ geom ...

I spent quite a bit of time thinking that facets was a new aesthetic for diamonds because that is one of the ways that diamond gems are classified. It was finally a big forehead slap, but still.

BTW: the aesthetic should have the Yankee spelling color

Build error: mapply ... eval -> eval -> <Anonymous> -> structure -> package_info

I am getting the following build error during make:

stat_bin: binwidth defaulted to range/30. Use 'binwidth = x' to adjust this.
Quitting from lines 112-120 (toolbox.rmd) 
Error: Results must be all atomic, or all data frames
In addition: Warning messages:
1: In loop_apply(n, do.ply) :
  position_stack requires constant width: output may be incorrect
2: In loop_apply(n, do.ply) :
  position_stack requires constant width: output may be incorrect
3: In loop_apply(n, do.ply) :
  position_stack requires constant width: output may be incorrect
4: In loop_apply(n, do.ply) :
  position_stack requires constant width: output may be incorrect
5: In loop_apply(n, do.ply) :
  position_stack requires constant width: output may be incorrect
6: In loop_apply(n, do.ply) :
  position_stack requires constant width: output may be incorrect
7: In loop_apply(n, do.ply) :
  position_fill requires constant width: output may be incorrect
Execution halted
Makefile:32: recipe for target 'book/tex/toolbox.tex' failed
make: *** [book/tex/toolbox.tex] Error 1

I am using R 3.2 with latest CRAN versions of ggplot2 and dplyr.

Build fails

@cpsievert have you seen this before?

LaTeX Warning: Reference `fig:diamond-dim' on page 12 undefined on input line 7
4.

<use  "diagrams/diamond-dimensions.pdf" > [12]

! LaTeX Error: File `figures/qplotqscatter-1' not found.

Environment problems (knitr vs. make)

qplot.rmd works if I knit it in RStudio, but if I build it with the make file, I get an error about the year() function defined in line 430. Any ideas why the environments might be different between the two ways of building?

Think about colour wheel

@cpsievert I think my trigonometry is correct and I've tweaked the display a bit to make it more obvious that the colours are in the right place. They're just not evenly spaced in the way that I expected, which means I need to think about this a bit

Convert latex to markdown

The conversation in #7 is headed in this direction, so I created a new issue.

How do you want to handle figure/table referencing? There doesn't seem to be a supported Markdown equivalent for the \ref{}/\label{} combo, but I found this hack.

I'm leaning toward leaving the latex specific referencing alone for now and wait (and hope a better alternative surfaces). What do you think?

Create diagrams directory

Move diagram pdfs out of top-level directory, and update \includegraphics{} commands. This will make it less likely to accidental ignore (or not ignore) the wrong file.

Build error

Hello, I am getting the following error on Mac Yosemite using RStudio and R (version 3.2.0). The dev versions of ggplot2 and dplyr are installed apart from the other dependencies. The system has MacTex distribution with the Inconsolata font and the hyperref package installed.

Error in mapply(bumper, bump_n, chap_ord[chap_id], SIMPLIFY = FALSE) :
zero-length inputs cannot be mixed with those of non-zero length
Execution halted
make: *** [book/tex/extending.tex] Error 1

Exited with status 2.

Eliminate use of qplot

replace with sidebar (from @garrettgman):

I think it would only be worth a sidebar: "In some cases, you will want to create a quick, simple plot that uses all of the ggplot2 defaults. In these cases you may prefer to use qplot() over ggplot(). qplot() let's you define a plot in a single call. To use it, provide a set of aesthetics, a data set, and a geom.

Example.

It is possible to use qplot() to access all of the custmizability of ggplot2, but I do not recommend it. If you find yourself making a complicated graph, such as one that uses different aesthetics for different layers, or manually sets visual properties, then you should use ggplot()."

This would also put qplot() in the index and have an "response" to people who look for qplot() in the book and would be surprised if they didn't find it. I think there may be a few.

git workflow

Should I always push to master? Or do you want to set up some type of branching workflow?

File structure

What do think about putting each chapter into it's own folder? That way we won't have to worry about chunk labels causing naming conflicts for figure files.

fail to build pdf

I'm on a Mac. It takes me hours to install all the packages and fonts, and whatever the error messages suggest missing (and every time the error messages are different), but I still failed to built a pdf. Is it possible to upload a pdf version? Thanks!

Use native R build?

@hadley it looks like a native R build fails since devtools::install_deps(dependencies = TRUE) fails when there is no DESCRIPTION file.

Maybe it makes sense to add a check for a DESCRIPTION, and if none exists, exit without error (to support projects such as this one)? cc @craigcitro

In the meantime, I'll go back to the old r-travis approach to avoid calling devtools::install_deps()

BTW, I think we've been displaying a false positive build, which I identified in ee3d5cb. I'll get on that right away.

Discarding values

From @ijlyttle:

At the bottom of page 96, there is a description of limits.

Any value not in the domain of the scale is discarded ... This discarding occurs before statistics are calculated.

I made a star, and a note that this is "very important". Until I understood this point I was (frustratingly) getting results that I did not understand. Perhaps this point could be given more prominence.

Add colophon to introduction

Something like this:

## Colophon

This book was written in [Rmarkdown](http://rmarkdown.rstudio.com/) inside 
[RStudio](http://www.rstudio.com/ide/). [knitr](http://yihui.name/knitr/) and 
[pandoc](http://johnmacfarlane.net/pandoc/) converted the raw Rmarkdown to html and pdf. 
The [website](http://adv-r.had.co.nz) was made with [jekyll](http://jekyllrb.com/), styled with 
[bootstrap](http://getbootstrap.com/), and automatically published to Amazon's 
[S3](http://aws.amazon.com/s3/) by [travis-ci](https://travis-ci.org/). The complete 
source is available from [github](https://github.com/hadley/r-pkgs).

This version of the book was built with:

    devtools::session_info()

Push pdf results back to GitHub

I don't think it'd be too hard to push the final pdf back to (a different branch) on this repo after it's done building on Travis. Is this something you'd like to see @hadley?

Turn caching on?

It would be useful to enable caching when building individual chapters

Use lubridate?

From @ijlyttle:

Bottom of page 101:

plot <- qplot(date, psavert, data = economics, geom = "line") + 
  ylab("Personal savings rate") + 
  geom_hline(xintercept = 0, colour = "grey50")

plot

plot + scale_x_date(breaks = "10 years")
plot + scale_x_date(
  limits = as.Date(c("2004-01-01", "2005-01-01")),
  labels = date_format("%Y-%m-%d")  
)

This brings up another point that I'll come back to later. Since the publication in 2009, we have had the introduction of lubridate (not to mention dplyr). Is it thought to update the examples accordingly (I know this is a lot of work)? Is it even a good idea?

library(lubridate)

plot + scale_x_date(
  limits = as.Date(c("2004-01-01", "2005-01-01")), # can this be made easier?
  labels = stamp("2004-12-31")  
)

Think about makefile

  • So we could bind to cmd + shift + B in Rstudio

Here are a couple of example make files from other projects:

HTML_FILES := $(patsubst %.md, %.html, $(shell find . -name '*.md'  \! -name 'navbar.md'))

all: $(HTML_FILES)

$(HTML_FILES): %.html: %.md navbar.html ../template.html
    pandoc '$<' -o '$@' --smart --template ../template.html --include-before navbar.html

navbar.html: navbar.md
    pandoc '$<' -o '$@' --smart

.PHONY: clean
clean:
    $(RM) $(HTML_FILES)

Add exercises

It would be a lot of work, but would considerably add to the value of the book.

Common code options at top of each file

Rather than setting options in render-tex.R (which are only included when turned into a pdf), I think it would be better for each chapter to source a common file at the top. This could set common options, a seed and load common packages.

It also might be better to move the captioner code in here - however I don't quite understand what bumper() is doing.

Convert chapters to Rmarkdown

Probably need to do this one-by-one, checking as you go. Steps:

  1. Replace \ggplot with ggplot2. Replace \code{x} with \texttt{x}, and \f{g} to \texttt{g}.
  2. pandoc introduction.tex -t markdown -o "introduction.rmd" --atx-headers --no-wrap
  3. Add standard rmarkdown header
  4. Search for <span> and replace with correct tag.
    • Functions should always end in ().
    • Package names appear as in in text
    • Ignore small caps
  5. Check citations and links.

Can't build, Mac OS X 10.10.4, R 3.2.1

I can't build the book using Mac OS X Yosemite 10.10.4 with the latest R-version 3.2.1.
I always get this error:

processing file: duplication.rmd
Quitting from lines 36-48 (duplication.rmd) 
Error in do.call("layer", list(mapping = mapping, data = data, stat = stat,  : 
  could not find function "alpha"
Calls: mapply ... geom_smooth -> <Anonymous> -> <Anonymous> -> do.call
Execution halted
make: *** [book/tex/duplication.tex] Error 1

Exited with status 2.

So I've change this file to call scales::alpha. But than I run into the next error.

Rscript render-tex.R extending.rmd
Warning message:
Detected a difference in actual and expected chapters.
Error in mapply(bumper, bump_n, chap_ord[chap_id], SIMPLIFY = FALSE) :
  zero-length inputs cannot be mixed with those of non-zero length
Execution halted
make: *** [book/tex/extending.tex] Error 1

I don't see what's wrong. So my question is which versions of R and R-packages do I need to compile the book using Mac OS X Yosemite?

Role of touch in the makefile

@cpsievert can you explain why you're using touch in the makefile? This seems like an anti-pattern to me, because it will force recomputation even when the source file hasn't changed

Translate plyr to dplyr?

Here is a good example where I could update from plyr code to dplyr (from 4.7 Drawing maps):

library(plyr)
library(ggplot2)
ia <- map_data("county", "iowa")
mid_range <- function(x) mean(range(x, na.rm = TRUE))
centres <- ddply(ia, .(subregion), 
  colwise(mid_range, .(lat, long)))
ggplot(ia, aes(long, lat)) + 
  geom_polygon(aes(group = group), 
    fill = NA, colour = "grey60") +
  geom_text(aes(label = subregion), data = centres, 
    size = 2, angle = 45)

to

library(dplyr)
centres2 <-ia %>%
  group_by(subregion) %>%
  summarise(lat = mid_range(lat), long = mid_range(long))

Would you prefer to see these updates done now? Is it OK to leave plyr code if there isn't an elegant dplyr equivalent?

More makefile woes

I'm seeing multiple calls to

Rscript render-tex.R data-transformation.rmd data.rmd introduction.rmd qplot.rmd
Rscript render-tex.R data-transformation.rmd data.rmd introduction.rmd qplot.rmd

Any ideas?

Package missing, and 'binwidth = x' error

Hi guys,

A couple of things, first thanks for posting this, it's great to have a resource like this available for the latest version of ggplot2.

I ran into a couple of issues when trying to make this project on a Fedora 21 workstation machine. First, the package 'readr' was not in the list of R packages to install in the readme.md.

Next, ran into this error which I can't figure out:
stat_bin: binwidth defaulted to range/30. Use 'binwidth = x' to adjust this.
Quitting from lines 112-120 (toolbox.rmd)
Error: arguments imply differing number of rows: 133, 102
In addition: Warning messages:
1: In loop_apply(n, do.ply) :
position_stack requires constant width: output may be incorrect
2: In loop_apply(n, do.ply) :
position_stack requires constant width: output may be incorrect
3: In loop_apply(n, do.ply) :
position_stack requires constant width: output may be incorrect
4: In loop_apply(n, do.ply) :
position_stack requires constant width: output may be incorrect
Execution halted
Makefile:32: recipe for target 'book/tex/toolbox.tex' failed
make: *** [book/tex/toolbox.tex] Error 1

Also, it would be pretty handy if in the readme there was also an indication that pandoc and pandoc-citeproc were pre-reqs

Thanks,
~josh

3d graphics

From @ijlyttle:

There is a discussion (on pg 129) including the quote:

one day I hope to add 3d graphics too.

Given that ggplot2 is now in maintenance mode, could/should this package be rephrased?

I don't know if this is the place for the larger discussion on the dangers of 3d. As I understand the arguement, 3d is a great way to hide difficult parts of your data, accidentally or otherwise. I remember one talk Hadley gave that touched on using heuristics, such as size and layering, to give the impression of 3d while still respecting the two dimensions being shown.

I don't know, as well, if this is a good place (or if there is a good place) to talk about some common graphing techniques, such as 2nd y-axis, or 3d, that (by definition) are contrary to grammar-of-graphics, and what grammar-of-graphics techniques can be used to convey the same information, perhaps not quite as concisely, but with less abiguity.

Load xtable in .Rmd documents that use the package

I mentioned this on a different issue, but I thought I would put it here to make sure it got caught. With commit 2c1c0a2, there was a switch from pander::pandoc.table to xtable for creating tables. However, xtableis not loaded into the session in the .Rmd files, and I believe that this causes the document not to be knitted properly in any chapter that calls xtable. I would make the changes myself and make a pull request, but when I made the changes it did something odd and the make won't work properly, so I thought I would just make it an issue.

Use broom over effects?

In 4.8 Revealing uncertainty, the effects package is used to extract model components. Would you prefer to use broom instead?

modelling.rmd has an Rscript error when I try to build the book

I will be happy to leave more info if desired; I am not an experienced R user at all. Error is:

Rscript render-tex.R modelling.rmd


processing file: modelling.rmd

Attaching package: 'dplyr'

The following object is masked from 'package:stats':

    filter

The following objects are masked from 'package:base':

    intersect, setdiff, setequal, union

Quitting from lines 176-183 (modelling.rmd)
Error in resid(mod) : object 'mod' not found
Calls: mapply ... lapply -> eval.quoted -> lapply -> FUN -> eval -> resid
In addition: Warning messages:
1: In loop_apply(n, do.ply) :
  Removed 1192 rows containing missing values (geom_path).
2: In loop_apply(n, do.ply) :
  Removed 1192 rows containing missing values (geom_path).
Execution halted
make: *** [book/tex/modelling.tex] Error 1

Here's the offending lines, which look to me to actually define mod:

```{r, fig.keep="hold"}
abilene <- tx %>% filter(city == "Abilene")
ggplot(abilene, aes(date, log(sales))) + 
  geom_line()

mod <- lm(log(sales) ~ factor(month), data = abilene)
ggplot(abilene, aes(date, resid(mod))) + 
  geom_line()
R version 3.1.2 (2014-10-31) -- "Pumpkin Helmet"
Copyright (C) 2014 The R Foundation for Statistical Computing
Platform: x86_64-apple-darwin14.3.0 (64-bit)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.