The readxl package is part of the tidyverse, a set of R packages for doing data science—helping R users get, clean, analyze and visualize data with packages that are purposely designed to work nicely together and help users develop tidy work flows.
I have become a regular user and fan of readxl, an R package that makes it easy to get data out of Excel and into R. Here are a few reasons why:
readxl does one thing well
It can be challenging to navigate the world of R packages—finding a package to do what you want to do with your data, or deciding which package to use, especially if there are many packages that have similar functionality. I like that readxl does one thing—import .xls and .xlsx files—and does it well. Close your eyes and imagine an Excel file with many, many sheets, or with graphs and pivot tables embedded in a sheet with raw data (we have all seen one)—readxl has the functions to import even that Excel data into R. You can specify a sheet, by name or number, or a set of cells, and you can control how R deals with blank cells in an Excel file. All you need and no more.
readxl is a bridge for many users
In a perfect world, in my opinion, we would blink and all raw, tabular data would be made available in flat, rectangular, machine-readable files. Tidy data, sigh. However, many students, data analysts and data producers are working in organizations and communities—the private sector, academia, governments and non-government groups—with strong ties to the Excel world. Having easy to use and reliable tools for bridging R and Excel makes learning easier and faster for new R users.
readxl & Clippy
readxl has a great logo. Hex logos and stickers are fun and kind of a thing for R packages now. I love stickers—and the readxl logo with the paperclip Clippy's sad face makes me smile, everytime-I-see-it. I was a big Excel user in Clippy's time—Clippy was the Microsoft Office "intelligent" assistant (cough cough) in the late 1990's and early 2000s. While I never really understood the wide-spread animousity for poor Clippy, I also didn't even notice when the talking paperclip was gone. I like readxl's homage to Clippy, I think the paperclip assistant deserves to be remembered.
Data import packages are sometimes easy to overlook when gushing about R packages in general—outshone by packages that help users really make data sing, like generating beautiful graphics using ggplot2 or turning super messy data into a tidy data frame using tidyr—both so satisfying. The readxl package, however, deserves some limelight too, in my opinion. It is an easy to use, reliable data import package for .xls and .xlsx files—creating a bridge to the tidyverse for new R users and many data users and communities with ties to the Excel world.
You can install the readxl package with the whole tidyverse suite:
install.packages("tidyverse")
Or just the readxl package itself:
install.packages("readxl")
You have to load the readxl package by itself. You can type ?readxl
in the console and navigate to the package index to see the set of functions available for getting that Excel file, tab or section of data into R.
library(readxl)
?readxl
Here are a just few links and resources on readxl, the tidyverse and why take the bridge to R:
- readxl documentation & vignettes
- tidyverse website
- tidyverse packages on GitHub
- @tidyverse on Twitter
- a few reasons to cross the bridge from Excel to R: scary Excel stories curated by Jenny Bryan