This presentation shows how you can use Amazon Web Services (AWS) in R with the Paws package. The Paws package provides access to 150+ services on AWS.
One use case is to run large, complex analyses on dedicated servers. The example code here runs an R script on a large server which starts on command and stops when done, using AWS Batch.
The example is based on the Creating a Simple "Fetch & Run" AWS Batch Job blog post written by Amazon.
This presentation was prepared for the Government Advances in Statistical Programming (GASP!) conference, held on September 23, 2019 in Washington DC.
- Install Paws
- Make a Docker container (optional)
- Set up AWS Batch
- Run an R job on Batch!
Run install.packages("paws")
to install from CRAN.
If you are using Linux, you'll need to install development packages for
cURL, OpenSSL, and libxml2. In Debian/Ubuntu, install libcurl4-openssl-dev
,
libssl-dev
, and libxml2-dev
.
The example also assumes that you have AWS credentials saved in OS environment variables or in a shared credentials file. See this document for more info on authenticating with AWS.
Your batch job runs in a Docker container, which is a self-contained environment with an OS and other software, such as R. The example in this repo uses a pre-built Docker container with R installed, which is hosted on Docker Hub.
You can make your own Docker container using the Dockerfile
in the docker
folder. The
Creating a Simple "Fetch & Run" AWS Batch Job
blog post shows how to do that.
To use Batch, you must set up a compute environment (e.g. max CPUs), a job queue, and a job definition (e.g. what container to use). See the Batch user guide for more info about what each of these is for.
The script 1_aws_batch_setup.R
will create these for you.
You can also follow the instructions in Creating a Simple "Fetch & Run" AWS Batch Job.
The example in 2_run_batch_job.R
copies an R script to an S3 file storage
bucket, then runs an AWS Batch job which fetches the R script and runs it.
install.packages("paws")
- install Paws from CRAN.
Paws home page - See online documentation.
GitHub - See getting started guide and examples; submit issues.