Code Monkey home page Code Monkey logo

pgdump-aws-lambda's Introduction

pgdump-aws-lambda

Build Status Coverage Status

An AWS Lambda function that runs pg_dump and streams the output to s3.

It can be configured to run periodically using CloudWatch events.

Quick start

  1. Create an AWS lambda function:

    • Author from scratch
    • Runtime: Node.js 16.x
    • Architecture: x86_64
  2. tab "Code" -> "Upload from" -> ".zip file":

    • Upload (pgdump-aws-lambda.zip)
    • tab "Configuration" -> "General Configuration" -> "Edit"
      • Timeout: 15 minutes
      • Edit the role and attach the policy "AmazonS3FullAccess"
    • Save
  3. Give your lambda permissions permissions to write to S3:

    • tab "Configuration" -> "Permissions"
    • click the existing Execution role
    • "Add permissions" -> "Attach policies"
    • select "AmazonS3FullAccess" and click "Attach policies"
  4. Test

    • Create new test event, e.g.:
    {
        "PGDATABASE": "dbname",
        "PGUSER": "postgres",
        "PGPASSWORD": "password",
        "PGHOST": "host",
        "S3_BUCKET": "db-backups",
        "ROOT": "hourly-backups"
    }
    • Test and check the output
  5. Create a CloudWatch rule:

    • Event Source: Schedule -> Fixed rate of 1 hour
    • Targets: Lambda Function (the one created in step #1)
    • Configure input -> Constant (JSON text) and paste your config (as per previous step)

File Naming

This function will store your backup with the following s3 key:

s3://${S3_BUCKET}${ROOT}/YYYY-MM-DD/[email protected]

AWS Firewall

  • If you run the Lambda function outside a VPC, you must enable public access to your database instance, a non VPC Lambda function executes on the public internet.
  • If you run the Lambda function inside a VPC, you must allow access from the Lambda Security Group to your database instance. Also you must either add a NAT gateway (chargeable) to your VPC so the Lambda can connect to S3 over the Internet, or add an S3 VPC endpoint (free) and allow traffic to the appropriate S3 prefixlist.

Encryption

You can add an encryption key to your event, e.g.

{
    "PGDATABASE": "dbname",
    "PGUSER": "postgres",
    "PGPASSWORD": "password",
    "PGHOST": "host",
    "S3_BUCKET": "db-backups",
    "ROOT": "hourly-backups",
    "ENCRYPT_KEY": "c0d71d7ae094bdde1ef60db8503079ce615e71644133dc22e9686dc7216de8d0"
}

The key should be exactly 64 hex characters (32 hex bytes).

When this key is present the function will do streaming encryption directly from pg_dump -> S3.

It uses the aes-256-cbc encryption algorithm with a random IV for each backup file. The IV is stored alongside the backup in a separate file with the .iv extension.

You can decrypt such a backup with the following bash command:

openssl enc -aes-256-cbc -d \
-in [email protected] \
-out [email protected] \
-K c0d71d7ae094bdde1ef60db8503079ce615e71644133dc22e9686dc7216de8d0 \
-iv $(< [email protected])

IAM-based Postgres authentication

Your context may require that you use IAM-based authentication to log into the Postgres service. Support for this can be enabled my making your Cloudwatch Event look like this.

{
    "PGDATABASE": "dbname",
    "PGUSER": "postgres",
    "PGHOST": "host",
    "S3_BUCKET": "db-backups",
    "ROOT": "hourly-backups",
    "USE_IAM_AUTH": true
}

If you supply USE_IAM_AUTH with a value of true, the PGPASSWORD var may be omitted in the CloudWatch event. If you still provide it, it will be ignored.

SecretsManager-based Postgres authentication

If you prefer to not send DB details/credentials in the event parameters, you can store such details in SecretsManager and just provide the SecretId, then the function will fetch your DB details/credentials from the secret value.

NOTE: the execution role for the Lambda function must have access to GetSecretValue for the given secret.

Support for this can be enabled by setting the SECRETS_MANAGER_SECRET_ID, so your Cloudwatch Event looks like this:

{
    "SECRETS_MANAGER_SECRET_ID": "my/secret/id",
    "S3_BUCKET": "db-backups",
    "ROOT": "hourly-backups"
}

If you supply SECRETS_MANAGER_SECRET_ID, you can ommit the 'PG*' keys, and they will be fetched from your SecretsManager secret value instead with the following mapping:

Secret Value PG-Key
username PGUSER
password PGPASSWORD
dbname PGDATABASE
host PGHOST
port PGPORT

You can provide overrides in your event to any PG* keys as event parameters will take precedence over secret values.

Developer

Bundling a new pg_dump binary

  1. Launch an EC2 instance with the Amazon Linux 2 AMI
  2. Connect via SSH and:
# install postgres 15
sudo amazon-linux-extras install epel

sudo tee /etc/yum.repos.d/pgdg.repo<<EOF
[pgdg15]
name=PostgreSQL 15 for RHEL/CentOS 7 - x86_64
baseurl=https://download.postgresql.org/pub/repos/yum/15/redhat/rhel-7-x86_64
enabled=1
gpgcheck=0
EOF

sudo yum install postgresql15 postgresql15-server

exit

Download the binaries

scp ec2-user@your-ec2-hostname:/usr/bin/pg_dump ./bin/postgres-15.0/pg_dump
scp ec2-user@your-ec2-hostname:/usr/lib64/{libcrypt.so.1,libnss3.so,libsmime3.so,libssl3.so,libsasl2.so.3,liblber-2.4.so.2,libldap_r-2.4.so.2} ./bin/postgres-15.0/
scp ec2-user@your-ec2-hostname:/usr/pgsql-15/lib/libpq.so.5 ./bin/postgres-15.0/libpq.so.5
  1. To use the new postgres binary pass PGDUMP_PATH in the event:
{
    "PGDUMP_PATH": "bin/postgres-15.0"
}

Creating a new function zip

npm run makezip

Contributing

Please submit issues and PRs.

pgdump-aws-lambda's People

Contributors

jameshy avatar hanswesterbeek avatar nanocode012 avatar teimor avatar bigpresh avatar sgomez17 avatar readcodelearn avatar viktor-podzigun avatar felix-weizman-deel avatar nison-jp avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.