Code Monkey home page Code Monkey logo

rke-backup-example's Introduction

RKE Backup and Restore Example

Setting up recurring backups for Rancher server is an important part of making Rancher production-ready. Rancher relies on etcd for persisting its configuration and state. Although etcd is a distributed data store, it is not impervious to hardware failure. Consistent backups is your best defense against this type of failure, so this example will help you understand how to set this up in your own instance of Rancher.

Below is an example of automation to produce recurring backups in Rancher as well as the automation to perform a recovery from backup. First we are going to create an RKE cluster using terraform and EC2, and then we'll configure the backup job.

Requirements

  • Terraform
  • AWS
  • RKE

If you are on a Mac you can install Terraform and RKE with Homebrew by running: brew install terraform & brew install rke.

Configuring Backup

Setp 1 - Start two EC2 Servers

First create a file called terraform.tfvars that contains the following:

ssh_keypair = ""
vpc_id = "
resource_prefix = "my-name-or-unqiue-id"

Replace the variables with values that correspond to your environment. These values will be used by Terraform to build EC2 instances in your AWS account. You also should have a set of AWS credentials defined per the TF module's instructions

Then run:

terraform init

and subsequently

terraform up

You should now have two Ubuntu servers running in EC2. Take note of their IP addresses for the next step.

Step 2 - Configure RKE cluster.yml

Now we are going to configure RKE to connect to these servers to build a Kubernetes cluster. This is also where we will configure the backup job. Edit the file called cluster.yml in the root of this repo to reflect the correct IP addresses of the two servers you just created.

Next we want to configure the backup settings for the cluster. Our Terraform plan from above created an S3 bucket we can use for this called demo-rke-backup-bucket. Update the section under the key etcd in the cluster.yml to reflect the correct settings for your AWS account. You'll need to provide it with AWS credentials that the automated job can use to upload. We recommend you create a narrow-scoped IAM user profile for this purpose.

You'll notice the etcd section of cluster.yml has a couple of parameters pertaining to backup behavior:

      interval_hours: 12
      retention: 6

These settings control how often backups will occur and how many points in time should be kept in storage. Modify these to suit your needs. An active cluster might benefit from a 6 hour backup interval to capture changes to the state.

Step 3 - Deploying Kubernetes and starting Backup Job

Now we can apply our settings with RKE:

rke up --ssh-agent-auth

This will install Kubernetes on these nodes and configure automated backup. You may need to add the SSH key you are using in AWS to your agent by running ssh-add /path/to/key

That's it. Now you have an automated backup job running periodically to protect your cluster from data loss.

Configuring Restore

In the event of a etcd failure or corruption, we can use our snapshot to restore the Rancher cluster state. Note that this is a destructive step to the existing etcd running in your cluster, so only do this if you want to replace the running state with the data in the backup.

To preform a restore, we'll re-use much of the information from the previous example to construct the restore command:

rke etcd snapshot-restore --config cluster.yml --name snapshot-name \
--s3 --access-key S3_ACCESS_KEY --secret-key S3_SECRET_KEY \
--bucket-name demo-rke-backup-bucket --s3-endpoint s3.amazonaws.com

Be sure to replace "snapshot-name" with the correct name of an actual snapshot. Also replace "access-key" and "secret-key" with the same keys we used previously to write the snapshots.

rke-backup-example's People

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.