Code Monkey home page Code Monkey logo

rsync-snapshot's Introduction

Rsync Snapshot

NPM

A Node.js implementation of incremental full system backups using rsync based on rsync - Arch Linux Wiki. Currently only tested on Linux.

See Incremental Bacukps with rsync for details on environment configuration

Features

  • Full System Backup
    • Backup of / (or any other folder) with permissions and other attributes preserved
    • Encrypted networked backups over the SSH Protocol
    • Transfer file deltas only (No need to transfer entire files, just the changes)
  • Incremental History
    • Using Hardlinks incremental history is stored
    • Auto deletion of oldest snapshots after specified number of snapshots is exceeded
  • Logging
    • Multiple output modes (json, text and raw rsync output)
    • Log to file
    • Multiple logging levels supported
  • Script Hooks
    • Execute scripts before or after backup

Requirements

  • NodeJS v7.6 or later - async/await is used in codebase
  • Rsync must be installed on the client and the server
  • One machine (client) must have SSH access to the other (server) if backing up over network, without a password (pubkey)
    • This script is designed to be run from the machine data is being backed up from (the client)
    • This script requires root access on the client to backup / and root access on the server to be able to set correct file ownership
      • In order to run this script passwordless sudo on rsync, rm and mv from the SSH user is required. This must be manually configured on the server. See Editing /etc/sudoers for details
      • If root is not used on server all files will be owned by the user logged in via ssh which will lead to errors on restore

Usage

  • Install Globally npm install -g rsync-snapshot
  • Execute the backup
    • Locally rsync-snapshot --dst /media/MyBackup
    • Remotely rsync-snapshot --shell ssh --dst [email protected]:/media/MyBackup
  • It is recommended to schedule this command to run regularly in cron or alike
    • When scheduling this script run it is best to update rsync-snapshot regularly
    • Execute npm install -g rsync-snapshot to update to latest version
  • Backups will be in a folder named by time and date (ex: 2018-02-21.19-06-26)
    • Anything may be appended (manually) to folder names to add user friendly info (ex: 2018-02-21.19-06-26.createdDatabase) as long as .incomplete is not appended (which is reserved for backups in progress, failed or canceled)
    • ls -1 | sort -r can be used to sort backups (most recent to least recent)
  • A symbolic link latest will always point to the most recent backup

Breaking Changes

Although this package will attempt to create very few breaking changes, if said changes do occur the Breaking Changes Notification Thread will be updated. Subscribe to the thread in order to be notified of breaking changes.

Parameters

Note: To wrap strings double quotes must be used. Ex: --shell "ssh -p 2222" must be used to specify ssh parameters. Single quotes will not be parsed correctly.

Rsync
  • --src PATH Default: /*
    • Source path to backup
  • --dst PATH
    • Destination folder path for backup
    • If using --shell ssh format is username@server:destinationPath
    • Folders will be created in this directory for incremental backup history
  • --shell SHELL
    • Remote shell to use
    • Note: Remote shell is assumed to be a ssh compatible client if specified
      • Ex: ssh or "ssh -p 2222"
  • --exclude PATH Can be used multiple times
    • Note: Unless --excludeFile is set default exclude list will be used in addition to specified excludes
    • Syntax
      • Include empty folder in destination: /dev/*
      • Do not include folder in destination: /dev
      • Glob style syntax: */steam/steamapps (Will exclude any file/folder ending with /steam/steamapps)
      • See Filter Rules for more information
  • --excludeFile EXCLUDEFILE Default: defaultExclude.txt
    • Similar to --exclude but is passed a text file with an exclude rule per line
    • For exclude rule syntax see --exclude documentation
  • --checksum
    • Change default transfer criteria from comparing modification date and file size to just comparing file size
    • This means the file size being the same is the only requirement needed to generate a checksum and transfer potential file differences
      • Enabling this flag will incur a performance penalty as many more checksums may be generated
  • --accurateProgress
    • Recurse all directories before transferring any files to generate a more accurate file tree
    • Note: This will increase memory usage substantially (10x increase is possible)
  • --noDelete
    • Don't delete existing files in --dst
  • --noDeleteExcludes
    • Don't delete existing files in --dst that are excluded
    • Can be useful for restores where excluded files are hardware specific
  • --rsyncPath PATH Defaults to "sudo rsync"
    • Command to execute rsync
    • If using SSH "sudo rsync" is recommended however it requires additional setup as a password prompt can not be asked (/etc/sudoers file must be modified to set NOPASSWD for rsync, see Rsync over ssh without root)
  • --setRsyncArg ARGUMENT=VALUE Can be used multiple times
    • Specify an argument to be passed to rsync and (optionally) its value
    • Ex: --setRsyncArgument checksum or --setRsyncArgument block-size=1024
  • --unsetRsyncArg ARGUMENT Can be used multiple times
    • Unset an argument which was already passed to rsync
Snapshot Management
  • --maxSnapshots NUMBER
    • Maximum number of snapshots
    • Once number is exceeded, oldest snapshots will be deleted until the condition is met
Script Hooks

Script Hooks can be used to run scripts before or after backup on the client while using the same log file as the backup process. Script hooks are not run in parallel.

  • --runBefore EXECUTABLE Can be used multiple times
    • Script to run on client before backup (file will be executed directly and output will be logged)
    • Can be useful for taking backups of data that requires consistency (ex: running pg_dump) and putting it in a folder that will be transfered by Rsync in the backup
  • --runAfter EXECUTABLE Can be used multiple times
    • Script to run on client after backup
    • Hook will only trigger if backup is successful
    • Can be useful for deleting temporary data after it is successfully transferred
Logging
  • --logFormat FORMAT Default: text
    • Format used to log output
    • Supported formats:
      • json - Rsync process output in JSON format
      • text - An easy to read rsync process output
      • raw - Output directly from rsync process
  • --logFile PATH
    • Path to file used to write output in logFormat
    • If file already exists it will be appended, otherwise it will be created
  • --logFileLevel LEVEL Default: ALL
    • Level of output to write to log file
    • Supported levels:
      • ALL Log Progress, Warnings, Errors and Summary
      • WARN Log Warnings, Errors and Summary
      • ERROR Log Errors and Summary
Restore
  • --restore
    • Clone files from --src to --dst
      • Note: Since there is no snapshot management done in restore mode, restores can be done from server to client or client to server by switching the destination and source arguments
    • If this flag is used snapshots are not used, this flag enables a simple rsync copy from the source to destination
      • All snapshot management flags will be ignored
Debug Info
  • --version
    • Print package version and exit
  • --printCommand
    • Print rsync command used

Data Consistency & Integrity

Rsync Does NOT Ensure Consistency | Rsync May Ensure Integrity

  • Consistency
    • Rsync can not take snapshots like certain file systems (ZFS, LVM...) this means if there are changes to files between this script starting and finishing the files could have been copied in any state
    • Rsync first builds a list of files then transfers only the deltas of each file from the client to the server
    • This means that files created after rsync has built a list of files will not be transfered
    • Because consistency is not ensured, this backup solution is not sufficient for database backups
      • It is recommended that database dumps are taken using database specific technology (ex: pg_dump for postgres)
      • The same applies to any write heavy application
  • Integrity
    • Rsync will always ensure transferred files are correctly reconstructed in memory
    • Rsync will then write the data in memory to the disk
    • If the OS indicates a successful write, rsync will proceed
      • There is no checksum done post write to disk as write correctness to be handled by the OS
    • Rsync determines files to be transferred by default by comparing file size and modification date
      • Checksums are only generated for potentially transferred files
      • The criteria to potentially transfer files can be changed to comparing file size only using the --checksum flag

Common Warnings/Errors

  • rsync warning: some files vanished before they could be transferred (code 24)
  • file has vanished
    • These warnings indicate a file has been deleted between the time rsync started and stopped executing
    • This does not mean the backup has failed, it is an expected warning as rsync does not take system level snapshots and data will not always be consistent
      • This message should be used a warning that said file may need to be backed up using a different method in order to ensure its consistency
      • See Data Consistency & Integrity section for more information
  • opendir failed: Permission Denied
  • send_files failed to open: Permission Denied
  • Any other permission related error
    • These errors indicate there is an issue with the permissions rsync is being run with.
  • An error occurred connecting to server while preparing for backup: sudo: no tty present and no askpass program specified
    • The backup snapshot management needs access to sudo rm and sudo mv without a password. If this error occurs the sudoers file (on the server) needs to be modified to allow rm and mv without a password.
  • An error occurred connecting to server whiel preparing for backup: mkdir: cannot create directory: Permission denied
    • The backup user does not have permission to create directories in --dst

Recovery

  • Partial Recovery
    • Since the backups are not compressed partial recovery is as easy as using SFTP (Filezilla works great if you want a GUI) and copying files over from the desired dated snapshot
      • Note: SFTP does not preserve all file attributes, if this is desired it is recommended to write a rsync script to transfer files using rsync parameters found in this script
        • At some point in the future I could make a flag for restores, if this would be useful to you feel free to open an issue
  • Complete Recovery
    • Recovering an entire installation is very similar to partial recovery
    • You will need to boot on a Live CD with access to networking, install rsync and then use it to rsync the files to the desired partition(s) (of course you will have to make the partition(s) first if they don't already exist)
    • Note that you may have to update things like /etc/fstab if disk names have changed or regenerate the bootloader
    • For more details see Recovering entire systems from backups
    • If the server with backups has ssh access to the client, see --restore

Additional Resources

rsync-snapshot's People

Contributors

mattlyons0 avatar

Stargazers

 avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

rsync-snapshot's Issues

Enhancement - Add smart remove like BIT

enhancement

the maxsnaphosts is an ok management for now but it would be nice to have something more like what BackinTime has

image

for starters just a --smartremove which would keep
all in current day, one a day for last 7 days, one a week for last 4 weeks, one a month for last 12 months and one a year. (by default)

then later allow those defaults to be customized

Allow machine/local time in logs and on snapshot dir name

Enhancement

logs use 24hr gmt for timestamp. That might be fine for some uber sysop running dozens of machines across multiple time zones but for little guys like me I much prefer the machine/local time. Yes I can subtract/add but it sooo much easier to glance at local timestamps to see if things came out ok. Can you please add an option to use machine time in log timestamps and for the directory of the snapshot. Thx

Unknown Output Logged via Text Logger

Unknown Output {"msgType":"warning","warning":"rsync warning: some files vanished before they could be transferred (code 24) at main.c(1196) [sender=3.1.2]"}

Refactor Codebase

I am unhappy with how this codebase turned out as a result of sort of throwing features in haphazardly.

It could be cleaned up a lot, especially how the loggers are implemented. Loggers should probably be split into their own package.

Feature Completion Status

The status of planned features can be seen below:

  • Basic Backup Functionality
  • JSON Output Mode
  • Text Output Mode
  • Incremental Backups
  • Logging to file
  • Logging levels
  • Deletion of Backups after more than N exist
  • Script Hooks

Test Local Backups

Local Backups need to be tested. I have only been testing over SSH so far.

Warnings and Errors false positives

Warning and error detection is flaky

Logged as warnings:

  • sending incremental file list
  • Empty Folders Ex: var/spool/postfix/private/ifmail
  • created directory (at the start after sending incremental file list when creating snapshot dir)
  • Symlinking? Ex:
    • usr/lib/i386-linux-gnu/libtasn1.so.6 -> libtasn1.so.6.5.4
    • snap/mailspring/140/usr/share/icons/Humanity/emblems/24/emblem-shared.svg -> ../../apps/24/gnome-session-switch.svg
    • usr/src/linux-headers-4.13.0-32/scripts/dtc/include-prefixes/arc -> ../../../arch/arc/boot/dts
    • var/snap/mailspring/current -> 140

Logged as errors:

  • Directories/files ending in error Ex: var/spool/postfix/private/error

Correctly detected warnings:

  • skipping non-regular file "var/spool/postfix/dev/random"

Don't mkdir -p

This behavior can have a unintended side effect of creating entire directory trees (ex: a drive isn't mounted).

Throw an error instead.

Deal with time jumping around

Rsync --info=progress2 seems to specify 2 times, elapsed time and some wildly incorrect eta?

Decide what to do about this, might just use my own calculation of elapsed time

Breaking Changes

A comment will be made on this thread every time a breaking change is made (although I do not plan on making any).

Subscribe to be notified.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.