pgbackrest / pgbackrest Goto Github PK

View Code? Open in Web Editor NEW

2.3K 63.0 208.0 52.84 MB

Reliable PostgreSQL Backup & Restore

Home Page: https://pgbackrest.org

License: Other

Perl 2.71% C 96.70% Makefile 0.10% M4 0.09% Shell 0.15% Dockerfile 0.02% Meson 0.23%

postgresql parallel backup restore archiving multi-process database s3 gzip lz4

pgbackrest's Introduction

pgBackRest
Reliable PostgreSQL Backup & Restore

Introduction

pgBackRest is a reliable backup and restore solution for PostgreSQL that seamlessly scales up to the largest databases and workloads.

pgBackRest v2.52 is the current stable release. Release notes are on the Releases page.

Please find us on GitHub and give us a star if you like pgBackRest!

Features

Parallel Backup & Restore

Compression is usually the bottleneck during backup operations so pgBackRest solves this problem with parallel processing and more efficient compression algorithms such as lz4 and zstd.

Local or Remote Operation

A custom protocol allows pgBackRest to backup, restore, and archive locally or remotely via TLS/SSH with minimal configuration. An interface to query PostgreSQL is also provided via the protocol layer so that remote access to PostgreSQL is never required, which enhances security.

Multiple Repositories

Multiple repositories allow, for example, a local repository with minimal retention for fast restores and a remote repository with a longer retention for redundancy and access across the enterprise.

Full, Differential, & Incremental Backups (at File or Block Level)

Full, differential, and incremental backups are supported. pgBackRest is not susceptible to the time resolution issues of rsync, making differential and incremental backups safe without the requirement to checksum each file. Block-level backups save space by only copying the parts of files that have changed.

Backup Rotation & Archive Expiration

Retention polices can be set for full and differential backups to create coverage for any time frame. The WAL archive can be maintained for all backups or strictly for the most recent backups. In the latter case WAL required to make older backups consistent will be maintained in the archive.

Backup Integrity

Checksums are calculated for every file in the backup and rechecked during a restore or verify. After a backup finishes copying files, it waits until every WAL segment required to make the backup consistent reaches the repository.

Backups in the repository may be stored in the same format as a standard PostgreSQL cluster (including tablespaces). If compression is disabled and hard links are enabled it is possible to snapshot a backup in the repository and bring up a PostgreSQL cluster directly on the snapshot. This is advantageous for terabyte-scale databases that are time consuming to restore in the traditional way.

All operations utilize file and directory level fsync to ensure durability.

Page Checksums

If page checksums are enabled pgBackRest will validate the checksums for every file that is copied during a backup. All page checksums are validated during a full backup and checksums in files that have changed are validated during differential and incremental backups.

Validation failures do not stop the backup process, but warnings with details of exactly which pages have failed validation are output to the console and file log.

This feature allows page-level corruption to be detected early, before backups that contain valid copies of the data have expired.

Backup Resume

An interrupted backup can be resumed from the point where it was stopped. Files that were already copied are compared with the checksums in the manifest to ensure integrity. Since this operation can take place entirely on the repository host, it reduces load on the PostgreSQL host and saves time since checksum calculation is faster than compressing and retransmitting data.

Streaming Compression & Checksums

Compression and checksum calculations are performed in stream while files are being copied to the repository, whether the repository is located locally or remotely.

If the repository is on a repository host, compression is performed on the PostgreSQL host and files are transmitted in a compressed format and simply stored on the repository host. When compression is disabled a lower level of compression is utilized to make efficient use of available bandwidth while keeping CPU cost to a minimum.

Delta Restore

The manifest contains checksums for every file in the backup so that during a restore it is possible to use these checksums to speed processing enormously. On a delta restore any files not present in the backup are first removed and then checksums are generated for the remaining files. Files that match the backup are left in place and the rest of the files are restored as usual. Parallel processing can lead to a dramatic reduction in restore times.

Parallel, Asynchronous WAL Push & Get

Dedicated commands are included for pushing WAL to the archive and getting WAL from the archive. Both commands support parallelism to accelerate processing and run asynchronously to provide the fastest possible response time to PostgreSQL.

WAL push automatically detects WAL segments that are pushed multiple times and de-duplicates when the segment is identical, otherwise an error is raised. Asynchronous WAL push allows transfer to be offloaded to another process which compresses WAL segments in parallel for maximum throughput. This can be a critical feature for databases with extremely high write volume.

Asynchronous WAL get maintains a local queue of WAL segments that are decompressed and ready for replay. This reduces the time needed to provide WAL to PostgreSQL which maximizes replay speed. Higher-latency connections and storage (such as S3) benefit the most.

The push and get commands both ensure that the database and repository match by comparing PostgreSQL versions and system identifiers. This virtually eliminates the possibility of misconfiguring the WAL archive location.

Tablespace & Link Support

Tablespaces are fully supported and on restore tablespaces can be remapped to any location. It is also possible to remap all tablespaces to one location with a single command which is useful for development restores.

File and directory links are supported for any file or directory in the PostgreSQL cluster. When restoring it is possible to restore all links to their original locations, remap some or all links, or restore some or all links as normal files or directories within the cluster directory.

S3, Azure, and GCS Compatible Object Store Support

pgBackRest repositories can be located in S3, Azure, and GCS compatible object stores to allow for virtually unlimited capacity and retention.

Encryption

pgBackRest can encrypt the repository to secure backups wherever they are stored.

Compatibility with ten versions of PostgreSQL

pgBackRest includes support for ten versions of PostgreSQL, the five supported versions and the last five EOL versions. This allows ample time to upgrade to a supported version.

Getting Started

pgBackRest strives to be easy to configure and operate:

User guides for various operating systems and PostgreSQL versions.
Command reference for command-line operations.
Configuration reference for creating pgBackRest configurations.

Documentation for v1 can be found here. No further releases are planned for v1 because v2 is backward-compatible with v1 options and repositories.

Contributions

Contributions to pgBackRest are always welcome! Please see our Contributing Guidelines for details on how to contribute features, improvements or issues.

Support

pgBackRest is completely free and open source under the MIT license. You may use it for personal or commercial purposes without any restrictions whatsoever. Bug reports are taken very seriously and will be addressed as quickly as possible.

Creating a robust disaster recovery policy with proper replication and backup strategies can be a very complex and daunting task. You may find that you need help during the architecture phase and ongoing support to ensure that your enterprise continues running smoothly.

Crunchy Data provides packaged versions of pgBackRest for major operating systems and expert full life-cycle commercial support for pgBackRest and all things PostgreSQL. Crunchy Data is committed to providing open source solutions with no vendor lock-in, ensuring that cross-compatibility with the community version of pgBackRest is always strictly maintained.

Please visit Crunchy Data for more information.

Recognition

Primary recognition goes to Stephen Frost for all his valuable advice and criticism during the development of pgBackRest.

Crunchy Data has contributed significant time and resources to pgBackRest and continues to actively support development. Resonate also contributed to the development of pgBackRest and allowed early (but well tested) versions to be installed as their primary PostgreSQL backup solution.

Armchair graphic by Sandor Szabo.

pgbackrest's People

Contributors

Stargazers

Watchers

Forkers

evanbenoit youngdev linearregression terrorobe direvus tayterz tbe jasonodonnell scrummyin gregscds jfhyn credativ devopsbox blunney1 ashokraj blogh fiolbs xenophenes liuqian1990 thunderkeys jyothsna01 jaferrer zhuomingliang davidfetter lucas0530 alvarolopez-stratio kuang17 tostino trinchan sebastianwebber chunxiaozhang2015 longnan skymysky huihuiyule duzhanyuan dacer250 sfrost viewpnt weiboyiyou bylee5 botp dpirotte mhagander dreych tiago-anastacio pgstef marco44 srqway shawnd fluca1978 rekgrpth ipq srikanth-medikonda srichunduru keithf4 gmmagruder cheongkt crunchyjoea jsoref ranjangajare ethvm yangmain bradnicholson haytastan guanqingtao mpalmi hlakshmi dut3062796s robinatw mrbez sebidude doytsujin fmbiete qiuwenhuifx zhutony sean0101n cuulee 759126711 ng-pe fvannee dtseiler sherlockcpp levkk machack666 eaz crunchyheath openmicrostacks mateuszkosek peacedata0 iqbalaydrus gum0x1a sscotty71 mschout bugfyi pgguru rodrigoieh shitouxiaogege constzl ak14bbk30 isgasho

pgbackrest's Issues

remove db start stop times from manifest.

These should really come from the .backup file so they are accurate. For now I think it's enough to remove them and them recommend to check the .backup file in the restore docs.

When resuming backups, do not resume from a backup of a different type.

Check to be sure that the backup being resumed is the same type that is being attempted:

If full, previous backup should have been full
If differential, resumed backup should have been differential and full backup should be the same
If incremental, resumed backup should have been incremental and previous backup should be the same.

ALSO: If the version number has changed do not resume. If there is no manifest or version file do not resume.

The backup type will need to be written into each manifest for this to work.

--config-option command-line option

Allow any config option to be set/overridden on the command line. Even a full config could be built this way.

Abstract manifest build.

This is still contained in backup code and needs to be moved to manifest code. Once this is done, the rest of the synthetic restore unit tests can be written.

Log expiration

Since backrest maintains it's own log directory it should also expire logs automatically.

Default to 30 days and make it user configurable. [Per Russell's suggestion make the default = to retention-full?]

Expiration unit tests

Write unit tests for expiration.

Info command

Some kind of info command to see when the last backup was done for each, this can be done by stanza or globally.
It would be nice to see:

Last backup
Number of backups of each type and date
What backups are currently running
Failed backups in temp

The internal format for this should be JSON so it can easily for formatted into text, Nagios alerts - whatever.

db::path command-line override rules

if db::path is specified on the command line then if tablespaces are not also on the command line then put them under the base pg_tblspc path without linking. If tablespaces are mapped in the config, then make sure a warning is thrown to show that they have been remapped.
This is to prevent an existing installation from being clobbered if the base path is remapped for the purpose of testing.

Move backups to be removed to temp before deleting

This will play nicer with backup software and prevent anyone from trying to restore a backup that is currently being deleted.

Maybe instead have a backup manifest that lists the valid backups.
Also add a feature to check the presence of all files before a restore.

Delete from backup.info before actually removing the directory. Move the directory deletion code to the end - scan the db path and delete anything that is not in backup.info.

Database restore

Currently this is manual - need to make it automatic.

Fix bug where .backup files written into old directories can cause the archive process to warn

Actually changed this to info since it has never caused any issues - still, it should be fixed. From the code:
# If the dir is not empty check if the files are in the manifest
# If they are, then error - there has been some issue
# If not, they are new - continue processing without error - they'll be picked up on the next run

Reverse ordering by file size

Process backup and restore files in reverse order. This is more efficient at the end since you're less likely to end up with a single thread processing the last big file.

Write manifest every minute or so to preserve checksums

Refactor code to allow manifest to be saved every minutes. This will reduce the number of files that need to be recopied on a resume.

Added format to manifest

Create a format settings that keeps track of when the on-disk/manifest format changes for backrest. This is will determine whether backups are compatible.

Add --help

Who doesn't need help?

Async archive-get with prefetch

Getting one archive file at a time can be tedious if the cluster is very far behind. An async get with some sort of prefetch would speed the process a lot.

Should be able to specify how may archive logs to prefetch.

simple restore where tablespaces are stored directly in pg_tablespace.

This should make development restores easier.
Perhaps a command line option --no-tablespace?

Complete documentation.

This should probably be done in html so there is more flexibility.

Lots of work needed on restore.

Set backup dir name to date/time when backup completes

Set backup dir name to date/time when backup completes (after stop backup). This indicates the time from which recovery is possible. Right now the timestamp is from the beginning of the backup which can be misleading.

Convert link_create to pure Perl implementation

Use link and symlink in Perl to make this function pure perl.
Make it work through remote protocol.

Allow backup on shutdown cluster

This is for cases when the cluster is already down:

Don't do start/stop backup
Preserve the contents of pg_xlog
Don't run if postmaster.pid is present

Unit test for restore type=preserve

Make File->move() run remotely

Current only has local operation.

More defaults for pg_backrest.conf

Defaults for:
command::remote
command::psql
backup::user
backup::path
db::user
db::path
[more?]

--version param (with version written into backup.manifest or VERSION file)

If tablespace is located in PG data directory it is copied twice

Sort of a weird case, but perhaps it should be accounted for.
Just throw an error to let the user know this is not a good idea?

Threading for archive-get and archive-put

These would perform better with threading.

Remove old thread code in ::Backup

Leave this until I decide if threads will be replaced or implemented in a new way.

Store cluster ID in archive dir

This will help identify issues where archive for different DBs has gotten mixed.

archive::path != backup::path

Make sure that archive::path != backup::path if backup is local.

Make sure duplicate WALs cannot be pushed

Duplicate WALs can be logged if checksums are turned on (since the filenames do no collide). Do not allow this as a misconfiguration would render the archive repo unusable.

TEST: Update backup unit tests to check for output to STDOUT

Make sure that nothing unexpected is output to STDOUT during a backup.

Split expire code out into a separate module

Just what it says.

Hook Scripts

Allow user-defined hook scripts to be run before and after backups.

Write pg_xlog/* into manifest

When archive logs are copied to pg_xlog, be sure to add them to the manifest.
This might be required to make some of the new unit tests work.

Convert Postgres calls from psql to DBD::Pg

This will eliminate some configuration and should increase portability. This should also eliminate IPC::System::Simple.

TEST: check that missing files are actually skipped by checking the log

Add to BackupTest.

Create latest directory link

Create a link to the latest backup called latest. This makes automating restores off the most recent backup easier.

Return hard error for holes in the the archive log

If an archive log is missing in the middle of an archive stream backrest will return a soft error (1), even though there is probably no chance of that archive log showing up.

If an archive log is missing then check to see if the next one is present - if so return a hard error.

This is tricky because there is a question of how long to wait. With parallel async push it's very possible that the WALs could arrive out of order.

Here's how to do it. BackRest on the database server knows the oldest WAL segment that is currently on the db server and not pushed. If this is reported to the backup server, then it can determine if a hole in the archive stream may be filled, or if it is a permanent condition.

Throttling

Add a throttling feature to limit the amount of disk i/o and or network bandwidth being used by the backup.
To start, make this a per thread limitation. That simplifies the problem quite a bit and most users who are throttling will probably be single threaded.

Add checksum-delta option

Checksums are calculated during the backup process, but the delta is still done during diff/incr backups.
Add a new option checksum_delta (default n) that does the delta using checksums. Of course, if the timestamp or size has changed the checksum does not need to be calculated.

Run four processes from the command line in parallel and make sure this is actually faster and make sure this is faster than running four threads. RESULTS: Processes are not measurably faster than threads, but it's still a good idea to replace them for portability reasons. Threads are not well supported and in fact are officially deprecated.

There will be a few steps to this effort:

Separate protocol from remote
Create local module
Create a ProcessGroup module to manage local helpers
Replace ThreadGroup with process group