Skip to content

mv hot backup

Matthew Von-Maszewski edited this page Jul 7, 2016 · 13 revisions

Status

  • merged to master -
  • code complete -
  • development started -
  • concept RFC circulated - July 7, 2016

History / Context

Basho has previously relied upon various external methods for creating hot backups of the leveldb .sst table files. All of the solutions to date shared two problems:

  • the MANIFEST file was wrong and a leveldb "repair" operation was necessary before using the backup, and
  • there could be .sst table files present that were incomplete output from an active compaction and they were therefore corrupt.

The leveldb repair operation addressed both problems. But the repair could take tens of minutes to hours on terabyte size datasets.

This branch addresses both problems. It creates "ready to run" backup images.

Usage

Trigger

The initial open source release of this branch uses an external trigger to initiate a hot backup. The expectation is that later a riak-admin command will be able to initiate the process across all Riak nodes. The external trigger is sufficient for non-Riak users and Riak users willing to initiate backups via cron job or similar operations methodology.

The external trigger is the creation of the file /etc/riak/backup_now. leveldb has an independent thread that cycles every 60 seconds. It will detect the file upon its next cycle and initiate the backup. leveldb erases the file /etc/riak/backup_now upon completing the backup of all open databases (vnodes). For Riak, this implies all user vnodes, AAE vnodes, and management vnodes such as cluster data.

The user has a choice once /etc/riak/backup_now disappears: either leave the backup in place, or copy the backup elsewhere and remove the copy from the production system.

Backup actions

Basho's leveldb hot backup manages up to six backup images. The hot backup creates a series of directories within each leveldb database (Riak vnode): backup, backup.0, backup.1, backup.2, backup.3, backup.4, and backup.5. Each new backup request renames the existing directories to next higher number, deleting the old backup.5, and placing the new backup in directory "backup". Tiered storage configurations have the same directories on both the fast and slow tier.

hot backup then creates a parallel directory structure for the sst_? directories within the "backup" directory. This completes the setup phase.

hot backup then:

  • momentarily pauses Write operations to ensure the secondary write buffer completely flushes to disk (if it exists),
  • requests current write buffer disk flush, notes leveldb's sequence number at this point, and enables Write operations,
  • requests a leveldb snapshot (after write buffer flushes), and updates the snapshot based upon the saved sequence number,
  • creates a MANIFEST file and CURRENT file in the backup directory based upon the snapshot,
  • creates hard links in backup directory's sst_? directories to needed .sst files in live sst_? directories,
  • and copies LOG and LOG.old file "as is" to backup directory (LOG may contain events beyond backup initiated time).

Limitations

This backup method does NOT guarantee consistency across Riak vnodes and/or Riak AAE data. The vnodes and their AAE data are close in time. But standard Riak AAE and read repair logic will create reasonable consistency. The only backup that guarantees full consistency across all vnodes and AAE data requires Riak shutdown and a backup of the static files.

Clone this wiki locally