Skip to content

Commit

Permalink
doc(backup): multiple backup stores support
Browse files Browse the repository at this point in the history
ref: longhorn/longhorn 5411, 10043, 10089

Signed-off-by: James Lu <[email protected]>
  • Loading branch information
mantissahz authored and derekbit committed Jan 17, 2025
1 parent ef2aa7d commit 805a49e
Show file tree
Hide file tree
Showing 14 changed files with 162 additions and 121 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -34,25 +34,29 @@ Example of YAML code used to create a backup of the sample backing image:
apiVersion: longhorn.io/v1beta2
kind: BackupBackingImage
metadata:
name: parrot
name: parrot-backup
namespace: longhorn-system
spec:
backingImage: parrot
backupTargetName: default
userCreated: true
labels:
usecase: test
type: raw
```
> **IMPORTANT:**
> - `name`: Use the same name for the backing image and its backup. If the names are not identical, Longhorn will not be able to find the backing image.
> - `name`: If the names are not unique, Longhorn will not be able to create a backup of the backing image.
> - `backingImage`: The backing image for the backup.
> - `backupTargetName`: The backup target that is used to store the backup of the backing image.
> - `userCreated`: Set the value to `true` to indicate that you created the backup custom resource, which enabled the creation of the backup in the backupstore. The value `false` indicates that the backup custom resource was synced from the backupstore.
> - `labels`: You can add labels to the backing image backup.

### Create a Backup Using the Longhorn UI
1. Go to **Setting** > **Backing Image**.
2. Select the backing image that you want to back up, and then click **Back Up** in the **Operation** menu.

Longhorn creates the backup and adds the details to the **Backing Image Backup** list. The names of the backup and the source backing image are identical.
Longhorn creates the backup and adds the details to the **Backing Image Backup** list.

{{< figure src="/img/screenshots/backing-image/backup.png" >}}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -38,11 +38,13 @@ It includes below resources associating with the Longhorn system:
- StorageClasses
- Volumes

> **Note:** Longhorn is unable to back up V2 Data Engine backing images.
> **Note:** Longhorn does not backup `Nodes`. The Longhorn manager on the target cluster is responsible for creating its own Longhorn `Node` custom resources.
> **Note:** Longhorn system backup bundle only includes resources operated by Longhorn.
> **Note:**
>
> - The default backup target (`default`) is always used to store system backups.
> - The Longhorn system backup bundle only includes resources operated by Longhorn.
> - Longhorn does not back up the `Nodes` resource. The Longhorn Manager on the target cluster is responsible for creating its own Longhorn `Node` custom resources.
> - Longhorn is unable to back up V2 Data Engine backing images.
>
> Here is an example of a cluster workload with a bare `Pod` workload. The system backup will collect the `PersistentVolumeClaim`, `PersistentVolume`, and `Volume`. The system backup will exclude the `Pod` during system backup resource collection.
## Create Longhorn System Backup
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -112,8 +112,6 @@ You can abort or remove a completed Longhorn system restore using Longhorn UI. O
Some settings are excluded as configurable before the Longhorn system restore.
- [Concurrent volume backup restore per node limit](../../../references/settings/#concurrent-volume-backup-restore-per-node-limit)
- [Concurrent replica rebuild per node limit](../../../references/settings/#concurrent-replica-rebuild-per-node-limit)
- [Backup Target](../../../references/settings/#backup-target)
- [Backup Target Credential Secret](../../../references/settings/#backup-target-credential-secret)
## Troubleshoot
Expand Down
12 changes: 6 additions & 6 deletions content/docs/1.8.0/concepts.md
Original file line number Diff line number Diff line change
Expand Up @@ -267,7 +267,7 @@ A backup is an object in the backupstore, which is an NFS or S3 compatible objec

Because the volume replication is synchronized, and because of network latency, it is hard to do cross-region replication. The backupstore is also used as a medium to address this problem.

When the backup target is configured in the Longhorn settings, Longhorn can connect to the backupstore and show you a list of existing backups in the Longhorn UI.
When the backup target is configured on the Longhorn UI (**Setting > Backup Target**), Longhorn can connect to the backupstore and display a list of existing backups on the **Backup** screen.

If Longhorn runs in a second Kubernetes cluster, it can also sync disaster recovery volumes to the backups in secondary storage, so that your data can be recovered more quickly in the second Kubernetes cluster.

Expand Down Expand Up @@ -326,9 +326,9 @@ Because the main purpose of a DR volume is to restore data from backup, this typ
- Creating persistent volumes
- Creating persistent volume claims

A DR volume can be created from a volume’s backup in the backup store. After the DR volume is created, Longhorn will monitor its original backup volume and incrementally restore from the latest backup. A backup volume is an object in the backupstore that contains multiple backups of the same volume.
A DR volume can be created from a volume’s backup in the backupstore. After the DR volume is created, Longhorn will monitor its original backup volume and incrementally restore from the latest backup. A backup volume is an object in the backupstore that contains multiple backups of the same volume.

If the original volume in the main cluster goes down, the DR volume can be immediately activated in the backup cluster, so it can greatly reduce the time needed to restore the data from the backup store to the volume in the backup cluster.
If the original volume in the main cluster goes down, the DR volume can be immediately activated in the backup cluster, reducing the time needed to restore the data from the backupstore to the volume in the backup cluster.

When a DR volume is activated, Longhorn will check the last backup of the original volume. If that backup has not already been restored, the restoration will be started, and the activate action will fail. Users need to wait for the restoration to complete before retrying.

Expand All @@ -338,16 +338,16 @@ After a DR volume is activated, it becomes a normal Longhorn volume and it canno

## 3.4. Backupstore Update Intervals, RTO, and RPO

Typically incremental restoration is triggered by the periodic backup store update. Users can set backup store update interval in Setting - General - Backupstore Poll Interval.
Incremental restoration is usually triggered by the periodic backupstore update. You can set the update interval on the backup target settings screen (**Setting > Backup Target**).

Notice that this interval can potentially impact Recovery Time Objective (RTO). If it is too long, there may be a large amount of data for the disaster recovery volume to restore, which will take a long time.

As for Recovery Point Objective (RPO), it is determined by recurring backup scheduling of the backup volume. If recurring backup scheduling for normal volume A creates a backup every hour, then the RPO is one hour. You can check here to see how to set recurring backups in Longhorn.

The following analysis assumes that the volume creates a backup every hour, and that incrementally restoring data from one backup takes five minutes:

- If the Backupstore Poll Interval is 30 minutes, then there will be at most one backup worth of data since the last restoration. The time for restoring one backup is five minutes, so the RTO would be five minutes.
- If the Backupstore Poll Interval is 12 hours, then there will be at most 12 backups worth of data since last restoration. The time for restoring the backups is 5 * 12 = 60 minutes, so the RTO would be 60 minutes.
- If the backupstore Poll Interval is 30 minutes, then there will be at most one backup worth of data since the last restoration. The time for restoring one backup is five minutes, so the RTO would be five minutes.
- If the backupstore Poll Interval is 12 hours, then there will be at most 12 backups worth of data since last restoration. The time for restoring the backups is 5 * 12 = 60 minutes, so the RTO would be 60 minutes.

# Appendix: How Persistent Storage Works in Kubernetes

Expand Down
24 changes: 21 additions & 3 deletions content/docs/1.8.0/important-notes/_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@ Please see [here](https://github.com/longhorn/longhorn/releases/tag/v{{< current
- [Change in Engine Replica Timeout Behavior](#change-in-engine-replica-timeout-behavior)
- [Talos Linux](#talos-linux)
- [Backup](#backup)
- [Multiple Backup Stores Support](#multiple-backupstores-support)
- [Backup Data On The Remote Backup Server Might Be Deleted](#backup-data-on-the-remote-backup-server-might-be-deleted)
- [System Backup And Restore](#system-backup-and-restore)
- [Volume Backup Policy](#volume-backup-policy)
Expand Down Expand Up @@ -63,16 +64,33 @@ Longhorn v1.8.0 and later versions support usage of V2 volumes in Talos Linux cl

## Backup

### Multiple Backupstores Support

Starting with v1.8.0, Longhorn supports usage of multiple backupstores. You can configure backup targets to access backupstores on the **Setting/Backup Target** screen of the Longhorn UI. v1.8.0 improves on earlier Longhorn versions, which only allow you to use a single backup target for accessing a backupstore. Earlier versions also require you to configure the settings `backup-target`, `backup-target-credential-secret`, and `backupstore-poll-interval` for backup target management.

> **IMPORTANT:**
> The settings `backup-target`, `backup-target-credential-secret`, and `backupstore-poll-interval` were removed from the global settings because backup targets can be configured on the **Setting/Backup Target** screen of the Longhorn UI. Longhorn also creates a default backup target (`default`) during installation and upgrades.
Longhorn creates a default backup target (`default`) during installation and upgrades. The default backup target is used for the following:

- System backups
- Volumes that were created without a specific backup target name

> **Tip:**
> Set the [default backup target](../snapshots-and-backups/backup-and-restore/set-backup-target#default-backup-target) before creating a new one.
For more information, see [Setting a Backup Target](../snapshots-and-backups/backup-and-restore/set-backup-target), [Issue #5411](https://github.com/longhorn/longhorn/issues/5411) and [Issue #10089](https://github.com/longhorn/longhorn/issues/10089).

### Backup Data On The Remote Backup Server Might Be Deleted

Longhorn may unintentionally delete backup-related custom resources (such as `BackupVolume`, `BackupBackingImage`, `SystemBackup`, and `Backup`) and backup data on the remote backup server before Longhorn v{{< current-version >}} in the following scenarios:
Earlier Longhorn versions may unintentionally delete data in the backupstore and backup-related custom resources (such as `BackupVolume`, `BackupBackingImage`, `SystemBackup`, and `Backup`) in the following scenarios:

- An empty response from the NFS server due to server downtime.
- A race condition could delete the remote backup volume and its corresponding backups when the backup target is reset within a short period.

Starting with v{{< current-version >}}, Longhorn handles backup-related custom resources in the following manner:
Starting with v1.8.0, Longhorn handles backup-related custom resources in the following manner:

- If there are discrepancies between the backup information in the cluster and on the remote backup server, Longhorn deletes only the backup-related custom resources in the cluster.
- If there are discrepancies between the backup information in the cluster and in the backupstore, Longhorn deletes only the backup-related custom resources in the cluster.
- The backup-related custom resources in the cluster may be deleted unintentionally while the remote backup data remains safely stored. The deleted resources are resynchronized from the remote backup server during the next polling period (if the backup target is available).

For more information, see [#9530](https://github.com/longhorn/longhorn/issues/9530).
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,7 @@ When the Pod is deployed, the Kubernetes master will check the PersistentVolumeC
staleReplicaTimeout: "2880" # 48 hours in minutes
fromBackup: ""
fsType: "ext4"
# backupTargetName: "default"
# mkfsParams: "-I 256 -b 4096 -O ^metadata_csum,^64bit"
# diskSelector: "ssd,fast"
# nodeSelector: "storage,fast"
Expand All @@ -51,6 +52,7 @@ When the Pod is deployed, the Kubernetes master will check the PersistentVolumeC
```
In particular, starting with v1.4.0, the parameter `mkfsParams` can be used to specify filesystem format options for each StorageClass.
Starting with v1.8.0, the parameter `backupTargetName` can be used to specify the backup target. The name of the default backup target (`default`) is used if `backupTargetName` is not specified.
2. Create a Pod that uses Longhorn volumes by running this command:
Expand Down
7 changes: 4 additions & 3 deletions content/docs/1.8.0/references/helm-values.md
Original file line number Diff line number Diff line change
Expand Up @@ -192,6 +192,10 @@ For more details, see the [ocp-readme](https://github.com/longhorn/longhorn/blob
| Key | Default | Description |
|-----|---------|-------------|
| annotations | `{}` | Annotation for the Longhorn Manager DaemonSet pods. This setting is optional. |
| defaultBackupStore | `{"backupTarget":null,"backupTargetCredentialSecret":null,"pollInterval":null}` | Setting that allows you to update the default backupstore. |
| defaultBackupStore.backupTarget | `""` | Endpoint used to access the default backupstore. (Options: "NFS", "CIFS", "AWS", "GCP", "AZURE") |
| defaultBackupStore.backupTargetCredentialSecret | `""` | Name of the Kubernetes secret associated with the default backup target. |
| defaultBackupStore.pollInterval | `""` | Number of seconds that Longhorn waits before checking the default backupstore for new backups. The default value is "300". When the value is "0", polling is disabled. |
| enableGoCoverDir | `false` | Setting that allows Longhorn to generate code coverage profiles. |
| enablePSP | `false` | Setting that allows you to enable pod security policies (PSPs) that allow privileged Longhorn pods to start. This setting applies only to clusters running Kubernetes 1.25 and earlier, and with the built-in Pod Security admission controller enabled. |
| namespaceOverride | `""` | Specify override namespace, specifically this is useful for using longhorn as sub-chart and its release namespace is not the `longhorn-system`. |
Expand All @@ -217,9 +221,6 @@ During installation, you can either allow Longhorn to use the default system set
| defaultSettings.backingImageRecoveryWaitInterval | Number of seconds that Longhorn waits before downloading a backing image file again when the status of all image disk files changes to "failed" or "unknown". |
| defaultSettings.backupCompressionMethod | Setting that allows you to specify a backup compression method. |
| defaultSettings.backupConcurrentLimit | Maximum number of worker threads that can concurrently run for each backup. |
| defaultSettings.backupTarget | Endpoint used to access the backupstore. (Options: "NFS", "CIFS", "AWS", "GCP", "AZURE") |
| defaultSettings.backupTargetCredentialSecret | Name of the Kubernetes secret associated with the backup target. |
| defaultSettings.backupstorePollInterval | Number of seconds that Longhorn waits before checking the backupstore for new backups. The default value is "300". When the value is "0", polling is disabled. |
| defaultSettings.concurrentAutomaticEngineUpgradePerNodeLimit | Maximum number of engines that are allowed to concurrently upgrade on each node after Longhorn Manager is upgraded. When the value is "0", Longhorn does not automatically upgrade volume engines to the new default engine image version. |
| defaultSettings.concurrentReplicaRebuildPerNodeLimit | Maximum number of replicas that can be concurrently rebuilt on each node. |
| defaultSettings.concurrentVolumeBackupRestorePerNodeLimit | Maximum number of volumes that can be concurrently restored on each node using a backup. When the value is "0", restoration of volumes using a backup is disabled. |
Expand Down
26 changes: 0 additions & 26 deletions content/docs/1.8.0/references/settings.md
Original file line number Diff line number Diff line change
Expand Up @@ -56,9 +56,6 @@ weight: 1
- [Orphaned Data Automatic Deletion](#orphaned-data-automatic-deletion)
- [Backups](#backups)
- [Allow Recurring Job While Volume Is Detached](#allow-recurring-job-while-volume-is-detached)
- [Backup Target](#backup-target)
- [Backup Target Credential Secret](#backup-target-credential-secret)
- [Backupstore Poll Interval](#backupstore-poll-interval)
- [Failed Backup Time To Live](#failed-backup-time-to-live)
- [Cronjob Failed Jobs History Limit](#cronjob-failed-jobs-history-limit)
- [Cronjob Successful Jobs History Limit](#cronjob-successful-jobs-history-limit)
Expand Down Expand Up @@ -604,29 +601,6 @@ If this setting is enabled, Longhorn automatically attaches the volume and takes

> **Note:** During the time the volume was attached automatically, the volume is not ready for the workload. The workload will have to wait until the recurring job finishes.
#### Backup Target

> Examples:
> `s3://backupbucket@us-east-1/backupstore`
> `nfs://longhorn-test-nfs-svc.default:/opt/backupstore`
> `nfs://longhorn-test-nfs-svc.default:/opt/backupstore?nfsOptions=soft,timeo=330,retrans=3`
Endpoint used to access a backupstore. Longhorn supports AWS S3, Azure, GCP, CIFS and NFS. See [Setting a Backup Target](../../snapshots-and-backups/backup-and-restore/set-backup-target) for details.

#### Backup Target Credential Secret

> Example: `s3-secret`
The Kubernetes secret associated with the backup target. See [Setting a Backup Target](../../snapshots-and-backups/backup-and-restore/set-backup-target) for details.

#### Backupstore Poll Interval

> Default: `300`
The interval in seconds to poll the backup store for updating volumes' **Last Backup** field. Set to 0 to disable the polling. See [Setting up Disaster Recovery Volumes](../../snapshots-and-backups/setup-disaster-recovery-volumes) for details.

For more information on how the backupstore poll interval affects the recovery time objective and recovery point objective, refer to the [concepts section.](../../concepts/#34-backupstore-update-intervals-rto-and-rpo)

#### Failed Backup Time To Live

> Default: `1440`
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,7 @@ spec:
frontend: blockdev
```
Users can override the setting `restore-volume-recurring-jobs` by the volume spec property `spec.restoreVolumeRecurringJob`.
Users can override the setting `restore-volume-recurring-jobs` by the volume spec property `spec.restoreVolumeRecurringJob`.

- **ignored**. This is the default option that instructs Longhorn to inherit from the global setting.
- **enabled**. This option instructs Longhorn to restore volume recurring jobs from the backup target forcibly.
Expand Down
Loading

0 comments on commit 805a49e

Please sign in to comment.