Skip to content

Commit

Permalink
Blog post for metadata rework
Browse files Browse the repository at this point in the history
  • Loading branch information
jbaublitz committed Jul 26, 2024
1 parent 7edfc0a commit e1f287b
Showing 1 changed file with 71 additions and 0 deletions.
71 changes: 71 additions & 0 deletions content/metadata-rework.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,71 @@
+++
title = "Metadata rework"
date = 2024-07-25
weight = 39
template = "page.html"
render = true
+++

*John Baublitz, Stratis Team*

Until now, Stratis has had a metadata format that, with a few exceptions, works
remarkly well for our current feature set. However, as we plan for larger features
like RAID, integrity, and online reencryption, we've noticed that we require some
reworks to our stacking and the addition of reservation ahead of time for new devicemapper
layers. The metadata rework has been a large effort to pave the way for Stratis moving forward
and will provide the ability for additional flexibility in enabling and disabling certain
layers after pool creation time once the features are added.

### Changes to encryption

Encryption might be the single largest change in the metadata rework PR. The crypt device now
sits above the Stratis metadata and where RAID and integrity will be in the stack.

Let's dive into the practical applications of this.

#### FDE

V1 of the metadata provided full disk encryption, but V2 will not. Data, cache, and
thinpool/filesystem metadata will all still be encrypted, but integrity metadata, RAID
metadata, and the Stratis metadata, which is generally just a record of the devicemapper
tables for the Stratis pool, will not be encrypted.

We discussed this at length with the cryptsetup maintainers and the security team and the
general consensus is that leaving this data unencrypted will not result in leaking any
information about the encrypted data stored above it in the stack.

#### Higher layer in the stack

The switch of encryption's position in the stack also has a surprising number of benefits.

Firstly, the operation time on encrypted pools no longer scales linearly according to the
number of devices. Because Stratis can add additional devices to a single pool, FDE required
encrypting each new device with a separate crypt device which resulted in having to pass
through PBKDF once for each device. This increased execution times for operations like unlock
based on how many devices were in the pool. V2 of the metadata has constant time execution time
for each encryption operation that needs to go through PBKDF.

Secondly, cache is encrypted without additional work simply by nature of encryption being layered
on top of the caching layer. This simplifies the cache handling and requires no special handling
for encrypting cached data.

Thirdly, this will avoid the problem we would have run into with RAID of encryption amplification.
Because previously each leg would have a separate crypt device, the CPU load of encryption would
have been multiplied by the number of legs in the RAID array. This would result in reduced throughput
due to too much CPU load encrypting each leg separately.

### Changes to metadata reservation

In V2 of the metadata layout, we've begun reserving space for the crypt header, integrity metadata and md-raid
metadata. The end result of this is that moving forward, we will increasingly be able to toggle certain features
on and off after pool creation time. Work has already begun on supporting this for encryption, and we intend to
do this at least for integrity as well. We are still discussing the best way to support this for RAID given
certain restrictions in the case of converting from a multi-disk pool to a RAID array.

### Backports

Moving forward, we intend to port all features that are compatible with V1 of the metadata layout back to V1, but
for features that cannot be backported like turning encryption on and off due to the differences outlined here,
we recommend that users migrate data to a V2 pool to take advantage of these upcoming features.

Please let us know what you think of the new metadata layout in a Github issue!

0 comments on commit e1f287b

Please sign in to comment.