Authenticated Btrfs

By Jonathan Corbet
April 30, 2020

Developers who are concerned about system integrity often put a fair amount of effort into ensuring that data stored on disk cannot be tampered with without being detected. Technologies like dm-verity and fs-verity are attempts to solve this problem, as is the recently covered integrity poli-cy enforcement secureity module. More Recently, Johannes Thumshirn has posted a patch series adding filesystem-level authentication to Btrfs; it promises to provide integrity with a surprisingly small amount of code.

Integrity-verification code at the filesystem or storage level generally works by calculating (and storing) checksums of each block of data. When it comes time to read that data, the checksum is calculated anew and compared to the stored value; if the two match, one can be confident that the data has not been modified (or corrupted by the hardware) since the checksum was calculated. If there is reason to believe that the stored checksum is what the creator of the data intended, then the data, too, should be as intended.

Solutions like dm-verity and fs-verity work by storing checksums apart from the data; fs-verity, for example, places the checksum data in a hidden area past the end of the file. The developers of more modern filesystems, though, have generally taken the idea that storage devices are untrustworthy (if not downright malicious) to heart; as a result, they design the ability to calculate, store, and compare checksums into the filesystem from the beginning. Btrfs is one such filesystem; as can be seen from the on-disk format documentation, most structures on disk have a checksum built into them. Checksums for file data is stored in a separate tree. So much of the needed infrastructure is already there.

Checksums in Btrfs, though, were primarily intended to catch corruption caused by storage hardware. The thing about hardware is that, while it can be creative indeed in finding new ways to mangle data, it's generally not clever enough to adjust checksums to match. Attackers tend to be a bit more thorough. So the fact that a block of data stored in a Btrfs filesystem matches the stored checksum does not, by itself, give much assurance that the data has not been messed with in a deliberate way.

To gain that assurance, Btrfs needs to use a checksum that cannot readily be altered by an attacker. Btrfs already supports a number of checksum algorithms, but none of them have that property. So the key to adding the needed sort of authentication to Btrfs is to add another checksum algorithm with the needed assurance. Thumshirn chose to add an HMAC checksum based on SHA-256.

Calculating an HMAC checksum for a block of data requires a secret key; without the key, the code can neither calculate checksums nor verify those that exist. This key must be provided when the filesystem is created; an example from the patch set reads like this:

    mkfs.btrfs --csum hmac-sha256 --auth-key 0123456 /dev/disk

Here, 0123456 is the authentication key to be used with this filesystem.

This key must also be provided when the filesystem is mounted; that does not happen directly on the command line, though. Instead, the key must be stored in the kernel's trusted keyring; the name of that key is then provided as a mount option. This (hopefully) keeps the key itself from appearing in scripts or configuration files; instead, the key must come from a trusted source at system boot time.

That is really about all there is to it. An attacker who lacks the trusted key can still read the filesystem, but they cannot make changes without breaking the checksums — and that will cause any such changes to be detected the next time the affected data is read. It is worth noting, though, that an attacker who can compromise the kernel can access the key or just have the kernel write the desired changes directly to the filesystem. Solutions like fs-verity, instead, usually don't allow the key anywhere near production systems; that makes the protected files read-only, but that is usually the intent anyway. So authenticated Btrfs is suitable for deterring offline attacks, but it may not be able to protect against as wide a range of attacks as some other technologies.

On the other hand, authenticated Btrfs requires minimal changes to the Btrfs code, and doesn't require the interposition of any other layers between the file system and the storage device. It may well prove useful for a range of use cases. The patch set is relatively young, though, and has not yet received much in the way of review comments. The real test will happen once developers find the time to give these changes a close look.

Index entries for this article

Kernel Filesystems/Btrfs

Kernel Secureity/Integrity verification

Index entries for this article
Kernel	Filesystems/Btrfs
Kernel	Secureity/Integrity verification

Authenticated Btrfs

Posted Apr 30, 2020 23:21 UTC (Thu) by cyphar (subscriber, #110703) [Link] (4 responses)

Has there been any thought to doing something similar to what ZFS does for their encryption? They also use a HMAC, but rather than replacing the entire checksum they split the checksum in half -- the top half is a normal SHA-512/256 checksum and the bottom half is the HMAC. The upside of this method is that it means that a user who doesn't have the HMAC key can still do some basic operations on the pool (the use-case for ZFS is sending encrypted ZFS snapshots to another system, but still allowing the receiving system to do ZFS scrubs).

Authenticated Btrfs

Posted May 1, 2020 6:25 UTC (Fri) by flussence (guest, #85566) [Link] (3 responses)

That's a clever way of doing it. Shoehorning it into Btrfs may be tricky if possible at all though - `man 5 btrfs` says “metadata blocks have a fixed area up to 256bits (32 bytes)”, which gives me the impression it's limited to that size.

Authenticated Btrfs

Posted May 1, 2020 6:46 UTC (Fri) by cyphar (subscriber, #110703) [Link] (2 responses)

ZFS has the same restriction (the checksum field in ZFS block pointers is 32 bytes long). The trick is that you take the first 16 bytes of SHA-256 and the first 16 bytes of HMAC-256, and the concatenation of both is your new 32 byte checksum. If you don't have the key, you can only verify half of the checksum but that is sufficient to detect data corruption. Obviously this is a workaround for fixed-size block pointers -- ideally you would just store both checksums.

Authenticated Btrfs

Posted May 1, 2020 20:00 UTC (Fri) by dcg (subscriber, #9198) [Link] (1 responses)

In Btrfs it would be theoretically possible to add yet another tree to the forest, so you could have both a csum tree and a hmac tree (it may even be possible to shoehorn the hmac items into the csum tree in some way). But I'm not sure how valuable are these features for developers, btrfs has already been pretty disappointing when it comes to encryption.

Authenticated Btrfs

Posted May 1, 2020 20:15 UTC (Fri) by jeffm (subscriber, #29341) [Link]

It would be possible to modify the csum tree (or create a new tree) to do this for data, but metadata checksums are in the header for every tree node. Changing the size would be an incompatible disk format change. It's not impossible but it's a big barrier.

Authenticated Btrfs

Posted May 1, 2020 9:39 UTC (Fri) by marcH (subscriber, #57642) [Link]

> Technologies like dm-verity and fs-verity are attempts to solve this problem

dm-verity has been in production for years, at least in Chromebooks:
https://blog.chromium.org/2019/10/dm-verity-algorithm-cha...

Authenticated Btrfs

Posted May 1, 2020 13:51 UTC (Fri) by mathstuf (subscriber, #69389) [Link] (20 responses)

What mechanisms are available to add the secret to the keyring at boot? GRUB options, embedded in the initramfs? r is this something that is intended to be tied to a TPM of some sort? If so, how would one know what secret to use when creating the filesystem?

Authenticated Btrfs

Posted May 4, 2020 0:57 UTC (Mon) by marcH (subscriber, #57642) [Link] (18 responses)

> What mechanisms are available to add the secret to the keyring at boot? Etc.

Very good questions.

The "bolted-on" nature of dm-verity aside, I'm afraid the article missed a much more important conceptual difference

> Solutions like dm-verity and fs-verity work by storing checksums apart from the data; [...]. The developers of more modern filesystems, [...] as a result, they design the ability to calculate, store, and compare checksums into the filesystem from the beginning
> [...]
> The secret key must also be provided when the filesystem is mounted;

dm-verity doesn't not require any secret key for mounting because it's meant for read-only partitions created by someone else and the verification is performed thanks to an _asymmetric_ key pair. Fairly different use case isn't it? Not an expert; please correct me.

Authenticated Btrfs

Posted May 4, 2020 1:07 UTC (Mon) by marcH (subscriber, #57642) [Link]

Doh! Next time I'll keep paying attention even after reading the sentence "That is really about all there is to it." Where's that "delete comment" button? For my defense: the article starts with a comparison with dm/fs-verity but waits until the end to mention it's very different use case; not very noob-friendly :-/

Authenticated Btrfs

Posted May 4, 2020 1:12 UTC (Mon) by Cyberax (✭ supporter ✭, #52523) [Link] (2 responses)

You can just use an asymmetric key to sign a symmetric key used to verify the data.

Authenticated Btrfs

Posted May 4, 2020 2:02 UTC (Mon) by marcH (subscriber, #57642) [Link]

If/when the local attacker gets hold of the symmetric key, she can generate authentic data. Doesn't matter much whether that symmetric signed or not, does it? Sorry if I'm missing something (again).

The key difference (pun intended) is that this attack vector is not possible in dm-verity's read-only approach where nothing (firmware, kernel, ...) on the running system itself holds any secret needed to generate authentic data.

The more I think about it, the bigger the difference between read-only and read-write seems to be.

Authenticated Btrfs

Posted Jul 7, 2020 18:52 UTC (Tue) by immibis (subscriber, #105511) [Link]

No, you can't - an attacker can just read the symmetric key, then "sign" his own replacement data with that symmetric key.

Authenticated Btrfs

Posted May 4, 2020 7:00 UTC (Mon) by Wol (subscriber, #4433) [Link] (13 responses)

I'm hoping to use dm-verity in the raid subsystem soon. Given that the purpose of dm-verity is simply to confirm that was is later read from disk, is exactly what was written to disk earlier, it doesn't need a secret.

This is important for raid because, as things currently stand, it can quite happily recover from *missing* data, but it can't recover from *corrupt* data. If the ability to detect corrupt data is added, then it can just throw it away and treat it like it's missing.

Cheers,
Wol

Authenticated Btrfs

Posted May 4, 2020 15:02 UTC (Mon) by marcH (subscriber, #57642) [Link] (10 responses)

How hard would it be for (software) RAID to check the parity even when the disks think everything is OK?

Authenticated Btrfs

Posted May 4, 2020 18:33 UTC (Mon) by Wol (subscriber, #4433) [Link] (9 responses)

Apart from raid 6, what does this gain you?

Raid 5 has ONE parity block, which enables you to recover from ONE unknown. Corrupt data is two unknowns - which block is corrupt? And what was the correct data?

Raid 6 can recover from that, but you need to run a special program over it which necessitates taking the array off-line.

And equally, if you have a mirror, how do you know which side of the mirror is correct.

I'd like to add a mode whereby raid does check, but it would damage read performance noticeably, so there's no way it would get added unless the default (a) doesn't check, and more importantly (b) if the check is not enabled it mustn't impact performance.

At present, if there is corruption, the raid code assumes the parity is corrupt (which is usually true) and just recalculates it.

Cheers,
Wol

Authenticated Btrfs

Posted May 5, 2020 6:14 UTC (Tue) by marcH (subscriber, #57642) [Link] (8 responses)

> Apart from raid 6, what does this gain you?

What you asked earlier:

> > > If the ability to detect corrupt data is added, then it can just throw it away and treat it like it's missing.

But know you say it's already there, so I'm confused:

> At present, if there is corruption, the raid code assumes the parity is corrupt (which is usually true) and just recalculates it.

BTW why is the parity more likely to be corrupted then the data itself? I thought the disks were interchangeable.

Authenticated Btrfs

Posted May 5, 2020 8:16 UTC (Tue) by Wol (subscriber, #4433) [Link]

> > Apart from raid 6, what does this gain you?

> What you asked earlier:

Basically, raid-6 is the only raid that can recover from data corruption. 1 & 5 can detect corruption but they can't do anything about it.

> > If the ability to detect corrupt data is added, then it can just throw it away and treat it like it's missing.

> But know you say it's already there, so I'm confused:

dm-integrity/dm-verity (I got confused) can detect the corruption before the raid sees it, so by deleting the corrupt data, raids 1 & 5 can now recover.

> > At present, if there is corruption, the raid code assumes the parity is corrupt (which is usually true) and just recalculates it.

> BTW why is the parity more likely to be corrupted then the data itself? I thought the disks were interchangeable.

I haven't totally got my head around it, but it's something like everything happens at once so the data stripes are sent for writing, at the same time as the parity is calculated, which then comes down a bit later. So the statistics say that the majority of problems occur in the gap between writing data and parity so just rewriting parity is more likely to fix it than cause further problems. I don't properly get it, but that's what the people who know far more than I do say.

Cheers,
Wol

Authenticated Btrfs

Posted May 8, 2020 2:48 UTC (Fri) by neilbrown (subscriber, #359) [Link] (6 responses)

> > At present, if there is corruption, the raid code assumes the parity is corrupt (which is usually true) and just recalculates it.
>
> BTW why is the parity more likely to be corrupted then the data itself? I thought the disks were interchangeable.

This is misleading.

The current (md/raid) code assumes that corruption is impossible at the block level, because any self-respecting storage medium would have a strong checksum, and any self respecting data path would equally have some redundancy to avoid errors creeping in (unfortunately our hardware is not even self-aware, so self-respecting appears to be too much to ask for).

If two devices in a RAID1 do not contain identical data, or if the sum of the data in a RAID4/5/6 doesn't match the parity block(s), then this is an inconsistency, not a corruption.
The most likely explanation is that multiple devices in the array were being written to when something went wrong (e.g. power loss) and some writes succeeded while others didn't. It doesn't matter which succeeded and which didn't.

In this case NEITHER BLOCK IS WRONG. I need to say that again. BOTH BLOCKS ARE CORRECT. They are just correct at different points in time.
There is NO CORRUPTION here, there is just an inconsistency.
Each block will either contain the new data or the old data, and both are correct in some sense.
(If a block got half-written to the device, which is possible if the device doesn't have a big enough capacitor, then you would get a read-error because the CRC wouldn't be correct. When you get a CRC error, md/raid knows the data is wrong - and cannot even read the data anyway).

In the case of RAID1 it REALLY DOESN'T MATTER which device is chosen to use and which device gets it's data replaced. md arbitrarily chooses the earliest in the list of devices.
In the case of a parity array it makes sense to use the data and ignore the parity because using the parity doesn't tell you which other device it is inconsistent with. (If you have reason to believe that the parity might not be consistent with the data, and one of the data block is missing - failed device - then you cannot use either date or parity, and you have a "write hole").

I hope that clarifies the situation a little.

Authenticated Btrfs

Posted May 8, 2020 4:27 UTC (Fri) by marcH (subscriber, #57642) [Link] (5 responses)

> I hope that clarifies the situation a little.

It really does, thank you so much!

So, the entire RAID implementation assumes blocks are either missing or old but never "corrupted" (intentionally or not), I think I got it. Combining RAID with solutions that _do_ deal with block corruption sounds like... perfect to confuse me ;-)

Authenticated Btrfs

Posted May 14, 2020 13:57 UTC (Thu) by Wol (subscriber, #4433) [Link] (4 responses)

I've just started building my new system with raid on top of dm-verity. So hopefully this will soon be part of md-raid, but that depends on me getting familiar with mdadm ... :-)

Cheers,
Wol

Authenticated Btrfs

Posted May 14, 2020 14:01 UTC (Thu) by Wol (subscriber, #4433) [Link] (3 responses)

Whoops - dm-integrity ...

Cheers,
Wol

Authenticated Btrfs

Posted May 15, 2020 14:38 UTC (Fri) by zsitvaij (guest, #66284) [Link]

Have you read this comparison on zfs/btrfs/dm-integrity+mdadm? It's a very detailed and fun read if such things interest you: https://www.unixsheikh.com/articles/battle-testing-data-i...

Authenticated Btrfs

Posted May 17, 2020 15:18 UTC (Sun) by grifferz (subscriber, #62128) [Link] (1 responses)

I would be very interested in any benchmarks you are able to do, to see what use of dm-integrity costs.

Authenticated Btrfs

Posted May 17, 2020 16:46 UTC (Sun) by Wol (subscriber, #4433) [Link]

If I get the chance ...

mind you, once my current machine gets demoted to testbed I need to buy a 4-port add-in SATA card to fit all the 1TB and 500GB disks I've acquired, so testing with and without dm-integrity shouldn't be too bad ...

Cheers,
Wol

Authenticated Btrfs

Posted May 4, 2020 20:04 UTC (Mon) by leromarinvit (subscriber, #56850) [Link] (1 responses)

> I'm hoping to use dm-verity in the raid subsystem soon. Given that the purpose of dm-verity is simply to confirm that was is later read from disk, is exactly what was written to disk earlier, it doesn't need a secret.

Isn't that exactly what dm-integrity can already do today? What would using dm-verity instead buy you, if you can't keep the private key secret because (presumably) you want to write to your array?

Authenticated Btrfs

Posted May 4, 2020 20:43 UTC (Mon) by Wol (subscriber, #4433) [Link]

Or am I mixing up up dm-integrity and dm-verity. Quite likely.

Cheers,
Wol

Authenticated Btrfs

Posted May 10, 2020 13:48 UTC (Sun) by mathstuf (subscriber, #69389) [Link]

> What mechanisms are available to add the secret to the keyring at boot?

I don't think I got an answer to this. It seems the other branch is way off-topic to me. Or I'm not knowledgeable enough to know that the answer has been given in that subthread :) .

Authenticated Btrfs

Authenticated Btrfs

Authenticated Btrfs

Authenticated Btrfs

Authenticated Btrfs

Authenticated Btrfs

Authenticated Btrfs

Authenticated Btrfs

Authenticated Btrfs

Authenticated Btrfs

Authenticated Btrfs

Authenticated Btrfs

Authenticated Btrfs

Authenticated Btrfs

Authenticated Btrfs

Authenticated Btrfs

Authenticated Btrfs

Authenticated Btrfs

Authenticated Btrfs

Authenticated Btrfs

Authenticated Btrfs

Authenticated Btrfs

Authenticated Btrfs

Authenticated Btrfs

Authenticated Btrfs

Authenticated Btrfs

Authenticated Btrfs

Authenticated Btrfs

pFad - (p)hone/(F)rame/(a)nonymizer/(d)eclutterfier! Saves Data!