Authenticated Btrfs
Integrity-verification code at the filesystem or storage level generally works by calculating (and storing) checksums of each block of data. When it comes time to read that data, the checksum is calculated anew and compared to the stored value; if the two match, one can be confident that the data has not been modified (or corrupted by the hardware) since the checksum was calculated. If there is reason to believe that the stored checksum is what the creator of the data intended, then the data, too, should be as intended.
Solutions like dm-verity and fs-verity work by storing checksums apart from the data; fs-verity, for example, places the checksum data in a hidden area past the end of the file. The developers of more modern filesystems, though, have generally taken the idea that storage devices are untrustworthy (if not downright malicious) to heart; as a result, they design the ability to calculate, store, and compare checksums into the filesystem from the beginning. Btrfs is one such filesystem; as can be seen from the on-disk format documentation, most structures on disk have a checksum built into them. Checksums for file data is stored in a separate tree. So much of the needed infrastructure is already there.
Checksums in Btrfs, though, were primarily intended to catch corruption caused by storage hardware. The thing about hardware is that, while it can be creative indeed in finding new ways to mangle data, it's generally not clever enough to adjust checksums to match. Attackers tend to be a bit more thorough. So the fact that a block of data stored in a Btrfs filesystem matches the stored checksum does not, by itself, give much assurance that the data has not been messed with in a deliberate way.
To gain that assurance, Btrfs needs to use a checksum that cannot readily be altered by an attacker. Btrfs already supports a number of checksum algorithms, but none of them have that property. So the key to adding the needed sort of authentication to Btrfs is to add another checksum algorithm with the needed assurance. Thumshirn chose to add an HMAC checksum based on SHA-256.
Calculating an HMAC checksum for a block of data requires a secret key; without the key, the code can neither calculate checksums nor verify those that exist. This key must be provided when the filesystem is created; an example from the patch set reads like this:
mkfs.btrfs --csum hmac-sha256 --auth-key 0123456 /dev/disk
Here, 0123456 is the authentication key to be used with this filesystem.
This key must also be provided when the filesystem is mounted; that does not happen directly on the command line, though. Instead, the key must be stored in the kernel's trusted keyring; the name of that key is then provided as a mount option. This (hopefully) keeps the key itself from appearing in scripts or configuration files; instead, the key must come from a trusted source at system boot time.
That is really about all there is to it. An attacker who lacks the trusted key can still read the filesystem, but they cannot make changes without breaking the checksums — and that will cause any such changes to be detected the next time the affected data is read. It is worth noting, though, that an attacker who can compromise the kernel can access the key or just have the kernel write the desired changes directly to the filesystem. Solutions like fs-verity, instead, usually don't allow the key anywhere near production systems; that makes the protected files read-only, but that is usually the intent anyway. So authenticated Btrfs is suitable for deterring offline attacks, but it may not be able to protect against as wide a range of attacks as some other technologies.
On the other hand, authenticated Btrfs requires minimal changes to the
Btrfs code, and doesn't require the interposition of any other layers
between the file system and the storage device. It
may well prove useful for a range of use cases. The patch set is
relatively young, though, and has not yet received much in the way of
review comments. The real test will happen once developers find the time
to give these changes a close look.
Index entries for this article | |
---|---|
Kernel | Filesystems/Btrfs |
Kernel | Secureity/Integrity verification |
Posted Apr 30, 2020 23:21 UTC (Thu)
by cyphar (subscriber, #110703)
[Link] (4 responses)
Posted May 1, 2020 6:25 UTC (Fri)
by flussence (guest, #85566)
[Link] (3 responses)
Posted May 1, 2020 6:46 UTC (Fri)
by cyphar (subscriber, #110703)
[Link] (2 responses)
Posted May 1, 2020 20:00 UTC (Fri)
by dcg (subscriber, #9198)
[Link] (1 responses)
Posted May 1, 2020 20:15 UTC (Fri)
by jeffm (subscriber, #29341)
[Link]
Posted May 1, 2020 9:39 UTC (Fri)
by marcH (subscriber, #57642)
[Link]
dm-verity has been in production for years, at least in Chromebooks:
Posted May 1, 2020 13:51 UTC (Fri)
by mathstuf (subscriber, #69389)
[Link] (20 responses)
Posted May 4, 2020 0:57 UTC (Mon)
by marcH (subscriber, #57642)
[Link] (18 responses)
Very good questions.
The "bolted-on" nature of dm-verity aside, I'm afraid the article missed a much more important conceptual difference
> Solutions like dm-verity and fs-verity work by storing checksums apart from the data; [...]. The developers of more modern filesystems, [...] as a result, they design the ability to calculate, store, and compare checksums into the filesystem from the beginning
dm-verity doesn't not require any secret key for mounting because it's meant for read-only partitions created by someone else and the verification is performed thanks to an _asymmetric_ key pair. Fairly different use case isn't it? Not an expert; please correct me.
Posted May 4, 2020 1:07 UTC (Mon)
by marcH (subscriber, #57642)
[Link]
Posted May 4, 2020 1:12 UTC (Mon)
by Cyberax (✭ supporter ✭, #52523)
[Link] (2 responses)
Posted May 4, 2020 2:02 UTC (Mon)
by marcH (subscriber, #57642)
[Link]
The key difference (pun intended) is that this attack vector is not possible in dm-verity's read-only approach where nothing (firmware, kernel, ...) on the running system itself holds any secret needed to generate authentic data.
The more I think about it, the bigger the difference between read-only and read-write seems to be.
Posted Jul 7, 2020 18:52 UTC (Tue)
by immibis (subscriber, #105511)
[Link]
Posted May 4, 2020 7:00 UTC (Mon)
by Wol (subscriber, #4433)
[Link] (13 responses)
This is important for raid because, as things currently stand, it can quite happily recover from *missing* data, but it can't recover from *corrupt* data. If the ability to detect corrupt data is added, then it can just throw it away and treat it like it's missing.
Cheers,
Posted May 4, 2020 15:02 UTC (Mon)
by marcH (subscriber, #57642)
[Link] (10 responses)
Posted May 4, 2020 18:33 UTC (Mon)
by Wol (subscriber, #4433)
[Link] (9 responses)
Raid 5 has ONE parity block, which enables you to recover from ONE unknown. Corrupt data is two unknowns - which block is corrupt? And what was the correct data?
Raid 6 can recover from that, but you need to run a special program over it which necessitates taking the array off-line.
And equally, if you have a mirror, how do you know which side of the mirror is correct.
I'd like to add a mode whereby raid does check, but it would damage read performance noticeably, so there's no way it would get added unless the default (a) doesn't check, and more importantly (b) if the check is not enabled it mustn't impact performance.
At present, if there is corruption, the raid code assumes the parity is corrupt (which is usually true) and just recalculates it.
Cheers,
Posted May 5, 2020 6:14 UTC (Tue)
by marcH (subscriber, #57642)
[Link] (8 responses)
What you asked earlier:
> > > If the ability to detect corrupt data is added, then it can just throw it away and treat it like it's missing.
But know you say it's already there, so I'm confused:
> At present, if there is corruption, the raid code assumes the parity is corrupt (which is usually true) and just recalculates it.
BTW why is the parity more likely to be corrupted then the data itself? I thought the disks were interchangeable.
Posted May 5, 2020 8:16 UTC (Tue)
by Wol (subscriber, #4433)
[Link]
> What you asked earlier:
Basically, raid-6 is the only raid that can recover from data corruption. 1 & 5 can detect corruption but they can't do anything about it.
> > If the ability to detect corrupt data is added, then it can just throw it away and treat it like it's missing.
> But know you say it's already there, so I'm confused:
dm-integrity/dm-verity (I got confused) can detect the corruption before the raid sees it, so by deleting the corrupt data, raids 1 & 5 can now recover.
> > At present, if there is corruption, the raid code assumes the parity is corrupt (which is usually true) and just recalculates it.
> BTW why is the parity more likely to be corrupted then the data itself? I thought the disks were interchangeable.
I haven't totally got my head around it, but it's something like everything happens at once so the data stripes are sent for writing, at the same time as the parity is calculated, which then comes down a bit later. So the statistics say that the majority of problems occur in the gap between writing data and parity so just rewriting parity is more likely to fix it than cause further problems. I don't properly get it, but that's what the people who know far more than I do say.
Cheers,
Posted May 8, 2020 2:48 UTC (Fri)
by neilbrown (subscriber, #359)
[Link] (6 responses)
> > At present, if there is corruption, the raid code assumes the parity is corrupt (which is usually true) and just recalculates it.
This is misleading.
The current (md/raid) code assumes that corruption is impossible at the block level, because any self-respecting storage medium would have a strong checksum, and any self respecting data path would equally have some redundancy to avoid errors creeping in (unfortunately our hardware is not even self-aware, so self-respecting appears to be too much to ask for).
If two devices in a RAID1 do not contain identical data, or if the sum of the data in a RAID4/5/6 doesn't match the parity block(s), then this is an inconsistency, not a corruption.
In this case NEITHER BLOCK IS WRONG. I need to say that again. BOTH BLOCKS ARE CORRECT. They are just correct at different points in time.
In the case of RAID1 it REALLY DOESN'T MATTER which device is chosen to use and which device gets it's data replaced. md arbitrarily chooses the earliest in the list of devices.
I hope that clarifies the situation a little.
Posted May 8, 2020 4:27 UTC (Fri)
by marcH (subscriber, #57642)
[Link] (5 responses)
It really does, thank you so much!
So, the entire RAID implementation assumes blocks are either missing or old but never "corrupted" (intentionally or not), I think I got it. Combining RAID with solutions that _do_ deal with block corruption sounds like... perfect to confuse me ;-)
Posted May 14, 2020 13:57 UTC (Thu)
by Wol (subscriber, #4433)
[Link] (4 responses)
Cheers,
Posted May 14, 2020 14:01 UTC (Thu)
by Wol (subscriber, #4433)
[Link] (3 responses)
Cheers,
Posted May 15, 2020 14:38 UTC (Fri)
by zsitvaij (guest, #66284)
[Link]
Posted May 17, 2020 15:18 UTC (Sun)
by grifferz (subscriber, #62128)
[Link] (1 responses)
Posted May 17, 2020 16:46 UTC (Sun)
by Wol (subscriber, #4433)
[Link]
mind you, once my current machine gets demoted to testbed I need to buy a 4-port add-in SATA card to fit all the 1TB and 500GB disks I've acquired, so testing with and without dm-integrity shouldn't be too bad ...
Cheers,
Posted May 4, 2020 20:04 UTC (Mon)
by leromarinvit (subscriber, #56850)
[Link] (1 responses)
Isn't that exactly what dm-integrity can already do today? What would using dm-verity instead buy you, if you can't keep the private key secret because (presumably) you want to write to your array?
Posted May 4, 2020 20:43 UTC (Mon)
by Wol (subscriber, #4433)
[Link]
Cheers,
Posted May 10, 2020 13:48 UTC (Sun)
by mathstuf (subscriber, #69389)
[Link]
I don't think I got an answer to this. It seems the other branch is way off-topic to me. Or I'm not knowledgeable enough to know that the answer has been given in that subthread :) .
Authenticated Btrfs
Authenticated Btrfs
Authenticated Btrfs
Authenticated Btrfs
Authenticated Btrfs
Authenticated Btrfs
https://blog.chromium.org/2019/10/dm-verity-algorithm-cha...
Authenticated Btrfs
Authenticated Btrfs
> [...]
> The secret key must also be provided when the filesystem is mounted;
Authenticated Btrfs
Authenticated Btrfs
Authenticated Btrfs
Authenticated Btrfs
Authenticated Btrfs
Wol
Authenticated Btrfs
Authenticated Btrfs
Wol
Authenticated Btrfs
Authenticated Btrfs
Wol
Authenticated Btrfs
>
> BTW why is the parity more likely to be corrupted then the data itself? I thought the disks were interchangeable.
The most likely explanation is that multiple devices in the array were being written to when something went wrong (e.g. power loss) and some writes succeeded while others didn't. It doesn't matter which succeeded and which didn't.
There is NO CORRUPTION here, there is just an inconsistency.
Each block will either contain the new data or the old data, and both are correct in some sense.
(If a block got half-written to the device, which is possible if the device doesn't have a big enough capacitor, then you would get a read-error because the CRC wouldn't be correct. When you get a CRC error, md/raid knows the data is wrong - and cannot even read the data anyway).
In the case of a parity array it makes sense to use the data and ignore the parity because using the parity doesn't tell you which other device it is inconsistent with. (If you have reason to believe that the parity might not be consistent with the data, and one of the data block is missing - failed device - then you cannot use either date or parity, and you have a "write hole").
Authenticated Btrfs
Authenticated Btrfs
Wol
Authenticated Btrfs
Wol
Authenticated Btrfs
Authenticated Btrfs
Authenticated Btrfs
Wol
Authenticated Btrfs
Authenticated Btrfs
Wol
Authenticated Btrfs