Authenticated Btrfs

Posted May 1, 2020 13:51 UTC (Fri) by mathstuf (subscriber, #69389)
Parent article: Authenticated Btrfs

What mechanisms are available to add the secret to the keyring at boot? GRUB options, embedded in the initramfs? r is this something that is intended to be tied to a TPM of some sort? If so, how would one know what secret to use when creating the filesystem?

Authenticated Btrfs

Posted May 4, 2020 0:57 UTC (Mon) by marcH (subscriber, #57642) [Link] (18 responses)

> What mechanisms are available to add the secret to the keyring at boot? Etc.

Very good questions.

The "bolted-on" nature of dm-verity aside, I'm afraid the article missed a much more important conceptual difference

> Solutions like dm-verity and fs-verity work by storing checksums apart from the data; [...]. The developers of more modern filesystems, [...] as a result, they design the ability to calculate, store, and compare checksums into the filesystem from the beginning
> [...]
> The secret key must also be provided when the filesystem is mounted;

dm-verity doesn't not require any secret key for mounting because it's meant for read-only partitions created by someone else and the verification is performed thanks to an _asymmetric_ key pair. Fairly different use case isn't it? Not an expert; please correct me.

Authenticated Btrfs

Posted May 4, 2020 1:07 UTC (Mon) by marcH (subscriber, #57642) [Link]

Doh! Next time I'll keep paying attention even after reading the sentence "That is really about all there is to it." Where's that "delete comment" button? For my defense: the article starts with a comparison with dm/fs-verity but waits until the end to mention it's very different use case; not very noob-friendly :-/

Authenticated Btrfs

Posted May 4, 2020 1:12 UTC (Mon) by Cyberax (✭ supporter ✭, #52523) [Link] (2 responses)

You can just use an asymmetric key to sign a symmetric key used to verify the data.

Authenticated Btrfs

Posted May 4, 2020 2:02 UTC (Mon) by marcH (subscriber, #57642) [Link]

If/when the local attacker gets hold of the symmetric key, she can generate authentic data. Doesn't matter much whether that symmetric signed or not, does it? Sorry if I'm missing something (again).

The key difference (pun intended) is that this attack vector is not possible in dm-verity's read-only approach where nothing (firmware, kernel, ...) on the running system itself holds any secret needed to generate authentic data.

The more I think about it, the bigger the difference between read-only and read-write seems to be.

Authenticated Btrfs

Posted Jul 7, 2020 18:52 UTC (Tue) by immibis (subscriber, #105511) [Link]

No, you can't - an attacker can just read the symmetric key, then "sign" his own replacement data with that symmetric key.

Authenticated Btrfs

Posted May 4, 2020 7:00 UTC (Mon) by Wol (subscriber, #4433) [Link] (13 responses)

I'm hoping to use dm-verity in the raid subsystem soon. Given that the purpose of dm-verity is simply to confirm that was is later read from disk, is exactly what was written to disk earlier, it doesn't need a secret.

This is important for raid because, as things currently stand, it can quite happily recover from *missing* data, but it can't recover from *corrupt* data. If the ability to detect corrupt data is added, then it can just throw it away and treat it like it's missing.

Cheers,
Wol

Authenticated Btrfs

Posted May 4, 2020 15:02 UTC (Mon) by marcH (subscriber, #57642) [Link] (10 responses)

How hard would it be for (software) RAID to check the parity even when the disks think everything is OK?

Authenticated Btrfs

Posted May 4, 2020 18:33 UTC (Mon) by Wol (subscriber, #4433) [Link] (9 responses)

Apart from raid 6, what does this gain you?

Raid 5 has ONE parity block, which enables you to recover from ONE unknown. Corrupt data is two unknowns - which block is corrupt? And what was the correct data?

Raid 6 can recover from that, but you need to run a special program over it which necessitates taking the array off-line.

And equally, if you have a mirror, how do you know which side of the mirror is correct.

I'd like to add a mode whereby raid does check, but it would damage read performance noticeably, so there's no way it would get added unless the default (a) doesn't check, and more importantly (b) if the check is not enabled it mustn't impact performance.

At present, if there is corruption, the raid code assumes the parity is corrupt (which is usually true) and just recalculates it.

Cheers,
Wol

Authenticated Btrfs

Posted May 5, 2020 6:14 UTC (Tue) by marcH (subscriber, #57642) [Link] (8 responses)

> Apart from raid 6, what does this gain you?

What you asked earlier:

> > > If the ability to detect corrupt data is added, then it can just throw it away and treat it like it's missing.

But know you say it's already there, so I'm confused:

> At present, if there is corruption, the raid code assumes the parity is corrupt (which is usually true) and just recalculates it.

BTW why is the parity more likely to be corrupted then the data itself? I thought the disks were interchangeable.

Authenticated Btrfs

Posted May 5, 2020 8:16 UTC (Tue) by Wol (subscriber, #4433) [Link]

> > Apart from raid 6, what does this gain you?

> What you asked earlier:

Basically, raid-6 is the only raid that can recover from data corruption. 1 & 5 can detect corruption but they can't do anything about it.

> > If the ability to detect corrupt data is added, then it can just throw it away and treat it like it's missing.

> But know you say it's already there, so I'm confused:

dm-integrity/dm-verity (I got confused) can detect the corruption before the raid sees it, so by deleting the corrupt data, raids 1 & 5 can now recover.

> > At present, if there is corruption, the raid code assumes the parity is corrupt (which is usually true) and just recalculates it.

> BTW why is the parity more likely to be corrupted then the data itself? I thought the disks were interchangeable.

I haven't totally got my head around it, but it's something like everything happens at once so the data stripes are sent for writing, at the same time as the parity is calculated, which then comes down a bit later. So the statistics say that the majority of problems occur in the gap between writing data and parity so just rewriting parity is more likely to fix it than cause further problems. I don't properly get it, but that's what the people who know far more than I do say.

Cheers,
Wol

Authenticated Btrfs

Posted May 8, 2020 2:48 UTC (Fri) by neilbrown (subscriber, #359) [Link] (6 responses)

> > At present, if there is corruption, the raid code assumes the parity is corrupt (which is usually true) and just recalculates it.
>
> BTW why is the parity more likely to be corrupted then the data itself? I thought the disks were interchangeable.

This is misleading.

The current (md/raid) code assumes that corruption is impossible at the block level, because any self-respecting storage medium would have a strong checksum, and any self respecting data path would equally have some redundancy to avoid errors creeping in (unfortunately our hardware is not even self-aware, so self-respecting appears to be too much to ask for).

If two devices in a RAID1 do not contain identical data, or if the sum of the data in a RAID4/5/6 doesn't match the parity block(s), then this is an inconsistency, not a corruption.
The most likely explanation is that multiple devices in the array were being written to when something went wrong (e.g. power loss) and some writes succeeded while others didn't. It doesn't matter which succeeded and which didn't.

In this case NEITHER BLOCK IS WRONG. I need to say that again. BOTH BLOCKS ARE CORRECT. They are just correct at different points in time.
There is NO CORRUPTION here, there is just an inconsistency.
Each block will either contain the new data or the old data, and both are correct in some sense.
(If a block got half-written to the device, which is possible if the device doesn't have a big enough capacitor, then you would get a read-error because the CRC wouldn't be correct. When you get a CRC error, md/raid knows the data is wrong - and cannot even read the data anyway).

In the case of RAID1 it REALLY DOESN'T MATTER which device is chosen to use and which device gets it's data replaced. md arbitrarily chooses the earliest in the list of devices.
In the case of a parity array it makes sense to use the data and ignore the parity because using the parity doesn't tell you which other device it is inconsistent with. (If you have reason to believe that the parity might not be consistent with the data, and one of the data block is missing - failed device - then you cannot use either date or parity, and you have a "write hole").

I hope that clarifies the situation a little.

Authenticated Btrfs

Posted May 8, 2020 4:27 UTC (Fri) by marcH (subscriber, #57642) [Link] (5 responses)

> I hope that clarifies the situation a little.

It really does, thank you so much!

So, the entire RAID implementation assumes blocks are either missing or old but never "corrupted" (intentionally or not), I think I got it. Combining RAID with solutions that _do_ deal with block corruption sounds like... perfect to confuse me ;-)

Authenticated Btrfs

Posted May 14, 2020 13:57 UTC (Thu) by Wol (subscriber, #4433) [Link] (4 responses)

I've just started building my new system with raid on top of dm-verity. So hopefully this will soon be part of md-raid, but that depends on me getting familiar with mdadm ... :-)

Cheers,
Wol

Authenticated Btrfs

Posted May 14, 2020 14:01 UTC (Thu) by Wol (subscriber, #4433) [Link] (3 responses)

Whoops - dm-integrity ...

Cheers,
Wol

Authenticated Btrfs

Posted May 15, 2020 14:38 UTC (Fri) by zsitvaij (guest, #66284) [Link]

Have you read this comparison on zfs/btrfs/dm-integrity+mdadm? It's a very detailed and fun read if such things interest you: https://www.unixsheikh.com/articles/battle-testing-data-i...

Authenticated Btrfs

Posted May 17, 2020 15:18 UTC (Sun) by grifferz (subscriber, #62128) [Link] (1 responses)

I would be very interested in any benchmarks you are able to do, to see what use of dm-integrity costs.

Authenticated Btrfs

Posted May 17, 2020 16:46 UTC (Sun) by Wol (subscriber, #4433) [Link]

If I get the chance ...

mind you, once my current machine gets demoted to testbed I need to buy a 4-port add-in SATA card to fit all the 1TB and 500GB disks I've acquired, so testing with and without dm-integrity shouldn't be too bad ...

Cheers,
Wol

Authenticated Btrfs

Posted May 4, 2020 20:04 UTC (Mon) by leromarinvit (subscriber, #56850) [Link] (1 responses)

> I'm hoping to use dm-verity in the raid subsystem soon. Given that the purpose of dm-verity is simply to confirm that was is later read from disk, is exactly what was written to disk earlier, it doesn't need a secret.

Isn't that exactly what dm-integrity can already do today? What would using dm-verity instead buy you, if you can't keep the private key secret because (presumably) you want to write to your array?

Authenticated Btrfs

Posted May 4, 2020 20:43 UTC (Mon) by Wol (subscriber, #4433) [Link]

Or am I mixing up up dm-integrity and dm-verity. Quite likely.

Cheers,
Wol

Authenticated Btrfs

Posted May 10, 2020 13:48 UTC (Sun) by mathstuf (subscriber, #69389) [Link]

> What mechanisms are available to add the secret to the keyring at boot?

I don't think I got an answer to this. It seems the other branch is way off-topic to me. Or I'm not knowledgeable enough to know that the answer has been given in that subthread :) .

Authenticated Btrfs

Authenticated Btrfs

Authenticated Btrfs

Authenticated Btrfs

Authenticated Btrfs

Authenticated Btrfs

Authenticated Btrfs

Authenticated Btrfs

Authenticated Btrfs

Authenticated Btrfs

Authenticated Btrfs

Authenticated Btrfs

Authenticated Btrfs

Authenticated Btrfs

Authenticated Btrfs

Authenticated Btrfs

Authenticated Btrfs

Authenticated Btrfs

Authenticated Btrfs

Authenticated Btrfs

Authenticated Btrfs

Authenticated Btrfs

pFad - (p)hone/(F)rame/(a)nonymizer/(d)eclutterfier! Saves Data!