Content-Length: 18690 | pFad | http://lwn.net/Articles/990918/

Getting PCI driver abstractions upstream [LWN.net]
|
|
Subscribe / Log in / New account

Getting PCI driver abstractions upstream

By Daroc Alden
September 26, 2024
Kangrejos 2024

Danilo Krummrich gave a talk at Kangrejos 2024 focusing on the question of how the Rust-for-Linux project could improve at getting device and driver abstractions upstream. As a case study, he used some of his recent work that attempts to make it possible to write a PCI driver entirely in Rust. There wasn't time to go into as much detail as he would have liked, but he did demonstrate that it is possible to interface with the kernel's module loader in a way that is much harder to screw up than the current standard approach in C.

To give context to the discussion, he started by explaining that his goal was to make development of Nova (the new NVIDIA driver he has been working on) go smoothly. He opined that Nova would probably be the first "more complex" thing to go upstream. Luckily, other in-progress efforts to write kernel components in Rust need some of the same abstractions as Nova, including rvkms, rnvme, Apple AGX, and rcpufreq-dt. Ultimately, he would like to provide driver infrastructure that integrates with the kernel, but that takes advantage of Rust's capabilities where possible.

Initially, there was confusion on the mailing list, Krummrich said, about what the Rust abstractions he wanted to discuss were meant to represent. The patch set that spawned the discussion is not really about abstracting drivers he explained, but about permitting drivers to access bus types safely — abstractions needed for drivers. A fact which had proved difficult to communicate via email. At the end of the discussion, Greg Kroah-Hartman had called for writing the driver-registration code itself separately, in C.

At Kangrejos, Kroah-Hartman took responsibility for half of the misunderstanding, but said that he did still think it made sense to implement driver registration itself in C. He said that it would be a small amount of "safe C" — a comment that prompted good-natured chuckles from the attending Rust programmers. Krummrich was sympathetic, but wanted to go through an example of what he was trying to achieve with the driver abstractions and why.

Krummrich showed the code for a simple PCI driver written in C, and then went through a series of modifications to incrementally rewrite it in Rust. The simplest change was just to replace the probe and remove functions with versions written in Rust:

    #[no_mangle]
    unsafe extern "C" fn rust_pci_driver_probe(
        _pdev: *mut bindings::pci_dev,
        _ent: bindings::pci_device_id,
    ) -> core::ffi::c_int {
        pr_info!("Probe Rust PCI driver sample.\n");

        0
    }

    #[no_mangle]
    unsafe extern "C" fn rust_pci_driver_remove(_pdev: *mut bindings::pci_dev) {
        pr_info!("Remove Rust PCI driver sample.\n");
    }

The code was fairly simple, but after showing it, Krummrich explained that there was actually already a bug in this example that he hadn't noticed until later: the signature of the Rust functions was incorrect. It didn't cause problems, because the incorrect parameter was unused, but it still served to illustrate why he wanted to make interfacing with the kernel's driver system less error-prone.

He also mentioned an issue that didn't show up in this simple example — binding object lifetimes to driver and module lifetimes, instead of device lifetimes, such as is sometimes necessary for complex drivers. Kroah-Hartman thought that it was good for that to be hard, because "we don't want you to do that". He explained that the kernel tried hard to ensure that data is bound to data, not to code, and that drivers should use per-device storage. Krummrich said that the same problem applied to trying to set up per-device storage, in this case. Ultimately, they both agreed that what they wanted was simple lifetime handling that bound data (such as cache data) to other data (such as a specific device), and that the simplest way to integrate Rust and C code did not make that easy.

Krummrich did include the caveat that Nova had some buffers that needed to live for the lifetime of the module itself, because of how NVIDIA's GPU system processor handles debugging information. Kroah-Hartman replied that GPUs were "the most crazy complex hardware out there". So he was fine with GPU drivers needing to do something out of the ordinary — but normal drivers shouldn't need to.

This prompted some discussion of which subsystems might need to do things differently. Kroah-Hartman agreed that some of the core infrastructure would need more complicated lifetime management as well. Most drivers are simple I2C or PCI drivers, however, he said. So it's only a handful of complicated drivers that need more complex lifetime management.

Krummrich then showed how the bindings could be made safer by moving some of the implementation into Rust; he demonstrated this in stages. The first stage was to use a macro to declare the module in Rust instead of C. That still left unsafe functions, the possibility that the driver could forget to unregister correctly, and some awkwardness with constructing the PCI device ID table.

Next, he showed how the uses of unsafe could be centralized to a single helper library, such that each driver only needs safe functions, with signatures type-checked by the compiler. The new version involves creating a type that implements the pci::Driver trait, which then takes care of the details of registering and unregistering the driver, and storing device-specific information in a type-safe way.

There were some more questions about the specific details. Kroah-Hartman asked why Krummrich had reimplemented a structure in Rust instead of using the version defined in C; Krummrich said that he didn't want to expose the structure directly to drivers. Benno Lossin suggested using a type alias, so that it's still not defined in two places.

The most involved part of the code concerned the machinery to make the PCI device ID table automatically add an appropriate sentinel value to the end at compile time, so that the whole device ID table can be stored in the binary's data section — a much more elegant solution than creating a copy of the table at run time. That ended up requiring the use of some additional intermediate traits, and made the whole code more than twice as large.

Kroah-Hartman said that the code was great, but that the complexity needed to add the sentinel "sounds crazy". He asked whether there wasn't a simpler way to do it, but nobody present volunteered a way.

Krummrich's example did show that it is definitely possible to build a driver-binding abstraction that requires no unsafe code in drivers. Such a driver cannot accidentally forget to unregister, automatically has type-safe storage associated with each device that it is registered for, does not need to touch raw pointers (instead receiving bounds-checked smart pointers for the appropriate PCI buses), and should generally be quite difficult to misuse.

The session ran out of time before Krummrich could finish his presentation. The overall conclusion from Kroah-Hartman was that he wants to see the full example in a patch set, but that it was a promising start. Krummrich wants this driver abstraction to be available for Nova and the other Rust drivers that use the same features, so it seems likely that he'll provide that example soon.


Index entries for this article
KernelDevelopment tools/Rust
ConferenceKangrejos/2024


to post comments

Device ID table with less craziness

Posted Sep 27, 2024 9:53 UTC (Fri) by garyguo (subscriber, #173367) [Link]

Some recent development on this topic. I was experimenting a different approach that would not require the macros. The const functions within there are still a bit crazy though, because what can be done at CTFE time is a bit limited.

Link on R4L zulip about this: https://rust-for-linux.zulipchat.com/#narrow/stream/288089-General/topic/.60DeviceId.60.20without.20macros

This progress is only made possible very recently by Rust allowing mutable references to be used in CTFE (https://github.com/rust-lang/rust/pull/129195) and therefore a lot more (crazy stuff) can be done during const evaluation.


Copyright © 2024, Eklektix, Inc.
This article may be redistributed under the terms of the Creative Commons CC BY-SA 4.0 license
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds









ApplySandwichStrip

pFad - (p)hone/(F)rame/(a)nonymizer/(d)eclutterfier!      Saves Data!


--- a PPN by Garber Painting Akron. With Image Size Reduction included!

Fetched URL: http://lwn.net/Articles/990918/

Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy