On Fri, May 31, 2024 at 04:50:32PM +0900, FUJITA Tomonori wrote:> On Thu, 30 May 2024 08:47:25 +0200 > Danilo Krummrich <dakr at redhat.com> wrote: > > >> >> >> For a Rust PHY driver, you know that you have a valid pointer to C's > >> >> >> device object of C's PHY device during the probe callback. The driver > >> >> >> creates a Rust device object to wrap the C pointer to the C's device > >> >> >> object and passes it to the firmware abstractions. The firmware > >> >> >> abstractions gets the C's pointer from the Rust object and calls C's > >> >> >> function to load firmware, returns the result. > >> >> >> > >> >> >> You have concerns about the simple code like the following? > >> >> >> > >> >> >> > >> >> >> diff --git a/rust/kernel/device.rs b/rust/kernel/device.rs > >> >> >> new file mode 100644 > >> >> >> index 000000000000..6144437984a9 > >> >> >> --- /dev/null > >> >> >> +++ b/rust/kernel/device.rs > >> >> >> @@ -0,0 +1,30 @@ > >> >> >> +// SPDX-License-Identifier: GPL-2.0 > >> >> >> + > >> >> >> +//! Generic devices that are part of the kernel's driver model. > >> >> >> +//! > >> >> >> +//! C header: [`include/linux/device.h`](srctree/include/linux/device.h) > >> >> >> + > >> >> >> +use crate::types::Opaque; > >> >> >> + > >> >> >> +#[repr(transparent)] > >> >> >> +pub struct Device(Opaque<bindings::device>); > >> >> >> + > >> >> >> +impl Device { > >> >> >> + /// Creates a new [`Device`] instance from a raw pointer. > >> >> >> + /// > >> >> >> + /// # Safety > >> >> >> + /// > >> >> >> + /// For the duration of 'a, the pointer must point at a valid `device`. > >> >> > > >> >> > If the following rust code does what this comment says, then sure, I'm > >> >> > ok with it for now if it helps you all out with stuff like the firmware > >> >> > interface for the phy rust code. > >> >> > >> >> Great, thanks a lot! > >> >> > >> >> Danilo and Wedson, are there any concerns about pushing this patch [1] > >> >> for the firmware abstractions? > >> > > >> > Well, if everyone is fine with this one I don't see why we can't we go with [1] > >> > directly? AFAICS, we'd only need the following fix: > >> > > >> > -//! C header: [`include/linux/device.h`](../../../../include/linux/device.h) > >> > +//! C header: [`include/linux/device.h`](srctree/include/linux/device.h) > >> > > >> > [1] https://lore.kernel.org/rust-for-linux/20240520172554.182094-2-dakr at redhat.com/ > >> > >> The difference is that your patch touches the reference count of a > >> struct device. My patch doesn't. > >> > >> The following part in your patch allows the rust code to freely play > >> with the reference count of a struct device. Your Rust drm driver > >> interact with the reference count in a different way than C's drm > >> drivers if I understand correctly. I'm not sure that Greg will be > >> convinenced with that approach. > > > > I don't see how this is different than what we do in C. If you (for whatever > > reason) want to keep a pointer to a struct device you should make sure to > > increase the reference count of this device, such that it can't get freed for > > the time being. > > > > This is a 1:1 representation of that and conceptually identical. > > A drm driver does such? If a drm driver allocates two types of > driver-specific data and keep a pointer to the pci device, then the > driver calls get_device() twice to increase the reference count of the > pci's device?Think about it more generically. If you store a pointer to a device in some structure (driver private data, generic subsystem structure, etc.), can you guarantee that without acquiring a reference that the device' reference count doesn't drop to zero while your structure is alive? There are cases where you can't, e.g. with Arc<device::Data>. How do you guarantee (generically for every driver) the last reference of your device data is dropped with the driver's remove callback? If we make it the driver's responsibility to care about this the whole thing is unsafe as in C. In some cases you can, mostly the ones where you free a structure in the driver's remove callback. But again, then there is no advantage over what we do in C. What if you change the lifetime of your structure later on, then you may introduce a bug.> > At the very least, network device drivers don't. a driver doesn't play > with the reference count. The network subsystem does. And, the networkA quick 'grep' shows 23 occurrences of get_device() in network drivers.> subsystem doesn't care about how many pointers to a pci device a > driver keeps.If none of the drivers has structures storing a pointer to a pci device that out live the driver's remove callback that's fine.> > With the patch [1], a driver plays with the reference count of a > device and directly calls get/put_device. It's a different than C > drivers for me (not sure about drm subsystem though).It's not different. Again, when a C driver stores a device pointer in a structure whose lifetime is not bound a special boundary, like the device remove callback, it must increase the reference count. It's just that most of the time we are moving within this boundary and in C we just rely on that. In Rust we can only get this safe by taking a reference for every pointer we store. And making it safe to use seems to be the goal. Generally, the whole point of a reference count is that everyone who owns a pointer to this shared structure increases / decreases the reference count accordingly.> > But I might be misunderstanding that Greg isn't convinced with this > reference count thing. We need to wait for his response. > > > >> +// SAFETY: Instances of `Device` are always ref-counted. > >> +unsafe impl crate::types::AlwaysRefCounted for Device { > >> + fn inc_ref(&self) { > >> + // SAFETY: The existence of a shared reference guarantees that the refcount is nonzero. > >> + unsafe { bindings::get_device(self.as_raw()) }; > >> + } > >> + > >> + unsafe fn dec_ref(obj: ptr::NonNull<Self>) { > >> + // SAFETY: The safety requirements guarantee that the refcount is nonzero. > >> + unsafe { bindings::put_device(obj.cast().as_ptr()) } > >> + } > >> +} > >> > >> The following comments give the impression that Rust abstractions > >> wrongly interact with the reference count; callers check out the > >> reference counter. Nobody should do that. > > > > No, saying that the caller must ensure that the device "has a non-zero reference > > count" is a perfectly valid requirement. > > > > It doensn't mean that the calling code has to peek the actual reference count, > > but it means that it must be ensured that while we try to increase the reference > > count it can't drop to zero unexpectedly. > > Yeah, the same requirements but expressed differently.Well, instead of "ensure that `ptr` is valid, non-null, and has a non-zero reference count" you propose "the pointer must point at a valid `device`". When I ask you to specify what "pointer to valid device" means, what would be the answer? Since we're still discussing this in the thread of the firmware abstraction, if you have any further concerns, let's discuss them in the thread for the corresponding patch. Once we get to a conclusion I can send a series with only the device and firmare abstractions such that we can get them in outside of the scope of the reset of both series to get your driver going. - Danilo> > But again, I might be misunderstanding. Greg might have a different reason. > > > Your patch, as a subset of this one, has the same requirements as listed below. > > > >> > >> + /// Callers must ensure that `ptr` is valid, non-null, and has a non-zero reference count. > >> + pub unsafe fn from_raw(ptr: *mut bindings::device) -> ARef<Self> { > >> > >> + /// Callers must ensure that `ptr` is valid, non-null, and has a non-zero reference count for > >> + /// the entire duration when the returned reference exists. > >> + pub unsafe fn as_ref<'a>(ptr: *mut bindings::device) -> &'a Self { > >> + // SAFETY: Guaranteed by the safety requirements of the function. > >> + unsafe { &*ptr.cast() } > >> + } > >> > > > > >
Hi, On Fri, 31 May 2024 11:59:47 +0200 Danilo Krummrich <dakr at redhat.com> wrote:> Once we get to a conclusion I can send a series with only the device and firmare > abstractions such that we can get them in outside of the scope of the reset of > both series to get your driver going.Since your discussion with Greg seems to continue for a while, let me include the following patch that Greg approved with the next version of the PHY driver patchset. You have the new version of the firmware patch? The unused functions will not be merged so only request_firmware() and release_firmware() can be included. If not, I can include my firmware patch that I wrote before. From: Danilo Krummrich <dakr at redhat.com> Date: Fri, 7 Jun 2024 20:14:59 +0900 Subject: [PATCH] add abstraction for struct device Add abstraction for struct device. This implements only the minimum necessary functions required for the abstractions of firmware API; that is, wrapping C's pointer to a device object with Rust struct only during a caller knows the pointer is valid (e.g., the probe callback). Co-developed-by: Wedson Almeida Filho <wedsonaf at gmail.com> Signed-off-by: Wedson Almeida Filho <wedsonaf at gmail.com> Signed-off-by: Danilo Krummrich <dakr at redhat.com> Co-developed-by: FUJITA Tomonori <fujita.tomonori at gmail.com> Signed-off-by: FUJITA Tomonori <fujita.tomonori at gmail.com> Acked-by: Greg Kroah-Hartman <gregkh at linuxfoundation.org> --- rust/kernel/device.rs | 31 +++++++++++++++++++++++++++++++ rust/kernel/lib.rs | 1 + 2 files changed, 32 insertions(+) create mode 100644 rust/kernel/device.rs diff --git a/rust/kernel/device.rs b/rust/kernel/device.rs new file mode 100644 index 000000000000..55ec4f364628 --- /dev/null +++ b/rust/kernel/device.rs @@ -0,0 +1,31 @@ +// SPDX-License-Identifier: GPL-2.0 + +//! Generic devices that are part of the kernel's driver model. +//! +//! C header: [`include/linux/device.h`](srctree/include/linux/device.h) + +use crate::types::Opaque; + +/// Wraps the kernel's `struct task_struct`. +#[repr(transparent)] +pub struct Device(Opaque<bindings::device>); + +impl Device { + /// Creates a new [`Device`] instance from a raw pointer. + /// + /// # Safety + /// + /// For the duration of 'a, the pointer must point at a valid `device`. + pub unsafe fn from_raw<'a>(ptr: *mut bindings::device) -> &'a Self { + // CAST: `Self` is a `repr(transparent)` wrapper around `bindings::device`. + let ptr = ptr.cast::<Self>(); + // SAFETY: by the function requirements the pointer is valid and we have unique access for + // the duration of `'a`. + unsafe { &mut *ptr } + } + + /// Returns the raw pointer to the device. + pub(crate) fn as_raw(&self) -> *mut bindings::device { + self.0.get() + } +} diff --git a/rust/kernel/lib.rs b/rust/kernel/lib.rs index fbd91a48ff8b..dd1207f1a873 100644 --- a/rust/kernel/lib.rs +++ b/rust/kernel/lib.rs @@ -28,6 +28,7 @@ pub mod alloc; mod build_assert; +pub mod device; pub mod error; pub mod init; pub mod ioctl; -- 2.34.1
On Fri, Jun 07, 2024 at 09:11:32PM +0900, FUJITA Tomonori wrote:> Hi, > > On Fri, 31 May 2024 11:59:47 +0200 > Danilo Krummrich <dakr at redhat.com> wrote: > > > Once we get to a conclusion I can send a series with only the device and firmare > > abstractions such that we can get them in outside of the scope of the reset of > > both series to get your driver going. > > Since your discussion with Greg seems to continue for a while, let me > include the following patch that Greg approved with the next version > of the PHY driver patchset.Yes, please take this one now. We can build on it from there. I had a meeting yesterday with a lot of rust kernel and userspace people at Microsoft and talked a bunch about this and how to move forward. I think we need to take much smaller "steps" and even encourage most drivers to start out as a mix of c and rust as there is no real "requirement" that a driver be "pure" rust at all. This should both make the interface logic simpler to start with, AND provide a base so that people can just write the majority of their driver logic in rust, which is where the language "safety" issues are most needed, not in the lifecycle rules involving the internal driver model infrastructure. Anyway, that's all hand-wavy right now, sorry, to get back to the point here, again, let's take this, which will allow the firmware bindings to be resubmitted and hopefully accepted, and we can move forward from there to "real" things like a USB or PCI or even platform device and driver binding stuff. thanks, greg k-h
On Fri, Jun 07, 2024 at 09:11:32PM +0900, FUJITA Tomonori wrote:> Hi, > > On Fri, 31 May 2024 11:59:47 +0200 > Danilo Krummrich <dakr at redhat.com> wrote: > > > Once we get to a conclusion I can send a series with only the device and firmare > > abstractions such that we can get them in outside of the scope of the reset of > > both series to get your driver going. > > Since your discussion with Greg seems to continue for a while, let me > include the following patch that Greg approved with the next version > of the PHY driver patchset.This all doesn't make a lot of sense. If that's the case, Greg approved something the he keeps arguing about in the discussion of the original patch. Please see the discussion about the NULL pointer check [1]. Besides that, I really don't think it's the correct approach to just (partially) pick up a patch from the mailing list that is actively discussed and submit versions that are slightly altered in the points that are actively discussed. This really just complicates the situation and does not help getting to an agreement.> > You have the new version of the firmware patch? The unused functions > will not be merged so only request_firmware() and release_firmware() > can be included. If not, I can include my firmware patch that I wrote > before. >Please find the updated firmware patch in [2]. However, I propose to just send a new series with just the device abstraction and this firmware patch and proceed from there. [1] https://lore.kernel.org/rust-for-linux/Zl8_bXqK-T24y1kp at cassiopeiae/ [2] https://gitlab.freedesktop.org/drm/nova/-/commit/0683e186929c4922d565e5315525925aa2d8d686 NAK for the patch below, for the reasons above. Please also see comments inline.> > From: Danilo Krummrich <dakr at redhat.com> > Date: Fri, 7 Jun 2024 20:14:59 +0900 > Subject: [PATCH] add abstraction for struct device > > Add abstraction for struct device. This implements only the minimum > necessary functions required for the abstractions of firmware API; > that is, wrapping C's pointer to a device object with Rust struct only > during a caller knows the pointer is valid (e.g., the probe callback). > > Co-developed-by: Wedson Almeida Filho <wedsonaf at gmail.com> > Signed-off-by: Wedson Almeida Filho <wedsonaf at gmail.com> > Signed-off-by: Danilo Krummrich <dakr at redhat.com> > Co-developed-by: FUJITA Tomonori <fujita.tomonori at gmail.com> > Signed-off-by: FUJITA Tomonori <fujita.tomonori at gmail.com> > Acked-by: Greg Kroah-Hartman <gregkh at linuxfoundation.org> > --- > rust/kernel/device.rs | 31 +++++++++++++++++++++++++++++++ > rust/kernel/lib.rs | 1 + > 2 files changed, 32 insertions(+) > create mode 100644 rust/kernel/device.rs > > diff --git a/rust/kernel/device.rs b/rust/kernel/device.rs > new file mode 100644 > index 000000000000..55ec4f364628 > --- /dev/null > +++ b/rust/kernel/device.rs > @@ -0,0 +1,31 @@ > +// SPDX-License-Identifier: GPL-2.0 > + > +//! Generic devices that are part of the kernel's driver model. > +//! > +//! C header: [`include/linux/device.h`](srctree/include/linux/device.h) > + > +use crate::types::Opaque; > + > +/// Wraps the kernel's `struct task_struct`. > +#[repr(transparent)] > +pub struct Device(Opaque<bindings::device>); > + > +impl Device { > + /// Creates a new [`Device`] instance from a raw pointer. > + /// > + /// # Safety > + /// > + /// For the duration of 'a, the pointer must point at a valid `device`.The original patch says: "Callers must ensure that `ptr` is valid, non-null, and has a non-zero reference count for the entire duration when the returned reference exists." This is way more precise than just saying "For the duration of 'a, the pointer must point at a valid `device`.", hence we should keep the original comment.> + pub unsafe fn from_raw<'a>(ptr: *mut bindings::device) -> &'a Self { > + // CAST: `Self` is a `repr(transparent)` wrapper around `bindings::device`. > + let ptr = ptr.cast::<Self>(); > + // SAFETY: by the function requirements the pointer is valid and we have unique access for > + // the duration of `'a`. > + unsafe { &mut *ptr }Why not just + // SAFETY: Guaranteed by the safety requirements of the function. + unsafe { &*ptr.cast() } as in the original commit?> + } > + > + /// Returns the raw pointer to the device. > + pub(crate) fn as_raw(&self) -> *mut bindings::device { > + self.0.get() > + } > +} > diff --git a/rust/kernel/lib.rs b/rust/kernel/lib.rs > index fbd91a48ff8b..dd1207f1a873 100644 > --- a/rust/kernel/lib.rs > +++ b/rust/kernel/lib.rs > @@ -28,6 +28,7 @@ > > pub mod alloc; > mod build_assert; > +pub mod device; > pub mod error; > pub mod init; > pub mod ioctl; > -- > 2.34.1 >