Michael S. Tsirkin
2022-Jul-04 06:22 UTC
[PATCH V3] virtio: disable notification hardening by default
On Mon, Jul 04, 2022 at 12:23:27PM +0800, Jason Wang wrote:> > So if there are not examples of callbacks not ready after kick > > then let us block callbacks until first kick. That is my idea. > > Ok, let me try. I need to drain my queue of fixes first. > > ThanksIf we do find issues, another option is blocking callbacks until the first add. A bit higher overhead as add is a more common operation but it has even less of a chance to introduce regressions.> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >I couldn't ... except maybe bluetooth > > > > > > > > > > > but that's just maintainer nacking fixes saying he'll fix it > > > > > > > > > > > his way ... > > > > > > > > > > > > > > > > > > > > > > > And during remove(), we get another window: > > > > > > > > > > > > > > > > > > > > > > > > subsysrem_unregistration() > > > > > > > > > > > > /* the window */ > > > > > > > > > > > > virtio_device_reset() > > > > > > > > > > > > > > > > > > > > > > Same here. > > > > > > > > > > > > > > > > > > Basically for the drivers that set driver_ok before registration, > > > > > > > > > > > > > > > > I don't see what does driver_ok have to do with it. > > > > > > > > > > > > > > I meant for those driver, in probe they do() > > > > > > > > > > > > > > virtio_device_ready() > > > > > > > subsystem_register() > > > > > > > > > > > > > > In remove() they do > > > > > > > > > > > > > > subsystem_unregister() > > > > > > > virtio_device_reset() > > > > > > > > > > > > > > for symmetry > > > > > > > > > > > > Let's leave remove alone for now. I am close to 100% sure we have *lots* > > > > > > of issues around it, but while probe is unavoidable remove can be > > > > > > avoided by blocking hotplug. > > > > > > > > > > Unbind can trigger this path as well. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > so > > > > > > > > > we have a lot: > > > > > > > > > > > > > > > > > > blk, net, mac80211_hwsim, scsi, vsock, bt, crypto, gpio, gpu, i2c, > > > > > > > > > iommu, caif, pmem, input, mem > > > > > > > > > > > > > > > > > > So I think there's no easy way to harden the notification without > > > > > > > > > auditing the driver one by one (especially considering the driver may > > > > > > > > > use bh or workqueue). The problem is the notification hardening > > > > > > > > > depends on a correct or race-free probe/remove. So we need to fix the > > > > > > > > > issues in probe/remove then do the hardening on the notification. > > > > > > > > > > > > > > > > > > Thanks > > > > > > > > > > > > > > > > So if drivers kick but are not ready to get callbacks then let's fix > > > > > > > > that first of all, these are racy with existing qemu even ignoring > > > > > > > > spec compliance. > > > > > > > > > > > > > > Yes, (the patches I've posted so far exist even with a well-behaved device). > > > > > > > > > > > > > > Thanks > > > > > > > > > > > > patches you posted deal with DRIVER_OK spec compliance. > > > > > > I do not see patches for kicks before callbacks are ready to run. > > > > > > > > > > Yes. > > > > > > > > > > Thanks > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > MST > > > > > > > > > > > > > > > > > > > >
Jason Wang
2022-Jul-04 06:40 UTC
[PATCH V3] virtio: disable notification hardening by default
On Mon, Jul 4, 2022 at 2:22 PM Michael S. Tsirkin <mst at redhat.com> wrote:> > On Mon, Jul 04, 2022 at 12:23:27PM +0800, Jason Wang wrote: > > > So if there are not examples of callbacks not ready after kick > > > then let us block callbacks until first kick. That is my idea. > > > > Ok, let me try. I need to drain my queue of fixes first. > > > > Thanks > > If we do find issues, another option is blocking callbacks until the > first add. A bit higher overhead as add is a more common operation > but it has even less of a chance to introduce regressions.So I understand that the case of blocking until first kick but if we block until add it means for drivers: virtqueue_add() virtio_device_ready() virtqueue_kick() We probably enlarge the window in this case. Thanks> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >I couldn't ... except maybe bluetooth > > > > > > > > > > > > but that's just maintainer nacking fixes saying he'll fix it > > > > > > > > > > > > his way ... > > > > > > > > > > > > > > > > > > > > > > > > > And during remove(), we get another window: > > > > > > > > > > > > > > > > > > > > > > > > > > subsysrem_unregistration() > > > > > > > > > > > > > /* the window */ > > > > > > > > > > > > > virtio_device_reset() > > > > > > > > > > > > > > > > > > > > > > > > Same here. > > > > > > > > > > > > > > > > > > > > Basically for the drivers that set driver_ok before registration, > > > > > > > > > > > > > > > > > > I don't see what does driver_ok have to do with it. > > > > > > > > > > > > > > > > I meant for those driver, in probe they do() > > > > > > > > > > > > > > > > virtio_device_ready() > > > > > > > > subsystem_register() > > > > > > > > > > > > > > > > In remove() they do > > > > > > > > > > > > > > > > subsystem_unregister() > > > > > > > > virtio_device_reset() > > > > > > > > > > > > > > > > for symmetry > > > > > > > > > > > > > > Let's leave remove alone for now. I am close to 100% sure we have *lots* > > > > > > > of issues around it, but while probe is unavoidable remove can be > > > > > > > avoided by blocking hotplug. > > > > > > > > > > > > Unbind can trigger this path as well. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > so > > > > > > > > > > we have a lot: > > > > > > > > > > > > > > > > > > > > blk, net, mac80211_hwsim, scsi, vsock, bt, crypto, gpio, gpu, i2c, > > > > > > > > > > iommu, caif, pmem, input, mem > > > > > > > > > > > > > > > > > > > > So I think there's no easy way to harden the notification without > > > > > > > > > > auditing the driver one by one (especially considering the driver may > > > > > > > > > > use bh or workqueue). The problem is the notification hardening > > > > > > > > > > depends on a correct or race-free probe/remove. So we need to fix the > > > > > > > > > > issues in probe/remove then do the hardening on the notification. > > > > > > > > > > > > > > > > > > > > Thanks > > > > > > > > > > > > > > > > > > So if drivers kick but are not ready to get callbacks then let's fix > > > > > > > > > that first of all, these are racy with existing qemu even ignoring > > > > > > > > > spec compliance. > > > > > > > > > > > > > > > > Yes, (the patches I've posted so far exist even with a well-behaved device). > > > > > > > > > > > > > > > > Thanks > > > > > > > > > > > > > > patches you posted deal with DRIVER_OK spec compliance. > > > > > > > I do not see patches for kicks before callbacks are ready to run. > > > > > > > > > > > > Yes. > > > > > > > > > > > > Thanks > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > MST > > > > > > > > > > > > > > > > > > > > > > > > >