On Tue, Oct 12, 2021 at 2:35 PM Michael S. Tsirkin <mst at redhat.com>
wrote:>
> On Tue, Oct 12, 2021 at 02:11:10PM +0800, Jason Wang wrote:
> > On Tue, Oct 12, 2021 at 1:44 PM Michael S. Tsirkin <mst at
redhat.com> wrote:
> > >
> > > On Tue, Oct 12, 2021 at 10:43:57AM +0800, Jason Wang wrote:
> > > > On Mon, Oct 11, 2021 at 8:36 PM Michael S. Tsirkin <mst
at redhat.com> wrote:
> > > > >
> > > > > On Mon, Oct 11, 2021 at 03:36:51PM +0800, Jason Wang
wrote:
> > > > > > On Tue, Oct 5, 2021 at 3:42 PM Michael S. Tsirkin
<mst at redhat.com> wrote:
> > > > > > >
> > > > > > > On Mon, Sep 13, 2021 at 01:53:44PM +0800,
Jason Wang wrote:
> > > > > > > > Hi All:
> > > > > > > >
> > > > > > > > This series treis to do more hardening
for virito.
> > > > > > > >
> > > > > > > > patch 1 validates the num_queues for
virio-blk device.
> > > > > > > > patch 2-4 validates max_nr_ports for
virito-console device.
> > > > > > > > patch 5-7 harden virtio-pci interrupts
to make sure no exepcted
> > > > > > > > interrupt handler is tiggered. If this
makes sense we can do similar
> > > > > > > > things in other transport drivers.
> > > > > > > > patch 8-9 validate used ring length.
> > > > > > > >
> > > > > > > > Smoking test on blk/net with
packed=on/off and iommu_platform=on/off.
> > > > > > > >
> > > > > > > > Please review.
> > > > > > > >
> > > > > > > > Thanks
> > > > > > >
> > > > > > > So I poked at console at least, and I think I
see
> > > > > > > an issue: if interrupt handler queues a
work/bh,
> > > > > > > then it can still run while reset is in
progress.
> > > > > >
> > > > > > Looks like a bug which is unrelated to the
hardening?
> > > > >
> > > > > Won't preventing use after free be relevant?
> > > >
> > > > Oh right.
> > > >
> > > > > I frankly don't know what does hardening means
then.
> > > > > > E.g the driver
> > > > > > should sync with work/bh before reset.
> > > > >
> > > > > No, there's no way to fix it ATM without extra
locks and state which I
> > > > > think we should strive to avoid or make it generic, not
per-driver,
> > > > > since sync before reset is useless, new interrupts will
just arrive and
> > > > > queue more work. And a sync after reset is too late
since driver will
> > > > > try to add buffers.
> > > >
> > > > Can we do something like
> > > >
> > > > 1) disable interrupt
> > > > 2) sync bh
> > > >
> > > > Or I guess this is somehow you meant in the following steps.
> > >
> > > So that would mean a new API to disable vq interrupts.
> > > reset will re-enable.
> > > E.g. virtqueue_cancel_cb_before_reset()?
> > >
> > > Then drivers can sync, then reset.
> > > This means maintaining more state though, which I don't like.
> > >
> > > An alternative is something like this:
> > >
> > > static void (*virtio_flush_device)(struct virtio_device *dev);
> > >
> > > void virtio_reset_device(struct virtio_device *dev,
virtio_flush_device flush)
> > > {
> > > might_sleep();
> > > if (flush) {
> > > dev->config->disable_interrupts(dev);
> > > flush(dev);
> > > dev->config->reset(dev);
> > > dev->config->enable_interrupts(dev);
> >
> > I wonder whether this is needed. As done in this series,
> > enable_interrupt should be done in virtio_device_ready().
> >
> > Others should work.
> >
> > > } else {
> > > dev->config->reset(dev);
> > > }
> > > }
> > >
> > > I have patches wrapping all reset calls in virtio_reset_device
> > > (without the flush parameter but that's easy to tweak).
> >
> > Does it work if I post V2 and you post those patches on top?
>
> The reset things? Sure.
Ok.
>
> > >
> > >
> > > > >
> > > > > Maybe we can break device. Two issues with that
> > > > > - drivers might not be ready to handle add_buf failures
> > > > > - restore needs to unbreak then and we don't have a
way to do that yet
> > > > >
> > > > > So .. careful reading of all device drivers and hoping
we don't mess
> > > > > things up even more ... here we come.
> > > >
> > > > Yes.
> > >
> > > The biggest issue with this trick is drivers not handling add_buf
> > > errors, adding a failure path here risks creating memory leaks.
> > > OTOH with e.g. bounce buffers maybe it's possible for add buf
to
> > > fail anyway?
> >
> > I'm not sure I get this, a simple git grep told me at least the
return
> > value of add_inbuf() were all checked.
> >
> > Thanks
>
> Checked locally, but not always error is handled all the way
> to the top. E.g. add_port in console returns an error code
> but that is never checked. Well, console is a mess generally.
I see. I can try to audit all virtio drivers for the add_inbuf() case.
Thanks
>
> > >
> > > > >
> > > > > > >
> > > > > > > I sent a patch to fix it for console removal
specifically,
> > > > > > > but I suspect it's not enough e.g. freeze
is still broken.
> > > > > > > And note this has been reported without any
TDX things -
> > > > > > > it's not a malicious device issue, can be
triggered just
> > > > > > > by module unload.
> > > > > > >
> > > > > > > I am vaguely thinking about new APIs to
disable/enable callbacks.
> > > > > > > An alternative:
> > > > > > >
> > > > > > > 1. adding new remove_nocb/freeze_nocb calls
> > > > > > > 2. disabling/enabling interrupts
automatically around these
> > > > > > > 3. gradually moving devices to using these
> > > > > > > 4. once/if all device move, removing the old
callbacks
> > > > > > >
> > > > > > > the advantage here is that we'll be sure
calls are always
> > > > > > > paired correctly.
> > > > > >
> > > > > > I'm not sure I get the idea, but my feeling is
that it doesn't
> > > > > > conflict with the interrupt hardening here (or at
least the same
> > > > > > method is required e.g NO_AUTO_EN).
> > > > > >
> > > > > > Thanks
> > > > >
> > > > > Right. It's not that it conflicts, it's that I
was hoping that
> > > > > since you are working on hardening you can take up
fixing that.
> > > > > Let me know whether you have the time. Thanks!
> > > >
> > > > I can do that.
> > > >
> > > > Thanks
> > > >
> > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > > Jason Wang (9):
> > > > > > > > virtio-blk: validate num_queues during
probe
> > > > > > > > virtio: add doc for validate() method
> > > > > > > > virtio-console: switch to use
.validate()
> > > > > > > > virtio_console: validate max_nr_ports
before trying to use it
> > > > > > > > virtio_config: introduce a new ready
method
> > > > > > > > virtio_pci: harden MSI-X interrupts
> > > > > > > > virtio-pci: harden INTX interrupts
> > > > > > > > virtio_ring: fix typos in
vring_desc_extra
> > > > > > > > virtio_ring: validate used buffer
length
> > > > > > > >
> > > > > > > > drivers/block/virtio_blk.c | 3
+-
> > > > > > > > drivers/char/virtio_console.c | 51
+++++++++++++++++++++---------
> > > > > > > > drivers/virtio/virtio_pci_common.c | 43
+++++++++++++++++++++----
> > > > > > > > drivers/virtio/virtio_pci_common.h | 7
++--
> > > > > > > > drivers/virtio/virtio_pci_legacy.c | 5
+--
> > > > > > > > drivers/virtio/virtio_pci_modern.c | 6
++--
> > > > > > > > drivers/virtio/virtio_ring.c | 27
++++++++++++++--
> > > > > > > > include/linux/virtio.h | 1
+
> > > > > > > > include/linux/virtio_config.h | 6
++++
> > > > > > > > 9 files changed, 118 insertions(+), 31
deletions(-)
> > > > > > > >
> > > > > > > > --
> > > > > > > > 2.25.1
> > > > > > >
> > > > >
> > >
>