thr3ads.net - Virtualization - [PATCH 3/4] virtio_ring: introduce a per virtqueue waitqueue [Jan 2023]

If this information is useful, please help other people find it:
Share via:

Jason Wang

2023-Jan-29 05:48 UTC

[PATCH 3/4] virtio_ring: introduce a per virtqueue waitqueue

On Fri, Jan 27, 2023 at 6:35 PM Michael S. Tsirkin <mst at redhat.com>
wrote:>
> On Fri, Dec 30, 2022 at 11:43:08AM +0800, Jason Wang wrote:
> > On Thu, Dec 29, 2022 at 4:10 PM Michael S. Tsirkin <mst at
redhat.com> wrote:
> > >
> > > On Thu, Dec 29, 2022 at 04:04:13PM +0800, Jason Wang wrote:
> > > > On Thu, Dec 29, 2022 at 3:07 PM Michael S. Tsirkin <mst
at redhat.com> wrote:
> > > > >
> > > > > On Wed, Dec 28, 2022 at 07:53:08PM +0800, Jason Wang
wrote:
> > > > > > On Wed, Dec 28, 2022 at 2:34 PM Jason Wang
<jasowang at redhat.com> wrote:
> > > > > > >
> > > > > > >
> > > > > > > ? 2022/12/27 17:38, Michael S. Tsirkin ??:
> > > > > > > > On Tue, Dec 27, 2022 at 05:12:58PM
+0800, Jason Wang wrote:
> > > > > > > >> ? 2022/12/27 15:33, Michael S.
Tsirkin ??:
> > > > > > > >>> On Tue, Dec 27, 2022 at
12:30:35PM +0800, Jason Wang wrote:
> > > > > > > >>>>> But device is still
going and will later use the buffers.
> > > > > > > >>>>>
> > > > > > > >>>>> Same for timeout really.
> > > > > > > >>>> Avoiding infinite wait/poll
is one of the goals, another is to sleep.
> > > > > > > >>>> If we think the timeout is
hard, we can start from the wait.
> > > > > > > >>>>
> > > > > > > >>>> Thanks
> > > > > > > >>> If the goal is to avoid
disrupting traffic while CVQ is in use,
> > > > > > > >>> that sounds more reasonable.
E.g. someone is turning on promisc,
> > > > > > > >>> a spike in CPU usage might be
unwelcome.
> > > > > > > >>
> > > > > > > >> Yes, this would be more obvious is
UP is used.
> > > > > > > >>
> > > > > > > >>
> > > > > > > >>> things we should be careful to
address then:
> > > > > > > >>> 1- debugging. Currently it's
easy to see a warning if CPU is stuck
> > > > > > > >>>      in a loop for a while, and
we also get a backtrace.
> > > > > > > >>>      E.g. with this - how do we
know who has the RTNL?
> > > > > > > >>>      We need to integrate with
kernel/watchdog.c for good results
> > > > > > > >>>      and to make sure policy is
consistent.
> > > > > > > >>
> > > > > > > >> That's fine, will consider this.
> > > > > >
> > > > > > So after some investigation, it seems the
watchdog.c doesn't help. The
> > > > > > only export helper is touch_softlockup_watchdog()
which tries to avoid
> > > > > > triggering the lockups warning for the known slow
path.
> > > > >
> > > > > I never said you can just use existing exporting APIs.
You'll have to
> > > > > write new ones :)
> > > >
> > > > Ok, I thought you wanted to trigger similar warnings as a
watchdog.
> > > >
> > > > Btw, I wonder what kind of logic you want here. If we switch
to using
> > > > sleep, there won't be soft lockup anymore. A simple wait
+ timeout +
> > > > warning seems sufficient?
> > > >
> > > > Thanks
> > >
> > > I'd like to avoid need to teach users new APIs. So watchdog
setup to apply
> > > to this driver. The warning can be different.
> >
> > Right, so it looks to me the only possible setup is the
> > watchdog_thres. I plan to trigger the warning every watchdog_thres * 2
> > second (as softlockup did).
> >
> > And I think it would still make sense to fail, we can start with a
> > very long timeout like 1 minutes and break the device. Does this make
> > sense?
> >
> > Thanks
>
> I'd say we need to make this manageable then.
Did you mean something like sysfs or module parameters?
> Can't we do it normally
> e.g. react to an interrupt to return to userspace?
I didn't get the meaning of this. Sorry.

Thanks
>
>
>
> > >
> > >
> > > > >
> > > > > > And before the patch, we end up with a real
infinite loop which could
> > > > > > be caught by RCU stall detector which is not the
case of the sleep.
> > > > > > What we can do is probably do a periodic
netdev_err().
> > > > > >
> > > > > > Thanks
> > > > >
> > > > > Only with a bad device.
> > > > >
> > > > > > > >>
> > > > > > > >>
> > > > > > > >>> 2- overhead. In a very common
scenario when device is in hypervisor,
> > > > > > > >>>      programming timers etc has
a very high overhead, at bootup
> > > > > > > >>>      lots of CVQ commands are
run and slowing boot down is not nice.
> > > > > > > >>>      let's poll for a bit
before waiting?
> > > > > > > >>
> > > > > > > >> Then we go back to the question of
choosing a good timeout for poll. And
> > > > > > > >> poll seems problematic in the case
of UP, scheduler might not have the
> > > > > > > >> chance to run.
> > > > > > > > Poll just a bit :) Seriously I don't
know, but at least check once
> > > > > > > > after kick.
> > > > > > >
> > > > > > >
> > > > > > > I think it is what the current code did where
the condition will be
> > > > > > > check before trying to sleep in the
wait_event().
> > > > > > >
> > > > > > >
> > > > > > > >
> > > > > > > >>> 3- suprise removal. need to wake
up thread in some way. what about
> > > > > > > >>>      other cases of device
breakage - is there a chance this
> > > > > > > >>>      introduces new bugs around
that? at least enumerate them please.
> > > > > > > >>
> > > > > > > >> The current code did:
> > > > > > > >>
> > > > > > > >> 1) check for vq->broken
> > > > > > > >> 2) wakeup during BAD_RING()
> > > > > > > >>
> > > > > > > >> So we won't end up with a never
woke up process which should be fine.
> > > > > > > >>
> > > > > > > >> Thanks
> > > > > > > >
> > > > > > > > BTW BAD_RING on removal will trigger
dev_err. Not sure that is a good
> > > > > > > > idea - can cause crashes if kernel
panics on error.
> > > > > > >
> > > > > > >
> > > > > > > Yes, it's better to use
__virtqueue_break() instead.
> > > > > > >
> > > > > > > But consider we will start from a wait first,
I will limit the changes
> > > > > > > in virtio-net without bothering virtio core.
> > > > > > >
> > > > > > > Thanks
> > > > > > >
> > > > > > >
> > > > > > > >
> > > > > > > >>>
> > > > >
> > >
>

Michael S. Tsirkin

2023-Jan-29 07:30 UTC

head link

[PATCH 3/4] virtio_ring: introduce a per virtqueue waitqueue

On Sun, Jan 29, 2023 at 01:48:49PM +0800, Jason Wang
wrote:> On Fri, Jan 27, 2023 at 6:35 PM Michael S. Tsirkin <mst at
redhat.com> wrote:
> >
> > On Fri, Dec 30, 2022 at 11:43:08AM +0800, Jason Wang wrote:
> > > On Thu, Dec 29, 2022 at 4:10 PM Michael S. Tsirkin <mst at
redhat.com> wrote:
> > > >
> > > > On Thu, Dec 29, 2022 at 04:04:13PM +0800, Jason Wang wrote:
> > > > > On Thu, Dec 29, 2022 at 3:07 PM Michael S. Tsirkin
<mst at redhat.com> wrote:
> > > > > >
> > > > > > On Wed, Dec 28, 2022 at 07:53:08PM +0800, Jason
Wang wrote:
> > > > > > > On Wed, Dec 28, 2022 at 2:34 PM Jason Wang
<jasowang at redhat.com> wrote:
> > > > > > > >
> > > > > > > >
> > > > > > > > ? 2022/12/27 17:38, Michael S. Tsirkin
??:
> > > > > > > > > On Tue, Dec 27, 2022 at 05:12:58PM
+0800, Jason Wang wrote:
> > > > > > > > >> ? 2022/12/27 15:33, Michael S.
Tsirkin ??:
> > > > > > > > >>> On Tue, Dec 27, 2022 at
12:30:35PM +0800, Jason Wang wrote:
> > > > > > > > >>>>> But device is still
going and will later use the buffers.
> > > > > > > > >>>>>
> > > > > > > > >>>>> Same for timeout
really.
> > > > > > > > >>>> Avoiding infinite
wait/poll is one of the goals, another is to sleep.
> > > > > > > > >>>> If we think the timeout
is hard, we can start from the wait.
> > > > > > > > >>>>
> > > > > > > > >>>> Thanks
> > > > > > > > >>> If the goal is to avoid
disrupting traffic while CVQ is in use,
> > > > > > > > >>> that sounds more
reasonable. E.g. someone is turning on promisc,
> > > > > > > > >>> a spike in CPU usage might
be unwelcome.
> > > > > > > > >>
> > > > > > > > >> Yes, this would be more obvious
is UP is used.
> > > > > > > > >>
> > > > > > > > >>
> > > > > > > > >>> things we should be careful
to address then:
> > > > > > > > >>> 1- debugging. Currently
it's easy to see a warning if CPU is stuck
> > > > > > > > >>>      in a loop for a while,
and we also get a backtrace.
> > > > > > > > >>>      E.g. with this - how
do we know who has the RTNL?
> > > > > > > > >>>      We need to integrate
with kernel/watchdog.c for good results
> > > > > > > > >>>      and to make sure
policy is consistent.
> > > > > > > > >>
> > > > > > > > >> That's fine, will consider
this.
> > > > > > >
> > > > > > > So after some investigation, it seems the
watchdog.c doesn't help. The
> > > > > > > only export helper is
touch_softlockup_watchdog() which tries to avoid
> > > > > > > triggering the lockups warning for the known
slow path.
> > > > > >
> > > > > > I never said you can just use existing exporting
APIs. You'll have to
> > > > > > write new ones :)
> > > > >
> > > > > Ok, I thought you wanted to trigger similar warnings as
a watchdog.
> > > > >
> > > > > Btw, I wonder what kind of logic you want here. If we
switch to using
> > > > > sleep, there won't be soft lockup anymore. A simple
wait + timeout +
> > > > > warning seems sufficient?
> > > > >
> > > > > Thanks
> > > >
> > > > I'd like to avoid need to teach users new APIs. So
watchdog setup to apply
> > > > to this driver. The warning can be different.
> > >
> > > Right, so it looks to me the only possible setup is the
> > > watchdog_thres. I plan to trigger the warning every
watchdog_thres * 2
> > > second (as softlockup did).
> > >
> > > And I think it would still make sense to fail, we can start with
a
> > > very long timeout like 1 minutes and break the device. Does this
make
> > > sense?
> > >
> > > Thanks
> >
> > I'd say we need to make this manageable then.
> 
> Did you mean something like sysfs or module parameters?
No I'd say pass it with an ioctl.
> > Can't we do it normally
> > e.g. react to an interrupt to return to userspace?
> 
> I didn't get the meaning of this. Sorry.
> 
> Thanks
Standard way to handle things that can timeout and where userspace
did not supply the time is to block until an interrupt
then return EINTR. Userspace controls the timeout by
using e.g. alarm(2).

> >
> >
> >
> > > >
> > > >
> > > > > >
> > > > > > > And before the patch, we end up with a real
infinite loop which could
> > > > > > > be caught by RCU stall detector which is not
the case of the sleep.
> > > > > > > What we can do is probably do a periodic
netdev_err().
> > > > > > >
> > > > > > > Thanks
> > > > > >
> > > > > > Only with a bad device.
> > > > > >
> > > > > > > > >>
> > > > > > > > >>
> > > > > > > > >>> 2- overhead. In a very
common scenario when device is in hypervisor,
> > > > > > > > >>>      programming timers etc
has a very high overhead, at bootup
> > > > > > > > >>>      lots of CVQ commands
are run and slowing boot down is not nice.
> > > > > > > > >>>      let's poll for a
bit before waiting?
> > > > > > > > >>
> > > > > > > > >> Then we go back to the question
of choosing a good timeout for poll. And
> > > > > > > > >> poll seems problematic in the
case of UP, scheduler might not have the
> > > > > > > > >> chance to run.
> > > > > > > > > Poll just a bit :) Seriously I
don't know, but at least check once
> > > > > > > > > after kick.
> > > > > > > >
> > > > > > > >
> > > > > > > > I think it is what the current code did
where the condition will be
> > > > > > > > check before trying to sleep in the
wait_event().
> > > > > > > >
> > > > > > > >
> > > > > > > > >
> > > > > > > > >>> 3- suprise removal. need to
wake up thread in some way. what about
> > > > > > > > >>>      other cases of device
breakage - is there a chance this
> > > > > > > > >>>      introduces new bugs
around that? at least enumerate them please.
> > > > > > > > >>
> > > > > > > > >> The current code did:
> > > > > > > > >>
> > > > > > > > >> 1) check for vq->broken
> > > > > > > > >> 2) wakeup during BAD_RING()
> > > > > > > > >>
> > > > > > > > >> So we won't end up with a
never woke up process which should be fine.
> > > > > > > > >>
> > > > > > > > >> Thanks
> > > > > > > > >
> > > > > > > > > BTW BAD_RING on removal will
trigger dev_err. Not sure that is a good
> > > > > > > > > idea - can cause crashes if kernel
panics on error.
> > > > > > > >
> > > > > > > >
> > > > > > > > Yes, it's better to use
__virtqueue_break() instead.
> > > > > > > >
> > > > > > > > But consider we will start from a wait
first, I will limit the changes
> > > > > > > > in virtio-net without bothering virtio
core.
> > > > > > > >
> > > > > > > > Thanks
> > > > > > > >
> > > > > > > >
> > > > > > > > >
> > > > > > > > >>>
> > > > > >
> > > >
> >

Virtualization - Jan 2023 - [PATCH 3/4] virtio_ring: introduce a per virtqueue waitqueue

[PATCH 3/4] virtio_ring: introduce a per virtqueue waitqueue

[PATCH 3/4] virtio_ring: introduce a per virtqueue waitqueue