thr3ads.net - Virtualization - [PATCH 3/4] virtio_ring: introduce a per virtqueue waitqueue [Dec 2022]

If this information is useful, please help other people find it:
Share via:

Michael S. Tsirkin

2022-Dec-27 07:33 UTC

[PATCH 3/4] virtio_ring: introduce a per virtqueue waitqueue

On Tue, Dec 27, 2022 at 12:30:35PM +0800, Jason Wang
wrote:> > But device is still going and will later use the buffers.
> >
> > Same for timeout really.
> 
> Avoiding infinite wait/poll is one of the goals, another is to sleep.
> If we think the timeout is hard, we can start from the wait.
> 
> Thanks
If the goal is to avoid disrupting traffic while CVQ is in use,
that sounds more reasonable. E.g. someone is turning on promisc,
a spike in CPU usage might be unwelcome.

things we should be careful to address then:
1- debugging. Currently it's easy to see a warning if CPU is stuck
   in a loop for a while, and we also get a backtrace.
   E.g. with this - how do we know who has the RTNL?
   We need to integrate with kernel/watchdog.c for good results
   and to make sure policy is consistent.
2- overhead. In a very common scenario when device is in hypervisor,
   programming timers etc has a very high overhead, at bootup
   lots of CVQ commands are run and slowing boot down is not nice.
   let's poll for a bit before waiting?
3- suprise removal. need to wake up thread in some way. what about
   other cases of device breakage - is there a chance this
   introduces new bugs around that? at least enumerate them please.

-- 
MST

Jason Wang

2022-Dec-27 09:12 UTC

head link

[PATCH 3/4] virtio_ring: introduce a per virtqueue waitqueue

? 2022/12/27 15:33, Michael S. Tsirkin ??:> On Tue, Dec 27, 2022 at 12:30:35PM +0800, Jason Wang wrote:
>>> But device is still going and will later use the buffers.
>>>
>>> Same for timeout really.
>> Avoiding infinite wait/poll is one of the goals, another is to sleep.
>> If we think the timeout is hard, we can start from the wait.
>>
>> Thanks
> If the goal is to avoid disrupting traffic while CVQ is in use,
> that sounds more reasonable. E.g. someone is turning on promisc,
> a spike in CPU usage might be unwelcome.

Yes, this would be more obvious is UP is used.

>
> things we should be careful to address then:
> 1- debugging. Currently it's easy to see a warning if CPU is stuck
>     in a loop for a while, and we also get a backtrace.
>     E.g. with this - how do we know who has the RTNL?
>     We need to integrate with kernel/watchdog.c for good results
>     and to make sure policy is consistent.

That's fine, will consider this.

> 2- overhead. In a very common scenario when device is in hypervisor,
>     programming timers etc has a very high overhead, at bootup
>     lots of CVQ commands are run and slowing boot down is not nice.
>     let's poll for a bit before waiting?

Then we go back to the question of choosing a good timeout for poll. And 
poll seems problematic in the case of UP, scheduler might not have the 
chance to run.

> 3- suprise removal. need to wake up thread in some way. what about
>     other cases of device breakage - is there a chance this
>     introduces new bugs around that? at least enumerate them please.

The current code did:

1) check for vq->broken
2) wakeup during BAD_RING()

So we won't end up with a never woke up process which should be fine.

Thanks

>
>

Virtualization - Dec 2022 - [PATCH 3/4] virtio_ring: introduce a per virtqueue waitqueue

[PATCH 3/4] virtio_ring: introduce a per virtqueue waitqueue

[PATCH 3/4] virtio_ring: introduce a per virtqueue waitqueue