thr3ads.net - Virtualization - [syzbot] [kvm?] [net?] [virt?] general protection fault in vhost_work

If this information is useful, please help other people find it:
Share via:

Stefano Garzarella

2023-May-30 16:17 UTC

[syzbot] [kvm?] [net?] [virt?] general protection fault in vhost_work_queue

On Tue, May 30, 2023 at 11:09:09AM -0500, Mike Christie
wrote:>On 5/30/23 11:00 AM, Stefano Garzarella wrote:
>> I think it is partially related to commit 6e890c5d5021 ("vhost:
use
>> vhost_tasks for worker threads") and commit 1a5f8090c6de
("vhost: move
>> worker thread fields to new struct"). Maybe that commits just
>> highlighted the issue and it was already existing.
>
>See my mail about the crash. Agree with your analysis about worker->vtsk
>not being set yet. It's a bug from my commit where I should have not set
>it so early or I should be checking for
>
>if (dev->worker && worker->vtsk)
>
>instead of
>
>if (dev->worker)
Yes, though, in my opinion the problem may persist depending on how the
instructions are reordered.

Should we protect dev->worker() with an RCU to be safe?
>
>One question about the behavior before my commit though and what we want in
>the end going forward. Before that patch we would just drop work if
>vhost_work_queue was called before VHOST_SET_OWNER. Was that
correct/expected?
I think so, since we ask the guest to call VHOST_SET_OWNER, before any
other command.
>
>The call to vhost_work_queue in vhost_vsock_start was only seeing the
>works queued after VHOST_SET_OWNER. Did you want works queued before that?
>
Yes, for example if an application in the host has tried to connect and
is waiting for a timeout, we already have work queued up to flush as
soon as we start the device. (See commit 0b841030625c ("vhost: vsock:
kick send_pkt worker once device is started")).

Thanks,
Stefano

michael.christie at oracle.com

2023-May-30 16:30 UTC

head link

[syzbot] [kvm?] [net?] [virt?] general protection fault in vhost_work_queue

On 5/30/23 11:17 AM, Stefano Garzarella wrote:> On Tue, May 30, 2023 at 11:09:09AM -0500, Mike Christie wrote:
>> On 5/30/23 11:00 AM, Stefano Garzarella wrote:
>>> I think it is partially related to commit 6e890c5d5021
("vhost: use
>>> vhost_tasks for worker threads") and commit 1a5f8090c6de
("vhost: move
>>> worker thread fields to new struct"). Maybe that commits just
>>> highlighted the issue and it was already existing.
>>
>> See my mail about the crash. Agree with your analysis about
worker->vtsk
>> not being set yet. It's a bug from my commit where I should have
not set
>> it so early or I should be checking for
>>
>> if (dev->worker && worker->vtsk)
>>
>> instead of
>>
>> if (dev->worker)
> 
> Yes, though, in my opinion the problem may persist depending on how the
> instructions are reordered.
Ah ok.
> 
> Should we protect dev->worker() with an RCU to be safe?
For those multiple worker patchsets Jason had asked me about supporting
where we don't have a worker while we are swapping workers around. To do
that I had added rcu around the dev->worker. I removed it in later patchsets
because I didn't think anyone would use it.

rcu would work for your case and for what Jason had requested.

Virtualization - May 2023 - [syzbot] [kvm?] [net?] [virt?] general protection fault in vhost_work_queue

[syzbot] [kvm?] [net?] [virt?] general protection fault in vhost_work_queue

[syzbot] [kvm?] [net?] [virt?] general protection fault in vhost_work_queue