On 2/18/22 11:53 AM, Mike Christie wrote:> On 2/17/22 3:48 AM, Stefano Garzarella wrote:
>>
>> On Thu, Feb 17, 2022 at 8:50 AM Michael S. Tsirkin <mst at
redhat.com> wrote:
>>>
>>> On Thu, Feb 17, 2022 at 03:39:48PM +0800, Jason Wang wrote:
>>>> On Thu, Feb 17, 2022 at 3:36 PM Michael S. Tsirkin <mst at
redhat.com> wrote:
>>>>>
>>>>> On Thu, Feb 17, 2022 at 03:34:13PM +0800, Jason Wang wrote:
>>>>>> On Thu, Feb 17, 2022 at 10:01 AM syzbot
>>>>>> <syzbot+1e3ea63db39f2b4440e0 at
syzkaller.appspotmail.com> wrote:
>>>>>>>
>>>>>>> Hello,
>>>>>>>
>>>>>>> syzbot found the following issue on:
>>>>>>>
>>>>>>> HEAD commit: c5d9ae265b10 Merge tag
'for-linus' of git://git.kernel.org..
>>>>>>> git tree: upstream
>>>>>>> console output:
https://urldefense.com/v3/__https://syzkaller.appspot.com/x/log.txt?x=132e687c700000__;!!ACWV5N9M2RV99hQ!fLqQTyosTBm7FK50IVmo0ozZhsvUEPFCivEHFDGU3GjlAHDWl07UdOa-t9uf9YisMihn$
>>>>>>> kernel config:
https://urldefense.com/v3/__https://syzkaller.appspot.com/x/.config?x=a78b064590b9f912__;!!ACWV5N9M2RV99hQ!fLqQTyosTBm7FK50IVmo0ozZhsvUEPFCivEHFDGU3GjlAHDWl07UdOa-t9uf9RjOhplp$
>>>>>>> dashboard link:
https://urldefense.com/v3/__https://syzkaller.appspot.com/bug?extid=1e3ea63db39f2b4440e0__;!!ACWV5N9M2RV99hQ!fLqQTyosTBm7FK50IVmo0ozZhsvUEPFCivEHFDGU3GjlAHDWl07UdOa-t9uf9bBf5tv0$
>>>>>>> compiler: gcc (Debian 10.2.1-6) 10.2.1
20210110, GNU ld (GNU Binutils for Debian) 2.35.2
>>>>>>>
>>>>>>> Unfortunately, I don't have any reproducer for
this issue yet.
>>>>>>>
>>>>>>> IMPORTANT: if you fix the issue, please add the
following tag to the commit:
>>>>>>> Reported-by: syzbot+1e3ea63db39f2b4440e0 at
syzkaller.appspotmail.com
>>>>>>>
>>>>>>> WARNING: CPU: 1 PID: 10828 at
drivers/vhost/vhost.c:715 vhost_dev_cleanup+0x8b8/0xbc0
drivers/vhost/vhost.c:715
>>>>>>> Modules linked in:
>>>>>>> CPU: 0 PID: 10828 Comm: syz-executor.0 Not tainted
5.17.0-rc4-syzkaller-00051-gc5d9ae265b10 #0
>>>>>>> Hardware name: Google Google Compute Engine/Google
Compute Engine, BIOS Google 01/01/2011
>>>>>>> RIP: 0010:vhost_dev_cleanup+0x8b8/0xbc0
drivers/vhost/vhost.c:715
>>>>>>
>>>>>> Probably a hint that we are missing a flush.
>>>>>>
>>>>>> Looking at vhost_vsock_stop() that is called by
vhost_vsock_dev_release():
>>>>>>
>>>>>> static int vhost_vsock_stop(struct vhost_vsock *vsock)
>>>>>> {
>>>>>> size_t i;
>>>>>> int ret;
>>>>>>
>>>>>> mutex_lock(&vsock->dev.mutex);
>>>>>>
>>>>>> ret =
vhost_dev_check_owner(&vsock->dev);
>>>>>> if (ret)
>>>>>> goto err;
>>>>>>
>>>>>> Where it could fail so the device is not actually
stopped.
>>>>>>
>>>>>> I wonder if this is something related.
>>>>>>
>>>>>> Thanks
>>>>>
>>>>>
>>>>> But then if that is not the owner then no work should be
running, right?
>>>>
>>>> Could it be a buggy user space that passes the fd to another
process
>>>> and changes the owner just before the mutex_lock() above?
>>>>
>>>> Thanks
>>>
>>> Maybe, but can you be a bit more explicit? what is the set of
>>> conditions you see that can lead to this?
>>
>> I think the issue could be in the vhost_vsock_stop() as Jason
mentioned,
>> but not related to fd passing, but related to the do_exit() function.
>>
>> Looking the stack trace, we are in exit_task_work(), that is called
>> after exit_mm(), so the vhost_dev_check_owner() can fail because
>> current->mm should be NULL at that point.
>>
>> It seems the fput work is queued by fput_many() in a worker queue, and
>> in some cases (maybe a lot of files opened?) the work is still queued
>> when we enter in do_exit().
> It normally happens if userspace doesn't do a close() when the VM
Just one clarification. I meant to say it "always" happens when
userspace
doesn't do a close.
It doesn't have anything to do with lots of files or something like that.
We are actually running the vhost device's release function from
do_exit->task_work_run and so all those __fputs are done from something
like qemu's context (current == that process).
We are *not* hitting the case:
do_exit->exit_files->put_files_struct->filp_close->fput->fput_many
and then in there hitting the schedule_delayed_work path. For that
the last __fput would be done from a workqueue thread and so the current
pointer would point to a completely different thread.
> is shutdown and instead let's the kernel's reaper code cleanup. The
qemu
> vhost-scsi code doesn't do a close() during shutdown and so this is our
> normal code path. It also happens when something like qemu is not
> gracefully shutdown like during a crash.
>
> So fire up qemu, start IO, then crash it or kill 9 it while IO is still
> running and you can hit it.
>
>>
>> That said, I don't know if we can simply remove that check in
>> vhost_vsock_stop(), or check if current->mm is NULL, to understand
if
>> the process is exiting.
>>
>
> Should the caller do the vhost_dev_check_owner or tell vhost_vsock_stop
> when to check?
>
> - vhost_vsock_dev_ioctl always wants to check for ownership right?
>
> - For vhost_vsock_dev_release ownership doesn't matter because we
> always want to clean up or it doesn't hurt too much.
>
> For the case where we just do open then close and no ioctls then
> running vhost_vq_set_backend in vhost_vsock_stop is just a minor
> hit of extra work. If we've done ioctls, but are now in
> vhost_vsock_dev_release then we know for the graceful and ungraceful
> case that nothing is going to be accessing this device in the future
> and it's getting completely freed so we must completely clean it up.
>
>
>
>
>
> _______________________________________________
> Virtualization mailing list
> Virtualization at lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/virtualization