thr3ads.net - Linux Virtualization - [CFT][PATCH v3] fork, vhost: Use CLONE_THREAD to fix freezer/ps regression [Jun 2023]

If this information is useful, please help other people find it:
Share via:

Oleg Nesterov

2023-Jun-06 12:16 UTC

[CFT][PATCH v3] fork, vhost: Use CLONE_THREAD to fix freezer/ps regression

On 06/05, Mike Christie wrote:>
> On 6/5/23 10:10 AM, Oleg Nesterov wrote:
> > On 06/03, michael.christie at oracle.com wrote:
> >>
> >> On 6/2/23 11:15 PM, Eric W. Biederman wrote:
> >> The problem is that as part of the flush the drivers/vhost/scsi.c
code
> >> will wait for outstanding commands, because we can't free the
device and
> >> it's resources before the commands complete or we will hit the
accessing
> >> freed memory bug.
> >
> > ignoring send-fd/clone issues, can we assume that the final
fput/release
> > should always come from vhost_worker's sub-thread (which shares
mm/etc) ?
>
> I think I'm misunderstanding the sub-thread term.
>
> - Is it the task_struct's context that we did the
> kernel/vhost_taskc.c:vhost_task_create() from? Below it would be the
> thread we did VHOST_SET_OWNER from.
Yes,
> So it works like if we were using a kthread still:
>
> 1. Userapce thread0 opens /dev/vhost-$something.
> 2. thread0 does VHOST_SET_OWNER ioctl. This calls vhost_task_create() to
> create the task_struct which runs the vhost_worker() function which handles
> the work->fns.
> 3. If userspace now does a SIGKILL or just exits without doing a close() on
> /dev/vhost-$something, then when thread0 does exit_files() that will do the
> fput that does vhost-$something's file_operations->release.
So, at least in this simple case vhost_worker() can just exit after SIGKILL,
and thread0 can flush the outstanding commands when it calls vhost_dev_flush()
rather than wait for vhost_worker().

Right?

not that I think this can help in the general case ...

Oleg.

Mike Christie

2023-Jun-06 15:57 UTC

head link

[CFT][PATCH v3] fork, vhost: Use CLONE_THREAD to fix freezer/ps regression

On 6/6/23 7:16 AM, Oleg Nesterov wrote:> On 06/05, Mike Christie wrote:
>>
>> On 6/5/23 10:10 AM, Oleg Nesterov wrote:
>>> On 06/03, michael.christie at oracle.com wrote:
>>>>
>>>> On 6/2/23 11:15 PM, Eric W. Biederman wrote:
>>>> The problem is that as part of the flush the
drivers/vhost/scsi.c code
>>>> will wait for outstanding commands, because we can't free
the device and
>>>> it's resources before the commands complete or we will hit
the accessing
>>>> freed memory bug.
>>>
>>> ignoring send-fd/clone issues, can we assume that the final
fput/release
>>> should always come from vhost_worker's sub-thread (which shares
mm/etc) ?
>>
>> I think I'm misunderstanding the sub-thread term.
>>
>> - Is it the task_struct's context that we did the
>> kernel/vhost_taskc.c:vhost_task_create() from? Below it would be the
>> thread we did VHOST_SET_OWNER from.
> 
> Yes,
> 
>> So it works like if we were using a kthread still:
>>
>> 1. Userapce thread0 opens /dev/vhost-$something.
>> 2. thread0 does VHOST_SET_OWNER ioctl. This calls vhost_task_create()
to
>> create the task_struct which runs the vhost_worker() function which
handles
>> the work->fns.
>> 3. If userspace now does a SIGKILL or just exits without doing a
close() on
>> /dev/vhost-$something, then when thread0 does exit_files() that will do
the
>> fput that does vhost-$something's file_operations->release.
> 
> So, at least in this simple case vhost_worker() can just exit after
SIGKILL,
> and thread0 can flush the outstanding commands when it calls
vhost_dev_flush()
> rather than wait for vhost_worker().
> 
> Right?
With the current code, the answer is no. We would hang like I mentioned here:

https://lore.kernel.org/lkml/ae250076-7d55-c407-1066-86b37014c69c at oracle.com/

We need to add code like I mentioned in that reply because we don't have a
way to call into the layers below us to flush those commands. We need more
like an abort and don't call back into us type of operation. Or, I'm
just trying
to add a check where we detect what happened then instead of trying to use
the vhost_task we try to complete in the context the lower level completes us
in.

Linux Virtualization - Jun 2023 - [CFT][PATCH v3] fork, vhost: Use CLONE_THREAD to fix freezer/ps regression

[CFT][PATCH v3] fork, vhost: Use CLONE_THREAD to fix freezer/ps regression

[CFT][PATCH v3] fork, vhost: Use CLONE_THREAD to fix freezer/ps regression