thr3ads.net - Virtualization - [PATCH 3/3] fork, vhost: Use CLONE_THREAD to fix freezer/ps regression [May 2023]

If this information is useful, please help other people find it:
Share via:

Mike Christie

2023-May-25 16:15 UTC

[PATCH 3/3] fork, vhost: Use CLONE_THREAD to fix freezer/ps regression

On 5/23/23 7:15 AM, Oleg Nesterov wrote:> 
> Now the main question. Whatever we do, SIGKILL/SIGSTOP/etc can come right
> before we call work->fn(). Is it "safe" to run this callback
with
> signal_pending() or fatal_signal_pending() ?
The questions before this one I'll leave for the core vhost devs since
they know best.

For this question and the one below, when we get SIGKILL we think it's ok
to fail the operation as in it's ok to not execute it like normal (send
it down to the net/target/scsi/block layers for execution). However, we
need to do some processing of the work to release mem, refcounts, etc.

For example we use the vhost_task to handle completions from the layers
we interact with. So when we get a SIGKILL, we could have N commands in
the target/block/scsi/nvme layers below the vhost layer. When those
complete, the current code assumes we have the vhost_task to handle the
responses. So for pending works on that work_list, we can pretty easily
kill them. However, we don't have a way to kill those outstanding
requests to some other layer, so we have to wait and handle them
somewhere.

If you are saying that once we get SIGKILL then we just can't use the
task anymore and we have to drop down to workqueue/kthread or change up
the completion paths so they can execute in the context they are completed
from by lower levels, then I can code this. On the vhost side, it's just
really ugly vs the code we have now that used to use kthreads (or in
6.4-rc looks like a process) so we couldn't get signals and just had the
one code path that removes us.
>From the vhost point of view signals are not useful to us and it's justadding complexity for something that I don't think is useful to users.
However after discussing this with guys for a week and going over the
signal code, I can see your point of views where you guys are thinking its
ugly for the signal code to try and support what we want :)

> 
> 
> Finally. I never looked into drivers/vhost/ before so I don't
understand
> this code at all, but let me ask anyway... Can we change vhost_dev_flush()
> to run the pending callbacks rather than wait for vhost_worker() ?
> I guess we can't, ->mm won't be correct, but can you confirm?
> 
> Oleg.
>

Eric W. Biederman

2023-May-28 01:41 UTC

head link

[PATCH 3/3] fork, vhost: Use CLONE_THREAD to fix freezer/ps regression

Mike Christie <michael.christie at oracle.com> writes:
> On 5/23/23 7:15 AM, Oleg Nesterov wrote:
>> 
>> Now the main question. Whatever we do, SIGKILL/SIGSTOP/etc can come
right
>> before we call work->fn(). Is it "safe" to run this
callback with
>> signal_pending() or fatal_signal_pending() ?
>
> The questions before this one I'll leave for the core vhost devs since
> they know best.
Let me ask a clarifying question:

Is it only the call to schedule() in vhost_worker that you are worried
about not sleeping if signal_pending() or fatal_signal_pending()?

Is there concern that the worker functions aka "work->fn()" will
also
have killable or interruptible sleeps that also will misbehave.

We can handle schedule() in vhost_worker without problem.

If a worker function has interruptible or killable sleeps that will turn
into busy waits or worse not sleeping long enough that seems like a
problem.  There is no way to guarantee that the outer loop of
vhost_worker will protect the worker functions from signal_pending()
or fatal_signal_pending() becoming true.

Eric

Virtualization - May 2023 - [PATCH 3/3] fork, vhost: Use CLONE_THREAD to fix freezer/ps regression

[PATCH 3/3] fork, vhost: Use CLONE_THREAD to fix freezer/ps regression

[PATCH 3/3] fork, vhost: Use CLONE_THREAD to fix freezer/ps regression