Mike Christie
2023-May-25 16:15 UTC
[PATCH 3/3] fork, vhost: Use CLONE_THREAD to fix freezer/ps regression
On 5/23/23 7:15 AM, Oleg Nesterov wrote:> > Now the main question. Whatever we do, SIGKILL/SIGSTOP/etc can come right > before we call work->fn(). Is it "safe" to run this callback with > signal_pending() or fatal_signal_pending() ?The questions before this one I'll leave for the core vhost devs since they know best. For this question and the one below, when we get SIGKILL we think it's ok to fail the operation as in it's ok to not execute it like normal (send it down to the net/target/scsi/block layers for execution). However, we need to do some processing of the work to release mem, refcounts, etc. For example we use the vhost_task to handle completions from the layers we interact with. So when we get a SIGKILL, we could have N commands in the target/block/scsi/nvme layers below the vhost layer. When those complete, the current code assumes we have the vhost_task to handle the responses. So for pending works on that work_list, we can pretty easily kill them. However, we don't have a way to kill those outstanding requests to some other layer, so we have to wait and handle them somewhere. If you are saying that once we get SIGKILL then we just can't use the task anymore and we have to drop down to workqueue/kthread or change up the completion paths so they can execute in the context they are completed from by lower levels, then I can code this. On the vhost side, it's just really ugly vs the code we have now that used to use kthreads (or in 6.4-rc looks like a process) so we couldn't get signals and just had the one code path that removes us.>From the vhost point of view signals are not useful to us and it's justadding complexity for something that I don't think is useful to users. However after discussing this with guys for a week and going over the signal code, I can see your point of views where you guys are thinking its ugly for the signal code to try and support what we want :)> > > Finally. I never looked into drivers/vhost/ before so I don't understand > this code at all, but let me ask anyway... Can we change vhost_dev_flush() > to run the pending callbacks rather than wait for vhost_worker() ? > I guess we can't, ->mm won't be correct, but can you confirm? > > Oleg. >
Eric W. Biederman
2023-May-28 01:41 UTC
[PATCH 3/3] fork, vhost: Use CLONE_THREAD to fix freezer/ps regression
Mike Christie <michael.christie at oracle.com> writes:> On 5/23/23 7:15 AM, Oleg Nesterov wrote: >> >> Now the main question. Whatever we do, SIGKILL/SIGSTOP/etc can come right >> before we call work->fn(). Is it "safe" to run this callback with >> signal_pending() or fatal_signal_pending() ? > > The questions before this one I'll leave for the core vhost devs since > they know best.Let me ask a clarifying question: Is it only the call to schedule() in vhost_worker that you are worried about not sleeping if signal_pending() or fatal_signal_pending()? Is there concern that the worker functions aka "work->fn()" will also have killable or interruptible sleeps that also will misbehave. We can handle schedule() in vhost_worker without problem. If a worker function has interruptible or killable sleeps that will turn into busy waits or worse not sleeping long enough that seems like a problem. There is no way to guarantee that the outer loop of vhost_worker will protect the worker functions from signal_pending() or fatal_signal_pending() becoming true. Eric