thr3ads.net - Virtualization - [PATCH 3/3] fork, vhost: Use CLONE_THREAD to fix freezer/ps regression [May 2023]

If this information is useful, please help other people find it:
Share via:

Oleg Nesterov

2023-May-29 17:46 UTC

[PATCH 3/3] fork, vhost: Use CLONE_THREAD to fix freezer/ps regression

Mike, sorry, I don't understand your email.

Just in case, let me remind I know nothing about drivers/vhost/

On 05/29, michael.christie at oracle.com wrote:>
> On 5/29/23 6:19 AM, Oleg Nesterov wrote:
> > On 05/27, Eric W. Biederman wrote:
> >>
> >> Looking forward I don't see not asking the worker threads to
stop
> >> for the coredump right now causing any problems in the future.
> >> So I think we can use this to resolve the coredump issue I
spotted.
> >
> > But we have almost the same problem with exec.
> >
> > Execing thread will wait for vhost_worker() while vhost_worker will
wait for
> > .release -> vhost_task_stop().
>
> For this type of case, what is the goal or correct behavior in the end?
>
> When get_signal returns true we can code things like you mention below
and you have mentioned in the next email that you have already coded something
like this, so perhaps we can delay the further discussions until you send the
new code?
> and
> clean up the task_struct.
Hmm... If we you CLONE_THREAD the exiting vhost_worker() will auto-reap itself,
> However, we now have a non-functioning vhost device
> open and just sitting around taking up memory and it can't do any IO.
can't comment, see above.
> For this type of case, do we expect just not to crash/hang, or was this new
> exec'd thread suppose to be able to use the vhost device?
I just tried to point out that (unless I missed something) there are more corner
cases, not just coredump.
> > Or suppose that vhost_worker's sub-thread forks a child with
CLONE_FILES...
>
> You mean the vhost_task's task/thread doing a function that does a
copy_process
> right?
I meant that the vhost_task's sub-thread can do sys_clone(CLONE_FILES) from
userspace. Yes, this implies copy_process() but I still can't understand
you.
> That type of thing is not needed.
Do you mean that userspace should never do this? But this doesn't matter,
the
kernel should handle this case anyway.

Or what?

In short let me repeat that I don't understand you and - of course! - quite
possibly I missed something.

Oleg.

Oleg Nesterov

2023-May-29 17:54 UTC

head link

[PATCH 3/3] fork, vhost: Use CLONE_THREAD to fix freezer/ps regression

On 05/29, Oleg Nesterov wrote:>
> Mike, sorry, I don't understand your email.
>
> Just in case, let me remind I know nothing about drivers/vhost/
And... this is slightly off-topic but you didn't answer my previous
question and I am just curious. Do you agree that, even if we forget
about CLONE_THREAD/signals, vhost_worker() needs fixes anyway because
it can race with vhost_work_queue() ?

Oleg.

Mike Christie

2023-May-29 19:35 UTC

head link

[PATCH 3/3] fork, vhost: Use CLONE_THREAD to fix freezer/ps regression

On 5/29/23 12:46 PM, Oleg Nesterov wrote:> Mike, sorry, I don't understand your email.
> 
> Just in case, let me remind I know nothing about drivers/vhost/
> 
No problem. I get it. I don't know the signal/thread code so it's
one of those things where I'm also learning as I go.
> On 05/29, michael.christie at oracle.com wrote:
>>
>> On 5/29/23 6:19 AM, Oleg Nesterov wrote:
>>> On 05/27, Eric W. Biederman wrote:
>>>>
>>>> Looking forward I don't see not asking the worker threads
to stop
>>>> for the coredump right now causing any problems in the future.
>>>> So I think we can use this to resolve the coredump issue I
spotted.
>>>
>>> But we have almost the same problem with exec.
>>>
>>> Execing thread will wait for vhost_worker() while vhost_worker will
wait for
>>> .release -> vhost_task_stop().
>>
>> For this type of case, what is the goal or correct behavior in the end?
>>
>> When get_signal returns true we can code things like you mention below
> 
> and you have mentioned in the next email that you have already coded
something
> like this, so perhaps we can delay the further discussions until you send
the
> new code?
> 
Ok. Let me post that. You guys and the vhost devs can argue about if it's
too ugly to merge :)

>> and
>> clean up the task_struct.
> 
> Hmm... If we you CLONE_THREAD the exiting vhost_worker() will auto-reap
itself,
Oh wait, are you saying that when we get auto-reaped then we would do the last
fput and call the file_operations->release function right? We actually set
task_struct->files = NULL for the vhost_task task_struct, so I think we call
release a little sooner than you think.

vhost_task_create() sets kernel_clone_args->no_files, so the vhost_task
task_struc
that gets created works like kthreads where it doesn't do a CLONE_FILES and
it
doesn't do a dup_fd.

So when we do de_thread() -> zap_other_threads(), that will kill all the
threads
in the group right? So when they exit, it will call our release function since
we don't have refcount on ourself.

> 
>> However, we now have a non-functioning vhost device
>> open and just sitting around taking up memory and it can't do any
IO.
> 
> can't comment, see above.
> 
>> For this type of case, do we expect just not to crash/hang, or was this
new
>> exec'd thread suppose to be able to use the vhost device?
> 
> I just tried to point out that (unless I missed something) there are more
corner
> cases, not just coredump.
Ok. I see.
> 
>>> Or suppose that vhost_worker's sub-thread forks a child with
CLONE_FILES...
>>
>> You mean the vhost_task's task/thread doing a function that does a
copy_process
>> right?
> 
> I meant that the vhost_task's sub-thread can do sys_clone(CLONE_FILES)
from
> userspace.
I think we were talking about different things. I was saying that when the vhost
layer does vhost_task_create() -> copy_process(), the kernel_clone_args.fn is
vhost_task_fn() -> vhost_worker(). I thought you were concerned about
vhost_worker()
or some function it calls, calling copy_process() with CLONE_FILES.

Virtualization - May 2023 - [PATCH 3/3] fork, vhost: Use CLONE_THREAD to fix freezer/ps regression

[PATCH 3/3] fork, vhost: Use CLONE_THREAD to fix freezer/ps regression

[PATCH 3/3] fork, vhost: Use CLONE_THREAD to fix freezer/ps regression

[PATCH 3/3] fork, vhost: Use CLONE_THREAD to fix freezer/ps regression