thr3ads.net - Virtualization - [PATCH 3/3] fork, vhost: Use CLONE_THREAD to fix freezer/ps regression [May 2023]

If this information is useful, please help other people find it:
Share via:

Oleg Nesterov

2023-May-22 12:30 UTC

[PATCH 3/3] fork, vhost: Use CLONE_THREAD to fix freezer/ps regression

Confused, please help...

On 05/21, Mike Christie wrote:>
> --- a/drivers/vhost/vhost.c
> +++ b/drivers/vhost/vhost.c
> @@ -338,6 +338,7 @@ static int vhost_worker(void *data)
>  	struct vhost_worker *worker = data;
>  	struct vhost_work *work, *work_next;
>  	struct llist_node *node;
> +	bool dead = false;
>
>  	for (;;) {
>  		/* mb paired w/ kthread_stop */
> @@ -349,8 +350,22 @@ static int vhost_worker(void *data)
>  		}
>
>  		node = llist_del_all(&worker->work_list);
> -		if (!node)
> +		if (!node) {
>  			schedule();
> +			/*
> +			 * When we get a SIGKILL our release function will
> +			 * be called. That will stop new IOs from being queued
> +			 * and check for outstanding cmd responses. It will then
> +			 * call vhost_task_stop to tell us to return and exit.
> +			 */
But who will call the release function / vhost_task_stop() and when this
will happen after this thread gets SIGKILL ?
> +			if (!dead && signal_pending(current)) {
> +				struct ksignal ksig;
> +
> +				dead = get_signal(&ksig);
> +				if (dead)
> +					clear_thread_flag(TIF_SIGPENDING);
If you do clear_thread_flag(TIF_SIGPENDING), then why do we need 1/3 ?


Also. Suppose that vhost_worker() dequeues SIGKILL and clears TIF_SIGPENDING.

SIGSTOP, PTRACE_INTERRUPT, freezer can come and set TIF_SIGPENDING again.
In this case the main for (;;) loop will spin without sleeping until
vhost_task_should_stop() becomes true?

Oleg.

Mike Christie

2023-May-22 17:00 UTC

head link

[PATCH 3/3] fork, vhost: Use CLONE_THREAD to fix freezer/ps regression

On 5/22/23 7:30 AM, Oleg Nesterov wrote:> Confused, please help...
> 
> On 05/21, Mike Christie wrote:
>>
>> --- a/drivers/vhost/vhost.c
>> +++ b/drivers/vhost/vhost.c
>> @@ -338,6 +338,7 @@ static int vhost_worker(void *data)
>>  	struct vhost_worker *worker = data;
>>  	struct vhost_work *work, *work_next;
>>  	struct llist_node *node;
>> +	bool dead = false;
>>
>>  	for (;;) {
>>  		/* mb paired w/ kthread_stop */
>> @@ -349,8 +350,22 @@ static int vhost_worker(void *data)
>>  		}
>>
>>  		node = llist_del_all(&worker->work_list);
>> -		if (!node)
>> +		if (!node) {
>>  			schedule();
>> +			/*
>> +			 * When we get a SIGKILL our release function will
>> +			 * be called. That will stop new IOs from being queued
>> +			 * and check for outstanding cmd responses. It will then
>> +			 * call vhost_task_stop to tell us to return and exit.
>> +			 */
> 
> But who will call the release function / vhost_task_stop() and when this
> will happen after this thread gets SIGKILL ?
When we get a SIGKILL, the thread that owns the device/vhost_task will
also exit since it's the same thread group and it does:

do_exit -> exit_files -> put_files_struct -> close_files -> fput

which eventually (the actual put is done via the delayed work path
in there) runs the file_operation->release.

So the release function is being called in parallel more or less as the
code above.
> 
>> +			if (!dead && signal_pending(current)) {
>> +				struct ksignal ksig;
>> +
>> +				dead = get_signal(&ksig);
>> +				if (dead)
>> +					clear_thread_flag(TIF_SIGPENDING);
> 
> If you do clear_thread_flag(TIF_SIGPENDING), then why do we need 1/3 ?
You're right. I don't need it. I thought __fatal_signal_pending checked
the shared pending as well but it only checks pending.
> Also. Suppose that vhost_worker() dequeues SIGKILL and clears
TIF_SIGPENDING.
> 
> SIGSTOP, PTRACE_INTERRUPT, freezer can come and set TIF_SIGPENDING again.
> In this case the main for (;;) loop will spin without sleeping until
> vhost_task_should_stop() becomes true?
I see. So I either have to be able to call get_signal after SIGKILL or
at this time work like a kthread and ignore signals like a

if (dead && signal_pending())
	flush_signals()
?

Virtualization - May 2023 - [PATCH 3/3] fork, vhost: Use CLONE_THREAD to fix freezer/ps regression

[PATCH 3/3] fork, vhost: Use CLONE_THREAD to fix freezer/ps regression

[PATCH 3/3] fork, vhost: Use CLONE_THREAD to fix freezer/ps regression