Mike Christie
2023-May-22 17:00 UTC
[PATCH 3/3] fork, vhost: Use CLONE_THREAD to fix freezer/ps regression
On 5/22/23 7:30 AM, Oleg Nesterov wrote:> Confused, please help... > > On 05/21, Mike Christie wrote: >> >> --- a/drivers/vhost/vhost.c >> +++ b/drivers/vhost/vhost.c >> @@ -338,6 +338,7 @@ static int vhost_worker(void *data) >> struct vhost_worker *worker = data; >> struct vhost_work *work, *work_next; >> struct llist_node *node; >> + bool dead = false; >> >> for (;;) { >> /* mb paired w/ kthread_stop */ >> @@ -349,8 +350,22 @@ static int vhost_worker(void *data) >> } >> >> node = llist_del_all(&worker->work_list); >> - if (!node) >> + if (!node) { >> schedule(); >> + /* >> + * When we get a SIGKILL our release function will >> + * be called. That will stop new IOs from being queued >> + * and check for outstanding cmd responses. It will then >> + * call vhost_task_stop to tell us to return and exit. >> + */ > > But who will call the release function / vhost_task_stop() and when this > will happen after this thread gets SIGKILL ?When we get a SIGKILL, the thread that owns the device/vhost_task will also exit since it's the same thread group and it does: do_exit -> exit_files -> put_files_struct -> close_files -> fput which eventually (the actual put is done via the delayed work path in there) runs the file_operation->release. So the release function is being called in parallel more or less as the code above.> >> + if (!dead && signal_pending(current)) { >> + struct ksignal ksig; >> + >> + dead = get_signal(&ksig); >> + if (dead) >> + clear_thread_flag(TIF_SIGPENDING); > > If you do clear_thread_flag(TIF_SIGPENDING), then why do we need 1/3 ?You're right. I don't need it. I thought __fatal_signal_pending checked the shared pending as well but it only checks pending.> Also. Suppose that vhost_worker() dequeues SIGKILL and clears TIF_SIGPENDING. > > SIGSTOP, PTRACE_INTERRUPT, freezer can come and set TIF_SIGPENDING again. > In this case the main for (;;) loop will spin without sleeping until > vhost_task_should_stop() becomes true?I see. So I either have to be able to call get_signal after SIGKILL or at this time work like a kthread and ignore signals like a if (dead && signal_pending()) flush_signals() ?
Oleg Nesterov
2023-May-22 17:47 UTC
[PATCH 3/3] fork, vhost: Use CLONE_THREAD to fix freezer/ps regression
On 05/22, Mike Christie wrote:> > On 5/22/23 7:30 AM, Oleg Nesterov wrote: > >> + /* > >> + * When we get a SIGKILL our release function will > >> + * be called. That will stop new IOs from being queued > >> + * and check for outstanding cmd responses. It will then > >> + * call vhost_task_stop to tell us to return and exit. > >> + */ > > > > But who will call the release function / vhost_task_stop() and when this > > will happen after this thread gets SIGKILL ? > > When we get a SIGKILL, the thread that owns the device/vhost_task will > also exit since it's the same thread group and it does: > > do_exit -> exit_files -> put_files_struct -> close_files -> fputAh. thanks. I confused CLONE_FS in vhost_task_create() with CLONE_FILES.> > Also. Suppose that vhost_worker() dequeues SIGKILL and clears TIF_SIGPENDING. > > > > SIGSTOP, PTRACE_INTERRUPT, freezer can come and set TIF_SIGPENDING again. > > In this case the main for (;;) loop will spin without sleeping until > > vhost_task_should_stop() becomes true? > > I see. So I either have to be able to call get_signal after SIGKILL or > at this time work like a kthread and ignore signals like a > > if (dead && signal_pending()) > flush_signals() > ?Right now I think that "int dead" should die, and you should simply do get_signal() + clear(SIGPENDING) if signal_pending() == T , but let me think tomorrow. Oleg.
Reasonably Related Threads
- [PATCH 3/3] fork, vhost: Use CLONE_THREAD to fix freezer/ps regression
- [PATCH 1/1] fork, vhost: Use CLONE_THREAD to fix freezer/ps regression
- [PATCH 3/3] fork, vhost: Use CLONE_THREAD to fix freezer/ps regression
- [PATCH 3/3] fork, vhost: Use CLONE_THREAD to fix freezer/ps regression
- [PATCH 1/1] fork, vhost: Use CLONE_THREAD to fix freezer/ps regression