On 12/22/21 12:24 PM, Eric W. Biederman wrote:> All I am certain of is that you need to set
> "args->exit_signal = -1;". This prevents having to play games
with
> do_notify_parent.
Hi Eric,
I have all your review comments handled except this one. It's looking like
it's
more difficult than just setting the exit_signal=-1, so I wanted to check that
I understood you.
Here is what I'm currently looking at:
1. I can't just set args->exit_signal to -1, because we end up with a
task_struct
that's partially setup like a CLONE_THREAD task. What happens is
copy_process
will set the task's exit_signal to -1 and then thread_group_leader() will
return
false. When code like the thread_group_leader check in copy_process runs, we
will
then go down the CLONE_THREAD paths which are not setup and hit crashes.
We would need changes like the following which does not crash anymore but is not
correct for many reasons. I am just posting this code as an example of the issue
I am hitting.
@@ -1637,11 +1637,13 @@ static void posix_cpu_timers_init_group(struct
signal_struct *sig)
posix_cputimers_group_init(pct, cpu_limit);
}
-static int copy_signal(unsigned long clone_flags, struct task_struct *tsk)
+static int copy_signal(unsigned long clone_flags, struct task_struct *tsk,
+ struct kernel_clone_args *args)
{
struct signal_struct *sig;
- if (clone_flags & CLONE_THREAD)
+ if (clone_flags & CLONE_THREAD || args->exit_signal == -1)
return 0;
sig = kmem_cache_zalloc(signal_cachep, GFP_KERNEL);
@@ -2194,7 +2244,7 @@ static __latent_entropy struct task_struct *copy_process(
retval = copy_sighand(clone_flags, p);
if (retval)
goto bad_fork_cleanup_fs;
- retval = copy_signal(clone_flags, p);
+ retval = copy_signal(clone_flags, p, args);
if (retval)
goto bad_fork_cleanup_sighand;
retval = copy_mm(clone_flags, p);
@@ -2277,6 +2327,9 @@ static __latent_entropy struct task_struct *copy_process(
if (clone_flags & CLONE_THREAD) {
p->group_leader = current->group_leader;
p->tgid = current->tgid;
+ } else if (args->exit_signal == -1) {
+ p->group_leader = current->group_leader;
+ p->tgid = p->pid;
} else {
p->group_leader = p;
p->tgid = p->pid;
2. Instead of #1, I could add some code where we just set
task_struct->exit_signal to -1. We could do this twords the end of
copy_process
or after it has returned, but before we do do_exit. However, hat will have
similar
issues as #1 during the exit handling.
For example, __exit_signal will call thread_group_leader which would return
false.
__unhash_process would then not detach the pid and we would later hit crashes
due
to the task_struct being freed already. I could add code like above to the exit
related
code paths, but it gets messy like above.
3. I thought I could separate the leader detection from the exit signal by
adding
a flag/field to kernel_clone_args and task_struct. But then I get to the point
where I just need a check for USER/VHOST_WORKER tasks in exit_notify which is
similar to the patch you didn't like where I added the check in
do_notify_parent.
So I thought you might not like this approach.
Note:
We can't set our task's exit_signal to SIGCHLD and get autoreaped like
suggested in
another mail. The original idea for the do_notify_parent was we wanted the
behavior
that kthreads have where they get autoreaped on exit. kthreads get autoreaped
there
because the threadd task that is the parent ignores all signals and so we hit
the
parent SIG_IGN check:
psig = tsk->parent->sighand;
spin_lock_irqsave(&psig->siglock, flags);
if (!tsk->ptrace && sig == SIGCHLD &&
(psig->action[SIGCHLD-1].sa.sa_handler == SIG_IGN ||
(psig->action[SIGCHLD-1].sa.sa_flags & SA_NOCLDWAIT))) {
Our parent, the qemu task, does not ignore SIGCHLD and so will not hit the code
above.
4. Maybe I am going in the wrong direction and we need kthreads. I could add a:
if (!is_ucounts_overlimit(task_ucounts(p), UCOUNT_RLIMIT_NPROC,
rlimit(RLIMIT_NPROC)))
inc_rlimit_ucounts(task_ucounts(p), UCOUNT_RLIMIT_NPROC, 1);
to vhost.c or to kthread.c when some new arg is passed in.
What do you think?