Tetsuo Handa
2020-Jul-02 04:26 UTC
[Bridge] linux-next: umh: fix processed error when UMH_WAIT_PROC is used seems to break linux bridge on s390x (bisected)
On 2020/07/02 0:38, Luis Chamberlain wrote:> @@ -156,6 +156,18 @@ static void call_usermodehelper_exec_sync(struct subprocess_info *sub_info) > */ > if (KWIFEXITED(ret)) > sub_info->retval = KWEXITSTATUS(ret); > + /* > + * Do we really want to be passing the signal, or do we pass > + * a single error code for all cases? > + */ > + else if (KWIFSIGNALED(ret)) > + sub_info->retval = KWTERMSIG(ret);No, this is bad. Caller of usermode helper is unable to distinguish exit(9) and e.g. SIGKILL'ed by the OOM-killer. Please pass raw exit status value. I feel that caller of usermode helper should not use exit status value. For example, call_sbin_request_key() is checking test_bit(KEY_FLAG_USER_CONSTRUCT, &key->flags) || key_validate(key) < 0 condition (if usermode helper was invoked) in order to "ignore any errors from userspace if the key was instantiated".> + /* Same here */ > + else if (KWIFSTOPPED((ret))) > + sub_info->retval = KWSTOPSIG(ret); > + /* And are we really sure we want this? */ > + else if (KWIFCONTINUED((ret))) > + sub_info->retval = 0; > } > > /* Restore default kernel sig handler */ >
Luis Chamberlain
2020-Jul-02 19:47 UTC
[Bridge] linux-next: umh: fix processed error when UMH_WAIT_PROC is used seems to break linux bridge on s390x (bisected)
On Thu, Jul 02, 2020 at 01:26:53PM +0900, Tetsuo Handa wrote:> On 2020/07/02 0:38, Luis Chamberlain wrote: > > @@ -156,6 +156,18 @@ static void call_usermodehelper_exec_sync(struct subprocess_info *sub_info) > > */ > > if (KWIFEXITED(ret)) > > sub_info->retval = KWEXITSTATUS(ret); > > + /* > > + * Do we really want to be passing the signal, or do we pass > > + * a single error code for all cases? > > + */ > > + else if (KWIFSIGNALED(ret)) > > + sub_info->retval = KWTERMSIG(ret); > > No, this is bad. Caller of usermode helper is unable to distinguish exit(9) > and e.g. SIGKILL'ed by the OOM-killer.Right, the question is: do we care?> Please pass raw exit status value. > > I feel that caller of usermode helper should not use exit status value. > For example, call_sbin_request_key() is checking > > test_bit(KEY_FLAG_USER_CONSTRUCT, &key->flags) || key_validate(key) < 0 > > condition (if usermode helper was invoked) in order to "ignore any errors from > userspace if the key was instantiated".For those not familiar with this code path, or if you cannot decipher the above, the code path in question was: static int call_sbin_request_key(struct key *authkey, void *aux) { ... /* do it */ ret = call_usermodehelper_keys(request_key, argv, envp, keyring, UMH_WAIT_PROC); kdebug("usermode -> 0x%x", ret); if (ret >= 0) { /* ret is the exit/wait code */ if (test_bit(KEY_FLAG_USER_CONSTRUCT, &key->flags) || key_validate(key) < 0) ret = -ENOKEY; /* ignore any errors from userspace if the key was * instantiated */ ret = 0; } ... } And the umh patch "umh: fix processed error when UMH_WAIT_PROC is used" changed this to: - if (ret >= 0) { + if (ret != 0) { Prior to the patch negative return values from userspace were still being captured, and likewise signals, but the error value was not raw, not the actual value. After the patch, since we check for ret != 0 we still upkeep the sanity check for any error, correct the error value, but as you noted signals were ignored as I made the wrong assumption we would ignore them. The umh sub_info->retval is set after my original patch only if KWIFSIGNALED(ret)), and ignored signals, and so that would be now capitured with the additional KWIFSIGNALED(ret)) check. The question still stands: Do we want to open code all these checks or simply wrap them up in the umh. If we do the later, as you note exit(9) and a SIGKILL will be the same to the inspector in the kernel. But do we care? Do we really want umh callers differntiatin between signals and exit values? The alternative to making a compromise is using generic wrappers for things which make sense and letting the callers use those. Luis> > + /* Same here */ > > + else if (KWIFSTOPPED((ret))) > > + sub_info->retval = KWSTOPSIG(ret); > > + /* And are we really sure we want this? */ > > + else if (KWIFCONTINUED((ret))) > > + sub_info->retval = 0; > > } > > > > /* Restore default kernel sig handler */ > > >
Tetsuo Handa
2020-Jul-03 00:52 UTC
[Bridge] linux-next: umh: fix processed error when UMH_WAIT_PROC is used seems to break linux bridge on s390x (bisected)
On 2020/07/03 4:46, Luis Chamberlain wrote:> On Thu, Jul 02, 2020 at 01:26:53PM +0900, Tetsuo Handa wrote: >> On 2020/07/02 0:38, Luis Chamberlain wrote: >>> @@ -156,6 +156,18 @@ static void call_usermodehelper_exec_sync(struct subprocess_info *sub_info) >>> */ >>> if (KWIFEXITED(ret)) >>> sub_info->retval = KWEXITSTATUS(ret); >>> + /* >>> + * Do we really want to be passing the signal, or do we pass >>> + * a single error code for all cases? >>> + */ >>> + else if (KWIFSIGNALED(ret)) >>> + sub_info->retval = KWTERMSIG(ret); >> >> No, this is bad. Caller of usermode helper is unable to distinguish exit(9) >> and e.g. SIGKILL'ed by the OOM-killer. > > Right, the question is: do we care?Yes, we have to care.> And the umh patch "umh: fix processed error when UMH_WAIT_PROC is used" > changed this to: > > - if (ret >= 0) { > + if (ret != 0) { > > Prior to the patch negative return values from userspace were still > being captured, and likewise signals, but the error value was not > raw, not the actual value. After the patch, since we check for ret != 0 > we still upkeep the sanity check for any error, correct the error value, > but as you noted signals were ignored as I made the wrong assumption > we would ignore them. The umh sub_info->retval is set after my original > patch only if KWIFSIGNALED(ret)), and ignored signals, and so that > would be now capitured with the additional KWIFSIGNALED(ret)) check."call_usermodehelper_keys() == 0" (i.e. usermode helper was successfully started and successfully terminated via exit(0)) is different from "there is nothing to do". call_sbin_request_key() == 0 case still has to check for possibility of -ENOKEY case.> > The question still stands: > > Do we want to open code all these checks or simply wrap them up in > the umh. If we do the later, as you note exit(9) and a SIGKILL will > be the same to the inspector in the kernel. But do we care?Yes, we do care.> > Do we really want umh callers differntiatin between signals and exit values?Yes, we do.> > The alternative to making a compromise is using generic wrappers for > things which make sense and letting the callers use those.I suggest just introducing KWIFEXITED()/KWEXITSTATUS()/KWIFSIGNALED()/KWTERMSIG() macros and fixing the callers, for some callers are not aware of possibility of KWIFSIGNALED() case. For example, conn_try_outdate_peer() in drivers/block/drbd/drbd_nl.c misbehaves if drbd_usermode_helper process was terminated by a signal, for the switch() statement after returning from conn_helper() is assuming that the return value of conn_helper() is a KWEXITSTATUS() value if drbd_usermode_helper process was successfully started. If drbd_usermode_helper process was terminated by SIGQUIT (which is 3), conn_try_outdate_peer() will by error hit "case P_INCONSISTENT:" (which is 3); conn_try_outdate_peer() should hit "default: /* The script is broken ... */" unless KWIFEXITED() == true. Your patch is trying to obnubilate the return code.