thr3ads.net - openssh unix dev - openssh 7.6 and 7.7 on Oracle Linux 7 (compiled from source) doesn't start correctly with systemd [Aug 2018]

If this information is useful, please help other people find it:
Share via:

Peter Stuge

2018-Aug-23 17:49 UTC

openssh 7.6 and 7.7 on Oracle Linux 7 (compiled from source) doesn't start correctly with systemd

Damien Miller wrote:> I agree: what is happening here seems to be mostly bad assumptions and
> inflexibility inside systemd.
I didn't say that, and I don't agree with that, to me it's welcome
ambition rather than bad assumptions.

Consider this:

How could systemd determine whether startup of a foreground daemon
completed successfully or failed?

Other than explicit notification (like a AF_UNIX message) systemd
could only use time; it could wait for the daemon to exit(EXIT_FAILURE)
after exec() - but how long is long enough? Every answer is incorrect.

Since systemd can't know when sshd has successfully started I find it
really reasonable to assume "immediately" in the Type=simple case.

> I'm surprised that systemd made these design decisions, because sshd is
> not doing anything historically unique with regards to startup or reload
> behaviour and "works with existing daemons" seems to be
requirement #0
> if you're writing an init system.
That's not fair.

systemd works with sshd just as well as if I would add sshd to my inittab
on a SysV init system, but that's not so useful.

systemd works well with sshd using Type=forking, but if the config
file breaks and a reload is issued (and sshd exits, because bad config)
then systemd detects that sshd exited, but it can't know why, so it
can't output a status message.

systemd is indeed more ambitious than e.g. SysV init, and for service
management I consider that a leap in the right direction. (For many other
things which systemd wants to do not so much - I don't use those.)

> Maybe the other daemon vendors didn't push back against this, but
I'm
> willing to.
Please don't push back just for the sake of it.

Did you look at the code I sent?

Would you take a patch with essentially that code, without any
libsystemd dependency, to make sshd work as a Type=notify service,
enabling maximum usability with systemd?

//Peter

Jochen Bern

2018-Aug-24 12:04 UTC

head link

openssh 7.6 and 7.7 on Oracle Linux 7 (compiled from source) doesn't start correctly with systemd

On 08/23/2018 07:49 PM, Peter Stuge wrote:> How could systemd determine whether startup of a foreground daemon
> completed successfully or failed?
> Other than explicit notification (like a AF_UNIX message) systemd
> could only use time; it could wait for the daemon to exit(EXIT_FAILURE)
> after exec() - but how long is long enough? Every answer is incorrect.
If we can agree that neither systemd nor "legacy" methods(*) of
getting
feedback from daemon processes will cease to exist just because the
other side wishes them to hard enough, then complementing either side
(but preferably systemd) with a (general, configurable, contrib/ subdir
based) wrapper to translate as needed would seem a pragmatic solution.
</?.02>

(*) PID file, lookup in the process table, check for a LISTEN, pattern
match in a logfile, running a dedicated *client* executable / Nagios
plugin / ${DAEMON}ctl tool for a test, throwing the daemon a
SIGAREYOUWELL/shmem/semaphore/... request, you name it

Regards,
-- 
Jochen Bern
Systemingenieur

Binect GmbH

-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 4278 bytes
Desc: S/MIME Cryptographic Signature
URL:
<http://lists.mindrot.org/pipermail/openssh-unix-dev/attachments/20180824/74d34f7e/attachment-0001.p7s>

Colin Watson

2018-Aug-24 17:19 UTC

head link

openssh 7.6 and 7.7 on Oracle Linux 7 (compiled from source) doesn't start correctly with systemd

On Fri, Aug 24, 2018 at 02:04:13PM +0200, Jochen Bern
wrote:> On 08/23/2018 07:49 PM, Peter Stuge wrote:
> > How could systemd determine whether startup of a foreground daemon
> > completed successfully or failed?
> > Other than explicit notification (like a AF_UNIX message) systemd
> > could only use time; it could wait for the daemon to
exit(EXIT_FAILURE)
> > after exec() - but how long is long enough? Every answer is incorrect.
> 
> If we can agree that neither systemd nor "legacy" methods(*) of
getting
> feedback from daemon processes will cease to exist just because the
> other side wishes them to hard enough, then complementing either side
> (but preferably systemd) with a (general, configurable, contrib/ subdir
> based) wrapper to translate as needed would seem a pragmatic solution.
> </?.02>
> 
> (*) PID file, lookup in the process table, check for a LISTEN, pattern
> match in a logfile, running a dedicated *client* executable / Nagios
> plugin / ${DAEMON}ctl tool for a test, throwing the daemon a
> SIGAREYOUWELL/shmem/semaphore/... request, you name it
I doubt that anyone using OpenSSH with systemd would want to use a
polling-based (and thus inefficient) hack like that when they could just
apply the tiny patch to slot in an sd_notify call between listen and
accept.  (And I definitely see the logic behind notifying the service
manager at that point; I've dealt with complex services built on top of
OpenSSH that needed to arrange the boot sequence so that they started
only once sshd was actually ready to accept connections, and without
this kind of approach they had to settle for arbitrary delays and race
conditions.)

systemd has its structural problems, but this is one thing it gets
right.  To my mind, the reasons for avoiding linking against libsystemd
with a configure-time switch are essentially political; if you're
running on a systemd-based system then it's paged in anyway so the
runtime cost is negligible, if you're not then sd_notify is already
careful to do nothing and do so cheaply, and in general I think it makes
more sense to use common code to notify the service manager than to
duplicate it.  (I still have a soft spot for the hacky "SIGSTOP yourself
and have init send you SIGCONT when it notices" approach to this problem
that we took in upstart, but I can understand why systemd preferred to
do something else.)

Obviously it's better to get patches upstream wherever possible.  But
honestly, speaking as a downstream who maintains a patch that calls
sd_notify in the right place, I'd rather have to maintain that patch
indefinitely than have a worse hack upstream that I'd then have to undo
or otherwise work around.

-- 
Colin Watson                                       [cjwatson at debian.org]

openssh unix dev - Aug 2018 - openssh 7.6 and 7.7 on Oracle Linux 7 (compiled from source) doesn't start correctly with systemd

openssh 7.6 and 7.7 on Oracle Linux 7 (compiled from source) doesn't start correctly with systemd

openssh 7.6 and 7.7 on Oracle Linux 7 (compiled from source) doesn't start correctly with systemd

openssh 7.6 and 7.7 on Oracle Linux 7 (compiled from source) doesn't start correctly with systemd