thr3ads.net - openssh unix dev - openssh 7.6 and 7.7 on Oracle Linux 7 (compiled from source) doesn't start correctly with systemd [Aug 2018]

If this information is useful, please help other people find it:
Share via:

kevin martin

2018-Aug-23 14:53 UTC

openssh 7.6 and 7.7 on Oracle Linux 7 (compiled from source) doesn't start correctly with systemd

I'm not sure I agree with Peter in respect to his comment about
"building a
dependency to systemd".  The only time a "dependency" would be
created is
when the end-user would configure it to be there with a configure time flag
of --with-systemd.  Just having the code available and dormant without that
flag being provided builds in no dependency whatsoever and gives the
end-user their option to choose.

---


Regards,

Kevin Martin


On Wed, Aug 22, 2018 at 11:02 PM Damien Miller <djm at mindrot.org> wrote:
> On Wed, 22 Aug 2018, Peter Stuge wrote:
>
> > kevin martin wrote:
> > > not sure why having the systemd notify code in openssh as a
> > > configure time option would be such a bad thing.
> >
> > At the very least it introduces a dependency on libsystemd into sshd,
> > which is undesirable for reasons of security and convenience. The
> > principle of "you are done when you can not remove any more"
confirms
> > that it is unwise to add dependencies without very careful
consideration.
> >
> >
> > I've read through the debian and Red Hat bug reports.
> >
> > There are two different but related problems here:
> >
> > 1. For systemctl [re]start, when a .service file has Type=simple,
> > systemd assumes that service startup can never fail, and immediately
> > considers this service successfully started when the exec() of sshd
> > has succeeded.
> >
> > That's debatable design within systemd, but it's hard for
systemd to
> > know when a given service has actually started successfully, and
> > services which fit that assumption do exist.
> >
> > So when sshd detects an error on startup and exits with an error code
> > shortly after being started, systemd considers the service to first
> > have started successfully and then to have exited with an error, so
> > it then restarts the service. Repeat.
> >
> > When service limits are exhausted the service ends up in a failed
state.
> >
> > Meanwhile, the systemctl [re]start command doesn't report any
error
> > to the administrator, because systemd considers the service to have
> > [re]started successfully once. This is "error messages are
lost".
> >
> >
> > 2. For systemctl reload, systemd can and arguably should send SIGHUP
> > to sshd. More uncertainty and assumptions within systemd follows;
> > sshd re-exec:s, meaning that the PID stays the same, so systemd
> > doesn't receive SIGCHLD and so even if 1. is fixed, here systemd
will
> > not understand that there an error during startup of the new sshd is
> > to be considered a failed reload. Ie. the above problems apply here
> > again. The systemctl reload sshd command is always immediately
> > successful, even if re-exec:ed sshd detects an error in the config
> > file.
>
> Thanks for the detailed write up, Peter.
>
> I agree: what is happening here seems to be mostly bad assumptions and
> inflexibility inside systemd.
>
> I'm surprised that systemd made these design decisions, because sshd is
> not doing anything historically unique with regards to startup or reload
> behaviour and "works with existing daemons" seems to be
requirement #0
> if you're writing an init system.
>
> Maybe the other daemon vendors didn't push back against this, but
I'm
> willing to.
>
> -d
> _______________________________________________
> openssh-unix-dev mailing list
> openssh-unix-dev at mindrot.org
> https://lists.mindrot.org/mailman/listinfo/openssh-unix-dev
>

Emmanuel Deloget

2018-Aug-23 16:21 UTC

head link

openssh 7.6 and 7.7 on Oracle Linux 7 (compiled from source) doesn't start correctly with systemd

Hello,

On Thu, Aug 23, 2018 at 4:53 PM kevin martin <ktmdms at gmail.com>
wrote:>
> I'm not sure I agree with Peter in respect to his comment about
"building a
> dependency to systemd".  The only time a "dependency" would
be created is
> when the end-user would configure it to be there with a configure time flag
> of --with-systemd.  Just having the code available and dormant without that
> flag being provided builds in no dependency whatsoever and gives the
> end-user their option to choose.
Not sure I should step in, but the code to deal with the user
selection and to notify systemd is a dependency - even if it's
compiled out. The fact is that you still ave to maintain it and to
test it regularly.

The problem looks like a systemd configuration error. systemd allows
you to start a non-systemd-aware daemon. You need to look at [Service]
/ Type (notify is used for systemd-aware daemons).

BR,

-- Emmanuel Deloget

kevin martin

2018-Aug-23 16:48 UTC

head link

openssh 7.6 and 7.7 on Oracle Linux 7 (compiled from source) doesn't start correctly with systemd

While I appreciate the need to code it and test it regularly, Peter wrote a
bit of notify code and provided it to Damien to essentially do what the API
code into systemd already does seemingly which seems like remaking the
wheel to me, and would still require ongoing maintenance and testing.  The
systemd API is developed and maintained external to openssh and is there
specifically to make it easier for apps that want to become daemons to be
able to be used effectively in the systemd environment.  I hated the fact
that most flavors of Linux moved to systemd from the init system but it's
what we, the end users (companies with 100's of thousands of Linux
instances running) have to live with and to have Redhat make changes to
*your* code to include systemd enhancements (and other vendors that don't
necessarily take their codebase from Redhat) I would think would/could lead
to issues (like this one) ongoing.  If *you as the developers included the
API access as a configurable option then *we the consumer could move to
your newer codebase products sooner and get the enhancements that you folks
work so diligently to make in your application which is a win-win for all
of us.

---

Regards,

Kevin Martin

On Thu, Aug 23, 2018 at 11:21 AM Emmanuel Deloget <logout at free.fr>
wrote:
> Hello,
>
> On Thu, Aug 23, 2018 at 4:53 PM kevin martin <ktmdms at gmail.com>
wrote:
> >
> > I'm not sure I agree with Peter in respect to his comment about
> "building a
> > dependency to systemd".  The only time a "dependency"
would be created is
> > when the end-user would configure it to be there with a configure time
> flag
> > of --with-systemd.  Just having the code available and dormant without
> that
> > flag being provided builds in no dependency whatsoever and gives the
> > end-user their option to choose.
>
> Not sure I should step in, but the code to deal with the user
> selection and to notify systemd is a dependency - even if it's
> compiled out. The fact is that you still ave to maintain it and to
> test it regularly.
>
> The problem looks like a systemd configuration error. systemd allows
> you to start a non-systemd-aware daemon. You need to look at [Service]
> / Type (notify is used for systemd-aware daemons).
>
> BR,
>
> -- Emmanuel Deloget
>

Damien Miller

2018-Aug-23 23:50 UTC

head link

openssh 7.6 and 7.7 on Oracle Linux 7 (compiled from source) doesn't start correctly with systemd

On Thu, 23 Aug 2018, kevin martin wrote:
> I'm not sure I agree with Peter in respect to his comment about
"building a
> dependency to systemd".? The only time a "dependency" would
be created is
> when the end-user would configure it to be there with a configure time flag
> of --with-systemd.? Just having the code available and dormant without that
> flag being provided builds in no dependency whatsoever and gives the
> end-user their option to choose.
If it's in the code that we maintain then it's a dependency. I'm
don't
think any other definition makes sense.

Peter Stuge

2018-Aug-24 18:06 UTC

head link

openssh 7.6 and 7.7 on Oracle Linux 7 (compiled from source) doesn't start correctly with systemd

Damien Miller wrote:> If it's in the code that we maintain then it's a dependency.
I'm don't
> think any other definition makes sense.
I define dependency as external library or component, required at
compile- and runtime.

Emmanuel Deloget wrote:> Not sure I should step in, but the code to deal with the user selection
What user selection?
> and to notify systemd is a dependency - even if it's compiled out. The
> fact is that you still ave to maintain it and to test it regularly.
Did you read the code I sent? It does six system calls in about 85 lines,
three of which can't fail. I wrote it exactly because something so short
is sufficient.

The existing code in OpenSSH which sends messages with ancillary data
was last touched in 2010, 8 years ago. This isn't complicated, making
a compile- and runtime dependency on libsystemd doubly undesirable.

> The problem looks like a systemd configuration error. systemd allows
> you to start a non-systemd-aware daemon. You need to look at [Service]
> / Type (notify is used for systemd-aware daemons).
The discussion isn't about the basic functionality known from sysvinit,
using init scripts or even inittab directly. sshd of course already runs
fine on many systemd systems, and it runs fine also without debian's
patch to depend on libsystemd, when only considering the bare minimum
of a service manager.

But that ignores the additional functionality offered by systemd to its
users. I think it makes sense for sshd to support that functionality if
the cost of doing so is low. I suggest that my proposed code is low cost.

The implemented API (AF_UNIX message with ancillary data) is documented
and there's no technical reason for it to change, so the maintenance burden
will likely be similar to monitor_fdpass.c - little change in 8 years.

kevin martin wrote:> seems like remaking the wheel to me
It is, but I think avoiding the library dependency is a good reason to
do so, especially considering how little code is needed.

> would still require ongoing maintenance and testing
Did you read the code?

> If *you as the developers included the API access as a configurable option
> then *we the consumer could move to your newer codebase products sooner and
> get the enhancements that you folks work so diligently to make in your
> application which is a win-win for all of us.
You imply an obligation for developers to enable consumers to "move
to newer codebase products" in their systems - please re-read the
software license, and please remember that no such obligation exists.

Over the years I've learned that open source software only works if you
take responsibility for it yourself. If you fail to do so then you will
inevitably have a bad experience.

One way is to get a support contract (like Red Hat in this case) and
then they are of course obliged to honor that. You'll get something,
(a dependency on libsystemd) but since they too optimize for cost you
can be pretty sure that it's not the best solution - that would come
from upstream.

David Newall wrote:> I'm old school and think systemd is an overly complicated abomination.
I agree completely that the systemd implementation and overall ambition
is overly complicated.

But the unit data model is very very good, and I think systemd is a great
improvement in service management compared to anything on Linux before.

While we argue, Windows has proper service management since the 1990s.

> The more projects that do support it, the more legitimacy it is lent.
That ship has sailed. Please think more about the role of Red Hat in
the Linux ecosystem.

Jochen Bern wrote:> wrapper to translate as needed would seem a pragmatic solution
I consider that wrapper to be Type=forking in the service file, but
anything short of explicit notification from daemon to service manager
leaves the service manager without complete information about the state
of the daemon. That's not great.

> (*) PID file, lookup in the process table, check for a LISTEN, pattern
> match in a logfile, running a dedicated *client* executable / Nagios
> plugin / ${DAEMON}ctl tool for a test, throwing the daemon a
> SIGAREYOUWELL/shmem/semaphore/... request, you name it
Neither are both explicit and generic (across services).

I think the notify socket is a good simple solution, and one that is not
tied to systemd, I think it is worth supporting.

//Peter

openssh unix dev - Aug 2018 - openssh 7.6 and 7.7 on Oracle Linux 7 (compiled from source) doesn't start correctly with systemd

openssh 7.6 and 7.7 on Oracle Linux 7 (compiled from source) doesn't start correctly with systemd

openssh 7.6 and 7.7 on Oracle Linux 7 (compiled from source) doesn't start correctly with systemd

openssh 7.6 and 7.7 on Oracle Linux 7 (compiled from source) doesn't start correctly with systemd

openssh 7.6 and 7.7 on Oracle Linux 7 (compiled from source) doesn't start correctly with systemd

openssh 7.6 and 7.7 on Oracle Linux 7 (compiled from source) doesn't start correctly with systemd