thr3ads.net - openssh bugs - [Bug 2641] New: Add systemd notify code to to track running server [Nov 2016]

If this information is useful, please help other people find it:
Share via:

bugzilla-daemon at bugzilla.mindrot.org

2016-Nov-21 14:26 UTC

[Bug 2641] New: Add systemd notify code to to track running server

https://bugzilla.mindrot.org/show_bug.cgi?id=2641

            Bug ID: 2641
           Summary: Add systemd notify code to to track running server
           Product: Portable OpenSSH
           Version: 7.3p1
          Hardware: Other
                OS: Linux
            Status: NEW
          Severity: enhancement
          Priority: P5
         Component: sshd
          Assignee: unassigned-bugs at mindrot.org
          Reporter: jjelen at redhat.com
                CC: cjwatson at debian.org

Created attachment 2893
  --> https://bugzilla.mindrot.org/attachment.cgi?id=2893&action=edit
add systemd bits

Ever since systemd, there are problems with keeping track of the
running daemon process [1] and errors [2]. We were trying to avoid this
solution, but after several bugs and many failed attempts from various
people, it looks like there is no other working way than to use their
API to notify them that the daemon is running. Also Debian moved to
this solution [3].

Unfortunately Colin didn't manage to send the patch upstream, even
though it would be useful for others (adding to CC, sorry if it was
intentional) so I am trying to do so.

The code is very simple, guarded by the ifdefs and enabled only if
--with-systemd switch is added to configure. If it will not get to
OpenBSD, I thing carrying it in portable would certainly make a sense.
I also added a sample systemd service file to contrib directory to
reflect its usage (not particularly needed).

[1] https://bugzilla.redhat.com/show_bug.cgi?id=1381997
[2] https://bugzilla.redhat.com/show_bug.cgi?id=1291172
[3] https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=778913

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

bugzilla-daemon at bugzilla.mindrot.org

2016-Nov-28 23:09 UTC

head link

[Bug 2641] Add systemd notify code to to track running server

https://bugzilla.mindrot.org/show_bug.cgi?id=2641

Darren Tucker <dtucker at zip.com.au> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |dtucker at zip.com.au

--- Comment #1 from Darren Tucker <dtucker at zip.com.au> ---
Created attachment 2896
  --> https://bugzilla.mindrot.org/attachment.cgi?id=2896&action=edit
do not call daemon again when restarting on sighup

I'm not wild about letting that particular camel's nose into the tent. 
How about we fix the sighup behaviour to not call daemon, keep the same
pid and not touch the pidfile?

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
You are watching someone on the CC list of the bug.

bugzilla-daemon at bugzilla.mindrot.org

2016-Dec-31 12:10 UTC

head link

[Bug 2641] Add systemd notify code to to track running server

https://bugzilla.mindrot.org/show_bug.cgi?id=2641

--- Comment #2 from Jakub Jelen <jjelen at redhat.com> ---
Sorry for a quite late answer.

I got bitten by this while trying to rebase to the OpenSSH 7.4. Your
patch got into this release in 7fc4766a, but was partially reverted in
f2398eb7.

But to my understanding, the daemonized() function returns 1 even for
the first call running from systemd service file and therefore prevents
call to daemon() and therefore systemd does not keep track of the
service again.

sshd[12655]: debug3: already daemonized
systemd[1]: Starting OpenSSH server daemon...

Reverting the chunk with the condition

        if (!(debug_flag || inetd_flag || no_daemon_flag ||
already_daemon)) {

makes it working again.

-- 
You are receiving this mail because:
You are watching someone on the CC list of the bug.
You are watching the assignee of the bug.

bugzilla-daemon at bugzilla.mindrot.org

2016-Dec-31 22:39 UTC

head link

[Bug 2641] Add systemd notify code to to track running server

https://bugzilla.mindrot.org/show_bug.cgi?id=2641

--- Comment #3 from Colin Watson <cjwatson at debian.org> ---
Jakub: That's only a problem because for some reason Fedora doesn't use
the -D option when running from systemd (if you did, then
no_daemon_flag would be true so the value of already_daemon would be
irrelevant).  When you have a smart service supervisor, IMO -D makes
much more sense.

-- 
You are receiving this mail because:
You are watching someone on the CC list of the bug.
You are watching the assignee of the bug.

bugzilla-daemon at bugzilla.mindrot.org

2017-Jan-02 07:43 UTC

head link

[Bug 2641] Add systemd notify code to to track running server

https://bugzilla.mindrot.org/show_bug.cgi?id=2641

--- Comment #4 from Jakub Jelen <jjelen at redhat.com> ---
Colin, the -D was omitted (and Type=Forking added) because systemd was
unable to report failures (when configuration file had an error), as
described in the Red Hat bugzilla [1].

Certainly, the -D makes more sense these days, but with that, typing

  systemctl restart sshd

is not producing any error with -D option. Also no error goes into the
journal (as in the description of the linked bug).

The one solution is "letting that particular camel's nose into the
tent" as used in Ubuntu/Debian, but I still hope, there is a better way
to do that.

Clearly, what I am trying to point out here is that the applied change
is modifying the behavior not only for re-exec (reload), but also for
the "start", which was not intentional to my understanding.

[1] https://bugzilla.redhat.com/show_bug.cgi?id=1291172

-- 
You are receiving this mail because:
You are watching someone on the CC list of the bug.
You are watching the assignee of the bug.

bugzilla-daemon at bugzilla.mindrot.org

2017-Feb-03 15:16 UTC

head link

[Bug 2641] Add systemd notify code to to track running server

https://bugzilla.mindrot.org/show_bug.cgi?id=2641

--- Comment #5 from Jakub Jelen <jjelen at redhat.com> ---
The test if the process is daemon passes for every process that is run
as a systemd service so it does not solve the problem for us. As the
result with the pushed change the 7.4p1 will not write PID file when
running as a systemd service.

I would consider a bit different approach. We can remember if we
already called daemon() using environment variable and prevent the
repeated calls if this variable is already set. This finally made it
working for me. The patch can look somehow like this:

diff -up openssh-7.4p1/misc.c.daemon openssh-7.4p1/misc.c
--- openssh-7.4p1/misc.c.daemon 2017-02-03 13:08:14.751282516 +0100
+++ openssh-7.4p1/misc.c        2017-02-03 13:08:14.778282474 +0100
@@ -1273,6 +1273,9 @@ daemonized(void)
                return 0;       /* parent is not init */
        if (getsid(0) != getpid())
                return 0;       /* not session leader */
+       if (getenv("_SSH_DAEMONIZED") == NULL)
+               return 0;       /* already reexeced */
+
        debug3("already daemonized");
        return 1;
 }
diff -up openssh-7.4p1/sshd.c.daemon openssh-7.4p1/sshd.c
--- openssh-7.4p1/sshd.c.daemon 2017-02-03 13:08:14.755282510 +0100
+++ openssh-7.4p1/sshd.c        2017-02-03 13:09:29.765164356 +0100
@@ -1866,6 +1866,7 @@ main(int ac, char **av)
                if (daemon(0, 0) < 0)
                        fatal("daemon() failed: %.200s",
strerror(errno));

+               setenv("_SSH_DAEMONIZED", "1", 1);
                disconnect_controlling_tty();
        }
        /* Reinitialize the log (because of the fork above). */

-- 
You are receiving this mail because:
You are watching someone on the CC list of the bug.
You are watching the assignee of the bug.

bugzilla-daemon at bugzilla.mindrot.org

2017-Feb-09 04:17 UTC

head link

[Bug 2641] Add systemd notify code to to track running server

https://bugzilla.mindrot.org/show_bug.cgi?id=2641

daniel.black at au.ibm.com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |daniel.black at au.ibm.com

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
You are watching someone on the CC list of the bug.

bugzilla-daemon at bugzilla.mindrot.org

2017-Mar-01 13:39 UTC

head link

[Bug 2641] Add systemd notify code to to track running server

https://bugzilla.mindrot.org/show_bug.cgi?id=2641

Jakub Jelen <jjelen at redhat.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Attachment #2893|0                           |1
        is obsolete|                            |

--- Comment #6 from Jakub Jelen <jjelen at redhat.com> ---
Created attachment 2950
  --> https://bugzilla.mindrot.org/attachment.cgi?id=2950&action=edit
fixed patch

Never mind. Nothing from above resolves the race condition between
systemd reading PID file and sshd after reexec writing it, except
adding SD_NOTIFY code so I gave up.

I updated the patch to move the SD_NOTIFY to the place where it will
not be called by children for every connection (as currently used in
Debian).

-- 
You are receiving this mail because:
You are watching someone on the CC list of the bug.
You are watching the assignee of the bug.

bugzilla-daemon at bugzilla.mindrot.org

2017-Mar-18 09:26 UTC

head link

[Bug 2641] Add systemd notify code to to track running server

https://bugzilla.mindrot.org/show_bug.cgi?id=2641

--- Comment #7 from Darren Tucker <dtucker at zip.com.au> ---
(In reply to Jakub Jelen from comment #6)> Created attachment 2950 [details]
> fixed patch
> 
> Never mind. Nothing from above resolves the race condition between
> systemd reading PID file and sshd after reexec writing it, except
> adding SD_NOTIFY code so I gave up.
What about reading the pidfile first to see if it has the correct PID
before rewriting it?  I did something like that to work around a
problem in pam_loginuid (LinuxPAM ticket #23, I'd link to it but
fedorahosted.org seems to have imploded).

Does systemd even need a pidfile?

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
You are watching someone on the CC list of the bug.

bugzilla-daemon at bugzilla.mindrot.org

2017-Mar-20 09:21 UTC

head link

[Bug 2641] Add systemd notify code to to track running server

https://bugzilla.mindrot.org/show_bug.cgi?id=2641

--- Comment #8 from Jakub Jelen <jjelen at redhat.com> ---
(In reply to Darren Tucker from comment #7)> (In reply to Jakub Jelen from comment #6)
> > Created attachment 2950 [details]
> > fixed patch
> > 
> > Never mind. Nothing from above resolves the race condition between
> > systemd reading PID file and sshd after reexec writing it, except
> > adding SD_NOTIFY code so I gave up.
> 
> What about reading the pidfile first to see if it has the correct
> PID before rewriting it?  I did something like that to work around a
> problem in pam_loginuid (LinuxPAM ticket #23, I'd link to it but
> fedorahosted.org seems to have imploded).
I don't think that would help the initial race condition, when systemd
tries to read the PID file before it is written after the daemon().
> Does systemd even need a pidfile?
>From man systemd.service:
> If this setting [Type=forking] is used, it is recommended to also use the
PIDFile= option, so that systemd can identify the main process of the daemon.
systemd will proceed with starting follow-up units as soon as the parent process
exits.
There is also GuessMainPID= option, but from the documentation in the
same manual page I am not convinced that it is something that we would
like to use in case we can specify the PID file reliably.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
You are watching someone on the CC list of the bug.

bugzilla-daemon at bugzilla.mindrot.org

2017-Mar-20 22:27 UTC

head link

[Bug 2641] Add systemd notify code to to track running server

https://bugzilla.mindrot.org/show_bug.cgi?id=2641

--- Comment #9 from daniel.black at au.ibm.com ---
In a non-forking type=notify case systemd keeps track of the sshd
service as it is a child process. Any termination of the child process
is a signal to the system launching process and all operations are
handled in this way. I wouldn't worry about the pid file at all.

-- 
You are receiving this mail because:
You are watching someone on the CC list of the bug.
You are watching the assignee of the bug.

bugzilla-daemon at bugzilla.mindrot.org

2017-Apr-20 08:26 UTC

head link

[Bug 2641] Add systemd notify code to to track running server

https://bugzilla.mindrot.org/show_bug.cgi?id=2641

Mario Trangoni <mjtrangoni at gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |mjtrangoni at gmail.com

-- 
You are receiving this mail because:
You are watching someone on the CC list of the bug.
You are watching the assignee of the bug.

bugzilla-daemon at bugzilla.mindrot.org

2017-Nov-27 16:29 UTC

head link

[Bug 2641] Add systemd notify code to to track running server

https://bugzilla.mindrot.org/show_bug.cgi?id=2641

Michal Koutn? <mkoutny at suse.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |mkoutny at suse.com

--- Comment #10 from Michal Koutn? <mkoutny at suse.com> ---
Created attachment 3099
  --> https://bugzilla.mindrot.org/attachment.cgi?id=3099&action=edit
notify systemd when daemon reloading

I observed a race similar to restarts during SSH daemon reloads.

Environment:
SLES12 SP2
openssh-7.2p2-74.1
systemd-228-150.15.2

Reproducer:
ssh $REMOTE_HOST systemctl reload sshd.service ; ssh $REMOTE_HOST

Actual behavior:
The second SSH connection fails in roughly 50% cases.

Expected behavior:
First SSH blocks until daemon reloads and second connection succeeds.

I propose to extend the suggested patch in comment 6 with another
sd_notify() call (attached) so that systemd can properly track
reloading state of the daemon.

The two patches fixed the behavior in my testcase (I admit I didn't
check actual systemd handling of the notification).

-- 
You are receiving this mail because:
You are watching someone on the CC list of the bug.
You are watching the assignee of the bug.

bugzilla-daemon at bugzilla.mindrot.org

2017-Nov-27 22:55 UTC

head link

[Bug 2641] Add systemd notify code to to track running server

https://bugzilla.mindrot.org/show_bug.cgi?id=2641

Damien Miller <djm at mindrot.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |djm at mindrot.org

--- Comment #11 from Damien Miller <djm at mindrot.org> ---
I don't understand the problem here.

Can you not arrange for systemd to run sshd with the -D flag to begin
with? When sshd restarts for SIGHUP it will reexec itself with the same
commandline argument and will therefore not daemonise.

wrt reporting errors when sshd is run with the -D flag, I don't
understand the problem there either: sshd reports config parsing errors
to stderr, and this isn't affected by the presence of the -D flag at
all.

-- 
You are receiving this mail because:
You are watching someone on the CC list of the bug.
You are watching the assignee of the bug.

bugzilla-daemon at bugzilla.mindrot.org

2017-Nov-28 14:17 UTC

head link