bugzilla-daemon at bugzilla.mindrot.org
2016-Nov-21 14:26 UTC
[Bug 2641] New: Add systemd notify code to to track running server
https://bugzilla.mindrot.org/show_bug.cgi?id=2641 Bug ID: 2641 Summary: Add systemd notify code to to track running server Product: Portable OpenSSH Version: 7.3p1 Hardware: Other OS: Linux Status: NEW Severity: enhancement Priority: P5 Component: sshd Assignee: unassigned-bugs at mindrot.org Reporter: jjelen at redhat.com CC: cjwatson at debian.org Created attachment 2893 --> https://bugzilla.mindrot.org/attachment.cgi?id=2893&action=edit add systemd bits Ever since systemd, there are problems with keeping track of the running daemon process [1] and errors [2]. We were trying to avoid this solution, but after several bugs and many failed attempts from various people, it looks like there is no other working way than to use their API to notify them that the daemon is running. Also Debian moved to this solution [3]. Unfortunately Colin didn't manage to send the patch upstream, even though it would be useful for others (adding to CC, sorry if it was intentional) so I am trying to do so. The code is very simple, guarded by the ifdefs and enabled only if --with-systemd switch is added to configure. If it will not get to OpenBSD, I thing carrying it in portable would certainly make a sense. I also added a sample systemd service file to contrib directory to reflect its usage (not particularly needed). [1] https://bugzilla.redhat.com/show_bug.cgi?id=1381997 [2] https://bugzilla.redhat.com/show_bug.cgi?id=1291172 [3] https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=778913 -- You are receiving this mail because: You are watching the assignee of the bug.
bugzilla-daemon at bugzilla.mindrot.org
2016-Nov-28 23:09 UTC
[Bug 2641] Add systemd notify code to to track running server
https://bugzilla.mindrot.org/show_bug.cgi?id=2641 Darren Tucker <dtucker at zip.com.au> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |dtucker at zip.com.au --- Comment #1 from Darren Tucker <dtucker at zip.com.au> --- Created attachment 2896 --> https://bugzilla.mindrot.org/attachment.cgi?id=2896&action=edit do not call daemon again when restarting on sighup I'm not wild about letting that particular camel's nose into the tent. How about we fix the sighup behaviour to not call daemon, keep the same pid and not touch the pidfile? -- You are receiving this mail because: You are watching the assignee of the bug. You are watching someone on the CC list of the bug.
bugzilla-daemon at bugzilla.mindrot.org
2016-Dec-31 12:10 UTC
[Bug 2641] Add systemd notify code to to track running server
https://bugzilla.mindrot.org/show_bug.cgi?id=2641 --- Comment #2 from Jakub Jelen <jjelen at redhat.com> --- Sorry for a quite late answer. I got bitten by this while trying to rebase to the OpenSSH 7.4. Your patch got into this release in 7fc4766a, but was partially reverted in f2398eb7. But to my understanding, the daemonized() function returns 1 even for the first call running from systemd service file and therefore prevents call to daemon() and therefore systemd does not keep track of the service again. sshd[12655]: debug3: already daemonized systemd[1]: Starting OpenSSH server daemon... Reverting the chunk with the condition if (!(debug_flag || inetd_flag || no_daemon_flag || already_daemon)) { makes it working again. -- You are receiving this mail because: You are watching someone on the CC list of the bug. You are watching the assignee of the bug.
bugzilla-daemon at bugzilla.mindrot.org
2016-Dec-31 22:39 UTC
[Bug 2641] Add systemd notify code to to track running server
https://bugzilla.mindrot.org/show_bug.cgi?id=2641 --- Comment #3 from Colin Watson <cjwatson at debian.org> --- Jakub: That's only a problem because for some reason Fedora doesn't use the -D option when running from systemd (if you did, then no_daemon_flag would be true so the value of already_daemon would be irrelevant). When you have a smart service supervisor, IMO -D makes much more sense. -- You are receiving this mail because: You are watching someone on the CC list of the bug. You are watching the assignee of the bug.
bugzilla-daemon at bugzilla.mindrot.org
2017-Jan-02 07:43 UTC
[Bug 2641] Add systemd notify code to to track running server
https://bugzilla.mindrot.org/show_bug.cgi?id=2641 --- Comment #4 from Jakub Jelen <jjelen at redhat.com> --- Colin, the -D was omitted (and Type=Forking added) because systemd was unable to report failures (when configuration file had an error), as described in the Red Hat bugzilla [1]. Certainly, the -D makes more sense these days, but with that, typing systemctl restart sshd is not producing any error with -D option. Also no error goes into the journal (as in the description of the linked bug). The one solution is "letting that particular camel's nose into the tent" as used in Ubuntu/Debian, but I still hope, there is a better way to do that. Clearly, what I am trying to point out here is that the applied change is modifying the behavior not only for re-exec (reload), but also for the "start", which was not intentional to my understanding. [1] https://bugzilla.redhat.com/show_bug.cgi?id=1291172 -- You are receiving this mail because: You are watching someone on the CC list of the bug. You are watching the assignee of the bug.
bugzilla-daemon at bugzilla.mindrot.org
2017-Feb-03 15:16 UTC
[Bug 2641] Add systemd notify code to to track running server
https://bugzilla.mindrot.org/show_bug.cgi?id=2641 --- Comment #5 from Jakub Jelen <jjelen at redhat.com> --- The test if the process is daemon passes for every process that is run as a systemd service so it does not solve the problem for us. As the result with the pushed change the 7.4p1 will not write PID file when running as a systemd service. I would consider a bit different approach. We can remember if we already called daemon() using environment variable and prevent the repeated calls if this variable is already set. This finally made it working for me. The patch can look somehow like this: diff -up openssh-7.4p1/misc.c.daemon openssh-7.4p1/misc.c --- openssh-7.4p1/misc.c.daemon 2017-02-03 13:08:14.751282516 +0100 +++ openssh-7.4p1/misc.c 2017-02-03 13:08:14.778282474 +0100 @@ -1273,6 +1273,9 @@ daemonized(void) return 0; /* parent is not init */ if (getsid(0) != getpid()) return 0; /* not session leader */ + if (getenv("_SSH_DAEMONIZED") == NULL) + return 0; /* already reexeced */ + debug3("already daemonized"); return 1; } diff -up openssh-7.4p1/sshd.c.daemon openssh-7.4p1/sshd.c --- openssh-7.4p1/sshd.c.daemon 2017-02-03 13:08:14.755282510 +0100 +++ openssh-7.4p1/sshd.c 2017-02-03 13:09:29.765164356 +0100 @@ -1866,6 +1866,7 @@ main(int ac, char **av) if (daemon(0, 0) < 0) fatal("daemon() failed: %.200s", strerror(errno)); + setenv("_SSH_DAEMONIZED", "1", 1); disconnect_controlling_tty(); } /* Reinitialize the log (because of the fork above). */ -- You are receiving this mail because: You are watching someone on the CC list of the bug. You are watching the assignee of the bug.
bugzilla-daemon at bugzilla.mindrot.org
2017-Feb-09 04:17 UTC
[Bug 2641] Add systemd notify code to to track running server
https://bugzilla.mindrot.org/show_bug.cgi?id=2641 daniel.black at au.ibm.com changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |daniel.black at au.ibm.com -- You are receiving this mail because: You are watching the assignee of the bug. You are watching someone on the CC list of the bug.
bugzilla-daemon at bugzilla.mindrot.org
2017-Mar-01 13:39 UTC
[Bug 2641] Add systemd notify code to to track running server
https://bugzilla.mindrot.org/show_bug.cgi?id=2641 Jakub Jelen <jjelen at redhat.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Attachment #2893|0 |1 is obsolete| | --- Comment #6 from Jakub Jelen <jjelen at redhat.com> --- Created attachment 2950 --> https://bugzilla.mindrot.org/attachment.cgi?id=2950&action=edit fixed patch Never mind. Nothing from above resolves the race condition between systemd reading PID file and sshd after reexec writing it, except adding SD_NOTIFY code so I gave up. I updated the patch to move the SD_NOTIFY to the place where it will not be called by children for every connection (as currently used in Debian). -- You are receiving this mail because: You are watching someone on the CC list of the bug. You are watching the assignee of the bug.
bugzilla-daemon at bugzilla.mindrot.org
2017-Mar-18 09:26 UTC
[Bug 2641] Add systemd notify code to to track running server
https://bugzilla.mindrot.org/show_bug.cgi?id=2641 --- Comment #7 from Darren Tucker <dtucker at zip.com.au> --- (In reply to Jakub Jelen from comment #6)> Created attachment 2950 [details] > fixed patch > > Never mind. Nothing from above resolves the race condition between > systemd reading PID file and sshd after reexec writing it, except > adding SD_NOTIFY code so I gave up.What about reading the pidfile first to see if it has the correct PID before rewriting it? I did something like that to work around a problem in pam_loginuid (LinuxPAM ticket #23, I'd link to it but fedorahosted.org seems to have imploded). Does systemd even need a pidfile? -- You are receiving this mail because: You are watching the assignee of the bug. You are watching someone on the CC list of the bug.
bugzilla-daemon at bugzilla.mindrot.org
2017-Mar-20 09:21 UTC
[Bug 2641] Add systemd notify code to to track running server
https://bugzilla.mindrot.org/show_bug.cgi?id=2641 --- Comment #8 from Jakub Jelen <jjelen at redhat.com> --- (In reply to Darren Tucker from comment #7)> (In reply to Jakub Jelen from comment #6) > > Created attachment 2950 [details] > > fixed patch > > > > Never mind. Nothing from above resolves the race condition between > > systemd reading PID file and sshd after reexec writing it, except > > adding SD_NOTIFY code so I gave up. > > What about reading the pidfile first to see if it has the correct > PID before rewriting it? I did something like that to work around a > problem in pam_loginuid (LinuxPAM ticket #23, I'd link to it but > fedorahosted.org seems to have imploded).I don't think that would help the initial race condition, when systemd tries to read the PID file before it is written after the daemon().> Does systemd even need a pidfile?>From man systemd.service:> If this setting [Type=forking] is used, it is recommended to also use the PIDFile= option, so that systemd can identify the main process of the daemon. systemd will proceed with starting follow-up units as soon as the parent process exits.There is also GuessMainPID= option, but from the documentation in the same manual page I am not convinced that it is something that we would like to use in case we can specify the PID file reliably. -- You are receiving this mail because: You are watching the assignee of the bug. You are watching someone on the CC list of the bug.
bugzilla-daemon at bugzilla.mindrot.org
2017-Mar-20 22:27 UTC
[Bug 2641] Add systemd notify code to to track running server
https://bugzilla.mindrot.org/show_bug.cgi?id=2641 --- Comment #9 from daniel.black at au.ibm.com --- In a non-forking type=notify case systemd keeps track of the sshd service as it is a child process. Any termination of the child process is a signal to the system launching process and all operations are handled in this way. I wouldn't worry about the pid file at all. -- You are receiving this mail because: You are watching someone on the CC list of the bug. You are watching the assignee of the bug.
bugzilla-daemon at bugzilla.mindrot.org
2017-Apr-20 08:26 UTC
[Bug 2641] Add systemd notify code to to track running server
https://bugzilla.mindrot.org/show_bug.cgi?id=2641 Mario Trangoni <mjtrangoni at gmail.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |mjtrangoni at gmail.com -- You are receiving this mail because: You are watching someone on the CC list of the bug. You are watching the assignee of the bug.
bugzilla-daemon at bugzilla.mindrot.org
2017-Nov-27 16:29 UTC
[Bug 2641] Add systemd notify code to to track running server
https://bugzilla.mindrot.org/show_bug.cgi?id=2641 Michal Koutn? <mkoutny at suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |mkoutny at suse.com --- Comment #10 from Michal Koutn? <mkoutny at suse.com> --- Created attachment 3099 --> https://bugzilla.mindrot.org/attachment.cgi?id=3099&action=edit notify systemd when daemon reloading I observed a race similar to restarts during SSH daemon reloads. Environment: SLES12 SP2 openssh-7.2p2-74.1 systemd-228-150.15.2 Reproducer: ssh $REMOTE_HOST systemctl reload sshd.service ; ssh $REMOTE_HOST Actual behavior: The second SSH connection fails in roughly 50% cases. Expected behavior: First SSH blocks until daemon reloads and second connection succeeds. I propose to extend the suggested patch in comment 6 with another sd_notify() call (attached) so that systemd can properly track reloading state of the daemon. The two patches fixed the behavior in my testcase (I admit I didn't check actual systemd handling of the notification). -- You are receiving this mail because: You are watching someone on the CC list of the bug. You are watching the assignee of the bug.
bugzilla-daemon at bugzilla.mindrot.org
2017-Nov-27 22:55 UTC
[Bug 2641] Add systemd notify code to to track running server
https://bugzilla.mindrot.org/show_bug.cgi?id=2641 Damien Miller <djm at mindrot.org> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |djm at mindrot.org --- Comment #11 from Damien Miller <djm at mindrot.org> --- I don't understand the problem here. Can you not arrange for systemd to run sshd with the -D flag to begin with? When sshd restarts for SIGHUP it will reexec itself with the same commandline argument and will therefore not daemonise. wrt reporting errors when sshd is run with the -D flag, I don't understand the problem there either: sshd reports config parsing errors to stderr, and this isn't affected by the presence of the -D flag at all. -- You are receiving this mail because: You are watching someone on the CC list of the bug. You are watching the assignee of the bug.
bugzilla-daemon at bugzilla.mindrot.org
2017-Nov-28 14:17 UTC
[Bug 2641] Add systemd notify code to to track running server
https://bugzilla.mindrot.org/show_bug.cgi?id=2641 Petr Cerny [:hrosik] <pcerny at suse.cz> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |pcerny at suse.cz --- Comment #12 from Petr Cerny [:hrosik] <pcerny at suse.cz> --- (In reply to Damien Miller from comment #11)> I don't understand the problem here. > > Can you not arrange for systemd to run sshd with the -D flag to > begin with? When sshd restarts for SIGHUP it will reexec itself with > the same commandline argument and will therefore not daemonise.That is an option, yet it has other drawbacks and one needs to carefully step among all the combinations that have some kind of race condition or similar problems (those were actually the reason for the Debian/RH/SUSE issues referenced above). That said, putting in an ifdeffed init system integration doesn't need to be a bad idea (I intentionally avoided writing 'systemd', as I personally consider this to be a broader issue, which would be present with any other init system that would want to keep track of the state the services are in.) Would it help if the state reporting was init-system agnostic - say split into a separate file(s) the way auditing is? (From my point of view that would be the right thing.) -- You are receiving this mail because: You are watching the assignee of the bug. You are watching someone on the CC list of the bug.
bugzilla-daemon at mindrot.org
2023-Aug-28 19:06 UTC
[Bug 2641] Add systemd notify code to to track running server
https://bugzilla.mindrot.org/show_bug.cgi?id=2641 github at kalvdans.no-ip.org changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |github at kalvdans.no-ip.org -- You are receiving this mail because: You are watching the assignee of the bug. You are watching someone on the CC list of the bug.
bugzilla-daemon at mindrot.org
2024-Mar-29 22:22 UTC
[Bug 2641] Add systemd notify code to to track running server
https://bugzilla.mindrot.org/show_bug.cgi?id=2641 Damien Miller <djm at mindrot.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Attachment #2896|0 |1 is obsolete| | Attachment #2950|0 |1 is obsolete| | Attachment #3099|0 |1 is obsolete| | --- Comment #13 from Damien Miller <djm at mindrot.org> --- Created attachment 3798 --> https://bugzilla.mindrot.org/attachment.cgi?id=3798&action=edit standalone systemd notifications This implements the equivalent of sd_notify() without bringing in the rest of systemd bloat. It it also signal-handler safe, which is not the case for the originally proposed diffs. Lightly tested. -- You are receiving this mail because: You are watching the assignee of the bug. You are watching someone on the CC list of the bug.
bugzilla-daemon at mindrot.org
2024-Mar-29 23:31 UTC
[Bug 2641] Add systemd notify code to to track running server
https://bugzilla.mindrot.org/show_bug.cgi?id=2641 Luca Boccassi <luca.boccassi at gmail.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |luca.boccassi at gmail.com --- Comment #14 from Luca Boccassi <luca.boccassi at gmail.com> --- Thanks for working on that, will be great to have native support for the readiness protocol. One review comment: unless I'm missing it because it's handled outside of the patch context, after a RELOADING=1, when the reload operation is complete, a READY=1 needs to be sent too: https://www.freedesktop.org/software/systemd/man/latest/sd_notify.html#RELOADING=1 While there, it would be really nice if the RELOADING=1 message also included MONOTONIC_USEC=<timestamp> (CLOCK_MONOTONIC in usec as a decimal string), which is used for accurate synchronization. IE, write a string like "RELOADING=1\nMONOTONIC_USEC=1234...". This will enable the unit to be of Type=notify-reload which adds some nice features. -- You are receiving this mail because: You are watching the assignee of the bug. You are watching someone on the CC list of the bug.
bugzilla-daemon at mindrot.org
2024-Mar-29 23:43 UTC
[Bug 2641] Add systemd notify code to to track running server
https://bugzilla.mindrot.org/show_bug.cgi?id=2641 Sam James <sam at gentoo.org> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |sam at gentoo.org -- You are receiving this mail because: You are watching someone on the CC list of the bug. You are watching the assignee of the bug.
bugzilla-daemon at mindrot.org
2024-Mar-29 23:47 UTC
[Bug 2641] Add systemd notify code to to track running server
https://bugzilla.mindrot.org/show_bug.cgi?id=2641 --- Comment #15 from Damien Miller <djm at mindrot.org> --- I think the READY=1 will be sent implicitly after sshd restarts -- You are receiving this mail because: You are watching the assignee of the bug. You are watching someone on the CC list of the bug.
bugzilla-daemon at mindrot.org
2024-Mar-30 01:43 UTC
[Bug 2641] Add systemd notify code to to track running server
https://bugzilla.mindrot.org/show_bug.cgi?id=2641 --- Comment #16 from Damien Miller <djm at mindrot.org> --- (In reply to Luca Boccassi from comment #14)> While there, it would be really nice if the RELOADING=1 message also > included MONOTONIC_USEC=<timestamp> (CLOCK_MONOTONIC in usec as a > decimal string), which is used for accurate synchronization. IE, > write a string like "RELOADING=1\nMONOTONIC_USEC=1234...". This will > enable the unit to be of Type=notify-reload which adds some nice > features.That's more tricky as the reload is called from signal handler context and we can't use snprint() there to format the usec part of the message. We'd have to refactor how sshd manages SIGHUP restarts. That would make some other things easier, but it's still a bigger change. Anyway, if some of the distro people on this bug can report on whether the patch is okay, then we can move forward with this and finesse it later. -- You are receiving this mail because: You are watching someone on the CC list of the bug. You are watching the assignee of the bug.
bugzilla-daemon at mindrot.org
2024-Mar-30 12:05 UTC
[Bug 2641] Add systemd notify code to to track running server
https://bugzilla.mindrot.org/show_bug.cgi?id=2641 Arkadiusz Mi?kiewicz <arekm at maven.pl> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |arekm at maven.pl -- You are receiving this mail because: You are watching someone on the CC list of the bug. You are watching the assignee of the bug.
bugzilla-daemon at mindrot.org
2024-Mar-30 16:10 UTC
[Bug 2641] Add systemd notify code to to track running server
https://bugzilla.mindrot.org/show_bug.cgi?id=2641 Richard W.M. Jones <rjones at redhat.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |rjones at redhat.com -- You are receiving this mail because: You are watching someone on the CC list of the bug. You are watching the assignee of the bug.
bugzilla-daemon at mindrot.org
2024-Mar-31 02:47 UTC
[Bug 2641] Add systemd notify code to to track running server
https://bugzilla.mindrot.org/show_bug.cgi?id=2641 --- Comment #17 from Colin Watson <cjwatson at debian.org> --- I don't see any problems from eyeballing the patch. I've pushed a version of the Debian packaging with this (and consequent modifications; we also have a socket activation patch from Ubuntu, but reworking that to avoid libsystemd wasn't too hard) to https://salsa.debian.org/ssh-team/openssh/-/tree/without-libsystemd, though so far I've only checked that it passes the regression tests. https://salsa.debian.org/ssh-team/openssh/-/jobs/5521815 has .debs for people who feel comfortable installing things from random CI jobs. Obviously I don't recommend installing those on production, but it's probably OK to do so in a container/VM. I'll look more once I've had some sleep. -- You are receiving this mail because: You are watching someone on the CC list of the bug. You are watching the assignee of the bug.
bugzilla-daemon at mindrot.org
2024-Mar-31 10:53 UTC
[Bug 2641] Add systemd notify code to to track running server
https://bugzilla.mindrot.org/show_bug.cgi?id=2641 --- Comment #18 from Colin Watson <cjwatson at debian.org> --- I've done some testing and this does seem to basically work. The one thing I'd point out is following on from Luca's comment: RELOADING=1 is ignored if you don't also send MONOTONIC_USEC=. So if you're not going to send that (and I understand the reasons), you might as well not bother sending RELOADING=1 either; we'll just have to stick with Type=notify rather than Type=notify-reload for now, which wouldn't be a regression. -- You are receiving this mail because: You are watching someone on the CC list of the bug. You are watching the assignee of the bug.
bugzilla-daemon at mindrot.org
2024-Mar-31 10:57 UTC
[Bug 2641] Add systemd notify code to to track running server
https://bugzilla.mindrot.org/show_bug.cgi?id=2641 --- Comment #19 from Luca Boccassi <luca.boccassi at gmail.com> --- (In reply to Colin Watson from comment #18)> I've done some testing and this does seem to basically work. > > The one thing I'd point out is following on from Luca's comment: > RELOADING=1 is ignored if you don't also send MONOTONIC_USEC=. So > if you're not going to send that (and I understand the reasons), you > might as well not bother sending RELOADING=1 either; we'll just have > to stick with Type=notify rather than Type=notify-reload for now, > which wouldn't be a regression.Mmmh hang on I don't think that should be the case. The MONOTONIC_USEC is for the Type=notify-reload workflow, that automatically hooks sighup to the service, and is newer. But RELOADING=1 -> READY=1 by itself should work with the older workflow where you manually specify an ExecReload=kill -HUP $MAINPID in the unit. Let me get your packages and test this. -- You are receiving this mail because: You are watching the assignee of the bug. You are watching someone on the CC list of the bug.
bugzilla-daemon at mindrot.org
2024-Mar-31 11:19 UTC
[Bug 2641] Add systemd notify code to to track running server
https://bugzilla.mindrot.org/show_bug.cgi?id=2641 --- Comment #20 from Colin Watson <cjwatson at debian.org> --- Actually, I noticed a slight race here. You're sending the readiness notification from platform_pre_listen; but, as the name implies, this is called _before_ the server has started listening. The point of the readiness protocol is that the notification is only sent once the server is ready to accept connections. The notification should be moved to after the listen sockets are bound. -- You are receiving this mail because: You are watching the assignee of the bug. You are watching someone on the CC list of the bug.
bugzilla-daemon at mindrot.org
2024-Mar-31 11:20 UTC
[Bug 2641] Add systemd notify code to to track running server
https://bugzilla.mindrot.org/show_bug.cgi?id=2641 --- Comment #21 from Colin Watson <cjwatson at debian.org> --- (In reply to Luca Boccassi from comment #19)> Mmmh hang on I don't think that should be the case. The > MONOTONIC_USEC is for the Type=notify-reload workflow, that > automatically hooks sighup to the service, and is newer. But > RELOADING=1 -> READY=1 by itself should work with the older workflow > where you manually specify an ExecReload=kill -HUP $MAINPID in the > unit.Ah, you may be right. I was just going by looking at the code and hadn't actually tested removing RELOADING=1. Probably best to leave it in then. -- You are receiving this mail because: You are watching the assignee of the bug. You are watching someone on the CC list of the bug.
bugzilla-daemon at mindrot.org
2024-Mar-31 13:36 UTC
[Bug 2641] Add systemd notify code to to track running server
https://bugzilla.mindrot.org/show_bug.cgi?id=2641 --- Comment #22 from Luca Boccassi <luca.boccassi at gmail.com> --- (In reply to Colin Watson from comment #21)> (In reply to Luca Boccassi from comment #19) > > Mmmh hang on I don't think that should be the case. The > > MONOTONIC_USEC is for the Type=notify-reload workflow, that > > automatically hooks sighup to the service, and is newer. But > > RELOADING=1 -> READY=1 by itself should work with the older workflow > > where you manually specify an ExecReload=kill -HUP $MAINPID in the > > unit. > > Ah, you may be right. I was just going by looking at the code and > hadn't actually tested removing RELOADING=1. Probably best to leave > it in then.I have tested the packages you published and the reloading notification is working: Mar 31 14:34:28 localhost systemd[1]: ssh.service: Trying to enqueue job ssh.service/reload/replace Mar 31 14:34:28 localhost systemd[1]: ssh.service: Installed new job ssh.service/reload as 1333 Mar 31 14:34:28 localhost systemd[1]: ssh.service: Enqueued job ssh.service/reload as 1333 Mar 31 14:34:28 localhost systemd[1]: ssh.service: Will spawn child (service_enter_reload): /usr/sbin/sshd Mar 31 14:34:28 localhost systemd[1]: ssh.service: About to execute: /usr/sbin/sshd -t Mar 31 14:34:28 localhost (sshd)[3824]: Found cgroup2 on /sys/fs/cgroup/, full unified hierarchy Mar 31 14:34:28 localhost systemd[1]: ssh.service: Forked /usr/sbin/sshd as 3824 (without CLONE_INTO_CGROUP) Mar 31 14:34:28 localhost systemd[1]: ssh.service: Changed running -> reload Mar 31 14:34:28 localhost systemd[1]: Reloading ssh.service - OpenBSD Secure Shell server... Mar 31 14:34:28 localhost (sshd)[3824]: Found cgroup2 on /sys/fs/cgroup/, full unified hierarchy Mar 31 14:34:28 localhost systemd[1]: ssh.service: Child 3824 belongs to ssh.service. Mar 31 14:34:28 localhost systemd[1]: ssh.service: Control process exited, code=exited, status=0/SUCCESS (success) Mar 31 14:34:28 localhost systemd[1]: ssh.service: Running next control command for state reload. Mar 31 14:34:28 localhost systemd[1]: ssh.service: Will spawn child (service_run_next_control): /bin/kill Mar 31 14:34:28 localhost systemd[1]: ssh.service: About to execute: /bin/kill -HUP "\$MAINPID" Mar 31 14:34:28 localhost (kill)[3826]: Found cgroup2 on /sys/fs/cgroup/, full unified hierarchy Mar 31 14:34:28 localhost systemd[1]: ssh.service: Forked /bin/kill as 3826 (without CLONE_INTO_CGROUP) Mar 31 14:34:28 localhost (kill)[3826]: Found cgroup2 on /sys/fs/cgroup/, full unified hierarchy Mar 31 14:34:28 localhost sshd[3812]: Received SIGHUP; restarting. Mar 31 14:34:28 localhost systemd[1]: ssh.service: Got notification message from PID 3812 (RELOADING=1) Mar 31 14:34:28 localhost systemd[1]: ssh.service: Child 3826 belongs to ssh.service. Mar 31 14:34:28 localhost systemd[1]: ssh.service: Control process exited, code=exited, status=0/SUCCESS (success) Mar 31 14:34:28 localhost systemd[1]: ssh.service: Got final SIGCHLD for state reload. Mar 31 14:34:28 localhost systemd[1]: ssh.service: Changed reload -> reload-notify Mar 31 14:34:28 localhost systemd[1]: ssh.service: Got notification message from PID 3812 (READY=1) Mar 31 14:34:28 localhost systemd[1]: ssh.service: Changed reload-notify -> running Mar 31 14:34:28 localhost systemd[1]: ssh.service: Job 1333 ssh.service/reload finished, result=done Mar 31 14:34:28 localhost systemd[1]: Reloaded ssh.service - OpenBSD Secure Shell server. Mar 31 14:34:28 localhost sshd[3812]: Server listening on 0.0.0.0 port 22. Mar 31 14:34:28 localhost sshd[3812]: Server listening on :: port 22. -- You are receiving this mail because: You are watching the assignee of the bug. You are watching someone on the CC list of the bug.
bugzilla-daemon at mindrot.org
2024-Mar-31 13:38 UTC
[Bug 2641] Add systemd notify code to to track running server
https://bugzilla.mindrot.org/show_bug.cgi?id=2641 --- Comment #23 from Luca Boccassi <luca.boccassi at gmail.com> --- (In reply to Colin Watson from comment #20)> Actually, I noticed a slight race here. You're sending the > readiness notification from platform_pre_listen; but, as the name > implies, this is called _before_ the server has started listening. > The point of the readiness protocol is that the notification is only > sent once the server is ready to accept connections. > > The notification should be moved to after the listen sockets are > bound.Yes, good catch, this should be fixed as it's important to avoid races that the notification is delivered after everything is up and running and ready to process requests. -- You are receiving this mail because: You are watching the assignee of the bug. You are watching someone on the CC list of the bug.
bugzilla-daemon at mindrot.org
2024-Mar-31 19:16 UTC
[Bug 2641] Add systemd notify code to to track running server
https://bugzilla.mindrot.org/show_bug.cgi?id=2641 --- Comment #24 from Luca Boccassi <luca.boccassi at gmail.com> --- Created attachment 3801 --> https://bugzilla.mindrot.org/attachment.cgi?id=3801&action=edit standalone notify patch The attached patch fixes the issue by creating a platform_post_listen() hook, as suggested by Colin. Tested in a Debian testing VM, seems to do the right thing, including on reloading. -- You are receiving this mail because: You are watching the assignee of the bug. You are watching someone on the CC list of the bug.
bugzilla-daemon at mindrot.org
2024-Mar-31 20:00 UTC
[Bug 2641] Add systemd notify code to to track running server
https://bugzilla.mindrot.org/show_bug.cgi?id=2641 Luca Boccassi <luca.boccassi at gmail.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Attachment #3801|0 |1 is obsolete| | --- Comment #25 from Luca Boccassi <luca.boccassi at gmail.com> --- Created attachment 3802 --> https://bugzilla.mindrot.org/attachment.cgi?id=3802&action=edit standalone notify patch Thinking about it, given there's no external dependency and the runtime behaviour is a no-op unless the NOTIFY_SOCKET env var is set (which is only set by systemd or systemd-compatible managers), I don't think the new autoconf option is needed? There's no downside to always including the implementation when building on Linux, like it's done with the OOM adjustments. New revision of the patch attached does just that. -- You are receiving this mail because: You are watching someone on the CC list of the bug. You are watching the assignee of the bug.
bugzilla-daemon at mindrot.org
2024-Apr-01 13:55 UTC
[Bug 2641] Add systemd notify code to to track running server
https://bugzilla.mindrot.org/show_bug.cgi?id=2641 --- Comment #26 from Colin Watson <cjwatson at debian.org> --- Either version of Luca's patch looks fine to me. -- You are receiving this mail because: You are watching someone on the CC list of the bug. You are watching the assignee of the bug.
bugzilla-daemon at mindrot.org
2024-Apr-01 23:21 UTC
[Bug 2641] Add systemd notify code to to track running server
https://bugzilla.mindrot.org/show_bug.cgi?id=2641 Luca Boccassi <luca.boccassi at gmail.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Attachment #3802|0 |1 is obsolete| | --- Comment #27 from Luca Boccassi <luca.boccassi at gmail.com> --- Created attachment 3804 --> https://bugzilla.mindrot.org/attachment.cgi?id=3804&action=edit standalone notify and timestamp patch> That's more tricky as the reload is called from signal handler context and we can't use snprint() there to format the usec part of the message. We'd have to refactor how sshd manages SIGHUP restarts. > > That would make some other things easier, but it's still a bigger change.I went back and had a look at this, and unless I am missing something the reloading message is not being sent from the signal handler? The handler is sighup_handler which just sets a boolean and returns, following the usual pattern: https://anongit.mindrot.org/openssh.git/tree/sshd.c#n298 but the notification message is sent from the platform_pre_restart() hook, which is called from the main context from the main loop via sighup_restart(): https://anongit.mindrot.org/openssh.git/tree/sshd.c#n304 This already does some logging, which uses format strings. Also platform_pre_restart() already calls oom_adjust_restore() which also uses format strings. So I went ahead and did the necessary modifications in the latest version, which also simplified the message handling as it can log unconditionally now, and added the timestamp too. I've tested this and seems to work just fine on Debian testing, I can change ssh.service to Type=notify-reload and reloading works just fine, including the state transitions. -- You are receiving this mail because: You are watching someone on the CC list of the bug. You are watching the assignee of the bug.
bugzilla-daemon at mindrot.org
2024-Apr-02 01:28 UTC
[Bug 2641] Add systemd notify code to to track running server
https://bugzilla.mindrot.org/show_bug.cgi?id=2641 Damien Miller <djm at mindrot.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Attachment #3798|0 |1 is obsolete| | Attachment #3804|0 |1 is obsolete| | --- Comment #28 from Damien Miller <djm at mindrot.org> --- Created attachment 3805 --> https://bugzilla.mindrot.org/attachment.cgi?id=3805&action=edit simplified further Good catch about the sighup restart no longer running in a signal handler. We can simplify further if we make ssh_systemd_notify() accept a format string. We also have code to get the CLOCK_MONOTONIC timer that we can reuse. -- You are receiving this mail because: You are watching someone on the CC list of the bug. You are watching the assignee of the bug.
bugzilla-daemon at mindrot.org
2024-Apr-02 01:41 UTC
[Bug 2641] Add systemd notify code to to track running server
https://bugzilla.mindrot.org/show_bug.cgi?id=2641 --- Comment #29 from Luca Boccassi <luca.boccassi at gmail.com> --- (In reply to Damien Miller from comment #28)> Created attachment 3805 [details] > simplified further > > Good catch about the sighup restart no longer running in a signal > handler. > > We can simplify further if we make ssh_systemd_notify() accept a > format string. We also have code to get the CLOCK_MONOTONIC timer > that we can reuse.Looks good to me, tested on Debian testing as before, works as expected. -- You are receiving this mail because: You are watching someone on the CC list of the bug. You are watching the assignee of the bug.
bugzilla-daemon at mindrot.org
2024-Apr-02 01:58 UTC
[Bug 2641] Add systemd notify code to to track running server
https://bugzilla.mindrot.org/show_bug.cgi?id=2641 Damien Miller <djm at mindrot.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Attachment #3805| |ok?(dtucker at dtucker.net) Flags| | -- You are receiving this mail because: You are watching someone on the CC list of the bug. You are watching the assignee of the bug.
bugzilla-daemon at mindrot.org
2024-Apr-02 09:50 UTC
[Bug 2641] Add systemd notify code to to track running server
https://bugzilla.mindrot.org/show_bug.cgi?id=2641 --- Comment #30 from Michal Koutn? <mkoutny at suse.com> --- (In reply to Damien Miller from comment #28)> Good catch about the sighup restart no longer running in a signal > handler.(In reply to Damien Miller from comment #13)> ... > It it also signal-handler safe, which is not the case for the originally proposed diffs.The original diff (comment 10) already put the notification in sighup_restart() not in sighup_handler(), i.e. still the same place where platform_pre_restart() is called now, not a signal handler context AFAICS. platform_* hooks look like the appropriate places for these calls. -- You are receiving this mail because: You are watching the assignee of the bug. You are watching someone on the CC list of the bug.
bugzilla-daemon at mindrot.org
2024-Apr-02 23:48 UTC
[Bug 2641] Add systemd notify code to to track running server
https://bugzilla.mindrot.org/show_bug.cgi?id=2641 --- Comment #31 from Luca Boccassi <luca.boccassi at gmail.com> --- Created attachment 3809 --> https://bugzilla.mindrot.org/attachment.cgi?id=3809&action=edit standalone notify and timestamp patch One more change, to support abstract namespace sockets (for containers) as per protocol defined at https://www.freedesktop.org/software/systemd/man/devel/sd_notify.html#Notes -- You are receiving this mail because: You are watching the assignee of the bug. You are watching someone on the CC list of the bug.
bugzilla-daemon at mindrot.org
2024-Apr-03 03:38 UTC
[Bug 2641] Add systemd notify code to to track running server
https://bugzilla.mindrot.org/show_bug.cgi?id=2641 Damien Miller <djm at mindrot.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Attachment #3805|ok?(dtucker at dtucker.net) | Flags| | Attachment #3805|0 |1 is obsolete| | -- You are receiving this mail because: You are watching the assignee of the bug. You are watching someone on the CC list of the bug.
bugzilla-daemon at mindrot.org
2024-Apr-03 03:38 UTC
[Bug 2641] Add systemd notify code to to track running server
https://bugzilla.mindrot.org/show_bug.cgi?id=2641 --- Comment #32 from Damien Miller <djm at mindrot.org> --- Comment on attachment 3809 --> https://bugzilla.mindrot.org/attachment.cgi?id=3809 standalone notify and timestamp patch This looks fine to me. I'll commit it. Thanks for you help! -- You are receiving this mail because: You are watching someone on the CC list of the bug. You are watching the assignee of the bug.
bugzilla-daemon at mindrot.org
2024-Apr-03 03:43 UTC
[Bug 2641] Add systemd notify code to to track running server
https://bugzilla.mindrot.org/show_bug.cgi?id=2641 Damien Miller <djm at mindrot.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Resolution|--- |FIXED Status|NEW |RESOLVED --- Comment #33 from Damien Miller <djm at mindrot.org> --- Committed as 08f579231cd38 and will be in OpenSSH-9.8, due around June/July. -- You are receiving this mail because: You are watching someone on the CC list of the bug. You are watching the assignee of the bug.
bugzilla-daemon at mindrot.org
2024-Apr-07 16:19 UTC
[Bug 2641] Add systemd notify code to to track running server
https://bugzilla.mindrot.org/show_bug.cgi?id=2641 --- Comment #34 from Luca Boccassi <luca.boccassi at gmail.com> --- (In reply to Damien Miller from comment #33)> Committed as 08f579231cd38 and will be in OpenSSH-9.8, due around > June/July.Thank you! -- You are receiving this mail because: You are watching the assignee of the bug. You are watching someone on the CC list of the bug.
Reasonably Related Threads
- openssh 7.6 and 7.7 on Oracle Linux 7 (compiled from source) doesn't start correctly with systemd
- D-bus integration
- openssh 7.6 and 7.7 on Oracle Linux 7 (compiled from source) doesn't start correctly with systemd
- [Bug 3558] New: Spelling "yes" as "Yes" in sshd_config has a fatal result
- openssh 7.6 and 7.7 on Oracle Linux 7 (compiled from source) doesn't start correctly with systemd