Hi all, I'm running NSD 4.9.1 on OpenBSD 7.6. I recently upgraded from OpenBSD 7.5, which I believe had NSD 4.8.0 in base, and did not see this behavior prior. When I try to reload a zone using nsd-control, I am seeing an error message in my logfile: "error: reload: old-main quit during quit sync" This error does not appear to happen every time I run reload, but it does get recorded to the logfile "more often than not" after a reload. When it does happen, it appears that the zone still successfully reloads. The error can show up when I reload any of my zones, and it does not appear to matter if I modified the zone. I cranked up verbosity to "3" but did not get much more detail. Below example is for my testing zone: ''' $ nsd-control reload testing.internal ok $ echo $? 0 ''' and the relevant entries in nsd.log: ''' [2025-01-04 14:55:30.652] nsd[69351]: info: new control connection from 127.0.0.1 [2025-01-04 14:55:30.711] nsd[69351]: info: remote control connection authenticated [2025-01-04 14:55:30.711] nsd[69351]: info: control cmd: reload testing.internal [2025-01-04 14:55:30.712] nsd[69351]: info: remote control operation completed [2025-01-04 14:55:30.713] nsd[40839]: info: zonefile testing.internal.forward is not modified [2025-01-04 14:55:30.723] nsd[40839]: error: reload: old-main quit during quit sync ''' relevant snippets from nsd.conf: ''' server: server-count: 1 verbosity: 3 zonesdir: "/var/nsd/etc/zones/" logfile: "/var/log/nsd.log" zonefiles-write: 60 zone: name: "testing.internal" zonefile: "testing.internal.forward" ''' Anyone else seeing similar behavior or can tell me if I'm doing something wrong? Happy to compile latest or follow other suggested troubleshooting steps. Thanks, Otto
Op 04-01-2025 om 17:10 schreef Otto Retter via nsd-users:> Hi all, > > I'm running NSD 4.9.1 on OpenBSD 7.6. I recently upgraded from OpenBSD > 7.5, which I believe had NSD 4.8.0 in base, and did not see this > behavior prior.Thanks Otto, Indeed, NSD 4.8.0 did not log this condition as an error message and just proceeded if the old-main would quit. With 4.9.0 reloading was refactored to reap exited old serve childs in order to reduce the number of "defunct" or "zombie" processes that can emerge (for example when one old-serve child is still busy, for example serving an AXFR or so). When old-main is done with is job during reload (killing the old serve children), it informs the reload process and then immediately exists. The detection of the closed pipe (because of exited old-main) could very well become before the information that old-main is done on that pipe on some platforms. So I consider this "reporting of an exited old-main" at this point in the code a bug, and changed it into a debugging warning level message here: github.com/NLnetLabs/nsd/pull/421 Thanks! -- Willem PS. For completeness a strip of a successful NSD reload below. Your issue would occur if old-main(5) would exit before the load(2) process received the NSD_RELOAD in picture 6.NSD successful reload> > When I try to reload a zone using nsd-control, I am seeing an error > message in my logfile: "error: reload: old-main quit during quit sync" > > This error does not appear to happen every time I run reload, but it > does get recorded to the logfile "more often than not" after a reload. > When it does happen, it appears that the zone still successfully > reloads. The error can show up when I reload any of my zones, and it > does not appear to matter if I modified the zone. I cranked up > verbosity to "3" but did not get much more detail. Below example is > for my testing zone:> > ''' > $ nsd-control reload testing.internal > ok > $ echo $? > 0 > ''' > > and the relevant entries in nsd.log: > ''' > [2025-01-04 14:55:30.652] nsd[69351]: info: new control connection > from 127.0.0.1 > [2025-01-04 14:55:30.711] nsd[69351]: info: remote control connection > authenticated > [2025-01-04 14:55:30.711] nsd[69351]: info: control cmd:? reload > testing.internal > [2025-01-04 14:55:30.712] nsd[69351]: info: remote control operation > completed > [2025-01-04 14:55:30.713] nsd[40839]: info: zonefile > testing.internal.forward is not modified > [2025-01-04 14:55:30.723] nsd[40839]: error: reload: old-main quit > during quit sync > ''' > > relevant snippets from nsd.conf: > ''' > server: > ??????? server-count: 1 > ??????? verbosity: 3 > ??????? zonesdir: "/var/nsd/etc/zones/" > ??????? logfile: "/var/log/nsd.log" > ??????? zonefiles-write: 60 > zone: > ??????? name: "testing.internal" > ??????? zonefile: "testing.internal.forward" > ''' > > Anyone else seeing similar behavior or can tell me if I'm doing > something wrong? Happy to compile latest or follow other suggested > troubleshooting steps. > > Thanks, > Otto > _______________________________________________ > nsd-users mailing list > nsd-users at lists.nlnetlabs.nl > lists.nlnetlabs.nl/mailman/listinfo/nsd-users-------------- next part -------------- An HTML attachment was scrubbed... URL: <lists.nlnetlabs.nl/pipermail/nsd-users/attachments/20250107/a9035f64/attachment-0001.htm> -------------- next part -------------- A non-text attachment was scrubbed... Name: NSD successful reload.svg Type: image/svg+xml Size: 398139 bytes Desc: not available URL: <lists.nlnetlabs.nl/pipermail/nsd-users/attachments/20250107/a9035f64/attachment-0001.svg> -------------- next part -------------- A non-text attachment was scrubbed... Name: OpenPGP_0xE5F8F8212F77A498_and_old_rev.asc Type: application/pgp-keys Size: 7749 bytes Desc: OpenPGP public key URL: <lists.nlnetlabs.nl/pipermail/nsd-users/attachments/20250107/a9035f64/attachment-0002.bin> -------------- next part -------------- A non-text attachment was scrubbed... Name: OpenPGP_signature.asc Type: application/pgp-signature Size: 840 bytes Desc: OpenPGP digital signature URL: <lists.nlnetlabs.nl/pipermail/nsd-users/attachments/20250107/a9035f64/attachment-0003.bin>