Anand Buddhdev
2020-Jan-07 06:11 UTC
[nsd-users] Repeated crashes of NSD, without a clear explanation
On 06/01/2020 22:39, Stephane Bortzmeyer via nsd-users wrote: Hi Stephane,> For now one week, one machine has NSD crashing after a few hours of > running, corrupting nsd.db. > > The log (verbosity 4) says: > > Jan 06 20:31:30 ada nsd[1974]: process 1975 exited with status 9 > Jan 06 20:31:30 ada nsd[1974]: [2020-01-06 19:31:30.892] nsd[1974]: error: process 1975 exited with status 9 > Jan 06 20:31:30 ada nsd[1974]: rmdir /tmp/nsd-xfr-1974 failed: Directory not emptyThis suggests that an incoming XFR is triggering a bug. Have you saved the contents of the nsd-xfr-1974 directory? If not, perhaps you can save it the next time it happens. This may help the developers in figuring out what causes the crash. Also, is there any log above this, to indicate which zone it might be? Note that there are several newer versions of NSD since 4.1.26, so this bug may also have been fixed in a newer version. If you can upgrade, you may want to do that. Finally, the database mode is no longer recommended. Could you try running your instance of NSD with: database: "" Regards, Anand Buddhdev
Stephane Bortzmeyer
2020-Jan-07 16:04 UTC
[nsd-users] Repeated crashes of NSD, without a clear explanation
On Tue, Jan 07, 2020 at 09:11:56AM +0300, Anand Buddhdev via nsd-users <nsd-users at lists.nlnetlabs.nl> wrote a message of 36 lines which said:> This suggests that an incoming XFR is triggering a bug. Have you saved > the contents of the nsd-xfr-1974 directory? If not, perhaps you can save > it the next time it happens. This may help the developers in figuring > out what causes the crash.Apparently, it was a lack of memory: [8374219.385014] Out of memory: Kill process 10677 (nsd) score 66 or sacrifice child [8374219.385758] Killed process 10678 (nsd) total-vm:37552kB, anon-rss:676kB, file-rss:0kB, shmem-rss:27344kB [8374219.386779] oom_reaper: reaped process 10678 (nsd), now anon-rss:0kB, file-rss:0kB, shmem-rss:27344kB> Finally, the database mode is no longer recommended. Could you try > running your instance of NSD with: > > database: ""Currently under test and no problem yet (anyway, I'll add RAM).