Hi Franky,
You may want to disable "--enable-checking", that's enabling debug
information and negatively impacts performance. --disable-recvmmsg is
something you do want to enable because it gets multiple UDP messages
with one syscall and thus improves performance.
Maybe it helps if you set the reload timeout a bit higher? It's hard to
tell with the provided information what can be changed to keep the
server from becoming unresponsive. Maybe you can share the
configuration? You may want to have a look at the tuning section of the
manual (https://nsd.docs.nlnetlabs.nl/en/latest/running/tuning.html). I
wouldn't bother with Processor Affinity just yet, the first section may
already do wonders for your setup.
Best,
Jeroen
On Fri, 2022-09-16 at 10:34 +0200, Franky Van Liedekerke via nsd-users
wrote:> Hi,
>
> I seem to have an issue with one nameserver (the one running nsd
> 4.6.0, but it also happened with the nsd package that came with
> ubuntu itself):
>
> on a regular basis the server just hangs. No coredumps (the server is
> configured to coredump), nothing in nsd logs, nothing in syslog
> except always the same final message that happens to arrive on the
> central logserver just before the OS hang:
> "TCP: request_sock_TCP: Possible SYN flooding on port 53. Sending
> cookies."
>
> After that message, it's game over for that server: not even the
> console is responsive anymore. It's a vm, so we see the cpu spiking
> in the vm stats on the host so I'm assuming something is taking up
> all cpu causing a huge load, but I'm unable to pinpoint it since ...
> it hangs :-) . Other dns servers (running bind) with the same kernel
> parameters for flooding (burst), don't show the message (so maybe
> just 1 server is being targetted, but it still shouldn't crash like
> that).
> Any hints on how to debug this? If somone might think it is related
> to nsd, this is the compile line:
> ./configure --prefix=/usr --with-configdir=/etc/nsd --with-
> nsd_conf_file=/etc/nsd/nsd.conf --with-pidfile=/run/nsd/nsd.pid --
> with-dbfile=/var/lib/nsd/nsd.db --with-zonesdir=/etc/nsd --with-
> xfrdfile=/var/lib/nsd/xfrd.state --disable-largefile --disable-
> recvmmsg --enable-root-server --enable-mmap --enable-ratelimit --
> enable-checking --enable-dnstap --enable-systemd
>
> (I see there's an option for tcp_fastopen but not used by the person
> that compiled it and I can't really explain the reason on -disable-
> largefile --disable-recvmmsg, but those two shouldn't have any
> impact)
> The server-count=2 (server having 2 vcpu's), no mem issues seen.
> Server is serving (as secondary) more than 7000 zones (so many xfr
> requests, but currently we left the xfr-reload-timeout at 1 second).
>
> With friendly regards,
> Franky
> _______________________________________________
> nsd-users mailing list
> nsd-users at lists.nlnetlabs.nl
> https://lists.nlnetlabs.nl/mailman/listinfo/nsd-users