Kabindra Shrestha
2016-Apr-05 16:28 UTC
[nsd-users] NSD4 goes unresponsive with lots of TCP connection!
Hi, We are seeing some large number of TCP connections to our DNS servers (in thousands) and NSD goes unresponsive after certain time and doesn't recover, it stops responding to UDP as well. We tried increasing the number of tcp-counts but it doesn't help. I noticed the TCP backlog is hardcoded to 256 in NSD config, so even with customised TCP backlogs on the system its still being throttled at around 256. Is there anyway we can change this value without recompiling the NSD. [kabindra at 05 nsd-4.1.8]$ grep BACKLOG * config.h.in:#undef TCP_BACKLOG configure:#define TCP_BACKLOG 256 configure.ac:AC_DEFINE_UNQUOTED([TCP_BACKLOG], [256], [Define to the backlog to be used with listen.]) We are using NSD4.1.8. ( From one of the servers that went unresponsive, we have seen that TCP number closing to 10k. ) #ss -s Total: 5591 (kernel 5640) TCP: 5067 (estab 4968, closed 4, orphaned 0, synrecv 0, timewait 3/0), ports 28 Transport Total IP IPv6 * 5640 - - RAW 0 0 0 UDP 122 63 59 TCP 5063 5017 46 INET 5185 5080 105 FRAG 0 0 0 Thanks. Regards, Kabindra Shrestha -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 495 bytes Desc: Message signed with OpenPGP using GPGMail URL: <http://lists.nlnetlabs.nl/pipermail/nsd-users/attachments/20160405/5ebcbb3d/attachment.bin>
W.C.A. Wijngaards
2016-Apr-06 09:04 UTC
[nsd-users] NSD4 goes unresponsive with lots of TCP connection!
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256 Hi Kabindra, I have not heard of this before, how is TCP affecting NSD? NSD has a fixed number of tcp connections, configured in tcp-count: 100 from the nsd.conf file. That should be what is services. You should increase that count to increase responsiveness to TCP. UDP should be unaffected. The backlog is for tcp connections waiting to be accepted. 256 is reasonably portable, reasonably large. I don't see how that value is your problem. Is your kernel and networking subsystem failing? The OS can return EMFILE or ENFILE to accept(), nsd starts to stop accepting TCP connections to relieve buffer stress on the OS. But again, UDP should not have been impacted? Are you using so-reuseport: yes? I have had reports that it disrupts connectivity (depending on OS, particular version of the OS, and more recent versions of NSD do not use reuseport on TCP anymore). Best regards, Wouter On 05/04/16 18:28, Kabindra Shrestha wrote:> Hi, > > We are seeing some large number of TCP connections to our DNS > servers (in thousands) and NSD goes unresponsive after certain time > and doesn't recover, it stops responding to UDP as well. We tried > increasing the number of tcp-counts but it doesn't help. I noticed > the TCP backlog is hardcoded to 256 in NSD config, so even with > customised TCP backlogs on the system its still being throttled at > around 256. Is there anyway we can change this value without > recompiling the NSD. > > > [kabindra at 05 nsd-4.1.8]$ grep BACKLOG * config.h.in:#undef > TCP_BACKLOG configure:#define TCP_BACKLOG 256 > configure.ac:AC_DEFINE_UNQUOTED([TCP_BACKLOG], [256], [Define to > the backlog to be used with listen.]) > > > We are using NSD4.1.8. > > ( From one of the servers that went unresponsive, we have seen that > TCP number closing to 10k. ) > > #ss -s Total: 5591 (kernel 5640) TCP: 5067 (estab 4968, closed 4, > orphaned 0, synrecv 0, timewait 3/0), ports 28 > > Transport Total IP IPv6 * 5640 - - RAW > 0 0 0 UDP 122 63 59 TCP 5063 > 5017 46 INET 5185 5080 105 FRAG 0 0 > 0 > > > Thanks. > > Regards, Kabindra Shrestha > > > > _______________________________________________ nsd-users mailing > list nsd-users at NLnetLabs.nl > https://open.nlnetlabs.nl/mailman/listinfo/nsd-users >-----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIcBAEBCAAGBQJXBNE1AAoJEJ9vHC1+BF+N7YYP/2CCwjlGFYvh3hQ5pWpMMRNT NOgwyENf2HGATR2uFvuASJixQ9SLGU2/YHRpaBDS5dP6zKdde6abH1FTPIGK5U94 6tcG2IgsDbm4eGFJ//ndEts1MRT+xvTOqgiR8cLzshu3ENJJ+dyYgDG0clhNyQPb gFr0yxvpjVbE56a2DA/NTTM5N7vP5Zdg5YyEq57me0fSGl2GHcUiRWy/zgE/iCjk IqP0Sygf2mz5Ig+PQm0QcD82GjAdPuWUb/+06b6P4m0s2QuZCelSLP2hqaKGGBRx tTlHa266Skf2xj23kuKm9Sf9AHEHggW4mCZaYL9VqGbD2aG2vuY2D92BZh4gcaca w1Le5DmwICeV/EyjIynC6WJoQ+AbU1T7lF6HS6XP1j5tNOWTzIphufBBe0N1S49p 5v2psAE8kJNGD4Bm8b13Fi7YQCfQJ2Z1GFLU6kWrbvuJCJA4TlGFDM1X9xtDLJYD NS0n4gGI063RUvyve/BPNPtHcNsLmAiYdYT5ZbN3Y18jO+QJnlbqOmBJCIlcZK4W QlRnc+83S7l8WfAy72EeqcuZ3q/J9UGNEhWDaEyUckZ/FFVjEM1tvZZLHxwkc7DM wOroMvaoDfSYbuV1+OoQxZNkLlvo6Hdrl3/EtVtKRABTtdjON2NJjMi4OTyUWFn4 puYMhYPxE56t9C/EsQOn =fhoJ -----END PGP SIGNATURE-----
Daisuke HIGASHI
2016-Apr-06 11:56 UTC
[nsd-users] NSD4 goes unresponsive with lots of TCP connection!
Hi, I have seen opposite (same?) situation with BIND9 nameserver -- many UDP queries and almost unresponsible both for UDP and TCP query. That was not due to BIND9's issue, but firewall (iptables) state table was full.