All,
We had NSD misconfigured, and were checking TSIG keys for notify
messages that we received from our masters. This was a
misconfiguration because the masters are BIND, which does not support
TSIG securing notify messages.
When we changed the configuration to use NOKEY, and now we accept
notifies. Yay!
But, NSD now leaks memory. :(
For our large zones, we get updates once a minute or so. It is
possible NSD leaks memory even without notifies, but since it is 15x
slower (or whatever the refresh time is on the zone) we don't notice
it so much. It is also possible that it has something to do with
starting a notify when another is in progress. Or, something else. (I
have no clue, just guessing.)
We see this on both FreeBSD (with 64-bit NSD) and Linux (with both 32-
bit and 64-bit NSD). Eventually the processes run out of memory and
die. Usually they first stop accepting new IXFR because they don't
have enough memory, like this:
[1214352780] nsd[12973]: warning: signal received, reloading...
[1214352780] nsd[14463]: info: memory recyclebin holds 571968 bytes
[1214352780] nsd[14463]: error: malloc failed: Cannot allocate memory
[1214352781] nsd[12973]: error: handle_reload_cmd: reload closed cmd
channel
[1214352781] nsd[12973]: warning: Reload process 14463 failed with
status 256, continuing with old database
[1214352781] nsd[20195]: error: xfrd: zone org: soa serial 2008220593
update failed restarting transfer (notified zone)
So, my questions are:
1. Is this a known bug?
2. If it is not a previously known bug, has anybody else seen this?
If this is something new, I welcome advice on how to debug it. My
current thinking is to simply try valgrind.
Cheers,
--
Shane