On Mon, 2 May 2016 00:01+0200, Wolfgang Zenker wrote:
> Hi,
>
> after updating some 10-STABLE systems a few days ago, I noticed that on
> two of those systems bsnmpd started to use up a lot of cpu time, and the
> available memory shrinked until rendering the system unusable. Killing
> bsnmpd stops the cpu usage but does not free up memory.
> Both affected systems are amd64, one having moved from r297555 to
> r298723, the other from r297555 to r298722. Another amd64 system
> that went from r297555 to r298722 appears to be not affected.
> The two affected systems are on an internal LAN segment and there
> is currently no application connecting to snmp on those machines.
>
> What would be useful debugging data to collect in this case?
I believe I've seen the very same on my systems. All of them got
updated last Friday due to the recent NTP fix. Prior to last Friday,
they all ran stable/10 from early March, r296648-ish. Neither of them
run bsnmpd, but they offer a lot of network services.
Three of my i386 systems each with 1 GiB of memory ran out of swap
space, Sunday afternoon.
This night a mail server running i386 with 4 GiB of memory died while
handling mail. From the messages I could glean on /dev/ttyvb (due to
custom logging) before rebooting, is that it's all networking related.
SpamAssassin and syslogd on the mail server managed to transmit these
lines to the central log host before dying:
May 2 00:05:17 <mail.err> [HOSTNAME] spamc[63613]: connect to spamd on
::1 failed, retrying (#1 of 3): Connection refused
May 2 00:05:17 <mail.err> [HOSTNAME] spamc[63613]: connect to spamd on
127.0.0.1 failed, retrying (#1 of 3): Connection refused
May 2 00:05:18 <mail.err> [HOSTNAME] spamc[63613]: connect to spamd on
::1 failed, retrying (#2 of 3): Connection refused
May 2 00:05:18 <mail.err> [HOSTNAME] spamc[63613]: connect to spamd on
127.0.0.1 failed, retrying (#2 of 3): Connection refused
May 2 00:05:19 <mail.err> [HOSTNAME] spamc[63613]: connect to spamd on
::1 failed, retrying (#3 of 3): Connection refused
May 2 00:05:19 <mail.err> [HOSTNAME] spamc[63613]: connect to spamd on
127.0.0.1 failed, retrying (#3 of 3): Connection refused
May 2 00:05:19 <mail.err> [HOSTNAME] spamc[63613]: connection attempt to
spamd aborted after 3 retries
May 2 00:52:17 <mail.err> [HOSTNAME] sm-mta[63740]: u41Mp86h063740:
Milter (spamassassin): error creating socket: No buffer space available
May 2 00:52:17 <mail.err> [HOSTNAME] sm-mta[63739]: u41Mp8r9063739:
Milter (spamassassin): error creating socket: No buffer space available
May 2 00:52:17 <mail.info> [HOSTNAME] sm-mta[63740]: u41Mp86h063740:
Milter (spamassassin): to error state
May 2 00:52:17 <mail.info> [HOSTNAME] sm-mta[63739]: u41Mp8r9063739:
Milter (spamassassin): to error state
All of the amd64 systems with 4 GiB or 8 GiB of memory are apparently
unaffected.
Maybe it's time to convert the remaining i386 systems to amd64
systems, and add some memory while I'm at it.
The bug is either in the kernel or in libc, or both.
--
+-------------------------------+------------------------------------+
| Vennlig hilsen, | Best regards, |
| Trond Endrest?l, | Trond Endrest?l, |
| IT-ansvarlig, | System administrator, |
| Fagskolen Innlandet, | Gj?vik Technical College, Norway, |
| tlf. mob. 952 62 567, | Cellular...: +47 952 62 567, |
| sentralbord 61 14 54 00. | Switchboard: +47 61 14 54 00. |
+-------------------------------+------------------------------------+