On Sat, Nov 26, 2011 at 04:05:58PM -0500, Mike Andrews
wrote:> I have a Supermicro 5015A-H (Intel Atom 330) server with two Realtek
> RTL8111C-GR gigabit NICs on it. As far as I can tell, these support
> jumbo frames up to 7422 bytes. When running them at an MTU of 5000 on
Actually the maximum size is 6KB for RTL8111C, not 7422.
RTL8111C and newer PCIe based gigabit controllers no longer support
scattering a jumbo frame into multiple RX buffers so a single RX
buffer has to receive an entire jumbo frame. This adds more burden
to system because it has to allocate a jumbo frame even when it
receives a pure TCP ACK.
> FreeBSD 9.0-RC2, after a week or so of update, with fairly light network
> activity, the interfaces die with "no memory for jumbo buffers"
errors
> on the console. Unloading and reloading the driver (via serial console)
> doesn't help; only rebooting seems to clear it up.
>
The jumbo code path is the same as normal MTU sized one so I think
possibility of leaking mbufs in driver is very low. And the
message "no memory for jumbo RX buffers" can only happen either
when you up the interface again or interface restart triggered by
watchdog timeout handler. I don't think you're seeing watchdog
timeouts though.
When you see "no memory for jumbo RX buffers" message, did you
check available mbuf pool?
> I don't have this issue with any of my em(4) based systems that are
also
> using a 5000 byte MTU -- and they push considerably more traffic.
>
> I don't really consider this a regression from FreeBSD 8.2 because 8.2
> didn't support jumbos at all on this hardware... :)
>
> What's the best way to go about debugging this... which sysctl's
should
> I be looking at first? I have already tried raising kern.ipc.nmbjumbo9
> to 16384 and it doesn't seem to help things... maybe prolonging it
> slightly, but not by much. The problem is it takes a week or so to
> reproduce the problem each time...
>
I vaguely guess it could be related with other subsystem which
leaks mbufs such that driver was not able to get more jumbo RX
buffers from system. For instance, r228016 would be worth to try on
your box. I can't clearly explain why em(4) does not suffer from
the issue though.