On one of my servers (RELENG_6 as of yesterday), I am seeing what appears to be RX overruns. Load avg does not seem to be high, and the only odd thing I have done to the kernel is defined #define EM_FAST_INTR 1 The man page talks about setting hw.em.* vars, but but does not discuss any of the tunables via dev.em.*. Is there anything that can be tuned there to improve performance ? Also, the man page talks about various controllers having different max values. How do I know what this particular card has available as it seems to have a controller (82572GI) not mentioned in the man page. # sysctl -a dev.em.2 dev.em.2.%desc: Intel(R) PRO/1000 Network Connection Version - 6.2.9 dev.em.2.%driver: em dev.em.2.%location: slot=0 function=0 dev.em.2.%pnpinfo: vendor=0x8086 device=0x105e subvendor=0x8086 subdevice=0x115e class=0x020000 dev.em.2.%parent: pci1 dev.em.2.debug_info: -1 dev.em.2.stats: -1 dev.em.2.rx_int_delay: 0 dev.em.2.tx_int_delay: 66 dev.em.2.rx_abs_int_delay: 66 dev.em.2.tx_abs_int_delay: 66 dev.em.2.rx_processing_limit: 100 Jan 30 11:04:31 FW4a-tor kernel: em2: Adapter hardware address = 0xc4b6f948 Jan 30 11:04:31 FW4a-tor kernel: em2: CTRL = 0x80c0241 RCTL = 0x8002 Jan 30 11:04:31 FW4a-tor kernel: em2: Packet buffer = Tx=16k Rx=32k Jan 30 11:04:31 FW4a-tor kernel: em2: Flow control watermarks high = 30720 low = 29220 Jan 30 11:04:31 FW4a-tor kernel: em2: tx_int_delay = 66, tx_abs_int_delay = 66 Jan 30 11:04:31 FW4a-tor kernel: em2: rx_int_delay = 0, rx_abs_int_delay = 66 Jan 30 11:04:31 FW4a-tor kernel: em2: fifo workaround = 0, fifo_reset_count = 0 Jan 30 11:04:31 FW4a-tor kernel: em2: hw tdh = 246, hw tdt = 246 Jan 30 11:04:31 FW4a-tor kernel: em2: Num Tx descriptors avail = 231 Jan 30 11:04:31 FW4a-tor kernel: em2: Tx Descriptors not avail1 = 0 Jan 30 11:04:31 FW4a-tor kernel: em2: Tx Descriptors not avail2 = 0 Jan 30 11:04:31 FW4a-tor kernel: em2: Std mbuf failed = 0 Jan 30 11:04:31 FW4a-tor kernel: em2: Std mbuf cluster failed = 0 Jan 30 11:04:31 FW4a-tor kernel: em2: Driver dropped packets = 0 Jan 30 11:04:31 FW4a-tor kernel: em2: Driver tx dma failure in encap = 0 Jan 30 11:04:40 FW4a-tor kernel: em2: Excessive collisions = 0 Jan 30 11:04:40 FW4a-tor kernel: em2: Sequence errors = 0 Jan 30 11:04:40 FW4a-tor kernel: em2: Defer count = 0 Jan 30 11:04:40 FW4a-tor kernel: em2: Missed Packets = 47990 Jan 30 11:04:40 FW4a-tor kernel: em2: Receive No Buffers = 2221 Jan 30 11:04:40 FW4a-tor kernel: em2: Receive Length Errors = 0 Jan 30 11:04:40 FW4a-tor kernel: em2: Receive errors = 0 Jan 30 11:04:40 FW4a-tor kernel: em2: Crc errors = 0 Jan 30 11:04:40 FW4a-tor kernel: em2: Alignment errors = 0 Jan 30 11:04:40 FW4a-tor kernel: em2: Carrier extension errors = 0 Jan 30 11:04:40 FW4a-tor kernel: em2: RX overruns = 61 Jan 30 11:04:40 FW4a-tor kernel: em2: watchdog timeouts = 0 Jan 30 11:04:40 FW4a-tor kernel: em2: XON Rcvd = 0 Jan 30 11:04:40 FW4a-tor kernel: em2: XON Xmtd = 0 Jan 30 11:04:40 FW4a-tor kernel: em2: XOFF Rcvd = 0 Jan 30 11:04:40 FW4a-tor kernel: em2: XOFF Xmtd = 0 Jan 30 11:04:40 FW4a-tor kernel: em2: Good Packets Rcvd = 126019287 Jan 30 11:04:40 FW4a-tor kernel: em2: Good Packets Xmtd = 78181054 em2@pci1:0:0: class=0x020000 card=0x115e8086 chip=0x105e8086 rev=0x06 hdr=0x00 vendor = 'Intel Corporation' device = 'PRO/1000 PT' class = network subclass = ethernet em3@pci1:0:1: class=0x020000 card=0x115e8086 chip=0x105e8086 rev=0x06 hdr=0x00 vendor = 'Intel Corporation' device = 'PRO/1000 PT' class = network subclass = ethernet em2: <Intel(R) PRO/1000 Network Connection Version - 6.2.9> port 0x9000-0x901f mem 0xd1020000-0xd103ffff,0xd1000000-0xd101ffff irq 18 at device 0.0 on pci1 em2: Ethernet address: 00:15:17:0b:46:7c em2: [FAST] em3: <Intel(R) PRO/1000 Network Connection Version - 6.2.9> port 0x9400-0x941f mem 0xd1040000-0xd105ffff,0xd1060000-0xd107ffff irq 19 at device 0.1 on pci1 em3: Ethernet address: 00:15:17:0b:46:7d em3: [FAST] -------------------------------------------------------------------- Mike Tancsa, tel +1 519 651 3400 Sentex Communications, mike@sentex.net Providing Internet since 1994 www.sentex.net Cambridge, Ontario Canada www.sentex.net/mike
On 1/30/07, Mike Tancsa <mike@sentex.net> wrote:> On one of my servers (RELENG_6 as of yesterday), I am seeing what > appears to be RX overruns. Load avg does not seem to be high, and the > only odd thing I have done to the kernel is defined > > #define EM_FAST_INTR 1 > > > The man page talks about setting hw.em.* vars, but but does not > discuss any of the tunables via dev.em.*. Is there anything that can > be tuned there to improve performance ? Also, the man page talks > about various controllers having different max values. How do I know > what this particular card has available as it seems to have a > controller (82572GI) not mentioned in the man page. > > > # sysctl -a dev.em.2 > dev.em.2.%desc: Intel(R) PRO/1000 Network Connection Version - 6.2.9 > dev.em.2.%driver: em > dev.em.2.%location: slot=0 function=0 > dev.em.2.%pnpinfo: vendor=0x8086 device=0x105e subvendor=0x8086 > subdevice=0x115e class=0x020000 > dev.em.2.%parent: pci1 > dev.em.2.debug_info: -1 > dev.em.2.stats: -1 > dev.em.2.rx_int_delay: 0 > dev.em.2.tx_int_delay: 66 > dev.em.2.rx_abs_int_delay: 66 > dev.em.2.tx_abs_int_delay: 66 > dev.em.2.rx_processing_limit: 100 > > > Jan 30 11:04:31 FW4a-tor kernel: em2: Adapter hardware address = 0xc4b6f948 > Jan 30 11:04:31 FW4a-tor kernel: em2: CTRL = 0x80c0241 RCTL = 0x8002 > Jan 30 11:04:31 FW4a-tor kernel: em2: Packet buffer = Tx=16k Rx=32k > Jan 30 11:04:31 FW4a-tor kernel: em2: Flow control watermarks high > 30720 low = 29220 > Jan 30 11:04:31 FW4a-tor kernel: em2: tx_int_delay = 66, tx_abs_int_delay = 66 > Jan 30 11:04:31 FW4a-tor kernel: em2: rx_int_delay = 0, rx_abs_int_delay = 66 > Jan 30 11:04:31 FW4a-tor kernel: em2: fifo workaround = 0, fifo_reset_count = 0 > Jan 30 11:04:31 FW4a-tor kernel: em2: hw tdh = 246, hw tdt = 246 > Jan 30 11:04:31 FW4a-tor kernel: em2: Num Tx descriptors avail = 231 > Jan 30 11:04:31 FW4a-tor kernel: em2: Tx Descriptors not avail1 = 0 > Jan 30 11:04:31 FW4a-tor kernel: em2: Tx Descriptors not avail2 = 0 > Jan 30 11:04:31 FW4a-tor kernel: em2: Std mbuf failed = 0 > Jan 30 11:04:31 FW4a-tor kernel: em2: Std mbuf cluster failed = 0 > Jan 30 11:04:31 FW4a-tor kernel: em2: Driver dropped packets = 0 > Jan 30 11:04:31 FW4a-tor kernel: em2: Driver tx dma failure in encap = 0 > Jan 30 11:04:40 FW4a-tor kernel: em2: Excessive collisions = 0 > Jan 30 11:04:40 FW4a-tor kernel: em2: Sequence errors = 0 > Jan 30 11:04:40 FW4a-tor kernel: em2: Defer count = 0 > Jan 30 11:04:40 FW4a-tor kernel: em2: Missed Packets = 47990 > Jan 30 11:04:40 FW4a-tor kernel: em2: Receive No Buffers = 2221 > Jan 30 11:04:40 FW4a-tor kernel: em2: Receive Length Errors = 0 > Jan 30 11:04:40 FW4a-tor kernel: em2: Receive errors = 0 > Jan 30 11:04:40 FW4a-tor kernel: em2: Crc errors = 0 > Jan 30 11:04:40 FW4a-tor kernel: em2: Alignment errors = 0 > Jan 30 11:04:40 FW4a-tor kernel: em2: Carrier extension errors = 0 > Jan 30 11:04:40 FW4a-tor kernel: em2: RX overruns = 61 > Jan 30 11:04:40 FW4a-tor kernel: em2: watchdog timeouts = 0 > Jan 30 11:04:40 FW4a-tor kernel: em2: XON Rcvd = 0 > Jan 30 11:04:40 FW4a-tor kernel: em2: XON Xmtd = 0 > Jan 30 11:04:40 FW4a-tor kernel: em2: XOFF Rcvd = 0 > Jan 30 11:04:40 FW4a-tor kernel: em2: XOFF Xmtd = 0 > Jan 30 11:04:40 FW4a-tor kernel: em2: Good Packets Rcvd = 126019287 > Jan 30 11:04:40 FW4a-tor kernel: em2: Good Packets Xmtd = 78181054 > > > em2@pci1:0:0: class=0x020000 card=0x115e8086 chip=0x105e8086 > rev=0x06 hdr=0x00 > vendor = 'Intel Corporation' > device = 'PRO/1000 PT' > class = network > subclass = ethernet > em3@pci1:0:1: class=0x020000 card=0x115e8086 chip=0x105e8086 > rev=0x06 hdr=0x00 > vendor = 'Intel Corporation' > device = 'PRO/1000 PT' > class = network > subclass = ethernet > > > em2: <Intel(R) PRO/1000 Network Connection Version - 6.2.9> port > 0x9000-0x901f mem 0xd1020000-0xd103ffff,0xd1000000-0xd101ffff > irq 18 at device 0.0 on pci1 > em2: Ethernet address: 00:15:17:0b:46:7c > em2: [FAST] > em3: <Intel(R) PRO/1000 Network Connection Version - 6.2.9> port > 0x9400-0x941f mem 0xd1040000-0xd105ffff,0xd1060000-0xd107ffff > irq 19 at device 0.1 on pci1 > em3: Ethernet address: 00:15:17:0b:46:7d > em3: [FAST]Performance tuning is not something that I have yet had time to focus on, our Linux team is able to do a lot more of that. Just at a glance, try increasing your mbuf pool size and the number of receive descriptors for a start. Oh, and try increasing your processing limit to 200 and see what effect that has. Jack
At 12:30 PM 1/30/2007, Jack Vogel wrote:>Performance tuning is not something that I have yet had time to focus >on, our Linux team is able to do a lot more of that. Just at a glance, >try increasing your mbuf pool size and the number of receive descriptors >for a start. Oh, and try increasing your processing limit to 200 and see >what effect that has.Hi, thanks for the info. What is the processing_limit limit, and apart from crashing the box, how do I know if I set it too high ? ;-) I am not sure which mbuf setting you mean ? From netstat -m, I dont seem to be hitting any max values # netstat -m 838/2237/3075 mbufs in use (current/cache/total) 836/578/1414/25600 mbuf clusters in use (current/cache/total/max) 836/572 mbuf+clusters out of packet secondary zone in use (current/cache) 0/0/0/0 4k (page size) jumbo clusters in use (current/cache/total/max) 0/0/0/0 9k jumbo clusters in use (current/cache/total/max) 0/0/0/0 16k jumbo clusters in use (current/cache/total/max) 1881K/1715K/3596K bytes allocated to network (current/cache/total) 0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters) 0/0/0 requests for jumbo clusters denied (4k/9k/16k) 0/5/6656 sfbufs in use (current/peak/max) 0 requests for sfbufs denied 0 requests for sfbufs delayed 0 requests for I/O initiated by sendfile 0 calls to protocol drain routines As for hw.em.rxd, how do I know what this chip can handle ? It says the current default is 256, but I dont know what I can set that too, based on this adaptor ? WRT, hw.em.rx_int_delay This value delays the generation of receive interrupts in units of 1.024 microseconds. The default value is 0, since adapters may hang with this feature being enabled. Do you know which adaptors have this issue ? Also, for hw.em.rx_abs_int_delay If hw.em.rx_int_delay is non-zero, this tunable limits the maxi- mum delay in which a receive interrupt is generated. I take it this is for interrupt moderation ? Am I right in thinking that if my rx buffers are filling, the box is not processing interrupts fast enough so I should move this value closer to zero ? How do I find what the current value is ? Thanks for any pointers you can provide. ---Mike
At 12:30 PM 1/30/2007, Jack Vogel wrote:>Performance tuning is not something that I have yet had time to focus >on, our Linux team is able to do a lot more of that. Just at a glance, >try increasing your mbuf pool size and the number of receive descriptors >for a start.OK, I setup a test box the pass packets through and I am getting results I dont understand. Increasing hw.em.rxd in loader.conf (and rebooting each time), I am getting worse results. With hw.em.rxd=4096 Jan 30 17:19:10 em-test kernel: em0: Receive No Buffers = 5707564 With hw.em.rxd=1024 Jan 30 17:22:31 em-test kernel: em0: Receive No Buffers = 351 With hw.em.rxd=512 Jan 30 17:27:24 em-test kernel: em0: Receive No Buffers = 230 with default 256 Jan 30 16:55:44 em-test kernel: em0: Receive No Buffers = 77 with 128, its gets much worse. This is with a stock UP kernel, no INET6, net.inet.ip.fastforwarding=1 Box A ------Box B (with dual Intel NIC) ----- Box C Box A is generating packets routed through firewall Box B towards Box C. They are connected together with 2 cross over cables.>Oh, and try increasing your processing limit to 200 and see >what effect that has.I will try that next. ---Mike