thr3ads.net - freebsd stable - Intel EM tuning (PT1000 adaptors) [Jan 2007]

If this information is useful, please help other people find it:
Share via:

Mike Tancsa

2007-Jan-30 17:19 UTC

Intel EM tuning (PT1000 adaptors)

On one of my servers (RELENG_6 as of yesterday), I am seeing what 
appears to be RX overruns. Load avg does not seem to be high, and the 
only odd thing I have done to the kernel is defined

#define EM_FAST_INTR 1


The man page talks about setting hw.em.* vars, but but does not 
discuss any of the tunables via dev.em.*.  Is there anything that can 
be tuned there to improve performance ?  Also, the man page talks 
about various controllers having different max values.  How do I know 
what this particular card has available as it seems to have a 
controller (82572GI) not mentioned in the man page.


# sysctl -a dev.em.2
dev.em.2.%desc: Intel(R) PRO/1000 Network Connection Version - 6.2.9
dev.em.2.%driver: em
dev.em.2.%location: slot=0 function=0
dev.em.2.%pnpinfo: vendor=0x8086 device=0x105e subvendor=0x8086 
subdevice=0x115e class=0x020000
dev.em.2.%parent: pci1
dev.em.2.debug_info: -1
dev.em.2.stats: -1
dev.em.2.rx_int_delay: 0
dev.em.2.tx_int_delay: 66
dev.em.2.rx_abs_int_delay: 66
dev.em.2.tx_abs_int_delay: 66
dev.em.2.rx_processing_limit: 100


Jan 30 11:04:31 FW4a-tor kernel: em2: Adapter hardware address = 0xc4b6f948
Jan 30 11:04:31 FW4a-tor kernel: em2: CTRL = 0x80c0241 RCTL = 0x8002
Jan 30 11:04:31 FW4a-tor kernel: em2: Packet buffer = Tx=16k Rx=32k
Jan 30 11:04:31 FW4a-tor kernel: em2: Flow control watermarks high = 
30720 low = 29220
Jan 30 11:04:31 FW4a-tor kernel: em2: tx_int_delay = 66, tx_abs_int_delay = 66
Jan 30 11:04:31 FW4a-tor kernel: em2: rx_int_delay = 0, rx_abs_int_delay = 66
Jan 30 11:04:31 FW4a-tor kernel: em2: fifo workaround = 0, fifo_reset_count = 0
Jan 30 11:04:31 FW4a-tor kernel: em2: hw tdh = 246, hw tdt = 246
Jan 30 11:04:31 FW4a-tor kernel: em2: Num Tx descriptors avail = 231
Jan 30 11:04:31 FW4a-tor kernel: em2: Tx Descriptors not avail1 = 0
Jan 30 11:04:31 FW4a-tor kernel: em2: Tx Descriptors not avail2 = 0
Jan 30 11:04:31 FW4a-tor kernel: em2: Std mbuf failed = 0
Jan 30 11:04:31 FW4a-tor kernel: em2: Std mbuf cluster failed = 0
Jan 30 11:04:31 FW4a-tor kernel: em2: Driver dropped packets = 0
Jan 30 11:04:31 FW4a-tor kernel: em2: Driver tx dma failure in encap = 0
Jan 30 11:04:40 FW4a-tor kernel: em2: Excessive collisions = 0
Jan 30 11:04:40 FW4a-tor kernel: em2: Sequence errors = 0
Jan 30 11:04:40 FW4a-tor kernel: em2: Defer count = 0
Jan 30 11:04:40 FW4a-tor kernel: em2: Missed Packets = 47990
Jan 30 11:04:40 FW4a-tor kernel: em2: Receive No Buffers = 2221
Jan 30 11:04:40 FW4a-tor kernel: em2: Receive Length Errors = 0
Jan 30 11:04:40 FW4a-tor kernel: em2: Receive errors = 0
Jan 30 11:04:40 FW4a-tor kernel: em2: Crc errors = 0
Jan 30 11:04:40 FW4a-tor kernel: em2: Alignment errors = 0
Jan 30 11:04:40 FW4a-tor kernel: em2: Carrier extension errors = 0
Jan 30 11:04:40 FW4a-tor kernel: em2: RX overruns = 61
Jan 30 11:04:40 FW4a-tor kernel: em2: watchdog timeouts = 0
Jan 30 11:04:40 FW4a-tor kernel: em2: XON Rcvd = 0
Jan 30 11:04:40 FW4a-tor kernel: em2: XON Xmtd = 0
Jan 30 11:04:40 FW4a-tor kernel: em2: XOFF Rcvd = 0
Jan 30 11:04:40 FW4a-tor kernel: em2: XOFF Xmtd = 0
Jan 30 11:04:40 FW4a-tor kernel: em2: Good Packets Rcvd = 126019287
Jan 30 11:04:40 FW4a-tor kernel: em2: Good Packets Xmtd = 78181054


em2@pci1:0:0:   class=0x020000 card=0x115e8086 chip=0x105e8086 
rev=0x06 hdr=0x00
     vendor   = 'Intel Corporation'
     device   = 'PRO/1000 PT'
     class    = network
     subclass = ethernet
em3@pci1:0:1:   class=0x020000 card=0x115e8086 chip=0x105e8086 
rev=0x06 hdr=0x00
     vendor   = 'Intel Corporation'
     device   = 'PRO/1000 PT'
     class    = network
     subclass = ethernet


em2: <Intel(R) PRO/1000 Network Connection Version - 6.2.9> port 
0x9000-0x901f mem 0xd1020000-0xd103ffff,0xd1000000-0xd101ffff
irq 18 at device 0.0 on pci1
em2: Ethernet address: 00:15:17:0b:46:7c
em2: [FAST]
em3: <Intel(R) PRO/1000 Network Connection Version - 6.2.9> port 
0x9400-0x941f mem 0xd1040000-0xd105ffff,0xd1060000-0xd107ffff
irq 19 at device 0.1 on pci1
em3: Ethernet address: 00:15:17:0b:46:7d
em3: [FAST]



--------------------------------------------------------------------
Mike Tancsa,                                      tel +1 519 651 3400
Sentex Communications,                            mike@sentex.net
Providing Internet since 1994                    www.sentex.net
Cambridge, Ontario Canada                         www.sentex.net/mike

Jack Vogel

2007-Jan-30 17:30 UTC

head link

Intel EM tuning (PT1000 adaptors)

On 1/30/07, Mike Tancsa <mike@sentex.net> wrote:> On one of my servers (RELENG_6 as of yesterday), I am seeing what
> appears to be RX overruns. Load avg does not seem to be high, and the
> only odd thing I have done to the kernel is defined
>
> #define EM_FAST_INTR 1
>
>
> The man page talks about setting hw.em.* vars, but but does not
> discuss any of the tunables via dev.em.*.  Is there anything that can
> be tuned there to improve performance ?  Also, the man page talks
> about various controllers having different max values.  How do I know
> what this particular card has available as it seems to have a
> controller (82572GI) not mentioned in the man page.
>
>
> # sysctl -a dev.em.2
> dev.em.2.%desc: Intel(R) PRO/1000 Network Connection Version - 6.2.9
> dev.em.2.%driver: em
> dev.em.2.%location: slot=0 function=0
> dev.em.2.%pnpinfo: vendor=0x8086 device=0x105e subvendor=0x8086
> subdevice=0x115e class=0x020000
> dev.em.2.%parent: pci1
> dev.em.2.debug_info: -1
> dev.em.2.stats: -1
> dev.em.2.rx_int_delay: 0
> dev.em.2.tx_int_delay: 66
> dev.em.2.rx_abs_int_delay: 66
> dev.em.2.tx_abs_int_delay: 66
> dev.em.2.rx_processing_limit: 100
>
>
> Jan 30 11:04:31 FW4a-tor kernel: em2: Adapter hardware address = 0xc4b6f948
> Jan 30 11:04:31 FW4a-tor kernel: em2: CTRL = 0x80c0241 RCTL = 0x8002
> Jan 30 11:04:31 FW4a-tor kernel: em2: Packet buffer = Tx=16k Rx=32k
> Jan 30 11:04:31 FW4a-tor kernel: em2: Flow control watermarks high >
30720 low = 29220
> Jan 30 11:04:31 FW4a-tor kernel: em2: tx_int_delay = 66, tx_abs_int_delay =
66
> Jan 30 11:04:31 FW4a-tor kernel: em2: rx_int_delay = 0, rx_abs_int_delay =
66
> Jan 30 11:04:31 FW4a-tor kernel: em2: fifo workaround = 0, fifo_reset_count
= 0
> Jan 30 11:04:31 FW4a-tor kernel: em2: hw tdh = 246, hw tdt = 246
> Jan 30 11:04:31 FW4a-tor kernel: em2: Num Tx descriptors avail = 231
> Jan 30 11:04:31 FW4a-tor kernel: em2: Tx Descriptors not avail1 = 0
> Jan 30 11:04:31 FW4a-tor kernel: em2: Tx Descriptors not avail2 = 0
> Jan 30 11:04:31 FW4a-tor kernel: em2: Std mbuf failed = 0
> Jan 30 11:04:31 FW4a-tor kernel: em2: Std mbuf cluster failed = 0
> Jan 30 11:04:31 FW4a-tor kernel: em2: Driver dropped packets = 0
> Jan 30 11:04:31 FW4a-tor kernel: em2: Driver tx dma failure in encap = 0
> Jan 30 11:04:40 FW4a-tor kernel: em2: Excessive collisions = 0
> Jan 30 11:04:40 FW4a-tor kernel: em2: Sequence errors = 0
> Jan 30 11:04:40 FW4a-tor kernel: em2: Defer count = 0
> Jan 30 11:04:40 FW4a-tor kernel: em2: Missed Packets = 47990
> Jan 30 11:04:40 FW4a-tor kernel: em2: Receive No Buffers = 2221
> Jan 30 11:04:40 FW4a-tor kernel: em2: Receive Length Errors = 0
> Jan 30 11:04:40 FW4a-tor kernel: em2: Receive errors = 0
> Jan 30 11:04:40 FW4a-tor kernel: em2: Crc errors = 0
> Jan 30 11:04:40 FW4a-tor kernel: em2: Alignment errors = 0
> Jan 30 11:04:40 FW4a-tor kernel: em2: Carrier extension errors = 0
> Jan 30 11:04:40 FW4a-tor kernel: em2: RX overruns = 61
> Jan 30 11:04:40 FW4a-tor kernel: em2: watchdog timeouts = 0
> Jan 30 11:04:40 FW4a-tor kernel: em2: XON Rcvd = 0
> Jan 30 11:04:40 FW4a-tor kernel: em2: XON Xmtd = 0
> Jan 30 11:04:40 FW4a-tor kernel: em2: XOFF Rcvd = 0
> Jan 30 11:04:40 FW4a-tor kernel: em2: XOFF Xmtd = 0
> Jan 30 11:04:40 FW4a-tor kernel: em2: Good Packets Rcvd = 126019287
> Jan 30 11:04:40 FW4a-tor kernel: em2: Good Packets Xmtd = 78181054
>
>
> em2@pci1:0:0:   class=0x020000 card=0x115e8086 chip=0x105e8086
> rev=0x06 hdr=0x00
>      vendor   = 'Intel Corporation'
>      device   = 'PRO/1000 PT'
>      class    = network
>      subclass = ethernet
> em3@pci1:0:1:   class=0x020000 card=0x115e8086 chip=0x105e8086
> rev=0x06 hdr=0x00
>      vendor   = 'Intel Corporation'
>      device   = 'PRO/1000 PT'
>      class    = network
>      subclass = ethernet
>
>
> em2: <Intel(R) PRO/1000 Network Connection Version - 6.2.9> port
> 0x9000-0x901f mem 0xd1020000-0xd103ffff,0xd1000000-0xd101ffff
> irq 18 at device 0.0 on pci1
> em2: Ethernet address: 00:15:17:0b:46:7c
> em2: [FAST]
> em3: <Intel(R) PRO/1000 Network Connection Version - 6.2.9> port
> 0x9400-0x941f mem 0xd1040000-0xd105ffff,0xd1060000-0xd107ffff
> irq 19 at device 0.1 on pci1
> em3: Ethernet address: 00:15:17:0b:46:7d
> em3: [FAST]
Performance tuning is not something that I have yet had time to focus
on, our Linux team is able to do a lot more of that. Just at a glance,
try increasing your mbuf pool size and the number of receive descriptors
for a start. Oh, and try increasing your processing limit to 200 and see
what effect that has.

Jack

Mike Tancsa

2007-Jan-30 18:59 UTC

head link

Intel EM tuning (PT1000 adaptors)

At 12:30 PM 1/30/2007, Jack Vogel wrote:
>Performance tuning is not something that I have yet had time to focus
>on, our Linux team is able to do a lot more of that. Just at a glance,
>try increasing your mbuf pool size and the number of receive descriptors
>for a start. Oh, and try increasing your processing limit to 200 and see
>what effect that has.
Hi, thanks for the info. What is the processing_limit limit, and 
apart from crashing the box, how do I know if I set it too high ? ;-)

I am not sure which mbuf setting you mean ? From netstat -m, I dont 
seem to be hitting any max values

# netstat -m
838/2237/3075 mbufs in use (current/cache/total)
836/578/1414/25600 mbuf clusters in use (current/cache/total/max)
836/572 mbuf+clusters out of packet secondary zone in use (current/cache)
0/0/0/0 4k (page size) jumbo clusters in use (current/cache/total/max)
0/0/0/0 9k jumbo clusters in use (current/cache/total/max)
0/0/0/0 16k jumbo clusters in use (current/cache/total/max)
1881K/1715K/3596K bytes allocated to network (current/cache/total)
0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters)
0/0/0 requests for jumbo clusters denied (4k/9k/16k)
0/5/6656 sfbufs in use (current/peak/max)
0 requests for sfbufs denied
0 requests for sfbufs delayed
0 requests for I/O initiated by sendfile
0 calls to protocol drain routines

As for hw.em.rxd, how do I know what this chip can handle ? It says 
the current default is 256, but I dont know what I can set that too, 
based on this adaptor ?

WRT,     hw.em.rx_int_delay
              This value delays the generation of receive interrupts in units
              of 1.024 microseconds.  The default value is 0, since adapters
              may hang with this feature being enabled.

Do you know which adaptors have this issue ?


Also, for
     hw.em.rx_abs_int_delay
              If hw.em.rx_int_delay is non-zero, this tunable limits the maxi-
              mum delay in which a receive interrupt is generated.

I take it this is for interrupt moderation ? Am I right in thinking 
that if my rx buffers are filling, the box is not processing 
interrupts fast enough so I should move this value closer to zero 
?  How do I find what the current value is ?

Thanks for any pointers you can provide.


         ---Mike

Mike Tancsa

2007-Jan-30 22:40 UTC

head link

Intel EM tuning (PT1000 adaptors)

At 12:30 PM 1/30/2007, Jack Vogel wrote:
>Performance tuning is not something that I have yet had time to focus
>on, our Linux team is able to do a lot more of that. Just at a glance,
>try increasing your mbuf pool size and the number of receive descriptors
>for a start.

OK, I setup a test box the pass packets through and I am getting 
results I dont understand.  Increasing hw.em.rxd in loader.conf (and 
rebooting each time), I am getting worse results.

With hw.em.rxd=4096
Jan 30 17:19:10 em-test kernel: em0: Receive No Buffers = 5707564

With hw.em.rxd=1024
Jan 30 17:22:31 em-test kernel: em0: Receive No Buffers = 351

With hw.em.rxd=512
Jan 30 17:27:24 em-test kernel: em0: Receive No Buffers = 230

with default 256

Jan 30 16:55:44 em-test kernel: em0: Receive No Buffers = 77

with 128, its gets much worse.  This is with a stock UP kernel, no 
INET6, net.inet.ip.fastforwarding=1

Box A ------Box B (with dual Intel NIC) ----- Box C

Box A is generating packets routed through firewall Box B towards Box 
C. They are connected together with 2 cross over cables.

>Oh, and try increasing your processing limit to 200 and see
>what effect that has.

I will try that next.

         ---Mike

freebsd stable - Jan 2007 - Intel EM tuning (PT1000 adaptors)

Intel EM tuning (PT1000 adaptors)

Intel EM tuning (PT1000 adaptors)

Intel EM tuning (PT1000 adaptors)

Intel EM tuning (PT1000 adaptors)