This patch is an evolution of the last one I sent out. It has a couple of minor corrections, like a bad forward decl in the header. The last patch has had quite a bit of testing and all reports have been positive. The only complaint was from Gleb who says he needs to keep his beloved infinite for loop in the interrupt handler, well I have a better one for you Gleb, keep reading. I have also been doing some extreme stress testing using SmartBits, and discovered the driver as it stands is really not able to take extreme receive side pounding, Scott pointed out that this is why the FAST_INTR work was done :) There were some people that had stability issues with that work, but there were also many that did not. I actually merged the FAST code onto my last patch, and ran the SB stress and found it really was able to gracefully handle that load, way to go Scott :) I've pondered this situation, and this patch I'm including here today is the result. Here's what it does: If you drop it in place, compile it, and go... you will get the code that has been tested for a week, it uses the older style interrupts, it has the watchdog and other SMP fixes so its been proven. BUT, I've added the FAST_INTR changes back into the code, so if you go into your Makefile and add -DEM_FAST_INTR you will then get the taskqueue stuff. So, Gleb, rather than replace the infinite for loop that no one thinks is a good idea, you can just define FAST_INTR again, and you should be good to go. I see this as the best thing for the 6.2 RELEASE, it lets us keep moving forward, people that want max performance can define EM_FAST_INTR and help us wring out any problems, it also will mean that I will have our Intel test group start using this code. But for those that just want a stable driver the standard compile will still give them that. The patch I'm including is against BETA3. Let me know of your concerns or issues. Cheers, Jack -------------- next part -------------- A non-text attachment was scrubbed... Name: proposed-6.2.patch Type: text/x-patch Size: 21823 bytes Desc: not available Url : http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20061109/f28bb79c/proposed-6.2.bin
Without introduced this new patch, can I still use sysctl to maximise its performance like FAST_INTR? S On 11/9/06, Jack Vogel <jfvogel@gmail.com> wrote:> > This patch is an evolution of the last one I sent out. It has > a couple of minor corrections, like a bad forward decl in > the header. > > The last patch has had quite a bit of testing and all reports > have been positive. The only complaint was from Gleb who > says he needs to keep his beloved infinite for loop in the > interrupt handler, well I have a better one for you Gleb, keep > reading. > > I have also been doing some extreme stress testing using > SmartBits, and discovered the driver as it stands is really > not able to take extreme receive side pounding, Scott > pointed out that this is why the FAST_INTR work was done :) > > There were some people that had stability issues with that > work, but there were also many that did not. I actually > merged the FAST code onto my last patch, and ran the > SB stress and found it really was able to gracefully handle > that load, way to go Scott :) > > I've pondered this situation, and this patch I'm including here > today is the result. Here's what it does: > > If you drop it in place, compile it, and go... you will get the > code that has been tested for a week, it uses the older > style interrupts, it has the watchdog and other SMP fixes > so its been proven. > > BUT, I've added the FAST_INTR changes back into the code, so > if you go into your Makefile and add -DEM_FAST_INTR you will > then get the taskqueue stuff. > > So, Gleb, rather than replace the infinite for loop that no one > thinks is a good idea, you can just define FAST_INTR again, > and you should be good to go. > > I see this as the best thing for the 6.2 RELEASE, it lets us > keep moving forward, people that want max performance > can define EM_FAST_INTR and help us wring out any > problems, it also will mean that I will have our Intel test > group start using this code. But for those that just want > a stable driver the standard compile will still give them that. > > The patch I'm including is against BETA3. Let me know of > your concerns or issues. > > Cheers, > > Jack > > > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" > > >
At 08:19 PM 11/8/2006, Jack Vogel wrote:>BUT, I've added the FAST_INTR changes back into the code, so >if you go into your Makefile and add -DEM_FAST_INTR you will >then get the taskqueue stuff.It certainly does make a difference performance wise. I did some quick testing with netperf and netrate. Back to back boxes, using an AMD x2 with bge nic and one intel box CPU: AMD Athlon(tm) 64 X2 Dual Core Processor 3800+ (2009.27-MHz 686-class CPU) CPU: Intel(R) Core(TM)2 CPU 6400 @ 2.13GHz (2144.01-MHz 686-class CPU) The intel is a DG965SS with integrated em nic, the AMD a Tyan with integrated bge. Both running SMP kernels with pf built in, no inet6. Intel box as sender. In this test its with the patch from yesterday. The first set with the patch as is, the second test with -DEM_FAST_INTR. TCP STREAM TEST to 192.168.44.1 : +/-2.5% @ 99% conf. Recv Send Send Socket Socket Message Elapsed Size Size Size Time Throughput bytes bytes bytes secs. 10^6bits/sec 57344 57344 4096 62.19 858.16 57344 57344 4096 62.19 934.58 TCP STREAM TEST to 192.168.44.1 : +/-2.5% @ 99% conf. Recv Send Send Socket Socket Message Elapsed Size Size Size Time Throughput bytes bytes bytes secs. 10^6bits/sec 32768 32768 4096 62.27 551.46 32768 32768 4096 62.26 788.56 TCP REQUEST/RESPONSE TEST to 192.168.44.1 : +/-2.5% @ 99% conf. Local /Remote Socket Size Request Resp. Elapsed Trans. Send Recv Size Size Time Rate bytes Bytes bytes bytes secs. per sec 32768 65536 1 1 62.26 2999.88 32768 65536 32768 65536 1 1 62.31 6165.46 32768 65536 UDP REQUEST/RESPONSE TEST to 192.168.44.1 : +/-2.5% @ 99% conf. Local /Remote Socket Size Request Resp. Elapsed Trans. Send Recv Size Size Time Rate bytes Bytes bytes bytes secs. per sec 9216 41600 1 1 62.30 3170.25 9216 41600 9216 41600 1 1 62.34 6170.81 9216 41600 UDP REQUEST/RESPONSE TEST to 192.168.44.1 : +/-2.5% @ 99% conf. Local /Remote Socket Size Request Resp. Elapsed Trans. Send Recv Size Size Time Rate bytes Bytes bytes bytes secs. per sec 9216 41600 516 4 62.28 2999.17 9216 41600 9216 41600 516 4 62.33 6031.56 9216 41600 UDP UNIDIRECTIONAL SEND TEST to 192.168.44.1 : +/-2.5% @ 99% conf. Socket Message Elapsed Messages Size Size Time Okay Errors Throughput bytes bytes secs # # 10^6bits/sec 32768 4096 60.00 1743632 24778919 952.25 41600 60.00 1742801 951.79 32768 4096 60.00 1743633 24722456 952.25 41600 60.00 1742828 951.81 UDP UNIDIRECTIONAL SEND TEST to 192.168.44.1 : +/-2.5% @ 99% conf. Socket Message Elapsed Messages Size Size Time Okay Errors Throughput bytes bytes secs # # 10^6bits/sec 32768 1024 60.00 6831370 28639884 932.70 41600 60.00 6828166 932.27 32768 1024 60.00 6831369 28465662 932.70 41600 60.00 6828086 932.26 Intel box as receiver, bge0/AMD as sender First set of results using stock em driver from 6.2beta2 second set of results using first patch 3rd set using taskqueue enabled /usr/local/netperf/netperf -t TCP_STREAM -l 60 -H 192.168.44.244 -i 10,3 -I 99,5 -- -s 57344 -S 57344 -m 4096 TCP STREAM TEST to 192.168.44.244 : +/-2.5% @ 99% conf. Recv Send Send Socket Socket Message Elapsed Size Size Size Time Throughput bytes bytes bytes secs. 10^6bits/sec 57344 57344 4096 60.00 680.24 57344 57344 4096 60.00 680.34 57344 57344 4096 60.00 680.54 TCP STREAM TEST to 192.168.44.244 : +/-2.5% @ 99% conf. Recv Send Send Socket Socket Message Elapsed Size Size Size Time Throughput bytes bytes bytes secs. 10^6bits/sec 32768 32768 4096 60.00 496.72 32768 32768 4096 60.00 499.87 32768 32768 4096 60.00 677.63 TCP REQUEST/RESPONSE TEST to 192.168.44.244 : +/-2.5% @ 99% conf. Local /Remote Socket Size Request Resp. Elapsed Trans. Send Recv Size Size Time Rate bytes Bytes bytes bytes secs. per sec 32768 65536 1 1 60.00 2999.61 32768 65536 32768 65536 1 1 60.00 2999.50 32768 65536 32768 65536 1 1 60.00 6163.75 32768 65536 UDP REQUEST/RESPONSE TEST to 192.168.44.244 : +/-2.5% @ 99% conf. Local /Remote Socket Size Request Resp. Elapsed Trans. Send Recv Size Size Time Rate bytes Bytes bytes bytes secs. per sec 9216 41600 1 1 60.00 3099.52 9216 41600 9216 41600 1 1 60.00 3102.97 9216 41600 9216 41600 1 1 60.00 6178.13 9216 41600 UDP REQUEST/RESPONSE TEST to 192.168.44.244 : +/-2.5% @ 99% conf. Local /Remote Socket Size Request Resp. Elapsed Trans. Send Recv Size Size Time Rate bytes Bytes bytes bytes secs. per sec 9216 41600 516 4 60.00 2956.58 9216 41600 9216 41600 516 4 60.00 2956.15 9216 41600 9216 41600 516 4 60.00 6075.79 9216 41600 /usr/local/netperf/netperf -t UDP_STREAM -l 60 -H 192.168.44.244 -i 10,3 -I 99,5 -- -s 32768 -S 32768 -m 4096 UDP UNIDIRECTIONAL SEND TEST to 192.168.44.244 : +/-2.5% @ 99% conf. Socket Message Elapsed Messages Size Size Time Okay Errors Throughput bytes bytes secs # # 10^6bits/sec 32768 4096 60.00 1340178 20058972 731.91 41600 60.00 1340178 731.90 32768 4096 60.00 1340076 19963473 731.85 41600 60.00 1340076 731.85 32768 4096 60.00 1340497 20167227 732.09 41600 60.00 1340497 732.09 UDP UNIDIRECTIONAL SEND TEST to 192.168.44.244 : +/-2.5% @ 99% conf. Socket Message Elapsed Messages Size Size Time Okay Errors Throughput bytes bytes secs # # 10^6bits/sec 32768 1024 60.00 5468540 29141343 746.64 41600 60.00 5468538 746.63 32768 1024 60.00 5469132 29805133 746.71 41600 60.00 5469132 746.71 32768 1024 60.00 5468372 30181335 746.61 41600 60.00 5468372 746.61
At 08:19 PM 11/8/2006, Jack Vogel wrote:>BUT, I've added the FAST_INTR changes back into the code, so >if you go into your Makefile and add -DEM_FAST_INTR you will >then get the taskqueue stuff.Not sure why you would want FAST_INTR and polling in at the same time, but I found that the two are mutually exclusive cd /usr/obj/usr/src/sys/pioneer; MAKEOBJDIRPREFIX=/usr/obj MACHINE_ARCH=i386 MACHINE=i386 CPUTYPE= GROFF_BIN_PATH=/usr/obj/usr/src/tmp/legacy/usr/bin GROFF_FONT_PATH=/usr/obj/usr/src/tmp/legacy/usr/share/groff_font GROFF_TMAC_PATH=/usr/obj/usr/src/tmp/legacy/usr/share/tmac _SHLIBDIRPREFIX=/usr/obj/usr/src/tmp INSTALL="sh /usr/src/tools/install.sh" PATH=/usr/obj/usr/src/tmp/legacy/usr/sbin:/usr/obj/usr/src/tmp/legacy/usr/bin:/usr/obj/usr/src/tmp/legacy/usr/games:/usr/obj/usr/src/tmp/usr/sbin:/usr/obj/usr/src/tmp/usr/bin:/usr/obj/usr/src/tmp/usr/games:/sbin:/bin:/usr/sbin:/usr/bin make KERNEL=kernel all -DNO_MODULES_OBJ cc -c -O -pipe -Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Winline -Wcast-qual -fformat-extensions -std=c99 -g -nostdinc -I- -I. -I/usr/src/sys -I/usr/src/sys/contrib/altq -I/usr/src/sys/contrib/ipfilter -I/usr/src/sys/contrib/pf -I/usr/src/sys/dev/ath -I/usr/src/sys/contrib/ngatm -I/usr/src/sys/dev/twa -D_KERNEL -DHAVE_KERNEL_OPTION_HEADERS -include opt_global.h -fno-common -finline-limit=8000 --param inline-unit-growth=100 --param large-function-growth=1000 -mno-align-long-strings -mpreferred-stack-boundary=2 -mno-mmx -mno-3dnow -mno-sse -mno-sse2 -ffreestanding -Werror /usr/src/sys/dev/em/if_em.c /usr/src/sys/dev/em/if_em.c: In function `em_ioctl': /usr/src/sys/dev/em/if_em.c:931: error: `em_poll' undeclared (first use in this function) /usr/src/sys/dev/em/if_em.c:931: error: (Each undeclared identifier is reported only once /usr/src/sys/dev/em/if_em.c:931: error: for each function it appears in.) /usr/src/sys/dev/em/if_em.c: At top level: /usr/src/sys/dev/em/if_em.c:1164: warning: 'em_poll' defined but not used *** Error code 1 Stop in /usr/obj/usr/src/sys/pioneer. *** Error code 1 Stop in /usr/src. *** Error code 1 Stop in /usr/src.
Should fiddling with the interrupt-coalescing stuff in the em driver via sysctl be tried? None of the recent tests in reply to your email indicate any particular tx/rx threshold settings. adrian -- Adrian Chadd - adrian@freebsd.org
At 08:45 AM 11/12/2006, Adrian Chadd wrote:>Should fiddling with the interrupt-coalescing stuff in the em driver >via sysctl be tried? >None of the recent tests in reply to your email indicate any >particular tx/rx threshold settings. >I was using whatever is the default. What would you like me to test ? # sysctl -A dev.em dev.em.0.%desc: Intel(R) PRO/1000 Network Connection Version - 6.2.9 dev.em.0.%driver: em dev.em.0.%location: slot=0 function=0 dev.em.0.%pnpinfo: vendor=0x8086 device=0x105e subvendor=0x8086 subdevice=0x115e class=0x020000 dev.em.0.%parent: pci6 dev.em.0.debug_info: -1 dev.em.0.stats: -1 dev.em.0.rx_int_delay: 0 dev.em.0.tx_int_delay: 66 dev.em.0.rx_abs_int_delay: 66 dev.em.0.tx_abs_int_delay: 66 dev.em.0.rx_processing_limit: 100 dev.em.1.%desc: Intel(R) PRO/1000 Network Connection Version - 6.2.9 dev.em.1.%driver: em dev.em.1.%location: slot=0 function=1 dev.em.1.%pnpinfo: vendor=0x8086 device=0x105e subvendor=0x8086 subdevice=0x115e class=0x020000 dev.em.1.%parent: pci6 dev.em.1.debug_info: -1 dev.em.1.stats: -1 dev.em.1.rx_int_delay: 0 dev.em.1.tx_int_delay: 66 dev.em.1.rx_abs_int_delay: 66 dev.em.1.tx_abs_int_delay: 66 dev.em.1.rx_processing_limit: 100>adrian > >-- >Adrian Chadd - adrian@freebsd.org >_______________________________________________ >freebsd-stable@freebsd.org mailing list >http://lists.freebsd.org/mailman/listinfo/freebsd-stable >To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org"
At 11:05 PM 11/12/2006, Scott Long wrote:>Mike Tancsa wrote: >>However, if I turn on fastforwarding, its back to the old behavior >>with it locking up. This was with the stock driver. I will try the >>same test with >>#define EM_FAST_INTR 1 >>as well as taking out the nfs option from the kernel >>driver. Anything else to tune with ? >> ---Mike > >Would you mind going so far as to adding NO_ADAPTIVE_MUTEXES? This isWow, it totally survives 2 streams (2 in one direction or 2 in opposite directions)... AND it almost can survive a 3rd! It does seem to be dropping packets on em1 when I add in a second stream blasting in the same direction, but its survives. You can see at ** I add in the second stream, and the outbound on em1 drops, even though more packets are going through. I stop the second stream then add a stream going in the other direction. Oddly enough, with a UP kernel it cant do 2 streams, just the one. I thought the extra locking on SMP would make it worse ? In this case it seems no. em0 em1 bge1 Kbps in Kbps out Kbps in Kbps out Kbps in Kbps out 0.00 0.00 0.00 0.00 0.47 1.17 0.00 0.00 0.00 0.00 0.47 1.17 0.00 0.00 0.00 0.00 0.94 1.17 184837.3 0.00 0.00 184634.5 0.47 1.17 416856.2 0.00 0.00 416232.3 0.94 1.17 414080.2 0.00 0.00 413829.7 0.47 1.17 415656.3 0.00 0.00 415449.1 1.40 1.17 413015.0 0.00 0.00 412519.6 0.94 1.17 415094.8 0.00 0.00 414795.6 1.87 1.17 415548.6 0.00 0.00 415323.7 0.47 1.17 416629.6 0.00 0.00 416046.7 1.40 1.17 417032.5 0.00 0.00 416854.1 0.47 1.17 416606.9 0.00 0.00 416217.9 0.94 1.17 591084.4 0.00 0.00 307308.9 0.47 1.17** 692232.6 0.00 0.00 248915.3 1.41 2.34 692837.6 0.00 0.00 248765.0 0.47 1.17 690216.7 0.00 0.00 245400.2 1.26 1.59 697517.9 0.00 0.00 245689.7 0.94 2.34 697039.0 0.00 0.00 245553.8 1.40 3.09 697424.1 0.00 0.00 244360.9 0.47 1.17 692166.5 0.00 0.00 245316.4 0.94 1.17 697102.3 0.00 0.00 244969.0 0.47 1.17 560005.0 0.00 0.00 329704.8 0.94 3.09 405189.2 0.00 0.00 404969.8 0.94 1.17 417511.0 0.00 0.00 416902.5 0.47 1.17 415029.0 0.00 0.00 414745.3 0.94 1.17 411610.7 15798.46 49760.74 347870.8 0.94 1.17 417520.1 55251.77 174393.0 187379.1 1.40 2.34 414477.5 56203.05 174188.7 185302.4 0.47 1.17 em0 em1 bge1 Kbps in Kbps out Kbps in Kbps out Kbps in Kbps out 416985.7 55063.17 174285.5 186733.5 0.94 2.34 415291.6 55358.65 174493.7 184466.1 1.41 5.01 409410.6 57657.47 174494.3 186945.9 0.94 1.17 417197.4 55271.79 173405.6 189667.3 0.94 3.09 415310.8 55503.47 174283.4 186058.0 0.94 1.17 417290.4 55983.99 174279.4 185274.7 0.94 4.59 418032.6 55648.61 174424.4 186057.0 0.94 1.17 416777.8 55283.73 174886.1 185034.6 0.47 1.17 414412.4 33989.34 105397.4 276853.1 0.94 1.17 416477.8 0.00 0.00 415984.6 0.47 1.17 416656.7 0.00 0.00 416117.0 0.94 1.17 416527.4 0.00 0.00 416372.2 0.47 1.17 250260.8 0.00 0.00 250029.2 0.94 1.17 0.00 0.00 0.00 0.00 0.47 1.17 0.00 0.00 0.00 0.00 0.94 1.17
Mike Tancsa wrote:> At 11:05 PM 11/12/2006, Scott Long wrote: >> Mike Tancsa wrote: >>> However, if I turn on fastforwarding, its back to the old behavior >>> with it locking up. This was with the stock driver. I will try the >>> same test with >>> #define EM_FAST_INTR 1 >>> as well as taking out the nfs option from the kernel driver. >>> Anything else to tune with ? >>> ---Mike >> >> Would you mind going so far as to adding NO_ADAPTIVE_MUTEXES? This is > > Wow, it totally survives 2 streams (2 in one direction or 2 in opposite > directions)... AND it almost can survive a 3rd! It does seem to be > dropping packets on em1 when I add in a second stream blasting in the > same direction, but its survives. You can see at ** I add in the second > stream, and the outbound on em1 drops, even though more packets are > going through. I stop the second stream then add a stream going in the > other direction. Oddly enough, with a UP kernel it cant do 2 streams, > just the one. I thought the extra locking on SMP would make it worse ? > In this case it seems no. >Is this with EM_INTR_FAST enabled also? Scott
At 12:15 AM 11/13/2006, Scott Long wrote:>Is this with EM_INTR_FAST enabled also?Yes. Havent done the stock case yet, but will do so later today. ---Mike
Mike Tancsa wrote:> At 12:15 AM 11/13/2006, Scott Long wrote: > >> Is this with EM_INTR_FAST enabled also? > > Yes. Havent done the stock case yet, but will do so later today.Do you have a comparison with Linux under the same circumstances?
At 12:15 AM 11/13/2006, Scott Long wrote:>Is this with EM_INTR_FAST enabled also?Without it, the 2 streams are definitely lossy on the management interface ---Mike
Mike Tancsa wrote:> At 12:15 AM 11/13/2006, Scott Long wrote: > > >> Is this with EM_INTR_FAST enabled also? > > > Without it, the 2 streams are definitely lossy on the management interface > > > ---Mike > >Ok, and would you be able to test the polling options as well? Scott
At 12:50 PM 11/13/2006, Ivan Voras wrote:>Mike Tancsa wrote: > > At 12:15 AM 11/13/2006, Scott Long wrote: > > > >> Is this with EM_INTR_FAST enabled also? > > > > Yes. Havent done the stock case yet, but will do so later today. > >Do you have a comparison with Linux under the same circumstances?I had a disk with 64bit already installed. I will try with 32bit tomorrow. I can also try FreeBSD AMD64 on the box to see how it does. ifstat gives a bit of an odd output, but its the same sort of pattern where adding a second stream in the same direction, slows down the first one. On the box R2 [root@amd64 ifstat-1.1]# ifstat -b eth0 eth1 eth3 eth4 Kbps in Kbps out Kbps in Kbps out Kbps in Kbps out Kbps in Kbps out 0.00 0.00 0.00 0.00 0.00 0.00 4.89 3.74 0.00 0.00 0.00 0.00 0.00 0.00 0.50 1.45 0.00 0.00 0.00 0.00 0.00 0.00 1.00 1.45 160965.0 0.00 0.00 0.00 0.00 0.00 0.83 1.95 0.00 0.00 0.00 272056.4 0.00 0.00 1.00 1.45 393994.2 0.00 0.00 0.00 0.00 0.00 5.47 1.45 0.00 0.00 0.00 393543.7 0.00 0.00 4.25 1.45 392911.0 0.00 0.00 0.00 0.00 0.00 2.50 1.45 0.00 0.00 0.50 392756.4 0.00 0.00 1.25 1.45 392626.7 0.00 0.00 0.00 0.00 0.00 1.75 1.45 0.00 0.00 0.00 393233.9 0.00 0.00 6.44 1.45 424068.1 0.00 0.00 0.00 0.00 0.00 1.74 1.45** 0.00 0.00 0.00 460503.1 0.00 0.00 2.72 1.45 509218.1 0.00 0.00 0.00 0.00 0.00 0.99 1.45 0.00 0.00 0.00 507800.4 0.00 0.00 0.50 1.45 502649.5 0.00 0.00 0.00 0.00 0.00 1.00 1.45 0.00 0.00 0.50 507537.1 0.00 0.00 0.50 1.46 519717.9 0.00 0.00 0.00 0.00 0.00 1.00 1.45 0.00 0.00 0.00 525973.4 0.00 0.00 0.50 1.46 520609.0 0.00 0.00 0.00 0.00 0.00 1.00 1.45 0.00 0.00 0.00 517888.6 0.00 0.00 0.50 1.45 525957.3 0.00 0.00 0.00 0.00 0.00 1.00 1.46 0.00 0.00 0.00 524119.9 0.00 0.00 0.50 1.45 522671.1 0.00 0.00 0.00 0.00 0.00 0.99 1.44 0.00 0.00 0.00 494008.7 0.00 0.00 0.50 1.45 390666.3 0.00 0.00 0.00 0.00 0.00 1.00 1.45 0.00 0.00 0.00 273779.6 0.00 0.00 0.50 1.45 0.00 0.00 0.00 0.00 0.00 0.00 1.00 1.45 0.00 0.00 0.00 0.00 0.00 0.00 0.50 1.45 [root@amd64 ifstat-1.1]# I added the second stream, going in the same direction at ** On one of the targets running netreceive you can see the impact. [tyan-1u]# ifstat -b rl0 bge0 Kbps in Kbps out Kbps in Kbps out 0.94 1.42 182716.2 0.00 0.47 1.05 182299.5 0.00 0.94 1.05 182493.4 0.33 0.94 2.09 182588.7 0.00 0.94 1.05 181959.8 0.00 0.47 1.05 104949.7 0.00 0.94 1.05 95674.27 0.00 0.47 1.05 95930.79 0.00 0.94 1.05 98329.93 0.00 0.94 1.05 97940.21 0.00 0.94 1.05 100636.9 0.00 0.47 1.05 99879.34 0.00 ^C [tyan-1u]# When the packets are bi-directional, the impact is not as great in LINUX as it is on FreeBSD [root@amd64 ifstat-1.1]# ifstat -b eth0 eth1 eth3 eth4 Kbps in Kbps out Kbps in Kbps out Kbps in Kbps out Kbps in Kbps out 0.00 0.00 0.00 0.00 0.00 0.00 3.65 10.81 0.00 0.00 0.00 0.00 0.00 0.00 0.50 1.45 0.00 0.00 0.00 0.00 0.00 0.00 0.83 1.95 0.00 0.00 0.00 0.00 0.00 0.00 1.50 8.03 0.00 0.00 0.00 0.00 0.00 0.00 0.50 1.45 0.00 0.00 0.00 0.00 0.00 0.00 1.00 1.45 0.00 230009.2 0.00 0.00 0.00 0.00 2.83 51.22 0.00 0.00 334969.3 0.00 0.00 0.00 1.00 1.45 0.00 369184.5 0.00 0.00 0.00 0.00 0.50 1.45 0.00 0.00 369294.2 0.00 0.00 0.00 3.33 51.10 0.00 367348.7 0.00 0.00 0.00 0.00 0.50 1.45 0.00 0.00 367185.5 0.00 0.00 0.00 1.00 1.45 2541.17 368707.6 0.00 0.00 0.00 0.00 2.82 51.12 0.00 0.00 363265.6 95798.38 0.00 0.00 0.99 1.44 330239.4 357706.3 0.00 0.00 0.00 0.00 0.50 1.45 0.00 0.00 354181.1 326599.7 0.00 0.00 4.11 51.17 328691.7 356129.1 0.00 0.00 0.00 0.00 0.50 1.44 0.00 0.00 358321.6 330567.1 0.00 0.00 1.50 1.45 329516.7 342389.2 0.00 0.00 0.00 0.00 0.99 14.99 0.00 0.00 334539.9 330647.5 0.00 0.00 0.99 1.44 330982.0 326772.6 0.00 0.00 0.00 0.00 0.50 1.44 0.00 0.00 329472.7 333109.3 0.00 0.00 2.32 14.45 324457.4 327537.4 0.00 0.00 0.00 0.00 0.50 1.44 0.00 0.00 329367.2 317784.0 0.00 0.00 0.99 1.44 308120.8 333789.8 0.00 0.00 0.00 0.00 1.80 20.78 0.00 0.00 331200.2 316116.3 0.00 0.00 1.00 1.45 370504.6 88001.99 0.00 0.00 0.00 0.00 0.50 1.44 0.00 0.00 0.50 392417.6 0.00 0.00 2.82 21.76 394057.2 0.00 0.00 0.00 0.00 0.00 0.83 1.95 0.00 0.00 0.00 394048.2 0.00 0.00 1.00 1.45 394306.3 0.00 0.00 0.00 0.00 0.00 3.66 52.56 0.00 0.00 0.00 393960.8 0.00 0.00 1.00 1.45 373321.8 0.00 0.00 0.00 0.00 0.00 0.50 1.45 0.00 0.00 0.00 261093.7 0.00 0.00 2.33 9.66 0.00 0.00 0.00 0.00 0.00 0.00 0.50 1.45 0.00 0.00 0.00 0.00 0.00 0.00 0.50 1.45 The box is totally responsive throughout with no packet loss on the management interface.... However, it seems quite a bit slower than FreeBSD when its tweaked with ADAPTIVE_GIANT removed... But again, this is 64bit so not quite apples to apples yet. Also, I need to check the default driver config to see if their NAPI or whatever its called is enabled. More tests to come. ---Mike>_______________________________________________ >freebsd-stable@freebsd.org mailing list >http://lists.freebsd.org/mailman/listinfo/freebsd-stable >To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org"
Mike Tancsa wrote:> At 12:50 PM 11/13/2006, Ivan Voras wrote: >> Mike Tancsa wrote: >> > At 12:15 AM 11/13/2006, Scott Long wrote: >> > >> >> Is this with EM_INTR_FAST enabled also? >> > >> > Yes. Havent done the stock case yet, but will do so later today. >> >> Do you have a comparison with Linux under the same circumstances? > > I had a disk with 64bit already installed. I will try with 32bit > tomorrow. I can also try FreeBSD AMD64 on the box to see how it does. > > ifstat gives a bit of an odd output, but its the same sort of pattern > where adding a second stream in the same direction, slows down the first > one. On the box R2...> > The box is totally responsive throughout with no packet loss on the > management interface.... However, it seems quite a bit slower than > FreeBSD when its tweaked with ADAPTIVE_GIANT removed... But again, this > is 64bit so not quite apples to apples yet. Also, I need to check the > default driver config to see if their NAPI or whatever its called is > enabled. More tests to come. > > ---MikeMore excellent data, thanks. I have some changes on the drawing board that should significantly improve forwarding and bridging in the em driver. Do you have a limit on how much more time you can spend on these tests? It might be a week or more before I have anything that can be tested. Scott
At 04:30 PM 11/13/2006, Scott Long wrote:>Mike Tancsa wrote: >>At 12:15 AM 11/13/2006, Scott Long wrote: >> >>>Is this with EM_INTR_FAST enabled also? >> >>Without it, the 2 streams are definitely lossy on the management interface >> >> ---Mike > >Ok, and would you be able to test the polling options as well?Here are some more results. I am still going through testing with firewall rules as well as testing with the size of the routing table. Should get through that tomorrow. Again, this is the same setup as described at http://www.tancsa.com/blast.jpg Note about platforms. The HEAD w Patch is a patch glebius@freebsd.org asked me to test. FastFWD is with net.inet.ip.fastforwarding on. Also with FastFWD set to one, I always used the kernel options ADAPTIVE_GIANT commented out and added NO_ADAPTIVE_MUTEXES. INET6 was removed from all kernels as well. With these kernel changes, and fast forwarding on, I was able to keep the box r2 responsive from the console as while blasting packets across its 2 interfaces. Otherwise, the box seemingly livelocked. For the linux kernel config, it was pretty well the default, except I removed INET6, IPSEC and disabled iptables. The LINUX kernel was 2.6.18.2 on FC5. The first test is with UDP netperf. /usr/local/netperf/netperf -l 60 -H 192.168.44.1 -i 10,2 -I 99,10 -t UDP_STREAM -- -m 10 -s 32768 -S 32768 /usr/local/netperf/netperf -l 60 -H 192.168.44.1 -i 10,2 -I 99,10 -t UDP_STREAM -- -m 64 -s 32768 -S 32768 /usr/local/netperf/netperf -l 60 -H 192.168.44.1 -i 10,2 -I 99,10 -t UDP_STREAM -- -m 128 -s 32768 -S 32768 /usr/local/netperf/netperf -l 60 -H 192.168.44.1 -i 10,2 -I 99,10 -t UDP_STREAM -- -m 200 -s 32768 -S 32768 Not much difference UDP STREAM TEST Platform 10 64 128 200 Linux 2.18.2 NAPI 46.79 297.65 531.00 706.00 FreeBSD HEAD 46.75 297.82 530.70 728.01 RELENG6 i386 46.70 296.32 529.12 721.80 RELENG6 i386 FastFWD 46.37 295.88 529.72 722.02 FreeBSD HEAD w Patch 46.39 293.78 529.41 728.17 FreeBSD HEAD w Patch FastFWD 46.52 295.71 529.81 718.32 AMD64 RELENG6 w FastFWD 46.27 295.85 529.44 721.96 Next test was one box blasting packets across using netrate, as measured at the receiving end of the blast-- i.e. what made it through the 2 interfaces on R2. I would sample the rate for 10 seconds and then record the average. The values were pretty tight with little variation. LINUX was faster, but the difference is uninteresting between it and the top values for FreeBSD. Straight Routing test One Stream pps Linux 581,309.81 FreeBSD HEAD 441,559.50 RELENG6 i386 407,403.00 RELENG6 i386 FastFWD 557,589.25 FreeBSD HEAD w Patch 422,294.13 FreeBSD HEAD w Patch FastFWD 567,290.00 AMD64 RELENG6 w FastFWD 574,591.88 AMD64 RELENG6 polling 285,917.13 AMD64 RELENG6 polling FastFWD 512,042.00 RELENG6 i386 polling FastFWD 558,603.00 The differences here between LINUX and FreeBSD were a bit more in this test. Straight Routing test 2 streams opposite direction pps Linux 473,814 FreeBSD HEAD 204,043 RELENG6 i386 165,461 RELENG6 i386 FastFWD 368,967 FreeBSD HEAD w Patch 127,832 FreeBSD HEAD w Patch FastFWD 346,220 AMD64 RELENG6 w Polling 155,659 AMD64 RELENG6 w Polling FastFWD 231,541 More data to come.... ---Mike