thr3ads.net - crossbow discuss - [crossbow-discuss] truly awful GLDv3 performance [Jan 2009]

If this information is useful, please help other people find it:
Share via:

Paul Durrant

2009-Jan-13 12:34 UTC

[crossbow-discuss] truly awful GLDv3 performance

I just thought I''d give my GLDv3 driver a simple netperf test now that 
crossbow has integrated and I find that, whereas I could achieve 9.3Gbps 
before I can now only get ~3Gbps with the same driver code (barring 
necessary interface changes for crossbow) and the same hardware.
The main crux of the problem, I think, is the sheer quantity of 
processing going on per received packet. This theory is supported by the 
fact that, when I turn on LRO, I get get 8.6Gbps for the same test with 
the same CPU bindings.
Is there any way to turn off crossbow''s huge bump-in-the-stack since I 
have no vnics and therefore am not remotely interesting in flow 
classification or resource control?

   Paul

Sunay Tripathi

2009-Jan-13 19:00 UTC

head link

[crossbow-discuss] truly awful GLDv3 performance

Paul,

Its due to the fact that so far the focus has been virtualization and
the 1 gigE NICs without multiple Rx rings etc. So even for 10gigE
NICs with have multiple Rx/Tx rings, we do try to scale them using
S/W fanout which might be the issue you are facing. We are working on
the problem right now. In the meantime, disable some of the S/W
scaling and you should get some of your performance back. Do this
in /etc/system

set mac_soft_ring_enable=0
set mac:mac_rx_soft_ring_count=0
set mac:mac_rx_soft_ring_10gig_count=0

You will have to reboot the machine after setting this. BTW, this is
only a part workaround and in no way a supported feature. The better
solution is being worked on.

Cheers,
Sunay

Paul Durrant wrote:> I just thought I''d give my GLDv3 driver a simple netperf test now
that
> crossbow has integrated and I find that, whereas I could achieve 9.3Gbps 
> before I can now only get ~3Gbps with the same driver code (barring 
> necessary interface changes for crossbow) and the same hardware.
> The main crux of the problem, I think, is the sheer quantity of 
> processing going on per received packet. This theory is supported by the 
> fact that, when I turn on LRO, I get get 8.6Gbps for the same test with 
> the same CPU bindings.
> Is there any way to turn off crossbow''s huge bump-in-the-stack
since I
> have no vnics and therefore am not remotely interesting in flow 
> classification or resource control?
> 
>    Paul
> _______________________________________________
> crossbow-discuss mailing list
> crossbow-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/crossbow-discuss

-- 
Sunay Tripathi
Distinguished Engineer
Solaris Core Operating System
Sun MicroSystems Inc.

Solaris Networking:     http://www.opensolaris.org/os/community/networking
Project Crossbow:       http://www.opensolaris.org/os/project/crossbow

Sunay Tripathi

2009-Jan-14 01:52 UTC

head link

[crossbow-discuss] truly awful GLDv3 performance

Paul,

I forgot to make one thing clear though - for your kind of NIC
(which has multiple Rx/Tx rings) a partial port to Crossbow
is going to be very harmful. Keep in mind, that unless crossbow
sees all the Rx/Tx rings, it will not use them. Also, dynamic
interrupt blanking has been removed (don''t need heuristic based
approach) and replaced by dynamic polling. So unless you expose
interfaces for dynamic polling, you will see huge differences.

So do enable dynamic polling and then do the things I mentioned
below to give us some data.

Thanks,
Sunay

Sunay Tripathi wrote:> Paul,
> 
> Its due to the fact that so far the focus has been virtualization and
> the 1 gigE NICs without multiple Rx rings etc. So even for 10gigE
> NICs with have multiple Rx/Tx rings, we do try to scale them using
> S/W fanout which might be the issue you are facing. We are working on
> the problem right now. In the meantime, disable some of the S/W
> scaling and you should get some of your performance back. Do this
> in /etc/system
> 
> set mac_soft_ring_enable=0
> set mac:mac_rx_soft_ring_count=0
> set mac:mac_rx_soft_ring_10gig_count=0
> 
> You will have to reboot the machine after setting this. BTW, this is
> only a part workaround and in no way a supported feature. The better
> solution is being worked on.
> 
> Cheers,
> Sunay
> 
> Paul Durrant wrote:
>> I just thought I''d give my GLDv3 driver a simple netperf test
now that
>> crossbow has integrated and I find that, whereas I could achieve
9.3Gbps
>> before I can now only get ~3Gbps with the same driver code (barring 
>> necessary interface changes for crossbow) and the same hardware.
>> The main crux of the problem, I think, is the sheer quantity of 
>> processing going on per received packet. This theory is supported by
the
>> fact that, when I turn on LRO, I get get 8.6Gbps for the same test with
>> the same CPU bindings.
>> Is there any way to turn off crossbow''s huge bump-in-the-stack
since I
>> have no vnics and therefore am not remotely interesting in flow 
>> classification or resource control?
>>
>>    Paul
>> _______________________________________________
>> crossbow-discuss mailing list
>> crossbow-discuss at opensolaris.org
>> http://mail.opensolaris.org/mailman/listinfo/crossbow-discuss
> 
> 

-- 
Sunay Tripathi
Distinguished Engineer
Solaris Core Operating System
Sun MicroSystems Inc.

Solaris Networking:     http://www.opensolaris.org/os/community/networking
Project Crossbow:       http://www.opensolaris.org/os/project/crossbow

Paul Durrant

2009-Jan-14 11:08 UTC

head link

[crossbow-discuss] truly awful GLDv3 performance

Sunay Tripathi wrote:> 
> I forgot to make one thing clear though - for your kind of NIC
> (which has multiple Rx/Tx rings) a partial port to Crossbow
> is going to be very harmful. Keep in mind, that unless crossbow
> sees all the Rx/Tx rings, it will not use them. Also, dynamic
> interrupt blanking has been removed (don''t need heuristic based
> approach) and replaced by dynamic polling. So unless you expose
> interfaces for dynamic polling, you will see huge differences.
> 
> So do enable dynamic polling and then do the things I mentioned
> below to give us some data.
> 
Sunay,

   Thanks for the info. From looking at the new API I don''t believe I 
can expose my multiple RX/TX rings to crossbow because my traffic 
steering algorithm is fixed (it''s an LFSR hash based on TCP/IP headers)
and the current h/w does not support multiple MAC addresses (although I 
could do this in s/w of course). Is there a way I can take advantage of 
the polling API without needing to claim levels of virtualization that 
the h/w does not support?

   Paul

PS: In general I don''t think it''s a good idea to assume that
h/w that
can traffic steer based on TCP/IP headers can also steer based on MAC 
address/VLAN tag.

Nicolas Droux

2009-Jan-14 17:42 UTC

head link

[crossbow-discuss] truly awful GLDv3 performance

Paul,

You can simply expose one RX hardware ring group with multiple RX rings 
inside that group. Then you do the steering in hardware (RSS) to the 
multiple RX rings as you did before. This is how our model maps to your 
type of NIC.

If someone then creates multiple VNICs on your NIC, we''ll do L2 
classification to these VNICs in mac in software, you don''t have to do 
that yourself in software.

To recap are roughly three combinations of L2 hardware classification 
and RSS we support:

1 - Some NICs can support multiple hardware groups with single rings per 
group, where they do L2 hardware classification between the groups.

2 - Some NICs can support only one hardware group with multiple rings 
and RSS between these rings, but no L2 hardware classification, which is 
the case of your hardware.

3 - Some other NICs can do both at the same time, multiple hardware 
groups with L2 hardware classification between the groups, then RSS 
between multiple RX rings in each group.

Some NICs can do 1 or 2 but not a combination of both at the same time.

What might be confusing is the name of the flag "MAC_VIRT_LEVEL1"
which
you need to raise in your driver in this case, although you support only 
one hardware group but no layer 2 hardware classification assist for 
virtualization.

Nicolas.

Paul Durrant wrote:> Sunay Tripathi wrote:
>> I forgot to make one thing clear though - for your kind of NIC
>> (which has multiple Rx/Tx rings) a partial port to Crossbow
>> is going to be very harmful. Keep in mind, that unless crossbow
>> sees all the Rx/Tx rings, it will not use them. Also, dynamic
>> interrupt blanking has been removed (don''t need heuristic
based
>> approach) and replaced by dynamic polling. So unless you expose
>> interfaces for dynamic polling, you will see huge differences.
>>
>> So do enable dynamic polling and then do the things I mentioned
>> below to give us some data.
>>
> 
> Sunay,
> 
>    Thanks for the info. From looking at the new API I don''t
believe I
> can expose my multiple RX/TX rings to crossbow because my traffic 
> steering algorithm is fixed (it''s an LFSR hash based on TCP/IP
headers)
> and the current h/w does not support multiple MAC addresses (although I 
> could do this in s/w of course). Is there a way I can take advantage of 
> the polling API without needing to claim levels of virtualization that 
> the h/w does not support?
> 
>    Paul
> 
> PS: In general I don''t think it''s a good idea to assume
that h/w that
> can traffic steer based on TCP/IP headers can also steer based on MAC 
> address/VLAN tag.
> 
> _______________________________________________
> crossbow-discuss mailing list
> crossbow-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/crossbow-discuss

Paul Durrant

2009-Jan-14 18:01 UTC

head link

[crossbow-discuss] truly awful GLDv3 performance

Nicolas Droux wrote:
[snip]

Nicolas,

   Thanks for the detailed explanation. I''ll update my driver and try
again.
> What might be confusing is the name of the flag "MAC_VIRT_LEVEL1"
which
> you need to raise in your driver in this case, although you support only 
> one hardware group but no layer 2 hardware classification assist for 
> virtualization.
   That was what was throwing me; it looked like I could not claim 
MAC_VIRT_LEVEL_1 without being able to steer traffic to rings based on 
MAC address.

   Paul

Paul Durrant

2009-Jan-19 12:43 UTC

head link

[crossbow-discuss] truly awful GLDv3 performance

Nicolas Droux wrote:> 
> You can simply expose one RX hardware ring group with multiple RX rings 
> inside that group. Then you do the steering in hardware (RSS) to the 
> multiple RX rings as you did before. This is how our model maps to your 
> type of NIC.
> 
Nicolas,

   I''m trying to code this up now and I''m confused as to how I
set up my
single group. I''ve opted for MAC_GROUP_TYPE_STATIC (which I think is 
correct) but in my mr_gget() method I apparently need to set up 
mgi_addmac() and mgi_remmac() entry points (looking at the code, it 
doesn''t pssible to leave these NULL); how do I implement these given 
that my h/w only supports a single MAC address and thus I do not steer 
traffic based on MAC address? Do I implement them and just fail any call 
to mgi_addmac()?

   Paul

Nicolas Droux

2009-Jan-20 23:16 UTC

head link

[crossbow-discuss] truly awful GLDv3 performance

Paul,

On Jan 19, 2009, at 5:43 AM, Paul Durrant wrote:
> Nicolas Droux wrote:
>>
>> You can simply expose one RX hardware ring group with multiple RX  
>> rings
>> inside that group. Then you do the steering in hardware (RSS) to the
>> multiple RX rings as you did before. This is how our model maps to  
>> your
>> type of NIC.
>>
>
> Nicolas,
>
>   I''m trying to code this up now and I''m confused as to
how I set up
> my
> single group. I''ve opted for MAC_GROUP_TYPE_STATIC (which I think
is
> correct) but in my mr_gget() method I apparently need to set up
> mgi_addmac() and mgi_remmac() entry points (looking at the code, it
> doesn''t pssible to leave these NULL); how do I implement these
given
> that my h/w only supports a single MAC address and thus I do not steer
> traffic based on MAC address? Do I implement them and just fail any  
> call
> to mgi_addmac()?
For drivers that support ring groups, all unicast MAC addresses,  
including the primary MAC address, are always programmed through the  
addmac/remmac entry points of the rings capability. Your single  
address will be programmed through these entry points on your ring  
group instead of the mc_unicst entry point.

Nicolas.
>
>
>   Paul
>
> _______________________________________________
> crossbow-discuss mailing list
> crossbow-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/crossbow-discuss
-- 
Nicolas Droux - Solaris Kernel Networking - Sun Microsystems, Inc.
droux at sun.com - http://blogs.sun.com/droux

Paul Durrant

2009-Jan-21 09:38 UTC

head link

[crossbow-discuss] truly awful GLDv3 performance

Nicolas Droux wrote:>>
>>   I''m trying to code this up now and I''m confused as
to how I set up my
>> single group. I''ve opted for MAC_GROUP_TYPE_STATIC (which I
think is
>> correct) but in my mr_gget() method I apparently need to set up
>> mgi_addmac() and mgi_remmac() entry points (looking at the code, it
>> doesn''t pssible to leave these NULL); how do I implement these
given
>> that my h/w only supports a single MAC address and thus I do not steer
>> traffic based on MAC address? Do I implement them and just fail any
call
>> to mgi_addmac()?
> 
> For drivers that support ring groups, all unicast MAC addresses, 
> including the primary MAC address, are always programmed through the 
> addmac/remmac entry points of the rings capability. Your single address 
> will be programmed through these entry points on your ring group instead 
> of the mc_unicst entry point.
> 
Thanks. I just found the bit of code that fails mac_register() if 
mc_unicst is set... that was the cause of my mac_register() panic. I 
seem to be up and running now; just need to implement the addmac/remmac 
entry points properly and then I can do some more performance runs.

   Cheers,

     Paul

crossbow discuss - Jan 2009 - truly awful GLDv3 performance

[crossbow-discuss] truly awful GLDv3 performance

[crossbow-discuss] truly awful GLDv3 performance

[crossbow-discuss] truly awful GLDv3 performance

[crossbow-discuss] truly awful GLDv3 performance

[crossbow-discuss] truly awful GLDv3 performance

[crossbow-discuss] truly awful GLDv3 performance

[crossbow-discuss] truly awful GLDv3 performance

[crossbow-discuss] truly awful GLDv3 performance

[crossbow-discuss] truly awful GLDv3 performance