Krishna Yenduri
2010-Mar-10 19:14 UTC
[crossbow-discuss] Fwd: Re: [osol-code] GLDv3 NIC driver Performance on sparc
-------- Original Message -------- Subject: Re: [osol-code] GLDv3 NIC driver Performance on sparc Date: Wed, 10 Mar 2010 11:10:12 -0800 From: Garrett D''Amore <garrett at damore.org> To: opensolaris-code at opensolaris.org On 03/10/10 10:48 AM, Mahesh wrote:> Hi all, > > I need some help in debugging the gldv3 driver performance issue on sparc. The driver has single Tx queue and 4 Rx queues and performs at almost the line rate(10G) on Sun intel boxes but the same driver performs very badly on sparc. The code is identical for sparc and intel except swapping involved since the hardware is little endian . Any idea how to debug this issue ?? > The machine i tried to bench mark is T5440 and i have tried setting ip_soft_rings_count = 16 on T5440 but result is same . >What is "badly"? Note that T5440 hardware uses individual cores which are probably quite a bit slower than an x86 core. Additionally, there could be resource contention (caches, etc.) due to different Niagra architecture here. Note also that I''ve been told that "bcopy" performs a bit slower on Niagra than on other SPARC or x86 architectures -- are you using bcopy to copy packet data, or are you using direct DMA? (Also, unless you take care, DMA setup and teardown on SPARC systems -- which use an IOMMU -- is quite expensive. In order to get good performance with direct DMA you really have to use loan up or something like it. Its tricky to get this right.) Some other questions: what size MTU are you using? Are you sure that you''re hitting each of your 4 RX rings basically "equally" by using different streams and making sure that traffic from a single stream stays on the same h/w ring? Is there a significant difference between TX and RX performance? - Garrett> > Thanks > Mahesh >_______________________________________________ opensolaris-code mailing list opensolaris-code at opensolaris.org http://mail.opensolaris.org/mailman/listinfo/opensolaris-code -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/crossbow-discuss/attachments/20100310/fbc3c292/attachment.html>
Mahesh.Vardhamanaiah at Emulex.Com
2010-Mar-13 05:29 UTC
[crossbow-discuss] Fwd: Re: [osol-code] GLDv3 NIC driver Performance on sparc
Tx we use the bcopy mode if the packet /fragment size is less than 512 bytes and use Direct DMA for other sizes. In the Rx we use bcopy if the packet size is less than 128 bytes and use the preallocated, premapped driver Buffer pool for other packet sizes. There is significant difference in Tx/Rx Tx is normally 6+ G but Rx is only 2+ G. Do you think using DVMA will help here ?? -Mahesh From: crossbow-discuss-bounces at opensolaris.org [mailto:crossbow-discuss-bounces at opensolaris.org] On Behalf Of Krishna Yenduri Sent: Thursday, March 11, 2010 12:45 AM To: crossbow-discuss at opensolaris.org Subject: [crossbow-discuss] Fwd: Re: [osol-code] GLDv3 NIC driver Performance on sparc -------- Original Message -------- Subject: Re: [osol-code] GLDv3 NIC driver Performance on sparc Date: Wed, 10 Mar 2010 11:10:12 -0800 From: Garrett D''Amore <garrett at damore.org><mailto:garrett at damore.org> To: opensolaris-code at opensolaris.org<mailto:opensolaris-code at opensolaris.org> On 03/10/10 10:48 AM, Mahesh wrote:> Hi all,>> I need some help in debugging the gldv3 driver performance issue on sparc. The driver has single Tx queue and 4 Rx queues and performs at almost the line rate(10G) on Sun intel boxes but the same driver performs very badly on sparc. The code is identical for sparc and intel except swapping involved since the hardware is little endian . Any idea how to debug this issue ??> The machine i tried to bench mark is T5440 and i have tried setting ip_soft_rings_count = 16 on T5440 but result is same .>What is "badly"? Note that T5440 hardware uses individual cores which are probably quite a bit slower than an x86 core. Additionally, there could be resource contention (caches, etc.) due to different Niagra architecture here. Note also that I''ve been told that "bcopy" performs a bit slower on Niagra than on other SPARC or x86 architectures -- are you using bcopy to copy packet data, or are you using direct DMA? (Also, unless you take care, DMA setup and teardown on SPARC systems -- which use an IOMMU -- is quite expensive. In order to get good performance with direct DMA you really have to use loan up or something like it. Its tricky to get this right.) Some other questions: what size MTU are you using? Are you sure that you''re hitting each of your 4 RX rings basically "equally" by using different streams and making sure that traffic from a single stream stays on the same h/w ring? Is there a significant difference between TX and RX performance? - Garrett>> Thanks> Mahesh>_______________________________________________ opensolaris-code mailing list opensolaris-code at opensolaris.org<mailto:opensolaris-code at opensolaris.org> http://mail.opensolaris.org/mailman/listinfo/opensolaris-code -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/crossbow-discuss/attachments/20100312/76e1cc6e/attachment.html>
Garrett D''Amore
2010-Mar-13 07:39 UTC
[crossbow-discuss] Fwd: Re: [osol-code] GLDv3 NIC driver Performance on sparc
On 03/12/10 09:29 PM, Mahesh.Vardhamanaiah at Emulex.Com wrote:> > Tx we use the bcopy mode if the packet /fragment size is less than > 512 bytes and use Direct DMA for other sizes. > > In the Rx we use bcopy if the packet size is less than 128 bytes and > use the preallocated, premapped driver > > Buffer pool for other packet sizes. > > There is significant difference in Tx/Rx Tx is normally 6+ G but Rx > is only 2+ G. > > Do you think using DVMA will help here ?? >I suspect your tradeoff at 128 bytes is too small. I''d bcopy all the way up to ~1K, maybe even up to full 1500 byte frames. - Garrett> -Mahesh > > *From:* crossbow-discuss-bounces at opensolaris.org > [mailto:crossbow-discuss-bounces at opensolaris.org] *On Behalf Of > *Krishna Yenduri > *Sent:* Thursday, March 11, 2010 12:45 AM > *To:* crossbow-discuss at opensolaris.org > *Subject:* [crossbow-discuss] Fwd: Re: [osol-code] GLDv3 NIC driver > Performance on sparc > > > > -------- Original Message -------- > > *Subject: * > > > > Re: [osol-code] GLDv3 NIC driver Performance on sparc > > *Date: * > > > > Wed, 10 Mar 2010 11:10:12 -0800 > > *From: * > > > > Garrett D''Amore <garrett at damore.org> <mailto:garrett at damore.org> > > *To: * > > > > opensolaris-code at opensolaris.org <mailto:opensolaris-code at opensolaris.org> > > On 03/10/10 10:48 AM, Mahesh wrote: > > Hi all, > > > > I need some help in debugging the gldv3 driver performance issue on sparc. The driver has single Tx queue and 4 Rx queues and performs at almost the line rate(10G) on Sun intel boxes but the same driver performs very badly on sparc. The code is identical for sparc and intel except swapping involved since the hardware is little endian . Any idea how to debug this issue ?? > > The machine i tried to bench mark is T5440 and i have tried setting ip_soft_rings_count = 16 on T5440 but result is same . > > > > What is "badly"? > > Note that T5440 hardware uses individual cores which are probably quite > a bit slower than an x86 core. Additionally, there could be resource > contention (caches, etc.) due to different Niagra architecture here. > > Note also that I''ve been told that "bcopy" performs a bit slower on > Niagra than on other SPARC or x86 architectures -- are you using bcopy > to copy packet data, or are you using direct DMA? (Also, unless you > take care, DMA setup and teardown on SPARC systems -- which use an IOMMU > -- is quite expensive. In order to get good performance with direct > DMA you really have to use loan up or something like it. Its tricky to > get this right.) > > Some other questions: what size MTU are you using? Are you sure that > you''re hitting each of your 4 RX rings basically "equally" by using > different streams and making sure that traffic from a single stream > stays on the same h/w ring? > > Is there a significant difference between TX and RX performance? > > - Garrett > > > > > > > Thanks > > Mahesh > > > > _______________________________________________ > opensolaris-code mailing list > opensolaris-code at opensolaris.org <mailto:opensolaris-code at opensolaris.org> > http://mail.opensolaris.org/mailman/listinfo/opensolaris-code > > > _______________________________________________ > crossbow-discuss mailing list > crossbow-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/crossbow-discuss >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/crossbow-discuss/attachments/20100312/2be3850d/attachment-0001.html>
rajagopal kunhappan
2010-Mar-14 21:41 UTC
[crossbow-discuss] Fwd: Re: [osol-code] GLDv3 NIC driver Performance on sparc
On 03/12/10 21:29, Mahesh.Vardhamanaiah at Emulex.Com wrote:> Tx we use the bcopy mode if the packet /fragment size is less than 512 bytes and use Direct DMA for other sizes. > In the Rx we use bcopy if the packet size is less than 128 bytes and use the preallocated, premapped driver > Buffer pool for other packet sizes. > > There is significant difference in Tx/Rx Tx is normally 6+ G but Rx is only 2+ G.So you have 1 Tx ring and 4 Rx rings. Have you ported your driver to Crossbow framework? If not, the Rx/Tx rings won''t be exposed to mac layer and Rs side won''t be able to take part in Crossbow features like polling. The Rx/Tx ring capability are exposed via MAC_CAPAB_RINGS capability. Also ip_soft_rings_cnt tunable you mention below is no more available in Opensolaris. -krgopi> > Do you think using DVMA will help here ?? > > > -Mahesh > > From: crossbow-discuss-bounces at opensolaris.org [mailto:crossbow-discuss-bounces at opensolaris.org] On Behalf Of Krishna Yenduri > Sent: Thursday, March 11, 2010 12:45 AM > To: crossbow-discuss at opensolaris.org > Subject: [crossbow-discuss] Fwd: Re: [osol-code] GLDv3 NIC driver Performance on sparc > > > > -------- Original Message -------- > Subject: > > Re: [osol-code] GLDv3 NIC driver Performance on sparc > > Date: > > Wed, 10 Mar 2010 11:10:12 -0800 > > From: > > Garrett D''Amore <garrett at damore.org><mailto:garrett at damore.org> > > To: > > opensolaris-code at opensolaris.org<mailto:opensolaris-code at opensolaris.org> > > > > On 03/10/10 10:48 AM, Mahesh wrote: > >> Hi all, > > >> I need some help in debugging the gldv3 driver performance issue on sparc. The driver has single Tx queue and 4 Rx queues and performs at almost the line rate(10G) on Sun intel boxes but the same driver performs very badly on sparc. The code is identical for sparc and intel except swapping involved since the hardware is little endian . Any idea how to debug this issue ?? > >> The machine i tried to bench mark is T5440 and i have tried setting ip_soft_rings_count = 16 on T5440 but result is same . > > > > > What is "badly"? > > > > Note that T5440 hardware uses individual cores which are probably quite > > a bit slower than an x86 core. Additionally, there could be resource > > contention (caches, etc.) due to different Niagra architecture here. > > > > Note also that I''ve been told that "bcopy" performs a bit slower on > > Niagra than on other SPARC or x86 architectures -- are you using bcopy > > to copy packet data, or are you using direct DMA? (Also, unless you > > take care, DMA setup and teardown on SPARC systems -- which use an IOMMU > > -- is quite expensive. In order to get good performance with direct > > DMA you really have to use loan up or something like it. Its tricky to > > get this right.) > > > > Some other questions: what size MTU are you using? Are you sure that > > you''re hitting each of your 4 RX rings basically "equally" by using > > different streams and making sure that traffic from a single stream > > stays on the same h/w ring? > > > > Is there a significant difference between TX and RX performance? > > > > - Garrett > > > > > > > > >> Thanks > >> Mahesh > > > > > _______________________________________________ > > opensolaris-code mailing list > > opensolaris-code at opensolaris.org<mailto:opensolaris-code at opensolaris.org> > > http://mail.opensolaris.org/mailman/listinfo/opensolaris-code > > > > ------------------------------------------------------------------------ > > _______________________________________________ > crossbow-discuss mailing list > crossbow-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/crossbow-discuss--