Jeremy Mann
2006-Dec-14 15:20 UTC
[Lustre-discuss] Using Lustre on channel bonded interfaces
First of all, thanks to everyone for an awesome product! My first usage of Lustre has been on a small cluster with gigabit interfaces. I can saturate the network and achieve 100 MB/s accessing the Lustre filesystem. My problem comes into when I channel bond 2 gigabit interfaces, my speed drops to 54 MB/s. A recent thread I googled says that this problem is due to the robin-robin trunking of a default channel bonded system, which I now understand. My question is, is anybody else out there using Lustre with channel bonded interfaces? If so, what trunking and kernel options for bonding are you using? Thanks for any hints or tips! -- Jeremy Mann jeremy@biochem.uthscsa.edu University of Texas Health Science Center Bioinformatics Core Facility http://www.bioinformatics.uthscsa.edu Phone: (210) 567-2672
Felix, Evan J
2006-Dec-14 15:27 UTC
[Lustre-discuss] Using Lustre on channel bonded interfaces
Can you tell us more about the version of Lustre you are using, that options to the bonding module are you setting, and possibly what type of switch you are using? Those things may help. If you are able to set up two separate networks on the two interfaces, it is possible(correct me here guys if I have this wrong) that Lustre can just run over the two interfaces and be just happy. For example, can you make all the eth0 interfaces be on the 10.10.10.0/24 network, and all the eth1 interfaces on the 10.10.20.0/24 network. The just use the lnet module parameters(>=1.4.6 only) to set two channels for traffic. Evan> -----Original Message----- > From: lustre-discuss-bounces@clusterfs.com > [mailto:lustre-discuss-bounces@clusterfs.com] On Behalf Of Jeremy Mann > Sent: Thursday, December 14, 2006 2:21 PM > To: lustre-discuss@clusterfs.com > Subject: [Lustre-discuss] Using Lustre on channel bonded interfaces > > First of all, thanks to everyone for an awesome product! My > first usage of Lustre has been on a small cluster with > gigabit interfaces. I can saturate the network and achieve > 100 MB/s accessing the Lustre filesystem. My problem comes > into when I channel bond 2 gigabit interfaces, my speed drops > to 54 MB/s. A recent thread I googled says that this problem > is due to the robin-robin trunking of a default channel > bonded system, which I now understand. > > My question is, is anybody else out there using Lustre with > channel bonded interfaces? If so, what trunking and kernel > options for bonding are you using? > > Thanks for any hints or tips! > > -- > Jeremy Mann > jeremy@biochem.uthscsa.edu > > University of Texas Health Science Center Bioinformatics Core > Facility http://www.bioinformatics.uthscsa.edu > Phone: (210) 567-2672 > > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss@clusterfs.com > https://mail.clusterfs.com/mailman/listinfo/lustre-discuss >
Jeremy Mann
2006-Dec-14 15:31 UTC
[Lustre-discuss] Using Lustre on channel bonded interfaces
Felix, Evan J wrote:> Can you tell us more about the version of Lustre you are using, that > options to the bonding module are you setting, and possibly what type of > switch you are using? Those things may help. If you are able to set > up two separate networks on the two interfaces, it is possible(correct > me here guys if I have this wrong) that Lustre can just run over the two > interfaces and be just happy. For example, can you make all the eth0 > interfaces be on the 10.10.10.0/24 network, and all the eth1 interfaces > on the 10.10.20.0/24 network. The just use the lnet module > parameters(>=1.4.6 only) to set two channels for traffic.We are using Lustre 1.6beta5 and no options for the bonding driver and the switch is an HP ProCurve 2848 I wasn''t aware of the magnitude of options for bonding and now after reading bonding.txt, I don''t know what sort of trunking to use. -- Jeremy Mann jeremy@biochem.uthscsa.edu University of Texas Health Science Center Bioinformatics Core Facility http://www.bioinformatics.uthscsa.edu Phone: (210) 567-2672
Felix, Evan J
2006-Dec-14 15:35 UTC
[Lustre-discuss] Using Lustre on channel bonded interfaces
Here at PNNL, we have been happy using the mode=4 option to the bonding driver to get 802.3ad aggregation. This option requires a switch that can do handle the type, and our Cisco switch does. If your HP switch supports it it would be interesting to see what you can get out of your configuration. Evan> -----Original Message----- > From: Jeremy Mann [mailto:jeremy@biochem.uthscsa.edu] > Sent: Thursday, December 14, 2006 2:31 PM > To: Felix, Evan J > Cc: lustre-discuss@clusterfs.com > Subject: RE: [Lustre-discuss] Using Lustre on channel bonded > interfaces > > > Felix, Evan J wrote: > > Can you tell us more about the version of Lustre you are > using, that > > options to the bonding module are you setting, and possibly > what type of > > switch you are using? Those things may help. If you are > able to set > > up two separate networks on the two interfaces, it is > possible(correct > > me here guys if I have this wrong) that Lustre can just run > over the > > two interfaces and be just happy. For example, can you > make all the > > eth0 interfaces be on the 10.10.10.0/24 network, and all the eth1 > > interfaces on the 10.10.20.0/24 network. The just use the > lnet module > > parameters(>=1.4.6 only) to set two channels for traffic. > > We are using Lustre 1.6beta5 and no options for the bonding > driver and the switch is an HP ProCurve 2848 I wasn''t aware > of the magnitude of options for bonding and now after reading > bonding.txt, I don''t know what sort of trunking to use. > > -- > Jeremy Mann > jeremy@biochem.uthscsa.edu > > University of Texas Health Science Center Bioinformatics Core > Facility http://www.bioinformatics.uthscsa.edu > Phone: (210) 567-2672 >
Serge Ryabchun
2006-Dec-17 17:17 UTC
[Lustre-discuss] Using Lustre on channel bonded interfaces
On Thu, Dec 14, 2006 at 04:31:01PM -0600, Jeremy Mann wrote:> Felix, Evan J wrote: > > Can you tell us more about the version of Lustre you are using, that > > options to the bonding module are you setting, and possibly what type of > > switch you are using? Those things may help. If you are able to set > > up two separate networks on the two interfaces, it is possible(correct > > me here guys if I have this wrong) that Lustre can just run over the two > > interfaces and be just happy. For example, can you make all the eth0 > > interfaces be on the 10.10.10.0/24 network, and all the eth1 interfaces > > on the 10.10.20.0/24 network. The just use the lnet module > > parameters(>=1.4.6 only) to set two channels for traffic. > > We are using Lustre 1.6beta5 and no options for the bonding driver and the > switch is an HP ProCurve 2848 I wasn''t aware of the magnitude of options > for bonding and now after reading bonding.txt, I don''t know what sort of > trunking to use.Lustre 1.6b5, linux-2.6.18, HP ProCurve 2848 and 3400 with LACP OSS::/etc/modules.conf options bonding mode=802.3ad xmit_hash_policy=layer3+4 1xMDS/1xGigE, 5xOSS/2xGigE/3xSW-RAID1-OST, 42xNODE/1xGigE FILETEST/NFS/MPI of 2006/11/10 is running with following parameters: Unit is 1Mb = 1048576 bytes File size: 4096Mb to 4096Mb, buffer size: 1Mb Number of processes: 42, number of files: 4, total: 168 Using direct I/O calls (read/write). Testing file system: /mnt/work/test Nodes: n01 n02 n03 n04 n05 n06 n07 n08 n09 n10 n11 n12 n13 n14 n15 n16 n17 n18 n19 n20 n21 n22 n23 n24 n25 n26 n27 n28 n29 n30 n31 n32 n33 n34 n35 n36 n37 n38 n39 n40 n41 n42 Starting test iteration = 0 for FS = 4096Mb. --- TFS=688128Mb WTT=1000.7s AWT= 6.0s LWR=16.4Mb/s TWR=687.6Mb/s --- TFS=688128Mb RTT=626.0s ART= 3.7s LRR=26.2Mb/s TRR=1099.2Mb/s Testing complete in 1634.3 seconds --- Best writing rate achieved: TWR=687.6 Mb/s (at TFS=688128Mb) --- Best reading rate achieved: TRR=1099.2 Mb/s (at TFS=688128Mb) Nprocs, Average TWR, Std deviation of TWR, Average TRR, Std deviation of TRR 42 687.6 0.0 1099.2 0.0 -- Serge Ryabchun