-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi, Is it possible to gain any speed advantage out of bonded NICs at the XCP host end, when accessing a single NFS Storage Repository? As I understand most bonding mechanisms, they use MAC addresses, or at best IP addresses and port numbers, as input to the hash to decide which link to use for a given packet. Which means that all traffic between a given host and the NFS server will hash to the same value every time, and go over just one link. Have I some obvious (or not so obvious) trickery or options? Thanks, - -- Craig Miskell Senior Systems Administrator Opus International Consultants Phone: +64 4 471 7209> Why do I have to justify a system change to someone who is > barely aware that "their" computer doesn''t contain a magic elf > who can draw pictures very fast and in reverse?Because they''re the ones holding the check-signing pen. - - Satya P, on A.S.R -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (GNU/Linux) iEYEARECAAYFAkzN/CwACgkQmDveRtxWqnbK2QCfX+NlR999/bEOkR88AMIy2jmG vikAoIpODPp+iLqbkc2A2oumEc4cj6Eo =cEWD -----END PGP SIGNATURE----- _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
On Monday 01 November 2010 00:30:52 Craig Miskell wrote:> Hi, > Is it possible to gain any speed advantage out of bonded NICs at the XCP > host end, when accessing a single NFS Storage Repository? > > As I understand most bonding mechanisms, they use MAC addresses, or at best > IP addresses and port numbers, as input to the hash to decide which link > to use for a given packet. Which means that all traffic between a given > host and the NFS server will hash to the same value every time, and go > over just one link. > > Have I some obvious (or not so obvious) trickery or options? > > Thanks, > > > Why do I have to justify a system change to someone who is > > barely aware that "their" computer doesn''t contain a magic elf > > who can draw pictures very fast and in reverse? > > Because they''re the ones holding the check-signing pen. > > - Satya P, on A.S.R > > > _______________________________________________ > Xen-users mailing list > Xen-users@lists.xensource.com > http://lists.xensource.com/xen-usersSomebody correct me if I''m wrong but if you use the same switch to get to the NFS server, you need to pick the 802.3ad bonding and see to it that your switch supports it. B. _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Bart Coninckx wrote:> On Monday 01 November 2010 00:30:52 Craig Miskell wrote: >> Hi, >> Is it possible to gain any speed advantage out of bonded NICs at the XCP >> host end, when accessing a single NFS Storage Repository? >> >> As I understand most bonding mechanisms, they use MAC addresses, or at best >> IP addresses and port numbers, as input to the hash to decide which link >> to use for a given packet. Which means that all traffic between a given >> host and the NFS server will hash to the same value every time, and go >> over just one link. >> >> Have I some obvious (or not so obvious) trickery or options? > > Somebody correct me if I''m wrong but if you use the same switch to get to the > NFS server, you need to pick the 802.3ad bonding and see to it that your > switch supports it.Sure, that''s given, but I''m not sure it answers my question. Getting bonding going isn''t the issue, what I''m interested in is whether there''s any tricks to actually utilise both links from a host talking to just a single NFS storage device. To quote from http://en.wikipedia.org/wiki/802.3ad: "When balancing traffic, network administrators often wish to avoid reordering Ethernet frames. For example, TCP suffers additional overheads when dealing with out-of-order packets. This goal is approximated by sending all frames associated with a particular session across the same link[6]. The most common implementations use L3 hashes (i.e. based on the IP address), ensuring that the same flow is always sent via the same physical link." So I guess I don''t *want* to balance a single flow across multiple links. So if I want to balance NFS traffic across links I need more than one distinct TCP connection to the NFS server, which I''d guess needs at least two distinct mounted shares. Is anyone doing anything like that? Thanks, - -- Craig Miskell Senior Systems Administrator Opus International Consultants Phone: +64 4 471 7209 Squawk - Pieces of eight! Squawk - Pieces of eight! Squawk - Pieces of eight! Squawk - Pieces of nine! System Halt 3248 - Parroty error. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (GNU/Linux) iEYEARECAAYFAkzPGhsACgkQmDveRtxWqnbbqACdFbqwd4vuH+bdxDp/KAYr6t2q 1QcAn0nHTivxAh6s4tBNMKpUg4BCPNrh =Ex1u -----END PGP SIGNATURE----- _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
On Tue, 2 Nov 2010, Craig Miskell wrote:> So I guess I don''t *want* to balance a single flow across multiple links. So if > I want to balance NFS traffic across links I need more than one distinct TCP > connection to the NFS server, which I''d guess needs at least two distinct > mounted shares.You could use balance-rr mode perhaps? I''m using this on a point-to-point drbd replication link between two systems with MTU=9000 and have seen throughput regularly around 165 MB/sec with peaks of up to 185 MB/sec. Steve _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Steve Thompson wrote:> On Tue, 2 Nov 2010, Craig Miskell wrote: > >> So I guess I don''t *want* to balance a single flow across multiple >> links. So if >> I want to balance NFS traffic across links I need more than one >> distinct TCP >> connection to the NFS server, which I''d guess needs at least two distinct >> mounted shares. > > You could use balance-rr mode perhaps? I''m using this on a > point-to-point drbd replication link between two systems with MTU=9000 > and have seen throughput regularly around 165 MB/sec with peaks of up to > 185 MB/sec.Ahhh, thanks, that''s very helpful. It pointed me in the right direction to search, and has clarified things for me immensely. If I''ve got a switch involved, then it''s going to have to do part of the job as well; balance-rr through a switch (not point-to-point) will handle outbound, but not inbound. balance-alb will do both, at the cost of being (to my mind) a little ugly in implementation (while I have little doubt it works, switching MAC-addresses around seems kinda hacky to me, and just asking for a picky switch to do something strange). So, I still have lots to think about. Thanks for the input. - -- Craig Miskell Senior Systems Administrator Opus International Consultants Phone: +64 4 471 7209 Come to think of it, there are already a million monkeys on a million typewriters, and Usenet is NOTHING like Shakespeare. -- Blair Houghton. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (GNU/Linux) iEYEARECAAYFAkzPMpEACgkQmDveRtxWqnZcfQCeMi+1VFnXkusOPePu2pGt72mG 5qIAn1OKUdbi9a1XKK6ZjL6ERDVk+5jH =REZI -----END PGP SIGNATURE----- _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
On Monday 01 November 2010 21:58:08 Steve Thompson wrote:> On Tue, 2 Nov 2010, Craig Miskell wrote: > > So I guess I don''t *want* to balance a single flow across multiple links. > > So if I want to balance NFS traffic across links I need more than one > > distinct TCP connection to the NFS server, which I''d guess needs at > > least two distinct mounted shares. > > You could use balance-rr mode perhaps? I''m using this on a point-to-point > drbd replication link between two systems with MTU=9000 and have seen > throughput regularly around 165 MB/sec with peaks of up to 185 MB/sec. > > SteveThis is what I do as well, except by either not using a switch (direct links like for DRBD) or by using two seperate ones. If I understand correctly, balance-rr spoofs MAC addresses which will confuse the connected switch. Anyway, it goes nice speed results. I do have to tweak tcp_reordering though in my SLES machines. If I don''t, I hardly get more speed than a single link. _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
On Mon, 1 Nov 2010, Bart Coninckx wrote:> This is what I do as well, except by either not using a switch (direct links > like for DRBD) or by using two seperate ones. If I understand correctly, > balance-rr spoofs MAC addresses which will confuse the connected switch. > Anyway, it goes nice speed results. I do have to tweak tcp_reordering though > in my SLES machines. If I don''t, I hardly get more speed than a single link.Interesting. My balance-rr setup for DRBD uses two dedicated GbE links between two Intel cards on each of two Dell PE2900 servers, straight through with no switches, and using MTU=9000. I have had no problems with packet reordering at all (all tcp parameters are stock). I _do_ see some out of order packets, of course, but the counts are very small. An iperf test between the two systems gives 1.97 Gb/sec. I''m using CentOS 5.5 x86_64. In different applications, I use balance-rr with two links to a single Dell 48-port switch. No problems with that either, although it is less traffic. Steve _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
On Monday 01 November 2010 23:14:22 Steve Thompson wrote:> On Mon, 1 Nov 2010, Bart Coninckx wrote: > > This is what I do as well, except by either not using a switch (direct > > links like for DRBD) or by using two seperate ones. If I understand > > correctly, balance-rr spoofs MAC addresses which will confuse the > > connected switch. Anyway, it goes nice speed results. I do have to tweak > > tcp_reordering though in my SLES machines. If I don''t, I hardly get more > > speed than a single link. > > Interesting. My balance-rr setup for DRBD uses two dedicated GbE links > between two Intel cards on each of two Dell PE2900 servers, straight > through with no switches, and using MTU=9000. I have had no problems with > packet reordering at all (all tcp parameters are stock). I _do_ see some > out of order packets, of course, but the counts are very small. An iperf > test between the two systems gives 1.97 Gb/sec. I''m using CentOS 5.5 > x86_64. > > In different applications, I use balance-rr with two links to a single > Dell 48-port switch. No problems with that either, although it is less > traffic. > > SteveThat''s odd, than the switch in some way must understand the mode 0 bonding ... Silly question maybe: but where do you see the out of order packet count? thx, B. _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users