Rick Friedman
2011-Jul-30 19:23 UTC
[Lustre-discuss] Lustre-discuss Digest, Vol 66, Issue 40
******************* Sent from my mobile Apologies for typos -----Original Message----- From: lustre-discuss-request at lists.lustre.org [lustre-discuss-request at lists.lustre.org] Received: Saturday, 30 Jul 2011, 2:00pm To: lustre-discuss at lists.lustre.org [lustre-discuss at lists.lustre.org] Subject: Lustre-discuss Digest, Vol 66, Issue 40 Send Lustre-discuss mailing list submissions to lustre-discuss at lists.lustre.org To subscribe or unsubscribe via the World Wide Web, visit http://lists.lustre.org/mailman/listinfo/lustre-discuss or, via email, send a message with subject or body ''help'' to lustre-discuss-request at lists.lustre.org You can reach the person managing the list at lustre-discuss-owner at lists.lustre.org When replying, please edit your Subject line so it is more specific than "Re: Contents of Lustre-discuss digest..." Today''s Topics: 1. Re: Line rate performance for clients (Andreas Dilger) 2. Re: Line rate performance for clients (Brock Palen) 3. Random OST Numbers chosen in a stripe (Roger Spellman) ---------------------------------------------------------------------- Message: 1 Date: Fri, 29 Jul 2011 12:01:40 -0600 From: Andreas Dilger <adilger at whamcloud.com> Subject: Re: [Lustre-discuss] Line rate performance for clients To: Brock Palen <brockp at umich.edu> Cc: lustre-discuss discuss <lustre-discuss at lists.lustre.org> Message-ID: <FA55B2A9-A027-4982-A3FA-4BFFA8B5E5CE at whamcloud.com> Content-Type: text/plain; charset=us-ascii On 2011-07-29, at 11:33 AM, Brock Palen wrote:> I think this is a networking question. > > We have lustre 1.8 clients with 1gig-e interfaces that according to ethtool are running full duplex. > > If I do the following: > > cp /lustre/largeilfe.h5 /tmp/ > > I get 117MB/s > > If I then use globus-url-copy to move that file from /tmp/ to -> remove tape archive I get 117MB/s > > If I go directly from /lustre -> archive I get 50MB/s,Strace your globus-url-copy and see what IO size it is using. "cp" has long ago been modified to use the blocksize reported by stat(2) for copying, and Lustre reports a 2MB IO size for striped files (1MB for unstriped). If your globus tool is using e.g. 4kB reads then it will be very inefficient for Lustre, but much less so than from /tmp.> this is consistently reproducible. It doesn''t mater if I just copy a large file on lustre to lustre, or scp, or globus. If I try to ingest and outgest data I get what looks like half duplex performance. > > Anyone have ideas why I cannot do 1Gig-e full duplex?I don''t think this has anything to do with "full duplex". 117MB/s is pretty much the maximum line rate for GigE (and pretty good for Lustre, if I do say so myself) in one direction. There is presumably no data moving in the other direction at that time. Cheers, Andreas -- Andreas Dilger Principal Engineer Whamcloud, Inc. ------------------------------ Message: 2 Date: Fri, 29 Jul 2011 14:15:42 -0400 From: Brock Palen <brockp at umich.edu> Subject: Re: [Lustre-discuss] Line rate performance for clients To: Andreas Dilger <adilger at whamcloud.com> Cc: lustre-discuss discuss <lustre-discuss at lists.lustre.org> Message-ID: <78BD437E-8F53-47DF-9D87-A98849B4A92D at umich.edu> Content-Type: text/plain; charset=us-ascii Brock Palen www.umich.edu/~brockp Center for Advanced Computing brockp at umich.edu (734)936-1985 On Jul 29, 2011, at 2:01 PM, Andreas Dilger wrote:> On 2011-07-29, at 11:33 AM, Brock Palen wrote: >> I think this is a networking question. >> >> We have lustre 1.8 clients with 1gig-e interfaces that according to ethtool are running full duplex. >> >> If I do the following: >> >> cp /lustre/largeilfe.h5 /tmp/ >> >> I get 117MB/s >> >> If I then use globus-url-copy to move that file from /tmp/ to -> remove tape archive I get 117MB/s >> >> If I go directly from /lustre -> archive I get 50MB/s, > > Strace your globus-url-copy and see what IO size it is using. "cp" has long ago been modified to use the blocksize reported by stat(2) for copying, and Lustre reports a 2MB IO size for striped files (1MB for unstriped). If your globus tool is using e.g. 4kB reads then it will be very inefficient for Lustre, but much less so than from /tmp. > >> this is consistently reproducible. It doesn''t mater if I just copy a large file on lustre to lustre, or scp, or globus. If I try to ingest and outgest data I get what looks like half duplex performance. >> >> Anyone have ideas why I cannot do 1Gig-e full duplex? > > I don''t think this has anything to do with "full duplex". 117MB/s is pretty much the maximum line rate for GigE (and pretty good for Lustre, if I do say so myself) in one direction. There is presumably no data moving in the other direction at that time.Ah I guess I wasn''t clear, I only get 117MB/s when I do ''one direction on the network'' eg copy form lustre to /tmp (local drive)'', /tmp using globus out. Its just when the client is reading form lustre and sending the data out at the same time that I only get 50MB/s. Does that make sense? Is it even right for me to expect that I could combine the performance together and expect full speed in and full speed out if I can consistently get them independent of each other?> > Cheers, Andreas > -- > Andreas Dilger > Principal Engineer > Whamcloud, Inc. > > > > >------------------------------ Message: 3 Date: Fri, 29 Jul 2011 16:49:28 -0400 From: "Roger Spellman" <Roger.Spellman at terascala.com> Subject: [Lustre-discuss] Random OST Numbers chosen in a stripe To: <lustre-discuss at lists.lustre.org>, <wc-discuss at whamcloud.com> Message-ID: <2C7DE72B9BD00F44BAECA5B0CBB87395013598BB at hermes.terascala.com> Content-Type: text/plain; charset="iso-8859-1" Suppose that I stripe a directory with the following command: lfs setstripe -c 4 . On some of my systems, when I create file in the directory, the list of OSTs for a particular file is sequential, e.g. obdidx objid objid group 12 2 0x2 0 13 2 0x2 0 14 2 0x2 0 15 2 0x2 0 On another one of my systems, when I create files in a similarly striped directory, I get seemingly random assignment, e.g. For one file: ?? obdidx?????????? objid????????? objid??????????? group ??? 14??????????? 6884???????? 0x1ae4??????????????? 0 ??? 46??????????? 6880???????? 0x1ae0??????? ????????0 ???? 8??????????? 6883???????? 0x1ae3??????????????? 0 ?? 29??????????? 6880???????? 0x1ae0??????????????? 0 For a different file: ?? obdidx?????????? objid????????? objid??????????? group ???? 13??????? ????6884???????? 0x1ae4??????????????? 0 ???? 28??????????? 6880???????? 0x1ae0??????????????? 0 ?? 44??????????? 6880???????? 0x1ae0??????????????? 0 ???? 27??????????? 6880???????? 0x1ae0??????????????? 0 Why is this? How can I control it to always be sequential? Thanks. Roger Spellman Staff Engineer Terascala, Inc. 508-588-1501 www.terascala.com <http://www.terascala.com/> ------------------------------ _______________________________________________ Lustre-discuss mailing list Lustre-discuss at lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss End of Lustre-discuss Digest, Vol 66, Issue 40 **********************************************