Hello, Could somebody give me some advice on how to improve the gridftp performance with Lustre? Currently, we are putting files onto Lustre through Lustre file system mounted gridftp server. I noticed the network traffic goes in this way (use putting data as an example): remote client---->gridftp server--->Lustre OSS And the gridftp server is busy with receiving and sending packets all the time. Is there a way for the control info goes to gridftp server, but the data go to Lustre OSS directly? Thanks in advance for answering my question. Regards, Yujun
On Fri, 2009-05-22 at 11:24 -0400, Yujun Wu wrote:> Hello,Hi,> remote client---->gridftp server--->Lustre OSS > > And the gridftp server is busy with receiving and sending packets > all the time.Is it so busy that it is unable to push the Lustre servers to full capacity?> Is there a way for the control info goes to gridftp > server, but the data go to Lustre OSS directly?No. All access to the Lustre servers have to go through a Lustre client. It sounds like you need to scale up with more Lustre clients (aka gridftp servers). Whether gridftp as a service supports growth through parallelization or not, I have no idea. Typically with services that don''t, something like round-robin DNS can be used to spread load. Other techniques can be used to make an IP service that is not natively parallel provide parallel access. b. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 197 bytes Desc: This is a digitally signed message part Url : http://lists.lustre.org/pipermail/lustre-discuss/attachments/20090522/05ffd159/attachment.bin
The gridftp servers need to be on lustre clients. No way you can send your data directly from the gridftp client to the OSS if this is what you''re asking. jab> -----Original Message----- > From: lustre-discuss-bounces at lists.lustre.org > [mailto:lustre-discuss-bounces at lists.lustre.org] On Behalf Of Yujun Wu > Sent: Friday, May 22, 2009 8:24 AM > To: lustre-discuss at lists.lustre.org > Subject: [Lustre-discuss] gridftp and Lustre > > Hello, > > Could somebody give me some advice on how to improve the > gridftp performance with Lustre? > > Currently, we are putting files onto Lustre through Lustre > file system mounted gridftp server. I noticed the network > traffic goes in this way (use putting data as an example): > > remote client---->gridftp server--->Lustre OSS > > And the gridftp server is busy with receiving and sending > packets all the time. Is there a way for the control info > goes to gridftp server, but the data go to Lustre OSS directly? > > Thanks in advance for answering my question. > > > Regards, > Yujun > > > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss >
Hi Jeffrey, Thanks for your e-mail. Then extra traffic is moving from gridftp servers to OSSs, which I really don''t like---as you may imagine. I know a product called dCache. The gridftp servers don''t handle the data traffic directly, but re-directing the data to the data server(dCache pools). Regards, Yujun On Fri, 22 May 2009, Jeffrey Bennett wrote:> The gridftp servers need to be on lustre clients. No way you can send your data directly from the gridftp client to the OSS if this is what you''re asking. > > jab > > > -----Original Message----- > > From: lustre-discuss-bounces at lists.lustre.org > > [mailto:lustre-discuss-bounces at lists.lustre.org] On Behalf Of Yujun Wu > > Sent: Friday, May 22, 2009 8:24 AM > > To: lustre-discuss at lists.lustre.org > > Subject: [Lustre-discuss] gridftp and Lustre > > > > Hello, > > > > Could somebody give me some advice on how to improve the > > gridftp performance with Lustre? > > > > Currently, we are putting files onto Lustre through Lustre > > file system mounted gridftp server. I noticed the network > > traffic goes in this way (use putting data as an example): > > > > remote client---->gridftp server--->Lustre OSS > > > > And the gridftp server is busy with receiving and sending > > packets all the time. Is there a way for the control info > > goes to gridftp server, but the data go to Lustre OSS directly? > > > > Thanks in advance for answering my question. > > > > > > Regards, > > Yujun > > > > > > _______________________________________________ > > Lustre-discuss mailing list > > Lustre-discuss at lists.lustre.org > > http://lists.lustre.org/mailman/listinfo/lustre-discuss > >
I guess you could implement some sort of gridftp plugin, a DSI module is what they call it, to write data directly to the OSS and then you could install gridftp servers on the OSS. This is basically what the HPSS DSI module for Gridftp does, it writes data directly to the HPSS system using the HPSS API. However, I am not sure it''s technically possible with Lustre. I''ll take a look at the dCache thing, thanks! jab> -----Original Message----- > From: Yujun Wu [mailto:yujun at phys.ufl.edu] > Sent: Friday, May 22, 2009 10:18 AM > To: Jeffrey Bennett > Cc: lustre-discuss at lists.lustre.org > Subject: RE: [Lustre-discuss] gridftp and Lustre > > Hi Jeffrey, > > Thanks for your e-mail. Then extra traffic is moving from > gridftp servers to OSSs, which I really don''t like---as you > may imagine. > > I know a product called dCache. The gridftp servers don''t > handle the data traffic directly, but re-directing the data > to the data server(dCache pools). > > > Regards, > Yujun > On Fri, 22 May 2009, Jeffrey Bennett wrote: > > > The gridftp servers need to be on lustre clients. No way > you can send your data directly from the gridftp client to > the OSS if this is what you''re asking. > > > > jab > > > > > -----Original Message----- > > > From: lustre-discuss-bounces at lists.lustre.org > > > [mailto:lustre-discuss-bounces at lists.lustre.org] On > Behalf Of Yujun > > > Wu > > > Sent: Friday, May 22, 2009 8:24 AM > > > To: lustre-discuss at lists.lustre.org > > > Subject: [Lustre-discuss] gridftp and Lustre > > > > > > Hello, > > > > > > Could somebody give me some advice on how to improve the gridftp > > > performance with Lustre? > > > > > > Currently, we are putting files onto Lustre through Lustre file > > > system mounted gridftp server. I noticed the network > traffic goes in > > > this way (use putting data as an example): > > > > > > remote client---->gridftp server--->Lustre OSS > > > > > > And the gridftp server is busy with receiving and sending packets > > > all the time. Is there a way for the control info goes to gridftp > > > server, but the data go to Lustre OSS directly? > > > > > > Thanks in advance for answering my question. > > > > > > > > > Regards, > > > Yujun > > > > > > > > > _______________________________________________ > > > Lustre-discuss mailing list > > > Lustre-discuss at lists.lustre.org > > > http://lists.lustre.org/mailman/listinfo/lustre-discuss > > > > >
On May 22, 2009 10:33 -0700, Jeffrey Bennett wrote:> I guess you could implement some sort of gridftp plugin, a DSI module > is what they call it, to write data directly to the OSS and then you > could install gridftp servers on the OSS. This is basically what the HPSS > DSI module for Gridftp does, it writes data directly to the HPSS system > using the HPSS API. However, I am not sure it''s technically possible > with Lustre.It is possible to mount Lustre clients on the OSS nodes, if you are willing to make at least the GridFTP port accessible to the outside network on those nodes. That said, this exposes the back-end storage to attack by a remote client, if the GridFTP server is not secure. Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc.
Hello Andreas, Thanks for your e-mail. I think what Jeffrey recommends is this: http://www.globus.org/toolkit/docs/4.2/4.2.0/data/gridftp/developer/gridftp-developer-dsi.html As the frontend and backend are separated, the security risk is not big. On the other hand, I have another question: if I install a gridftp server on an OSS, it will write to that OSS ONLY or data can go to other OSSs as well? Thanks, Yujun On Mon, 25 May 2009, Andreas Dilger wrote:> On May 22, 2009 10:33 -0700, Jeffrey Bennett wrote: > > I guess you could implement some sort of gridftp plugin, a DSI module > > is what they call it, to write data directly to the OSS and then you > > could install gridftp servers on the OSS. This is basically what the HPSS > > DSI module for Gridftp does, it writes data directly to the HPSS system > > using the HPSS API. However, I am not sure it''s technically possible > > with Lustre. > > It is possible to mount Lustre clients on the OSS nodes, if you are willing > to make at least the GridFTP port accessible to the outside network on > those nodes. That said, this exposes the back-end storage to attack by > a remote client, if the GridFTP server is not secure. > > Cheers, Andreas > -- > Andreas Dilger > Sr. Staff Engineer, Lustre Group > Sun Microsystems of Canada, Inc. > >
On May 26, 2009 12:44 -0700, Jeffrey Bennett wrote:> Is it technically possible to implement this without using the > Lustre client on the OSS and accessing directly to the objects on the > OSS? I guess this would imply the development of some sort of Lustre > Client/Gridftp server all together in the same code. Not that I am going > to implement it, but I am curious...There is no such thing as "direct" access to the object. What you are describing is essentially just putting a GridFTP server accessing Lustre via the Lustre client. If you want to put a "DSI module" that just writes to a mounted client filesystem on the OSS that is fine too, but I don''t see how this is significantly different than just running multiple GridFTP servers, one on each OSS. Yujun Wu <yujun at phys.ufl.edu> wrote:> As the frontend and backend are separated, the security risk is > not big. On the other hand, I have another question: if I install a > gridftp server on an OSS, it will write to that OSS ONLY or data can go > to other OSSs as well?The client would be a normal client - it can access all of the OSS storage in the Lustre filesystem. While it might be possible to create a DSI that knows the internal structure of the Lustre striping and drive the traffic directly to a specific OST, this would require a non-trivial amount of ongoing effort. It would be much simpler to just run multiple GridFTP servers on Lustre clients (whether local to the OSS nodes or on standalone clients). It would seem that even a small number of Lustre clients could accept all of the GridFTP traffic that your WAN link could handle (10Gbit/s?) The biggest obstacle to performance is the TCP data copy overhead, which is probably independent of Lustre itself.> > -----Original Message----- > > From: Andreas.Dilger at sun.com [mailto:Andreas.Dilger at sun.com] > > On Behalf Of Andreas Dilger > > Sent: Monday, May 25, 2009 4:14 PM > > To: Jeffrey Bennett > > Cc: Yujun Wu; lustre-discuss at lists.lustre.org > > Subject: Re: [Lustre-discuss] gridftp and Lustre > > > > On May 22, 2009 10:33 -0700, Jeffrey Bennett wrote: > > > I guess you could implement some sort of gridftp plugin, a > > DSI module > > > is what they call it, to write data directly to the OSS and > > then you > > > could install gridftp servers on the OSS. This is basically > > what the > > > HPSS DSI module for Gridftp does, it writes data directly > > to the HPSS > > > system using the HPSS API. However, I am not sure it''s technically > > > possible with Lustre. > > > > It is possible to mount Lustre clients on the OSS nodes, if > > you are willing to make at least the GridFTP port accessible > > to the outside network on > > those nodes. That said, this exposes the back-end storage > > to attack by > > a remote client, if the GridFTP server is not secure. > > > > Cheers, Andreas > > -- > > Andreas Dilger > > Sr. Staff Engineer, Lustre Group > > Sun Microsystems of Canada, Inc. > > > >Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc.