I came across this www.gluster.org Has any one tried it . Is it a true parallel file system allowing concurrent read and write to a file by many processes. Will it be suitable for HPC applications. -- Regards-- Rishi Pathak -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20080430/d1eead2a/attachment-0001.html
rishi pathak wrote:> I came across this www.gluster.org <http://www.gluster.org> > Has any one tried it . > Is it a true parallel file system allowing concurrent read and write to > a file by many processes. > Will it be suitable for HPC applications. > >I wouldn''t call GlusterFS a parallel filesystem in the same way I would refer to Lustre or PVFS. GlusterFS is a distributed filesystem, where complete files are contained on one of multiple servers. Although it supports striping, even they say striping for their implementation is bad (http://www.gluster.org/docs/index.php/GlusterFS_FAQ#Why_is_striping_bad.3F). Because of GlusterFS''s modular architecture it was easy for them to implement. They do have MPI-IO support on their roadmap, so maybe they are planning to work around the issues described in the link above in user space. GlusterFS is much more like Ibrix or Netapp/GX than Lustre. It seems best as a distributed NFS replacement. In my minimal testing, performance scales linearly as you add data servers. Metadata performance is reasonable (by feel, not by actual measurements). Some of the more interesting features that GlusterFS supports include automatic file replication (AFR), layered performance translators for both client and server side, ability to support heterogeneous storage servers, and is really easy to setup and maintain. I don''t have a Lustre setup ready to make any apples to apples comparisons though. However, I believe that the two products fit two different needs. Also, file systems are hard and take a long time to stabilize. Lustre has put in its time, and we are now seeing the benefits. GlusterFS is less mature. Note, comments above are from some basic testing that I have done. I am not a GlusterFS developer. Craig> -- > Regards-- > Rishi Pathak > > > ------------------------------------------------------------------------ > > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss-- Craig Tierney (craig.tierney at noaa.gov)
---- Craig Tierney <Craig.Tierney at noaa.gov> wrote:> rishi pathak wrote: > > I came across this www.gluster.org <http://www.gluster.org> > > Has any one tried it . > > Is it a true parallel file system allowing concurrent read and write to > > a file by many processes. > > Will it be suitable for HPC applications. > > > > > > I wouldn''t call GlusterFS a parallel filesystem in the same way I would > refer to Lustre or PVFS. GlusterFS is a distributed filesystem, > where complete files are contained on one of multiple servers.This isn''t quite accurate. Depending upon the translators you use, the files can be stripped across servers. For clusters it is almost always the case that the files will be stripped.> striping, even they say striping for their implementation is bad > (http://www.gluster.org/docs/index.php/GlusterFS_FAQ#Why_is_striping_bad.3F). > Because of GlusterFS''s modular architecture it was easy for them to implement. > They do have MPI-IO support on their roadmap, so maybe they are planning to work around > the issues described in the link above in user space. > > GlusterFS is much more like Ibrix or Netapp/GX than Lustre. It seems best as a distributed NFS > replacement. In my minimal testing, performance scales linearly as you add data servers. > Metadata performance is reasonable (by feel, not by actual measurements).One of the design ideas behind GlusterFS is that it doesn''t have a metadata server. So i''m not sure what you were measuring. It may have been the metadata performance for the underlying file system rather than GlusterFS. I haven''t tested it yet, but it has some interesting ideas (all in user-space so there are no kernel mods to worry about, no metadata server, stackable translators for tuning performance). Jeff
laytonjb at charter.net wrote:> ---- Craig Tierney <Craig.Tierney at noaa.gov> wrote: >> rishi pathak wrote: >>> I came across this www.gluster.org <http://www.gluster.org> >>> Has any one tried it . >>> Is it a true parallel file system allowing concurrent read and write to >>> a file by many processes. >>> Will it be suitable for HPC applications. >>> >>> >> I wouldn''t call GlusterFS a parallel filesystem in the same way I would >> refer to Lustre or PVFS. GlusterFS is a distributed filesystem, >> where complete files are contained on one of multiple servers. > > This isn''t quite accurate. Depending upon the translators you use, the files > can be stripped across servers. For clusters it is almost always the case that > the files will be stripped. > >> striping, even they say striping for their implementation is bad >> (http://www.gluster.org/docs/index.php/GlusterFS_FAQ#Why_is_striping_bad.3F). >> Because of GlusterFS''s modular architecture it was easy for them to implement. >> They do have MPI-IO support on their roadmap, so maybe they are planning to work around >> the issues described in the link above in user space. >>Yes, there is a translator that will stripe files. However, see the above comment. Even they say it isn''t a good idea to use it. I don''t see why that for clusters it would always be the case that files will be striped? Are you implying that clusters means "Large distributed HPC systems that read/write very large files"? There is implicitly overhead in reconstructing a striped file that will impact performance (but could be minimal, I haven''t tested it). Streaming performance may be better but what about random IO patterns? If my codes don''t do parallel IO, why would I necessarily add the complexity? I know Lustre does striping quite well, but not applications require it.>> GlusterFS is much more like Ibrix or Netapp/GX than Lustre. It seems best as a distributed NFS >> replacement. In my minimal testing, performance scales linearly as you add data servers. >> Metadata performance is reasonable (by feel, not by actual measurements). > > One of the design ideas behind GlusterFS is that it doesn''t have a metadata > server. So i''m not sure what you were measuring. It may have been the > metadata performance for the underlying file system rather than GlusterFS.By metadata performance, I meant IOPS. It doesn''t have a dedicated metadata server, but all servers perform the function. The streaming performance is quite good, but what I if I need to use NetCDF files, compile code, or use the filesystem as a large distributed mailserver? Why I say streaming performance is good, I have been able to get a single server to push about 300 MB/s. This is a limitation of my storage device, not the filesystem. I don''t know how performs over the IB transport when a faster disk array is used.> > I haven''t tested it yet, but it has some interesting ideas (all in user-space so > there are no kernel mods to worry about, no metadata server, stackable > translators for tuning performance). >Yes, these features are very nice. I liked that I could get it running on an older kernel in only a few minutes (non-lustre server supported kernel). So far it is meeting my needs for a small application. I haven''t been using it long, so I cannot comment on long term stability. When I have some larger storage servers, I plan to test it further (as well as Lustre). Craig> Jeff > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss >-- Craig Tierney (craig.tierney at noaa.gov)
---- Craig Tierney <Craig.Tierney at noaa.gov> wrote:> laytonjb at charter.net wrote: > > ---- Craig Tierney <Craig.Tierney at noaa.gov> wrote: > >> rishi pathak wrote: > >>> I came across this www.gluster.org <http://www.gluster.org> > >>> Has any one tried it . > >>> Is it a true parallel file system allowing concurrent read and write to > >>> a file by many processes. > >>> Will it be suitable for HPC applications. > >>> > >>> > >> I wouldn''t call GlusterFS a parallel filesystem in the same way I would > >> refer to Lustre or PVFS. GlusterFS is a distributed filesystem, > >> where complete files are contained on one of multiple servers. > > > > This isn''t quite accurate. Depending upon the translators you use, the files > > can be stripped across servers. For clusters it is almost always the case that > > the files will be stripped. > > > >> striping, even they say striping for their implementation is badHmm... The last time I talked to AB he suggested using striping for better performance. But as you say below, it depends upon the strip size and other translators in use (I''ve seen that drive performance).> >> (http://www.gluster.org/docs/index.php/GlusterFS_FAQ#Why_is_striping_bad.3F). > >> Because of GlusterFS''s modular architecture it was easy for them to implement. > >> They do have MPI-IO support on their roadmap, so maybe they are planning to work around > >> the issues described in the link above in user space. > >> > > Yes, there is a translator that will stripe files. However, see the above comment. > Even they say it isn''t a good idea to use it. > > I don''t see why that for clusters it would always be the case that files will be striped? Are > you implying that clusters means "Large distributed HPC systems that read/write very large files"?I like the idea of striped files from the perspective if that I lose the server where the file is located, I''ve lost access to the file until it''s restored. I can mirror the file but that''s wasting space. But, as you point out, it depends upon the application(s). (I think I''ll get a tatoo that says that :) ).> There is implicitly overhead in reconstructing a striped file that will impact performance > (but could be minimal, I haven''t tested it).Yep - good comment. I haven''t tested the reconstruction either manual or AFR. Streaming performance may be better but what> about random IO patterns? If my codes don''t do parallel IO, why would I necessarily > add the complexity? > > I know Lustre does striping quite well, but not applications require it. > > >> GlusterFS is much more like Ibrix or Netapp/GX than Lustre. It seems best as a distributed NFS > >> replacement. In my minimal testing, performance scales linearly as you add data servers. > >> Metadata performance is reasonable (by feel, not by actual measurements). > > > > One of the design ideas behind GlusterFS is that it doesn''t have a metadata > > server. So i''m not sure what you were measuring. It may have been the > > metadata performance for the underlying file system rather than GlusterFS. > > By metadata performance, I meant IOPS. It doesn''t have a dedicated metadata server, > but all servers perform the function. The streaming performance is quite > good, but what I if I need to use NetCDF files, compile code, or use the filesystem > as a large distributed mailserver? > > Why I say streaming performance is good, I have been able to get a single server > to push about 300 MB/s. This is a limitation of my storage device, not the > filesystem. I don''t know how performs over the IB transport when a faster > disk array is used. > > > > > I haven''t tested it yet, but it has some interesting ideas (all in user-space so > > there are no kernel mods to worry about, no metadata server, stackable > > translators for tuning performance). > > > > Yes, these features are very nice. I liked that I could get it running on an older > kernel in only a few minutes (non-lustre server supported kernel). So far it is > meeting my needs for a small application. I haven''t been using it long, so > I cannot comment on long term stability. When I have some larger storage servers, > I plan to test it further (as well as Lustre). >Jeff