On Feb 15, 2006 15:03 -0700, Nathan Dauchy wrote:> From this statement, we infer that every HDF4 I/O operations results in > a system I/O request to the filesystem. Since Lustre doesn''t cache data > on the client side, this means that small HDF operations in particular > will have a very adverse impact on performance. For now, we are trying > to aggregate I/O operations as much as possible in order to get better > performance.I''m not sure why you say Lustre does not cache data on the client side. It in fact does cache reads and writes on the client, but if e.g. there are many processes doing IO to the same file in small, interlaced chunks then the cache coherency is causing a lot of small writes to be done to the storage (no easy way around this). Beyond that, I''m not very familiar with HDF4 (have heard of HDF5 but not much more than that), sorry.> Does anyone have experience with using HDF and Lustre? > Can you share your performance results? > Have any tuning tips we can try?Look at /proc/fs/lustre/obdfilter/*/brw_stats on the OSS nodes and this will tell you the kind of IO that the client is doing. Ideally it is doing 1MB reads and writes, but from your description I suspect not. You can also check "llstat.pl /proc/fs/lustre/ost/OST/ost/stats 10" will tell you the OST RPCs being done in 10 second intervals. See sec 8.2: https://mail.clusterfs.com/wikis/lustre/LustreProc Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc.
Andreas Dilger wrote:> > I''m not sure why you say Lustre does not cache data on the client side.Oops, my mistake... but I don''t think it is making a difference:> It in fact does cache reads and writes on the client, but if e.g. there > are many processes doing IO to the same file in small, interlaced chunks > then the cache coherency is causing a lot of small writes to be done to > the storage (no easy way around this).Yes, this is indeed what happens with HDF. For example, the last run of the problem application used 224 cpus, all accessing the same file.> Look at /proc/fs/lustre/obdfilter/*/brw_stats on the OSS nodes and this > will tell you the kind of IO that the client is doing. Ideally it is > doing 1MB reads and writes, but from your description I suspect not. > You can also check "llstat.pl /proc/fs/lustre/ost/OST/ost/stats 10" will > tell you the OST RPCs being done in 10 second intervals. See sec 8.2: > https://mail.clusterfs.com/wikis/lustre/LustreProc >Thanks for the pointers! I''ll see what they turn up. -Nathan
Greetings, We have a user whose application runs considerably slower on Lustre than on NFS. (One routine in particular takes twice as long.) The application in question performs a lot of HDF4 I/O. Specifically, the HAregister_atom, HAremove_atom, HAPatom_object, Hfind, Hstartaccess, and HTIfind_dd calls take considerably longer on the Lustre run. The developer''s documentation for HDF4 contains following statement: (http://wuarchive.wustl.edu/pub/FreeBSD/distfiles/HDF41r5_SpecDG.pdf) ----------------------------------------- The Hsync routine has been defined and implemented to synchronize a file with its image in memory. Currently it is not very useful because the HDF software includes no buffering mechanism and the two images are always identical. Hsync will become useful when buffering is implemented: ----------------------------------------- From this statement, we infer that every HDF4 I/O operations results in a system I/O request to the filesystem. Since Lustre doesn''t cache data on the client side, this means that small HDF operations in particular will have a very adverse impact on performance. For now, we are trying to aggregate I/O operations as much as possible in order to get better performance. Does anyone have experience with using HDF and Lustre? Can you share your performance results? Have any tuning tips we can try? Thanks, Nathan