Hello,
I have Lustre running on a ~300 node cluster, using 8 OSTs. The Lustre
file system is in test mode, so there are only 1-2 processes using it at a
time. I have noticed that ''ls -l'' is taking a lot of time.
Measured
elapsed time ranges from 37 seconds to 2 minutes, 38 seconds. This occurs
when I do an ''ls -l'' on the same directory that another
(parallel) job is
using. The parallel job doesn''t have to be very large -- 24-48
processes,
each writing their own file to the directory. I am curious as to why the
''ls
-l'' is so slow. Does Lustre have to contact each client to get the
relevant
attributes (atime/mtime/size/etc) or to make each client flush its cache?
Note that it is not a huge issue, I am just curious.
Sonja Tideman