[Please note that the @lists.sourceforge.net lists are no longer active;
I have CCed the new list.]
Hawk Chen wrote:> Hello,
> I have downloaded a reinterpreted version of the venerable Andrew file
> system benchmark .The author is Yasushi Saito, HP Labs, Storage Systems
> Department.
> I found that it would need about 5 minites to execute this benchmark
> in a Linux ext3 filesystem,but about 25 minites in a Lustre client
> filesystem with 4 OSTs,1 MDS angd 1 Client.The test environments were
> the same.
So far, we have focused our efforts primarily on scalable metadata --
excellent performance with many clients, but not necessarily with just
one client -- and extremely good medium/large file I/O performance.
If I am looking at the correct benchmark
(http://www.hpl.hp.com/personal/Yasushi_Saito/andrew-tcl-bench-0.3.tgz),
then this test appears to be mostly metadata (mkdir, stat, creations)
and small file I/O (compilation).
In general, because ext3 is a local filesystem and Lustre is a
distributed network filesystem, there will be more overhead for Lustre
to perform basic operations (distributed locking, RPC latency, etc) than
comparable operations on a local filesystem.
That said, Lustre can provide better client performance than a single
ext3 filesystem for bulk I/O (providing network bandwidth is available)
because it allows parallel I/O to multiple storage devices. We have
benchmarked sustained single-client write performance at 660MB/s, which
is considerably faster than any local disk that we are aware of.
There are many metadata and small-file improvements around the corner:
Lustre 1.4 will have finer-granularity metadata locking and small file
I/O, which will likely help this benchmark a lot. Lustre 2.x will have
a write-back metadata cache, proxies, and clustered metadata servers.
> I also found that executing "ls" command would need more time
> in the Lustre filesystem than in the ext3 filesystem.
> Can you tell me why this happens?
For the very same reason -- ls on a large directory will do dozens or
hundreds of system calls to fetch the attributes for the files, each of
which will require a trip to the server. This is something for which we
also plan improvements, in the form of pre-fetching the attributes when
we detect that a process is behaving like ls.
Thanks--
-Phil