thr3ads.net - Lustre discuss - [Lustre-discuss] performance tuning w/ dbench, bonnie++ [Dec 2009]

If this information is useful, please help other people find it:
Share via:

Jay Christopherson

2009-Dec-21 17:57 UTC

[Lustre-discuss] performance tuning w/ dbench, bonnie++

I have a relatively small Lustre environment. I''m running 4
OSS''s with 4
OST''s, running version 1.8.1.1 (I recently upgraded from 1.6). The MDS
is a
separate host with separate disk. I''ve been using Lustre to host
shared
logs, files, and application queues, most recently, a JMS shared queue. It
has been working well until we had tuned out Java application and databases
to the point where throughput on Lustre has become the bottleneck. When we
moved our application queues to local disk, instead of shared disk, our
application throughput really shot through the roof. At that point, we had
been seeing a lot of "pauses" in IO, where throughput (as measured by
our
application) would be running right along and then, periodically, we would
see unexplained pauses, where IO would take nearly 2000ms to complete as
opposed to more normal times of sub 200ms. This is IO as we are defining in
terms of our application, not simply disk IO.

After eliminating everything else, we moved our application queues and
logging to local disk and throughput stabilized throughout the entire run of
a test at peak loads. Trying to explain this behavior, we started
benchmarking Lustre vs. local disk. At first, the times were really bad,
showing something like 15MB/s on Lustre vs. 400MB/s local, as reported by
dbench. After much tuning, re-architecting our Lustre setup (which
admittedly, was not ideal), we were able to see big improvements, as
evidenced by repeated Bonnie++ tests. However, no matter what our
configuration, dbench never shows any improvement. It never shows more than
25MB/sec, with 5 clients. I *KNOW* we are getting better throughput, but I
need to be able to prove it. Bonnie++ is showing improvement (like
100+MB/sec), but I''d like to see two or three sources verify it for me
before I go back and start re-testing our application again.

I''m using Bonnie++ and dbench like so:

# bonnie++ -d /logstore/test

# dbench -t 60 -D /logstore/test 5

I''m hoping it''s simply a matter of me not using the test
correctly or
something else that makes me the culprit. If there are other tests that I
should be doing, that would be helpful too. I looked at IOR, which has been
a pain to get running since LAM really, really, really doesn''t want to
compile or install correctly on my system (CentOS 5.2, x86_64).
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
http://lists.lustre.org/pipermail/lustre-discuss/attachments/20091221/a41845e5/attachment.html

Andreas Dilger

2009-Dec-21 18:24 UTC

head link

[Lustre-discuss] performance tuning w/ dbench, bonnie++

On 2009-12-21, at 10:57, Jay Christopherson wrote:> However, no matter what our configuration, dbench never shows any
> improvement. It never shows more than 25MB/sec, with 5 clients. I
> *KNOW* we are getting better throughput, but I need to be able to
> prove it.
Dbench should not be considered a performance benchmark, but rather
only a load testing tool.

The main reason dbench is not a good benchmark tool is that it creates
and deletes files fairly rapidly, and if the file has never been
written to disk before it is deleted, it wills still count this IO in
the "bandwidth" number even though no bytes hit the disk. In real
world usage it is fairly uncommon that files are created and deleted
in such a short time. Compilers used to be the common usage example
for this, but today they dump temp files to tmpfs (RAM-based) and not
to disk filesystems.

Lustre is an "eager writer" in that it submits data to the server as
soon as it has a full RPC worth of data. Local filesystems often
delay writing between 5-30s, which is fine because they control the VM
directly in conjunction with the disk IO. For Lustre, which sometimes
has 10000 clients, delaying the writes by 30s would mean interfering
with application traffic on the network, and would also waste
30*{server bandwidth} of data that could already have been written to
disk.

In the end, the most important metric you need to look at is whether
your application has seen improvement now that you restructured your
Lustre config. If it has, then end of story. If it hasn''t, then your
application is doing something that Lustre doesn''t like very much
(e.g. small IO, concurrent appends to the same file). It might be
possible to tune Lustre further, depending on what it is your
application is doing, but it is pointless to optimize for dbench,
since it is unlikely to be doing exactly what your application is doing.

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.

Peter Grandi

2009-Dec-24 13:18 UTC

head link

[Lustre-discuss] performance tuning w/ dbench, bonnie++

[ ... ]
>> However, no matter what our configuration, dbench never shows
>> any improvement.  It never shows more than 25MB/sec, with 5
>> clients. I *KNOW* we are getting better throughput, but I
>> need to be able to prove it.
> Dbench should not be considered a performance benchmark, but
> rather only a load testing tool. The main reason dbench is
> not a good benchmark tool is that it creates and deletes files
> fairly rapidly, and if the file has never been written to disk
> before it is deleted, it wills still count this IO in the
> "bandwidth" number even though no bytes hit the disk.
To add to these wise words, I reckon that "bonnie++" is also a
poor benchmarking tool, regrettably very popular; I usually
prefer Bonnie 1.4 for quick tests or FIO for bigger ones (both
with the right options), and ''lmdd'' from
''lmbench'':

 
http://www.linux-archive.org/ext3-users/284045-best-file-system-performance-analysis-tool.html

Unfortunately file system benchmarking requires great patience
and insight, and tool selection is part of that, including
knowing why ''dbench'' or ''bonnie++'' are not
the best performance
testing tools.

Lustre discuss - Dec 2009 - performance tuning w/ dbench, bonnie++

[Lustre-discuss] performance tuning w/ dbench, bonnie++

[Lustre-discuss] performance tuning w/ dbench, bonnie++

[Lustre-discuss] performance tuning w/ dbench, bonnie++