thr3ads.net - Lustre discuss - [Lustre-discuss] Lustre 1.6.1 and Performance Issue [Oct 2007]

If this information is useful, please help other people find it:
Share via:

anhvu.q.le at exxonmobil.com

2007-Oct-11 11:50 UTC

[Lustre-discuss] Lustre 1.6.1 and Performance Issue

Our Lustre file system currently comprises 6 HP DL380s with dual, dual core
2.66 GHz Intel Xeon processors and 8GB of RAM.  Two servers are used for
Lustre metadata and the remaining 4 are used for object storage, each
equipped with dual, dual port Qlogic card and a Infiniband SDR card.  Both
metadata servers are directly attached to a HP SFS20 storage array (RAID
5).  Two EMC CX-380 storage couplets front 144 TB usable SATA II disk and
are attached to the OSSs via 4 Gb Fibre Channel (FC).  Each of the
OSS''s
Qlogic ports function as a primary path for three OSTs - 3TB LUNs.  The
problem that I am having is that regardless how we stripe our Lustre file
systems, our aggregate performance is always somewhat the performance of a
single OST - which is around 700MB/s.

Any help is greatly appreciated!

Thanks,
Anhvu Q. Le
ExxonMobil GSC Information Technology
Business Line Infrastructure, Technical Systems
Phone: 713-431-4739
Email: anhvu.q.le at exxonmobil.com

Niklas Edmundsson

2007-Oct-11 14:34 UTC

head link

[Lustre-discuss] Lustre 1.6.1 and Performance Issue

On Thu, 11 Oct 2007, anhvu.q.le at exxonmobil.com wrote:
> Our Lustre file system currently comprises 6 HP DL380s with dual, dual core
> 2.66 GHz Intel Xeon processors and 8GB of RAM.  Two servers are used for
> Lustre metadata and the remaining 4 are used for object storage, each
> equipped with dual, dual port Qlogic card and a Infiniband SDR card.  Both
> metadata servers are directly attached to a HP SFS20 storage array (RAID
> 5).  Two EMC CX-380 storage couplets front 144 TB usable SATA II disk and
> are attached to the OSSs via 4 Gb Fibre Channel (FC).  Each of the
OSS''s
> Qlogic ports function as a primary path for three OSTs - 3TB LUNs.  The
> problem that I am having is that regardless how we stripe our Lustre file
> systems, our aggregate performance is always somewhat the performance of a
> single OST - which is around 700MB/s.
The first step would be to isolate where your bottleneck is... My 
first guess would be the hardware raid boxes.

I''m assuming you''re talking about large-file IO performance,
which
would imply that the MDS is OK. However, I would strongly advise 
against having metadata on parity-raid (RAID5/6). Use mirroring 
instead.

Have you benchmarked your storage backends, for example with 
sgpdd_survey? In parallel, simulating lustre load?

Same thing goes for all other involved parts, benchmark them both 
individually and in parallel so you''re sure they can take the load.

If you haven''t read it already, I''d recommend reading the
lustre
manual since it answers a lot of hardware config/benchmark/tuning 
questions.

/Nikke
-- 
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
  Niklas Edmundsson, Admin @ {acc,hpc2n}.umu.se     |    nikke at hpc2n.umu.se
---------------------------------------------------------------------------
  I am Garfield of Borg. Hairballs are irrelevent. ..<HACK>..
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

Lustre discuss - Oct 2007 - Lustre 1.6.1 and Performance Issue

[Lustre-discuss] Lustre 1.6.1 and Performance Issue

[Lustre-discuss] Lustre 1.6.1 and Performance Issue