thr3ads.net - Lustre discuss - [Lustre-discuss] Hardware Question [Oct 2007]

If this information is useful, please help other people find it:
Share via:

Aaron Knister

2007-Oct-05 16:36 UTC

[Lustre-discuss] Hardware Question

I''m planning to hang 58 terabytes off of a PowerEdge 1950 with 4 CPUS  
and 8 gigs of memory. My MDS is a dual core Opteron with a 250gig  
raid1 metadata volume and 2GB of ram. Do you think this hardware  
configuration is sane?

-Aaron

Aaron Knister

2007-Oct-05 16:49 UTC

head link

[Lustre-discuss] Hardware Question

I''m planning to hang 58 terabytes off of a PowerEdge 1950 with 4 CPUS  
and 8 gigs of memory. My MDS is a dual core Opteron with a 250gig  
raid1 metadata volume and 2GB of ram. Do you think this hardware  
configuration is sane?

-Aaron

Daniel Leaberry

2007-Oct-05 17:02 UTC

head link

[Lustre-discuss] Hardware Question

Aaron Knister wrote:> I''m planning to hang 58 terabytes off of a PowerEdge 1950 with 4
CPUS
> and 8 gigs of memory. My MDS is a dual core Opteron with a 250gig  
> raid1 metadata volume and 2GB of ram. Do you think this hardware  
> configuration is sane?
>
> -Aaron
>   Depends how much you push your disks. Your waiting on io will shoot up 
real quick if the disks slow down even a bit. My experience is you''ll 
probably want more machines unless you''re not pushing your disks (in 
which case why run lustre?)

We have about 85TB of disk (in 24 luns) hanging off 4 PE2950''s with 
those same specs. They are set up in failover pairs (each handles 6 
luns) but I can''t run too long on a single machine before it starts 
thrashing when it takes over the other nodes 6 luns.

Daniel
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at clusterfs.com
> https://mail.clusterfs.com/mailman/listinfo/lustre-discuss
>

Aaron Knister

2007-Oct-05 17:11 UTC

head link

[Lustre-discuss] Hardware Question

In what sense does it start thrasing? I''m thinking about building a  
poweredge with 8 cpus and 16GB of mem to handle 3x 9TB luns. Does  
that sound sane?

On Oct 5, 2007, at 1:02 PM, Daniel Leaberry wrote:
>
> Aaron Knister wrote:
>> I''m planning to hang 58 terabytes off of a PowerEdge 1950 with
4
>> CPUS  and 8 gigs of memory. My MDS is a dual core Opteron with a  
>> 250gig  raid1 metadata volume and 2GB of ram. Do you think this  
>> hardware  configuration is sane?
>>
>> -Aaron
>>
> Depends how much you push your disks. Your waiting on io will shoot  
> up real quick if the disks slow down even a bit. My experience is  
> you''ll probably want more machines unless you''re not
pushing your
> disks (in which case why run lustre?)
>
> We have about 85TB of disk (in 24 luns) hanging off 4 PE2950''s
with
> those same specs. They are set up in failover pairs (each handles 6  
> luns) but I can''t run too long on a single machine before it
starts
> thrashing when it takes over the other nodes 6 luns.
>
> Daniel
>
>> _______________________________________________
>> Lustre-discuss mailing list
>> Lustre-discuss at clusterfs.com
>> https://mail.clusterfs.com/mailman/listinfo/lustre-discuss
>>

Aaron Knister

2007-Oct-05 17:14 UTC

head link

[Lustre-discuss] Hardware Question

Make that 6x 9.7TB luns.

On Oct 5, 2007, at 1:11 PM, Aaron Knister wrote:
> In what sense does it start thrasing? I''m thinking about building
a
> poweredge with 8 cpus and 16GB of mem to handle 3x 9TB luns. Does
> that sound sane?
>
> On Oct 5, 2007, at 1:02 PM, Daniel Leaberry wrote:
>
>>
>> Aaron Knister wrote:
>>> I''m planning to hang 58 terabytes off of a PowerEdge 1950
with 4
>>> CPUS  and 8 gigs of memory. My MDS is a dual core Opteron with a
>>> 250gig  raid1 metadata volume and 2GB of ram. Do you think this
>>> hardware  configuration is sane?
>>>
>>> -Aaron
>>>
>> Depends how much you push your disks. Your waiting on io will shoot
>> up real quick if the disks slow down even a bit. My experience is
>> you''ll probably want more machines unless you''re not
pushing your
>> disks (in which case why run lustre?)
>>
>> We have about 85TB of disk (in 24 luns) hanging off 4 PE2950''s
with
>> those same specs. They are set up in failover pairs (each handles 6
>> luns) but I can''t run too long on a single machine before it
starts
>> thrashing when it takes over the other nodes 6 luns.
>>
>> Daniel
>>
>>> _______________________________________________
>>> Lustre-discuss mailing list
>>> Lustre-discuss at clusterfs.com
>>> https://mail.clusterfs.com/mailman/listinfo/lustre-discuss
>>>
>
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at clusterfs.com
> https://mail.clusterfs.com/mailman/listinfo/lustre-discuss

Balagopal Pillai

2007-Oct-05 17:37 UTC

head link

[Lustre-discuss] Hardware Question

Hi,

           I have 8 MD1000''s (~90 TB of raw disk space) connected to 2 
Dell servers ( 2 quad core cpu''s and 4 gigabit interfaces bonded)
on an hpc cluster with 6 lustre volumes and parallel i/o. For bigger 
files, performance is more than adequate and with large number of small
files, performance is close to terrible. Does the job, fast enough and 
is stable. If your storage requirement is for big files, one or two more
pe1950 for parallel i/o helps. I have another Lustre installation with 
one OSS. Again the throughput is better than nfs when mounted on 21 nodes.
The best performance i have seen for large number of small files is with 
gfs and ocfs2. (May be you could have a hybrid, with a few volumes with 
lustre
and few volumes with ocfs2?? never tried that!)

            The onboard broadcoms on pe1950 has the tendency to drop 
frames, just need to bump up the receive buffers with ethtool. I had a 
few occasions
where the two lustre oss had kernel panic on heavy i/o and large number 
of dropped frames. Once the dropping the problem is fixed, never had
that problem so far.

Regards
Balagopal




Aaron Knister wrote:> I''m planning to hang 58 terabytes off of a PowerEdge 1950 with 4
CPUS
> and 8 gigs of memory. My MDS is a dual core Opteron with a 250gig  
> raid1 metadata volume and 2GB of ram. Do you think this hardware  
> configuration is sane?
>
> -Aaron
>
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at clusterfs.com
> https://mail.clusterfs.com/mailman/listinfo/lustre-discuss
>

Daniel Leaberry

2007-Oct-05 17:47 UTC

head link

[Lustre-discuss] Hardware Question

Daniel Leaberry
Systems Administrator
iArchives
Tel: 801-494-6528
Cell: 801-376-6411



Aaron Knister wrote:> In what sense does it start thrasing? I''m thinking about building
a
> poweredge with 8 cpus and 16GB of mem to handle 3x 9TB luns. Does that 
> sound sane?We run heavy amounts of small files. We start thrashing when the disks 
can''t keep up. CPU really doesn''t have much to do with it. If
you''re not
running a lot of small files you''ll probably be fine.

Daniel
>
> On Oct 5, 2007, at 1:02 PM, Daniel Leaberry wrote:
>
>>
>> Aaron Knister wrote:
>>> I''m planning to hang 58 terabytes off of a PowerEdge 1950
with 4
>>> CPUS  and 8 gigs of memory. My MDS is a dual core Opteron with a 
>>> 250gig  raid1 metadata volume and 2GB of ram. Do you think this 
>>> hardware  configuration is sane?
>>>
>>> -Aaron
>>>
>> Depends how much you push your disks. Your waiting on io will shoot 
>> up real quick if the disks slow down even a bit. My experience is 
>> you''ll probably want more machines unless you''re not
pushing your
>> disks (in which case why run lustre?)
>>
>> We have about 85TB of disk (in 24 luns) hanging off 4 PE2950''s
with
>> those same specs. They are set up in failover pairs (each handles 6 
>> luns) but I can''t run too long on a single machine before it
starts
>> thrashing when it takes over the other nodes 6 luns.
>>
>> Daniel
>>
>>> _______________________________________________
>>> Lustre-discuss mailing list
>>> Lustre-discuss at clusterfs.com
>>> https://mail.clusterfs.com/mailman/listinfo/lustre-discuss
>>>
>

Andreas Dilger

2007-Oct-05 22:18 UTC

head link

[Lustre-discuss] Hardware Question

On Oct 05, 2007  13:14 -0400, Aaron Knister wrote:> Make that 6x 9.7TB luns.
Lustre (== ext3) doesn''t support >= 8TB LUNs.

Cheers, Andreas
--
Andreas Dilger
Principal Software Engineer
Cluster File Systems, Inc.

Andreas Dilger

2007-Oct-05 22:21 UTC

head link

[Lustre-discuss] Hardware Question

On Oct 05, 2007  11:02 -0600, Daniel Leaberry wrote:> Aaron Knister wrote:
> > I''m planning to hang 58 terabytes off of a PowerEdge 1950
with 4 CPUS
> > and 8 gigs of memory. My MDS is a dual core Opteron with a 250gig  
> > raid1 metadata volume and 2GB of ram. Do you think this hardware  
> > configuration is sane?
>
> We have about 85TB of disk (in 24 luns) hanging off 4 PE2950''s
with
> those same specs. They are set up in failover pairs (each handles 6 
> luns) but I can''t run too long on a single machine before it
starts
> thrashing when it takes over the other nodes 6 luns.
If you have 12 OSTs on a single node, that means up to 12 * 400MB = 4.8GB
of RAM pinned just by the ext3 journal.  Either you need a lot more RAM
than this (8TB for example), or you need to shrink the journal size like
128MB (tune2fs to remove then re-add it).  Using 128MB should be fine
unless you have many hundreds of clients doing concurrent IO.

Cheers, Andreas
--
Andreas Dilger
Principal Software Engineer
Cluster File Systems, Inc.

Aaron Knister

2007-Oct-06 14:28 UTC

head link

[Lustre-discuss] Hardware Question

Oh, right I forgot about that. Well...if i had an 8tb lun and split  
it into 2 volume groups using LVM do you think the performance would  
be worse than making 2 raids at the hardware level?

-Aaron

On Oct 5, 2007, at 6:18 PM, Andreas Dilger wrote:
> On Oct 05, 2007  13:14 -0400, Aaron Knister wrote:
>> Make that 6x 9.7TB luns.
>
> Lustre (== ext3) doesn''t support >= 8TB LUNs.
>
> Cheers, Andreas
> --
> Andreas Dilger
> Principal Software Engineer
> Cluster File Systems, Inc.
>
Aaron Knister
Associate Systems Administrator/Web Designer
Center for Research on Environment and Water

(301) 595-7001
aaron at iges.org



-------------- next part --------------
An HTML attachment was scrubbed...
URL:
http://lists.lustre.org/pipermail/lustre-discuss/attachments/20071006/c5d3106e/attachment-0002.html

Andreas Dilger

2007-Oct-10 15:26 UTC

head link

[Lustre-discuss] Hardware Question

On Oct 06, 2007  10:28 -0400, Aaron Knister wrote:> Oh, right I forgot about that. Well...if i had an 8tb lun and split  
> it into 2 volume groups using LVM do you think the performance would  
> be worse than making 2 raids at the hardware level?
Well, it won''t be doing the disks any favours, since you''ll
now have
contention between the OSTs, and the kernel will be doing a poor job
with the IO elevator decisions.  I would suggest making 2 smaller RAID
LUNs instead.

In the end it is up to you to decide if the IO performance is acceptable.
You can do some testing using lustre-iokit to see what the component
device performance is.
> On Oct 5, 2007, at 6:18 PM, Andreas Dilger wrote:
> 
> >On Oct 05, 2007  13:14 -0400, Aaron Knister wrote:
> >>Make that 6x 9.7TB luns.
> >
> >Lustre (== ext3) doesn''t support >= 8TB LUNs.
> 
Cheers, Andreas
--
Andreas Dilger
Principal Software Engineer
Cluster File Systems, Inc.

Lundgren, Andrew

2007-Oct-10 15:40 UTC

head link

[Lustre-discuss] Hardware Question

As RH 5.1 will support 16TB ext3 partitions, will lustre inherit that
functionality? 
> -----Original Message-----
> From: lustre-discuss-bounces at clusterfs.com 
> [mailto:lustre-discuss-bounces at clusterfs.com] On Behalf Of 
> Andreas Dilger
> Sent: Wednesday, October 10, 2007 9:26 AM
> To: Aaron Knister
> Cc: lustre-discuss at clusterfs.com
> Subject: Re: [Lustre-discuss] Hardware Question
> 
> On Oct 06, 2007  10:28 -0400, Aaron Knister wrote:
> > Oh, right I forgot about that. Well...if i had an 8tb lun 
> and split it 
> > into 2 volume groups using LVM do you think the performance 
> would be 
> > worse than making 2 raids at the hardware level?
> 
> Well, it won''t be doing the disks any favours, since
you''ll
> now have contention between the OSTs, and the kernel will be 
> doing a poor job with the IO elevator decisions.  I would 
> suggest making 2 smaller RAID LUNs instead.
> 
> In the end it is up to you to decide if the IO performance is 
> acceptable.
> You can do some testing using lustre-iokit to see what the 
> component device performance is.
> 
> > On Oct 5, 2007, at 6:18 PM, Andreas Dilger wrote:
> > 
> > >On Oct 05, 2007  13:14 -0400, Aaron Knister wrote:
> > >>Make that 6x 9.7TB luns.
> > >
> > >Lustre (== ext3) doesn''t support >= 8TB LUNs.
> > 
> 
> Cheers, Andreas
> --
> Andreas Dilger
> Principal Software Engineer
> Cluster File Systems, Inc.
> 
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at clusterfs.com
> https://mail.clusterfs.com/mailman/listinfo/lustre-discuss
>

Andreas Dilger

2007-Oct-10 15:48 UTC

head link

[Lustre-discuss] Hardware Question

On Oct 10, 2007  09:40 -0600, Lundgren, Andrew wrote:> As RH 5.1 will support 16TB ext3 partitions, will lustre inherit that
> functionality? 
We haven''t looked at this yet.  The ldiskfs code is ext3 + patches, so
there
is some chance that it will work (more likely on 64-bit platforms), but
we haven''t audited the ldiskfs patches to check if they are 32-bit
clean.
> > -----Original Message-----
> > From: lustre-discuss-bounces at clusterfs.com 
> > [mailto:lustre-discuss-bounces at clusterfs.com] On Behalf Of 
> > Andreas Dilger
> > Sent: Wednesday, October 10, 2007 9:26 AM
> > To: Aaron Knister
> > Cc: lustre-discuss at clusterfs.com
> > Subject: Re: [Lustre-discuss] Hardware Question
> > 
> > On Oct 06, 2007  10:28 -0400, Aaron Knister wrote:
> > > Oh, right I forgot about that. Well...if i had an 8tb lun 
> > and split it 
> > > into 2 volume groups using LVM do you think the performance 
> > would be 
> > > worse than making 2 raids at the hardware level?
> > 
> > Well, it won''t be doing the disks any favours, since
you''ll
> > now have contention between the OSTs, and the kernel will be 
> > doing a poor job with the IO elevator decisions.  I would 
> > suggest making 2 smaller RAID LUNs instead.
> > 
> > In the end it is up to you to decide if the IO performance is 
> > acceptable.
> > You can do some testing using lustre-iokit to see what the 
> > component device performance is.
> > 
> > > On Oct 5, 2007, at 6:18 PM, Andreas Dilger wrote:
> > > 
> > > >On Oct 05, 2007  13:14 -0400, Aaron Knister wrote:
> > > >>Make that 6x 9.7TB luns.
> > > >
> > > >Lustre (== ext3) doesn''t support >= 8TB LUNs.
Cheers, Andreas
--
Andreas Dilger
Principal Software Engineer
Cluster File Systems, Inc.

Aaron Knister

2007-Oct-17 15:29 UTC

head link

[Lustre-discuss] Hardware Question

So if I have arrays with 15 drives in them should I just configure  
two smaller arrays? Also if I make a giant 30 terabyte filesystem of  
underlying say 6TB disk arrays and one of my disk arrays bites the  
dust what happens to the rest of the filesystem and how easy is it to  
recover from this situation?

-Aaron

On Oct 10, 2007, at 11:48 AM, Andreas Dilger wrote:
> On Oct 10, 2007  09:40 -0600, Lundgren, Andrew wrote:
>> As RH 5.1 will support 16TB ext3 partitions, will lustre inherit that
>> functionality?
>
> We haven''t looked at this yet.  The ldiskfs code is ext3 +
patches,
> so there
> is some chance that it will work (more likely on 64-bit platforms),  
> but
> we haven''t audited the ldiskfs patches to check if they are 32-bit
> clean.
>
>>> -----Original Message-----
>>> From: lustre-discuss-bounces at clusterfs.com
>>> [mailto:lustre-discuss-bounces at clusterfs.com] On Behalf Of
>>> Andreas Dilger
>>> Sent: Wednesday, October 10, 2007 9:26 AM
>>> To: Aaron Knister
>>> Cc: lustre-discuss at clusterfs.com
>>> Subject: Re: [Lustre-discuss] Hardware Question
>>>
>>> On Oct 06, 2007  10:28 -0400, Aaron Knister wrote:
>>>> Oh, right I forgot about that. Well...if i had an 8tb lun
>>> and split it
>>>> into 2 volume groups using LVM do you think the performance
>>> would be
>>>> worse than making 2 raids at the hardware level?
>>>
>>> Well, it won''t be doing the disks any favours, since
you''ll
>>> now have contention between the OSTs, and the kernel will be
>>> doing a poor job with the IO elevator decisions.  I would
>>> suggest making 2 smaller RAID LUNs instead.
>>>
>>> In the end it is up to you to decide if the IO performance is
>>> acceptable.
>>> You can do some testing using lustre-iokit to see what the
>>> component device performance is.
>>>
>>>> On Oct 5, 2007, at 6:18 PM, Andreas Dilger wrote:
>>>>
>>>>> On Oct 05, 2007  13:14 -0400, Aaron Knister wrote:
>>>>>> Make that 6x 9.7TB luns.
>>>>>
>>>>> Lustre (== ext3) doesn''t support >= 8TB LUNs.
>
> Cheers, Andreas
> --
> Andreas Dilger
> Principal Software Engineer
> Cluster File Systems, Inc.
>
Aaron Knister
Associate Systems Administrator/Web Designer
Center for Research on Environment and Water

(301) 595-7001
aaron at iges.org



-------------- next part --------------
An HTML attachment was scrubbed...
URL:
http://lists.lustre.org/pipermail/lustre-discuss/attachments/20071017/d27c72e2/attachment-0002.html

Peter Avakian

2007-Oct-17 19:02 UTC

head link

[Lustre-discuss] Hardware Question

If the intention is not size, but to spread your I/Os on as many spindles as
possible, you could still have these volume groups.  Once you create these
volumes, you could have them sliced into multiple LUNs where their
collective sizes are acceptable by EXT3. 

Regards

-Peter

 

From: lustre-discuss-bounces at clusterfs.com
[mailto:lustre-discuss-bounces at clusterfs.com] On Behalf Of Aaron Knister
Sent: Wednesday, October 17, 2007 7:30 PM
To: Andreas Dilger
Cc: lustre-discuss at clusterfs.com; Lundgren, Andrew
Subject: Re: [Lustre-discuss] Hardware Question

 

So if I have arrays with 15 drives in them should I just configure two
smaller arrays? Also if I make a giant 30 terabyte filesystem of underlying
say 6TB disk arrays and one of my disk arrays bites the dust what happens to
the rest of the filesystem and how easy is it to recover from this
situation?

 

-Aaron

 

On Oct 10, 2007, at 11:48 AM, Andreas Dilger wrote:





On Oct 10, 2007  09:40 -0600, Lundgren, Andrew wrote:

As RH 5.1 will support 16TB ext3 partitions, will lustre inherit that

functionality? 

 

We haven''t looked at this yet.  The ldiskfs code is ext3 + patches, so
there

is some chance that it will work (more likely on 64-bit platforms), but

we haven''t audited the ldiskfs patches to check if they are 32-bit
clean.

 

-----Original Message-----

From: lustre-discuss-bounces at clusterfs.com 

[mailto:lustre-discuss-bounces at clusterfs.com] On Behalf Of 

Andreas Dilger

Sent: Wednesday, October 10, 2007 9:26 AM

To: Aaron Knister

Cc: lustre-discuss at clusterfs.com

Subject: Re: [Lustre-discuss] Hardware Question

 

On Oct 06, 2007  10:28 -0400, Aaron Knister wrote:

Oh, right I forgot about that. Well...if i had an 8tb lun 

and split it 

into 2 volume groups using LVM do you think the performance 

would be 

worse than making 2 raids at the hardware level?

 

Well, it won''t be doing the disks any favours, since you''ll 

now have contention between the OSTs, and the kernel will be 

doing a poor job with the IO elevator decisions.  I would 

suggest making 2 smaller RAID LUNs instead.

 

In the end it is up to you to decide if the IO performance is 

acceptable.

You can do some testing using lustre-iokit to see what the 

component device performance is.

 

On Oct 5, 2007, at 6:18 PM, Andreas Dilger wrote:

 

On Oct 05, 2007  13:14 -0400, Aaron Knister wrote:

Make that 6x 9.7TB luns.

 

Lustre (== ext3) doesn''t support >= 8TB LUNs.

 

Cheers, Andreas

--

Andreas Dilger

Principal Software Engineer

Cluster File Systems, Inc.

 

 

Aaron Knister

Associate Systems Administrator/Web Designer

Center for Research on Environment and Water

 

(301) 595-7001

aaron at iges.org

 





 

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
http://lists.lustre.org/pipermail/lustre-discuss/attachments/20071017/2630fd5e/attachment-0002.html

Lustre discuss - Oct 2007 - Hardware Question

[Lustre-discuss] Hardware Question

[Lustre-discuss] Hardware Question

[Lustre-discuss] Hardware Question

[Lustre-discuss] Hardware Question

[Lustre-discuss] Hardware Question

[Lustre-discuss] Hardware Question

[Lustre-discuss] Hardware Question

[Lustre-discuss] Hardware Question

[Lustre-discuss] Hardware Question

[Lustre-discuss] Hardware Question

[Lustre-discuss] Hardware Question

[Lustre-discuss] Hardware Question

[Lustre-discuss] Hardware Question

[Lustre-discuss] Hardware Question

[Lustre-discuss] Hardware Question