thr3ads.net - Lustre discuss - [Lustre-discuss] fast traverse [Sep 2008]

If this information is useful, please help other people find it:
Share via:

Mag Gam

2008-Sep-11 10:28 UTC

[Lustre-discuss] fast traverse

I have a filesystem with over 1m directories which are filled with
hourly temperatures of a controller environment for years. They are
being hosted on our lustre filesystem, and I constantly do a a fstat()
and fstat64() to get the directory''s create time. I was wondering if
there is a way to speed this operation? Is it possible for me to
increase the mds cache? Are there any tricks I can perform to speed
this operation up?

TIA

Peter Grandi

2008-Sep-11 16:42 UTC

head link

[Lustre-discuss] fast traverse

> I have a filesystem with over 1m directories which are filled
> with hourly temperatures of a controller environment for
> years. [ ... ]
Using *any* filesystem as a database-of-record is usually an
exceptionally bad and irrecoverable idea. Using a network
filesystem with split metadata/data servers as such is beneath
comment. Perhaps it may be too late for the case above, but
perhaps not. Consider the difference here:

  http://WWW.sabi.co.UK/blog/0802feb.html#080216

Most performance problems resulting from the usage patterns like
the above can only be fixed by serious amounts of cash or as-yet
undiscovered research advances.

Andreas Dilger

2008-Sep-11 22:05 UTC

head link

[Lustre-discuss] fast traverse

On Sep 11, 2008  06:28 -0400, Mag Gam wrote:> I have a filesystem with over 1m directories which are filled with
> hourly temperatures of a controller environment for years. They are
> being hosted on our lustre filesystem, and I constantly do a a fstat()
> and fstat64() to get the directory''s create time. I was wondering
if
> there is a way to speed this operation? Is it possible for me to
> increase the mds cache? Are there any tricks I can perform to speed
> this operation up?
To cache 1M directory entries would need in the range of 6GB of RAM.

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.

Mag Gam

2008-Sep-11 23:50 UTC

head link

[Lustre-discuss] fast traverse

I have 32GB on the MDS. So, where do I start? :-)



On Thu, Sep 11, 2008 at 6:05 PM, Andreas Dilger <adilger at sun.com>
wrote:> On Sep 11, 2008  06:28 -0400, Mag Gam wrote:
>> I have a filesystem with over 1m directories which are filled with
>> hourly temperatures of a controller environment for years. They are
>> being hosted on our lustre filesystem, and I constantly do a a fstat()
>> and fstat64() to get the directory''s create time. I was
wondering if
>> there is a way to speed this operation? Is it possible for me to
>> increase the mds cache? Are there any tricks I can perform to speed
>> this operation up?
>
> To cache 1M directory entries would need in the range of 6GB of RAM.
>
> Cheers, Andreas
> --
> Andreas Dilger
> Sr. Staff Engineer, Lustre Group
> Sun Microsystems of Canada, Inc.
>
>

Mag Gam

2008-Sep-16 00:18 UTC

head link

[Lustre-discuss] fast traverse

While doing a large scan of  a "large" lustre filesystem (10TB) I
noticed the client hung the host.

I did a simple ''find /oagre/lustre/fs '' and it naturally took
36 hours
since there are many small files. But we noticed the host crashed with
ll_socket<pid of find> and ''find'' process taking up 100%
CPU. No
commands were working, but I was able to ssh into the box. We are
using Lustre 1.6.5.1. Is this a known issue? Could this be a statahead
issue mentioned in the previous threads?

Sorry if this is redundant.

TIA

On Thu, Sep 11, 2008 at 7:50 PM, Mag Gam <magawake at gmail.com>
wrote:> I have 32GB on the MDS. So, where do I start? :-)
>
>
>
> On Thu, Sep 11, 2008 at 6:05 PM, Andreas Dilger <adilger at sun.com>
wrote:
>> On Sep 11, 2008  06:28 -0400, Mag Gam wrote:
>>> I have a filesystem with over 1m directories which are filled with
>>> hourly temperatures of a controller environment for years. They are
>>> being hosted on our lustre filesystem, and I constantly do a a
fstat()
>>> and fstat64() to get the directory''s create time. I was
wondering if
>>> there is a way to speed this operation? Is it possible for me to
>>> increase the mds cache? Are there any tricks I can perform to speed
>>> this operation up?
>>
>> To cache 1M directory entries would need in the range of 6GB of RAM.
>>
>> Cheers, Andreas
>> --
>> Andreas Dilger
>> Sr. Staff Engineer, Lustre Group
>> Sun Microsystems of Canada, Inc.
>>
>>
>

Mag Gam

2008-Sep-17 11:24 UTC

head link

[Lustre-discuss] fast traverse

This happened again :-(

ANyone have any insight on a problem similar to this?

TIA

On Mon, Sep 15, 2008 at 8:18 PM, Mag Gam <magawake at gmail.com>
wrote:> While doing a large scan of  a "large" lustre filesystem (10TB) I
> noticed the client hung the host.
>
> I did a simple ''find /oagre/lustre/fs '' and it naturally
took 36 hours
> since there are many small files. But we noticed the host crashed with
> ll_socket<pid of find> and ''find'' process taking up
100% CPU. No
> commands were working, but I was able to ssh into the box. We are
> using Lustre 1.6.5.1. Is this a known issue? Could this be a statahead
> issue mentioned in the previous threads?
>
> Sorry if this is redundant.
>
> TIA
>
>
>
>
> On Thu, Sep 11, 2008 at 7:50 PM, Mag Gam <magawake at gmail.com>
wrote:
>> I have 32GB on the MDS. So, where do I start? :-)
>>
>>
>>
>> On Thu, Sep 11, 2008 at 6:05 PM, Andreas Dilger <adilger at
sun.com> wrote:
>>> On Sep 11, 2008  06:28 -0400, Mag Gam wrote:
>>>> I have a filesystem with over 1m directories which are filled
with
>>>> hourly temperatures of a controller environment for years. They
are
>>>> being hosted on our lustre filesystem, and I constantly do a a
fstat()
>>>> and fstat64() to get the directory''s create time. I
was wondering if
>>>> there is a way to speed this operation? Is it possible for me
to
>>>> increase the mds cache? Are there any tricks I can perform to
speed
>>>> this operation up?
>>>
>>> To cache 1M directory entries would need in the range of 6GB of
RAM.
>>>
>>> Cheers, Andreas
>>> --
>>> Andreas Dilger
>>> Sr. Staff Engineer, Lustre Group
>>> Sun Microsystems of Canada, Inc.
>>>
>>>
>>
>

Andreas Dilger

2008-Sep-18 21:35 UTC

head link

[Lustre-discuss] fast traverse

On Sep 17, 2008  07:24 -0400, Mag Gam wrote:> This happened again :-(
> 
> ANyone have any insight on a problem similar to this?
> 
> TIA
> 
> On Mon, Sep 15, 2008 at 8:18 PM, Mag Gam <magawake at gmail.com>
wrote:
> > Is this a known issue? Could this be a statahead
> > issue mentioned in the previous threads?
Probably yes, so trying the workarounds previously mentioned in those
emails will tell us whether this is the same problem or not.

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.

Mag Gam

2008-Sep-20 00:46 UTC

head link

[Lustre-discuss] fast traverse

I am not sure what work around you are speaking about.

I can try to recreate the problem and submit a bug report but what
information do I need submit?  (kernel logs, lustre logs , etc..) I am
not sure how to generate this and I don''t think the default client
settings has  them enabled.

TIA

On Thu, Sep 18, 2008 at 5:35 PM, Andreas Dilger <adilger at sun.com>
wrote:> On Sep 17, 2008  07:24 -0400, Mag Gam wrote:
>> This happened again :-(
>>
>> ANyone have any insight on a problem similar to this?
>>
>> TIA
>>
>> On Mon, Sep 15, 2008 at 8:18 PM, Mag Gam <magawake at gmail.com>
wrote:
>> > Is this a known issue? Could this be a statahead
>> > issue mentioned in the previous threads?
>
> Probably yes, so trying the workarounds previously mentioned in those
> emails will tell us whether this is the same problem or not.
>
> Cheers, Andreas
> --
> Andreas Dilger
> Sr. Staff Engineer, Lustre Group
> Sun Microsystems of Canada, Inc.
>
>

Andreas Dilger

2008-Sep-22 18:54 UTC

head link

[Lustre-discuss] fast traverse

On Sep 19, 2008  20:46 -0400, Mag Gam wrote:> I am not sure what work around you are speaking about.
You mentioned in your email that this is likely the statahead issue
from previous threads.  Most of those had proposed a workaround:

echo 0 > /proc/fs/lustre/llite/{fsname}/statahead_max

> On Thu, Sep 18, 2008 at 5:35 PM, Andreas Dilger <adilger at sun.com>
wrote:
> > On Sep 17, 2008  07:24 -0400, Mag Gam wrote:
> >> This happened again :-(
> >>
> >> ANyone have any insight on a problem similar to this?
> >>
> >> TIA
> >>
> >> On Mon, Sep 15, 2008 at 8:18 PM, Mag Gam <magawake at
gmail.com> wrote:
> >> > Is this a known issue? Could this be a statahead
> >> > issue mentioned in the previous threads?
> >
> > Probably yes, so trying the workarounds previously mentioned in those
> > emails will tell us whether this is the same problem or not.
> >
> > Cheers, Andreas
> > --
> > Andreas Dilger
> > Sr. Staff Engineer, Lustre Group
> > Sun Microsystems of Canada, Inc.
> >
> >
Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.

Lustre discuss - Sep 2008 - fast traverse

[Lustre-discuss] fast traverse

[Lustre-discuss] fast traverse

[Lustre-discuss] fast traverse

[Lustre-discuss] fast traverse

[Lustre-discuss] fast traverse

[Lustre-discuss] fast traverse

[Lustre-discuss] fast traverse

[Lustre-discuss] fast traverse

[Lustre-discuss] fast traverse