Daniel Leaberry
2007-Aug-17 07:28 UTC
[Lustre-discuss] Sudden ost crashing appears to take > 100s to find free extents
I have an interesting problem. I''ve made no changes to the IB DDN
storage yet I''m finding OST''s crashing left and right. The
thread
watchdog gets triggered and the most relevant part of the dump is the
following. It appears that it took > 100s to find a free extent. On the
OSS I watch with iostat as the lun is saturated with small read requests.
We''ve just hit 80% full (we planned on going to 90% full) and we do
have
a lot of small files (~75 million )
Is there anyway to tune the extent searching code? Does my analysis seem
likely? Is this fixed in 1.6.1 such that I should upgrade immediately?
Thanks,
Daniel
Call Trace:<ffffffffa0024125>{:sd_mod:sd_iostats_bump+147}
<ffffffffa031429a>{:ib_srp:srp_host_qcommand+399}
<ffffffff80253ebf>{deadline_next_request+34}
<ffffffff8024b329>{elv_next_request+238}
<ffffffff80309843>{io_schedule+38}
<ffffffff8017843c>{__wait_on_buffer+125}
<ffffffff801782c2>{bh_wake_function+0}
<ffffffff801782c2>{bh_wake_function+0}
<ffffffffa05771d9>{:ldiskfs:ldiskfs_mb_init_cache+469}
<ffffffff80157ba2>{add_to_page_cache+167}
<ffffffffa0577792>{:ldiskfs:ldiskfs_mb_load_buddy+257}
<ffffffffa057a89f>{:ldiskfs:ldiskfs_mb_new_blocks+1946}
<ffffffffa05b480e>{:fsfilt_ldiskfs:ldiskfs_ext_new_extent_cb+729}
<ffffffffa0574362>{:ldiskfs:ldiskfs_ext_find_extent+205}
<ffffffffa0575a69>{:ldiskfs:ldiskfs_ext_walk_space+535}
<ffffffffa05b4535>{:fsfilt_ldiskfs:ldiskfs_ext_new_extent_cb+0}
<ffffffffa05b4b56>{:fsfilt_ldiskfs:fsfilt_map_nblocks+236}
<ffffffffa05b4d50>{:fsfilt_ldiskfs:fsfilt_ldiskfs_map_ext_inode_pages+457}
<ffffffffa05d59dc>{:obdfilter:filter_direct_io+892}
<ffffffffa05b36f2>{:fsfilt_ldiskfs:fsfilt_ldiskfs_brw_start+649}
<ffffffffa05d6fb9>{:obdfilter:filter_commitrw_write+3494}
<ffffffff80308ecd>{thread_return+0}
<ffffffff80308f25>{thread_return+88}
<ffffffffa0378bf2>{:lnet:lnet_send+2251}
<ffffffffa05d100e>{:obdfilter:filter_commitrw+84}
<ffffffff8013f23f>{del_timer+107}
<ffffffff8013f2fc>{del_singleshot_timer_sync+9}
<ffffffff803099f7>{schedule_timeout+375}
<ffffffffa05a2c47>{:ost:ost_brw_write+5119}
<ffffffff801331a5>{default_wake_function+0}
<ffffffffa059f513>{:ost:ost_bulk_timeout+0}
<ffffffffa043269f>{:ptlrpc:lustre_msg_get_version+64}
<ffffffffa05a637e>{:ost:ost_handle+6987}
<ffffffffa034cf41>{:libcfs:libcfs_debug_vmsg2+1713}
<ffffffff801e9c83>{vsnprintf+1406}
<ffffffff801e9d66>{snprintf+131}
<ffffffffa043906b>{:ptlrpc:ptlrpc_server_handle_request+2336}
<ffffffff8013f100>{__mod_timer+293}
<ffffffffa043ad29>{:ptlrpc:ptlrpc_main+2018}
<ffffffff801331a5>{default_wake_function+0}
<ffffffffa0439a47>{:ptlrpc:ptlrpc_retry_rqbds+0}
<ffffffffa0439a47>{:ptlrpc:ptlrpc_retry_rqbds+0}
<ffffffff80110e23>{child_rip+8}
<ffffffffa043a547>{:ptlrpc:ptlrpc_main+0}
<ffffffff80110e1b>{child_rip+0}
--
Daniel Leaberry
Systems Administrator
iArchives
Tel: 801-494-6528
Cell: 801-376-6411
Andreas Dilger
2007-Aug-17 12:37 UTC
[Lustre-discuss] Sudden ost crashing appears to take > 100s to find free extents
On Aug 17, 2007 07:28 -0600, Daniel Leaberry wrote:> I have an interesting problem. I''ve made no changes to the IB DDN > storage yet I''m finding OST''s crashing left and right. The thread > watchdog gets triggered,Note that a watchdog thread stack dump is NOT a crash, but rather a debugging mechanism so we can see where the thread is stuck for such a long time. It should be able to continue working even after this happens.> Is there anyway to tune the extent searching code? Does my analysis seem > likely? Is this fixed in 1.6.1 such that I should upgrade immediately?You could increase the watchdog thread timeout (this is currently a compile time constant), but that won''t remove the fact that it is taking 100s to find free space. Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc.
Fegan, Joe
2007-Aug-20 07:19 UTC
[Lustre-discuss] Sudden ost crashing appears to take > 100s tofind free extents
Hi Andreas,
When the watchdog fires (prints the stack of a thread) does this mean
that the thread has been hogging the cpu for 100 seconds? Or that the
thread has been *sleeping* for 100 seconds? Or something else?
The stack Daniel posted shows io_schedule called from __wait_on_buffer
which would be typical of a thread waiting for disk i/o completion.
Call Trace:<ffffffffa0024125>{:sd_mod:sd_iostats_bump+147}
<ffffffffa031429a>{:ib_srp:srp_host_qcommand+399}
<ffffffff80253ebf>{deadline_next_request+34}
<ffffffff8024b329>{elv_next_request+238}
<ffffffff80309843>{io_schedule+38}
<ffffffff8017843c>{__wait_on_buffer+125}
<ffffffff801782c2>{bh_wake_function+0}
<ffffffff801782c2>{bh_wake_function+0}
<ffffffffa05771d9>{:ldiskfs:ldiskfs_mb_init_cache+469}
<ffffffff80157ba2>{add_to_page_cache+167}
<ffffffffa0577792>{:ldiskfs:ldiskfs_mb_load_buddy+257}
<ffffffffa057a89f>{:ldiskfs:ldiskfs_mb_new_blocks+1946}
<ffffffffa05b480e>{:fsfilt_ldiskfs:ldiskfs_ext_new_extent_cb+729}
<ffffffffa0574362>{:ldiskfs:ldiskfs_ext_find_extent+205}
<ffffffffa0575a69>{:ldiskfs:ldiskfs_ext_walk_space+535}
<ffffffffa05b4535>{:fsfilt_ldiskfs:ldiskfs_ext_new_extent_cb+0}
<ffffffffa05b4b56>{:fsfilt_ldiskfs:fsfilt_map_nblocks+236}
And yes ldiskfs_ext_new_extent_cb is higher up the stack, which means
that in the bigger picture this thread is searching for free blocks, but
did the watchdog appear because the thread has been searching for free
blocks for more than 100 seconds or because *this specific i/o request*
has been pending for more than 100 seconds?
Joe.
Postal Address: Hewlett Packard Galway Ltd., Ballybrit Business Park,
Galway, Ireland
Registered Office: 63-74 Sir John Rogerson''s Quay, Dublin 2, Ireland.
Registered Number: 361933
The contents of this message and any attachments to it are confidential
and may be legally privileged. If you have received this message in
error you should delete it from your system immediately and advise the
sender. To any recipient of this message within HP: unless otherwise
stated you should consider this message and attachments as "HP
CONFIDENTIAL".
-----Original Message-----
From: Andreas Dilger [mailto:adilger@clusterfs.com]
Sent: 17 August 2007 19:38
To: Daniel Leaberry
Cc: lustre-discuss@clusterfs.com
Subject: Re: [Lustre-discuss] Sudden ost crashing appears to take > 100s
tofind free extents
On Aug 17, 2007 07:28 -0600, Daniel Leaberry wrote:> I have an interesting problem. I''ve made no changes to the IB DDN
> storage yet I''m finding OST''s crashing left and right.
The thread
> watchdog gets triggered,
Note that a watchdog thread stack dump is NOT a crash, but rather a
debugging mechanism so we can see where the thread is stuck for such
a long time. It should be able to continue working even after this
happens.
> Is there anyway to tune the extent searching code? Does my analysis
seem > likely? Is this fixed in 1.6.1 such that I should upgrade immediately?
You could increase the watchdog thread timeout (this is currently a
compile time constant), but that won''t remove the fact that it is
taking
100s to find free space.
Cheers, Andreas
--
Andreas Dilger
Principal Software Engineer
Cluster File Systems, Inc.
Daniel Leaberry
2007-Aug-20 08:15 UTC
[Lustre-discuss] Sudden ost crashing appears to take > 100s to find free extents
Thanks Andreas. We hit this problem at 79% full on each ost. After deleting we got the ost''s down to 77% full and the problem subsided. I haven''t found any information or rumors regarding full filesystem lustre performance but I know for our workload we''re setting a 75% hard limit on used space to avoid these issues. The biggest surprise for me was not that it slowed down (all filesystems get slower as they approach 100% full) but how sudden the wall seemed to be hit. Joe, I can''t definitively answer your question but I can tell you that what I saw on the luns was one io thread would dominate the lun for 100s. No other read/write requests would get through. This was with the deadline scheduler. We tried with cfq as well and the same behavior was exhibited. That indicates to me that the thread was *active* for 100 seconds. Daniel Andreas Dilger wrote:> On Aug 17, 2007 07:28 -0600, Daniel Leaberry wrote: > >> I have an interesting problem. I''ve made no changes to the IB DDN >> storage yet I''m finding OST''s crashing left and right. The thread >> watchdog gets triggered, >> > > Note that a watchdog thread stack dump is NOT a crash, but rather a > debugging mechanism so we can see where the thread is stuck for such > a long time. It should be able to continue working even after this > happens. > > >> Is there anyway to tune the extent searching code? Does my analysis seem >> likely? Is this fixed in 1.6.1 such that I should upgrade immediately? >> > > You could increase the watchdog thread timeout (this is currently a > compile time constant), but that won''t remove the fact that it is taking > 100s to find free space. > > Cheers, Andreas > -- > Andreas Dilger > Principal Software Engineer > Cluster File Systems, Inc. > >
Oleg Drokin
2007-Aug-20 08:39 UTC
[Lustre-discuss] Sudden ost crashing appears to take > 100s tofind free extents
Hello! On Mon, Aug 20, 2007 at 02:18:43PM +0100, Fegan, Joe wrote:> When the watchdog fires (prints the stack of a thread) does this mean > that the thread has been hogging the cpu for 100 seconds? Or that the > thread has been *sleeping* for 100 seconds? Or something else?It can be both. What it actually means is that between watchdog timer was primed and now process did not reprimed the timer for 100 seconds, but was it eatign cpu or sleeping or any combination of those states does not matter. Bye, Oleg