thr3ads.net - Lustre discuss - [Lustre-discuss] Backing up the OSSs and MDSs [Oct 2008]

If this information is useful, please help other people find it:
Share via:

Tim Bell

2008-Oct-31 18:38 UTC

[Lustre-discuss] Backing up the OSSs and MDSs

I''m new to Lustre and am investigating what would be required for 
production.  One of the things that we would require would be backup of 
the MDS and OSS data.

For the OSSs, from what I can see, this can be achieved on a per-volume 
basis using LVM and snapshotting.  However, I feel uncomfortable with 
this approach and would prefer something where I can restore an 
individual file from backup if lost.  Is there another way other than 
find with mtime to find the list of modified files in the file system ?

For the MDSs, I cannot see how to back up an online MDS.  If I do a 
split mirror without quiesce and backup the ext3 file system of the 
split-off mirror, I get an inconsistent state of database.

I''ve seen various references to Amanda based systems but I''ve
not been
able to find the code that implements it so I could evaluate 
implementing the same thing with our TSM system.

Tim

Brian J. Murrell

2008-Oct-31 19:02 UTC

head link

[Lustre-discuss] Backing up the OSSs and MDSs

On Fri, 2008-10-31 at 19:38 +0100, Tim Bell wrote:> 
> For the OSSs, from what I can see, this can be achieved on a per-volume 
> basis using LVM and snapshotting.
Yes.
> However, I feel uncomfortable with 
> this approach and would prefer something where I can restore an 
> individual file from backup if lost.
Then you need to do filesystem backups, not device backups.  You do
filesystem backups from a lustre client rather than on individual server
nodes.  This is no different than how you would back up any other
filesystem.  I believe there is coverage in the manual about filesystem
level backups.  You need to pay attention to being able to backup EAs
used for striping and so on if you wish to restore files with their
striping attributes intact.
> Is there another way other than 
> find with mtime to find the list of modified files in the file system ?
Some tools, such as gnutar have a "find all files newer than" feature
built into them.
> For the MDSs, I cannot see how to back up an online MDS.  If I do a 
> split mirror without quiesce and backup the ext3 file system of the 
> split-off mirror, I get an inconsistent state of database.
Right.  The closest you can come is to synchronize as best you can
taking the LVM snapshot of the MDT and all of the OSTs and then backing
each of the snapshots up at your leisure.
> I''ve seen various references to Amanda based systems but
I''ve not been
> able to find the code that implements it so I could evaluate 
> implementing the same thing with our TSM system.
Indeed, there was an announcement here a while ago from zamanda about
supporting Lustre but I have not seen nor heard anything since.

b.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 197 bytes
Desc: This is a digitally signed message part
Url :
http://lists.lustre.org/pipermail/lustre-discuss/attachments/20081031/56ac67e2/attachment.bin

Steden Klaus

2008-Oct-31 20:02 UTC

head link

[Lustre-discuss] Odd performance issue with 1.4.x OSS ...

Hi folks,

Our Lustre started exhibiting some curious performance issues today ...
basically, it slowed down dramatically and reliable I/O performance became
impossible. I looked through the output of dmesg and saw a number of kernel
''oops'' messages, but not being a Lustre kernel expert,
I''m not exactly sure what they indicate. I stopped the OSTs on the node
in question and ran e2fsck on the OST drives, but they''ve come up clean
so I don''t think it''s a hardware problem. I don''t
have physical access to the machine right now so it may in fact be something on
the back end, but I''m working on verifying that with a technician on
site. In the meantime ... can anyone help decipher this for me? There are a
couple of messages like it:

-- cut --
ll_ost_215    S 00000100d2141808     0  8584      1          8585  8583 (L-TLB)
00000101184233e8 0000000000000046 000000000000000f ffffffffa059c3b8 
       00000000005c2616 0000000100000000 0000000000000000 00000100d21418b0 
       0000000000000013 0000000000000000 
Call Trace:<ffffffffa059c3b8>{:ptlrpc:ptl_send_buf+824}
<ffffffff801454bd>{__mod_timer+317}
       <ffffffff8033860d>{schedule_timeout+381}
<ffffffff801460a0>{process_timeout+0}
       <ffffffffa0596e84>{:ptlrpc:ptlrpc_queue_wait+6932} 
       <ffffffffa054227d>{:ptlrpc:l_has_lock+77}
<ffffffffa056f76c>{:ptlrpc:ldlm_add_waiting_lock+2156}
       <ffffffff80137be0>{default_wake_function+0}
<ffffffffa0592620>{:ptlrpc:expired_request+0}
       <ffffffffa05926e0>{:ptlrpc:interrupted_request+0}
<ffffffffa0574ae8>{:ptlrpc:ldlm_server_blocking_ast+4072}
       <ffffffffa0547f5a>{:ptlrpc:ldlm_run_ast_work+234}
<ffffffffa05421f8>{:ptlrpc:l_unlock+248}
       <ffffffffa055ac73>{:ptlrpc:ldlm_process_extent_lock+995} 
       <ffffffffa054bb4f>{:ptlrpc:ldlm_lock_enqueue+1087} 
       <ffffffffa0574bb0>{:ptlrpc:ldlm_server_completion_ast+0} 
       <ffffffffa0575b50>{:ptlrpc:ldlm_server_glimpse_ast+0} 
       <ffffffffa0578694>{:ptlrpc:ldlm_handle_enqueue+8836} 
       <ffffffffa0677485>{:obdfilter:filter_fmd_expire+53} 
       <ffffffffa0611dc0>{:kviblnd:kibnal_cq_callback+0}
<ffffffffa0573b00>{:ptlrpc:ldlm_server_blocking_ast+0}
       <ffffffffa0574bb0>{:ptlrpc:ldlm_server_completion_ast+0} 
       <ffffffffa062cbb2>{:ost:ost_handle+20738}
<ffffffff801f78d9>{number+233}
       <ffffffff801f7f09>{vsnprintf+1321}
<ffffffffa048e022>{:libcfs:libcfs_debug_msg+1554}
       <ffffffffa05a593a>{:ptlrpc:ptlrpc_server_handle_request+3066} 
       <ffffffff80116d35>{do_gettimeofday+101}
<ffffffffa049014e>{:libcfs:lcw_update_time+30}
       <ffffffff801454bd>{__mod_timer+317}
<ffffffffa05a6d17>{:ptlrpc:ptlrpc_main+2375}
       <ffffffff80137be0>{default_wake_function+0}
<ffffffffa05a63c0>{:ptlrpc:ptlrpc_retry_rqbds+0}
       <ffffffffa05a63c0>{:ptlrpc:ptlrpc_retry_rqbds+0}
<ffffffff801114ab>{child_rip+8}
       <ffffffffa05a63d0>{:ptlrpc:ptlrpc_main+0}
<ffffffff801114a3>{child_rip+0}
-- cut --

I can include debug logs as well, if needed. Any help is greatly appreciated.

cheers,
Klaus

Andreas Dilger

2008-Nov-03 17:47 UTC

head link

[Lustre-discuss] Odd performance issue with 1.4.x OSS ...

On Oct 31, 2008  13:02 -0700, Steden Klaus wrote:> Our Lustre started exhibiting some curious performance issues today
> ... basically, it slowed down dramatically and reliable I/O performance
> became impossible. I looked through the output of dmesg and saw a number
> of kernel ''oops'' messages, but not being a Lustre kernel
expert, I''m
> not exactly sure what they indicate. I stopped the OSTs on the node in
> question and ran e2fsck on the OST drives, but they''ve come up
clean so
> I don''t think it''s a hardware problem. I don''t
have physical access to
> the machine right now so it may in fact be something on the back end,
> but I''m working on verifying that with a technician on site. In
the
> meantime ... can anyone help decipher this for me? There are a couple
> of messages like it:
These kind of messages are of relatively little use unless they include
some of the preceding lines.  Are you sure this is an oops, and not a
watchdog timeout that is dumping the stack?
> -- cut --
> ll_ost_215    S 00000100d2141808     0  8584      1          8585  8583
(L-TLB)
> 00000101184233e8 0000000000000046 000000000000000f ffffffffa059c3b8 
>        00000000005c2616 0000000100000000 0000000000000000 00000100d21418b0 
>        0000000000000013 0000000000000000 
> Call Trace:<ffffffffa059c3b8>{:ptlrpc:ptl_send_buf+824}
<ffffffff801454bd>{__mod_timer+317}
>        <ffffffff8033860d>{schedule_timeout+381}
<ffffffff801460a0>{process_timeout+0}
>        <ffffffffa0596e84>{:ptlrpc:ptlrpc_queue_wait+6932} 
This looks like a network problem, but hard to say without more information.
If you are a supported customer, you will have better service by filing a
bugzilla bug.  This list only gets "as available" replies and that is
doubly true for old 1.4 Lustre installations.

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.

Steden Klaus

2008-Nov-03 19:39 UTC

head link

[Lustre-discuss] Odd performance issue with 1.4.x OSS ...

Hello Andreas,

Thanks for the info. Here''s some more information (basically cut and
pasted from before and after a typical stack dump message):

-- cut --
LustreError: 20438:0:(service.c:648:ptlrpc_server_handle_request()) request
4777659 opc 101 from 12345-10.0.0.249 at vib processed in 118s trans 0 rc 0/0
Lustre: 20438:0:(watchdog.c:302:lcw_update_time()) Expired watchdog for pid
20438 disabled after 118.9719s
LustreError: 20257:0:(ldlm_lockd.c:579:ldlm_server_completion_ast()) ### enqueue
wait took 118984007us from 1225493396 ns: filter-ost13_UUID lock:
0000010036031dc0/0x7bef85be5c78b145 lrc: 2/0,0 mode: PW/PW res: 15229751/0 rrc:
3 type:
EXT [0->18446744073709551615] (req 8482816->8503295) flags: 20 remote:
0x84d9c08a99ff4b23 expref: 113 pid: 20438
LustreError: 20500:0:(client.c:940:ptlrpc_expire_one_request()) @@@ timeout
(sent at 1225493514, 20s ago) req at 00000100cb900e00 x1151/t0
o104->@NET_0xb00000a0000fa_UUID:15 lens 176/64 ref 2 fl Rpc:/0/0 rc 0/0
LustreError: A client on nid 10.0.0.250 at vib was evicted from service ost5.
LustreError: 20500:0:(ldlm_lockd.c:423:ldlm_failed_ast()) ### blocking AST
failed (-110): evicting client 36604_tiger_lov_7a82cb6234 at
NET_0xb00000a0000fa_UUID NID 10.0.0.250 at vib (10.0.0.250 at vib) ns:
filter-ost5_UUID lock: 000001008f840c
40/0x7bef85be5c78c80c lrc: 2/0,0 mode: PW/PW res: 15158480/0 rrc: 57 type: EXT
[4651483136->4802478079] (req 4651483136->4654473215) flags: 20 remote:
0x789bd2e5e1ada437 expref: 120 pid: 20579
Lustre: 20107:0:(lib-move.c:1624:lnet_parse_put()) Dropping PUT from
12345-10.0.0.250 at vib portal 16 match 1151 offset 0 length 64: 2
LustreError: 20269:0:(ldlm_lockd.c:1417:ldlm_cancel_handler()) operation 103
from 12345-10.0.0.250 at vib with bad export cookie 8930503638740276498
LustreError: 20580:0:(ldlm_lib.c:1272:target_send_reply_msg()) @@@ processing
error (-107) req at 0000010060cc0000 x7055836/t0 o101-><?>@<?>:-1
lens 176/0 ref 0 fl Interpret:/0/0 rc -107/0
LustreError: 0:0:(ldlm_lockd.c:205:waiting_locks_callback()) ### lock callback
timer expired: evicting client 7cb21_tiger_lov_7afb88fa94 at
NET_0xb00000a0000f9_UUID nid 10.0.0.249 at vib  ns: filter-ost13_UUID lock:
0000010036031dc0/0x7bef85
be5c78b145 lrc: 1/0,0 mode: PW/PW res: 15229751/0 rrc: 2 type: EXT
[0->18446744073709551615] (req 8482816->8503295) flags: 20 remote:
0x84d9c08a99ff4b23 expref: 117 pid: 20438
LustreError: 20452:0:(ldlm_lib.c:1272:target_send_reply_msg()) @@@ processing
error (-107) req at 0000010064e22200 x4778674/t0 o101-><?>@<?>:-1
lens 176/0 ref 0 fl Interpret:/0/0 rc -107/0
Lustre: 0:0:(watchdog.c:121:lcw_cb()) Watchdog triggered for pid 20559: it was
inactive for 100000ms
Lustre: 0:0:(linux-debug.c:155:libcfs_debug_dumpstack()) showing stack for
process 20559
ll_ost_220    S 000001006e854008     0 20559      1         20560 20558 (L-TLB)
000001009739f3e8 0000000000000046 000000000000000f ffffffffa05b33b8 
       0000000000000548 0000000100000000 0000000000000000 000001006e8540b0 
       0000000000000013 0000000000000000 
Call Trace:<ffffffffa05b33b8>{:ptlrpc:ptl_send_buf+824}
<ffffffff801454bd>{__mod_timer+317}
       <ffffffff8033860d>{schedule_timeout+381}
<ffffffff801460a0>{process_timeout+0}
       <ffffffffa05ade84>{:ptlrpc:ptlrpc_queue_wait+6932} 
       <ffffffffa055927d>{:ptlrpc:l_has_lock+77}
<ffffffffa058676c>{:ptlrpc:ldlm_add_waiting_lock+2156}
       <ffffffff80137be0>{default_wake_function+0}
<ffffffffa05a9620>{:ptlrpc:expired_request+0}
       <ffffffffa05a96e0>{:ptlrpc:interrupted_request+0}
<ffffffffa058bae8>{:ptlrpc:ldlm_server_blocking_ast+4072}
       <ffffffffa055ef5a>{:ptlrpc:ldlm_run_ast_work+234}
<ffffffffa05591f8>{:ptlrpc:l_unlock+248}
       <ffffffffa0571c73>{:ptlrpc:ldlm_process_extent_lock+995} 
       <ffffffffa0562b4f>{:ptlrpc:ldlm_lock_enqueue+1087} 
       <ffffffffa058bbb0>{:ptlrpc:ldlm_server_completion_ast+0} 
       <ffffffffa058cb50>{:ptlrpc:ldlm_server_glimpse_ast+0} 
       <ffffffffa058f694>{:ptlrpc:ldlm_handle_enqueue+8836} 
       <ffffffffa068c485>{:obdfilter:filter_fmd_expire+53} 
       <ffffffffa058ab00>{:ptlrpc:ldlm_server_blocking_ast+0} 
       <ffffffffa058bbb0>{:ptlrpc:ldlm_server_completion_ast+0} 
       <ffffffffa0641bb2>{:ost:ost_handle+20738}
<ffffffff801f787e>{number+142}
       <ffffffff801f7f09>{vsnprintf+1321}
<ffffffffa049f022>{:libcfs:libcfs_debug_msg+1554}
       <ffffffffa05bc93a>{:ptlrpc:ptlrpc_server_handle_request+3066} 
       <ffffffff80116d35>{do_gettimeofday+101}
<ffffffffa04a114e>{:libcfs:lcw_update_time+30}
       <ffffffff801454bd>{__mod_timer+317}
<ffffffffa05bdd17>{:ptlrpc:ptlrpc_main+2375}
       <ffffffff80137be0>{default_wake_function+0}
<ffffffffa05bd3c0>{:ptlrpc:ptlrpc_retry_rqbds+0}
       <ffffffffa05bd3c0>{:ptlrpc:ptlrpc_retry_rqbds+0}
<ffffffff801114ab>{child_rip+8}
       <ffffffffa05bd3d0>{:ptlrpc:ptlrpc_main+0}
<ffffffff801114a3>{child_rip+0}
       
LustreError: dumping log to /tmp/lustre-log-tiger-oss-0-0.local.1225493885.20559
LustreError: 20526:0:(client.c:940:ptlrpc_expire_one_request()) @@@ timeout
(sent at 1225493865, 20s ago) req at 000001001986c000 x1346/t0
o106->@NET_0xb00000a0000ed_UUID:15 lens 176/112 ref 2 fl Rpc:/0/0 rc 0/0
LustreError: A client on nid 10.0.0.237 at vib was evicted from service ost1.
LustreError: 20526:0:(ldlm_lockd.c:423:ldlm_failed_ast()) ### glimpse AST failed
(-110): evicting client 82ed7_tiger_lov_eea68aa0e0 at NET_0xb00000a0000ed_UUID
NID 10.0.0.237 at vib (10.0.0.237 at vib) ns: filter-ost1_UUID lock:
000001010f17dac
0/0x7bef85be5c792519 lrc: 2/0,0 mode: PW/PW res: 15168088/0 rrc: 2 type: EXT
[0->18446744073709551615] (req 0->4095) flags: 0 remote:
0x3dd1a37046787550 expref: 182 pid: 20464
LustreError: 20702:0:(ost_handler.c:955:ost_brw_write()) @@@ Eviction on bulk
GET req at 0000010061543400 x4033002/t0 o4->82ed7_tiger_lov_eea68aa0e0 at
NET_0xb00000a0000ed_UUID:-1 lens 328/288 ref 0 fl Interpret:/0/0 rc 0/0
LustreError: 20866:0:(ost_handler.c:747:ost_brw_read()) @@@ timeout on bulk PUT
req at 0000010055099050 x4032963/t0 o3->82ed7_tiger_lov_eea68aa0e0 at
NET_0xb00000a0000ed_UUID:-1 lens 328/280 ref 0 fl Interpret:/0/0 rc 0/0
LustreError: 20866:0:(ost_handler.c:747:ost_brw_read()) Skipped 4 previous
similar messages
Lustre: 20866:0:(ost_handler.c:813:ost_brw_read()) ost13: ignoring bulk IO comm
error with 82ed7_tiger_lov_eea68aa0e0 at NET_0xb00000a0000ed_UUID id
12345-10.0.0.237 at vib - client will retry
Lustre: 20866:0:(ost_handler.c:813:ost_brw_read()) Skipped 4 previous similar
messages
Lustre: 20702:0:(ost_handler.c:1050:ost_brw_write()) ost1: ignoring bulk IO comm
error with 82ed7_tiger_lov_eea68aa0e0 at NET_0xb00000a0000ed_UUID id
12345-10.0.0.237 at vib - client will retry
Lustre: 20106:0:(lib-move.c:1728:lnet_parse_reply()) 10.0.0.230 at vib: Dropping
REPLY from 12345-10.0.0.237 at vib for invalid MD 0x45a932db4532a.0xb4ef9
Lustre: 20106:0:(lib-move.c:1624:lnet_parse_put()) Dropping PUT from
12345-10.0.0.237 at vib portal 16 match 1346 offset 0 length 112: 2
LustreError: 20592:0:(ldlm_lib.c:1272:target_send_reply_msg()) @@@ processing
error (-107) req at 000001010209d800 x4033039/t0 o101-><?>@<?>:-1
lens 176/0 ref 0 fl Interpret:/0/0 rc -107/0
LustreError: 20559:0:(service.c:648:ptlrpc_server_handle_request()) request
4510790 opc 101 from 12345-10.0.0.232 at vib processed in 109s trans 0 rc 0/0
-- cut --

I''m working on urging the clients to upgrade to 1.6 but it''s
slow going.

Any insight would be very helpful ... I can probably get on top of the fix if I
have an inkling of where to start looking ... at this point, I don''t, I
just get reports that it''s "slow".

thanks,
Klaus

PS Sorry for the massive paste to everyone else on the list.

-----Original Message-----
From: Andreas.Dilger at sun.com on behalf of Andreas Dilger
Sent: Mon 11/3/2008 9:47 AM
To: Steden Klaus
Cc: lustre-discuss at lists.lustre.org
Subject: Re: [Lustre-discuss] Odd performance issue with 1.4.x OSS ...
 
On Oct 31, 2008  13:02 -0700, Steden Klaus wrote:> Our Lustre started exhibiting some curious performance issues today
> ... basically, it slowed down dramatically and reliable I/O performance
> became impossible. I looked through the output of dmesg and saw a number
> of kernel ''oops'' messages, but not being a Lustre kernel
expert, I''m
> not exactly sure what they indicate. I stopped the OSTs on the node in
> question and ran e2fsck on the OST drives, but they''ve come up
clean so
> I don''t think it''s a hardware problem. I don''t
have physical access to
> the machine right now so it may in fact be something on the back end,
> but I''m working on verifying that with a technician on site. In
the
> meantime ... can anyone help decipher this for me? There are a couple
> of messages like it:
These kind of messages are of relatively little use unless they include
some of the preceding lines.  Are you sure this is an oops, and not a
watchdog timeout that is dumping the stack?
> -- cut --
> ll_ost_215    S 00000100d2141808     0  8584      1          8585  8583
(L-TLB)
> 00000101184233e8 0000000000000046 000000000000000f ffffffffa059c3b8 
>        00000000005c2616 0000000100000000 0000000000000000 00000100d21418b0 
>        0000000000000013 0000000000000000 
> Call Trace:<ffffffffa059c3b8>{:ptlrpc:ptl_send_buf+824}
<ffffffff801454bd>{__mod_timer+317}
>        <ffffffff8033860d>{schedule_timeout+381}
<ffffffff801460a0>{process_timeout+0}
>        <ffffffffa0596e84>{:ptlrpc:ptlrpc_queue_wait+6932} 
This looks like a network problem, but hard to say without more information.
If you are a supported customer, you will have better service by filing a
bugzilla bug.  This list only gets "as available" replies and that is
doubly true for old 1.4 Lustre installations.

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.

Andreas Dilger

2008-Nov-04 22:43 UTC

head link

[Lustre-discuss] Backing up the OSSs and MDSs

On Oct 31, 2008  19:38 +0100, Tim Bell wrote:> For the OSSs, from what I can see, this can be achieved on a per-volume 
> basis using LVM and snapshotting.  However, I feel uncomfortable with 
> this approach and would prefer something where I can restore an 
> individual file from backup if lost.
I would strongly recommend filesystem-level backups instead of doing
device-level backups.  This gives you full control of which files to
backup, and allows restoring individual files as needed.  It is also
easiest to integrate into existing backup solutions.

It is probably also desirable to do a device-level backup of the MDS,
because it is the central part of the Lustre filesystem, and if this
device fails permanently then the entire filesystem would need to be
restored.  Lustre can generally handle if the MDS backup is slightly
out of date w.r.t. the OSTs, though care must be taken when restoring
in such a scenario.
> Is there another way other than 
> find with mtime to find the list of modified files in the file system ?
There is the "e2scan" program that is included with recent versions of
the lustre e2fsprogs RPM.  It is a fast scanner of the MDS filesystem
that returns a list of files that have changed, like "find".  You can
also use "find" or "lfs find" to traverse the filesystem,
and this
can be very fast if run on a client mounted on the MDS node.

In the 2.0 release there will be a changelog exported from Lustre which
will catalog files that have changed in the filesystem since the last
time an application processed the log.
> For the MDSs, I cannot see how to back up an online MDS.  If I do a 
> split mirror without quiesce and backup the ext3 file system of the 
> split-off mirror, I get an inconsistent state of database.
You should use LVM mirroring to ensure the filesystem is coherent
when the snapshot is made.  There are patches going into the upstream
kernel by Takashi Sato of NEC to allow a userspace process to freeze
the filesystem while doing this sort of hardware mirror split
(http://marc.info/?l=linux-fsdevel&m=122511248322724&w=4), but as
yet this is not in the upstream kernel or any Lustre kernel.

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.

Andreas Dilger

2008-Nov-07 22:46 UTC

head link

[Lustre-discuss] Odd performance issue with 1.4.x OSS ...

On Nov 03, 2008  11:39 -0800, Steden Klaus wrote:> Hello Andreas,
> 
> Thanks for the info. Here''s some more information (basically cut
and pasted from before and after a typical stack dump message):
> 
> -- cut --
> Lustre: 0:0:(watchdog.c:121:lcw_cb()) Watchdog triggered for pid 20559: it
was inactive for 100000ms
> Lustre: 0:0:(linux-debug.c:155:libcfs_debug_dumpstack()) showing stack for
process 20559
> ll_ost_220    S 000001006e854008     0 20559      1         20560 20558
(L-TLB)
> 000001009739f3e8 0000000000000046 000000000000000f ffffffffa05b33b8 
>        0000000000000548 0000000100000000 0000000000000000 000001006e8540b0 
>        0000000000000013 0000000000000000 
> Call Trace:<ffffffffa05b33b8>{:ptlrpc:ptl_send_buf+824}
<ffffffff801454bd>{__mod_timer+317}
>        <ffffffff8033860d>{schedule_timeout+381}
<ffffffff801460a0>{process_timeout+0}
This is a watchdog timeout, and should not be thought of as an oops or panic
as you mentioned in your earlier comment.  It is debugging for Lustre.
> LustreError: 20438:0:(service.c:648:ptlrpc_server_handle_request()) request
4777659 opc 101 from 12345-10.0.0.249 at vib processed in 118s trans 0 rc 0/0
> Lustre: 20438:0:(watchdog.c:302:lcw_update_time()) Expired watchdog for pid
20438 disabled after 118.9719s
> LustreError: 20257:0:(ldlm_lockd.c:579:ldlm_server_completion_ast()) ###
enqueue wait took 118984007us from 1225493396 ns: filter-ost13_UUID lock:
0000010036031dc0/0x7bef85be5c78b145 lrc: 2/0,0 mode: PW/PW res: 15229751/0 rrc:
3 type:
This looks like slow request processing on the OST.  I would suggest to
increase the lustre timeout value to 150s or more.  Timeout values of
300s is fairly common in larger installations (1000 nodes or more).  In
the 1.8 release Lustre will automatically tune this value so that it
can adapt to larger-scale systems better.
> I''m working on urging the clients to upgrade to 1.6 but
it''s slow going.
> 
> Any insight would be very helpful ... I can probably get on top of the fix
if I have an inkling of where to start looking ... at this point, I
don''t, I just get reports that it''s "slow".
> 
> thanks,
> Klaus
> 
> PS Sorry for the massive paste to everyone else on the list.
> 
> -----Original Message-----
> From: Andreas.Dilger at sun.com on behalf of Andreas Dilger
> Sent: Mon 11/3/2008 9:47 AM
> To: Steden Klaus
> Cc: lustre-discuss at lists.lustre.org
> Subject: Re: [Lustre-discuss] Odd performance issue with 1.4.x OSS ...
>  
> On Oct 31, 2008  13:02 -0700, Steden Klaus wrote:
> > Our Lustre started exhibiting some curious performance issues today
> > ... basically, it slowed down dramatically and reliable I/O
performance
> > became impossible. I looked through the output of dmesg and saw a
number
> > of kernel ''oops'' messages, but not being a Lustre
kernel expert, I''m
> > not exactly sure what they indicate. I stopped the OSTs on the node in
> > question and ran e2fsck on the OST drives, but they''ve come
up clean so
> > I don''t think it''s a hardware problem. I
don''t have physical access to
> > the machine right now so it may in fact be something on the back end,
> > but I''m working on verifying that with a technician on site.
In the
> > meantime ... can anyone help decipher this for me? There are a couple
> > of messages like it:
> 
> These kind of messages are of relatively little use unless they include
> some of the preceding lines.  Are you sure this is an oops, and not a
> watchdog timeout that is dumping the stack?
> 
> > -- cut --
> > ll_ost_215    S 00000100d2141808     0  8584      1          8585 
8583 (L-TLB)
> > 00000101184233e8 0000000000000046 000000000000000f ffffffffa059c3b8 
> >        00000000005c2616 0000000100000000 0000000000000000
00000100d21418b0
> >        0000000000000013 0000000000000000 
> > Call Trace:<ffffffffa059c3b8>{:ptlrpc:ptl_send_buf+824}
<ffffffff801454bd>{__mod_timer+317}
> >        <ffffffff8033860d>{schedule_timeout+381}
<ffffffff801460a0>{process_timeout+0}
> >        <ffffffffa0596e84>{:ptlrpc:ptlrpc_queue_wait+6932} 
> 
> This looks like a network problem, but hard to say without more
information.
> If you are a supported customer, you will have better service by filing a
> bugzilla bug.  This list only gets "as available" replies and
that is
> doubly true for old 1.4 Lustre installations.
> 
> Cheers, Andreas
> --
> Andreas Dilger
> Sr. Staff Engineer, Lustre Group
> Sun Microsystems of Canada, Inc.
> 
Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.

Klaus Steden

2008-Nov-07 23:03 UTC

head link

[Lustre-discuss] Odd performance issue with 1.4.x OSS ...

On 11/7/08 2:46 PM, "Andreas Dilger" <adilger at sun.com> etched
on stone
tablets:
> On Nov 03, 2008  11:39 -0800, Steden Klaus wrote:
>> Hello Andreas,
>> 
>> Thanks for the info. Here''s some more information (basically
cut and pasted
>> from before and after a typical stack dump message):
>> 
>> -- cut --
>> Lustre: 0:0:(watchdog.c:121:lcw_cb()) Watchdog triggered for pid 20559:
it
>> was inactive for 100000ms
>> Lustre: 0:0:(linux-debug.c:155:libcfs_debug_dumpstack()) showing stack
for
>> process 20559
>> ll_ost_220    S 000001006e854008     0 20559      1         20560 20558
>> (L-TLB)
>> 000001009739f3e8 0000000000000046 000000000000000f ffffffffa05b33b8
>>        0000000000000548 0000000100000000 0000000000000000
000001006e8540b0
>>        0000000000000013 0000000000000000
>> Call Trace:<ffffffffa05b33b8>{:ptlrpc:ptl_send_buf+824}
>> <ffffffff801454bd>{__mod_timer+317}
>>        <ffffffff8033860d>{schedule_timeout+381}
>> <ffffffff801460a0>{process_timeout+0}
> 
> This is a watchdog timeout, and should not be thought of as an oops or
panic
> as you mentioned in your earlier comment.  It is debugging for Lustre.
> 
>> LustreError: 20438:0:(service.c:648:ptlrpc_server_handle_request())
request
>> 4777659 opc 101 from 12345-10.0.0.249 at vib processed in 118s trans 0
rc 0/0
>> Lustre: 20438:0:(watchdog.c:302:lcw_update_time()) Expired watchdog for
pid
>> 20438 disabled after 118.9719s
>> LustreError: 20257:0:(ldlm_lockd.c:579:ldlm_server_completion_ast())
###
>> enqueue wait took 118984007us from 1225493396 ns: filter-ost13_UUID
lock:
>> 0000010036031dc0/0x7bef85be5c78b145 lrc: 2/0,0 mode: PW/PW res:
15229751/0
>> rrc: 3 type: 
> 
> This looks like slow request processing on the OST.  I would suggest to
> increase the lustre timeout value to 150s or more.  Timeout values of
> 300s is fairly common in larger installations (1000 nodes or more).  In
> the 1.8 release Lustre will automatically tune this value so that it
> can adapt to larger-scale systems better.
> Hi Andreas,

That is good to know. I thought it was an oops message, not normal
debugging. I''m used to Lustre "just working" without this
kind of output, so
I assumed the worst when I saw them.

As far as the issue itself, apparently the Voltaire switch needed a restart,
which has cleared whatever condition was causing the problem.

thanks,
Klaus

Lustre discuss - Oct 2008 - Backing up the OSSs and MDSs

[Lustre-discuss] Backing up the OSSs and MDSs

[Lustre-discuss] Backing up the OSSs and MDSs

[Lustre-discuss] Odd performance issue with 1.4.x OSS ...

[Lustre-discuss] Odd performance issue with 1.4.x OSS ...

[Lustre-discuss] Odd performance issue with 1.4.x OSS ...

[Lustre-discuss] Backing up the OSSs and MDSs

[Lustre-discuss] Odd performance issue with 1.4.x OSS ...

[Lustre-discuss] Odd performance issue with 1.4.x OSS ...