thr3ads.net - Lustre discuss - [Lustre-discuss] OSTs hanging while running IOR [Sep 2009]

If this information is useful, please help other people find it:
Share via:

Rafael David Tinoco

2009-Sep-09 17:31 UTC

[Lustre-discuss] OSTs hanging while running IOR

Have anyone seen these kind of errors while running IOR or some other
benchmarks:

 

Im running lustre 1.8.1 on CentOS 5.3.

 

I have the following configuration:

 

4 JBDOs J4400 connected to 4 OSSs.

 

Each OSS has 3 OSTs (raid5 - 8 disks) connected using multipathd, mdadm on
/dev/dm* and using mptfusion driver (for de J4400 JBODS)

 

Everytime I run:

 

mpirun -hostfile ./lustre.hosts -np 20 /hpc/IOR -w -r -C -i 2 -b 1000M -t
128k -F -o /work/stripe12/teste

(Specially with -b 1000M) 

 

One of my OSSs crashes, sometimes one, sometimes another. With the following
error:

 

Sep  9 07:43:40 a01n00 kernel: ll_ost_io_64  D ffff81037fea80c0     0 20381
1         20382 20380 (L-TLB)

Sep  9 07:43:40 a01n00 kernel:  ffff81036316b510 0000000000000046
0000000000000003 0000040000000282

Sep  9 07:43:40 a01n00 kernel:  0000000000000100 0000000000000009
ffff81037ac09100 ffff81037fea80c0

Sep  9 07:43:40 a01n00 kernel:  0000088160738e93 0000000000313ec1
ffff81037ac092e8 0000000328b65740

Sep  9 07:43:40 a01n00 kernel: Call Trace:

Sep  9 07:43:40 a01n00 kernel:  [<ffffffff80033608>] submit_bio+0xcd/0xd4

Sep  9 07:43:40 a01n00 kernel:  [<ffffffff88b14aac>]
:obdfilter:filter_do_bio+0x95c/0xb60

Sep  9 07:43:40 a01n00 kernel:  [<ffffffff88ae0f24>]
:fsfilt_ldiskfs:fsfilt_ldiskfs_write_record+0x464/0x4b0

Sep  9 07:43:40 a01n00 kernel:  [<ffffffff88b014f0>]
:obdfilter:filter_commit_cb+0x0/0x2d0

Sep  9 07:43:40 a01n00 kernel:  [<ffffffff88031749>]
:jbd:journal_callback_set+0x2d/0x47

Sep  9 07:43:40 a01n00 kernel:  [<ffffffff8009daef>]
autoremove_wake_function+0x0/0x2e

Sep  9 07:43:40 a01n00 kernel:  [<ffffffff88b15974>]
:obdfilter:filter_direct_io+0xcc4/0xd50

Sep  9 07:43:40 a01n00 kernel:  [<ffffffff8892ad70>]
:lquota:filter_quota_acquire+0x0/0x120

Sep  9 07:43:40 a01n00 kernel:  [<ffffffff88b17c08>]
:obdfilter:filter_commitrw_write+0x1558/0x25b0

Sep  9 07:43:40 a01n00 kernel:  [<ffffffff88730d23>]
:lnet:lnet_send+0x973/0x9a0

Sep  9 07:43:40 a01n00 kernel:  [<ffffffff88790c11>]
:obdclass:class_handle2object+0xd1/0x160

Sep  9 07:43:40 a01n00 kernel:  [<ffffffff88abc048>]
:ost:ost_checksum_bulk+0x358/0x590

Sep  9 07:43:40 a01n00 kernel:  [<ffffffff88ac2b1e>]
:ost:ost_brw_write+0x1b8e/0x2310

Sep  9 07:43:40 a01n00 kernel:  [<ffffffff88837c88>]
:ptlrpc:ptlrpc_send_reply+0x5c8/0x5e0

Sep  9 07:43:40 a01n00 kernel:  [<ffffffff88803320>]
:ptlrpc:target_committed_to_req+0x40/0x120

Sep  9 07:43:40 a01n00 kernel:  [<ffffffff88abe67c>]
:ost:ost_brw_read+0x182c/0x19e0

Sep  9 07:43:40 a01n00 kernel:  [<ffffffff8883c025>]
:ptlrpc:lustre_msg_get_version+0x35/0xf0

Sep  9 07:43:40 a01n00 kernel:  [<ffffffff8008a3ef>]
default_wake_function+0x0/0xe

Sep  9 07:43:40 a01n00 kernel:  [<ffffffff8883c0e8>]
:ptlrpc:lustre_msg_check_version_v2+0x8/0x20

Sep  9 07:43:40 a01n00 kernel:  [<ffffffff88ac60fb>]
:ost:ost_handle+0x2e5b/0x5a70

Sep  9 07:43:40 a01n00 kernel:  [<ffffffff88735305>]
:lnet:lnet_match_blocked_msg+0x375/0x390

Sep  9 07:43:40 a01n00 kernel:  [<ffffffff88811aea>]
:ptlrpc:ldlm_resource_foreach+0x25a/0x390

Sep  9 07:43:40 a01n00 kernel:  [<ffffffff80148d4f>] __next_cpu+0x19/0x28

Sep  9 07:43:40 a01n00 kernel:  [<ffffffff80148d4f>] __next_cpu+0x19/0x28

Sep  9 07:43:40 a01n00 kernel:  [<ffffffff80088f32>]
find_busiest_group+0x20d/0x621

Sep  9 07:43:40 a01n00 kernel:  [<ffffffff88841a15>]
:ptlrpc:lustre_msg_get_conn_cnt+0x35/0xf0

Sep  9 07:43:40 a01n00 kernel:  [<ffffffff8884672d>]
:ptlrpc:ptlrpc_check_req+0x1d/0x110

Sep  9 07:43:40 a01n00 kernel:  [<ffffffff88848e67>]
:ptlrpc:ptlrpc_server_handle_request+0xa97/0x1160

Sep  9 07:43:40 a01n00 kernel:  [<ffffffff80063098>]
thread_return+0x62/0xfe

Sep  9 07:43:40 a01n00 kernel:  [<ffffffff8003dc3f>]
lock_timer_base+0x1b/0x3c

Sep  9 07:43:40 a01n00 kernel:  [<ffffffff8001ceb8>] __mod_timer+0xb0/0xbe

Sep  9 07:43:40 a01n00 kernel:  [<ffffffff8884c908>]
:ptlrpc:ptlrpc_main+0x1218/0x13e0

Sep  9 07:43:40 a01n00 kernel:  [<ffffffff8008a3ef>]
default_wake_function+0x0/0xe

Sep  9 07:43:40 a01n00 kernel:  [<ffffffff8005dfb1>] child_rip+0xa/0x11

Sep  9 07:43:40 a01n00 kernel:  [<ffffffff8884b6f0>]
:ptlrpc:ptlrpc_main+0x0/0x13e0

Sep  9 07:43:40 a01n00 kernel:  [<ffffffff8005dfa7>] child_rip+0x0/0x11

Sep  9 07:43:40 a01n00 kernel:

Sep  9 07:43:40 a01n00 kernel: Lustre: 0:0:(watchdog.c:181:lcw_cb())
Watchdog triggered for pid 27733: it was inactive for 200.00s

Sep  9 07:43:40 a01n00 kernel: Lustre:
0:0:(linux-debug.c:264:libcfs_debug_dumpstack()) showing stack for process
27733

Sep  9 07:43:40 a01n00 kernel: ll_ost_io_159 D 0000000000000000     0 27733
1         27734 27732 (L-TLB)

Sep  9 07:43:40 a01n00 kernel:  ffff810521239510 0000000000000046
0000000000000003 0000040000000282

Sep  9 07:43:40 a01n00 kernel:  0000000000000100 000000000000000a
ffff81067e810860 ffff81033115a040

Sep  9 07:43:40 a01n00 kernel:  00000881604f2d64 00000000000d2465
ffff81067e810a48 000000061ced4140

Sep  9 07:43:40 a01n00 kernel: Call Trace:

Sep  9 07:43:40 a01n00 kernel:  [<ffffffff80033608>] submit_bio+0xcd/0xd4

Sep  9 07:43:40 a01n00 kernel:  [<ffffffff88b14aac>]
:obdfilter:filter_do_bio+0x95c/0xb60

Sep  9 07:43:40 a01n00 kernel:  [<ffffffff88ae0f24>]
:fsfilt_ldiskfs:fsfilt_ldiskfs_write_record+0x464/0x4b0

Sep  9 07:43:40 a01n00 kernel:  [<ffffffff88b014f0>]
:obdfilter:filter_commit_cb+0x0/0x2d0

Sep  9 07:43:40 a01n00 kernel:  [<ffffffff88031749>]
:jbd:journal_callback_set+0x2d/0x47

Sep  9 07:43:40 a01n00 kernel:  [<ffffffff8009daef>]
autoremove_wake_function+0x0/0x2e

Sep  9 07:43:40 a01n00 kernel:  [<ffffffff88b15974>]
:obdfilter:filter_direct_io+0xcc4/0xd50

Sep  9 07:43:40 a01n00 kernel:  [<ffffffff8892ad70>]
:lquota:filter_quota_acquire+0x0/0x120

Sep  9 07:43:40 a01n00 kernel:  [<ffffffff88b17c08>]
:obdfilter:filter_commitrw_write+0x1558/0x25b0

Sep  9 07:43:40 a01n00 kernel:  [<ffffffff88ac2b1e>]
:ost:ost_brw_write+0x1b8e/0x2310

Sep  9 07:43:40 a01n00 kernel:  [<ffffffff88837c88>]
:ptlrpc:ptlrpc_send_reply+0x5c8/0x5e0

Sep  9 07:43:40 a01n00 kernel:  [<ffffffff88803320>]
:ptlrpc:target_committed_to_req+0x40/0x120

Sep  9 07:43:40 a01n00 kernel:  [<ffffffff88abe67c>]
:ost:ost_brw_read+0x182c/0x19e0

Sep  9 07:43:40 a01n00 kernel:  [<ffffffff8883c025>]
:ptlrpc:lustre_msg_get_version+0x35/0xf0

Sep  9 07:43:40 a01n00 kernel:  [<ffffffff8008a3ef>]
default_wake_function+0x0/0xe

Sep  9 07:43:40 a01n00 kernel:  [<ffffffff8883c0e8>]
:ptlrpc:lustre_msg_check_version_v2+0x8/0x20

Sep  9 07:43:40 a01n00 kernel:  [<ffffffff88ac60fb>]
:ost:ost_handle+0x2e5b/0x5a70

Sep  9 07:43:40 a01n00 kernel:  [<ffffffff88735305>]
:lnet:lnet_match_blocked_msg+0x375/0x390

Sep  9 07:43:40 a01n00 kernel:  [<ffffffff800d74d2>]
__drain_alien_cache+0x51/0x66

Sep  9 07:43:40 a01n00 kernel:  [<ffffffff80148d4f>] __next_cpu+0x19/0x28

Sep  9 07:43:40 a01n00 kernel:  [<ffffffff88841a15>]
:ptlrpc:lustre_msg_get_conn_cnt+0x35/0xf0

Sep  9 07:43:40 a01n00 kernel:  [<ffffffff80089d89>]
enqueue_task+0x41/0x56

Sep  9 07:43:40 a01n00 kernel:  [<ffffffff8884672d>]
:ptlrpc:ptlrpc_check_req+0x1d/0x110

Sep  9 07:43:40 a01n00 kernel:  [<ffffffff88848e67>]
:ptlrpc:ptlrpc_server_handle_request+0xa97/0x1160

Sep  9 07:43:40 a01n00 kernel:  [<ffffffff80088819>]
__wake_up_common+0x3e/0x68

Sep  9 07:43:40 a01n00 kernel:  [<ffffffff8884c908>]
:ptlrpc:ptlrpc_main+0x1218/0x13e0

Sep  9 07:43:40 a01n00 kernel:  [<ffffffff8008a3ef>]
default_wake_function+0x0/0xe

Sep  9 07:43:40 a01n00 kernel:  [<ffffffff8005dfb1>] child_rip+0xa/0x11

Sep  9 07:43:40 a01n00 kernel:  [<ffffffff8884b6f0>]
:ptlrpc:ptlrpc_main+0x0/0x13e0

Sep  9 07:43:40 a01n00 kernel:  [<ffffffff8005dfa7>] child_rip+0x0/0x11

Sep  9 07:43:40 a01n00 kernel: ll_ost_io_195 D ffff81038ab8c860     0 27769
1         27770 27768 (L-TLB)

Sep  9 07:43:40 a01n00 kernel:  ffff81028a541190 0000000000000046
ffff81028a541120 ffffffff8009daf8

Sep  9 07:43:40 a01n00 kernel:  ffff810369dc3b18 000000000000000a
ffff81028a524820 ffff81038ab8c860

Sep  9 07:43:40 a01n00 kernel:  00000881659b85ee 0000000000000429
ffff81028a524a08 0000000000000003

Sep  9 07:43:40 a01n00 kernel: Call Trace:

Sep  9 07:43:40 a01n00 kernel:  [<ffffffff8009daf8>]
autoremove_wake_function+0x9/0x2e

Sep  9 07:43:40 a01n00 kernel:  [<ffffffff8002e6ba>] __wake_up+0x38/0x4f

Sep  9 07:43:41 a01n00 kernel:  [<ffffffff881b8b39>]
:dm_mod:dm_table_unplug_all+0x33/0x42

Sep  9 07:43:41 a01n00 kernel:  [<ffffffff886b5e62>]
:raid456:get_active_stripe+0x247/0x4f0

Sep  9 07:43:41 a01n00 kernel:  [<ffffffff8008a3ef>]
default_wake_function+0x0/0xe

Sep  9 07:43:41 a01n00 kernel:  [<ffffffff886bb4dd>]
:raid456:make_request+0x472/0x9af

Sep  9 07:43:41 a01n00 kernel:  [<ffffffff8009daef>]
autoremove_wake_function+0x0/0x2e

Sep  9 07:43:41 a01n00 kernel:  [<ffffffff8009daef>]
autoremove_wake_function+0x0/0x2e

Sep  9 07:43:41 a01n00 kernel:  [<ffffffff8001c49b>]
generic_make_request+0x1e7/0x1fe

Sep  9 07:43:41 a01n00 kernel:  [<ffffffff80023342>]
mempool_alloc+0x24/0xda

Sep  9 07:43:41 a01n00 kernel:  [<ffffffff80033608>] submit_bio+0xcd/0xd4

Sep  9 07:43:41 a01n00 kernel:  [<ffffffff88788656>]
:obdclass:lprocfs_oh_tally+0x26/0x50

Sep  9 07:43:41 a01n00 kernel:  [<ffffffff88adf7bc>]
:fsfilt_ldiskfs:fsfilt_ldiskfs_send_bio+0xc/0x20

Sep  9 07:43:41 a01n00 kernel:  [<ffffffff88b14711>]
:obdfilter:filter_do_bio+0x5c1/0xb60

Sep  9 07:43:41 a01n00 kernel:  [<ffffffff88ae0f24>]
:fsfilt_ldiskfs:fsfilt_ldiskfs_write_record+0x464/0x4b0

Sep  9 07:43:41 a01n00 kernel:  [<ffffffff88b014f0>]
:obdfilter:filter_commit_cb+0x0/0x2d0

Sep  9 07:43:41 a01n00 kernel:  [<ffffffff88031749>]
:jbd:journal_callback_set+0x2d/0x47

Sep  9 07:43:41 a01n00 kernel:  [<ffffffff88adfad0>]
:fsfilt_ldiskfs:fsfilt_ldiskfs_commit_async+0xd0/0x150

Sep  9 07:43:41 a01n00 kernel:  [<ffffffff88b15974>]
:obdfilter:filter_direct_io+0xcc4/0xd50

Sep  9 07:43:41 a01n00 kernel:  [<ffffffff8892ad70>]
:lquota:filter_quota_acquire+0x0/0x120

Sep  9 07:43:41 a01n00 kernel:  [<ffffffff88b17c08>]
:obdfilter:filter_commitrw_write+0x1558/0x25b0

Sep  9 07:43:41 a01n00 kernel:  [<ffffffff88730d23>]
:lnet:lnet_send+0x973/0x9a0

Sep  9 07:43:41 a01n00 kernel:  [<ffffffff88790c11>]
:obdclass:class_handle2object+0xd1/0x160

Sep  9 07:43:41 a01n00 kernel:  [<ffffffff88abc02c>]
:ost:ost_checksum_bulk+0x33c/0x590

Sep  9 07:43:41 a01n00 kernel:  [<ffffffff88ac2b1e>]
:ost:ost_brw_write+0x1b8e/0x2310

Sep  9 07:43:41 a01n00 kernel:  [<ffffffff88837c88>]
:ptlrpc:ptlrpc_send_reply+0x5c8/0x5e0

Sep  9 07:43:41 a01n00 kernel:  [<ffffffff88803320>]
:ptlrpc:target_committed_to_req+0x40/0x120

Sep  9 07:43:41 a01n00 kernel:  [<ffffffff88abe67c>]
:ost:ost_brw_read+0x182c/0x19e0

Sep  9 07:43:41 a01n00 kernel:  [<ffffffff8883c025>]
:ptlrpc:lustre_msg_get_version+0x35/0xf0

Sep  9 07:43:41 a01n00 kernel:  [<ffffffff8008a3ef>]
default_wake_function+0x0/0xe

Sep  9 07:43:41 a01n00 kernel:  [<ffffffff8883c0e8>]
:ptlrpc:lustre_msg_check_version_v2+0x8/0x20

Sep  9 07:43:41 a01n00 kernel:  [<ffffffff88ac60fb>]
:ost:ost_handle+0x2e5b/0x5a70

Sep  9 07:43:41 a01n00 kernel:  [<ffffffff800d7290>]
free_block+0x126/0x143

Sep  9 07:43:41 a01n00 kernel:  [<ffffffff88735305>]
:lnet:lnet_match_blocked_msg+0x375/0x390

Sep  9 07:43:41 a01n00 kernel:  [<ffffffff800d74d2>]
__drain_alien_cache+0x51/0x66

Sep  9 07:43:41 a01n00 kernel:  [<ffffffff88790c11>]
:obdclass:class_handle2object+0xd1/0x160

Sep  9 07:43:41 a01n00 kernel:  [<ffffffff80148d4f>] __next_cpu+0x19/0x28

Sep  9 07:43:41 a01n00 kernel:  [<ffffffff80088f32>]
find_busiest_group+0x20d/0x621

Sep  9 07:43:41 a01n00 kernel:  [<ffffffff887f719a>]
:ptlrpc:lock_res_and_lock+0xba/0xd0

Sep  9 07:43:41 a01n00 kernel:  [<ffffffff88841a15>]
:ptlrpc:lustre_msg_get_conn_cnt+0x35/0xf0

Sep  9 07:43:41 a01n00 kernel:  [<ffffffff80089d89>]
enqueue_task+0x41/0x56

Sep  9 07:43:41 a01n00 kernel:  [<ffffffff8884672d>]
:ptlrpc:ptlrpc_check_req+0x1d/0x110

Sep  9 07:43:41 a01n00 kernel:  [<ffffffff88848e67>]
:ptlrpc:ptlrpc_server_handle_request+0xa97/0x1160

Sep  9 07:43:41 a01n00 kernel:  [<ffffffff80088819>]
__wake_up_common+0x3e/0x68

Sep  9 07:43:41 a01n00 kernel:  [<ffffffff8884c908>]
:ptlrpc:ptlrpc_main+0x1218/0x13e0

Sep  9 07:43:41 a01n00 kernel:  [<ffffffff8008a3ef>]
default_wake_function+0x0/0xe

Sep  9 07:43:41 a01n00 kernel:  [<ffffffff8005dfb1>] child_rip+0xa/0x11

Sep  9 07:43:41 a01n00 kernel:  [<ffffffff8884b6f0>]
:ptlrpc:ptlrpc_main+0x0/0x13e0

Sep  9 07:43:41 a01n00 kernel:  [<ffffffff8005dfa7>] child_rip+0x0/0x11

Sep  9 07:43:41 a01n00 kernel:

Sep  9 07:43:41 a01n00 kernel: ll_ost_io_68  D 0000000000000000     0 20385
1         20386 20384 (L-TLB)

Sep  9 07:43:41 a01n00 kernel:  ffff810375ce5510 0000000000000046
0000000000000003 0000040000000282

Sep  9 07:43:41 a01n00 kernel:  0000000000000100 000000000000000a
ffff81037a2ca080 ffff810365f9e860

Sep  9 07:43:41 a01n00 kernel:  000008815549e040 00000000000df2e2
ffff81037a2ca268 0000000730b8cd40

...

 

Any Ideas ?

 

Tks

 

Rafael Tinoco

 

 

Rafael David Tinoco - Sun Microsystems

Systems Engineer - High Performance Computing

Rafael.Tinoco at Sun.COM - 55.11.5187.2194

 

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
http://lists.lustre.org/pipermail/lustre-discuss/attachments/20090909/aeedb904/attachment-0001.html

Brian J. Murrell

2009-Sep-09 17:41 UTC

head link

[Lustre-discuss] OSTs hanging while running IOR

On Wed, 2009-09-09 at 14:31 -0300, Rafael David Tinoco
wrote:> Have anyone seen these kind of errors while running IOR or some other
> benchmarks:
On a note of e-mail formatting, so much vertical whitespace is not
really needed and makes reading a bit more difficult.

Also, personally, I don''t wrap log file excerpts at ~80 columns.  I
think most people have a wide enough display to read that and it makes
reading things like stack dumps much, much easier.  Not MTAs make it all
that easy to not wrap though.
> One of my OSSs crashes,
What do you mean by "crash"?  Does it oops, or need a reboot, etc? 
You
have not really provided enough log for me to determine what context the
following is in:
>  sometimes one, sometimes another. With the following error:
> 
>  
> 
> Sep  9 07:43:40 a01n00 kernel: ll_ost_io_64  D ffff81037fea80c0     0
> 20381      1         20382 20380 (L-TLB)
> 
> Sep  9 07:43:40 a01n00 kernel:  ffff81036316b510 0000000000000046
> 0000000000000003 0000040000000282
> 
> Sep  9 07:43:40 a01n00 kernel:  0000000000000100 0000000000000009
> ffff81037ac09100 ffff81037fea80c0
> 
> Sep  9 07:43:40 a01n00 kernel:  0000088160738e93 0000000000313ec1
> ffff81037ac092e8 0000000328b65740
> 
> Sep  9 07:43:40 a01n00 kernel: Call Trace:
> 
> Sep  9 07:43:40 a01n00 kernel:  [<ffffffff80033608>] submit_bio
> +0xcd/0xd4
> 
> Sep  9 07:43:40 a01n00 kernel:
> [<ffffffff88b14aac>] :obdfilter:filter_do_bio+0x95c/0xb60
> 
> Sep  9 07:43:40 a01n00 kernel:
> [<ffffffff88ae0f24>] :fsfilt_ldiskfs:fsfilt_ldiskfs_write_record
> +0x464/0x4b0
> 
> Sep  9 07:43:40 a01n00 kernel:
> [<ffffffff88b014f0>] :obdfilter:filter_commit_cb+0x0/0x2d0
> 
> Sep  9 07:43:40 a01n00 kernel:
> [<ffffffff88031749>] :jbd:journal_callback_set+0x2d/0x47
> 
> Sep  9 07:43:40 a01n00 kernel:  [<ffffffff8009daef>]
> autoremove_wake_function+0x0/0x2e...

Can you provide a bit more of the log before the above so we can see
what the stack trace is in reference to?  Also, try to eliminate the
white-space between lines.  Are you getting any other errors or messages
from Lustre prior to that?

Perhaps you are getting some messages saying that various operations are
"slow"?

Have you tuned these OSSes with respect to the number of OST threads
needed to drive (and not over-drive) your disks?  The lustre iokit is
useful for that tuning.

b.


-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 197 bytes
Desc: This is a digitally signed message part
Url :
http://lists.lustre.org/pipermail/lustre-discuss/attachments/20090909/31d30a37/attachment.bin

Oleg Drokin

2009-Sep-09 22:25 UTC

head link

[Lustre-discuss] OSTs hanging while running IOR

Hello!

On Sep 9, 2009, at 1:31 PM, Rafael David Tinoco wrote:> One of my OSSs crashes, sometimes one, sometimes another. With the  
> following error:
That''s not a crash.
That''s watchdog timeout indicative of lustre spending too much time
waiting on io.
As such you need to somehow decrease the load on the system (by e.g.  
reducing
the number of io threads - was discussed on this list recently),  
increase obd_timeout or
get faster disk subsystem.

Bye,
     Oleg

Rafael David Tinoco

2009-Sep-09 22:30 UTC

head link

[Lustre-discuss] OSTs hanging while running IOR

Im attaching the messages (only the error part) file so we don''t have
these mail formatting problems.

------

Can you provide a bit more of the log before the above so we can see what the
stack trace is in reference to?  Also, try to
eliminate the white-space between lines.  Are you getting any other errors or
messages from Lustre prior to that?

Perhaps you are getting some messages saying that various operations are
"slow"?
>> Even beeing slow, the OST should respond right ? It "hangs".
Have you tuned these OSSes with respect to the number of OST threads needed to
drive (and not over-drive) your disks?  The lustre
iokit is useful for that tuning.
>> Ok, tuning for performance is okay, but hanging with 20 nodes (IOR
MPI).. strange right ?
b.

-----

I''m using 3 raid 5 with 8 disks each and 256 OST threads on each OSS.

root at a02n00:~# cat /etc/mdadm.conf
ARRAY /dev/md10 level=raid5 num-devices=8
devices=/dev/dm-0,/dev/dm-1,/dev/dm-2,/dev/dm-3,/dev/dm-4,/dev/dm-5,/dev/dm-6,/dev/dm-7
ARRAY /dev/md11 level=raid5 num-devices=8
devices=/dev/dm-8,/dev/dm-9,/dev/dm-10,/dev/dm-11,/dev/dm-12,/dev/dm-13,/dev/dm-14,/dev/dm-15
ARRAY /dev/md12 level=raid5 num-devices=8
devices=/dev/dm-16,/dev/dm-17,/dev/dm-18,/dev/dm-19,/dev/dm-20,/dev/dm-21,/dev/dm-22,/dev/dm-23

All my OSTs were created with internal journal (for test pourposes).

mkfs.lustre --r --ost --fsname=work --mkfsoptions="-b 4096 -E
stride=32,stripe-width=224 -m 0" --mgsnid=a03n00 at o2ib
--mgsnid=b03n00 at o2ib /dev/md[10|11|12]

Im using separete mdt and mgs:

# MGS
mkfs.lustre --fsname=work --r --mgs --mkfsoptions="-b 4096 -E
stride=4,stripe-width=4 -m 0" --mountfsoptions=acl
--failnode=b03n00 at o2ib /dev/sdb1

# MDT
mkfs.lustre --fsname=work --r --mgsnid=a03n00 at o2ib --mgsnid=b03n00 at o2ib
--mdt --mkfsoptions="-b 4096 -E stride=4,stripe-width=40 -m
0" --mountfsoptions=acl --failnode=b03n00 at o2ib /dev/sdc1

I''m using these packages on server:
----------
root at a03n00:~# rpm -aq | grep -i lustre
lustre-modules-1.8.1-2.6.18_128.1.14.el5_lustre.1.8.1
lustre-client-modules-1.8.1-2.6.18_128.1.14.el5_lustre.1.8.1
lustre-ldiskfs-3.0.9-2.6.18_128.1.14.el5_lustre.1.8.1
kernel-lustre-headers-2.6.18-128.1.14.el5_lustre.1.8.1
kernel-lustre-2.6.18-128.1.14.el5_lustre.1.8.1
lustre-client-1.8.1-2.6.18_128.1.14.el5_lustre.1.8.1
kernel-lustre-devel-2.6.18-128.1.14.el5_lustre.1.8.1
lustre-1.8.1-2.6.18_128.1.14.el5_lustre.1.8.1
kernel-ib-1.4.1-2.6.18_128.1.14.el5_lustre.1.8.1
----------
On client Ive compiled kernel 2.6.18-128.el5 without INFINIBAND support.
Then compiled OFED 1.4.1 and after that compile patchless client.
For the patchless client, compiled with:
--ofa-kernel=/usr/src/ofa_kernel
----------

* THE ERROR

Using: 

root at b00n00:~# mpirun -hostfile ./lustre.hosts -np 20 /hpc/IOR -w -r -C -i 2
-b 1G -t 512k -F -o /work/stripe12/teste

for example starts "hanging" the OSTs and the filesystem
"hangs".
Any atempt to rm or read a file (or df -kh) hangs and keeps forever (not even
kill -9 solves).

With that.. I cannot umount my OSTs on the OSSs.
And I have to "reboot" the server, and my raids starts resyncing.

Tinoco

Rafael David Tinoco

2009-Sep-09 22:32 UTC

head link

[Lustre-discuss] OSTs hanging while running IOR

Forget the file.. sorry

-----Original Message-----
From: lustre-discuss-bounces at lists.lustre.org [mailto:lustre-discuss-bounces
at lists.lustre.org] On Behalf Of Rafael David Tinoco
Sent: Wednesday, September 09, 2009 7:30 PM
To: ''Brian J. Murrell''; lustre-discuss at lists.lustre.org
Subject: Re: [Lustre-discuss] OSTs hanging while running IOR

Im attaching the messages (only the error part) file so we don''t have
these mail formatting problems.

------

Can you provide a bit more of the log before the above so we can see what the
stack trace is in reference to?  Also, try to
eliminate the white-space between lines.  Are you getting any other errors or
messages from Lustre prior to that?

Perhaps you are getting some messages saying that various operations are
"slow"?
>> Even beeing slow, the OST should respond right ? It "hangs".
Have you tuned these OSSes with respect to the number of OST threads needed to
drive (and not over-drive) your disks?  The lustre
iokit is useful for that tuning.
>> Ok, tuning for performance is okay, but hanging with 20 nodes (IOR
MPI).. strange right ?
b.

-----

I''m using 3 raid 5 with 8 disks each and 256 OST threads on each OSS.

root at a02n00:~# cat /etc/mdadm.conf
ARRAY /dev/md10 level=raid5 num-devices=8
devices=/dev/dm-0,/dev/dm-1,/dev/dm-2,/dev/dm-3,/dev/dm-4,/dev/dm-5,/dev/dm-6,/dev/dm-7
ARRAY /dev/md11 level=raid5 num-devices=8
devices=/dev/dm-8,/dev/dm-9,/dev/dm-10,/dev/dm-11,/dev/dm-12,/dev/dm-13,/dev/dm-14,/dev/dm-15
ARRAY /dev/md12 level=raid5 num-devices=8
devices=/dev/dm-16,/dev/dm-17,/dev/dm-18,/dev/dm-19,/dev/dm-20,/dev/dm-21,/dev/dm-22,/dev/dm-23

All my OSTs were created with internal journal (for test pourposes).

mkfs.lustre --r --ost --fsname=work --mkfsoptions="-b 4096 -E
stride=32,stripe-width=224 -m 0" --mgsnid=a03n00 at o2ib
--mgsnid=b03n00 at o2ib /dev/md[10|11|12]

Im using separete mdt and mgs:

# MGS
mkfs.lustre --fsname=work --r --mgs --mkfsoptions="-b 4096 -E
stride=4,stripe-width=4 -m 0" --mountfsoptions=acl
--failnode=b03n00 at o2ib /dev/sdb1

# MDT
mkfs.lustre --fsname=work --r --mgsnid=a03n00 at o2ib --mgsnid=b03n00 at o2ib
--mdt --mkfsoptions="-b 4096 -E stride=4,stripe-width=40 -m
0" --mountfsoptions=acl --failnode=b03n00 at o2ib /dev/sdc1

I''m using these packages on server:
----------
root at a03n00:~# rpm -aq | grep -i lustre
lustre-modules-1.8.1-2.6.18_128.1.14.el5_lustre.1.8.1
lustre-client-modules-1.8.1-2.6.18_128.1.14.el5_lustre.1.8.1
lustre-ldiskfs-3.0.9-2.6.18_128.1.14.el5_lustre.1.8.1
kernel-lustre-headers-2.6.18-128.1.14.el5_lustre.1.8.1
kernel-lustre-2.6.18-128.1.14.el5_lustre.1.8.1
lustre-client-1.8.1-2.6.18_128.1.14.el5_lustre.1.8.1
kernel-lustre-devel-2.6.18-128.1.14.el5_lustre.1.8.1
lustre-1.8.1-2.6.18_128.1.14.el5_lustre.1.8.1
kernel-ib-1.4.1-2.6.18_128.1.14.el5_lustre.1.8.1
----------
On client Ive compiled kernel 2.6.18-128.el5 without INFINIBAND support.
Then compiled OFED 1.4.1 and after that compile patchless client.
For the patchless client, compiled with:
--ofa-kernel=/usr/src/ofa_kernel
----------

* THE ERROR

Using: 

root at b00n00:~# mpirun -hostfile ./lustre.hosts -np 20 /hpc/IOR -w -r -C -i 2
-b 1G -t 512k -F -o /work/stripe12/teste

for example starts "hanging" the OSTs and the filesystem
"hangs".
Any atempt to rm or read a file (or df -kh) hangs and keeps forever (not even
kill -9 solves).

With that.. I cannot umount my OSTs on the OSSs.
And I have to "reboot" the server, and my raids starts resyncing.

Tinoco

_______________________________________________
Lustre-discuss mailing list
Lustre-discuss at lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss
-------------- next part --------------
A non-text attachment was scrubbed...
Name: messages
Type: application/octet-stream
Size: 31917 bytes
Desc: not available
Url :
http://lists.lustre.org/pipermail/lustre-discuss/attachments/20090909/cbdbf5f9/attachment-0001.obj

Atul Vidwansa

2009-Sep-09 23:21 UTC

head link

[Lustre-discuss] OSTs hanging while running IOR

Rafael,

Can you tell me what RAID stripe_cache_size are you using? Have you 
tuned your OSS nodes with something like:

echo 4096 > /sys/block/md0/md/stripe_cache_size

If you haven''t, can you try using above command to tune all your RAID 
arrays?

Cheers,
-Atul

Rafael David Tinoco wrote:> Forget the file.. sorry
>
> -----Original Message-----
> From: lustre-discuss-bounces at lists.lustre.org
[mailto:lustre-discuss-bounces at lists.lustre.org] On Behalf Of Rafael David
Tinoco
> Sent: Wednesday, September 09, 2009 7:30 PM
> To: ''Brian J. Murrell''; lustre-discuss at
lists.lustre.org
> Subject: Re: [Lustre-discuss] OSTs hanging while running IOR
>
> Im attaching the messages (only the error part) file so we don''t
have these mail formatting problems.
>
> ------
>
> Can you provide a bit more of the log before the above so we can see what
the stack trace is in reference to?  Also, try to
> eliminate the white-space between lines.  Are you getting any other errors
or messages from Lustre prior to that?
>
> Perhaps you are getting some messages saying that various operations are
"slow"?
>
>   
>>> Even beeing slow, the OST should respond right ? It
"hangs".
>>>       
>
> Have you tuned these OSSes with respect to the number of OST threads needed
to drive (and not over-drive) your disks?  The lustre
> iokit is useful for that tuning.
>
>   
>>> Ok, tuning for performance is okay, but hanging with 20 nodes (IOR
MPI).. strange right ?
>>>       
>
> b.
>
> -----
>
> I''m using 3 raid 5 with 8 disks each and 256 OST threads on each
OSS.
>
> root at a02n00:~# cat /etc/mdadm.conf
> ARRAY /dev/md10 level=raid5 num-devices=8
devices=/dev/dm-0,/dev/dm-1,/dev/dm-2,/dev/dm-3,/dev/dm-4,/dev/dm-5,/dev/dm-6,/dev/dm-7
> ARRAY /dev/md11 level=raid5 num-devices=8
>
devices=/dev/dm-8,/dev/dm-9,/dev/dm-10,/dev/dm-11,/dev/dm-12,/dev/dm-13,/dev/dm-14,/dev/dm-15
> ARRAY /dev/md12 level=raid5 num-devices=8
>
devices=/dev/dm-16,/dev/dm-17,/dev/dm-18,/dev/dm-19,/dev/dm-20,/dev/dm-21,/dev/dm-22,/dev/dm-23
>
> All my OSTs were created with internal journal (for test pourposes).
>
> mkfs.lustre --r --ost --fsname=work --mkfsoptions="-b 4096 -E
stride=32,stripe-width=224 -m 0" --mgsnid=a03n00 at o2ib
> --mgsnid=b03n00 at o2ib /dev/md[10|11|12]
>
> Im using separete mdt and mgs:
>
> # MGS
> mkfs.lustre --fsname=work --r --mgs --mkfsoptions="-b 4096 -E
stride=4,stripe-width=4 -m 0" --mountfsoptions=acl
> --failnode=b03n00 at o2ib /dev/sdb1
>
> # MDT
> mkfs.lustre --fsname=work --r --mgsnid=a03n00 at o2ib --mgsnid=b03n00 at
o2ib --mdt --mkfsoptions="-b 4096 -E stride=4,stripe-width=40 -m
> 0" --mountfsoptions=acl --failnode=b03n00 at o2ib /dev/sdc1
>
> I''m using these packages on server:
> ----------
> root at a03n00:~# rpm -aq | grep -i lustre
> lustre-modules-1.8.1-2.6.18_128.1.14.el5_lustre.1.8.1
> lustre-client-modules-1.8.1-2.6.18_128.1.14.el5_lustre.1.8.1
> lustre-ldiskfs-3.0.9-2.6.18_128.1.14.el5_lustre.1.8.1
> kernel-lustre-headers-2.6.18-128.1.14.el5_lustre.1.8.1
> kernel-lustre-2.6.18-128.1.14.el5_lustre.1.8.1
> lustre-client-1.8.1-2.6.18_128.1.14.el5_lustre.1.8.1
> kernel-lustre-devel-2.6.18-128.1.14.el5_lustre.1.8.1
> lustre-1.8.1-2.6.18_128.1.14.el5_lustre.1.8.1
> kernel-ib-1.4.1-2.6.18_128.1.14.el5_lustre.1.8.1
> ----------
> On client Ive compiled kernel 2.6.18-128.el5 without INFINIBAND support.
> Then compiled OFED 1.4.1 and after that compile patchless client.
> For the patchless client, compiled with:
> --ofa-kernel=/usr/src/ofa_kernel
> ----------
>
> * THE ERROR
>
> Using: 
>
> root at b00n00:~# mpirun -hostfile ./lustre.hosts -np 20 /hpc/IOR -w -r -C
-i 2 -b 1G -t 512k -F -o /work/stripe12/teste
>
> for example starts "hanging" the OSTs and the filesystem
"hangs".
> Any atempt to rm or read a file (or df -kh) hangs and keeps forever (not
even kill -9 solves).
>
> With that.. I cannot umount my OSTs on the OSSs.
> And I have to "reboot" the server, and my raids starts resyncing.
>
> Tinoco
>
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss
> ------------------------------------------------------------------------
>
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss

Rafael David Tinoco

2009-Sep-10 00:38 UTC

head link

[Lustre-discuss] OSTs hanging while running IOR

It seeeeeeeems that using 64 threads for OST solved the problem.
:D

But.. too early to celebrate .. running all blocksizes and stripe widths
combinations.

Regards

Tinoco

-----Original Message-----
From: Oleg.Drokin at Sun.COM [mailto:Oleg.Drokin at Sun.COM] 
Sent: Wednesday, September 09, 2009 7:26 PM
To: Rafael David Tinoco
Cc: lustre-discuss at lists.lustre.org
Subject: Re: [Lustre-discuss] OSTs hanging while running IOR

Hello!

On Sep 9, 2009, at 1:31 PM, Rafael David Tinoco wrote:> One of my OSSs crashes, sometimes one, sometimes another. With the  
> following error:
That''s not a crash.
That''s watchdog timeout indicative of lustre spending too much time
waiting on io.
As such you need to somehow decrease the load on the system (by e.g.  
reducing
the number of io threads - was discussed on this list recently),  
increase obd_timeout or
get faster disk subsystem.

Bye,
     Oleg

Brian J. Murrell

2009-Sep-10 11:49 UTC

head link

[Lustre-discuss] OSTs hanging while running IOR

On Wed, 2009-09-09 at 21:38 -0300, Rafael David Tinoco
wrote:> It seeeeeeeems that using 64 threads for OST solved the problem.
Ahhh.  As I suspected then.  Good.

Thanx for updating the thread.  The archives at least will like
that.  :-)

b.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 197 bytes
Desc: This is a digitally signed message part
Url :
http://lists.lustre.org/pipermail/lustre-discuss/attachments/20090910/31ef9f86/attachment.bin

Andreas Dilger

2009-Sep-10 13:19 UTC

head link

[Lustre-discuss] OSTs hanging while running IOR

On Sep 09, 2009  19:32 -0300, Rafael David Tinoco wrote:> Forget the file.. sorry
>
> Lustre: 0:0:(watchdog.c:181:lcw_cb()) Watchdog
> triggered for pid 16372: it was inactive for 200.00s
>
> [stack trace]
Since lots of users are confused by this message, and think there
is a crash, I think we should make a more useful message here.

I''ve filed bug 20722 on this, which would be a trivial bug for
someone to fix if they have some time.

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.

Lustre discuss - Sep 2009 - OSTs hanging while running IOR

[Lustre-discuss] OSTs hanging while running IOR

[Lustre-discuss] OSTs hanging while running IOR

[Lustre-discuss] OSTs hanging while running IOR

[Lustre-discuss] OSTs hanging while running IOR

[Lustre-discuss] OSTs hanging while running IOR

[Lustre-discuss] OSTs hanging while running IOR

[Lustre-discuss] OSTs hanging while running IOR

[Lustre-discuss] OSTs hanging while running IOR

[Lustre-discuss] OSTs hanging while running IOR