thr3ads.net - Lustre discuss - [Lustre-discuss] Soft CPU Lockup [Oct 2009]

If this information is useful, please help other people find it:
Share via:

Hendelman, Rob

2009-Oct-05 18:40 UTC

[Lustre-discuss] Soft CPU Lockup

Hi list!

We have a very simple Lustre setup as follows:

Server1 (MGS/MDS)
1 mgs/mds that contains 3 lun''s for 2 Lustre filesystems...
1 lun = mgs data
1 lun = home dirs for users
1 lun = research data

Server2
(Currently unused)

Server3 (OSS for research data - no errors)

Server4 (OSS for mds1 that contains homedir data)
12 ost''s approximately 1.1T ea.

All servers are running Centos 5.1 with 1.6.7.2 rpm''s from sun.  We
also
have 5 clients that are running Ubuntu + 2.6.22.19/patchless.  Today
client1 lost its Lustre mounts (a df -h hangs) but other clients were
all ok.

On the oss for the homedir data, I saw the following in /var/log/syslog:

Oct  5 13:07:48 maglustre04 kernel: 
Oct  5 13:07:58 maglustre04 kernel: BUG: soft lockup - CPU#1 stuck for
10s! [ll_ost_35:13366]
Oct  5 13:07:58 maglustre04 kernel: CPU 1:
Oct  5 13:07:58 maglustre04 kernel: Modules linked in: obdfilter(U)
fsfilt_ldiskfs(U) ost(U) mgc(U) ldiskfs(U) crc16(U) lustre(U) lov(U)
mdc(U) lquota(U) osc(U)
 ksocklnd(U) ptlrpc(U) obdclass(U) lnet(U) lvfs(U) libcfs(U) ipv6(U)
xfrm_nalgo(U) crypto_api(U) autofs4(U) sunrpc(U) dm_round_robin(U)
dm_emc(U) dm_multipath(U
) video(U) sbs(U) backlight(U) i2c_ec(U) i2c_core(U) button(U)
battery(U) asus_acpi(U) acpi_memhotplug(U) ac(U) parport_pc(U) lp(U)
parport(U) sg(U) pata_acpi(U
) lpfc(U) ide_cd(U) hpwdt(U) bnx2(U) shpchp(U) cdrom(U)
scsi_transport_fc(U) i5000_edac(U) serio_raw(U) edac_mc(U) pcspkr(U)
dm_snapshot(U) dm_zero(U) dm_mirror
(U) dm_mod(U) ata_piix(U) libata(U) cciss(U) sd_mod(U) scsi_mod(U)
ext3(U) jbd(U) ehci_hcd(U) ohci_hcd(U) uhci_hcd(U)
Oct  5 13:07:58 maglustre04 kernel: Pid: 13366, comm: ll_ost_35 Tainted:
G      2.6.18-92.1.26.el5_lustre.1.6.7.2smp #1
Oct  5 13:07:58 maglustre04 kernel: RIP: 0010:[<ffffffff8856caed>]
[<ffffffff8856caed>] :ptlrpc:ptlrpc_queue_wait+0x93d/0x1690
Oct  5 13:07:58 maglustre04 kernel: RSP: 0018:ffff8101acd09780  EFLAGS:
00000202
Oct  5 13:07:58 maglustre04 kernel: RAX: ffff81051bf1cc00 RBX:
ffff8103f1b76000 RCX: 0000000000080000
Oct  5 13:07:58 maglustre04 kernel: RDX: ffff81051bf1cca0 RSI:
ffff81023e40fc08 RDI: ffff8103f1b76008
Oct  5 13:07:58 maglustre04 kernel: RBP: ffff8103f1b7605c R08:
00000000ffffffff R09: 0000000000000020
Oct  5 13:07:58 maglustre04 kernel: R10: 0000000000000000 R11:
0000000000000000 R12: ffff8103f1b76000
Oct  5 13:07:58 maglustre04 kernel: R13: ffff8103f1b76000 R14:
0000000000000013 R15: ffffffff885657ec
Oct  5 13:07:58 maglustre04 kernel: FS:  00002b9509e77220(0000)
GS:ffff81052ff9a640(0000) knlGS:0000000000000000
Oct  5 13:07:58 maglustre04 kernel: CS:  0010 DS: 0000 ES: 0000 CR0:
000000008005003b
Oct  5 13:07:58 maglustre04 kernel: CR2: 0000003184c99a60 CR3:
0000000000201000 CR4: 00000000000006e0
Oct  5 13:07:58 maglustre04 kernel: 
Oct  5 13:07:58 maglustre04 kernel: Call Trace:
Oct  5 13:07:58 maglustre04 kernel:  [<ffffffff885768b5>]
:ptlrpc:lustre_msg_set_opc+0x45/0x120
Oct  5 13:07:58 maglustre04 kernel:  [<ffffffff88566e73>]
:ptlrpc:ptlrpc_prep_req_pool+0x613/0x6b0
Oct  5 13:07:58 maglustre04 kernel:  [<ffffffff8008abbc>]
default_wake_function+0x0/0xe
Oct  5 13:07:58 maglustre04 kernel:  [<ffffffff88554a87>]
:ptlrpc:ldlm_server_glimpse_ast+0x257/0x3a0
Oct  5 13:07:58 maglustre04 kernel:  [<ffffffff88561953>]
:ptlrpc:interval_iterate_reverse+0x73/0x240
Oct  5 13:07:58 maglustre04 kernel:  [<ffffffff88549700>]
:ptlrpc:ldlm_process_extent_lock+0x0/0xad0
Oct  5 13:07:58 maglustre04 kernel:  [<ffffffff8881818c>]
:obdfilter:filter_intent_policy+0x68c/0x7a0
Oct  5 13:07:58 maglustre04 kernel:  [<ffffffff88536d76>]
:ptlrpc:ldlm_lock_enqueue+0x186/0xb00
Oct  5 13:07:58 maglustre04 kernel:  [<ffffffff885518ef>]
:ptlrpc:ldlm_export_lock_get+0x6f/0xe0
Oct  5 13:07:58 maglustre04 kernel:  [<ffffffff884ba688>]
:obdclass:lustre_hash_add+0x208/0x2d0
Oct  5 13:07:58 maglustre04 kernel:  [<ffffffff8855a490>]
:ptlrpc:ldlm_server_blocking_ast+0x0/0x833
Oct  5 13:07:58 maglustre04 kernel:  [<ffffffff885585e9>]
:ptlrpc:ldlm_handle_enqueue+0xc09/0x1200
Oct  5 13:07:58 maglustre04 kernel:  [<ffffffff885751b8>]
:ptlrpc:lustre_msg_check_version_v2+0x8/0x20
Oct  5 13:07:58 maglustre04 kernel:  [<ffffffff887d630a>]
:ost:ost_handle+0x565a/0x5cd0
Oct  5 13:07:58 maglustre04 kernel:  [<ffffffff80143b75>]
__next_cpu+0x19/0x28
Oct  5 13:07:58 maglustre04 kernel:  [<ffffffff80143b75>]
__next_cpu+0x19/0x28
Oct  5 13:07:58 maglustre04 kernel:  [<ffffffff800898e6>]
find_busiest_group+0x20d/0x621
Oct  5 13:07:58 maglustre04 kernel:  [<ffffffff88574795>]
:ptlrpc:lustre_msg_get_conn_cnt+0x35/0xf0
Oct  5 13:07:58 maglustre04 kernel:  [<ffffffff8857ceea>]
:ptlrpc:ptlrpc_server_request_get+0x6a/0x150
Oct  5 13:07:58 maglustre04 kernel:  [<ffffffff8857ed6d>]
:ptlrpc:ptlrpc_check_req+0x1d/0x110
Oct  5 13:07:58 maglustre04 kernel:  [<ffffffff885812f3>]
:ptlrpc:ptlrpc_server_handle_request+0xa93/0x1150
Oct  5 13:07:58 maglustre04 kernel:  [<ffffffff80062f4b>]
thread_return+0x0/0xdf
Oct  5 13:07:58 maglustre04 kernel:  [<ffffffff8006d8a2>]
do_gettimeofday+0x40/0x8f
Oct  5 13:07:58 maglustre04 kernel:  [<ffffffff884247c6>]
:libcfs:lcw_update_time+0x16/0x100
Oct  5 13:07:58 maglustre04 kernel:  [<ffffffff800891f9>]
__wake_up_common+0x3e/0x68
Oct  5 13:07:58 maglustre04 kernel:  [<ffffffff885847e8>]
:ptlrpc:ptlrpc_main+0x1218/0x13e0
Oct  5 13:07:58 maglustre04 kernel:  [<ffffffff8008abbc>]
default_wake_function+0x0/0xe
Oct  5 13:07:58 maglustre04 kernel:  [<ffffffff800b4391>]
audit_syscall_exit+0x31b/0x336
Oct  5 13:07:58 maglustre04 kernel:  [<ffffffff8005dfb1>]
child_rip+0xa/0x11
Oct  5 13:07:58 maglustre04 kernel:  [<ffffffff885835d0>]
:ptlrpc:ptlrpc_main+0x0/0x13e0
Oct  5 13:07:58 maglustre04 kernel:  [<ffffffff8005dfa7>]
child_rip+0x0/0x11
Oct  5 13:07:58 maglustre04 kernel: 
Oct  5 13:07:58 maglustre04 kernel: BUG: soft lockup - CPU#4 stuck for
10s! [ll_ost_90:13421]
Oct  5 13:07:58 maglustre04 kernel: CPU 4:
Oct  5 13:07:58 maglustre04 kernel: Modules linked in: obdfilter(U)
fsfilt_ldiskfs(U) ost(U) mgc(U) ldiskfs(U) crc16(U) lustre(U) lov(U)
mdc(U) lquota(U) osc(U)
 ksocklnd(U) ptlrpc(U) obdclass(U) lnet(U) lvfs(U) libcfs(U) ipv6(U)
xfrm_nalgo(U) crypto_api(U) autofs4(U) sunrpc(U) dm_round_robin(U)
dm_emc(U) dm_multipath(U
) video(U) sbs(U) backlight(U) i2c_ec(U) i2c_core(U) button(U)
battery(U) asus_acpi(U) acpi_memhotplug(U) ac(U) parport_pc(U) lp(U)
parport(U) sg(U) pata_acpi(U
) lpfc(U) ide_cd(U) hpwdt(U) bnx2(U) shpchp(U) cdrom(U)
scsi_transport_fc(U) i5000_edac(U) serio_raw(U) edac_mc(U) pcspkr(U)
dm_snapshot(U) dm_zero(U) dm_mirror
(U) dm_mod(U) ata_piix(U) libata(U) cciss(U) sd_mod(U) scsi_mod(U)
ext3(U) jbd(U) ehci_hcd(U) ohci_hcd(U) uhci_hcd(U)
Oct  5 13:07:58 maglustre04 kernel: Pid: 13421, comm: ll_ost_90 Tainted:
G      2.6.18-92.1.26.el5_lustre.1.6.7.2smp #1
Oct  5 13:07:58 maglustre04 kernel: RIP: 0010:[<ffffffff8005475f>]
[<ffffffff8005475f>] strrchr+0x19/0x24
Oct  5 13:07:58 maglustre04 kernel: RSP: 0018:ffff8101e534f358  EFLAGS:
00000212
Oct  5 13:07:58 maglustre04 kernel: RAX: ffffffff885a6497 RBX:
ffffffff885ae804 RCX: 0000000000000039
Oct  5 13:07:58 maglustre04 kernel: RDX: ffffffff885a6460 RSI:
000000000000002f RDI: ffffffff885a6499
Oct  5 13:07:58 maglustre04 kernel: RBP: 0000010000000100 R08:
ffffffff8859ebe0 R09: 00000000000007b7
Oct  5 13:07:58 maglustre04 kernel: R10: ffffffff885ae831 R11:
ffffffff885ae804 R12: ffffffff00000107
Oct  5 13:07:58 maglustre04 kernel: R13: ffff8101b0531b58 R14:
00000000000000b4 R15: 000000a800000100
Oct  5 13:07:58 maglustre04 kernel: FS:  00002b9509e77220(0000)
GS:ffff81052fe21b40(0000) knlGS:0000000000000000
Oct  5 13:07:58 maglustre04 kernel: CS:  0010 DS: 0000 ES: 0000 CR0:
000000008005003b
Oct  5 13:07:58 maglustre04 kernel: CR2: 0000003184c6bf00 CR3:
0000000000201000 CR4: 00000000000006e0
Oct  5 13:07:58 maglustre04 kernel: 
Oct  5 13:07:58 maglustre04 kernel: Call Trace:
Oct  5 13:07:58 maglustre04 kernel:  [<ffffffff884232aa>]
:libcfs:libcfs_debug_vmsg2+0x4a/0x980
Oct  5 13:07:58 maglustre04 kernel:  [<ffffffff88576005>]
:ptlrpc:_debug_req+0x4b5/0x4d0
Oct  5 13:07:58 maglustre04 kernel:  [<ffffffff8002e1d8>]
__wake_up+0x38/0x4f
Oct  5 13:07:58 maglustre04 kernel:  [<ffffffff88576005>]
:ptlrpc:_debug_req+0x4b5/0x4d0
Oct  5 13:07:58 maglustre04 kernel:  [<ffffffff88566242>]
:ptlrpc:ptlrpc_expire_one_request+0x1d2/0x530
Oct  5 13:07:58 maglustre04 kernel:  [<ffffffff885657ec>]
:ptlrpc:ptlrpc_unregister_reply+0x13c/0x9c0
Oct  5 13:07:58 maglustre04 kernel:  [<ffffffff8856929d>]
:ptlrpc:ptlrpc_check_reply+0x18d/0x530
Oct  5 13:07:58 maglustre04 kernel:  [<ffffffff8856caa0>]
:ptlrpc:ptlrpc_queue_wait+0x8f0/0x1690
Oct  5 13:07:58 maglustre04 kernel:  [<ffffffff885768b5>]
:ptlrpc:lustre_msg_set_opc+0x45/0x120
Oct  5 13:07:58 maglustre04 kernel:  [<ffffffff88566e73>]
:ptlrpc:ptlrpc_prep_req_pool+0x613/0x6b0
Oct  5 13:07:58 maglustre04 kernel:  [<ffffffff8008abbc>]
default_wake_function+0x0/0xe
Oct  5 13:07:58 maglustre04 kernel:  [<ffffffff88554a87>]
:ptlrpc:ldlm_server_glimpse_ast+0x257/0x3a0
Oct  5 13:07:58 maglustre04 kernel:  [<ffffffff88561953>]
:ptlrpc:interval_iterate_reverse+0x73/0x240
Oct  5 13:07:58 maglustre04 kernel:  [<ffffffff88549700>]
:ptlrpc:ldlm_process_extent_lock+0x0/0xad0
Oct  5 13:07:58 maglustre04 kernel:  [<ffffffff8881818c>]
:obdfilter:filter_intent_policy+0x68c/0x7a0
Oct  5 13:07:58 maglustre04 kernel:  [<ffffffff88536d76>]
:ptlrpc:ldlm_lock_enqueue+0x186/0xb00
Oct  5 13:07:58 maglustre04 kernel:  [<ffffffff885518ef>]
:ptlrpc:ldlm_export_lock_get+0x6f/0xe0
Oct  5 13:07:58 maglustre04 kernel:  [<ffffffff884ba688>]
:obdclass:lustre_hash_add+0x208/0x2d0
Oct  5 13:07:58 maglustre04 kernel:  [<ffffffff8855a490>]
:ptlrpc:ldlm_server_blocking_ast+0x0/0x833
Oct  5 13:07:58 maglustre04 kernel:  [<ffffffff885585e9>]
:ptlrpc:ldlm_handle_enqueue+0xc09/0x1200
Oct  5 13:07:58 maglustre04 kernel:  [<ffffffff885751b8>]
:ptlrpc:lustre_msg_check_version_v2+0x8/0x20
Oct  5 13:07:58 maglustre04 kernel:  [<ffffffff887d630a>]
:ost:ost_handle+0x565a/0x5cd0
Oct  5 13:07:58 maglustre04 kernel:  [<ffffffff80143b75>]
__next_cpu+0x19/0x28
Oct  5 13:07:58 maglustre04 kernel:  [<ffffffff800898e6>]
find_busiest_group+0x20d/0x621
Oct  5 13:07:58 maglustre04 kernel:  [<ffffffff88574795>]
:ptlrpc:lustre_msg_get_conn_cnt+0x35/0xf0
Oct  5 13:07:58 maglustre04 kernel:  [<ffffffff8857ceea>]
:ptlrpc:ptlrpc_server_request_get+0x6a/0x150
Oct  5 13:07:58 maglustre04 kernel:  [<ffffffff8857ed6d>]
:ptlrpc:ptlrpc_check_req+0x1d/0x110
Oct  5 13:07:58 maglustre04 kernel:  [<ffffffff885812f3>]
:ptlrpc:ptlrpc_server_handle_request+0xa93/0x1150
Oct  5 13:07:58 maglustre04 kernel:  [<ffffffff80062f4b>]
thread_return+0x0/0xdf
Oct  5 13:07:58 maglustre04 kernel:  [<ffffffff8006d8a2>]
do_gettimeofday+0x40/0x8f
Oct  5 13:07:58 maglustre04 kernel:  [<ffffffff884247c6>]
:libcfs:lcw_update_time+0x16/0x100
Oct  5 13:07:58 maglustre04 kernel:  [<ffffffff800891f9>]
__wake_up_common+0x3e/0x68
Oct  5 13:07:58 maglustre04 kernel:  [<ffffffff885847e8>]
:ptlrpc:ptlrpc_main+0x1218/0x13e0
Oct  5 13:07:58 maglustre04 kernel:  [<ffffffff8008abbc>]
default_wake_function+0x0/0xe
Oct  5 13:07:58 maglustre04 kernel:  [<ffffffff800b4391>]
audit_syscall_exit+0x31b/0x336
Oct  5 13:07:58 maglustre04 kernel:  [<ffffffff8005dfb1>]
child_rip+0xa/0x11
Oct  5 13:07:58 maglustre04 kernel:  [<ffffffff885835d0>]
:ptlrpc:ptlrpc_main+0x0/0x13e0
Oct  5 13:07:58 maglustre04 kernel:  [<ffffffff8005dfa7>]
child_rip+0x0/0x11
Oct  5 13:07:58 maglustre04 kernel: 

After searching Bugzilla, it appears it may be bug #19785.  Do you guys
agree with this?  The difference is that the "RIP" line there contains
a
reference to text.lock.spinklock and for us it contains strrchr (for one
thread) and ptlrpc_queue_wait on the other thread.

In the meantime, server4 (maglustre04) has two hung threads (100% cpu)
which appear to be OST/io related.  What is the correct way to resolve
this?

Thank you,

Robert

The information contained in this message and its attachments 
is intended only for the private and confidential use of the 
intended recipient(s).  If you are not the intended recipient 
(or have received this e-mail in error) please notify the 
sender immediately and destroy this e-mail. Any unauthorized 
copying, disclosure or distribution of the material in this e-
mail is strictly prohibited.

Hendelman, Rob

2009-Oct-05 20:40 UTC

head link

[Lustre-discuss] Soft CPU Lockup

It looks like the threads finally died....  The 2 cpu cores that were
pegged at 100% are idle again.

That seems like one heck of a timeout...
=============
Oct  5 14:10:59 maglustre04 kernel: Lustre:
13366:0:(service.c:1317:ptlrpc_server_handle_request()) @@@ Request
x6413848 took longer than es
timated (100+5495s); client may timeout.  req at ffff81009308c400
x6413848/t0 o101->1b9e4991-1d5e-814d-2607-8c52f432e68d@:0/0 lens 232/288
e 0 
to 0 dl 1254764364 ref 1 fl Complete:/0/0 rc 301/301
Oct  5 14:10:59 maglustre04 kernel: Lustre:
13421:0:(watchdog.c:330:lcw_update_time()) Expired watchdog for pid
13421 disabled after 5595.80
41s
Oct  5 14:10:59 maglustre04 kernel: Lustre:
13366:0:(service.c:1317:ptlrpc_server_handle_request()) Skipped 1
previous similar message
Oct  5 14:10:59 maglustre04 kernel: Lustre:
13366:0:(watchdog.c:330:lcw_update_time()) Expired watchdog for pid
13366 disabled after 5595.80
59s

Robert Hendelman Jr
Magnetar Capital LLC
Rob.Hendelman at magnetar.com
1-847-905-4557



The information contained in this message and its attachments 
is intended only for the private and confidential use of the 
intended recipient(s).  If you are not the intended recipient 
(or have received this e-mail in error) please notify the 
sender immediately and destroy this e-mail. Any unauthorized 
copying, disclosure or distribution of the material in this e-
mail is strictly prohibited.

Oleg Drokin

2009-Oct-05 21:28 UTC

head link

[Lustre-discuss] Soft CPU Lockup

Hello!

On Oct 5, 2009, at 4:40 PM, Hendelman, Rob wrote:
> It looks like the threads finally died....  The 2 cpu cores that were
> pegged at 100% are idle again.
>
> That seems like one heck of a timeout...
Was there a client eviction right before this message?
The watchdog trace from your previous message is in the middle of  
waiting a reply
from a client for glimpse AST.
Granted this should not occupy 100% cpu, so it is not completely clear  
what that
happened.
The traces are not similar enough to 19785 I would say

Bye,
     Oleg

Hendelman, Rob

2009-Oct-06 13:14 UTC

head link

[Lustre-discuss] Soft CPU Lockup

Hello Mr. Drokin,

Thank you for your prior response.

There was a client eviction just prior to the threads hanging and eating
100%, but NOT prior to the OSS finally dropping cpu usage again.
 
Here is a basic timeline(in hours:min "military" time)

09:07am->12:39:  Client "6" which was cloned from
"client1" is being
worked on, rebooted, and connected/disconnected from the lustre servers.
No issues
12:39: final OSS message that says "haven''t heard from <ip of
client66>
in 240 seconds, I think it''s dead and I''m evicting it.
12:40: what appear to be stack dumps on the OSS server for 2 i/o threads
(previously mentioned) 
12:44: client1 has lost it''s lustre mounts and is complaining in
nagios.
All other clients are fine.
13:13:  "stack dumps" once a minute or so, but no LBUG.  I leave the
server up and finally reboot client1.  The other clients2-5 are not
affected.  All other clients seem to be working normally so I don''t
touch the OSS.
14:10: Final messages on OSS before OSS calms down (no messages after
this)

Oct  5 14:10:56 maglustre04 kernel: 
Oct  5 14:10:59 maglustre04 kernel: Lustre:
13366:0:(service.c:1317:ptlrpc_server_handle_request()) @@@ Request
x6413848 took longer than estimated (100+5495s); client may timeout.
  req at ffff81009308c400 x6413848/t0
o101->1b9e4991-1d5e-814d-2607-8c52f432e68d@:0/0 lens 232/288 e 0 to 0 dl
1254764364 ref 1 fl Complete:/0/0 rc 301/301
Oct  5 14:10:59 maglustre04 kernel: Lustre:
13421:0:(watchdog.c:330:lcw_update_time()) Expired watchdog for pid
13421 disabled after 5595.8041s
Oct  5 14:10:59 maglustre04 kernel: Lustre:
13366:0:(service.c:1317:ptlrpc_server_handle_request()) Skipped 1
previous similar message
Oct  5 14:10:59 maglustre04 kernel: Lustre:
13366:0:(watchdog.c:330:lcw_update_time()) Expired watchdog for pid
13366 disabled after 5595.8059s

Should I file a new bug?  Is there enough info in /var/log/messages to
file a bug or do I need to turn on some sort of more verbose debugging
incase this happens again?

Thanks,

Robert

The information contained in this message and its attachments 
is intended only for the private and confidential use of the 
intended recipient(s).  If you are not the intended recipient 
(or have received this e-mail in error) please notify the 
sender immediately and destroy this e-mail. Any unauthorized 
copying, disclosure or distribution of the material in this e-
mail is strictly prohibited.

Lustre discuss - Oct 2009 - Soft CPU Lockup

[Lustre-discuss] Soft CPU Lockup

[Lustre-discuss] Soft CPU Lockup

[Lustre-discuss] Soft CPU Lockup

[Lustre-discuss] Soft CPU Lockup