Hi list! We have a very simple Lustre setup as follows: Server1 (MGS/MDS) 1 mgs/mds that contains 3 lun''s for 2 Lustre filesystems... 1 lun = mgs data 1 lun = home dirs for users 1 lun = research data Server2 (Currently unused) Server3 (OSS for research data - no errors) Server4 (OSS for mds1 that contains homedir data) 12 ost''s approximately 1.1T ea. All servers are running Centos 5.1 with 1.6.7.2 rpm''s from sun. We also have 5 clients that are running Ubuntu + 2.6.22.19/patchless. Today client1 lost its Lustre mounts (a df -h hangs) but other clients were all ok. On the oss for the homedir data, I saw the following in /var/log/syslog: Oct 5 13:07:48 maglustre04 kernel: Oct 5 13:07:58 maglustre04 kernel: BUG: soft lockup - CPU#1 stuck for 10s! [ll_ost_35:13366] Oct 5 13:07:58 maglustre04 kernel: CPU 1: Oct 5 13:07:58 maglustre04 kernel: Modules linked in: obdfilter(U) fsfilt_ldiskfs(U) ost(U) mgc(U) ldiskfs(U) crc16(U) lustre(U) lov(U) mdc(U) lquota(U) osc(U) ksocklnd(U) ptlrpc(U) obdclass(U) lnet(U) lvfs(U) libcfs(U) ipv6(U) xfrm_nalgo(U) crypto_api(U) autofs4(U) sunrpc(U) dm_round_robin(U) dm_emc(U) dm_multipath(U ) video(U) sbs(U) backlight(U) i2c_ec(U) i2c_core(U) button(U) battery(U) asus_acpi(U) acpi_memhotplug(U) ac(U) parport_pc(U) lp(U) parport(U) sg(U) pata_acpi(U ) lpfc(U) ide_cd(U) hpwdt(U) bnx2(U) shpchp(U) cdrom(U) scsi_transport_fc(U) i5000_edac(U) serio_raw(U) edac_mc(U) pcspkr(U) dm_snapshot(U) dm_zero(U) dm_mirror (U) dm_mod(U) ata_piix(U) libata(U) cciss(U) sd_mod(U) scsi_mod(U) ext3(U) jbd(U) ehci_hcd(U) ohci_hcd(U) uhci_hcd(U) Oct 5 13:07:58 maglustre04 kernel: Pid: 13366, comm: ll_ost_35 Tainted: G 2.6.18-92.1.26.el5_lustre.1.6.7.2smp #1 Oct 5 13:07:58 maglustre04 kernel: RIP: 0010:[<ffffffff8856caed>] [<ffffffff8856caed>] :ptlrpc:ptlrpc_queue_wait+0x93d/0x1690 Oct 5 13:07:58 maglustre04 kernel: RSP: 0018:ffff8101acd09780 EFLAGS: 00000202 Oct 5 13:07:58 maglustre04 kernel: RAX: ffff81051bf1cc00 RBX: ffff8103f1b76000 RCX: 0000000000080000 Oct 5 13:07:58 maglustre04 kernel: RDX: ffff81051bf1cca0 RSI: ffff81023e40fc08 RDI: ffff8103f1b76008 Oct 5 13:07:58 maglustre04 kernel: RBP: ffff8103f1b7605c R08: 00000000ffffffff R09: 0000000000000020 Oct 5 13:07:58 maglustre04 kernel: R10: 0000000000000000 R11: 0000000000000000 R12: ffff8103f1b76000 Oct 5 13:07:58 maglustre04 kernel: R13: ffff8103f1b76000 R14: 0000000000000013 R15: ffffffff885657ec Oct 5 13:07:58 maglustre04 kernel: FS: 00002b9509e77220(0000) GS:ffff81052ff9a640(0000) knlGS:0000000000000000 Oct 5 13:07:58 maglustre04 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b Oct 5 13:07:58 maglustre04 kernel: CR2: 0000003184c99a60 CR3: 0000000000201000 CR4: 00000000000006e0 Oct 5 13:07:58 maglustre04 kernel: Oct 5 13:07:58 maglustre04 kernel: Call Trace: Oct 5 13:07:58 maglustre04 kernel: [<ffffffff885768b5>] :ptlrpc:lustre_msg_set_opc+0x45/0x120 Oct 5 13:07:58 maglustre04 kernel: [<ffffffff88566e73>] :ptlrpc:ptlrpc_prep_req_pool+0x613/0x6b0 Oct 5 13:07:58 maglustre04 kernel: [<ffffffff8008abbc>] default_wake_function+0x0/0xe Oct 5 13:07:58 maglustre04 kernel: [<ffffffff88554a87>] :ptlrpc:ldlm_server_glimpse_ast+0x257/0x3a0 Oct 5 13:07:58 maglustre04 kernel: [<ffffffff88561953>] :ptlrpc:interval_iterate_reverse+0x73/0x240 Oct 5 13:07:58 maglustre04 kernel: [<ffffffff88549700>] :ptlrpc:ldlm_process_extent_lock+0x0/0xad0 Oct 5 13:07:58 maglustre04 kernel: [<ffffffff8881818c>] :obdfilter:filter_intent_policy+0x68c/0x7a0 Oct 5 13:07:58 maglustre04 kernel: [<ffffffff88536d76>] :ptlrpc:ldlm_lock_enqueue+0x186/0xb00 Oct 5 13:07:58 maglustre04 kernel: [<ffffffff885518ef>] :ptlrpc:ldlm_export_lock_get+0x6f/0xe0 Oct 5 13:07:58 maglustre04 kernel: [<ffffffff884ba688>] :obdclass:lustre_hash_add+0x208/0x2d0 Oct 5 13:07:58 maglustre04 kernel: [<ffffffff8855a490>] :ptlrpc:ldlm_server_blocking_ast+0x0/0x833 Oct 5 13:07:58 maglustre04 kernel: [<ffffffff885585e9>] :ptlrpc:ldlm_handle_enqueue+0xc09/0x1200 Oct 5 13:07:58 maglustre04 kernel: [<ffffffff885751b8>] :ptlrpc:lustre_msg_check_version_v2+0x8/0x20 Oct 5 13:07:58 maglustre04 kernel: [<ffffffff887d630a>] :ost:ost_handle+0x565a/0x5cd0 Oct 5 13:07:58 maglustre04 kernel: [<ffffffff80143b75>] __next_cpu+0x19/0x28 Oct 5 13:07:58 maglustre04 kernel: [<ffffffff80143b75>] __next_cpu+0x19/0x28 Oct 5 13:07:58 maglustre04 kernel: [<ffffffff800898e6>] find_busiest_group+0x20d/0x621 Oct 5 13:07:58 maglustre04 kernel: [<ffffffff88574795>] :ptlrpc:lustre_msg_get_conn_cnt+0x35/0xf0 Oct 5 13:07:58 maglustre04 kernel: [<ffffffff8857ceea>] :ptlrpc:ptlrpc_server_request_get+0x6a/0x150 Oct 5 13:07:58 maglustre04 kernel: [<ffffffff8857ed6d>] :ptlrpc:ptlrpc_check_req+0x1d/0x110 Oct 5 13:07:58 maglustre04 kernel: [<ffffffff885812f3>] :ptlrpc:ptlrpc_server_handle_request+0xa93/0x1150 Oct 5 13:07:58 maglustre04 kernel: [<ffffffff80062f4b>] thread_return+0x0/0xdf Oct 5 13:07:58 maglustre04 kernel: [<ffffffff8006d8a2>] do_gettimeofday+0x40/0x8f Oct 5 13:07:58 maglustre04 kernel: [<ffffffff884247c6>] :libcfs:lcw_update_time+0x16/0x100 Oct 5 13:07:58 maglustre04 kernel: [<ffffffff800891f9>] __wake_up_common+0x3e/0x68 Oct 5 13:07:58 maglustre04 kernel: [<ffffffff885847e8>] :ptlrpc:ptlrpc_main+0x1218/0x13e0 Oct 5 13:07:58 maglustre04 kernel: [<ffffffff8008abbc>] default_wake_function+0x0/0xe Oct 5 13:07:58 maglustre04 kernel: [<ffffffff800b4391>] audit_syscall_exit+0x31b/0x336 Oct 5 13:07:58 maglustre04 kernel: [<ffffffff8005dfb1>] child_rip+0xa/0x11 Oct 5 13:07:58 maglustre04 kernel: [<ffffffff885835d0>] :ptlrpc:ptlrpc_main+0x0/0x13e0 Oct 5 13:07:58 maglustre04 kernel: [<ffffffff8005dfa7>] child_rip+0x0/0x11 Oct 5 13:07:58 maglustre04 kernel: Oct 5 13:07:58 maglustre04 kernel: BUG: soft lockup - CPU#4 stuck for 10s! [ll_ost_90:13421] Oct 5 13:07:58 maglustre04 kernel: CPU 4: Oct 5 13:07:58 maglustre04 kernel: Modules linked in: obdfilter(U) fsfilt_ldiskfs(U) ost(U) mgc(U) ldiskfs(U) crc16(U) lustre(U) lov(U) mdc(U) lquota(U) osc(U) ksocklnd(U) ptlrpc(U) obdclass(U) lnet(U) lvfs(U) libcfs(U) ipv6(U) xfrm_nalgo(U) crypto_api(U) autofs4(U) sunrpc(U) dm_round_robin(U) dm_emc(U) dm_multipath(U ) video(U) sbs(U) backlight(U) i2c_ec(U) i2c_core(U) button(U) battery(U) asus_acpi(U) acpi_memhotplug(U) ac(U) parport_pc(U) lp(U) parport(U) sg(U) pata_acpi(U ) lpfc(U) ide_cd(U) hpwdt(U) bnx2(U) shpchp(U) cdrom(U) scsi_transport_fc(U) i5000_edac(U) serio_raw(U) edac_mc(U) pcspkr(U) dm_snapshot(U) dm_zero(U) dm_mirror (U) dm_mod(U) ata_piix(U) libata(U) cciss(U) sd_mod(U) scsi_mod(U) ext3(U) jbd(U) ehci_hcd(U) ohci_hcd(U) uhci_hcd(U) Oct 5 13:07:58 maglustre04 kernel: Pid: 13421, comm: ll_ost_90 Tainted: G 2.6.18-92.1.26.el5_lustre.1.6.7.2smp #1 Oct 5 13:07:58 maglustre04 kernel: RIP: 0010:[<ffffffff8005475f>] [<ffffffff8005475f>] strrchr+0x19/0x24 Oct 5 13:07:58 maglustre04 kernel: RSP: 0018:ffff8101e534f358 EFLAGS: 00000212 Oct 5 13:07:58 maglustre04 kernel: RAX: ffffffff885a6497 RBX: ffffffff885ae804 RCX: 0000000000000039 Oct 5 13:07:58 maglustre04 kernel: RDX: ffffffff885a6460 RSI: 000000000000002f RDI: ffffffff885a6499 Oct 5 13:07:58 maglustre04 kernel: RBP: 0000010000000100 R08: ffffffff8859ebe0 R09: 00000000000007b7 Oct 5 13:07:58 maglustre04 kernel: R10: ffffffff885ae831 R11: ffffffff885ae804 R12: ffffffff00000107 Oct 5 13:07:58 maglustre04 kernel: R13: ffff8101b0531b58 R14: 00000000000000b4 R15: 000000a800000100 Oct 5 13:07:58 maglustre04 kernel: FS: 00002b9509e77220(0000) GS:ffff81052fe21b40(0000) knlGS:0000000000000000 Oct 5 13:07:58 maglustre04 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b Oct 5 13:07:58 maglustre04 kernel: CR2: 0000003184c6bf00 CR3: 0000000000201000 CR4: 00000000000006e0 Oct 5 13:07:58 maglustre04 kernel: Oct 5 13:07:58 maglustre04 kernel: Call Trace: Oct 5 13:07:58 maglustre04 kernel: [<ffffffff884232aa>] :libcfs:libcfs_debug_vmsg2+0x4a/0x980 Oct 5 13:07:58 maglustre04 kernel: [<ffffffff88576005>] :ptlrpc:_debug_req+0x4b5/0x4d0 Oct 5 13:07:58 maglustre04 kernel: [<ffffffff8002e1d8>] __wake_up+0x38/0x4f Oct 5 13:07:58 maglustre04 kernel: [<ffffffff88576005>] :ptlrpc:_debug_req+0x4b5/0x4d0 Oct 5 13:07:58 maglustre04 kernel: [<ffffffff88566242>] :ptlrpc:ptlrpc_expire_one_request+0x1d2/0x530 Oct 5 13:07:58 maglustre04 kernel: [<ffffffff885657ec>] :ptlrpc:ptlrpc_unregister_reply+0x13c/0x9c0 Oct 5 13:07:58 maglustre04 kernel: [<ffffffff8856929d>] :ptlrpc:ptlrpc_check_reply+0x18d/0x530 Oct 5 13:07:58 maglustre04 kernel: [<ffffffff8856caa0>] :ptlrpc:ptlrpc_queue_wait+0x8f0/0x1690 Oct 5 13:07:58 maglustre04 kernel: [<ffffffff885768b5>] :ptlrpc:lustre_msg_set_opc+0x45/0x120 Oct 5 13:07:58 maglustre04 kernel: [<ffffffff88566e73>] :ptlrpc:ptlrpc_prep_req_pool+0x613/0x6b0 Oct 5 13:07:58 maglustre04 kernel: [<ffffffff8008abbc>] default_wake_function+0x0/0xe Oct 5 13:07:58 maglustre04 kernel: [<ffffffff88554a87>] :ptlrpc:ldlm_server_glimpse_ast+0x257/0x3a0 Oct 5 13:07:58 maglustre04 kernel: [<ffffffff88561953>] :ptlrpc:interval_iterate_reverse+0x73/0x240 Oct 5 13:07:58 maglustre04 kernel: [<ffffffff88549700>] :ptlrpc:ldlm_process_extent_lock+0x0/0xad0 Oct 5 13:07:58 maglustre04 kernel: [<ffffffff8881818c>] :obdfilter:filter_intent_policy+0x68c/0x7a0 Oct 5 13:07:58 maglustre04 kernel: [<ffffffff88536d76>] :ptlrpc:ldlm_lock_enqueue+0x186/0xb00 Oct 5 13:07:58 maglustre04 kernel: [<ffffffff885518ef>] :ptlrpc:ldlm_export_lock_get+0x6f/0xe0 Oct 5 13:07:58 maglustre04 kernel: [<ffffffff884ba688>] :obdclass:lustre_hash_add+0x208/0x2d0 Oct 5 13:07:58 maglustre04 kernel: [<ffffffff8855a490>] :ptlrpc:ldlm_server_blocking_ast+0x0/0x833 Oct 5 13:07:58 maglustre04 kernel: [<ffffffff885585e9>] :ptlrpc:ldlm_handle_enqueue+0xc09/0x1200 Oct 5 13:07:58 maglustre04 kernel: [<ffffffff885751b8>] :ptlrpc:lustre_msg_check_version_v2+0x8/0x20 Oct 5 13:07:58 maglustre04 kernel: [<ffffffff887d630a>] :ost:ost_handle+0x565a/0x5cd0 Oct 5 13:07:58 maglustre04 kernel: [<ffffffff80143b75>] __next_cpu+0x19/0x28 Oct 5 13:07:58 maglustre04 kernel: [<ffffffff800898e6>] find_busiest_group+0x20d/0x621 Oct 5 13:07:58 maglustre04 kernel: [<ffffffff88574795>] :ptlrpc:lustre_msg_get_conn_cnt+0x35/0xf0 Oct 5 13:07:58 maglustre04 kernel: [<ffffffff8857ceea>] :ptlrpc:ptlrpc_server_request_get+0x6a/0x150 Oct 5 13:07:58 maglustre04 kernel: [<ffffffff8857ed6d>] :ptlrpc:ptlrpc_check_req+0x1d/0x110 Oct 5 13:07:58 maglustre04 kernel: [<ffffffff885812f3>] :ptlrpc:ptlrpc_server_handle_request+0xa93/0x1150 Oct 5 13:07:58 maglustre04 kernel: [<ffffffff80062f4b>] thread_return+0x0/0xdf Oct 5 13:07:58 maglustre04 kernel: [<ffffffff8006d8a2>] do_gettimeofday+0x40/0x8f Oct 5 13:07:58 maglustre04 kernel: [<ffffffff884247c6>] :libcfs:lcw_update_time+0x16/0x100 Oct 5 13:07:58 maglustre04 kernel: [<ffffffff800891f9>] __wake_up_common+0x3e/0x68 Oct 5 13:07:58 maglustre04 kernel: [<ffffffff885847e8>] :ptlrpc:ptlrpc_main+0x1218/0x13e0 Oct 5 13:07:58 maglustre04 kernel: [<ffffffff8008abbc>] default_wake_function+0x0/0xe Oct 5 13:07:58 maglustre04 kernel: [<ffffffff800b4391>] audit_syscall_exit+0x31b/0x336 Oct 5 13:07:58 maglustre04 kernel: [<ffffffff8005dfb1>] child_rip+0xa/0x11 Oct 5 13:07:58 maglustre04 kernel: [<ffffffff885835d0>] :ptlrpc:ptlrpc_main+0x0/0x13e0 Oct 5 13:07:58 maglustre04 kernel: [<ffffffff8005dfa7>] child_rip+0x0/0x11 Oct 5 13:07:58 maglustre04 kernel: After searching Bugzilla, it appears it may be bug #19785. Do you guys agree with this? The difference is that the "RIP" line there contains a reference to text.lock.spinklock and for us it contains strrchr (for one thread) and ptlrpc_queue_wait on the other thread. In the meantime, server4 (maglustre04) has two hung threads (100% cpu) which appear to be OST/io related. What is the correct way to resolve this? Thank you, Robert The information contained in this message and its attachments is intended only for the private and confidential use of the intended recipient(s). If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and destroy this e-mail. Any unauthorized copying, disclosure or distribution of the material in this e- mail is strictly prohibited.
It looks like the threads finally died.... The 2 cpu cores that were pegged at 100% are idle again. That seems like one heck of a timeout... ============= Oct 5 14:10:59 maglustre04 kernel: Lustre: 13366:0:(service.c:1317:ptlrpc_server_handle_request()) @@@ Request x6413848 took longer than es timated (100+5495s); client may timeout. req at ffff81009308c400 x6413848/t0 o101->1b9e4991-1d5e-814d-2607-8c52f432e68d@:0/0 lens 232/288 e 0 to 0 dl 1254764364 ref 1 fl Complete:/0/0 rc 301/301 Oct 5 14:10:59 maglustre04 kernel: Lustre: 13421:0:(watchdog.c:330:lcw_update_time()) Expired watchdog for pid 13421 disabled after 5595.80 41s Oct 5 14:10:59 maglustre04 kernel: Lustre: 13366:0:(service.c:1317:ptlrpc_server_handle_request()) Skipped 1 previous similar message Oct 5 14:10:59 maglustre04 kernel: Lustre: 13366:0:(watchdog.c:330:lcw_update_time()) Expired watchdog for pid 13366 disabled after 5595.80 59s Robert Hendelman Jr Magnetar Capital LLC Rob.Hendelman at magnetar.com 1-847-905-4557 The information contained in this message and its attachments is intended only for the private and confidential use of the intended recipient(s). If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and destroy this e-mail. Any unauthorized copying, disclosure or distribution of the material in this e- mail is strictly prohibited.
Hello! On Oct 5, 2009, at 4:40 PM, Hendelman, Rob wrote:> It looks like the threads finally died.... The 2 cpu cores that were > pegged at 100% are idle again. > > That seems like one heck of a timeout...Was there a client eviction right before this message? The watchdog trace from your previous message is in the middle of waiting a reply from a client for glimpse AST. Granted this should not occupy 100% cpu, so it is not completely clear what that happened. The traces are not similar enough to 19785 I would say Bye, Oleg
Hello Mr. Drokin, Thank you for your prior response. There was a client eviction just prior to the threads hanging and eating 100%, but NOT prior to the OSS finally dropping cpu usage again. Here is a basic timeline(in hours:min "military" time) 09:07am->12:39: Client "6" which was cloned from "client1" is being worked on, rebooted, and connected/disconnected from the lustre servers. No issues 12:39: final OSS message that says "haven''t heard from <ip of client66> in 240 seconds, I think it''s dead and I''m evicting it. 12:40: what appear to be stack dumps on the OSS server for 2 i/o threads (previously mentioned) 12:44: client1 has lost it''s lustre mounts and is complaining in nagios. All other clients are fine. 13:13: "stack dumps" once a minute or so, but no LBUG. I leave the server up and finally reboot client1. The other clients2-5 are not affected. All other clients seem to be working normally so I don''t touch the OSS. 14:10: Final messages on OSS before OSS calms down (no messages after this) Oct 5 14:10:56 maglustre04 kernel: Oct 5 14:10:59 maglustre04 kernel: Lustre: 13366:0:(service.c:1317:ptlrpc_server_handle_request()) @@@ Request x6413848 took longer than estimated (100+5495s); client may timeout. req at ffff81009308c400 x6413848/t0 o101->1b9e4991-1d5e-814d-2607-8c52f432e68d@:0/0 lens 232/288 e 0 to 0 dl 1254764364 ref 1 fl Complete:/0/0 rc 301/301 Oct 5 14:10:59 maglustre04 kernel: Lustre: 13421:0:(watchdog.c:330:lcw_update_time()) Expired watchdog for pid 13421 disabled after 5595.8041s Oct 5 14:10:59 maglustre04 kernel: Lustre: 13366:0:(service.c:1317:ptlrpc_server_handle_request()) Skipped 1 previous similar message Oct 5 14:10:59 maglustre04 kernel: Lustre: 13366:0:(watchdog.c:330:lcw_update_time()) Expired watchdog for pid 13366 disabled after 5595.8059s Should I file a new bug? Is there enough info in /var/log/messages to file a bug or do I need to turn on some sort of more verbose debugging incase this happens again? Thanks, Robert The information contained in this message and its attachments is intended only for the private and confidential use of the intended recipient(s). If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and destroy this e-mail. Any unauthorized copying, disclosure or distribution of the material in this e- mail is strictly prohibited.