Hendelman, Rob
2009-Dec-11 17:51 UTC
[Lustre-discuss] mdt threads "hanging" (?) until writes no longer permitted
Hello. We have tried both lustre 1.6.7.2 server & 1.6.7.2 client/1.8.1.1-56 (cvs) client (patchless) on Ubuntu with kernel 2.6.24-25.63 which maps to mainline 2.6.24.6 and have seen some strange behavior. MGS/MDS is a HPDL385G6 with 36G ram. It has two partitions on a a set of raid 73G 15k sas for mgs/mds: Filesystem Size Used Avail Use% Mounted on /dev/cciss/c0d0p1 61G 2.0G 56G 4% / tmpfs 18G 0 18G 0% /dev/shm /dev/lustre/mgs 1008M 34M 924M 4% /mnt/mgs /dev/lustre/mdt 7.0G 437M 6.2G 7% /mnt/mdt The OSS is a separate server with the same cpu/ram config but has 4 OST''s each a raid10 of 4 * 146G 15k sas drives on fc san Filesystem Size Used Avail Use% Mounted on /dev/cciss/c0d0p1 59G 2.0G 54G 4% / tmpfs 18G 0 18G 0% /dev/shm /dev/data/ost00 264G 80G 171G 32% /mnt/data-ost00 /dev/data/ost01 264G 76G 175G 31% /mnt/data-ost01 /dev/data/ost02 264G 74G 177G 30% /mnt/data-ost02 /dev/data/ost03 264G 73G 178G 30% /mnt/data-ost03 Our programmers here have created a short program that uses a 3rd party library (which we don''t have source code to) that actually performs the file I/O that creates our problem. The program does many many reads of very small files an no writes. What ends up happening is that sometime in the middle of the processing, the load goes up very very high 300-700 on the client running the program, however the system is still responsive except to a few commands ("w" or "ps" are good examples). On the MDT we see the following: BUG: soft lockup - CPU#1 stuck for 10s! [ll_evictor:3245] CPU 1: Modules linked in: sunrpc(U) ipv6(U) xfrm_nalgo(U) crypto_api(U) autofs4(U) hidp(U) l2cap(U) bluetooth(U) osc(U) lov(U) mds(U) lquota(U) fsfilt_ldiskfs(U) mgs(U) mgc(U) ptlrpc(U) l diskfs(U) crc16(U) ksocklnd(U) obdclass(U) lnet(U) lvfs(U) libcfs(U) dm_multipath(U) video(U) sbs(U) backlight(U) i2c_ec(U) i2c_core(U) button(U) battery(U) asus_acpi(U) acpi_memho tplug(U) ac(U) parport_pc(U) lp(U) parport(U) pcspkr(U) hpwdt(U) serio_raw(U) bnx2(U) shpchp(U) dm_snapshot(U) dm_zero(U) dm_mirror(U) dm_mod(U) usb_storage(U) lpfc(U) scsi_transpo rt_fc(U) cciss(U) sd_mod(U) scsi_mod(U) ext3(U) jbd(U) ehci_hcd(U) ohci_hcd(U) uhci_hcd(U) Pid: 3245, comm: ll_evictor Tainted: G 2.6.18-92.1.26.el5_lustre.1.6.7.2smp #1 RIP: 0010:[<ffffffff800649e6>] [<ffffffff800649e6>] _write_lock+0x7/0xf RSP: 0018:ffff810493d89d68 EFLAGS: 00000246 RAX: ffff8103b4900c00 RBX: 000000000000c5e0 RCX: 00000000000120e6 RDX: 000000000000037e RSI: ffffffff802f1d80 RDI: ffffc20010861e0c RBP: 0000000000000282 R08: ffff8104afcb3360 R09: 0000000000000000 R10: ffff8103b49ed840 R11: 0000000000000150 R12: 0000000000000282 R13: ffff810493d89d10 R14: ffff8103b49ed898 R15: ffff8103b49ed840 FS: 00002b4db7ec2230(0000) GS:ffff8104a007ac40(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 00002ab671c5d0a0 CR3: 000000091d857000 CR4: 00000000000006e0 Call Trace: [<ffffffff883593fe>] :obdclass:lustre_hash_for_each_empty+0x20e/0x290 [<ffffffff8835fc38>] :obdclass:class_disconnect+0x378/0x400 [<ffffffff884bd610>] :ptlrpc:ldlm_cancel_locks_for_export_cb+0x0/0xc0 [<ffffffff885cc43b>] :mds:mds_disconnect+0x11b/0x310 [<ffffffff8835bce4>] :obdclass:class_fail_export+0x384/0x4c0 [<ffffffff8850daa8>] :ptlrpc:ping_evictor_main+0x4f8/0x7d5 [<ffffffff8008abbc>] default_wake_function+0x0/0xe [<ffffffff800b4391>] audit_syscall_exit+0x31b/0x336 [<ffffffff8005dfb1>] child_rip+0xa/0x11 [<ffffffff8850d5b0>] :ptlrpc:ping_evictor_main+0x0/0x7d5 [<ffffffff8005dfa7>] child_rip+0x0/0x11 BUG: soft lockup - CPU#1 stuck for 10s! [ll_evictor:3245] CPU 1: Modules linked in: sunrpc(U) ipv6(U) xfrm_nalgo(U) crypto_api(U) autofs4(U) hidp(U) l2cap(U) bluetooth(U) osc(U) lov(U) mds(U) lquota(U) fsfilt_ldiskfs(U) mgs(U) mgc(U) ptlrpc(U) l diskfs(U) crc16(U) ksocklnd(U) obdclass(U) lnet(U) lvfs(U) libcfs(U) dm_multipath(U) video(U) sbs(U) backlight(U) i2c_ec(U) i2c_core(U) button(U) battery(U) asus_acpi(U) acpi_memho tplug(U) ac(U) parport_pc(U) lp(U) parport(U) pcspkr(U) hpwdt(U) serio_raw(U) bnx2(U) shpchp(U) dm_snapshot(U) dm_zero(U) dm_mirror(U) dm_mod(U) usb_storage(U) lpfc(U) scsi_transpo rt_fc(U) cciss(U) sd_mod(U) scsi_mod(U) ext3(U) jbd(U) ehci_hcd(U) ohci_hcd(U) uhci_hcd(U) Pid: 3245, comm: ll_evictor Tainted: G 2.6.18-92.1.26.el5_lustre.1.6.7.2smp #1 RIP: 0010:[<ffffffff883593e0>] [<ffffffff883593e0>] :obdclass:lustre_hash_for_each_empty+0x1f0/0x290 RSP: 0018:ffff810493d89d70 EFLAGS: 00000206 RAX: ffff8103a8880200 RBX: 000000000000c479 RCX: 0000000000014eaf RDX: 000000000000025b RSI: ffffffff802f1d80 RDI: ffffc2001086079c RBP: ffffffff884bb912 R08: ffff8104afcb3360 R09: ffff8103b4f14a00 R10: 0000000c00000000 R11: 0000000000000150 R12: ffff8103a8880200 R13: 0000000000000282 R14: 0000000000000282 R15: ffff810493d89d10 FS: 00002b4db7ec2230(0000) GS:ffff8104a007ac40(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 00002ab671c5d0a0 CR3: 000000091d857000 CR4: 00000000000006e0 Call Trace: [<ffffffff8835fc38>] :obdclass:class_disconnect+0x378/0x400 [<ffffffff884bd610>] :ptlrpc:ldlm_cancel_locks_for_export_cb+0x0/0xc0 [<ffffffff885cc43b>] :mds:mds_disconnect+0x11b/0x310 [<ffffffff8835bce4>] :obdclass:class_fail_export+0x384/0x4c0 [<ffffffff8850daa8>] :ptlrpc:ping_evictor_main+0x4f8/0x7d5 [<ffffffff8008abbc>] default_wake_function+0x0/0xe [<ffffffff800b4391>] audit_syscall_exit+0x31b/0x336 [<ffffffff8005dfb1>] child_rip+0xa/0x11 [<ffffffff8850d5b0>] :ptlrpc:ping_evictor_main+0x0/0x7d5 [<ffffffff8005dfa7>] child_rip+0x0/0x11 As well as: BUG: soft lockup - CPU#3 stuck for 10s! [ll_mdt_21:3313] CPU 3: Modules linked in: sunrpc(U) ipv6(U) xfrm_nalgo(U) crypto_api(U) autofs4(U) hidp(U) l2cap(U) bluetooth(U) osc(U) lov(U) mds(U) lquota(U) fsfilt_ldiskfs(U) mgs(U) mgc(U) ptlrpc(U) l diskfs(U) crc16(U) ksocklnd(U) obdclass(U) lnet(U) lvfs(U) libcfs(U) dm_multipath(U) video(U) sbs(U) backlight(U) i2c_ec(U) i2c_core(U) button(U) battery(U) asus_acpi(U) acpi_memho tplug(U) ac(U) parport_pc(U) lp(U) parport(U) pcspkr(U) hpwdt(U) serio_raw(U) bnx2(U) shpchp(U) dm_snapshot(U) dm_zero(U) dm_mirror(U) dm_mod(U) usb_storage(U) lpfc(U) scsi_transpo rt_fc(U) cciss(U) sd_mod(U) scsi_mod(U) ext3(U) jbd(U) ehci_hcd(U) ohci_hcd(U) uhci_hcd(U) Pid: 3313, comm: ll_mdt_21 Tainted: G 2.6.18-92.1.26.el5_lustre.1.6.7.2smp #1 RIP: 0010:[<ffffffff800649e6>] [<ffffffff800649e6>] _write_lock+0x7/0xf RSP: 0018:ffff81049346bb08 EFLAGS: 00000246 RAX: ffff8103fd4cd400 RBX: 0000000000007ff5 RCX: 000000000000c76e RDX: fffffffffff9d458 RSI: ffffffff802f1d80 RDI: ffffc20010865f5c RBP: 0000000000000282 R08: ffff8104afcbb8e0 R09: 0000000000000000 R10: ffff81043a9d6680 R11: 0000000000000150 R12: 0000000000000282 R13: ffff81049346bab0 R14: ffff81043a9d66d8 R15: ffff81043a9d6680 FS: 00002b000b9be230(0000) GS:ffff81091ff5bb40(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 00000037ff495770 CR3: 0000000000201000 CR4: 00000000000006e0 Call Trace: [<ffffffff883593fe>] :obdclass:lustre_hash_for_each_empty+0x20e/0x290 [<ffffffff8835fc38>] :obdclass:class_disconnect+0x378/0x400 [<ffffffff884bd610>] :ptlrpc:ldlm_cancel_locks_for_export_cb+0x0/0xc0 [<ffffffff885cc43b>] :mds:mds_disconnect+0x11b/0x310 [<ffffffff885010f9>] :ptlrpc:lustre_pack_reply+0x29/0xb0 [<ffffffff884c7c61>] :ptlrpc:target_handle_disconnect+0x461/0x560 [<ffffffff884fc005>] :ptlrpc:lustre_msg_get_opc+0x35/0xf0 [<ffffffff885c7734>] :mds:mds_handle+0xda4/0x4d20 [<ffffffff80143b75>] __next_cpu+0x19/0x28 [<ffffffff800898e6>] find_busiest_group+0x20d/0x621 [<ffffffff884fb795>] :ptlrpc:lustre_msg_get_conn_cnt+0x35/0xf0 [<ffffffff88503eea>] :ptlrpc:ptlrpc_server_request_get+0x6a/0x150 [<ffffffff88505d6d>] :ptlrpc:ptlrpc_check_req+0x1d/0x110 [<ffffffff885082f3>] :ptlrpc:ptlrpc_server_handle_request+0xa93/0x1150 [<ffffffff8006da18>] do_gettimeoffset_hpet+0x7/0x28 [<ffffffff8006d8a2>] do_gettimeofday+0x40/0x8f [<ffffffff882c37c6>] :libcfs:lcw_update_time+0x16/0x100 [<ffffffff8003d59a>] lock_timer_base+0x1b/0x3c [<ffffffff8001c5e4>] __mod_timer+0xb0/0xbe [<ffffffff8850b7e8>] :ptlrpc:ptlrpc_main+0x1218/0x13e0 [<ffffffff8008abbc>] default_wake_function+0x0/0xe [<ffffffff800b4391>] audit_syscall_exit+0x31b/0x336 [<ffffffff8005dfb1>] child_rip+0xa/0x11 [<ffffffff8850a5d0>] :ptlrpc:ptlrpc_main+0x0/0x13e0 [<ffffffff8005dfa7>] child_rip+0x0/0x11 Writing from ANY client (e.g.: touch ./filename) to the lustre mount then returns "no space left on device." Note above that we are not over 30% capacity on any OST and 4/7% capacity on mgs/mdt. Any suggestions? We have another cluster with an entirely different I/O load with 1.6.7.2. servers & clients that show no problems at all. Thank you, Robert The information contained in this message and its attachments is intended only for the private and confidential use of the intended recipient(s). If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and destroy this e-mail. Any unauthorized copying, disclosure or distribution of the material in this e- mail is strictly prohibited.
Brian J. Murrell
2009-Dec-11 19:03 UTC
[Lustre-discuss] mdt threads "hanging" (?) until writes no longer permitted
On Fri, 2009-12-11 at 11:51 -0600, Hendelman, Rob wrote:> BUG: soft lockup - CPU#1 stuck for 10s! [ll_evictor:3245] > CPU 1: > Modules linked in: sunrpc(U) ipv6(U) xfrm_nalgo(U) crypto_api(U) autofs4(U) hidp(U) l2cap(U) bluetooth(U) osc(U) lov(U) mds(U) lquota(U) fsfilt_ldiskfs(U) mgs(U) mgc(U) ptlrpc(U) l > diskfs(U) crc16(U) ksocklnd(U) obdclass(U) lnet(U) lvfs(U) libcfs(U) dm_multipath(U) video(U) sbs(U) backlight(U) i2c_ec(U) i2c_core(U) button(U) battery(U) asus_acpi(U) acpi_memho > tplug(U) ac(U) parport_pc(U) lp(U) parport(U) pcspkr(U) hpwdt(U) serio_raw(U) bnx2(U) shpchp(U) dm_snapshot(U) dm_zero(U) dm_mirror(U) dm_mod(U) usb_storage(U) lpfc(U) scsi_transpo > rt_fc(U) cciss(U) sd_mod(U) scsi_mod(U) ext3(U) jbd(U) ehci_hcd(U) ohci_hcd(U) uhci_hcd(U) > Pid: 3245, comm: ll_evictor Tainted: G 2.6.18-92.1.26.el5_lustre.1.6.7.2smp #1 > RIP: 0010:[<ffffffff800649e6>] [<ffffffff800649e6>] _write_lock+0x7/0xf > RSP: 0018:ffff810493d89d68 EFLAGS: 00000246 > RAX: ffff8103b4900c00 RBX: 000000000000c5e0 RCX: 00000000000120e6 > RDX: 000000000000037e RSI: ffffffff802f1d80 RDI: ffffc20010861e0c > RBP: 0000000000000282 R08: ffff8104afcb3360 R09: 0000000000000000 > R10: ffff8103b49ed840 R11: 0000000000000150 R12: 0000000000000282 > R13: ffff810493d89d10 R14: ffff8103b49ed898 R15: ffff8103b49ed840 > FS: 00002b4db7ec2230(0000) GS:ffff8104a007ac40(0000) knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b > CR2: 00002ab671c5d0a0 CR3: 000000091d857000 CR4: 00000000000006e0 > > Call Trace: > [<ffffffff883593fe>] :obdclass:lustre_hash_for_each_empty+0x20e/0x290This looks like bug 19557 with patches landed for 1.8.2 already and work still ongoing on making lustre_hash_for_each_empty() more efficient. b. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 197 bytes Desc: This is a digitally signed message part Url : http://lists.lustre.org/pipermail/lustre-discuss/attachments/20091211/126ee0e7/attachment.bin