Hendelman, Rob
2009-Dec-11 17:51 UTC
[Lustre-discuss] mdt threads "hanging" (?) until writes no longer permitted
Hello.
We have tried both lustre 1.6.7.2 server & 1.6.7.2 client/1.8.1.1-56 (cvs)
client (patchless) on Ubuntu with kernel 2.6.24-25.63 which maps to mainline
2.6.24.6 and have seen some strange behavior.
MGS/MDS is a HPDL385G6 with 36G ram. It has two partitions on a a set of raid
73G 15k sas for mgs/mds:
Filesystem Size Used Avail Use% Mounted on
/dev/cciss/c0d0p1 61G 2.0G 56G 4% /
tmpfs 18G 0 18G 0% /dev/shm
/dev/lustre/mgs 1008M 34M 924M 4% /mnt/mgs
/dev/lustre/mdt
7.0G 437M 6.2G 7% /mnt/mdt
The OSS is a separate server with the same cpu/ram config but has 4
OST''s each a raid10 of 4 * 146G 15k sas drives on fc san
Filesystem Size Used Avail Use% Mounted on
/dev/cciss/c0d0p1 59G 2.0G 54G 4% /
tmpfs 18G 0 18G 0% /dev/shm
/dev/data/ost00
264G 80G 171G 32% /mnt/data-ost00
/dev/data/ost01
264G 76G 175G 31% /mnt/data-ost01
/dev/data/ost02
264G 74G 177G 30% /mnt/data-ost02
/dev/data/ost03
264G 73G 178G 30% /mnt/data-ost03
Our programmers here have created a short program that uses a 3rd party library
(which we don''t have source code to) that actually performs the file
I/O that creates our problem. The program does many many reads of very small
files an no writes. What ends up happening is that sometime in the middle of
the processing, the load goes up very very high 300-700 on the client running
the program, however the system is still responsive except to a few commands
("w" or "ps" are good examples). On the MDT we see the
following:
BUG: soft lockup - CPU#1 stuck for 10s! [ll_evictor:3245]
CPU 1:
Modules linked in: sunrpc(U) ipv6(U) xfrm_nalgo(U) crypto_api(U) autofs4(U)
hidp(U) l2cap(U) bluetooth(U) osc(U) lov(U) mds(U) lquota(U) fsfilt_ldiskfs(U)
mgs(U) mgc(U) ptlrpc(U) l
diskfs(U) crc16(U) ksocklnd(U) obdclass(U) lnet(U) lvfs(U) libcfs(U)
dm_multipath(U) video(U) sbs(U) backlight(U) i2c_ec(U) i2c_core(U) button(U)
battery(U) asus_acpi(U) acpi_memho
tplug(U) ac(U) parport_pc(U) lp(U) parport(U) pcspkr(U) hpwdt(U) serio_raw(U)
bnx2(U) shpchp(U) dm_snapshot(U) dm_zero(U) dm_mirror(U) dm_mod(U)
usb_storage(U) lpfc(U) scsi_transpo
rt_fc(U) cciss(U) sd_mod(U) scsi_mod(U) ext3(U) jbd(U) ehci_hcd(U) ohci_hcd(U)
uhci_hcd(U)
Pid: 3245, comm: ll_evictor Tainted: G 2.6.18-92.1.26.el5_lustre.1.6.7.2smp
#1
RIP: 0010:[<ffffffff800649e6>] [<ffffffff800649e6>]
_write_lock+0x7/0xf
RSP: 0018:ffff810493d89d68 EFLAGS: 00000246
RAX: ffff8103b4900c00 RBX: 000000000000c5e0 RCX: 00000000000120e6
RDX: 000000000000037e RSI: ffffffff802f1d80 RDI: ffffc20010861e0c
RBP: 0000000000000282 R08: ffff8104afcb3360 R09: 0000000000000000
R10: ffff8103b49ed840 R11: 0000000000000150 R12: 0000000000000282
R13: ffff810493d89d10 R14: ffff8103b49ed898 R15: ffff8103b49ed840
FS: 00002b4db7ec2230(0000) GS:ffff8104a007ac40(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00002ab671c5d0a0 CR3: 000000091d857000 CR4: 00000000000006e0
Call Trace:
[<ffffffff883593fe>] :obdclass:lustre_hash_for_each_empty+0x20e/0x290
[<ffffffff8835fc38>] :obdclass:class_disconnect+0x378/0x400
[<ffffffff884bd610>] :ptlrpc:ldlm_cancel_locks_for_export_cb+0x0/0xc0
[<ffffffff885cc43b>] :mds:mds_disconnect+0x11b/0x310
[<ffffffff8835bce4>] :obdclass:class_fail_export+0x384/0x4c0
[<ffffffff8850daa8>] :ptlrpc:ping_evictor_main+0x4f8/0x7d5
[<ffffffff8008abbc>] default_wake_function+0x0/0xe
[<ffffffff800b4391>] audit_syscall_exit+0x31b/0x336
[<ffffffff8005dfb1>] child_rip+0xa/0x11
[<ffffffff8850d5b0>] :ptlrpc:ping_evictor_main+0x0/0x7d5
[<ffffffff8005dfa7>] child_rip+0x0/0x11
BUG: soft lockup - CPU#1 stuck for 10s! [ll_evictor:3245]
CPU 1:
Modules linked in: sunrpc(U) ipv6(U) xfrm_nalgo(U) crypto_api(U) autofs4(U)
hidp(U) l2cap(U) bluetooth(U) osc(U) lov(U) mds(U) lquota(U) fsfilt_ldiskfs(U)
mgs(U) mgc(U) ptlrpc(U) l
diskfs(U) crc16(U) ksocklnd(U) obdclass(U) lnet(U) lvfs(U) libcfs(U)
dm_multipath(U) video(U) sbs(U) backlight(U) i2c_ec(U) i2c_core(U) button(U)
battery(U) asus_acpi(U) acpi_memho
tplug(U) ac(U) parport_pc(U) lp(U) parport(U) pcspkr(U) hpwdt(U) serio_raw(U)
bnx2(U) shpchp(U) dm_snapshot(U) dm_zero(U) dm_mirror(U) dm_mod(U)
usb_storage(U) lpfc(U) scsi_transpo
rt_fc(U) cciss(U) sd_mod(U) scsi_mod(U) ext3(U) jbd(U) ehci_hcd(U) ohci_hcd(U)
uhci_hcd(U)
Pid: 3245, comm: ll_evictor Tainted: G 2.6.18-92.1.26.el5_lustre.1.6.7.2smp
#1
RIP: 0010:[<ffffffff883593e0>] [<ffffffff883593e0>]
:obdclass:lustre_hash_for_each_empty+0x1f0/0x290
RSP: 0018:ffff810493d89d70 EFLAGS: 00000206
RAX: ffff8103a8880200 RBX: 000000000000c479 RCX: 0000000000014eaf
RDX: 000000000000025b RSI: ffffffff802f1d80 RDI: ffffc2001086079c
RBP: ffffffff884bb912 R08: ffff8104afcb3360 R09: ffff8103b4f14a00
R10: 0000000c00000000 R11: 0000000000000150 R12: ffff8103a8880200
R13: 0000000000000282 R14: 0000000000000282 R15: ffff810493d89d10
FS: 00002b4db7ec2230(0000) GS:ffff8104a007ac40(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00002ab671c5d0a0 CR3: 000000091d857000 CR4: 00000000000006e0
Call Trace:
[<ffffffff8835fc38>] :obdclass:class_disconnect+0x378/0x400
[<ffffffff884bd610>] :ptlrpc:ldlm_cancel_locks_for_export_cb+0x0/0xc0
[<ffffffff885cc43b>] :mds:mds_disconnect+0x11b/0x310
[<ffffffff8835bce4>] :obdclass:class_fail_export+0x384/0x4c0
[<ffffffff8850daa8>] :ptlrpc:ping_evictor_main+0x4f8/0x7d5
[<ffffffff8008abbc>] default_wake_function+0x0/0xe
[<ffffffff800b4391>] audit_syscall_exit+0x31b/0x336
[<ffffffff8005dfb1>] child_rip+0xa/0x11
[<ffffffff8850d5b0>] :ptlrpc:ping_evictor_main+0x0/0x7d5
[<ffffffff8005dfa7>] child_rip+0x0/0x11
As well as:
BUG: soft lockup - CPU#3 stuck for 10s! [ll_mdt_21:3313]
CPU 3:
Modules linked in: sunrpc(U) ipv6(U) xfrm_nalgo(U) crypto_api(U) autofs4(U)
hidp(U) l2cap(U) bluetooth(U) osc(U) lov(U) mds(U) lquota(U) fsfilt_ldiskfs(U)
mgs(U) mgc(U) ptlrpc(U) l
diskfs(U) crc16(U) ksocklnd(U) obdclass(U) lnet(U) lvfs(U) libcfs(U)
dm_multipath(U) video(U) sbs(U) backlight(U) i2c_ec(U) i2c_core(U) button(U)
battery(U) asus_acpi(U) acpi_memho
tplug(U) ac(U) parport_pc(U) lp(U) parport(U) pcspkr(U) hpwdt(U) serio_raw(U)
bnx2(U) shpchp(U) dm_snapshot(U) dm_zero(U) dm_mirror(U) dm_mod(U)
usb_storage(U) lpfc(U) scsi_transpo
rt_fc(U) cciss(U) sd_mod(U) scsi_mod(U) ext3(U) jbd(U) ehci_hcd(U) ohci_hcd(U)
uhci_hcd(U)
Pid: 3313, comm: ll_mdt_21 Tainted: G 2.6.18-92.1.26.el5_lustre.1.6.7.2smp
#1
RIP: 0010:[<ffffffff800649e6>] [<ffffffff800649e6>]
_write_lock+0x7/0xf
RSP: 0018:ffff81049346bb08 EFLAGS: 00000246
RAX: ffff8103fd4cd400 RBX: 0000000000007ff5 RCX: 000000000000c76e
RDX: fffffffffff9d458 RSI: ffffffff802f1d80 RDI: ffffc20010865f5c
RBP: 0000000000000282 R08: ffff8104afcbb8e0 R09: 0000000000000000
R10: ffff81043a9d6680 R11: 0000000000000150 R12: 0000000000000282
R13: ffff81049346bab0 R14: ffff81043a9d66d8 R15: ffff81043a9d6680
FS: 00002b000b9be230(0000) GS:ffff81091ff5bb40(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00000037ff495770 CR3: 0000000000201000 CR4: 00000000000006e0
Call Trace:
[<ffffffff883593fe>] :obdclass:lustre_hash_for_each_empty+0x20e/0x290
[<ffffffff8835fc38>] :obdclass:class_disconnect+0x378/0x400
[<ffffffff884bd610>] :ptlrpc:ldlm_cancel_locks_for_export_cb+0x0/0xc0
[<ffffffff885cc43b>] :mds:mds_disconnect+0x11b/0x310
[<ffffffff885010f9>] :ptlrpc:lustre_pack_reply+0x29/0xb0
[<ffffffff884c7c61>] :ptlrpc:target_handle_disconnect+0x461/0x560
[<ffffffff884fc005>] :ptlrpc:lustre_msg_get_opc+0x35/0xf0
[<ffffffff885c7734>] :mds:mds_handle+0xda4/0x4d20
[<ffffffff80143b75>] __next_cpu+0x19/0x28
[<ffffffff800898e6>] find_busiest_group+0x20d/0x621
[<ffffffff884fb795>] :ptlrpc:lustre_msg_get_conn_cnt+0x35/0xf0
[<ffffffff88503eea>] :ptlrpc:ptlrpc_server_request_get+0x6a/0x150
[<ffffffff88505d6d>] :ptlrpc:ptlrpc_check_req+0x1d/0x110
[<ffffffff885082f3>] :ptlrpc:ptlrpc_server_handle_request+0xa93/0x1150
[<ffffffff8006da18>] do_gettimeoffset_hpet+0x7/0x28
[<ffffffff8006d8a2>] do_gettimeofday+0x40/0x8f
[<ffffffff882c37c6>] :libcfs:lcw_update_time+0x16/0x100
[<ffffffff8003d59a>] lock_timer_base+0x1b/0x3c
[<ffffffff8001c5e4>] __mod_timer+0xb0/0xbe
[<ffffffff8850b7e8>] :ptlrpc:ptlrpc_main+0x1218/0x13e0
[<ffffffff8008abbc>] default_wake_function+0x0/0xe
[<ffffffff800b4391>] audit_syscall_exit+0x31b/0x336
[<ffffffff8005dfb1>] child_rip+0xa/0x11
[<ffffffff8850a5d0>] :ptlrpc:ptlrpc_main+0x0/0x13e0
[<ffffffff8005dfa7>] child_rip+0x0/0x11
Writing from ANY client (e.g.: touch ./filename) to the lustre mount then
returns "no space left on device." Note above that we are not over
30% capacity on any OST and 4/7% capacity on mgs/mdt.
Any suggestions? We have another cluster with an entirely different I/O load
with 1.6.7.2. servers & clients that show no problems at all.
Thank you,
Robert
The information contained in this message and its attachments
is intended only for the private and confidential use of the
intended recipient(s). If you are not the intended recipient
(or have received this e-mail in error) please notify the
sender immediately and destroy this e-mail. Any unauthorized
copying, disclosure or distribution of the material in this e-
mail is strictly prohibited.
Brian J. Murrell
2009-Dec-11 19:03 UTC
[Lustre-discuss] mdt threads "hanging" (?) until writes no longer permitted
On Fri, 2009-12-11 at 11:51 -0600, Hendelman, Rob wrote:> BUG: soft lockup - CPU#1 stuck for 10s! [ll_evictor:3245] > CPU 1: > Modules linked in: sunrpc(U) ipv6(U) xfrm_nalgo(U) crypto_api(U) autofs4(U) hidp(U) l2cap(U) bluetooth(U) osc(U) lov(U) mds(U) lquota(U) fsfilt_ldiskfs(U) mgs(U) mgc(U) ptlrpc(U) l > diskfs(U) crc16(U) ksocklnd(U) obdclass(U) lnet(U) lvfs(U) libcfs(U) dm_multipath(U) video(U) sbs(U) backlight(U) i2c_ec(U) i2c_core(U) button(U) battery(U) asus_acpi(U) acpi_memho > tplug(U) ac(U) parport_pc(U) lp(U) parport(U) pcspkr(U) hpwdt(U) serio_raw(U) bnx2(U) shpchp(U) dm_snapshot(U) dm_zero(U) dm_mirror(U) dm_mod(U) usb_storage(U) lpfc(U) scsi_transpo > rt_fc(U) cciss(U) sd_mod(U) scsi_mod(U) ext3(U) jbd(U) ehci_hcd(U) ohci_hcd(U) uhci_hcd(U) > Pid: 3245, comm: ll_evictor Tainted: G 2.6.18-92.1.26.el5_lustre.1.6.7.2smp #1 > RIP: 0010:[<ffffffff800649e6>] [<ffffffff800649e6>] _write_lock+0x7/0xf > RSP: 0018:ffff810493d89d68 EFLAGS: 00000246 > RAX: ffff8103b4900c00 RBX: 000000000000c5e0 RCX: 00000000000120e6 > RDX: 000000000000037e RSI: ffffffff802f1d80 RDI: ffffc20010861e0c > RBP: 0000000000000282 R08: ffff8104afcb3360 R09: 0000000000000000 > R10: ffff8103b49ed840 R11: 0000000000000150 R12: 0000000000000282 > R13: ffff810493d89d10 R14: ffff8103b49ed898 R15: ffff8103b49ed840 > FS: 00002b4db7ec2230(0000) GS:ffff8104a007ac40(0000) knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b > CR2: 00002ab671c5d0a0 CR3: 000000091d857000 CR4: 00000000000006e0 > > Call Trace: > [<ffffffff883593fe>] :obdclass:lustre_hash_for_each_empty+0x20e/0x290This looks like bug 19557 with patches landed for 1.8.2 already and work still ongoing on making lustre_hash_for_each_empty() more efficient. b. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 197 bytes Desc: This is a digitally signed message part Url : http://lists.lustre.org/pipermail/lustre-discuss/attachments/20091211/126ee0e7/attachment.bin