Hello All,
I''m running lustre v1.6.7.2 - I''ve got one of each (mds, oss,
client).
The client serves apache off the lustre volume.
About an hour ago the client completely hung. Hosting co. says it was
a kernel panic. I got not useful feedback in /var/log/messages from the
client or the MDS. However from the OST I got several complaints.
(below).
Does anyone have any insight into the problem? All help as to how I can
fix this, or avoid the problem, greatly appreciated.
Thanks,
Nick
Sep 20 04:02:03 ssn1 syslogd 1.4.1: restart.
Sep 22 18:01:29 ssn1 : error getting update info: tuple index out of
range
Sep 24 19:09:30 ssn1 auditd[10084]: Audit daemon rotating log files
Sep 25 22:31:29 ssn1 kernel: Lustre: clients-OST0000: haven''t heard
from
client eaa25af9-d0f5-8d54-1644-9cdd7f978e05 (at 10.0.0.21 at tcp1) in 269
seconds. I think it''s dead, and I am evicting it.
Sep 25 22:31:39 ssn1 kernel: BUG: soft lockup - CPU#2 stuck for 10s!
[ll_evictor:3714]
Sep 25 22:31:39 ssn1 kernel: CPU 2:
Sep 25 22:31:39 ssn1 kernel: Modules linked in: ipmi_devintf(U)
ipmi_si(U) ipmi_msghandler(U) lockd(U) obdfilter(U) fsfilt_ldiskfs(U)
ost(U) mgc(U) lustre(U) lov(U) mdc(U) lquota(U) osc(U) ksocklnd(U)
ptlrpc(U) obdclass(U) lnet(U) lvfs(U) libcfs(U) ldiskfs(U) crc16(U)
autofs4(U) hidp(U) rfcomm(U) l2cap(U) bluetooth(U) sunrpc(U)
cpufreq_ondemand(U) dm_multipath(U) video(U) sbs(U) backlight(U)
i2c_ec(U) i2c_core(U) button(U) battery(U) asus_acpi(U)
acpi_memhotplug(U) ac(U) parport_pc(U) lp(U) parport(U) sr_mod(U)
cdrom(U) pata_acpi(U) serio_raw(U) tg3(U) sg(U) pcspkr(U) dm_snapshot(U)
dm_zero(U) dm_mirror(U) dm_mod(U) ata_piix(U) libata(U) megaraid_sas(U)
shpchp(U) mptsas(U) mptscsih(U) mptbase(U) scsi_transport_sas(U)
sd_mod(U) scsi_mod(U) ext3(U) jbd(U) uhci_hcd(U) ohci_hcd(U) ehci_hcd(U)
Sep 25 22:31:39 ssn1 kernel: Pid: 3714, comm: ll_evictor Tainted: G
2.6.18-92.1.26.el5_lustre.1.6.7.2smp #1
Sep 25 22:31:39 ssn1 kernel: RIP: 0010:[<ffffffff800649e6>]
[<ffffffff800649e6>] _write_lock+0x7/0xf
Sep 25 22:31:39 ssn1 kernel: RSP: 0018:ffff810213aabd88 EFLAGS:
00000246
Sep 25 22:31:39 ssn1 kernel: RAX: ffff81012ac8a800 RBX: 0000000000002be2
RCX: 0000000000005c90
Sep 25 22:31:39 ssn1 kernel: RDX: fffffffffffff6ef RSI: ffffffff802f1d80
RDI: ffffc20010a1ee2c
Sep 25 22:31:39 ssn1 kernel: RBP: 0000000000000286 R08: ffff810001016e60
R09: 0000000000000000
Sep 25 22:31:39 ssn1 kernel: R10: ffff81012ac85240 R11: 0000000000000150
R12: 0000000000000286
Sep 25 22:31:39 ssn1 kernel: R13: ffff810213aabd30 R14: ffff81012ac85298
R15: ffff81012ac85240
Sep 25 22:31:39 ssn1 kernel: FS: 00002aaf1c86f220(0000)
GS:ffff810107b9cd40(0000) knlGS:0000000000000000
Sep 25 22:31:39 ssn1 kernel: CS: 0010 DS: 0000 ES: 0000 CR0:
000000008005003b
Sep 25 22:31:39 ssn1 kernel: CR2: 000000001745b024 CR3: 000000022275b000
CR4: 00000000000006e0
Sep 25 22:31:39 ssn1 kernel:
Sep 25 22:31:39 ssn1 kernel: Call Trace:
Sep 25 22:31:39 ssn1 kernel:
[<ffffffff884b43fe>] :obdclass:lustre_hash_for_each_empty+0x20e/0x290
Sep 25 22:31:39 ssn1 kernel:
[<ffffffff884bac38>] :obdclass:class_disconnect+0x378/0x400
Sep 25 22:31:39 ssn1 kernel:
[<ffffffff88530610>] :ptlrpc:ldlm_cancel_locks_for_export_cb+0x0/0xc0
Sep 25 22:31:39 ssn1 kernel:
[<ffffffff887c442d>] :obdfilter:filter_disconnect+0x36d/0x4b0
Sep 25 22:31:39 ssn1 kernel:
[<ffffffff884b6ce4>] :obdclass:class_fail_export+0x384/0x4c0
Sep 25 22:31:39 ssn1 kernel:
[<ffffffff88580aa8>] :ptlrpc:ping_evictor_main+0x4f8/0x7d5
Sep 25 22:31:39 ssn1 kernel: [<ffffffff8008abbc>] default_wake_function
+0x0/0xe
Sep 25 22:31:39 ssn1 kernel: [<ffffffff800b4391>] audit_syscall_exit
+0x31b/0x336
Sep 25 22:31:39 ssn1 kernel: [<ffffffff8005dfb1>] child_rip+0xa/0x11
Sep 25 22:31:39 ssn1 kernel:
[<ffffffff885805b0>] :ptlrpc:ping_evictor_main+0x0/0x7d5
Sep 25 22:31:39 ssn1 kernel: [<ffffffff8005dfa7>] child_rip+0x0/0x11
Sep 25 22:31:39 ssn1 kernel:
Sep 25 22:31:49 ssn1 kernel: BUG: soft lockup - CPU#2 stuck for 10s!
[ll_evictor:3714]
Sep 25 22:31:49 ssn1 kernel: CPU 2:
Sep 25 22:31:49 ssn1 kernel: Modules linked in: ipmi_devintf(U)
ipmi_si(U) ipmi_msghandler(U) lockd(U) obdfilter(U) fsfilt_ldiskfs(U)
ost(U) mgc(U) lustre(U) lov(U) mdc(U) lquota(U) osc(U) ksocklnd(U)
ptlrpc(U) obdclass(U) lnet(U) lvfs(U) libcfs(U) ldiskfs(U) crc16(U)
autofs4(U) hidp(U) rfcomm(U) l2cap(U) bluetooth(U) sunrpc(U)
cpufreq_ondemand(U) dm_multipath(U) video(U) sbs(U) backlight(U)
i2c_ec(U) i2c_core(U) button(U) battery(U) asus_acpi(U)
acpi_memhotplug(U) ac(U) parport_pc(U) lp(U) parport(U) sr_mod(U)
cdrom(U) pata_acpi(U) serio_raw(U) tg3(U) sg(U) pcspkr(U) dm_snapshot(U)
dm_zero(U) dm_mirror(U) dm_mod(U) ata_piix(U) libata(U) megaraid_sas(U)
shpchp(U) mptsas(U) mptscsih(U) mptbase(U) scsi_transport_sas(U)
sd_mod(U) scsi_mod(U) ext3(U) jbd(U) uhci_hcd(U) ohci_hcd(U) ehci_hcd(U)
Sep 25 22:31:49 ssn1 kernel: Pid: 3714, comm: ll_evictor Tainted: G
2.6.18-92.1.26.el5_lustre.1.6.7.2smp #1
Sep 25 22:31:49 ssn1 kernel: RIP: 0010:[<ffffffff884b43fe>]
[<ffffffff884b43fe>] :obdclass:lustre_hash_for_each_empty+0x20e/0x290
Sep 25 22:31:49 ssn1 kernel: RSP: 0018:ffff810213aabd90 EFLAGS:
00000246
Sep 25 22:31:49 ssn1 kernel: RAX: ffff81012b1a7400 RBX: 0000000000004bea
RCX: 0000000000008468
Sep 25 22:31:49 ssn1 kernel: RDX: 0000000000000368 RSI: ffffffff802f1d80
RDI: ffffc20010a3eeac
Sep 25 22:31:49 ssn1 kernel: RBP: ffffffff8852e912 R08: ffff810001016e60
R09: 0000000000000000
Sep 25 22:31:49 ssn1 kernel: R10: ffff81012b1a6500 R11: 0000000000000150
R12: ffff81012b1a7400
Sep 25 22:31:49 ssn1 kernel: R13: 0000000000000286 R14: 0000000000000286
R15: ffff810213aabd30
Sep 25 22:31:49 ssn1 kernel: FS: 00002aaf1c86f220(0000)
GS:ffff810107b9cd40(0000) knlGS:0000000000000000
Sep 25 22:31:49 ssn1 kernel: CS: 0010 DS: 0000 ES: 0000 CR0:
000000008005003b
Sep 25 22:31:49 ssn1 kernel: CR2: 000000001745b024 CR3: 000000022275b000
CR4: 00000000000006e0
Sep 25 22:31:49 ssn1 kernel:
Sep 25 22:31:49 ssn1 kernel: Call Trace:
Sep 25 22:31:49 ssn1 kernel:
[<ffffffff884bac38>] :obdclass:class_disconnect+0x378/0x400
Sep 25 22:31:49 ssn1 kernel:
[<ffffffff88530610>] :ptlrpc:ldlm_cancel_locks_for_export_cb+0x0/0xc0
Sep 25 22:31:49 ssn1 kernel:
[<ffffffff887c442d>] :obdfilter:filter_disconnect+0x36d/0x4b0
Sep 25 22:31:49 ssn1 kernel:
[<ffffffff884b6ce4>] :obdclass:class_fail_export+0x384/0x4c0
Sep 25 22:31:49 ssn1 kernel:
[<ffffffff88580aa8>] :ptlrpc:ping_evictor_main+0x4f8/0x7d5
Sep 25 22:31:49 ssn1 kernel: [<ffffffff8008abbc>] default_wake_function
+0x0/0xe
Sep 25 22:31:49 ssn1 kernel: [<ffffffff800b4391>] audit_syscall_exit
+0x31b/0x336
Sep 25 22:31:49 ssn1 kernel: [<ffffffff8005dfb1>] child_rip+0xa/0x11
Sep 25 22:31:49 ssn1 kernel:
[<ffffffff885805b0>] :ptlrpc:ping_evictor_main+0x0/0x7d5
Sep 25 22:31:49 ssn1 kernel: [<ffffffff8005dfa7>] child_rip+0x0/0x11
Sep 25 22:31:49 ssn1 kernel:
Sep 25 22:31:59 ssn1 kernel: BUG: soft lockup - CPU#2 stuck for 10s!
[ll_evictor:3714]
Sep 25 22:31:59 ssn1 kernel: CPU 2:
Sep 25 22:31:59 ssn1 kernel: Modules linked in: ipmi_devintf(U)
ipmi_si(U) ipmi_msghandler(U) lockd(U) obdfilter(U) fsfilt_ldiskfs(U)
ost(U) mgc(U) lustre(U) lov(U) mdc(U) lquota(U) osc(U) ksocklnd(U)
ptlrpc(U) obdclass(U) lnet(U) lvfs(U) libcfs(U) ldiskfs(U) crc16(U)
autofs4(U) hidp(U) rfcomm(U) l2cap(U) bluetooth(U) sunrpc(U)
cpufreq_ondemand(U) dm_multipath(U) video(U) sbs(U) backlight(U)
i2c_ec(U) i2c_core(U) button(U) battery(U) asus_acpi(U)
acpi_memhotplug(U) ac(U) parport_pc(U) lp(U) parport(U) sr_mod(U)
cdrom(U) pata_acpi(U) serio_raw(U) tg3(U) sg(U) pcspkr(U) dm_snapshot(U)
dm_zero(U) dm_mirror(U) dm_mod(U) ata_piix(U) libata(U) megaraid_sas(U)
shpchp(U) mptsas(U) mptscsih(U) mptbase(U) scsi_transport_sas(U)
sd_mod(U) scsi_mod(U) ext3(U) jbd(U) uhci_hcd(U) ohci_hcd(U) ehci_hcd(U)
Sep 25 22:31:59 ssn1 kernel: Pid: 3714, comm: ll_evictor Tainted: G
2.6.18-92.1.26.el5_lustre.1.6.7.2smp #1
Sep 25 22:31:59 ssn1 kernel: RIP: 0010:[<ffffffff884b43fe>]
[<ffffffff884b43fe>] :obdclass:lustre_hash_for_each_empty+0x20e/0x290
Sep 25 22:31:59 ssn1 kernel: RSP: 0018:ffff810213aabd90 EFLAGS:
00000246
Sep 25 22:31:59 ssn1 kernel: RAX: ffff81012bd95e00 RBX: 0000000000006994
RCX: 000000000000a30c
Sep 25 22:31:59 ssn1 kernel: RDX: 00000000000002c4 RSI: ffffffff802f1d80
RDI: ffffc20010a5c94c
Sep 25 22:31:59 ssn1 kernel: RBP: ffffffff8852e912 R08: ffff810001016e60
R09: 0000000000000000
Sep 25 22:31:59 ssn1 kernel: R10: ffff81012bd986c0 R11: 0000000000000150
R12: ffff81012bd95e00
Sep 25 22:31:59 ssn1 kernel: R13: 0000000000000286 R14: 0000000000000286
R15: ffff810213aabd30
Sep 25 22:31:59 ssn1 kernel: FS: 00002aaf1c86f220(0000)
GS:ffff810107b9cd40(0000) knlGS:0000000000000000
Sep 25 22:31:59 ssn1 kernel: CS: 0010 DS: 0000 ES: 0000 CR0:
000000008005003b
Sep 25 22:31:59 ssn1 kernel: CR2: 000000001745b024 CR3: 000000022275b000
CR4: 00000000000006e0
Sep 25 22:31:59 ssn1 kernel:
Sep 25 22:31:59 ssn1 kernel: Call Trace:
Sep 25 22:31:59 ssn1 kernel:
[<ffffffff884bac38>] :obdclass:class_disconnect+0x378/0x400
Sep 25 22:31:59 ssn1 kernel:
[<ffffffff88530610>] :ptlrpc:ldlm_cancel_locks_for_export_cb+0x0/0xc0
Sep 25 22:31:59 ssn1 kernel:
[<ffffffff887c442d>] :obdfilter:filter_disconnect+0x36d/0x4b0
Sep 25 22:31:59 ssn1 kernel:
[<ffffffff884b6ce4>] :obdclass:class_fail_export+0x384/0x4c0
Sep 25 22:31:59 ssn1 kernel:
[<ffffffff88580aa8>] :ptlrpc:ping_evictor_main+0x4f8/0x7d5
Sep 25 22:31:59 ssn1 kernel: [<ffffffff8008abbc>] default_wake_function
+0x0/0xe
Sep 25 22:31:59 ssn1 kernel: [<ffffffff800b4391>] audit_syscall_exit
+0x31b/0x336
Sep 25 22:31:59 ssn1 kernel: [<ffffffff8005dfb1>] child_rip+0xa/0x11
Sep 25 22:31:59 ssn1 kernel:
[<ffffffff885805b0>] :ptlrpc:ping_evictor_main+0x0/0x7d5
Sep 25 22:31:59 ssn1 kernel: [<ffffffff8005dfa7>] child_rip+0x0/0x11
Sep 25 22:31:59 ssn1 kernel:
Sep 25 22:32:09 ssn1 kernel: BUG: soft lockup - CPU#2 stuck for 10s!
[ll_evictor:3714]
Sep 25 22:32:09 ssn1 kernel: CPU 2:
Sep 25 22:32:09 ssn1 kernel: Modules linked in: ipmi_devintf(U)
ipmi_si(U) ipmi_msghandler(U) lockd(U) obdfilter(U) fsfilt_ldiskfs(U)
ost(U) mgc(U) lustre(U) lov(U) mdc(U) lquota(U) osc(U) ksocklnd(U)
ptlrpc(U) obdclass(U) lnet(U) lvfs(U) libcfs(U) ldiskfs(U) crc16(U)
autofs4(U) hidp(U) rfcomm(U) l2cap(U) bluetooth(U) sunrpc(U)
cpufreq_ondemand(U) dm_multipath(U) video(U) sbs(U) backlight(U)
i2c_ec(U) i2c_core(U) button(U) battery(U) asus_acpi(U)
acpi_memhotplug(U) ac(U) parport_pc(U) lp(U) parport(U) sr_mod(U)
cdrom(U) pata_acpi(U) serio_raw(U) tg3(U) sg(U) pcspkr(U) dm_snapshot(U)
dm_zero(U) dm_mirror(U) dm_mod(U) ata_piix(U) libata(U) megaraid_sas(U)
shpchp(U) mptsas(U) mptscsih(U) mptbase(U) scsi_transport_sas(U)
sd_mod(U) scsi_mod(U) ext3(U) jbd(U) uhci_hcd(U) ohci_hcd(U) ehci_hcd(U)
Sep 25 22:32:09 ssn1 kernel: Pid: 3714, comm: ll_evictor Tainted: G
2.6.18-92.1.26.el5_lustre.1.6.7.2smp #1
Sep 25 22:32:09 ssn1 kernel: RIP: 0010:[<ffffffff884b43e0>]
[<ffffffff884b43e0>] :obdclass:lustre_hash_for_each_empty+0x1f0/0x290
Sep 25 22:32:09 ssn1 kernel: RSP: 0018:ffff810213aabd90 EFLAGS:
00000206
Sep 25 22:32:09 ssn1 kernel: RAX: ffff81012c713a00 RBX: 0000000000005c43
RCX: 000000000000bcbf
Sep 25 22:32:09 ssn1 kernel: RDX: 0000000000000118 RSI: ffffffff802f1d80
RDI: ffffc20010a4f43c
Sep 25 22:32:09 ssn1 kernel: RBP: ffffffff8852e912 R08: ffff810001016e60
R09: 0000000000000000
Sep 25 22:32:09 ssn1 kernel: R10: ffff81012c7106c0 R11: 0000000000000150
R12: ffff81012c713a00
Sep 25 22:32:09 ssn1 kernel: R13: 0000000000000286 R14: 0000000000000286
R15: ffff810213aabd30
Sep 25 22:32:09 ssn1 kernel: FS: 00002aaf1c86f220(0000)
GS:ffff810107b9cd40(0000) knlGS:0000000000000000
Sep 25 22:32:09 ssn1 kernel: CS: 0010 DS: 0000 ES: 0000 CR0:
000000008005003b
Sep 25 22:32:09 ssn1 kernel: CR2: 000000001745b024 CR3: 000000022275b000
CR4: 00000000000006e0
Sep 25 22:32:09 ssn1 kernel:
Sep 25 22:32:09 ssn1 kernel: Call Trace:
Sep 25 22:32:09 ssn1 kernel:
[<ffffffff884bac38>] :obdclass:class_disconnect+0x378/0x400
Sep 25 22:32:09 ssn1 kernel:
[<ffffffff88530610>] :ptlrpc:ldlm_cancel_locks_for_export_cb+0x0/0xc0
Sep 25 22:32:09 ssn1 kernel:
[<ffffffff887c442d>] :obdfilter:filter_disconnect+0x36d/0x4b0
Sep 25 22:32:09 ssn1 kernel:
[<ffffffff884b6ce4>] :obdclass:class_fail_export+0x384/0x4c0
Sep 25 22:32:09 ssn1 kernel:
[<ffffffff88580aa8>] :ptlrpc:ping_evictor_main+0x4f8/0x7d5
Sep 25 22:32:09 ssn1 kernel: [<ffffffff8008abbc>] default_wake_function
+0x0/0xe
Sep 25 22:32:09 ssn1 kernel: [<ffffffff800b4391>] audit_syscall_exit
+0x31b/0x336
Sep 25 22:32:09 ssn1 kernel: [<ffffffff8005dfb1>] child_rip+0xa/0x11
Sep 25 22:32:09 ssn1 kernel:
[<ffffffff885805b0>] :ptlrpc:ping_evictor_main+0x0/0x7d5
Sep 25 22:32:09 ssn1 kernel: [<ffffffff8005dfa7>] child_rip+0x0/0x11
Sep 25 22:32:09 ssn1 kernel:
Sep 25 22:32:19 ssn1 kernel: BUG: soft lockup - CPU#2 stuck for 10s!
[ll_evictor:3714]
Sep 25 22:32:19 ssn1 kernel: CPU 2:
Sep 25 22:32:19 ssn1 kernel: Modules linked in: ipmi_devintf(U)
ipmi_si(U) ipmi_msghandler(U) lockd(U) obdfilter(U) fsfilt_ldiskfs(U)
ost(U) mgc(U) lustre(U) lov(U) mdc(U) lquota(U) osc(U) ksocklnd(U)
ptlrpc(U) obdclass(U) lnet(U) lvfs(U) libcfs(U) ldiskfs(U) crc16(U)
autofs4(U) hidp(U) rfcomm(U) l2cap(U) bluetooth(U) sunrpc(U)
cpufreq_ondemand(U) dm_multipath(U) video(U) sbs(U) backlight(U)
i2c_ec(U) i2c_core(U) button(U) battery(U) asus_acpi(U)
acpi_memhotplug(U) ac(U) parport_pc(U) lp(U) parport(U) sr_mod(U)
cdrom(U) pata_acpi(U) serio_raw(U) tg3(U) sg(U) pcspkr(U) dm_snapshot(U)
dm_zero(U) dm_mirror(U) dm_mod(U) ata_piix(U) libata(U) megaraid_sas(U)
shpchp(U) mptsas(U) mptscsih(U) mptbase(U) scsi_transport_sas(U)
sd_mod(U) scsi_mod(U) ext3(U) jbd(U) uhci_hcd(U) ohci_hcd(U) ehci_hcd(U)
Sep 25 22:32:19 ssn1 kernel: Pid: 3714, comm: ll_evictor Tainted: G
2.6.18-92.1.26.el5_lustre.1.6.7.2smp #1
Sep 25 22:32:19 ssn1 kernel: RIP: 0010:[<ffffffff884b43d6>]
[<ffffffff884b43d6>] :obdclass:lustre_hash_for_each_empty+0x1e6/0x290
Sep 25 22:32:19 ssn1 kernel: RSP: 0018:ffff810213aabd90 EFLAGS:
00000246
Sep 25 22:32:19 ssn1 kernel: RAX: ffff81012cbb5c00 RBX: 000000000000b206
RCX: 000000000000d33d
Sep 25 22:32:19 ssn1 kernel: RDX: 0000000000000020 RSI: ffffffff802f1d80
RDI: ffffc20010aa506c
Sep 25 22:32:19 ssn1 kernel: RBP: 0000000000000150 R08: ffff810001016e60
R09: 0000000000000000
Sep 25 22:32:19 ssn1 kernel: R10: ffff81012c14a540 R11: 0000000000000150
R12: 0000000000004a00
Sep 25 22:32:19 ssn1 kernel: R13: 0000000000000286 R14: 0000000000000286
R15: ffff810213aabd30
Sep 25 22:32:19 ssn1 kernel: FS: 00002aaf1c86f220(0000)
GS:ffff810107b9cd40(0000) knlGS:0000000000000000
Sep 25 22:32:19 ssn1 kernel: CS: 0010 DS: 0000 ES: 0000 CR0:
000000008005003b
Sep 25 22:32:19 ssn1 kernel: CR2: 000000001745b024 CR3: 000000022275b000
CR4: 00000000000006e0
Sep 25 22:32:19 ssn1 kernel:
Sep 25 22:32:19 ssn1 kernel: Call Trace:
Sep 25 22:32:19 ssn1 kernel:
[<ffffffff884bac38>] :obdclass:class_disconnect+0x378/0x400
Sep 25 22:32:19 ssn1 kernel:
[<ffffffff88530610>] :ptlrpc:ldlm_cancel_locks_for_export_cb+0x0/0xc0
Sep 25 22:32:19 ssn1 kernel:
[<ffffffff887c442d>] :obdfilter:filter_disconnect+0x36d/0x4b0
Sep 25 22:32:19 ssn1 kernel:
[<ffffffff884b6ce4>] :obdclass:class_fail_export+0x384/0x4c0
Sep 25 22:32:19 ssn1 kernel:
[<ffffffff88580aa8>] :ptlrpc:ping_evictor_main+0x4f8/0x7d5
Sep 25 22:32:19 ssn1 kernel: [<ffffffff8008abbc>] default_wake_function
+0x0/0xe
Sep 25 22:32:19 ssn1 kernel: [<ffffffff800b4391>] audit_syscall_exit
+0x31b/0x336
Sep 25 22:32:19 ssn1 kernel: [<ffffffff8005dfb1>] child_rip+0xa/0x11
Sep 25 22:32:19 ssn1 kernel:
[<ffffffff885805b0>] :ptlrpc:ping_evictor_main+0x0/0x7d5
Sep 25 22:32:19 ssn1 kernel: [<ffffffff8005dfa7>] child_rip+0x0/0x11
Sep 25 22:32:19 ssn1 kernel:
Sep 25 22:32:29 ssn1 kernel: BUG: soft lockup - CPU#2 stuck for 10s!
[ll_evictor:3714]
Sep 25 22:32:29 ssn1 kernel: CPU 2:
Sep 25 22:32:29 ssn1 kernel: Modules linked in: ipmi_devintf(U)
ipmi_si(U) ipmi_msghandler(U) lockd(U) obdfilter(U) fsfilt_ldiskfs(U)
ost(U) mgc(U) lustre(U) lov(U) mdc(U) lquota(U) osc(U) ksocklnd(U)
ptlrpc(U) obdclass(U) lnet(U) lvfs(U) libcfs(U) ldiskfs(U) crc16(U)
autofs4(U) hidp(U) rfcomm(U) l2cap(U) bluetooth(U) sunrpc(U)
cpufreq_ondemand(U) dm_multipath(U) video(U) sbs(U) backlight(U)
i2c_ec(U) i2c_core(U) button(U) battery(U) asus_acpi(U)
acpi_memhotplug(U) ac(U) parport_pc(U) lp(U) parport(U) sr_mod(U)
cdrom(U) pata_acpi(U) serio_raw(U) tg3(U) sg(U) pcspkr(U) dm_snapshot(U)
dm_zero(U) dm_mirror(U) dm_mod(U) ata_piix(U) libata(U) megaraid_sas(U)
shpchp(U) mptsas(U) mptscsih(U) mptbase(U) scsi_transport_sas(U)
sd_mod(U) scsi_mod(U) ext3(U) jbd(U) uhci_hcd(U) ohci_hcd(U) ehci_hcd(U)
Sep 25 22:32:29 ssn1 kernel: Pid: 3714, comm: ll_evictor Tainted: G
2.6.18-92.1.26.el5_lustre.1.6.7.2smp #1
Sep 25 22:32:29 ssn1 kernel: RIP: 0010:[<ffffffff884b43e0>]
[<ffffffff884b43e0>] :obdclass:lustre_hash_for_each_empty+0x1f0/0x290
Sep 25 22:32:29 ssn1 kernel: RSP: 0018:ffff810213aabd90 EFLAGS:
00000206
Sep 25 22:32:29 ssn1 kernel: RAX: ffff81013ef65600 RBX: 000000000000811a
RCX: 000000000000e770
Sep 25 22:32:29 ssn1 kernel: RDX: 00000000000000c6 RSI: ffffffff802f1d80
RDI: ffffc20010a741ac
Sep 25 22:32:29 ssn1 kernel: RBP: ffffffff8852e912 R08: ffff810001016e60
R09: 0000000000000000
Sep 25 22:32:29 ssn1 kernel: R10: ffff810137ee4e40 R11: 0000000000000150
R12: ffff81013ef65600
Sep 25 22:32:29 ssn1 kernel: R13: 0000000000000286 R14: 0000000000000286
R15: ffff810213aabd30
Sep 25 22:32:29 ssn1 kernel: FS: 00002aaf1c86f220(0000)
GS:ffff810107b9cd40(0000) knlGS:0000000000000000
Sep 25 22:32:29 ssn1 kernel: CS: 0010 DS: 0000 ES: 0000 CR0:
000000008005003b
Sep 25 22:32:29 ssn1 kernel: CR2: 000000001745b024 CR3: 000000022275b000
CR4: 00000000000006e0
Sep 25 22:32:29 ssn1 kernel:
Sep 25 22:32:29 ssn1 kernel: Call Trace:
Sep 25 22:32:29 ssn1 kernel:
[<ffffffff884bac38>] :obdclass:class_disconnect+0x378/0x400
Sep 25 22:32:29 ssn1 kernel:
[<ffffffff88530610>] :ptlrpc:ldlm_cancel_locks_for_export_cb+0x0/0xc0
Sep 25 22:32:29 ssn1 kernel:
[<ffffffff887c442d>] :obdfilter:filter_disconnect+0x36d/0x4b0
Sep 25 22:32:29 ssn1 kernel:
[<ffffffff884b6ce4>] :obdclass:class_fail_export+0x384/0x4c0
Sep 25 22:32:29 ssn1 kernel:
[<ffffffff88580aa8>] :ptlrpc:ping_evictor_main+0x4f8/0x7d5
Sep 25 22:32:29 ssn1 kernel: [<ffffffff8008abbc>] default_wake_function
+0x0/0xe
Sep 25 22:32:29 ssn1 kernel: [<ffffffff800b4391>] audit_syscall_exit
+0x31b/0x336
Sep 25 22:32:29 ssn1 kernel: [<ffffffff8005dfb1>] child_rip+0xa/0x11
Sep 25 22:32:29 ssn1 kernel:
[<ffffffff885805b0>] :ptlrpc:ping_evictor_main+0x0/0x7d5
Sep 25 22:32:29 ssn1 kernel: [<ffffffff8005dfa7>] child_rip+0x0/0x11
Sep 25 22:32:29 ssn1 kernel:
Sep 25 22:32:39 ssn1 kernel: BUG: soft lockup - CPU#2 stuck for 10s!
[ll_evictor:3714]
Sep 25 22:32:39 ssn1 kernel: CPU 2:
Sep 25 22:32:39 ssn1 kernel: Modules linked in: ipmi_devintf(U)
ipmi_si(U) ipmi_msghandler(U) lockd(U) obdfilter(U) fsfilt_ldiskfs(U)
ost(U) mgc(U) lustre(U) lov(U) mdc(U) lquota(U) osc(U) ksocklnd(U)
ptlrpc(U) obdclass(U) lnet(U) lvfs(U) libcfs(U) ldiskfs(U) crc16(U)
autofs4(U) hidp(U) rfcomm(U) l2cap(U) bluetooth(U) sunrpc(U)
cpufreq_ondemand(U) dm_multipath(U) video(U) sbs(U) backlight(U)
i2c_ec(U) i2c_core(U) button(U) battery(U) asus_acpi(U)
acpi_memhotplug(U) ac(U) parport_pc(U) lp(U) parport(U) sr_mod(U)
cdrom(U) pata_acpi(U) serio_raw(U) tg3(U) sg(U) pcspkr(U) dm_snapshot(U)
dm_zero(U) dm_mirror(U) dm_mod(U) ata_piix(U) libata(U) megaraid_sas(U)
shpchp(U) mptsas(U) mptscsih(U) mptbase(U) scsi_transport_sas(U)
sd_mod(U) scsi_mod(U) ext3(U) jbd(U) uhci_hcd(U) ohci_hcd(U) ehci_hcd(U)
Sep 25 22:32:39 ssn1 kernel: Pid: 3714, comm: ll_evictor Tainted: G
2.6.18-92.1.26.el5_lustre.1.6.7.2smp #1
Sep 25 22:32:39 ssn1 kernel: RIP: 0010:[<ffffffff884b43e0>]
[<ffffffff884b43e0>] :obdclass:lustre_hash_for_each_empty+0x1f0/0x290
Sep 25 22:32:39 ssn1 kernel: RSP: 0018:ffff810213aabd90 EFLAGS:
00000206
Sep 25 22:32:39 ssn1 kernel: RAX: ffff81012e650c00 RBX: 000000000000da23
RCX: 000000000000f9e6
Sep 25 22:32:39 ssn1 kernel: RDX: 0000000000000398 RSI: ffffffff802f1d80
RDI: ffffc20010acd23c
Sep 25 22:32:39 ssn1 kernel: RBP: 0000000000000150 R08: ffff810001016e60
R09: 0000000000000000
Sep 25 22:32:39 ssn1 kernel: R10: ffff81012e971cc0 R11: 0000000000000150
R12: 0000000000007220
Sep 25 22:32:39 ssn1 kernel: R13: 0000000000000286 R14: 0000000000000286
R15: ffff810213aabd30
Sep 25 22:32:39 ssn1 kernel: FS: 00002aaf1c86f220(0000)
GS:ffff810107b9cd40(0000) knlGS:0000000000000000
Sep 25 22:32:39 ssn1 kernel: CS: 0010 DS: 0000 ES: 0000 CR0:
000000008005003b
Sep 25 22:32:39 ssn1 kernel: CR2: 000000001745b024 CR3: 000000022275b000
CR4: 00000000000006e0
Sep 25 22:32:39 ssn1 kernel:
Sep 25 22:32:39 ssn1 kernel: Call Trace:
Sep 25 22:32:39 ssn1 kernel:
[<ffffffff884bac38>] :obdclass:class_disconnect+0x378/0x400
Sep 25 22:32:39 ssn1 kernel:
[<ffffffff88530610>] :ptlrpc:ldlm_cancel_locks_for_export_cb+0x0/0xc0
Sep 25 22:32:39 ssn1 kernel:
[<ffffffff887c442d>] :obdfilter:filter_disconnect+0x36d/0x4b0
Sep 25 22:32:39 ssn1 kernel:
[<ffffffff884b6ce4>] :obdclass:class_fail_export+0x384/0x4c0
Sep 25 22:32:39 ssn1 kernel:
[<ffffffff88580aa8>] :ptlrpc:ping_evictor_main+0x4f8/0x7d5
Sep 25 22:32:39 ssn1 kernel: [<ffffffff8008abbc>] default_wake_function
+0x0/0xe
Sep 25 22:32:39 ssn1 kernel: [<ffffffff800b4391>] audit_syscall_exit
+0x31b/0x336
Sep 25 22:32:39 ssn1 kernel: [<ffffffff8005dfb1>] child_rip+0xa/0x11
Sep 25 22:32:39 ssn1 kernel:
[<ffffffff885805b0>] :ptlrpc:ping_evictor_main+0x0/0x7d5
Sep 25 22:32:39 ssn1 kernel: [<ffffffff8005dfa7>] child_rip+0x0/0x11
Sep 25 22:32:39 ssn1 kernel:
Sep 25 22:32:49 ssn1 kernel: BUG: soft lockup - CPU#2 stuck for 10s!
[ll_evictor:3714]
Sep 25 22:32:49 ssn1 kernel: CPU 2:
Sep 25 22:32:49 ssn1 kernel: Modules linked in: ipmi_devintf(U)
ipmi_si(U) ipmi_msghandler(U) lockd(U) obdfilter(U) fsfilt_ldiskfs(U)
ost(U) mgc(U) lustre(U) lov(U) mdc(U) lquota(U) osc(U) ksocklnd(U)
ptlrpc(U) obdclass(U) lnet(U) lvfs(U) libcfs(U) ldiskfs(U) crc16(U)
autofs4(U) hidp(U) rfcomm(U) l2cap(U) bluetooth(U) sunrpc(U)
cpufreq_ondemand(U) dm_multipath(U) video(U) sbs(U) backlight(U)
i2c_ec(U) i2c_core(U) button(U) battery(U) asus_acpi(U)
acpi_memhotplug(U) ac(U) parport_pc(U) lp(U) parport(U) sr_mod(U)
cdrom(U) pata_acpi(U) serio_raw(U) tg3(U) sg(U) pcspkr(U) dm_snapshot(U)
dm_zero(U) dm_mirror(U) dm_mod(U) ata_piix(U) libata(U) megaraid_sas(U)
shpchp(U) mptsas(U) mptscsih(U) mptbase(U) scsi_transport_sas(U)
sd_mod(U) scsi_mod(U) ext3(U) jbd(U) uhci_hcd(U) ohci_hcd(U) ehci_hcd(U)
Sep 25 22:32:49 ssn1 kernel: Pid: 3714, comm: ll_evictor Tainted: G
2.6.18-92.1.26.el5_lustre.1.6.7.2smp #1
Sep 25 22:32:49 ssn1 kernel: RIP: 0010:[<ffffffff884b43eb>]
[<ffffffff884b43eb>] :obdclass:lustre_hash_for_each_empty+0x1fb/0x290
Sep 25 22:32:49 ssn1 kernel: RSP: 0018:ffff810213aabd90 EFLAGS:
00000202
Sep 25 22:32:49 ssn1 kernel: RAX: ffff81012b5e3600 RBX: 000000000000a6f4
RCX: 0000000000010b0e
Sep 25 22:32:49 ssn1 kernel: RDX: 0000000000000133 RSI: ffffffff802f1d80
RDI: ffffc20010a99f3c
Sep 25 22:32:49 ssn1 kernel: RBP: ffffffff8852e912 R08: ffff810001016e60
R09: 0000000000000000
Sep 25 22:32:49 ssn1 kernel: R10: ffff81012b5e03c0 R11: 0000000000000150
R12: ffff81012b5e3600
Sep 25 22:32:49 ssn1 kernel: R13: 0000000000000286 R14: 0000000000000286
R15: ffff810213aabd30
Sep 25 22:32:49 ssn1 kernel: FS: 00002aaf1c86f220(0000)
GS:ffff810107b9cd40(0000) knlGS:0000000000000000
Sep 25 22:32:49 ssn1 kernel: CS: 0010 DS: 0000 ES: 0000 CR0:
000000008005003b
Sep 25 22:32:49 ssn1 kernel: CR2: 000000001745b024 CR3: 000000022275b000
CR4: 00000000000006e0
Sep 25 22:32:49 ssn1 kernel:
Sep 25 22:32:49 ssn1 kernel: Call Trace:
Sep 25 22:32:49 ssn1 kernel:
[<ffffffff884bac38>] :obdclass:class_disconnect+0x378/0x400
Sep 25 22:32:49 ssn1 kernel:
[<ffffffff88530610>] :ptlrpc:ldlm_cancel_locks_for_export_cb+0x0/0xc0
Sep 25 22:32:49 ssn1 kernel:
[<ffffffff887c442d>] :obdfilter:filter_disconnect+0x36d/0x4b0
Sep 25 22:32:49 ssn1 kernel:
[<ffffffff884b6ce4>] :obdclass:class_fail_export+0x384/0x4c0
Sep 25 22:32:49 ssn1 kernel:
[<ffffffff88580aa8>] :ptlrpc:ping_evictor_main+0x4f8/0x7d5
Sep 25 22:32:49 ssn1 kernel: [<ffffffff8008abbc>] default_wake_function
+0x0/0xe
Sep 25 22:32:49 ssn1 kernel: [<ffffffff800b4391>] audit_syscall_exit
+0x31b/0x336
Sep 25 22:32:49 ssn1 kernel: [<ffffffff8005dfb1>] child_rip+0xa/0x11
Sep 25 22:32:49 ssn1 kernel:
[<ffffffff885805b0>] :ptlrpc:ping_evictor_main+0x0/0x7d5
Sep 25 22:32:49 ssn1 kernel: [<ffffffff8005dfa7>] child_rip+0x0/0x11
Sep 25 22:32:49 ssn1 kernel:
-
Nick Jennings
Technical Director
Creative Motion Design
www.creativemotiondesign.com
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 197 bytes
Desc: This is a digitally signed message part
Url :
http://lists.lustre.org/pipermail/lustre-discuss/attachments/20090926/d068df49/attachment-0001.bin