Hi, I''m seeing coordinated OSS crashes with Lustre 1.6.5.1. our RHEL4 OSS have been stable for ~months with these kernels: kernel-lustre-smp-2.6.9-67.0.4.EL_lustre.1.6.4.3 kernel-lustre-smp-2.6.9-55.0.9.EL_lustre.1.6.4.2 but have crashed hard, twice, about 10hrs apart as soon as we started using this kernel: kernel-lustre-smp-2.6.9-67.0.7.EL_lustre.1.6.5.1 the weird thing is that as near as I can tell, both times all three OSS''s crashed at exactly the same time! couldn''t even ping them, so it was a pretty solid crash. any ideas? I can''t see anything similar in bugzilla. no logs got out of the nodes (we only use remote syslog) except for the below Oops from one node at the time of the first crash. thanks for any help! cheers, robin Jul 17 00:10:29 x17 kernel: ----------- [cut here ] --------- [please bite here ] --------- Jul 17 00:10:29 x17 kernel: Kernel BUG at spinlock:76 Jul 17 00:10:29 x17 kernel: invalid operand: 0000 [1] SMP Jul 17 00:10:29 x17 kernel: CPU 2 Jul 17 00:10:29 x17 kernel: Modules linked in: obdfilter(U) fsfilt_ldiskfs(U) ost(U) mgc(U) ldiskfs(U) jbd(U) lustre(U) lov(U) mdc(U) lquota(U) osc(U) ko2iblnd(U) ptlrpc(U) obdclass(U) lnet(U) lvfs(U) libcfs(U) raid5(U) xor(U) rdma_ucm(U) qlgc_vnic(U) ib_sdp(U) rdma_cm(U) iw_cm(U) ib_addr(U) iw_cxgb3(U) cxgb3(U) ib_ipath(U) mlx4_ib(U) mlx4_core(U) dm_mod(U) button(U) battery(U) ac(U) uhci_hcd(U) ehci_hcd(U) hw_random(U) ib_mthca(U) ib_ipoib(U) md5(U) ipv6(U) ib_umad(U) ib_ucm(U) ib_uverbs(U) ib_cm(U) ib_sa(U) ib_mad(U) ib_core(U) sd_mod(U) qla2300(U) qla2xxx(U) scsi_transport_fc(U) ahci(U) ata_piix(U) libata(U) scsi_mod(U) nfs(U) nfs_acl(U) lockd(U) sunrpc(U) e1000(U) Jul 17 00:10:29 x17 kernel: Pid: 0, comm: swapper Not tainted 2.6.9-67.0.7.EL_lustre.1.6.5.1smp Jul 17 00:10:29 x17 kernel: RIP: 0010:[<ffffffff8030e97d>] <ffffffff8030e97d>{_spin_unlock_irqrestore+27} Jul 17 00:10:29 x17 kernel: RSP: 0018:000001009fa03ee0 EFLAGS: 00010002 Jul 17 00:10:29 x17 kernel: RAX: 0000000000000001 RBX: 0000010198205680 RCX: 00001d6067022f90 Jul 17 00:10:29 x17 kernel: RDX: 00000000045ba580 RSI: 0000000000000246 RDI: 0000010254bae9c0 Jul 17 00:10:29 x17 kernel: RBP: 0000000000000000 R08: 0000000000000246 R09: 0000000000000000 Jul 17 00:10:30 x17 kernel: R10: ffffffffa009e88c R11: ffffffffa0147cc2 R12: 0000000000000012 Jul 17 00:10:30 x17 kernel: R13: 0000000000000001 R14: 0000000000000012 R15: 0000000000000000 Jul 17 00:10:30 x17 kernel: FS: 0000000000000000(0000) GS:ffffffff8048e800(0000) knlGS:0000000000000000 Jul 17 00:10:30 x17 kernel: CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b Jul 17 00:10:30 x17 kernel: CR2: 0000002a9556c000 CR3: 000000009fb6e000 CR4: 00000000000006e0 Jul 17 00:10:30 x17 kernel: Process swapper (pid: 0, threadinfo 000001009fb6c000, task 000001009f9de800) Jul 17 00:10:30 x17 kernel: Stack: ffffffffa0147f1d 0000000000002002 0000010198205680 000000000000000a Jul 17 00:10:30 x17 kernel: 0000000000000002 000001009fb6de98 ffffffffa009ed57 000001009fa03f18 Jul 17 00:10:30 x17 kernel: 000001009fa03f18 000001009f5e2fc0 Jul 17 00:10:30 x17 kernel: Call Trace:<IRQ> <ffffffffa0147f1d>{:sd_mod:sd_rw_intr+603} <ffffffffa009ed57>{:scsi_mod:scsi_softirq+213} Jul 17 00:10:30 x17 kernel: <ffffffff8013c1e8>{__do_softirq+88} <ffffffff8013c291>{do_softirq+49} Jul 17 00:10:30 x17 kernel: <ffffffff801130e3>{do_IRQ+328} <ffffffff801107d1>{ret_from_intr+0} Jul 17 00:10:30 x17 kernel: <EOI> <ffffffff8010e80c>{mwait_idle+86} <ffffffff8010e79c>{cpu_idle+26} Jul 17 00:10:30 x17 kernel: Jul 17 00:10:30 x17 kernel: Jul 17 00:10:30 x17 kernel: Code: 0f 0b 37 87 32 80 ff ff ff ff 4c 00 c7 07 01 00 00 00 56 9d Jul 17 00:10:30 x17 kernel: RIP <ffffffff8030e97d>{_spin_unlock_irqrestore+27} RSP <000001009fa03ee0> Jul 17 00:10:30 x17 kernel: <0>Kernel panic - not syncing: Oops
On Fri, 2008-07-18 at 05:52 -0400, Robin Humble wrote:> Hi, > > I''m seeing coordinated OSS crashes with Lustre 1.6.5.1. > > our RHEL4 OSS have been stable for ~months with these kernels: > kernel-lustre-smp-2.6.9-67.0.4.EL_lustre.1.6.4.3 > kernel-lustre-smp-2.6.9-55.0.9.EL_lustre.1.6.4.2 > > but have crashed hard, twice, about 10hrs apart as soon as we started > using this kernel: > kernel-lustre-smp-2.6.9-67.0.7.EL_lustre.1.6.5.1Can you try rebuilding the kernel, disabling SD_IOSTATS? b. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part Url : http://lists.lustre.org/pipermail/lustre-discuss/attachments/20080718/0f6ef7e2/attachment.bin
On Fri, Jul 18, 2008 at 09:02:36AM -0400, Brian J. Murrell wrote:>On Fri, 2008-07-18 at 05:52 -0400, Robin Humble wrote: >> Hi, >> >> I''m seeing coordinated OSS crashes with Lustre 1.6.5.1. >> >> our RHEL4 OSS have been stable for ~months with these kernels: >> kernel-lustre-smp-2.6.9-67.0.4.EL_lustre.1.6.4.3 >> kernel-lustre-smp-2.6.9-55.0.9.EL_lustre.1.6.4.2 >> >> but have crashed hard, twice, about 10hrs apart as soon as we started >> using this kernel: >> kernel-lustre-smp-2.6.9-67.0.7.EL_lustre.1.6.5.1 >Can you try rebuilding the kernel, disabling SD_IOSTATS?done. I rebuilt using the stock kernel''s InfiniBand stack and # CONFIG_SD_IOSTATS is not set % cexec -p oss: uptime oss x17: 18:45:07 up 1 day, 30 min, 1 user, load average: 4.97, 7.00, 6.27 oss x18: 18:45:07 up 1 day, 23 min, 1 user, load average: 4.18, 5.78, 5.71 oss x19: 18:45:07 up 1 day, 23 min, 1 user, load average: 5.18, 5.66, 4.60 which is >> the 10hrs it was crashing at before. good guess about the cause of the problem! :-) maybe that rhel4 1.6.5.1 kernel rpm needs a respin then? seems like a fairly critical issue... :-/ cheers, robin
I am trying to understand. What was the problem? How does SD_IOSTATS affect the crash? How did you disable this? Sorry for a newbie question.... TIA On Sun, Jul 20, 2008 at 4:54 AM, Robin Humble <rjh+lustre at cita.utoronto.ca> wrote:> On Fri, Jul 18, 2008 at 09:02:36AM -0400, Brian J. Murrell wrote: >>On Fri, 2008-07-18 at 05:52 -0400, Robin Humble wrote: >>> Hi, >>> >>> I''m seeing coordinated OSS crashes with Lustre 1.6.5.1. >>> >>> our RHEL4 OSS have been stable for ~months with these kernels: >>> kernel-lustre-smp-2.6.9-67.0.4.EL_lustre.1.6.4.3 >>> kernel-lustre-smp-2.6.9-55.0.9.EL_lustre.1.6.4.2 >>> >>> but have crashed hard, twice, about 10hrs apart as soon as we started >>> using this kernel: >>> kernel-lustre-smp-2.6.9-67.0.7.EL_lustre.1.6.5.1 >>Can you try rebuilding the kernel, disabling SD_IOSTATS? > > done. I rebuilt using the stock kernel''s InfiniBand stack and > # CONFIG_SD_IOSTATS is not set > > % cexec -p oss: uptime > oss x17: 18:45:07 up 1 day, 30 min, 1 user, load average: 4.97, 7.00, 6.27 > oss x18: 18:45:07 up 1 day, 23 min, 1 user, load average: 4.18, 5.78, 5.71 > oss x19: 18:45:07 up 1 day, 23 min, 1 user, load average: 5.18, 5.66, 4.60 > > which is >> the 10hrs it was crashing at before. > good guess about the cause of the problem! :-) > > maybe that rhel4 1.6.5.1 kernel rpm needs a respin then? seems like a > fairly critical issue... :-/ > > cheers, > robin > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss >
On Sun, 2008-07-20 at 04:54 -0400, Robin Humble wrote:> > done. I rebuilt using the stock kernel''s InfiniBand stack and > # CONFIG_SD_IOSTATS is not set > > % cexec -p oss: uptime > oss x17: 18:45:07 up 1 day, 30 min, 1 user, load average: 4.97, 7.00, 6.27 > oss x18: 18:45:07 up 1 day, 23 min, 1 user, load average: 4.18, 5.78, 5.71 > oss x19: 18:45:07 up 1 day, 23 min, 1 user, load average: 5.18, 5.66, 4.60 > > which is >> the 10hrs it was crashing at before.Good.> good guess about the cause of the problem! :-)I cheated. It''s an already open bug: 16404. There is even a patch in that bug for the reporter to test. Please feel free to test it yourself and report here (or even better, in the bug) on your results.> maybe that rhel4 1.6.5.1 kernel rpm needs a respin then? seems like a > fairly critical issue... :-/You can follow the above bug to see how we progress with it. b. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part Url : http://lists.lustre.org/pipermail/lustre-discuss/attachments/20080721/3f5022d7/attachment.bin
On Sun, Jul 20, 2008 at 08:40:19AM -0400, Mag Gam wrote:>I am trying to understand. What was the problem? How does SD_IOSTATS >affect the crash? How did you disable this?the comments describe the bug: https://bugzilla.lustre.org/show_bug.cgi?id=16404#c22 which from a quick look seems like a SMP locking issue around the statistics collection issue that presumable under some circumstances can cause an overflow and a crash. the way to disable it is to rebuild the patched-by-Lustre RHEL kernel with the CONFIG_SD_IOSTATS option turned off.>Sorry for a newbie question....no probs. let me know if you need a recipe for patching and rebuilding this kernel. I should really write it all down before I forget anyway... there are most likely descriptions for patching and building kernels on the Lustre wiki too. cheers, robin> > >On Sun, Jul 20, 2008 at 4:54 AM, Robin Humble ><rjh+lustre at cita.utoronto.ca> wrote: >> On Fri, Jul 18, 2008 at 09:02:36AM -0400, Brian J. Murrell wrote: >>>On Fri, 2008-07-18 at 05:52 -0400, Robin Humble wrote: >>>> Hi, >>>> >>>> I''m seeing coordinated OSS crashes with Lustre 1.6.5.1. >>>> >>>> our RHEL4 OSS have been stable for ~months with these kernels: >>>> kernel-lustre-smp-2.6.9-67.0.4.EL_lustre.1.6.4.3 >>>> kernel-lustre-smp-2.6.9-55.0.9.EL_lustre.1.6.4.2 >>>> >>>> but have crashed hard, twice, about 10hrs apart as soon as we started >>>> using this kernel: >>>> kernel-lustre-smp-2.6.9-67.0.7.EL_lustre.1.6.5.1 >>>Can you try rebuilding the kernel, disabling SD_IOSTATS? >> >> done. I rebuilt using the stock kernel''s InfiniBand stack and >> # CONFIG_SD_IOSTATS is not set >> >> % cexec -p oss: uptime >> oss x17: 18:45:07 up 1 day, 30 min, 1 user, load average: 4.97, 7.00, 6.27 >> oss x18: 18:45:07 up 1 day, 23 min, 1 user, load average: 4.18, 5.78, 5.71 >> oss x19: 18:45:07 up 1 day, 23 min, 1 user, load average: 5.18, 5.66, 4.60 >> >> which is >> the 10hrs it was crashing at before. >> good guess about the cause of the problem! :-) >> >> maybe that rhel4 1.6.5.1 kernel rpm needs a respin then? seems like a >> fairly critical issue... :-/ >> >> cheers, >> robin >> _______________________________________________ >> Lustre-discuss mailing list >> Lustre-discuss at lists.lustre.org >> http://lists.lustre.org/mailman/listinfo/lustre-discuss >> >_______________________________________________ >Lustre-discuss mailing list >Lustre-discuss at lists.lustre.org >http://lists.lustre.org/mailman/listinfo/lustre-discuss
Robin, Thankyou very much for helping with this. I want to try kernel 2.6.25 or even 2.6.26. But its not a big deal, I just patched my distro kernel and everything seems to work well. I am hoping in the future lustre will become a deamon or a module instead of patching the actual kernel source code. This is causing too many pains On Fri, Jul 25, 2008 at 8:13 AM, Robin Humble <robin.humble at anu.edu.au> wrote:> On Fri, Jul 25, 2008 at 07:46:24AM -0400, Mag Gam wrote: >>Can you please provide some good instructions on how to patch a more >>recent kernel, or if you are using Redhat provide a more recent kernel >>:-) > > I don''t work for Sun/Lustre, so I can''t provide you with a more recent > (fixed) spin of their kernel, if that''s what you mean. > > clients: > I''m using patchless clients with 2.6.23 and 2.6.24 kernels. standard > lustre only supports 2.6.22 patchless clients, or using their distro > kernels. > > servers: > I''m using rhel5 and rhel4 kernels provided by lustre whenever possible. > > if you can do that too, it''d be the easiest thing. can you use a rhel5 > (centos5), sles10, or similar kernel? > then you don''t have to patch anything... > > I''ve had to patch both servers and clients over the years - clients > mostly for bugs, and servers mostly to include infiniband support which > is in there as standard now. > > let me know if you can''t use a provided-by-sun kernel, or what sort of > ''recent'' kernel you''d like and I''ll try to put together a recipe for > patching for you... > > cheers, > robin >
On Sat, 2008-07-26 at 12:56 -0400, Mag Gam wrote:> Robin, > > Thankyou very much for helping with this. > > I want to try kernel 2.6.25 or even 2.6.26. But its not a big deal, I > just patched my distro kernel and everything seems to work well. > I am hoping in the future lustre will become a deamon or a module > instead of patching the actual kernel source code. This is causing too > many painsSeriously, this whole thing is a lot, lot, less painful if you just run one of our supported distros on your MDS and OSSes (i.e. you can simply install RPMs to get up and going). Your MDS and OSS should be "sealed server" type dedicated machines that do nothing else but serve metadata and file objects (respectively) so getting a bleeding edge kernel on them should not be a requirement. b. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part Url : http://lists.lustre.org/pipermail/lustre-discuss/attachments/20080728/88b0aa3c/attachment.bin
On Mon, Jul 28, 2008 at 09:49:12AM -0400, Brian J. Murrell wrote:> On Sat, 2008-07-26 at 12:56 -0400, Mag Gam wrote: > > Robin, > > > > Thankyou very much for helping with this. > > > > I want to try kernel 2.6.25 or even 2.6.26. But its not a big deal, I > > just patched my distro kernel and everything seems to work well. > > I am hoping in the future lustre will become a deamon or a module > > instead of patching the actual kernel source code. This is causing too > > many pains > > Seriously, this whole thing is a lot, lot, less painful if you just run > one of our supported distros on your MDS and OSSes (i.e. you can simply > install RPMs to get up and going). Your MDS and OSS should be "sealed > server" type dedicated machines that do nothing else but serve metadata > and file objects (respectively) so getting a bleeding edge kernel on > them should not be a requirement.Maybe a lot less painful for Sun support ;) .. But what if you need a new RDMA NIC driver that actually *works* in 2.6.26, and is completely busted in a vendor/distro kernel. Is Debian a supported distro? And if not, why? It seems that making debian policy compliant Lustre server packages would be a good excercise for code and packaging quality. I haven''t keep up on linux-kernel lately, but is there some fundamental reason that Lustre *still* requires kernel patches that don''t have a path into the mainline kernel tree?
On Mon, 2008-07-28 at 09:17 -0500, Troy Benjegerdes wrote:> > Maybe a lot less painful for Sun support ;)No. I specifically meant less painful on the implementer. It''s not really any more difficult on Sun for you to use whatever kernel you want to use.> But what if you need a new RDMA NIC > driver that actually *works* in 2.6.26, and is completely busted in a > vendor/distro kernel.Well, that''s really no different than any of the rest of the FOSS world. If you want to use bleeding edge hardware, there are always pains associated. There is support in the distro kernels for a good number of working RDMA capable NICs.> Is Debian a supported distro?No.> And if not, why?Simply no customer demand. Like *everything* else in this world, we are a limited resource and we have to allocate that resource where it is most effective.> It seems that making > debian policy compliant Lustre server packages would be a good excercise > for code and packaging quality.Somebody already does that. http://packages.debian.org/source/lenny/lustre http://packages.debian.org/source/sid/lustre> I haven''t keep up on linux-kernel lately, but is there some fundamental > reason that Lustre *still* requires kernel patches that don''t have a > path into the mainline kernel tree?Yes. Performance and scalability. The mainline kernel lacks a number of facilities we need to achieve the performance and scalability that we do. We have worked toward getting our changes into the kernel in the past but have been continually rejected for one reason or another. I''m not going to speculate why this has been but you can search lkml for the gory details if you like. b. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part Url : http://lists.lustre.org/pipermail/lustre-discuss/attachments/20080728/999f3cda/attachment-0001.bin