Erik Froese
2010-Apr-29 15:31 UTC
[Lustre-discuss] Kernel Panic on osc:osc_statfs_interpret
Hey everyone, I have a machine that is constantly crashing with a Kernel Panic at osc:osc_statfs_interpret. The machine had almost no Lustre activity going on at the time of the crash. When it came up I ran some ls and dd tests at first to verify that it was indeed working. I don''t have any other nodes exhibiting the same symptoms so I''m leaning towards this being a hardware issue on the problem node. Any insight into the osc:osc_statfs_interpret function and what it might touch that could cause a KP? Sorry for attaching a picture of it but it''s a Dell and the remote console software won''t let me copy and paste. Erik -------------- next part -------------- A non-text attachment was scrubbed... Name: Screen shot 2010-04-28 at 4.10.05 PM.png Type: image/png Size: 39572 bytes Desc: not available Url : http://lists.lustre.org/pipermail/lustre-discuss/attachments/20100429/c39cc661/attachment-0001.png
Kit Westneat
2010-Apr-29 20:10 UTC
[Lustre-discuss] Kernel Panic on osc:osc_statfs_interpret
Hi Erik, It looks like: https://bugzilla.lustre.org/show_bug.cgi?id=20482 It''s a problem with how it handles inactive OSTs. It''s fixed it 1.8.2, but the work around seems to be to do --writeconf everywhere to remove the OSTs from the config. - Kit On 4/29/2010 11:31 AM, Erik Froese wrote:> Hey everyone, > > I have a machine that is constantly crashing with a Kernel Panic at > osc:osc_statfs_interpret. > The machine had almost no Lustre activity going on at the time of the > crash. When it came up I ran some ls and dd tests at first to verify > that it was indeed working. > I don''t have any other nodes exhibiting the same symptoms so I''m > leaning towards this being a hardware issue on the problem node. > > Any insight into the osc:osc_statfs_interpret function and what it > might touch that could cause a KP? > > Sorry for attaching a picture of it but it''s a Dell and the remote > console software won''t let me copy and paste. > > Erik >-- --- Kit Westneat kwestneat at datadirectnet.com 812-484-8485
Erik Froese
2010-Apr-30 17:58 UTC
[Lustre-discuss] Kernel Panic on osc:osc_statfs_interpret
Thanks Kit, We found that bug yesterday and it is indeed what bit us. The bugreport says the fix landed for 1.8.1.1. Our servers were running 1.8.1.1 and the clients were on 1.8.1. We''ve tested a few clients with the 1.8.1.1 version and it works just fine so I''m in the process of rolling out the upgraded rpms. Thanks for helping out! Erik On Thu, Apr 29, 2010 at 4:10 PM, Kit Westneat <kwestneat at ddn.com> wrote:> Hi Erik, > > It looks like: > https://bugzilla.lustre.org/show_bug.cgi?id=20482 > > It''s a problem with how it handles inactive OSTs. It''s fixed it 1.8.2, but > the work around seems to be to do --writeconf everywhere to remove the OSTs > from the config. > > - Kit > > On 4/29/2010 11:31 AM, Erik Froese wrote: >> >> Hey everyone, >> >> I have a machine that is constantly crashing with a Kernel Panic at >> osc:osc_statfs_interpret. >> The machine had almost no Lustre activity going on at the time of the >> crash. When it came up I ran some ls and dd tests at first to verify >> that it was indeed working. >> I don''t have any other nodes exhibiting the same symptoms so I''m >> leaning towards this being a hardware issue on the problem node. >> >> Any insight into the osc:osc_statfs_interpret function and what it >> might touch that could cause a KP? >> >> Sorry for attaching a picture of it but it''s a Dell and the remote >> console software won''t let me copy and paste. >> >> Erik >> > > > -- > --- > Kit Westneat > kwestneat at datadirectnet.com > 812-484-8485 > >