Hi all, I have been playing with lustre 1.6 beta 5 and I have gotten everything working with a 2.6.18 vanilla kernel but when I went to umount the client I get this error: CPU 1 Modules linked in: obdfilter ost osc mds fsfilt_ldiskfs mgs mgc ldiskfs lustre lov lquota mdc ksocklnd ptlrpc obdclass lnet lvfs libcfs ipv6 thermal dm_snapshot dm_mirror dm_mod psmouse serio_raw shpchp pci_hotplug ohci_hcd evdev pcspkr Pid: 2608, comm: umount Tainted: G M 2.6.18lustre1.95 #1 RIP: 0010:[<ffffffff88264d52>] [<ffffffff88264d52>] :lustre:ll_umount_begin+0x133/0x69f RSP: 0018:ffff8101ef8dbc18 EFLAGS: 00010206 RAX: 0100000000000824 RBX: ffff8100fbf891c0 RCX: 0000000000000010 RDX: ffff8101ef91c1ea RSI: ffffffff88278740 RDI: ffff810003abede8 RBP: ffff8100fbf89080 R08: 00000000fffffffe R09: 0000000000000020 R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 R13: ffff8100fa295200 R14: 0000000000512f90 R15: 0000000000513020 FS: 00002b6c2d9321d0(0000) GS:ffff8101000e7740(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 00002b6c2d1b1461 CR3: 00000001ef963000 CR4: 00000000000006e0 Process umount (pid: 2608, threadinfo ffff8101ef8da000, task ffff8101fc666870) Stack: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 Call Trace: [<ffffffff802bf67f>] sys_umount+0x133/0x290 [<ffffffff802228ec>] sys_newstat+0x19/0x31 [<ffffffff8025c7b2>] system_call+0x7e/0x83 Code: 80 88 b8 00 00 00 40 f6 05 44 cb e4 ff 01 49 8b 6d 40 74 3a RIP [<ffffffff88264d52>] :lustre:ll_umount_begin+0x133/0x69f RSP <ffff8101ef8dbc18> Any ideas? I can''t unmount the lustreFS. Thanks in advance for any advice. Jon Scottorn Systems Administrator The Possibility Forge, Inc. http://www.possibilityforge.com 435.635.0591 x.1004 -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.clusterfs.com/pipermail/lustre-discuss/attachments/20061026/0aa001f5/attachment.html
Jon Scottorn <jscottorn@possibilityforge.com> writes:> Hi all, > ??? I have been playing with lustre 1.6 beta 5 and I have gotten everything > working with a 2.6.18 vanilla kernel but when I went to umount the client I > get this error: > CPU 1 > Modules linked in: obdfilter ost osc mds fsfilt_ldiskfs mgs mgc ldiskfs lustre > lov lquota mdc ksocklnd ptlrpc obdclass lnet lvfs libcfs ipv6 thermal > dm_snapshot dm_mirror dm_mod psmouse serio_raw shpchp pci_hotplug ohci_hcd > evdev pcspkr > Pid: 2608, comm: umount Tainted: G?? M? 2.6.18lustre1.95 #1 > RIP: 0010:[<ffffffff88264d52>]? [<ffffffff88264d52>] > :lustre:ll_umount_begin+0x133/0x69f > RSP: 0018:ffff8101ef8dbc18? EFLAGS: 00010206 > RAX: 0100000000000824 RBX: ffff8100fbf891c0 RCX: 0000000000000010 > RDX: ffff8101ef91c1ea RSI: ffffffff88278740 RDI: ffff810003abede8 > RBP: ffff8100fbf89080 R08: 00000000fffffffe R09: 0000000000000020 > R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 > R13: ffff8100fa295200 R14: 0000000000512f90 R15: 0000000000513020 > FS:? 00002b6c2d9321d0(0000) GS:ffff8101000e7740(0000) knlGS:0000000000000000 > CS:? 0010 DS: 0000 ES: 0000 CR0: 000000008005003b > CR2: 00002b6c2d1b1461 CR3: 00000001ef963000 CR4: 00000000000006e0 > Process umount (pid: 2608, threadinfo ffff8101ef8da000, task ffff8101fc666870) > Stack:? 0000000000000000 0000000000000000 0000000000000000 0000000000000000 > 0000000000000000 0000000000000000 0000000000000000 0000000000000000 > 0000000000000000 0000000000000000 0000000000000000 0000000000000000 > Call Trace: > [<ffffffff802bf67f>] sys_umount+0x133/0x290 > [<ffffffff802228ec>] sys_newstat+0x19/0x31 > [<ffffffff8025c7b2>] system_call+0x7e/0x83 > Code: 80 88 b8 00 00 00 40 f6 05 44 cb e4 ff 01 49 8b 6d 40 74 3a > RIP? [<ffffffff88264d52>] :lustre:ll_umount_begin+0x133/0x69f > RSP <ffff8101ef8dbc18> > Any ideas? > I can''t unmount the lustreFS. > Thanks in advance for any advice.You are number 4 with this problem now. I got it too as well but with Lustre 1.4.7.1 and kernel 2.6.18. So I figure it is something to do with 2.6.18. :) Anyone? MfG Goswin
This only happens on the system that houses the MGS and MDT, It could be just tied to the 2.6.18 kernel like you said. It does not give me this error on a stand alone client. ie... I have my main system that is running 2.6.18 with Lustre 1.5.95 running MGS, MDT, OST and mounted as a client as well. I have a second system that is running 2.6.17.13 with Vservers 2.0.1 and Lustre 1.5.95 Patchless Client. Mounted to the above as a client. I can unmount the second system from lustre with no problems. BTW if anyone is wondering thanks for all the help in getting Vservers working with Lustre. I am running my above setup with only the below issue currently. I am going to be extensively testing my setup over the next few weeks. Thanks, Jon On Thu, 2006-10-26 at 23:52 +0200, Goswin von Brederlow wrote:> Jon Scottorn <jscottorn@possibilityforge.com> writes: > > > Hi all, > > I have been playing with lustre 1.6 beta 5 and I have gotten everything > > working with a 2.6.18 vanilla kernel but when I went to umount the client I > > get this error: > > CPU 1 > > Modules linked in: obdfilter ost osc mds fsfilt_ldiskfs mgs mgc ldiskfs lustre > > lov lquota mdc ksocklnd ptlrpc obdclass lnet lvfs libcfs ipv6 thermal > > dm_snapshot dm_mirror dm_mod psmouse serio_raw shpchp pci_hotplug ohci_hcd > > evdev pcspkr > > Pid: 2608, comm: umount Tainted: G M 2.6.18lustre1.95 #1 > > RIP: 0010:[<ffffffff88264d52>] [<ffffffff88264d52>] > > :lustre:ll_umount_begin+0x133/0x69f > > RSP: 0018:ffff8101ef8dbc18 EFLAGS: 00010206 > > RAX: 0100000000000824 RBX: ffff8100fbf891c0 RCX: 0000000000000010 > > RDX: ffff8101ef91c1ea RSI: ffffffff88278740 RDI: ffff810003abede8 > > RBP: ffff8100fbf89080 R08: 00000000fffffffe R09: 0000000000000020 > > R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 > > R13: ffff8100fa295200 R14: 0000000000512f90 R15: 0000000000513020 > > FS: 00002b6c2d9321d0(0000) GS:ffff8101000e7740(0000) knlGS:0000000000000000 > > CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b > > CR2: 00002b6c2d1b1461 CR3: 00000001ef963000 CR4: 00000000000006e0 > > Process umount (pid: 2608, threadinfo ffff8101ef8da000, task ffff8101fc666870) > > Stack: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 > > 0000000000000000 0000000000000000 0000000000000000 0000000000000000 > > 0000000000000000 0000000000000000 0000000000000000 0000000000000000 > > Call Trace: > > [<ffffffff802bf67f>] sys_umount+0x133/0x290 > > [<ffffffff802228ec>] sys_newstat+0x19/0x31 > > [<ffffffff8025c7b2>] system_call+0x7e/0x83 > > Code: 80 88 b8 00 00 00 40 f6 05 44 cb e4 ff 01 49 8b 6d 40 74 3a > > RIP [<ffffffff88264d52>] :lustre:ll_umount_begin+0x133/0x69f > > RSP <ffff8101ef8dbc18> > > Any ideas? > > I can''t unmount the lustreFS. > > Thanks in advance for any advice. > > You are number 4 with this problem now. I got it too as well but with > Lustre 1.4.7.1 and kernel 2.6.18. So I figure it is something to do > with 2.6.18. :) > > Anyone? > > MfG > Goswin > > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss@clusterfs.com > https://mail.clusterfs.com/mailman/listinfo/lustre-discuss >Jon Scottorn Systems Administrator The Possibility Forge, Inc. http://www.possibilityforge.com 435.635.0591 x.1004 -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.clusterfs.com/pipermail/lustre-discuss/attachments/20061026/11a78c17/attachment-0001.html
For what it''s worth, ll_umount_begin is only called for clients, so maybe it is an issue with running a client on the same node as the MGS/MDT. Also, ll_umount_begin is only called in the umount -f case, not a non-forced case. So try the umount without the "-f". Jon Scottorn wrote:> This only happens on the system that houses the MGS and MDT, It could > be just tied to the 2.6.18 kernel like you said. It does not give me > this error on a stand alone client. ie... > > I have my main system that is running 2.6.18 with Lustre 1.5.95 > running MGS, MDT, OST and mounted as a client as well. > > I have a second system that is running 2.6.17.13 with Vservers 2.0.1 > and Lustre 1.5.95 Patchless Client. > Mounted to the above as a client. > > I can unmount the second system from lustre with no problems. > > BTW if anyone is wondering thanks for all the help in getting > Vservers working with Lustre. I am running my above setup with only > the below issue currently. I am going to be extensively testing my > setup over the next few weeks. > > Thanks, > > Jon > > On Thu, 2006-10-26 at 23:52 +0200, Goswin von Brederlow wrote: >> Jon Scottorn <jscottorn@possibilityforge.com <mailto:jscottorn@possibilityforge.com>> writes: >> >> > Hi all, >> > I have been playing with lustre 1.6 beta 5 and I have gotten everything >> > working with a 2.6.18 vanilla kernel but when I went to umount the client I >> > get this error: >> > CPU 1 >> > Modules linked in: obdfilter ost osc mds fsfilt_ldiskfs mgs mgc ldiskfs lustre >> > lov lquota mdc ksocklnd ptlrpc obdclass lnet lvfs libcfs ipv6 thermal >> > dm_snapshot dm_mirror dm_mod psmouse serio_raw shpchp pci_hotplug ohci_hcd >> > evdev pcspkr >> > Pid: 2608, comm: umount Tainted: G M 2.6.18lustre1.95 #1 >> > RIP: 0010:[<ffffffff88264d52>] [<ffffffff88264d52>] >> > :lustre:ll_umount_begin+0x133/0x69f >> > RSP: 0018:ffff8101ef8dbc18 EFLAGS: 00010206 >> > RAX: 0100000000000824 RBX: ffff8100fbf891c0 RCX: 0000000000000010 >> > RDX: ffff8101ef91c1ea RSI: ffffffff88278740 RDI: ffff810003abede8 >> > RBP: ffff8100fbf89080 R08: 00000000fffffffe R09: 0000000000000020 >> > R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 >> > R13: ffff8100fa295200 R14: 0000000000512f90 R15: 0000000000513020 >> > FS: 00002b6c2d9321d0(0000) GS:ffff8101000e7740(0000) knlGS:0000000000000000 >> > CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b >> > CR2: 00002b6c2d1b1461 CR3: 00000001ef963000 CR4: 00000000000006e0 >> > Process umount (pid: 2608, threadinfo ffff8101ef8da000, task ffff8101fc666870) >> > Stack: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 >> > 0000000000000000 0000000000000000 0000000000000000 0000000000000000 >> > 0000000000000000 0000000000000000 0000000000000000 0000000000000000 >> > Call Trace: >> > [<ffffffff802bf67f>] sys_umount+0x133/0x290 >> > [<ffffffff802228ec>] sys_newstat+0x19/0x31 >> > [<ffffffff8025c7b2>] system_call+0x7e/0x83 >> > Code: 80 88 b8 00 00 00 40 f6 05 44 cb e4 ff 01 49 8b 6d 40 74 3a >> > RIP [<ffffffff88264d52>] :lustre:ll_umount_begin+0x133/0x69f >> > RSP <ffff8101ef8dbc18> >> > Any ideas? >> > I can''t unmount the lustreFS. >> > Thanks in advance for any advice. >> >> You are number 4 with this problem now. I got it too as well but with >> Lustre 1.4.7.1 and kernel 2.6.18. So I figure it is something to do >> with 2.6.18. :) >> >> Anyone? >> >> MfG >> Goswin >> >> _______________________________________________ >> Lustre-discuss mailing list >> Lustre-discuss@clusterfs.com <mailto:Lustre-discuss@clusterfs.com> >> https://mail.clusterfs.com/mailman/listinfo/lustre-discuss >> >> > */Jon Scottorn/* > /Systems Administrator/ > /The Possibility Forge, Inc./ > /http://www.possibilityforge.com/ > /435.635.0591 x.1004/ > > ------------------------------------------------------------------------ > > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss@clusterfs.com > https://mail.clusterfs.com/mailman/listinfo/lustre-discuss >
Solofo.Ramangalahy@bull.net
2006-Oct-27 02:01 UTC
[Lustre-discuss] Lustre 1.6 umount errors
Goswin von Brederlow writes: > You are number 4 with this problem now. I got it too as well but with > Lustre 1.4.7.1 and kernel 2.6.18. So I figure it is something to do > with 2.6.18. :) > > Anyone? I''ve opened bugzilla ticket 11039 related to 2.6.18 support. This work is not finished yet, but maybe it has already useful stuff. Regards, -- solofo
Nathaniel Rutman wrote:> For what it''s worth, ll_umount_begin is only called for clients, so > maybe it is an issue with running a client on the same node as the > MGS/MDT. > Also, ll_umount_begin is only called in the umount -f case, not a > non-forced case. So try the umount without the "-f". >Are you sure? I Just checked this, running ''umount /test'' and oops''ed in ll_umount_begin, even though I did not do ''-f''. This is in 2.6.18. - Alastair
Alastair McKinstry wrote:> Nathaniel Rutman wrote: > >> For what it''s worth, ll_umount_begin is only called for clients, so >> maybe it is an issue with running a client on the same node as the >> MGS/MDT. >> Also, ll_umount_begin is only called in the umount -f case, not a >> non-forced case. So try the umount without the "-f". >> >> > Are you sure? I Just checked this, running ''umount /test'' and oops''ed in > ll_umount_begin, > even though I did not do ''-f''. This is in 2.6.18. > > - Alastair > >In 2.6.10 fs/namespace.c/do_umount(): if( (flags&MNT_FORCE) && sb->s_op->umount_begin) sb->s_op->umount_begin(sb); Maybe it''s changed in 2.6.18.
Nathaniel Rutman <nathan@clusterfs.com> writes:> Alastair McKinstry wrote: >> Nathaniel Rutman wrote: >> >>> For what it''s worth, ll_umount_begin is only called for clients, so >>> maybe it is an issue with running a client on the same node as the >>> MGS/MDT. >>> Also, ll_umount_begin is only called in the umount -f case, not a >>> non-forced case. So try the umount without the "-f". >>> >>> >> Are you sure? I Just checked this, running ''umount /test'' and oops''ed in >> ll_umount_begin, >> even though I did not do ''-f''. This is in 2.6.18. >> >> - Alastair >> >> > In 2.6.10 fs/namespace.c/do_umount(): > if( (flags&MNT_FORCE) && sb->s_op->umount_begin) > sb->s_op->umount_begin(sb); > > Maybe it''s changed in 2.6.18.It has. The bugzilla has a patch. MfG Goswin