On Mar 28, 2006, at 4:58 AM, Andreas Dilger wrote:> On Mar 27, 2006 18:46 -0500, Weikuan Yu wrote: >>> I had the same problem you reported here with lustre-1.4.6 patched >>> for >>> linux-2.6.15.4 . My guess is that there is a racing condition >>> somewhere in the use of logs directory. I used a workaround to avoid >>> this problem, shown with the following patch. Basically, it tried to >>> avoid the LOGS directory, but used PENDING instead when it is needed. >>> It may or may not help with your problem. But just to throw some >>> thoughts here to get more opinions people who have experienced the >>> same problem. > > The meaning of the LOGS and PENDING directories is completely > different.I can understanding they mean different very different things :-). So that is why I meant to throw some probes here. But other than blocking on this MDS backing file system corruption (indicated by Cliff on the same problem), this could temporarily allow me to proceed further. Do you see any implications here, under what scenarios? BTW, is the problem considered as valid or just be deferred for now. Thanks, Weikuan> Cheers, Andreas > -- > Andreas Dilger > Principal Software Engineer > Cluster File Systems, Inc. > >-- Weikuan Yu, Computer Science, OSU http://www.cse.ohio-state.edu/~yuw
Did you update your e2fsprog? Updating your e2fsprog to 1.38 should fix this problem. thanks Weikuan Yu wrote:> Post it again for discussion after registering as a member. > > Weikuan > > Begin forwarded message: > >> From: Weikuan Yu <wkyuwk@gmail.com> >> Date: March 27, 2006 1:45:06 PM EST >> To: "Jacob Boswell" <jacob@cordor.com> >> Cc: lustre-discuss@lists.clusterfs.com >> Subject: Re: [Lustre-discuss] Lustre + Xen >> >> Hi, Jacob, >> >> I had the same problem you reported here with lustre-1.4.6 patched >> for linux-2.6.15.4 . My guess is that there is a racing condition >> somewhere in the use of logs directory. I used a workaround to avoid >> this problem, shown with the following patch. Basically, it tried to >> avoid the LOGS directory, but used PENDING instead when it is needed. >> It may or may not help with your problem. But just to throw some >> thoughts here to get more opinions people who have experienced the >> same problem. >> >> Weikuan >> >> >> >> >> On Mar 6, 2006, at 11:24 AM, Jacob Boswell wrote: >> >>> Hello all, I have compiled lustre myself against a 2.6.12.6-xen >>> kernel. The patch and compile of the kernel happened without any >>> errors and the tools and modules compiles worked with no errors. >>> However after running the llmount.sh or running the basic example in >>> the Howto I keep getting the following errors. >>> >>> Mar 5 05:00:49 xen1 kernel: LustreError: >>> 27144:0:(lvfs.h:130:ll_lookup_one_len()) bad inode returned >>> 25002/3133433428 >>> Mar 5 05:00:49 xen1 kernel: LustreError: >>> 27144:0:(mds_fs.c:466:mds_fs_setup()) cannot create LOGS directory: >>> rc = -2 >>> Mar 5 05:00:49 xen1 kernel: LustreError: >>> 27144:0:(handler.c:1834:mds_setup()) mds1: MDS filesystem method >>> init failed: rc = -2 >>> Mar 5 05:00:49 xen1 kernel: LustreError: >>> 27146:0:(obd_config.c:323:class_cleanup()) Device 3 not setup >>> >>> These errors happen both using loopback devices and physical devices. >>> >>> Any help would be greatly appreciated. >>> >>> Jacob_______________________________________________ >>> Lustre-discuss mailing list >>> Lustre-discuss@clusterfs.com >>> https://mail.clusterfs.com/mailman/listinfo/lustre-discuss >>> >> -- >> Weikuan Yu, Computer Science, OSU >> http://www.cse.ohio-state.edu/~yuw >> >> > -- > Weikuan Yu, Computer Science, OSU > http://www.cse.ohio-state.edu/~yuw > >------------------------------------------------------------------------ > >_______________________________________________ >Lustre-discuss mailing list >Lustre-discuss@clusterfs.com >https://mail.clusterfs.com/mailman/listinfo/lustre-discuss > >
On Mar 27, 2006 18:46 -0500, Weikuan Yu wrote:> >I had the same problem you reported here with lustre-1.4.6 patched for > >linux-2.6.15.4 . My guess is that there is a racing condition > >somewhere in the use of logs directory. I used a workaround to avoid > >this problem, shown with the following patch. Basically, it tried to > >avoid the LOGS directory, but used PENDING instead when it is needed. > >It may or may not help with your problem. But just to throw some > >thoughts here to get more opinions people who have experienced the > >same problem.The meaning of the LOGS and PENDING directories is completely different. Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc.
On Mar 27, 2006, at 10:11 PM, wddi_1976 wrote:> Did you update your e2fsprog? Updating your e2fsprog to 1.38 should > fix this problem.Thanks for your trick. Could you detail a little more on how this can help fix the problem? I am using ldiskfs on a device file /dev/hda. Weikuan> thanks > > Weikuan Yu wrote: > >> Post it again for discussion after registering as a member. >> >> Weikuan >> >> Begin forwarded message: >> >>> From: Weikuan Yu <wkyuwk@gmail.com> >>> Date: March 27, 2006 1:45:06 PM EST >>> To: "Jacob Boswell" <jacob@cordor.com> >>> Cc: lustre-discuss@lists.clusterfs.com >>> Subject: Re: [Lustre-discuss] Lustre + Xen >>> >>> Hi, Jacob, >>> >>> I had the same problem you reported here with lustre-1.4.6 patched >>> for linux-2.6.15.4 . My guess is that there is a racing condition >>> somewhere in the use of logs directory. I used a workaround to avoid >>> this problem, shown with the following patch. Basically, it tried to >>> avoid the LOGS directory, but used PENDING instead when it is >>> needed. It may or may not help with your problem. But just to throw >>> some thoughts here to get more opinions people who have experienced >>> the same problem. >>> >>> Weikuan >>> >>> >>> >>> >>> On Mar 6, 2006, at 11:24 AM, Jacob Boswell wrote: >>> >>>> Hello all, I have compiled lustre myself against a 2.6.12.6-xen >>>> kernel. The patch and compile of the kernel happened without any >>>> errors and the tools and modules compiles worked with no errors. >>>> However after running the llmount.sh or running the basic example >>>> in the Howto I keep getting the following errors. >>>> >>>> Mar 5 05:00:49 xen1 kernel: LustreError: >>>> 27144:0:(lvfs.h:130:ll_lookup_one_len()) bad inode returned >>>> 25002/3133433428 >>>> Mar 5 05:00:49 xen1 kernel: LustreError: >>>> 27144:0:(mds_fs.c:466:mds_fs_setup()) cannot create LOGS directory: >>>> rc = -2 >>>> Mar 5 05:00:49 xen1 kernel: LustreError: >>>> 27144:0:(handler.c:1834:mds_setup()) mds1: MDS filesystem method >>>> init failed: rc = -2 >>>> Mar 5 05:00:49 xen1 kernel: LustreError: >>>> 27146:0:(obd_config.c:323:class_cleanup()) Device 3 not setup >>>> >>>> These errors happen both using loopback devices and physical >>>> devices. >>>> >>>> Any help would be greatly appreciated. >>>> >>>> Jacob_______________________________________________ >>>> Lustre-discuss mailing list >>>> Lustre-discuss@clusterfs.com >>>> https://mail.clusterfs.com/mailman/listinfo/lustre-discuss >>>> >>> -- >>> Weikuan Yu, Computer Science, OSU >>> http://www.cse.ohio-state.edu/~yuw >>> >>> >> -- >> Weikuan Yu, Computer Science, OSU >> http://www.cse.ohio-state.edu/~yuw >> >> ---------------------------------------------------------------------- >> -- >> >> _______________________________________________ >> Lustre-discuss mailing list >> Lustre-discuss@clusterfs.com >> https://mail.clusterfs.com/mailman/listinfo/lustre-discuss >> > >-- Weikuan Yu, Computer Science, OSU http://www.cse.ohio-state.edu/~yuw
On Mar 28, 2006 06:24 -0500, Weikuan Yu wrote:> On Mar 28, 2006, at 4:58 AM, Andreas Dilger wrote: > >On Mar 27, 2006 18:46 -0500, Weikuan Yu wrote: > >>>I had the same problem you reported here with lustre-1.4.6 patched > >>>for > >>>linux-2.6.15.4 . My guess is that there is a racing condition > >>>somewhere in the use of logs directory. I used a workaround to avoid > >>>this problem, shown with the following patch. Basically, it tried to > >>>avoid the LOGS directory, but used PENDING instead when it is needed. > >>>It may or may not help with your problem. But just to throw some > >>>thoughts here to get more opinions people who have experienced the > >>>same problem. > > > >The meaning of the LOGS and PENDING directories is completely > >different. > > I can understanding they mean different very different things :-). So > that is why I meant to throw some probes here. But other than blocking > on this MDS backing file system corruption (indicated by Cliff on the > same problem), this could temporarily allow me to proceed further. Do > you see any implications here, under what scenarios? BTW, is the > problem considered as valid or just be deferred for now.I''m not sure what the initial problem is, but what will happen now is that when your MDS crashes you will likely leak space on the OSTs, because all files in PENDING are deleted after recovery is complete and the llog files (which should be in LOGS) will not be available to clean up files that were not completely destroyed on the OSTs. Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc.
Post it again for discussion after registering as a member. Weikuan Begin forwarded message:> From: Weikuan Yu <wkyuwk@gmail.com> > Date: March 27, 2006 1:45:06 PM EST > To: "Jacob Boswell" <jacob@cordor.com> > Cc: lustre-discuss@lists.clusterfs.com > Subject: Re: [Lustre-discuss] Lustre + Xen > > Hi, Jacob, > > I had the same problem you reported here with lustre-1.4.6 patched for > linux-2.6.15.4 . My guess is that there is a racing condition > somewhere in the use of logs directory. I used a workaround to avoid > this problem, shown with the following patch. Basically, it tried to > avoid the LOGS directory, but used PENDING instead when it is needed. > It may or may not help with your problem. But just to throw some > thoughts here to get more opinions people who have experienced the > same problem. > > Weikuan >-------------- next part -------------- A non-text attachment was scrubbed... Name: mds_fs_setup.workaround Type: application/octet-stream Size: 2518 bytes Desc: not available Url : http://mail.clusterfs.com/pipermail/lustre-discuss/attachments/20060327/685d2abf/mds_fs_setup.obj -------------- next part --------------> > > > On Mar 6, 2006, at 11:24 AM, Jacob Boswell wrote: > >> Hello all, I have compiled lustre myself against a 2.6.12.6-xen >> kernel. The patch and compile of the kernel happened without any >> errors and the tools and modules compiles worked with no errors. >> However after running the llmount.sh or running the basic example in >> the Howto I keep getting the following errors. >> >> Mar? 5 05:00:49 xen1 kernel: LustreError: >> 27144:0:(lvfs.h:130:ll_lookup_one_len()) bad inode returned >> 25002/3133433428 >> Mar? 5 05:00:49 xen1 kernel: LustreError: >> 27144:0:(mds_fs.c:466:mds_fs_setup()) cannot create LOGS directory: >> rc = -2 >> Mar? 5 05:00:49 xen1 kernel: LustreError: >> 27144:0:(handler.c:1834:mds_setup()) mds1: MDS filesystem method init >> failed: rc = -2 >> Mar? 5 05:00:49 xen1 kernel: LustreError: >> 27146:0:(obd_config.c:323:class_cleanup()) Device 3 not setup >> >> These errors happen both using loopback devices and physical devices. >> >> Any help would be greatly appreciated. >> >> Jacob_______________________________________________ >> Lustre-discuss mailing list >> Lustre-discuss@clusterfs.com >> https://mail.clusterfs.com/mailman/listinfo/lustre-discuss >> > -- > Weikuan Yu, Computer Science, OSU > http://www.cse.ohio-state.edu/~yuw > >-- Weikuan Yu, Computer Science, OSU http://www.cse.ohio-state.edu/~yuw