Hi, on the mgs node e2fsck-1.39.cfs8 shows this: e2fsck 1.39.cfs8 (7-Apr-2007) lustre-MDT0000: recovering journal Clearing orphaned inode 26502799 (uid=1012, gid=100, mode=0100644, size=0) Clearing orphaned inode 26501309 (uid=1012, gid=100, mode=0100644, size=0) Clearing orphaned inode 26500161 (uid=1012, gid=100, mode=0100644, size=0) Clearing orphaned inode 26499062 (uid=1012, gid=100, mode=0100644, size=0) Clearing orphaned inode 26497822 (uid=1012, gid=100, mode=0100644, size=0) Clearing orphaned inode 26497454 (uid=1012, gid=100, mode=0100644, size=0) Clearing orphaned inode 26470885 (uid=1012, gid=100, mode=0100644, size=0) Clearing orphaned inode 26470857 (uid=1012, gid=100, mode=0100644, size=0) Clearing orphaned inode 26470840 (uid=1012, gid=100, mode=0100644, size=0) Pass 1: Checking inodes, blocks, and sizes Pass 2: Checking directory structure Pass 3: Checking directory connectivity Pass 4: Checking reference counts Pass 5: Checking group summary information This is after building a bintuils packages on lustre clients and then checking the filesystem. Any idea how critical it is? Thanks, Bernd -- Bernd Schubert Q-Leap Networks GmbH
On Jul 05, 2007 15:34 +0200, Bernd Schubert wrote:> on the mgs node e2fsck-1.39.cfs8 shows this: > > e2fsck 1.39.cfs8 (7-Apr-2007) > lustre-MDT0000: recovering journal > Clearing orphaned inode 26502799 (uid=1012, gid=100, mode=0100644, size=0) > Clearing orphaned inode 26501309 (uid=1012, gid=100, mode=0100644, size=0) > Clearing orphaned inode 26500161 (uid=1012, gid=100, mode=0100644, size=0) > Clearing orphaned inode 26499062 (uid=1012, gid=100, mode=0100644, size=0) > Clearing orphaned inode 26497822 (uid=1012, gid=100, mode=0100644, size=0) > Clearing orphaned inode 26497454 (uid=1012, gid=100, mode=0100644, size=0) > Clearing orphaned inode 26470885 (uid=1012, gid=100, mode=0100644, size=0) > Clearing orphaned inode 26470857 (uid=1012, gid=100, mode=0100644, size=0) > Clearing orphaned inode 26470840 (uid=1012, gid=100, mode=0100644, size=0) > Pass 1: Checking inodes, blocks, and sizes > Pass 2: Checking directory structure > Pass 3: Checking directory connectivity > Pass 4: Checking reference counts > Pass 5: Checking group summary information > > > This is after building a bintuils packages on lustre clients and then checking > the filesystem. > > Any idea how critical it is?This is normal even for local ext3 filesystems, if they crash with open but unlinked files. If the filesystem has been stopped normally (not forced) then this shouldn''t be happening. Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc.
Bernd Schubert
2007-Jul-05 12:20 UTC
[Lustre-discuss] Re: Clearing orphaned inode 26502799
Andreas Dilger wrote:>> >> This is after building a bintuils packages on lustre clients and then >> checking the filesystem. >> >> Any idea how critical it is? > > This is normal even for local ext3 filesystems, if they crash with open > but > unlinked files. If the filesystem has been stopped normally (not forced) > then this shouldn''t be happening.Sorry, I should have been more specific. This happens reproducibly on a normally unmounted MDS node. As e2fsck detects it on journal replay, does this mean the buffers are probably not flushed to disk before umount succeeds? Thanks a lot for your help, Bernd
Andreas Dilger
2007-Jul-05 14:25 UTC
[Lustre-discuss] Re: Clearing orphaned inode 26502799
On Jul 05, 2007 20:19 +0200, Bernd Schubert wrote:> Andreas Dilger wrote: > >> This is after building a bintuils packages on lustre clients and then > >> checking the filesystem. > >> > >> Any idea how critical it is? > > > > This is normal even for local ext3 filesystems, if they crash with open > > but > > unlinked files. If the filesystem has been stopped normally (not forced) > > then this shouldn''t be happening. > > Sorry, I should have been more specific. This happens reproducibly on a > normally unmounted MDS node.Is this also after the clients are properly unmounted and/or evicted (not using --force during MDS cleanup)? In that case it seems to show that some files are leaking inode reference counts or similar. Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc.
Bernd Schubert
2007-Jul-05 15:07 UTC
[Lustre-discuss] Re: Re: Clearing orphaned inode 26502799
Andreas Dilger wrote:> On Jul 05, 2007 20:19 +0200, Bernd Schubert wrote: >> Andreas Dilger wrote: >> >> This is after building a bintuils packages on lustre clients and then >> >> checking the filesystem. >> >> >> >> Any idea how critical it is? >> > >> > This is normal even for local ext3 filesystems, if they crash with open >> > but >> > unlinked files. If the filesystem has been stopped normally (not >> > forced) then this shouldn''t be happening. >> >> Sorry, I should have been more specific. This happens reproducibly on a >> normally unmounted MDS node. > > Is this also after the clients are properly unmounted and/or evicted > (not using --force during MDS cleanup)? In that case it seems to show > that some files are leaking inode reference counts or similar.I need test if it also happens when the clients umount first, but I didn''t give the --force option on MDS umount. Any idea how I can debug it? Thanks, Bernd
Bernd Schubert
2007-Jul-06 04:03 UTC
[Lustre-discuss] Re: Re: Clearing orphaned inode 26502799
I just got this in dmesg: [67100.343686] Lustre: Found inode with zero generation or link -- this may indicate disk corruption (inode: 26398122/151277060, link 0, count 1) [67100.358660] Lustre: Found inode with zero generation or link -- this may indicate disk corruption (inode: 26398122/151277060, link 0, count 1) [67100.368242] Lustre: Found inode with zero generation or link -- this may indicate disk corruption (inode: 26398122/151277060, link 0, count 1) [67100.368248] Lustre: Skipped 6 previous similar messages [67100.386517] Lustre: Found inode with zero generation or link -- this may indicate disk corruption (inode: 26398114/151277054, link 0, count 1) So I did run an e2fsck again and among the orphaned inodes e2fsck found in the journal were also these above. This might or might not be a bug in our additional patches for 2.6.20, but its hard to test, since I have not access to your cvs tree for lustre-1.6.1 with 2.6.18 support and older kernels won''t run on our hardware. Let me know if there''s anything I can do to debug this. Cheers, Bernd -- Bernd Schubert Q-Leap Networks GmbH
Kalpak Shah
2007-Jul-06 04:27 UTC
[Lustre-discuss] Re: Re: Clearing orphaned inode 26502799
On Fri, 2007-07-06 at 12:03 +0200, Bernd Schubert wrote:> I just got this in dmesg: > > [67100.343686] Lustre: Found inode with zero generation or link -- this may > indicate disk corruption (inode: 26398122/151277060, link 0, count 1) > [67100.358660] Lustre: Found inode with zero generation or link -- this may > indicate disk corruption (inode: 26398122/151277060, link 0, count 1) > [67100.368242] Lustre: Found inode with zero generation or link -- this may > indicate disk corruption (inode: 26398122/151277060, link 0, count 1) > [67100.368248] Lustre: Skipped 6 previous similar messages > [67100.386517] Lustre: Found inode with zero generation or link -- this may > indicate disk corruption (inode: 26398114/151277054, link 0, count 1) > > So I did run an e2fsck again and among the orphaned inodes e2fsck found in the > journal were also these above. > > This might or might not be a bug in our additional patches for 2.6.20, but its > hard to test, since I have not access to your cvs tree for lustre-1.6.1 with > 2.6.18 support and older kernels won''t run on our hardware. > > Let me know if there''s anything I can do to debug this.Hi Bernd, This happens because it is actually possible to get a zero generation inode once every (random < 2^32) inodes. Ldiskfs needs to have this patch to skip inodes with generation = 0. linux-2.6.9-34.orig/fs/ext3/ialloc.c 2007-01-03 13:30:33.000000000 +0000 +++ linux-2.6.9-34/fs/ext3/ialloc.c 2007-01-03 13:42:04.000000000 +0000 @@ -721,6 +721,8 @@ got: insert_inode_hash(inode); spin_lock(&sbi->s_next_gen_lock); inode->i_generation = sbi->s_next_generation++; + if (unlikely(inode->i_generation == 0)) + inode->i_generation = sbi->s_next_generation++; spin_unlock(&sbi->s_next_gen_lock); ei->i_state = EXT3_STATE_NEW; There also needs to be a change in mds_fid2dentry(). These patches can be found in bz10419. Thanks, Kalpak.> > Cheers, > Bernd > >
Bernd Schubert
2007-Jul-06 13:25 UTC
[Lustre-discuss] Re: Re: Clearing orphaned inode 26502799
Hello Kalpak, On Friday 06 July 2007 12:28:29 Kalpak Shah wrote:> > Hi Bernd, > > This happens because it is actually possible to get a zero generation inode > once every (random < 2^32) inodes. > > Ldiskfs needs to have this patch to skip inodes with generation = 0. > > linux-2.6.9-34.orig/fs/ext3/ialloc.c 2007-01-03 13:30:33.000000000 +0000 > +++ linux-2.6.9-34/fs/ext3/ialloc.c 2007-01-03 13:42:04.000000000 +0000 > @@ -721,6 +721,8 @@ got: > insert_inode_hash(inode); > spin_lock(&sbi->s_next_gen_lock); > inode->i_generation = sbi->s_next_generation++; > + if (unlikely(inode->i_generation == 0)) > + inode->i_generation = sbi->s_next_generation++; > spin_unlock(&sbi->s_next_gen_lock); > > ei->i_state = EXT3_STATE_NEW; > > There also needs to be a change in mds_fid2dentry(). These patches can be > found in bz10419.thanks a lot, I will test it as soon as possible, today I was all the day busy with an entirely diferent issue (not related to lustre at all). Thanks again, Bernd -- Bernd Schubert Q-Leap Networks GmbH