Stephen, OK, I can now reproduce this hang at will, purely by pulling the plug on my desktop when logged in and then rebooting - its a gnome desktop box with few partitions and ext3 on all of them, so I guess its getting a pile of gnome or ssh related sockets kept in /tmp which is on root To recap, when the machine is suffering from this, it hangs at the point of mounting the root filesystem during boot. Using e2fsck from the latest 1.20WIP set fixes the problem. I had hoped that booting with rootflags=debug would give me more information, however there is nothing in there thats useful - basically just a JFS DEBUG.... exit status 0, recovered transactions 38212 to 38212, similar summary message and then EXT-fs: 03:01 orphan cleanup on read-only filesystem [I'm retyping these errors so there may be minor mistakes] Running e2fsck on it gives, after some other bits, Clearing orphaned inode 666608 (uid=0, gid=0, mode=0100755, size=456328) Clearing orphaned inode 53304 (uid=0, gid=0, mode=0100755, size=552935) Truncating orphaned inode 10456 (uid=100, gid=500, mode=0140755, size=0) I can reproduce this at will (with some pain), so if you can give me a hint as to how you want this followed up then please come back to me. Also if it requires a kernel rebuild (fairly easy) I can then also add serial console and trap the output more effectively. Nigel.
Hi, On Mon, Feb 12, 2001 at 05:34:07PM +0000, Nigel Metheringham wrote:> > OK, I can now reproduce this hang at will, purely by pulling the plug > on my desktop when logged in and then rebooting - its a gnome desktop > box with few partitions and ext3 on all of them, so I guess its getting > a pile of gnome or ssh related sockets kept in /tmp which is on root > > Running e2fsck on it gives, after some other bits, > Clearing orphaned inode 666608 (uid=0, gid=0, mode=0100755, > size=456328) > Clearing orphaned inode 53304 (uid=0, gid=0, mode=0100755, size=552935) > Truncating orphaned inode 10456 (uid=100, gid=500, mode=0140755, > size=0)Ahah. *THAT* is nasty. It's trying to truncate a socket. Needless to say, this is doomed to failure. The orphan list is used for two things: to list orphans (inodes deleted but still in use), and to complete large truncates which were split over multiple transactions, in case the truncate got interrupted midway. We can tell the difference by looking at the link count on the inode: if there are no links to it left then it must be orphaned; otherwise it must have been a truncate which got interrupted. So how on *earth* are we seeing a socket here with nlinks != 0? All of the places where we create orphans check that the nlink==0 before adding to the orphan list. OK, I'll go and poke at this for a bit, and I may end up giving you a booby-trapped debug build at some point to find out where this inode is getting onto the orphan list in the first place. Thanks, Stephen
Hi, On Mon, Feb 12, 2001 at 10:41:12PM +0000, Stephen C. Tweedie wrote:> > So how on *earth* are we seeing a socket here with nlinks != 0? All > of the places where we create orphans check that the nlink==0 before > adding to the orphan list.Question: is this filesystem being exported via NFS? --Stephen
It also happens on local disk file systems.. - Peter - > -----Original Message----- > From: ext3-users-admin@redhat.com > [mailto:ext3-users-admin@redhat.com]On > Behalf Of Stephen C. Tweedie > Sent: Tuesday, February 13, 2001 6:42 AM > To: Nigel Metheringham > Cc: Stephen C. Tweedie; ext3-users@redhat.com > Subject: Re: That darned orphaned socket hang > > > Hi, > > On Mon, Feb 12, 2001 at 10:41:12PM +0000, Stephen C. > Tweedie wrote: > > > > So how on *earth* are we seeing a socket here with > nlinks != 0? All > > of the places where we create orphans check that the > nlink==0 before > > adding to the orphan list. > > Question: is this filesystem being exported via NFS? > > --Stephen > > > > _______________________________________________ > Ext3-users mailing list > Ext3-users@redhat.com > https://listman.redhat.com/mailman/listinfo/ext3-users >