Marc Olzheim
2005-Jul-21 09:45 UTC
background fsck, softupdates & inconsistent state on disk
Hi. Having enough opportunities to do crash recovery with kern/83375 open and some of my services not yet moved back to FreeBSD 4, I noticed that often it crashes just after (or perhaps during) mirroring of a directory tree. The mirroring involves creating a directory with in it 80 subdirectories in it. Now when the machine panics on a 'screen' again, background fsck fails to properly check the filesystem and reports so in /var/log/messages. What I see on that partition is the main directory that should have contained the 80 subdirs, but now it has a link count of 0 and so doesn't even contain a . or .. , let alone the 80 directories that should have been there. The only thing a manual fsck can do after that is unlink the unreferenced inodes and clear up the mess... Shouldn't this be impossible without power loss ? Or is it inherent to SMP that the machine can crash on a process on CPU #0 while CPU #1 is updating disk structures ? Anyway, as soon as the migration of production services suffering from kern/83375 back to 4.x is done I should have a 5.x test machine ready to crash whenever people want, so I can get debug output out of it. If anyone could tell me how to get it and what they need, I'd be happy to provide it. Marc -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 187 bytes Desc: not available Url : http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20050721/c1f5fb2c/attachment.bin
Marc Olzheim
2005-Aug-01 08:58 UTC
background fsck, softupdates & inconsistent state on disk
On Thu, Jul 21, 2005 at 11:45:33AM +0200, Marc Olzheim wrote:> Having enough opportunities to do crash recovery with kern/83375 open > and some of my services not yet moved back to FreeBSD 4, I noticed that > often it crashes just after (or perhaps during) mirroring of a directory > tree. The mirroring involves creating a directory with in it 80 > subdirectories in it. > > Now when the machine panics on a 'screen' again, background fsck fails > to properly check the filesystem and reports so in /var/log/messages. > What I see on that partition is the main directory that should have > contained the 80 subdirs, but now it has a link count of 0 and so > doesn't even contain a . or .. , let alone the 80 directories that > should have been there. > > The only thing a manual fsck can do after that is unlink the > unreferenced inodes and clear up the mess...Ok, this time it's worse; Trying to startup single user, gives: WARNING: / was not properly dismounted start_init: trying /sbin/init WARNING: R/W mount of / denied. Filesystem is not clean - run fsck WARNING: R/W mount of / denied. Filesystem is not clean - run fsck WARNING: R/W mount of / denied. Filesystem is not clean - run fsck ... And it won't snap out of it... Luckily we start a 'remote control daemon' before fsck is started, so I managed to run 'fsck /' manually: rcntlc> shell sh -i Press CTRL-A to escape from this mode sh: can't access tty; job control turned off # fsck / ** /dev/da0s1a ** Last Mounted on / ** Root file system ** Phase 1 - Check Blocks and Sizes ** Phase 2 - Check Pathnames ** Phase 3 - Check Connectivity ** Phase 4 - Check Reference Counts ** Phase 5 - Check Cyl groups 2174 files, 29734 used, 97105 free (225 frags, 12110 blocks, 0.2% fragmentation) ***** FILE SYSTEM MARKED CLEAN ***** # So I'm not sure what the problem was...> Shouldn't this be impossible without power loss ? Or is it inherent to > SMP that the machine can crash on a process on CPU #0 while CPU #1 is > updating disk structures ?Marc -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 187 bytes Desc: not available Url : http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20050801/7d0f2ff5/attachment.bin