-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 because of strange reasons my notebook sometimes crashes short after startup (but that's not ext3's fault, maybe mem?, when i wait several minutes it works without problems) the problem is that after 3 crashes at startup, when my notebook finally worked i got the msg: Sep 23 23:29:17 blackbox kernel: EXT3-fs warning (device ide0(3,3)): ext3_clear_journal_err: detected journal error -5 from previous mount Sep 23 23:29:17 blackbox kernel: EXT3-fs: ide0(3,3): orphan cleanup on readonly fs Sep 23 23:29:17 blackbox kernel: ext3_orphan_cleanup: deleting unreferenced inode 97540 Sep 23 23:29:17 blackbox kernel: ext3_orphan_cleanup: deleting unreferenced inode 97538 Sep 23 23:29:17 blackbox kernel: EXT3-fs: ide0(3,3): 2 orphan inodes deleted the problem is that if found a part of my dpkg package list in my motd and the first line of the motd in my resolv.conf :/ (i haven't found any other corrupted files yet) if ext3 detects a journal error - why does it still use the journal (it did a fsck after the recovering)? why are files corrupted which i don't edit very often (motd, the dpkg list, i changed the resolv.conf before the crashes). i am using ext3-2.4-0.9.6-249.gz cu /gst btw: i'm not subscribed to the list, pls cc replies to me. (gst@sysfrog.org) -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.0.6 (GNU/Linux) Comment: For info see http://www.gnupg.org iD8DBQE7r3LOzsYkmN+2qCgRAqJ/AJ9IY571dWmThAYLzz7BaDlyAhTgGgCeLqh0 APGk1f06FeM+pEcmxAmfs9w=wzwF -----END PGP SIGNATURE-----
On Sep 24, 2001 19:52 +0200, Guenther Starnberger wrote:> the problem is that after 3 crashes at startup, when my notebook finally > worked i got the msg: > > Sep 23 23:29:17 blackbox kernel: EXT3-fs warning (device ide0(3,3)): > ext3_clear_journal_err: detected journal error -5 from previous mount > Sep 23 23:29:17 blackbox kernel: EXT3-fs: ide0(3,3): orphan cleanup on > readonly fs > Sep 23 23:29:17 blackbox kernel: ext3_orphan_cleanup: deleting unreferenced > inode 97540 > Sep 23 23:29:17 blackbox kernel: ext3_orphan_cleanup: deleting unreferenced > inode 97538 > Sep 23 23:29:17 blackbox kernel: EXT3-fs: ide0(3,3): 2 orphan inodes deleted > > the problem is that if found a part of my dpkg package list in my motd and > the first line of the motd in my resolv.conf :/ (i haven't found any other > corrupted files yet) > > if ext3 detects a journal error - why does it still use the journal (it did a > fsck after the recovering)?Hmm, I don't know. I have never seen a journal error. In this case, -5 means EIO (input/output error). Stephen will know the most about the error handling (he wrote it).> why are files corrupted which i don't edit very often (motd, the dpkg list, i > changed the resolv.conf before the crashes). > > i am using ext3-2.4-0.9.6-249.gzWhat kernel are you using, and which version of e2fsck? Also which journal mode do you use (data=ordered (default), data=writeback, data=journal)? Cheers, Andreas -- Andreas Dilger \ "If a man ate a pound of pasta and a pound of antipasto, \ would they cancel out, leaving him still hungry?" http://www-mddsp.enel.ucalgary.ca/People/adilger/ -- Dogbert
Hi, On Mon, Sep 24, 2001 at 07:52:08PM +0200, Guenther Starnberger wrote:> the problem is that after 3 crashes at startup, when my notebook finally > worked i got the msg: > > Sep 23 23:29:17 blackbox kernel: EXT3-fs warning (device ide0(3,3)): > ext3_clear_journal_err: detected journal error -5 from previous mountNice, that's the first report I've had of that code working in the field. :-) What has happened is that ext3 has, on a previous mount, detected a fatal IO error (EIO) during journal operations, and has taken the entire journal offline as a result, marking the journal with the error code. On the subsequent mount, e2fsck detected the error marked in the journal, and forced a full fsck of the filesystem. So far, that's all working as expected.> the problem is that if found a part of my dpkg package list in my motd and > the first line of the motd in my resolv.conf :/ (i haven't found any other > corrupted files yet)I have had precisely 3 other reports of weird data corruption on recent kernels, all of which were on laptops. It really sounds as if there's dodgy hardware involved here. Some of those prior cases seem to go away if you avoid the suspend-to-ram function, by the way.> if ext3 detects a journal error - why does it still use the journal (it did a > fsck after the recovering)?The journal, up to the point at which the error occurred, should still be valid and may well contain information which is _much_ more uptodate than that in the rest of the filesystem. After the error is detected in the journal, we absolutely refuse to generate any new commit records so there is no way for the IO which was in progress at the time to be replayed, so we're not in danger of recovering the data being logged at the time of the error.> why are files corrupted which i don't edit very often (motd, the dpkg list, i > changed the resolv.conf before the crashes).I've got fairly cast-iron evidence of at least one laptop disk drive writing data to the wrong blocks on disk under load. It could be that. If you're getting weird crashes then basically any part of memory might be getting corrupted, which can confuse any filesystem about where to write to disk. It's not usually particularly easy to track down a specific chain of events leading to the corruption when there is bad hardware involved. Cheers, Stephen
On Mon, Sep 24, 2001 at 07:52:08PM +0200, Guenther Starnberger wrote:> the problem is that after 3 crashes at startup, when my notebook finally > worked i got the msg: > > Sep 23 23:29:17 blackbox kernel: EXT3-fs warning (device ide0(3,3)): > ext3_clear_journal_err: detected journal error -5 from previous mount > Sep 23 23:29:17 blackbox kernel: EXT3-fs: ide0(3,3): orphan cleanup on > readonly fs > Sep 23 23:29:17 blackbox kernel: ext3_orphan_cleanup: deleting unreferenced > inode 97540 > Sep 23 23:29:17 blackbox kernel: ext3_orphan_cleanup: deleting unreferenced > inode 97538 > Sep 23 23:29:17 blackbox kernel: EXT3-fs: ide0(3,3): 2 orphan inodes deleted > > if ext3 detects a journal error - why does it still use the journal > (it did a fsck after the recovering)?The kernel message is a little misleading (Stephen, we should fix that). What it means is that the kernel noticed a problem in the filesystem on a previous mount, but since it can't necessarily guarantee that a modification to the superblock will get flushed to disk, it writes an indication of a filesystem error in the journal superblock. When the filesystem is mounted or fsck'ed (assuming the use of a reasonably modern e2fsprogs), the error condition in the journal superblock is propagated t o the EXT2_ERROR_FS bit in the ext3 superblock. So the journal is still valid, and in fact you really do want to run the journal before running fsck to recover the filesystem.> why are files corrupted which i don't edit very often (motd, the > dpkg list, i changed the resolv.conf before the crashes).Well, ext2 and ext3 don't modify or move files that haven't been changed, so the only explanation is some kind of hardware error (which might be also causing the filesystem corruption), or some process you didn't know about actually has modified the dpkg list and/or the motd file. For example, with many distributions the motd file is edited so that the first line contains the kernel version of the current booted kernel; so the boot scripts do modify motd. - Ted
Hi, On Mon, Sep 24, 2001 at 11:23:26PM -0400, Theodore Tso wrote:> On Mon, Sep 24, 2001 at 07:52:08PM +0200, Guenther Starnberger wrote: > > > > Sep 23 23:29:17 blackbox kernel: EXT3-fs warning (device ide0(3,3)): > > ext3_clear_journal_err: detected journal error -5 from previous mount> The kernel message is a little misleading (Stephen, we should fix > that).Will do --- suggestions? "Error (EIO) recorded from previous mount, filesystem should be checked" or something similar? --Stephen
Reasonably Related Threads
- orphan inodes deleted issue
- [RFC] A new intrinsic, `llvm.blackbox`, to explicitly prevent constprop, die, etc optimizations
- [RFC] A new intrinsic, `llvm.blackbox`, to explicitly prevent constprop, die, etc optimizations
- embedding characters
- Trying to get a cdwriter to write to loop device