While compiling two kernels and untarring a third, my root fs was remounted r/w and I got the following in dmesg (kernel 2.4.19-pre9): EXT3-fs error (device ide0(3,2)) in ext3_new_inode: error 28 Aborting journal on device ide0(3,2). ext3_abort called EXT3-fs abort (device ide0(3,2)): ext3_journal_start: Detected aborted journal. Remounting filesystem read-only Remounting filesystem read-only EXT3-fs error (device ide0(3,2)) in start_transaction: Journal has aborted EXT3-fs error (device ide0(3,2)) in start_transaction: Journal has aborted EXT3-fs error (device ide0(3,2)) in ext3_create: IO failure EXT3-fs error (device ide0(3,2)) in start_transaction: Journal has aborted EXT3-fs error (device ide0(3,2)) in start_transaction: Journal has aborted ... Rebooting didn't help: the journal aborted immediately. I also had some trouble using rootfstype=ext2 because (detecting that the filesystem was in a bad state when shutdown) it refused to mount the root fs ext2! Anyway, I finally tricked it into remounting as ext2. It then became clear that the disk was full. After removing a heap of files (possible because I was using ext2; not possible with ext3 because of the instant remounting as read-only) and rebooting, all was well. The moral of the story seems to be: ext3 behaves in an inelegant way when the disk is full. Is this inevitable for a journalling file system? If not, I for one would be very happy if ext3 (which otherwise I have been very happy with) behaved a little nicer in this case... All the best, Duncan.
While compiling two kernels and untarring a third, my root fs was remounted r/w and I got the following in dmesg (kernel 2.4.19-pre9): EXT3-fs error (device ide0(3,2)) in ext3_new_inode: error 28 Aborting journal on device ide0(3,2). ext3_abort called EXT3-fs abort (device ide0(3,2)): ext3_journal_start: Detected aborted journal. Remounting filesystem read-only Remounting filesystem read-only EXT3-fs error (device ide0(3,2)) in start_transaction: Journal has aborted EXT3-fs error (device ide0(3,2)) in start_transaction: Journal has aborted EXT3-fs error (device ide0(3,2)) in ext3_create: IO failure EXT3-fs error (device ide0(3,2)) in start_transaction: Journal has aborted EXT3-fs error (device ide0(3,2)) in start_transaction: Journal has aborted ... Rebooting didn't help: the journal aborted immediately. I also had some trouble using rootfstype=ext2 because (detecting that the filesystem was in a bad state when shutdown) it refused to mount the root fs ext2! Anyway, I finally tricked it into remounting as ext2. It then became clear that the disk was full. After removing a heap of files (possible because I was using ext2; not possible with ext3 because of the instant remounting as read-only) and rebooting, all was well. The moral of the story seems to be: ext3 behaves in an inelegant way when the disk is full. Is this inevitable for a journalling file system? If not, I for one would be very happy if ext3 (which otherwise I have been very happy with) behaved a little nicer in this case... All the best, Duncan.
Summary: not a problem with the disk being full. This is bad. Ok, I did some more tests. Results are for 2.4.19-pre10. I applied Andrew's patch and filled up my disk 100% by copying a huge file to the partition. No journal abort occurred. I then did the same thing without Andrew's patch. No journal abort occurred either! (I observed no other problems either). Now, the abort I reported occurred when I was simultaneously compiling two kernels while untarring two others. So I became suspicious that maybe the problem came from heavy system load rather than the disk being full. I performed the following test: (1) booted 2.4.19-pre10 with mem=50M (because I have a bad bit around 105M). (2) while printing disk output (df) every 5 seconds simultaneously did: (a) untarred three kernels (tar xj) (b) compiled two kernels After about 5-10 minutes of this I got a journal abort while the filesystem was only 70% full (same message as in my original email: error 28 in ext3_new_inode). Any thoughts? I've attached some system info to the email. Duncan. PS: I use devfs. On Tuesday 04 June 2002 1:06 am, Andrew Morton wrote:> Andreas Dilger wrote: > > On Jun 03, 2002 23:04 +0200, Duncan Sands wrote: > > > While compiling two kernels and untarring a third, my root fs was > > > remounted r/w and I got the following in dmesg (kernel 2.4.19-pre9): > > > > ^^^ r/o I presume... > > > > > EXT3-fs error (device ide0(3,2)) in ext3_new_inode: error 28 > > > Aborting journal on device ide0(3,2). > > > ext3_abort called > > > > This is a known error, and I thought a fix was submitted by Andrew > > and/or Stephen. It should not cause a filesystem error just because > > the filesystem was full. > > Memory fails me... But no, we shouldn't be treating ENOSPC in that > manner. How about this? > > --- linux-2.5.20/fs/ext3/ialloc.c Wed May 29 12:48:15 2002 > +++ 25/fs/ext3/ialloc.c Mon Jun 3 16:05:36 2002 > @@ -534,7 +534,8 @@ repeat: > fail: > unlock_super(sb); > iput(inode); > - ext3_std_error(sb, err); > + if (err != -ENOSPC) > + ext3_std_error(sb, err); > return ERR_PTR(err); > }
Hi, On Mon, Jun 03, 2002 at 10:59:42PM +0200, Duncan Sands wrote:> While compiling two kernels and untarring a third, my root fs was remounted r/w > and I got the following in dmesg (kernel 2.4.19-pre9): > > EXT3-fs error (device ide0(3,2)) in ext3_new_inode: error 28Known problem, fixed in ext3 CVS and in -ac. If you run out of free inodes (not free disk blocks), ext3 erroneously considers it to be an IO error...> Aborting journal on device ide0(3,2). > ext3_abort called > EXT3-fs abort (device ide0(3,2)): ext3_journal_start: Detected aborted journal. > Remounting filesystem read-only > Remounting filesystem read-only > EXT3-fs error (device ide0(3,2)) in start_transaction: Journal has aborted > EXT3-fs error (device ide0(3,2)) in start_transaction: Journal has aborted > EXT3-fs error (device ide0(3,2)) in ext3_create: IO failure > EXT3-fs error (device ide0(3,2)) in start_transaction: Journal has aborted > EXT3-fs error (device ide0(3,2)) in start_transaction: Journal has aborted...and if you set the filesystem's on-error behaviour to be remount-readonly, this is what happens. --Stephen