thr3ads.net - Ext3 users - Assert in jbd-kernel.c [Oct 2001]

If this information is useful, please help other people find it:
Share via:

Steve R. Hastings

2001-Oct-09 22:15 UTC

Assert in jbd-kernel.c

Hello.  I have installed the ext3 file system on a test system, and 
sometimes I have a problem: I get an assert from within jbd-kernel.c, 
and whatever prgram was writing to the disk when this happens is unable 
to continue.

The system is a server I built, which I named "dax".  It is running 
Debian unstable, and I updated it to all the latest packages in Debian 
unstable as of today.  It is running a Linux 2.4.10 kernel.  On the 
Linux Weekly News page I saw someone offering 2.4.10 sources pre-patched 
for both ext3 and kernel preemption, so I built from those sources.

http://lwn.net/2001/0927/a/ext3-preempt.php3
http://lameter.com/kernel/linux-2.4.10-ext3-preempt.tar.gz


I enabled both ext3 and kernel preemption.  The server is running Linux 
software RAID, using RAID 1 (mirroring) on both ext3 filesystems.  I 
have seen the same problem and assert several times now.  Whenever I see 
this happen, I always reboot the system, since I am not sure how serious 
the problem is.



The assert text is as follows:

-- cut here -- cut here -- cut here -- cut here -- cut here --
Message from syslogd@dax at Tue Oct  9 12:07:47 2001 ...
dax kernel: Assertion failure in jbd_preclean_buffer_check() at 
jbd-kernel.c:80:
 "(((bh)->b_state & (1UL << BH_Dirty)) != 0)"
-- cut here -- cut here -- cut here -- cut here -- cut here --



Here is line 80 from jbd-kernel.c:

-- cut here -- cut here -- cut here -- cut here -- cut here --
                       J_ASSERT_JH(jh, buffer_dirty(bh));
-- cut here -- cut here -- cut here -- cut here -- cut here --



Then on the reboot, the boot message included this:

-- cut here -- cut here -- cut here -- cut here -- cut here --
EXT3-fs: INFO: recovery required on readonly filesystem.
EXT3-fs: write access will be enabled during recovery.
(recovery.c, 253): journal_recover: JBD: recovery, exit status 0, 
recovered tran
sactions 20383 to 20655
(recovery.c, 255): journal_recover: JBD: Replayed 5021 and revoked 
37/134 blocks
kjournald starting.  Commit interval 5 seconds
EXT3-fs: md(9,0): orphan cleanup on readonly fs
ext3_orphan_cleanup: truncating inode 100113 to 894 bytes
EXT3-fs: md(9,0): 1 truncate cleaned up
EXT3-fs: recovery complete.
EXT3-fs: mounted filesystem with ordered data mode.
VFS: Mounted root (ext3 filesystem) readonly.
Freeing unused kernel memory: 204k freed
Unable to find swap-space signature
Adding Swap: 249440k swap-space (priority -1)
EXT3 FS 2.4-0.9.9, 5 Sep 2001 on md(9,0), internal journal
kjournald starting.  Commit interval 5 seconds
EXT3 FS 2.4-0.9.9, 5 Sep 2001 on md(9,1), internal journal
EXT3-fs: mounted filesystem with ordered data mode.
-- cut here -- cut here -- cut here -- cut here -- cut here --


The problem happened again today, as I was using aptitude(1) to update 
the packages on the system.  aptitude downloaded all the packages 
successfully, but as it began to unpack them all the error occurred. 
 Unpacking many packages does cause a lot of disk activity, so perhaps 
the problem is related to a lot of disk activity.  I rebooted, and again 
ran aptitude; it managed to unpack a few more packages when the error 
ocurred again.  I rebooted, tried again, hit the problem again, 
rebooted, tried again, and finished the package unpacking and 
installation without further errors.  (It made progress on unpacking the 
packages each time, and the last time it had only a few packages left.)

I would like to help in finding and fixing the problem, if I can.  If 
there is some sort of extra logging or debugging that I can enable, I am 
very willing to do it.  I have no experience with kernel or file system 
debugging, but I am an experienced software engineer, and I can spare 
some time to work on this.

-- 
Steve R. Hastings		"Vita est"
steve@hastings.org		http://www.blarg.net/~steveha

Steve R. Hastings

2001-Oct-09 23:21 UTC

head link

Assert in jbd-kernel.c

Hello.  I have installed the ext3 file system on a test system, and 
sometimes I have a problem: I get an assert from within jbd-kernel.c, 
and whatever prgram was writing to the disk when this happens is unable 
to continue.

The system is a server I built, which I named "dax".  It is running 
Debian unstable, and I updated it to all the latest packages in Debian 
unstable as of today.  It is running a Linux 2.4.10 kernel.  On the 
Linux Weekly News page I saw someone offering 2.4.10 sources pre-patched 
for both ext3 and kernel preemption, so I built from those sources.

http://lwn.net/2001/0927/a/ext3-preempt.php3
http://lameter.com/kernel/linux-2.4.10-ext3-preempt.tar.gz


I enabled both ext3 and kernel preemption.  The server is running Linux 
software RAID, using RAID 1 (mirroring) on both ext3 filesystems.  I 
have seen the same problem and assert several times now.  Whenever I see 
this happen, I always reboot the system, since I am not sure how serious 
the problem is.



The assert text is as follows:

-- cut here -- cut here -- cut here -- cut here -- cut here --
Message from syslogd@dax at Tue Oct  9 12:07:47 2001 ...
dax kernel: Assertion failure in jbd_preclean_buffer_check() at 
jbd-kernel.c:80:
 "(((bh)->b_state & (1UL << BH_Dirty)) != 0)"
-- cut here -- cut here -- cut here -- cut here -- cut here --



Here is line 80 from jbd-kernel.c:

-- cut here -- cut here -- cut here -- cut here -- cut here --
                       J_ASSERT_JH(jh, buffer_dirty(bh));
-- cut here -- cut here -- cut here -- cut here -- cut here --



Then on the reboot, the boot message included this:

-- cut here -- cut here -- cut here -- cut here -- cut here --
EXT3-fs: INFO: recovery required on readonly filesystem.
EXT3-fs: write access will be enabled during recovery.
(recovery.c, 253): journal_recover: JBD: recovery, exit status 0, 
recovered tran
sactions 20383 to 20655
(recovery.c, 255): journal_recover: JBD: Replayed 5021 and revoked 
37/134 blocks
kjournald starting.  Commit interval 5 seconds
EXT3-fs: md(9,0): orphan cleanup on readonly fs
ext3_orphan_cleanup: truncating inode 100113 to 894 bytes
EXT3-fs: md(9,0): 1 truncate cleaned up
EXT3-fs: recovery complete.
EXT3-fs: mounted filesystem with ordered data mode.
VFS: Mounted root (ext3 filesystem) readonly.
Freeing unused kernel memory: 204k freed
Unable to find swap-space signature
Adding Swap: 249440k swap-space (priority -1)
EXT3 FS 2.4-0.9.9, 5 Sep 2001 on md(9,0), internal journal
kjournald starting.  Commit interval 5 seconds
EXT3 FS 2.4-0.9.9, 5 Sep 2001 on md(9,1), internal journal
EXT3-fs: mounted filesystem with ordered data mode.
-- cut here -- cut here -- cut here -- cut here -- cut here --


The problem happened again today, as I was using aptitude(1) to update 
the packages on the system.  aptitude downloaded all the packages 
successfully, but as it began to unpack them all the error occurred. 
Unpacking many packages does cause a lot of disk activity, so perhaps 
the problem is related to a lot of disk activity.  I rebooted, and again 
ran aptitude; it managed to unpack a few more packages when the error 
ocurred again.  I rebooted, tried again, hit the problem again, 
rebooted, tried again, and finished the package unpacking and 
installation without further errors.  (It made progress on unpacking the 
packages each time, and the last time it had only a few packages left.)

I would like to help in finding and fixing the problem, if I can.  If 
there is some sort of extra logging or debugging that I can enable, I am 
very willing to do it.  I have no experience with kernel or file system 
debugging, but I am an experienced software engineer, and I can spare 
some time to work on this.
-- 
Steve R. Hastings		"Vita est"
steve@hastings.org		http://www.blarg.net/~steveha

Stephen C. Tweedie

2001-Oct-10 09:57 UTC

head link

Re: Assert in jbd-kernel.c

Hi,

On Tue, Oct 09, 2001 at 03:15:45PM -0700, Steve R. Hastings
wrote:> Hello.  I have installed the ext3 file system on a test system, and 
> sometimes I have a problem: I get an assert from within jbd-kernel.c, 
> and whatever prgram was writing to the disk when this happens is unable 
> to continue.
> 
> The system is a server I built, which I named "dax".  It is
running
> Debian unstable, and I updated it to all the latest packages in Debian 
> unstable as of today.  It is running a Linux 2.4.10 kernel.  On the 
> Linux Weekly News page I saw someone offering 2.4.10 sources pre-patched 
> for both ext3 and kernel preemption, so I built from those sources.
> 
> http://lwn.net/2001/0927/a/ext3-preempt.php3
> http://lameter.com/kernel/linux-2.4.10-ext3-preempt.tar.gz
Could you please try to reproduce this with a couple of different
kernels?  First, if you could rebuild with CONFIG_BUFFER_DEBUG and
CONFIG_JBD_DEBUG set and mail me the log if the problem recurs, that
will give me a much better idea of what happened.  Second, you should
also try the current -ac kernel.

I've no idea whether this is a genuine ext3 bug, an interaction with
the huge vm/vfs changes in 2.4.10, or a preempt kernel interaction.
The tests above should help to narrow this down.

Thanks,
 Stephen

Maybe Matching Threads

Search for more seemingly similar threads

Ext3 users - Oct 2001 - Assert in jbd-kernel.c

Assert in jbd-kernel.c

Assert in jbd-kernel.c

Re: Assert in jbd-kernel.c

Maybe Matching Threads