Hi Steven, Hi ext3-users list! We encountered a reproduceable problem with ext3: When issuing a FIBMAP ioctl for a block written right before while the FS is under high load (RH build universe), the assertion !journal->j_running_transaction fails at the bottom of journal_flush() in fs/jbd/journal.c. We encountered this problem with the arch=s390x (64 bit big endian) bootloader zipl, I'll try to reproduce it with 2.4.latest on arch=i386. I'll try to create a stack backtrace as well by inserting a BUG();. Strace of problem: ioctl(5, FIBMAP, 0x1ffffffe528) = 0 close(5) = 0 write(4, "\0\342\0\6\4\20\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 4096 ioctl(4, FIBMAP <unfinished ...> +++ killed by SIGSEGV +++ Syslog output of problem: z02 kernel: Assertion failure in journal_flush() at journal.c:1198: "!journal->j_running_transaction" mit freundlichem Gruß / with kind regards Carsten Otte IBM Deutschland Entwicklung GmbH Linux for eServer development - device driver team Phone: +49/07031/16-4076 IBM internal phone: *120-4076 -- We are Linux. Resistance indicates that you're missing the point!
On Jan 07, 2002 17:24 +0100, Carsten Otte wrote:> We encountered a reproduceable problem with ext3: > When issuing a FIBMAP ioctl for a block written right before while > the FS is under high load (RH build universe), the assertion > !journal->j_running_transaction fails at the bottom of journal_flush() > in fs/jbd/journal.c. > We encountered this problem with the arch=s390x (64 bit big endian) > bootloader zipl, I'll try to reproduce it with 2.4.latest on arch=i386. > I'll try > to create a stack backtrace as well by inserting a BUG();. > > Strace of problem: > ioctl(5, FIBMAP, 0x1ffffffe528) = 0 > close(5) = 0 > write(4, "\0\342\0\6\4\20\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., > 4096) = 4096 > ioctl(4, FIBMAP <unfinished ...> > +++ killed by SIGSEGV +++ > > Syslog output of problem: > z02 kernel: Assertion failure in journal_flush() at journal.c:1198: > "!journal->j_running_transaction"Hmm, you should get an oops and stack trace with an assertion (see J_ASSERT macro in include/linux/jbd.h). Maybe it is something with the S/390 BUG macro that is different? Cheers, Andreas -- Andreas Dilger http://sourceforge.net/projects/ext2resize/ http://www-mddsp.enel.ucalgary.ca/People/adilger/
Hi, On Mon, Jan 07, 2002 at 05:24:36PM +0100, Carsten Otte wrote:> > We encountered a reproduceable problem with ext3: > When issuing a FIBMAP ioctl for a block written right before while > the FS is under high load (RH build universe), the assertion > !journal->j_running_transaction fails at the bottom of journal_flush() > in fs/jbd/journal.c. > We encountered this problem with the arch=s390x (64 bit big endian) > bootloader zipl, I'll try to reproduce it with 2.4.latest on arch=i386. > I'll try > to create a stack backtrace as well by inserting a BUG();.Does the patch below fix it? There was one path through the transaction startup code where we could drop the journal lock without retesting the barrier which protects that FIBMAP special case. Cheers, Stephen Index: fs/jbd/transaction.c ==================================================================RCS file: /cvsroot/gkernel/ext3/fs/jbd/transaction.c,v retrieving revision 1.64.2.5 retrieving revision 1.64.2.6 diff -c -r1.64.2.5 -r1.64.2.6 *** fs/jbd/transaction.c 2001/11/18 03:46:45 1.64.2.5 --- fs/jbd/transaction.c 2002/01/07 19:33:24 1.64.2.6 *************** *** 97,102 **** --- 97,104 ---- lock_journal(journal); + repeat_locked: + if (is_journal_aborted(journal) || (journal->j_errno != 0 && !(journal->j_flags & JFS_ACK_ERR))) { unlock_journal(journal); *************** *** 110,116 **** goto repeat; } - repeat_locked: if (!journal->j_running_transaction) get_transaction(journal, 0); /* @@@ Error? */ --- 112,117 ----