thr3ads.net - Ext3 users - kjournald panic in 2.4.20 RedHat 7.2 [Apr 2003]

If this information is useful, please help other people find it:
Share via:

Michael Harris

2003-Apr-18 04:30 UTC

kjournald panic in 2.4.20 RedHat 7.2

Hi, If this is a redundant post I apologize. I am running 2.4.20 on what has
been
a very stable Athlon machine for months, tried to move a 2 GB file from an ext2
partition to an ext3 and kjournald crashed. Here are the last reminants of my
shell scrollback:

[*ROOT* mofo /mnt/sda1/mysql/fd 641 ] ll oldmail/
total 2363288
-rw-rw----    1 mysql    mysql    2147483647 Jan 23 18:04 maillog.MYD
-rw-rw----    1 mysql    mysql    270138368 Jan 23 18:06 maillog.MYI
-rw-rw----    1 mysql    mysql        8910 Mar 22  2002 maillog.frm
[*ROOT* mofo /mnt/sda1/mysql/fd 642 ] df
Filesystem           1k-blocks      Used Available Use% Mounted on
/dev/hda1              8064272   4529888   3124732  60% /
/dev/hda3             29387900   1488316  26406744   6% /home
none                    127884         0    127884   0% /dev/shm
/dev/sda1             33032196  30162240   1191972  97% /mnt/sda1
/dev/sda3            151195204 138014604   5500328  97% /mnt/sda3
/dev/sda4            193010776  75750204 107456104  42% /mnt/sda4
[*ROOT* mofo /mnt/sda1/mysql/fd 643 ] mv oldmail/* /mnt/sda4/mgh/oldmysqllogs/
Segmentation fault
[*ROOT* mofo /mnt/sda1/mysql/fd 644 ]
Message from syslogd@mofo at Thu Apr 17 21:40:13 2003 ...
mofo kernel: Assertion failure in journal_stop() at transaction.c:1384:
"journal_current_handle() == handle"

[*ROOT* mofo /mnt/sda1/mysql/fd 644 ]
[*ROOT* mofo /mnt/sda1/mysql/fd 644 ] fg

Anything accessing /mnt/sda4 hung at this point (smbd among others) and I could
not cleanly shutdown the machine. Finally a umount -km /mnt/sda3 (not sda4)
killed lots
of procs, among them sshd and it is game over until a guy gets onsite to hit the
reset button.

I cant access the machine at the moment but this looks like a hot list so I am
posing what I can. It is an Athlon XP 2000+ with 256 MB DDR (no certain on
speed,
definitely an athlon XP) running strait 2.4.20 from the bz2 at ftp.kernel.org
w/o module support compiled for Athlon, ext3 compiled in statically, and again
this
has been acting as a mysql server for months without a hitch. it is a redhat 7.2
dist
with all the updates as of abotut one month ago installed, less the custom
kernel.
The file I was moving as you can see is a 2 GB file, ie. right at the limit of
ext2 capacity, and I am wondering if this is the culprit.

Here is what was logged before I lost the machine:

Apr 17 21:40:13 mofo kernel: kernel BUG at transaction.c:1384!
Apr 17 21:40:13 mofo kernel: invalid operand: 0000
Apr 17 21:40:13 mofo kernel: CPU:    0
Apr 17 21:40:13 mofo kernel: EIP:    0010:[journal_stop+108/560]    Not tainted
Apr 17 21:40:13 mofo kernel: EIP:    0010:[<c0158eec>]    Not tainted
Apr 17 21:40:13 mofo kernel: EFLAGS: 00010282
Apr 17 21:40:13 mofo kernel: eax: 00000063   ebx: 00000001   ecx: 00000009  
edx: c831bf44
Apr 17 21:40:13 mofo kernel: esi: cdcc7a40   edi: c3739e80   ebp: ccd18ec0  
esp: c69e9a00
Apr 17 21:40:13 mofo kernel: ds: 0018   es: 0018   ss: 0018
Apr 17 21:40:13 mofo kernel: Process mv (pid: 8133, stackpage=c69e9000)
Apr 17 21:40:13 mofo kernel: Stack: c03250a0 c0320f67 c0320d18 00000568 c0327540
00000000 00000000 c3739e80
Apr 17 21:40:13 mofo kernel:        cda5e900 c3739e80 c0152617 c3739e80 00000000
c0158935 cbc83930 00000000
Apr 17 21:40:13 mofo kernel:        c313bc90 cdcc7a40 ca39fec0 ccd18ec0 cda5e900
cc283600 00000007 c013e3ce
Apr 17 21:40:13 mofo kernel: Call Trace:    [ext3_dirty_inode+199/256]
[journal_get_undo_access+245/288] [__mark_inode_dirty+46/144]
[ext3_new_block+112/1936] [journal_cancel_revoke+251/368]
Apr 17 21:40:13 mofo kernel: Call Trace:    [<c0152617>]
[<c0158935>] [<c013e3ce>] [<c014d370>] [<c015ca9b>]
Apr 17 21:40:13 mofo kernel:   [do_get_write_access+1183/1216]
[journal_dirty_metadata+398/432] [ext3_do_update_inode+759/896]
[ext3_do_update_inode+852/896] [ip_nat_fn+467/480] [ipt_hook+28/32]
Apr 17 21:40:13 mofo kernel:   [<c015861f>] [<c0158c8e>]
[<c0152117>] [<c0152174>] [<c02cfe53>] [<c02cfb2c>]
Apr 17 21:40:13 mofo kernel:   [journal_cancel_revoke+251/368]
[do_get_write_access+1183/1216] [tcp_packet+309/336]
[journal_get_write_access+55/80] [journal_cancel_revoke+251/368]
[do_get_write_access+1183/1216]
Apr 17 21:40:13 mofo kernel:   [<c015ca9b>] [<c015861f>]
[<c02cbf85>] [<c0158677>] [<c015ca9b>] [<c015861f>]
Apr 17 21:40:13 mofo kernel:   [ext3_alloc_block+25/32]
[ext3_alloc_branch+85/720] [getblk+40/96] [getblk+57/96] [bread+22/112]
[ext3_do_update_inode+759/896]
Apr 17 21:40:13 mofo kernel:   [<c014f649>] [<c014f965>]
[<c012e778>] [<c012e789>] [<c012e9c6>] [<c0152117>]
Apr 17 21:40:13 mofo kernel:   [ext3_do_update_inode+852/896]
[do_get_write_access+1183/1216] [ext3_get_branch+83/208]
[ext3_get_block_handle+437/688] [do_get_write_access+1183/1216]
[create_buffers+97/240]
Apr 17 21:40:13 mofo kernel:   [<c0152174>] [<c015861f>]
[<c014f7d3>] [<c0150035>] [<c015861f>] [<c012ebd1>]
Apr 17 21:40:13 mofo kernel:   [ext3_get_block+89/96]
[__block_prepare_write+230/768] [__jbd_kmalloc+39/160]
[block_prepare_write+29/64] [ext3_get_block+0/96] [ext3_prepare_write+124/288]
Apr 17 21:40:13 mofo kernel:   [<c0150189>] [<c012f126>]
[<c015e757>] [<c012f9ad>] [<c0150130>] [<c01505dc>]
Apr 17 21:40:13 mofo kernel:   [ext3_get_block+0/96]
[generic_file_write+1185/1760] [ext3_file_write+31/176] [sys_write+149/240]
[schedule+786/832] [system_call+51/56]
Apr 17 21:40:13 mofo kernel:   [<c0150130>] [<c0122b91>]
[<c014e13f>] [<c012ce25>] [<c0110222>] [<c0106d83>]
Apr 17 21:40:13 mofo kernel:
Apr 17 21:40:13 mofo kernel: Code: 0f 0b 68 05 18 0d 32 c0 83 c4 14 f6 47 18 04
ba 01 00 00 00

Looking at http://batleth.sapienti-sat.org/projects/FAQs/ext3-faq.html where i
found the link to
this list, it says to use ext3-0.0.7a.tar.bz2 which looks like a kernel patch,
which I have not
done. The kernel was compiled from the 2.4.20 dist with no ext3 patches. I did
install
e2fsprogs-1.32 but no kernel patches. If this is the issue, please just tell me
I am an
idiot and I will be gone. I am 99% sure this is not a hardware issue.

my first priority is getting the machine on its feet along with that partition,
whose integrity
i now question. Can I substitute ext2 for ext3 in fstab and mount it as ext2,
after ext2 fscking
it?

If you have a monent to spare any insight on this late good Thursday you are
doing me a great favor,
and maybe I have found a legitimate bug here. I should have hte machine online
in 30 minutes
if there is more info I can provide.

Thanks,
Mike

Michael Harris

2003-Apr-18 06:20 UTC

head link

Re: kjournald panic in 2.4.20 RedHat 7.2

Hi, I have the machine back online. /dev/sda4 (the partition that crashed)
recovered
in about 2 seconds with the only e2fsck output being "recovering
journal", so I am
running with it.

Here are more details on the machine:

[*ROOT* mofo /home/mgh 23 ] cat /proc/cpuinfo
processor       : 0
vendor_id       : AuthenticAMD
cpu family      : 6
model           : 6
model name      : AMD Athlon(tm) XP 1800+
stepping        : 2
cpu MHz         : 1534.037
cache size      : 256 KB
fdiv_bug        : no
hlt_bug         : no
f00f_bug        : no
coma_bug        : no
fpu             : yes
fpu_exception   : yes
cpuid level     : 1
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov
pat pse36 mmx fxsr sse syscall mmxext 3dnowext 3dnow
bogomips        : 3060.53
[*ROOT* mofo /home/mgh 24 ] uname -a
Linux mofo 2.4.20 #14 Wed Mar 19 16:48:34 CST 2003 i686 unknown
[*ROOT* mofo /home/mgh 25 ] df
Filesystem           1k-blocks      Used Available Use% Mounted on
/dev/hda1              8064272   4530996   3123624  60% /
/dev/hda3             29387900   1485288  26409772   6% /home
none                    127884         0    127884   0% /dev/shm
/dev/sda3            151195204 138014604   5500328  97% /mnt/sda3
/dev/sda4            193010776  75844724 107361584  42% /mnt/sda4
/dev/sda1             33032196  27801288   3552924  89% /mnt/sda1
[*ROOT* mofo /home/mgh 26 ] mount
/dev/hda1 on / type ext3 (rw)
none on /proc type proc (rw)
none on /dev/pts type devpts (rw,gid=5,mode=620)
/dev/hda3 on /home type ext3 (rw)
none on /dev/shm type tmpfs (rw)
/dev/sda3 on /mnt/sda3 type ext2 (rw)
/dev/sda4 on /mnt/sda4 type ext3 (rw)
/dev/sda1 on /mnt/sda1 type ext2 (rw)
[*ROOT* mofo /home/mgh 27 ] cat /proc/meminfo
        total:    used:    free:  shared: buffers:  cached:
Mem:  261910528 249167872 12742656        0 25636864 86761472
Swap: 1052827648 15437824 1037389824
MemTotal:       255772 kB
MemFree:         12444 kB
MemShared:           0 kB
Buffers:         25036 kB
Cached:          80408 kB
SwapCached:       4320 kB
Active:         133188 kB
Inactive:        91704 kB
HighTotal:           0 kB
HighFree:            0 kB
LowTotal:       255772 kB
LowFree:         12444 kB
SwapTotal:     1028152 kB
SwapFree:      1013076 kB

[*ROOT* mofo /home/mgh 30 ] lspci
00:00.0 Host bridge: VIA Technologies, Inc. VT8367 [KT266]
00:01.0 PCI bridge: VIA Technologies, Inc. VT8367 [KT266 AGP]
00:09.0 Communication controller: Cyclades Corporation PC300 TE 2 (rev 01)
00:0b.0 SCSI storage controller: Adaptec AIC-7881U
00:0d.0 Ethernet controller: Bridgecom, Inc: Unknown device 0985 (rev 11)
00:0f.0 Ethernet controller: Bridgecom, Inc: Unknown device 0985 (rev 11)
00:10.0 VGA compatible controller: Silicon Integrated Systems [SiS] 82C204 (rev
21)
00:11.0 ISA bridge: VIA Technologies, Inc.: Unknown device 3147
00:11.1 IDE interface: VIA Technologies, Inc. Bus Master IDE (rev 06)
00:11.2 USB Controller: VIA Technologies, Inc. UHCI USB (rev 23)
00:11.3 USB Controller: VIA Technologies, Inc. UHCI USB (rev 23)


sda is an external Belkin RAID on an Adaptec 2940:

Apr 17 23:56:47 mofo kernel: SCSI subsystem driver Revision: 1.00
Apr 17 23:56:47 mofo kernel: scsi0 : Adaptec AIC7XXX EISA/VLB/PCI SCSI HBA
DRIVER, Rev 6.2.8
Apr 17 23:56:47 mofo kernel:         <Adaptec 2940 Ultra SCSI adapter>
Apr 17 23:56:47 mofo kernel:         aic7880: Ultra Wide Channel A, SCSI Id=7,
16/253 SCBs
Apr 17 23:56:47 mofo kernel:
Apr 17 23:56:47 mofo kernel:   Vendor: BellStor  Model:                   Rev:
Apr 17 23:56:47 mofo kernel:   Type:   Direct-Access                      ANSI
SCSI revision: 02
Apr 17 23:56:47 mofo kernel: (scsi0:A:3): 40.000MB/s transfers (20.000MHz,
offset 8, 16bit)
Apr 17 23:56:47 mofo kernel: scsi0:A:3:0: Tagged Queuing enabled.  Depth 253
Apr 17 23:56:47 mofo kernel: Attached scsi disk sda at scsi0, channel 0, id 3,
lun 0
Apr 17 23:56:47 mofo kernel: SCSI device sda: 1073723392 512-byte hdwr sectors
(549746 MB)
Apr 17 23:56:48 mofo kernel:  sda: sda1 sda2 sda3 sda4

[*ROOT* mofo /usr/src 201 ] fdisk /dev/sda

The number of cylinders for this disk is set to 66836.
There is nothing wrong with that, but this is larger than 1024,
and could in certain setups cause problems with:
1) software that runs at boot time (e.g., old versions of LILO)
2) booting and partitioning software from other OSs
   (e.g., DOS FDISK, OS/2 FDISK)

Command (m for help): p

Disk /dev/sda: 255 heads, 63 sectors, 66836 cylinders
Units = cylinders of 16065 * 512 bytes

   Device Boot    Start       End    Blocks   Id  System
/dev/sda1             1      4178  33559753+  83  Linux
/dev/sda2          4179     23301 153605497+  83  Linux
/dev/sda3         23302     42424 153605497+  83  Linux
/dev/sda4         42425     66836 196089390   83  Linux

Command (m for help): q
[*ROOT* mofo /usr/src 202 ] cat /etc/fstab
LABEL=/                 /                       ext3    defaults        1 1
none                    /dev/pts                devpts  gid=5,mode=620  0 0
LABEL=/home             /home                   ext3    defaults        1 2
none                    /proc                   proc    defaults        0 0
none                    /dev/shm                tmpfs   defaults        0 0
/dev/hda2               swap                    swap    defaults        0 0
/dev/fd0                /mnt/floppy             auto    noauto,owner,kudzu 0 0
/dev/sda1               /mnt/sda1               ext2    noauto 0 0
/dev/sda2               /mnt/sda2               ext2    noauto 0 0
/dev/sda3               /mnt/sda3               ext2    noauto 0 0
/dev/sda4               /mnt/sda4               ext3    noauto 0 0
/dev/cdrom              /mnt/cdrom              iso9660 noauto,owner,kudzu,ro 0
0

The kernel is a minimal 2.4.20 with the freeswan 1.99 patch applied. not that it
could not be
related, but i have been running freeswan since 1999 on 40+ machines in various
kernels without
any problem. I also applied the pc300-3.4.7 patch to support the Cyclades PC300
T1 card.
otherwise the kernel is as stripped down as I could make it.

CPU option is "(Athlon/Duron/K7) Processor family"
modules support disabled, everything compiled in statically

There are no scsi or other hardware errors surrounding the kjournald crash (or
ever).
After kjournald crashed I could run df without it hanging, but an ls on
/mnt/sda4 hung as did
all other processes hitting it (remote NT machines using Samba). killall -9 smbd
never worked,
umount /mnt/sda4 reported busy. The load average jumped to about 30 during all
this.
umount /mnt/sda1 worked but fsck showed it as uncleanly umounted though didnt
find any
errors. i could not umount /dev/sda3 due to it being busy, but finally did a
umount -km /mnt/sda3 which killed my shell and I was unable to login thereafter.

Without thinking too much about it, I deleted the file being moved to sda4
when it crashed. Only maillog.MYD had copied over and it showed a size of about
270 MB.

As far as the error itself, it looks like
/usr/src/linux/fs/jbd/transaction.c:1384 is:

   J_ASSERT (journal_current_handle() == handle)

in fcn journal_stop() though anyone this board is 90 steps ahead of me as to
what
this aserts.

Also, rereading the ext3 FAQ at
http://batleth.sapienti-sat.org/projects/FAQs/ext3-faq.html
it looks like 2.4.16 and up should not require the patch. 

Any other information I can provide please ask and thanks for your help.

Mike

> Hi, If this is a redundant post I apologize. I am running 2.4.20 on what
has been
> a very stable Athlon machine for months, tried to move a 2 GB file from an
ext2
> partition to an ext3 and kjournald crashed. Here are the last reminants of
my
> shell scrollback:
> 
> [*ROOT* mofo /mnt/sda1/mysql/fd 641 ] ll oldmail/
> total 2363288
> -rw-rw----    1 mysql    mysql    2147483647 Jan 23 18:04 maillog.MYD
> -rw-rw----    1 mysql    mysql    270138368 Jan 23 18:06 maillog.MYI
> -rw-rw----    1 mysql    mysql        8910 Mar 22  2002 maillog.frm
> [*ROOT* mofo /mnt/sda1/mysql/fd 642 ] df
> Filesystem           1k-blocks      Used Available Use% Mounted on
> /dev/hda1              8064272   4529888   3124732  60% /
> /dev/hda3             29387900   1488316  26406744   6% /home
> none                    127884         0    127884   0% /dev/shm
> /dev/sda1             33032196  30162240   1191972  97% /mnt/sda1
> /dev/sda3            151195204 138014604   5500328  97% /mnt/sda3
> /dev/sda4            193010776  75750204 107456104  42% /mnt/sda4
> [*ROOT* mofo /mnt/sda1/mysql/fd 643 ] mv oldmail/*
/mnt/sda4/mgh/oldmysqllogs/
> Segmentation fault
> [*ROOT* mofo /mnt/sda1/mysql/fd 644 ]
> Message from syslogd@mofo at Thu Apr 17 21:40:13 2003 ...
> mofo kernel: Assertion failure in journal_stop() at transaction.c:1384:
"journal_current_handle() == handle"
> 
> [*ROOT* mofo /mnt/sda1/mysql/fd 644 ]
> [*ROOT* mofo /mnt/sda1/mysql/fd 644 ] fg
> 
> Anything accessing /mnt/sda4 hung at this point (smbd among others) and I
could
> not cleanly shutdown the machine. Finally a umount -km /mnt/sda3 (not sda4)
killed lots
> of procs, among them sshd and it is game over until a guy gets onsite to
hit the reset button.
> 
> I cant access the machine at the moment but this looks like a hot list so I
am
> posing what I can. It is an Athlon XP 2000+ with 256 MB DDR (no certain on
speed,
> definitely an athlon XP) running strait 2.4.20 from the bz2 at
ftp.kernel.org
> w/o module support compiled for Athlon, ext3 compiled in statically, and
again this
> has been acting as a mysql server for months without a hitch. it is a
redhat 7.2 dist
> with all the updates as of abotut one month ago installed, less the custom
kernel.
> The file I was moving as you can see is a 2 GB file, ie. right at the limit
of
> ext2 capacity, and I am wondering if this is the culprit.
> 
> Here is what was logged before I lost the machine:
> 
> Apr 17 21:40:13 mofo kernel: kernel BUG at transaction.c:1384!
> Apr 17 21:40:13 mofo kernel: invalid operand: 0000
> Apr 17 21:40:13 mofo kernel: CPU:    0
> Apr 17 21:40:13 mofo kernel: EIP:    0010:[journal_stop+108/560]    Not
tainted
> Apr 17 21:40:13 mofo kernel: EIP:    0010:[<c0158eec>]    Not tainted
> Apr 17 21:40:13 mofo kernel: EFLAGS: 00010282
> Apr 17 21:40:13 mofo kernel: eax: 00000063   ebx: 00000001   ecx: 00000009 
edx: c831bf44
> Apr 17 21:40:13 mofo kernel: esi: cdcc7a40   edi: c3739e80   ebp: ccd18ec0 
esp: c69e9a00
> Apr 17 21:40:13 mofo kernel: ds: 0018   es: 0018   ss: 0018
> Apr 17 21:40:13 mofo kernel: Process mv (pid: 8133, stackpage=c69e9000)
> Apr 17 21:40:13 mofo kernel: Stack: c03250a0 c0320f67 c0320d18 00000568
c0327540 00000000 00000000 c3739e80
> Apr 17 21:40:13 mofo kernel:        cda5e900 c3739e80 c0152617 c3739e80
00000000 c0158935 cbc83930 00000000
> Apr 17 21:40:13 mofo kernel:        c313bc90 cdcc7a40 ca39fec0 ccd18ec0
cda5e900 cc283600 00000007 c013e3ce
> Apr 17 21:40:13 mofo kernel: Call Trace:    [ext3_dirty_inode+199/256]
[journal_get_undo_access+245/288] [__mark_inode_dirty+46/144]
[ext3_new_block+112/1936] [journal_cancel_revoke+251/368]
> Apr 17 21:40:13 mofo kernel: Call Trace:    [<c0152617>]
[<c0158935>] [<c013e3ce>] [<c014d370>] [<c015ca9b>]
> Apr 17 21:40:13 mofo kernel:   [do_get_write_access+1183/1216]
[journal_dirty_metadata+398/432] [ext3_do_update_inode+759/896]
[ext3_do_update_inode+852/896] [ip_nat_fn+467/480] [ipt_hook+28/32]
> Apr 17 21:40:13 mofo kernel:   [<c015861f>] [<c0158c8e>]
[<c0152117>] [<c0152174>] [<c02cfe53>] [<c02cfb2c>]
> Apr 17 21:40:13 mofo kernel:   [journal_cancel_revoke+251/368]
[do_get_write_access+1183/1216] [tcp_packet+309/336]
[journal_get_write_access+55/80] [journal_cancel_revoke+251/368]
[do_get_write_access+1183/1216]
> Apr 17 21:40:13 mofo kernel:   [<c015ca9b>] [<c015861f>]
[<c02cbf85>] [<c0158677>] [<c015ca9b>] [<c015861f>]
> Apr 17 21:40:13 mofo kernel:   [ext3_alloc_block+25/32]
[ext3_alloc_branch+85/720] [getblk+40/96] [getblk+57/96] [bread+22/112]
[ext3_do_update_inode+759/896]
> Apr 17 21:40:13 mofo kernel:   [<c014f649>] [<c014f965>]
[<c012e778>] [<c012e789>] [<c012e9c6>] [<c0152117>]
> Apr 17 21:40:13 mofo kernel:   [ext3_do_update_inode+852/896]
[do_get_write_access+1183/1216] [ext3_get_branch+83/208]
[ext3_get_block_handle+437/688] [do_get_write_access+1183/1216]
[create_buffers+97/240]
> Apr 17 21:40:13 mofo kernel:   [<c0152174>] [<c015861f>]
[<c014f7d3>] [<c0150035>] [<c015861f>] [<c012ebd1>]
> Apr 17 21:40:13 mofo kernel:   [ext3_get_block+89/96]
[__block_prepare_write+230/768] [__jbd_kmalloc+39/160]
[block_prepare_write+29/64] [ext3_get_block+0/96] [ext3_prepare_write+124/288]
> Apr 17 21:40:13 mofo kernel:   [<c0150189>] [<c012f126>]
[<c015e757>] [<c012f9ad>] [<c0150130>] [<c01505dc>]
> Apr 17 21:40:13 mofo kernel:   [ext3_get_block+0/96]
[generic_file_write+1185/1760] [ext3_file_write+31/176] [sys_write+149/240]
[schedule+786/832] [system_call+51/56]
> Apr 17 21:40:13 mofo kernel:   [<c0150130>] [<c0122b91>]
[<c014e13f>] [<c012ce25>] [<c0110222>] [<c0106d83>]
> Apr 17 21:40:13 mofo kernel:
> Apr 17 21:40:13 mofo kernel: Code: 0f 0b 68 05 18 0d 32 c0 83 c4 14 f6 47
18 04 ba 01 00 00 00
> 
> Looking at http://batleth.sapienti-sat.org/projects/FAQs/ext3-faq.html
where i found the link to
> this list, it says to use ext3-0.0.7a.tar.bz2 which looks like a kernel
patch, which I have not
> done. The kernel was compiled from the 2.4.20 dist with no ext3 patches. I
did install
> e2fsprogs-1.32 but no kernel patches. If this is the issue, please just
tell me I am an
> idiot and I will be gone. I am 99% sure this is not a hardware issue.
> 
> my first priority is getting the machine on its feet along with that
partition, whose integrity
> i now question. Can I substitute ext2 for ext3 in fstab and mount it as
ext2, after ext2 fscking
> it?
> 
> If you have a monent to spare any insight on this late good Thursday you
are doing me a great favor,
> and maybe I have found a legitimate bug here. I should have hte machine
online in 30 minutes
> if there is more info I can provide.
> 
> Thanks,
> Mike
> 
> 
> 
> _______________________________________________
> Ext3-users mailing list
> Ext3-users@redhat.com
> https://listman.redhat.com/mailman/listinfo/ext3-users

Stephen C. Tweedie

2003-Apr-18 09:33 UTC

head link

Re: kjournald panic in 2.4.20 RedHat 7.2

Hi,

On Fri, 2003-04-18 at 05:30, Michael Harris wrote:> Hi, If this is a redundant post I apologize. I am running 2.4.20 on what
has been
> a very stable Athlon machine for months, tried to move a 2 GB file from an
ext2
> partition to an ext3 and kjournald crashed. Here are the last reminants of
my
> shell scrollback:
> mofo kernel: Assertion failure in journal_stop() at transaction.c:1384:
"journal_current_handle() == handle"
Odd.  That is one assert failure I have _never_ seen reported.  Handle
mismatches in journal_start have happened from time to time when there
has been illegal recursion in the VM, but not in journal_stop.

The most likely cause would seem to me to be a stack overflow --- the
per-process field which holds the journal handle is right at the end of
the task struct, so it's one of the first fields to be clobbered in the
event of a stack overflow.

If that has happened, it's not due to ext3 --- the stack here isn't
close to being that sort of size --- but it's entirely possible that
there were IRQ routines operating during the function which overflowed
the stack.

In particular, we've seen that happen before with heavy network
activity, especially with multiple NICs, because the random sampling
that occurs for /dev/random during NIC activity was a heavy stack user.

There's a patch to address that in the very latest Marcelo kernel
trees.  It reduces the stack usage of the random sampling by several
hundred bytes.  The fix is in the 2.4.21-pre7 kernel.
> The file I was moving as you can see is a 2 GB file, ie. right at the limit
of
> ext2 capacity, and I am wondering if this is the culprit.
No, ext2/3 can both operate beyond 2GB quite safely.

Cheers,
 Stephen

Reasonably Related Threads

Search for more apparently analagous threads

Ext3 users - Apr 2003 - kjournald panic in 2.4.20 RedHat 7.2

kjournald panic in 2.4.20 RedHat 7.2

Re: kjournald panic in 2.4.20 RedHat 7.2

Re: kjournald panic in 2.4.20 RedHat 7.2

Reasonably Related Threads