hiya!
see the attached file for the resolved bug() call. my kernel spit out 185
messages like this:
Mar 9 17:15:13 srck@trottelkunde attempt to access beyond end of device
Mar 9 17:15:13 srck@trottelkunde 16:42: rw=0, want=0, limit=12289725
right before the bug().
this message didn't get parsed by ksymoops
Mar 9 17:15:13 srck@trottelkunde Assertion failure in journal_start() at
transaction.c:226: "handle->h_transaction->t_journal == journal"
i'm somehow desperate by now, i get crashes like this on a monthly basis;
the quota code always seems to be the cause...
here are the lines of the last crashes (can't resolve call trace because
the old kernels are gone and i didn't resolve the traces when they
occured)
---
Dec 9 19:55:30 srck@trottelkunde Kernel panic: EXT3-fs panic (device
ide1(22,66)): load_block_bitmap: block_group >= groups_count - block_group =
131071, groups_count = 94
Dec 9 19:55:30 srck@trottelkunde Assertion failure in journal_start() at
transaction.c:227: "handle->h_transaction->t_journal == journal"
---
Jan 4 11:29:42 srck@trottelkunde Kernel panic: EXT3-fs panic (device
ide1(22,67)): load_block_bitmap: block_group >= groups_count - block_group =
131071, groups_count = 52
Jan 4 11:29:42 srck@trottelkunde Assertion failure in journal_start() at
transaction.c:227: "handle->h_transaction->t_journal == journal"
---
Feb 19 00:54:43 srck@trottelkunde Kernel panic: EXT3-fs panic (device
ide1(22,66)): load_block_bitmap: block_group >= groups_count - block_group =
131071, groups_count = 94
Feb 19 00:54:43 srck@trottelkunde Assertion failure in journal_start() at
transaction.c:225: "handle->h_transaction->t_journal == journal"
---
Mar 9 17:15:13 srck@trottelkunde Kernel panic: EXT3-fs panic (device
ide1(22,66)): load_block_bitmap: block_group >= groups_count - block_group =
131071, groups_count = 94
Mar 9 17:15:13 srck@trottelkunde Assertion failure in journal_start() at
transaction.c:226: "handle->h_transaction->t_journal == journal"
---
the kernels where always the -ac version from the current tree, using
vfsv0 quota.
is it possible that the hardware is the cause of these crashes? a
badblocks scan of the whole device hasn't reported anything suspicious
(there aren't any messages in the kernel logs which would point to bad
blocks either).
both partitions on this drive (/www and /home) had crashes; the only other
usrquota partition in this system is on another drive, has barely disk/fs
i/o (mounted as /tmp) and hasn't reported any problems yet.
btw. our semi-highend webservers which are running with vanilla 2.4.17
kernels and ext3 + vfsold quota hadn't had any crashes yet.
have you any ideas on how to resolve this problem?
best regards,
michael