thr3ads.net - Ext3 users - another quota related ext3fs crash... [Mar 2002]

If this information is useful, please help other people find it:
Share via:

Michael Renner

2002-Mar-09 23:58 UTC

another quota related ext3fs crash...

hiya!

see the attached file for the resolved bug() call. my kernel spit out 185
messages like this:

Mar  9 17:15:13 srck@trottelkunde attempt to access beyond end of device
Mar  9 17:15:13 srck@trottelkunde 16:42: rw=0, want=0, limit=12289725

right before the bug().

this message didn't get parsed by ksymoops

Mar  9 17:15:13 srck@trottelkunde Assertion failure in journal_start() at
transaction.c:226: "handle->h_transaction->t_journal == journal"

i'm somehow desperate by now, i get crashes like this on a monthly basis;
the quota code always seems to be the cause...

here are the lines of the last crashes (can't resolve call trace because
the old kernels are gone and i didn't resolve the traces when they
occured)

---

Dec  9 19:55:30 srck@trottelkunde Kernel panic: EXT3-fs panic (device
ide1(22,66)): load_block_bitmap: block_group >= groups_count - block_group =
131071, groups_count = 94
Dec  9 19:55:30 srck@trottelkunde Assertion failure in journal_start() at
transaction.c:227: "handle->h_transaction->t_journal == journal"

---

Jan  4 11:29:42 srck@trottelkunde Kernel panic: EXT3-fs panic (device
ide1(22,67)): load_block_bitmap: block_group >= groups_count - block_group =
131071, groups_count = 52
Jan  4 11:29:42 srck@trottelkunde Assertion failure in journal_start() at
transaction.c:227: "handle->h_transaction->t_journal == journal"

---

Feb 19 00:54:43 srck@trottelkunde Kernel panic: EXT3-fs panic (device
ide1(22,66)): load_block_bitmap: block_group >= groups_count - block_group =
131071, groups_count = 94
Feb 19 00:54:43 srck@trottelkunde Assertion failure in journal_start() at
transaction.c:225: "handle->h_transaction->t_journal == journal"

---

Mar  9 17:15:13 srck@trottelkunde Kernel panic: EXT3-fs panic (device
ide1(22,66)): load_block_bitmap: block_group >= groups_count - block_group =
131071, groups_count = 94
Mar  9 17:15:13 srck@trottelkunde Assertion failure in journal_start() at
transaction.c:226: "handle->h_transaction->t_journal == journal"

---

the kernels where always the -ac version from the current tree, using
vfsv0 quota.

is it possible that the hardware is the cause of these crashes? a
badblocks scan of the whole device hasn't reported anything suspicious
(there aren't any messages in the kernel logs which would point to bad
blocks either).

both partitions on this drive (/www and /home) had crashes; the only other
usrquota partition in this system is on another drive, has barely disk/fs
i/o (mounted as /tmp) and hasn't reported any problems yet.

btw. our semi-highend webservers which are running with vanilla 2.4.17
kernels and ext3 + vfsold quota hadn't had any crashes yet.

have you any ideas on how to resolve this problem?

best regards,
michael

Andrew Morton

2002-Mar-10 01:33 UTC

head link

Re: another quota related ext3fs crash...

Michael Renner wrote:> 
> hiya!
> 
> see the attached file for the resolved bug() call. my kernel spit out 185
> messages like this:
> 
> Mar  9 17:15:13 srck@trottelkunde attempt to access beyond end of device
> Mar  9 17:15:13 srck@trottelkunde 16:42: rw=0, want=0, limit=12289725
> 
> right before the bug().
> 
> this message didn't get parsed by ksymoops
> 
> Mar  9 17:15:13 srck@trottelkunde Assertion failure in journal_start() at
transaction.c:226: "handle->h_transaction->t_journal == journal"
This one is a bit of a red herring.  The real error is this one:
> Dec  9 19:55:30 srck@trottelkunde Kernel panic: EXT3-fs panic (device
ide1(22,66)): load_block_bitmap: block_group >= groups_count - block_group =
131071, groups_count = 94
Which called panic(), which called sys_sync(), which tried to sync
some other filesystem while inside this filesystem's transaction.
Really, panic() shouldn't be calling sys_sync().

So the question is, why is truncate trying to access such a wild
blockgroup number?  ext3_free_blocks() checks the block number so
possibly some of the in-memory superblock information has been
corrupted.

Hard to say.  Are you running anything unusual on the machine? Anything
which could help us to understand why you get this happening, but
(to my knowledge) nobody else does?  Is it good quality hardware?

-

Maybe Matching Threads

Search for more possibly parallel threads

Ext3 users - Mar 2002 - another quota related ext3fs crash...

another quota related ext3fs crash...

Re: another quota related ext3fs crash...

Maybe Matching Threads