thr3ads.net - Ext3 users - Bug: 2.2.20pre/ext3 0.0.7a crash apparently in sys

If this information is useful, please help other people find it:
Share via:

Nigel Metheringham

2001-Aug-06 10:43 UTC

Bug: 2.2.20pre/ext3 0.0.7a crash apparently in sys_close()

Apologies for a vague and wooly bug report, but I can't reproduce this
on my test systems - I *can* reproduce it on some production ones like a
flash.... but that seems to upset our operations guys :-(

I am getting a set of crashes on some boxes in the field, apparently
related to high network traffic (this only occurs on boxes with ethernet
connectivity back to the centre rather than the majority of boxes which
have E1 connectivity back), and when the boxes are under network load.
The boxes have several filesystems on a h/w RAID controller, all of
which are ext3 except /boot.   There should be *very* little disk
traffic on these boxes in normal use, including small amounts of syslog.

The kernel is a 2.2.20pre - the couple of messages here are from
2.2.20pre8 - which also has FreeSWAN 1.91 and RAID stuff patched in
(however on all of these boxes RAID is not in use since the h/w has
onboard h/w RAID).

A kernel which has 0.0.7a ext3 dies in this situation.
A kernel which has 0.0.6b ext3 works without fault.

The death message is:-
Unable to handle kernel NULL pointer dereference at virtual address
00000004
current->tss.cr3 = 0ddf6000, %cr3 = 0ddf6000
*pde = 00000000
Entering kdb due to panic @ 0xc9124fe3
eax = 0x00000006  ebx = 0x00000004  ecx = 0xcdf32000  edx = 0x00000000
esi = 0x00000000  edi = 0xfffffff7  esp = 0xcdf32000  eip = 0xc0124fe3
ebp = 0xbffffca0   ss = 0x00000000   cs = 0x00000010  eflags 0x00010246
ds = 0x00000018   es = 0x00000018  origeax = 0xffffffff  &regs 0xcdf33f80

[NB These boxes have Compaq Remote Insight consoles - so this message is
retyped from a jpg of the console output :-(    Nothing makes it into
syslog]

backtrace and checks against System.map show this to be in sys_close()
A few other bombs have also been mostly in sys_close with a couple of
others within schedule().

	Nigel.

Stephen C. Tweedie

2001-Aug-08 15:19 UTC

head link

Re: Bug: 2.2.20pre/ext3 0.0.7a crash apparently in sys_close()

Hi,

On Mon, Aug 06, 2001 at 11:43:54AM +0100, Nigel Metheringham
wrote:> Apologies for a vague and wooly bug report, but I can't reproduce this
> on my test systems - I *can* reproduce it on some production ones like a
> flash.... but that seems to upset our operations guys :-(
> 
> I am getting a set of crashes on some boxes in the field, apparently
> related to high network traffic (this only occurs on boxes with ethernet
> connectivity back to the centre rather than the majority of boxes which
> have E1 connectivity back), and when the boxes are under network load.
> The boxes have several filesystems on a h/w RAID controller, all of
> which are ext3 except /boot.   There should be *very* little disk
> traffic on these boxes in normal use, including small amounts of syslog.
> 
> The kernel is a 2.2.20pre - the couple of messages here are from
> 2.2.20pre8 - which also has FreeSWAN 1.91 and RAID stuff patched in
> (however on all of these boxes RAID is not in use since the h/w has
> onboard h/w RAID).
> 
> A kernel which has 0.0.7a ext3 dies in this situation.
> A kernel which has 0.0.6b ext3 works without fault.
Is 7a the _only_ difference between a working and a non-working
kernel?  

Other than that, I'd probably need a bit more info than what's here to
get much further with this.  One thing that's always worth trying is
to disable slab poisoning and see if things get better --- that's a
piece of pure debugging which ext3 enables for its own benefit, but
which often causes other buggy drivers to fall apart under load.  The
oops you posted doesn't have any obvious signs of slab debugging
problems, though.

Cheers,
 Stephen

Seemingly Similar Threads

Search for more seemingly similar threads

Ext3 users - Aug 2001 - Bug: 2.2.20pre/ext3 0.0.7a crash apparently in sys_close()

Bug: 2.2.20pre/ext3 0.0.7a crash apparently in sys_close()

Re: Bug: 2.2.20pre/ext3 0.0.7a crash apparently in sys_close()

Seemingly Similar Threads