thr3ads.net - freebsd stable - Help? 6.1-S: Fatal trap 12: page fault while in kernel mode [Jun 2006]

If this information is useful, please help other people find it:
Share via:

David Wolfskill

2006-Jun-15 23:22 UTC

Help? 6.1-S: Fatal trap 12: page fault while in kernel mode

I had one of these a couple of weeks ago or so; I had been distracted by
some more urgent matters that came up (the panic was on a machine under
test; the more urgent matters were little things like needing to deploy
a handful of resolvers on our network because existing ones were running
on systems that had provided evidence of being prone to imminent
failure).

Anyway: I updated the 2 boxen under test to 6.1-STABLE as of this
morning, and finally(!) had a chance to re-try the failing operation.

It went "kaboom!" again. :-{ (Well, there's something to be said
for
consistency. :-})

The setup is thus:

* On machine "C", I run smtp-sink (one of the test programs from
Postfix).

* On machine "B" (the machine & software under test), I fire up
the
software being tested, which acts as an SMTP relay, accepting mail and
relaying it to machine C (where it gets counted and discarded).

* On machine "A", I have installed the mail/postal port; I run
"postal,"
directing it to send mail to the SMTP server on machine B (the machine
under test).

It seems to run OK (albeit slowly) for a couple of minutes; then the
serial console reports:

Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 06
fault virtual address = 0x0
fault code = supervisor read, page not present
instruction pointer = 0x20:0x0
stack pointer = 0x28:0xf09b3b98
frame pointer = 0x28:0xf09b3bcc
code segment = base 0x0, limit 0xfffff, type 0x1b
= DPL 0, pres 1, def32 1, gran 1
processor eflags = interrupt enabled, resume, IOPL = 0
current process = 23782 (ecelerity)
[thread pid 23782 tid 100120 ]
Stopped at 0: *** error reading from address 0 ***
db> trace
Tracing pid 23782 tid 100120 td 0xcc445180
db>

Now, the software being tested apparently exercises threads quite a bit.

The hardware (for machine B) is a dual Xeon @ 3 GHz & 4 GB RAM.

The kernel config is pretty simple:

-------------%< snip! -------------------
include PAE
options SMP # Symmetric MultiProcessor Kernel

nodevice hptmv
nodevice bce

options MAXDSIZ="(2000UL*1024*1024)"
options KDB
options KDB_TRACE
options DDB

options IPFIREWALL
options IPFIREWALL_VERBOSE #enable logging to syslogd(8)
options IPFIREWALL_VERBOSE_LIMIT=0 #do not limit verbosity
options DUMMYNET
options IPDIVERT
-------------%< snip! -------------------

So: I have a pair of these machines, configured identically. Each
is connected to a terminal server for access to the serial console. I
have a private mirror of the FreeBSD CVS repository; I'm tracking RELENG_6
& HEAD on my laptop daily; I could try building CURRENT on one of these
boxen if it would help get the problem solved.

The software under test was built for FreeBSD 5.x; I have the
misc/compat5x port installed.

The vendor claims that they don't have this kind of problem with
"Linux,"
and if I can't get it to run without letting the magic smoke leak out,
I'll probably end up trying to hack my way through installing some flavor
of Linux on one of the machines, which prospect I find remarkably
unappealing.

Maybe the DTrace stuff would help?

Could someone please work with me on this, so we can have a software
vendor recommending that their customers deploy their software on
FreeBSD, rather than recommending against it?

Thanks!

Peace,
david
--
David H. Wolfskill david@catwhisker.org
Doing business with spammers only encourages them. Please boycott spammers.

See http://www.catwhisker.org/~david/publickey.gpg for my public key.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 195 bytes
Desc: not available
Url :
http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20060615/858e0a36/attachment.pgp

Lord Reaper

2006-Jun-16 18:42 UTC

head link

Help? 6.1-S: Fatal trap 12: page fault while in kernel mode

On Thu, 15 Jun 2006 16:22:40 -0700 David Wolfskill <david@catwhisker.org>
wrote:
>Fatal trap 12: page fault while in kernel mode
>cpuid = 0; apic id = 06
>fault virtual address   = 0x0
>fault code              = supervisor read, page not present
>instruction pointer     = 0x20:0x0
>stack pointer           = 0x28:0xf09b3b98
>frame pointer           = 0x28:0xf09b3bcc
>code segment            = base 0x0, limit 0xfffff, type 0x1b
>                        = DPL 0, pres 1, def32 1, gran 1
>processor eflags        = interrupt enabled, resume, IOPL = 0
>current process         = 23782 (ecelerity)
>[thread pid 23782 tid 100120 ]
>Stopped at      0:      *** error reading from address 0 ***
>
I had similar problems when updating from 5.4 to 6.1 because of 
nvidia-driver. After changing the card, the system worked like a charm. 
Later on recompiling nvidia-driver (forgot to deinstall it) resulted in 
the machine crashing and rebooting itself. This happened with a 
non-nvidia graphic adapter installed. I remember hearing that 
optimizations might be the cause of the driver failing.

Hope this helps.

Regards,
Sampsa Suoninen

Gavin Atkinson

2006-Jun-23 15:29 UTC

head link

Help? 6.1-S: Fatal trap 12: page fault while in kernel mode

On Thu, 2006-06-15 at 16:22 -0700, David Wolfskill
wrote:> I had one of these a couple of weeks ago or so; I had been distracted by
> some more urgent matters that came up (the panic was on a machine under
> test; the more urgent matters were little things like needing to deploy
> a handful of resolvers on our network because existing ones were running
> on systems that had provided evidence of being prone to imminent
> failure).
> 
> Anyway:  I updated the 2 boxen under test to 6.1-STABLE as of this
> morning, and finally(!) had a chance to re-try the failing operation.
> 
> It went "kaboom!" again.  :-{  (Well, there's something to be
said for
> consistency.  :-})
> 
> It seems to run OK (albeit slowly) for a couple of minutes; then the
> serial console reports:
> 
> Fatal trap 12: page fault while in kernel mode
> cpuid = 0; apic id = 06
> fault virtual address   = 0x0
> fault code              = supervisor read, page not present
> instruction pointer     = 0x20:0x0
> stack pointer           = 0x28:0xf09b3b98
> frame pointer           = 0x28:0xf09b3bcc
> code segment            = base 0x0, limit 0xfffff, type 0x1b
>                         = DPL 0, pres 1, def32 1, gran 1
> processor eflags        = interrupt enabled, resume, IOPL = 0
> current process         = 23782 (ecelerity)
> [thread pid 23782 tid 100120 ]
> Stopped at      0:      *** error reading from address 0 ***
> db> trace
> Tracing pid 23782 tid 100120 td 0xcc445180
> db> 
OK, seeing as nobody has offered any advice, I'll have a go.

Have you got a debug kernel?  If so, get a kernel dump.  Load it into
kgdb.  Chances are "bt" won't work as the instruction pointer is
zero,
so instead you need to display the stack directly:

(kgdb) x/80xw 0xf09b3b98

Look for any addresses in the 0xc0xxxxxx range - these will probably be
pointers to kernel functions.  Drop out of kgdb, and try to find out
which functions these belong to:

addr2line 0xc0639bd6 -e kernel.debug
/usr/src/sys/kern/tty.c:1653

You can build up a backtrace and knowledge of atguments given to
functions this way.

Gavin

freebsd stable - Jun 2006 - Help? 6.1-S: Fatal trap 12: page fault while in kernel mode

Help? 6.1-S: Fatal trap 12: page fault while in kernel mode

Help? 6.1-S: Fatal trap 12: page fault while in kernel mode

Help? 6.1-S: Fatal trap 12: page fault while in kernel mode