thr3ads.net - freebsd stable - Heads up: panics should be fixed! [Aug 2003]

If this information is useful, please help other people find it:
Share via:

Mike Silbersack

2003-Aug-30 12:23 UTC

Heads up: panics should be fixed!

As others have noted, Tor's patch appears to be a total solution to the
recent instability the PAE patch introduced.  So, if you're experiencing
panics with a recent kernel, or are in a position to stress a machine,
please cvsup and give it a test!

Thanks,

Mike "Silby" Silbersack

---------- Forwarded message ----------
Date: Sat, 30 Aug 2003 08:39:08 -0700 (PDT)
From: Tor Egge <tegge@FreeBSD.org>
To: src-committers@FreeBSD.org, cvs-src@FreeBSD.org, cvs-all@FreeBSD.org
Subject: cvs commit: src/sys/i386/i386 genassym.c globals.s mp_machdep.c
    pmap.c src/sys/i386/include globaldata.h globals.h

tegge       2003/08/30 08:39:08 PDT

  FreeBSD src repository

  Modified files:        (Branch: RELENG_4)
    sys/i386/i386        mp_machdep.c genassym.c globals.s pmap.c
    sys/i386/include     globaldata.h globals.h
  Log:
  Avoid conflict between temporary page table mappings performed by
  interrupts and temporary page table mappings performed outside
  interrupt context without splvm() protection.  Interrupt time async
  completion callbacks for pageout operations triggered this conflict.

  Approved by:    re (murray)

  Revision    Changes    Path
  1.86.2.5    +2 -0      src/sys/i386/i386/genassym.c
  1.13.2.2    +5 -1      src/sys/i386/i386/globals.s
  1.115.2.18  +5 -2      src/sys/i386/i386/mp_machdep.c
  1.250.2.21  +71 -11    src/sys/i386/i386/pmap.c
  1.11.2.3    +5 -2      src/sys/i386/include/globaldata.h
  1.5.2.3     +4 -0      src/sys/i386/include/globals.h

T.Suzuki

2003-Sep-02 17:11 UTC

head link

Heads up: panics should be fixed!

Our -stable machine has been rebooting every 24hrs from upgrading
 on "Jul 18". 

Then I did cvsup again on Aug 31 03:00JST (GMT +0900). But......

# gdb -k kernel.1 vmcore.1

IdlePTD at phsyical address 0x00367000
initial pcb at physical address 0x002c55c0
panicstr: page fault
panic messages:
---
Fatal trap 12: page fault while in kernel mode
fault virtual address   = 0x5ea26fef
fault code              = supervisor read, page not present
instruction pointer     = 0x8:0xc01924b0
stack pointer           = 0x10:0xc8bafd74
frame pointer           = 0x10:0xc8bafd90
code segment            = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, def32 1, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 5674 (perl)
interrupt mask          trap number             = 12
panic: page fault

syncing disks... 18
done
Uptime: 23h55m53s

# dmesg -a 

FreeBSD 4.9-PRERELEASE #6: Mon Sep  1 08:09:40 JST 2003
    tss@stargate.tokai-ic.or .jp:/usr/src/sys/compile/STARGATE
Timecounter "i8254"  frequency 1193182 Hz
Timecounter "TSC"  frequency 768413581 Hz
CPU: Intel Celeron (768.41-MHz 686-class CPU)
  Origin = "GenuineIntel"  Id = 0x686  Stepping = 6
  
Features=0x383f9ff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,MMX,FXSR,SSE>
real memory  = 133083136 (129964K bytes)
avail memory = 126193664 (123236K bytes)
Preloaded elf kernel "kernel" at 0xc0348000.
Pentium Pro MTRR support enabled

-- 
///////////////////////////////////////////////////////////////////////
// T.Suzuki @ Tokai Internet Council
///////////////////////////////////////////////////////////////////////

John Kennedy

2003-Sep-03 08:41 UTC

head link

Heads up: panics should be fixed!

On Sat, Aug 30, 2003 at 02:20:48PM -0500, Mike Silbersack
wrote:> As others have noted, Tor's patch appears to be a total solution to the
> recent instability the PAE patch introduced.  So, if you're
experiencing
> panics with a recent kernel, or are in a position to stress a machine,
> please cvsup and give it a test!
  FYI, I'm *not* seeing a 24-hr crash.  Just running the GENERIC kernel.
I've been cvsuping fairly regular and the reloads are recompiles vs.
crashes.

  Just a datapoint since I haven't made any effort to use PAE.

FreeBSD pandora.jk.homeunix.net 4.9-PRERELEASE FreeBSD 4.9-PRERELEASE #6: Tue
Sep  2 21:13:12 PDT 2003    
root@pandora.jk.homeunix.net:/usr/src/sys/compile/GENERIC  i386

Aug 27 06:45:42 pandora /kernel: FreeBSD 4.9-PRERELEASE #2: Mon Aug 25 17:34:08
PDT 2003
Aug 27 06:57:10 pandora /kernel: FreeBSD 4.9-PRERELEASE #3: Wed Aug 27 06:49:36
PDT 2003
Aug 30 09:12:34 pandora /kernel: FreeBSD 4.9-PRERELEASE #4: Sat Aug 30 09:09:03
PDT 2003
Sep  2 21:56:09 pandora /kernel: FreeBSD 4.9-PRERELEASE #6: Tue Sep  2 21:13:12
PDT 2003

T.Suzuki

2003-Sep-03 15:24 UTC

head link

Heads up: panics should be fixed!

Thanks for Mike and Silby. Sorry, of my poor information.

I have following options in the kernel.
option DDB
option DDB_UNATTENDED
makeoptions DEBUG=-g

# gdb -k /usr/src/sys/compile/STARGATE/kernel.debug /var/crash/vmcore.1

(kgdb) bt
#0  dumpsys () at ../../kern/kern_shutdown.c:487
#1  0xc01562bf in boot (howto=256) at ../../kern/kern_shutdown.c:316
#2  0xc01566fd in panic (fmt=0xc029ccac "%s") at
../../kern/kern_shutdown.c:595
#3  0xc025f1e7 in trap_fatal (frame=0xc8bafd34, eva=1587703791) at
../../i386/i386/trap.c:974
#4  0xc025ee95 in trap_pfault (frame=0xc8bafd34, usermode=0, eva=1587703791) at
../../i386/i386/trap.c:867
#5  0xc025ea3b in trap (frame={tf_fs = -1060634608, tf_es = 16, tf_ds =
-927334384, tf_edi = -1060944235,
      tf_esi = -1060963163, tf_ebp = -927269488, tf_isp = -927269536, tf_ebx =
1587703791,
      tf_edx = -1060963128, tf_ecx = -1060963131, tf_eax = 28, tf_trapno = 12,
tf_err = 0,
      tf_eip = -1072094032, tf_cs = 8, tf_eflags = 66050, tf_esp = 41216, tf_ss
= -1060944240})
    at ../../i386/i386/trap.c:466
#6  0xc01924b0 in ifa_ifwithnet (addr=0xc0c34690) at ../../net/if.c:612
#7  0xc019ed31 in in_pcbladdr (inp=0xc81d5b00, nam=0xc0c34690,
plocal_sin=0xc8bafdc8)
    at ../../netinet/in_pcb.c:459
#8  0xc019ee16 in in_pcbconnect (inp=0xc81d5b00, nam=0xc0c34690, p=0xc8a7cea0)
at ../../netinet/in_pcb.c:526
#9  0xc01b5373 in udp_output (inp=0xc81d5b00, m=0xc074d800, addr=0xc0c34690,
control=0x0, p=0xc8a7cea0)
    at ../../netinet/udp_usrreq.c:708
#10 0xc01b5784 in udp_send (so=0xc8177a00, flags=0, m=0xc074d800,
addr=0xc0c34690, control=0x0, p=0xc8a7cea0)
    at ../../netinet/udp_usrreq.c:920
#11 0xc01756b7 in sosend (so=0xc8177a00, addr=0xc0c34690, uio=0xc8bafecc,
top=0xc074d800, control=0x0, flags=0,
    p=0xc8a7cea0) at ../../kern/uipc_socket.c:609
#12 0xc0178b37 in sendit (p=0xc8a7cea0, s=4, mp=0xc8baff0c, flags=0) at
../../kern/uipc_syscalls.c:590
#13 0xc0178c3a in sendto (p=0xc8a7cea0, uap=0xc8baff80) at
../../kern/uipc_syscalls.c:643
#14 0xc025f49d in syscall2 (frame={tf_fs = 135725103, tf_es = 135725103, tf_ds =
-1078001617,
      tf_edi = 135172116, tf_esi = 135172112, tf_ebp = -1077937056, tf_isp =
-927268908, tf_ebx = 672126736,
      tf_edx = 139254656, tf_ecx = 139260876, tf_eax = 133, tf_trapno = 0,
tf_err = 2, tf_eip = 672614432,
      tf_cs = 31, tf_eflags = 659, tf_esp = -1077937148, tf_ss = 47}) at
../../i386/i386/trap.c:1175
#15 0xc0250785 in Xint0x80_syscall ()
#16 0x2807ef19 in ?? ()
#17 0x280e8c58 in ?? ()
#18 0x8048e79 in ?? ()
#19 0x8048d5a in ?? ()

-- 
///////////////////////////////////////////////////////////////////////
// T.Suzuki @ Tokai Internet Council, Japan
///////////////////////////////////////////////////////////////////////

Michael W. Oliver

2003-Sep-04 08:46 UTC

head link

Heads up: panics should be fixed!

+--- On Wednesday, September 03, 2003 03:09 ---
| Mike Tancsa proclaimed:
| >| in
| >| /etc/rc.conf
| >| add
| >| dumpdev="/dev/ad0s1b"           # Device name to crashdump to
(or NO).
| >| dumpdir="/var/crash"    # Directory where crash dumps are to
be stored
| >
| >Ok, I am guessing the 'dumpdev' line is the boot-time equivalent
of
| > running the dumpon(8) command to set the sysctl kern.dumpdev.
|
| Correct. The above also assumes thats where you have your swap. If its
| not, than adjust accordingly.
|

Well, it has been a little over 24 hours, and I got a panic, but no dump!
Here is the log from the panic as well as the message stating that a dump
couldn't be performed:

//--start--//
Fatal trap 12: page fault while in kernel mode
fault virtual address   = 0xbed557c5
fault code              = supervisor read, page not present
instruction pointer     = 0x8:0xc027c38d
stack pointer           = 0x10:0xdea01ecc
frame pointer           = 0x10:0xdea01ef4
code segment            = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, def32 1, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 5 (syncer)
interrupt mask          = none
trap number             = 12
panic: page fault

syncing disks... 8
done
Uptime: 1d0h9m53s

dumping to dev #ar/0x20001, offset 1279168
dump failed, reason: device doesn't support a dump routine
Automatic reboot in 15 seconds - press a key on the console to abort
Rebooting...
//--end--//


I have a HighPoint IDE RAID controller in this box with a RAID 1
configuration (ar0) using two Seagate 120GB disks (ad4 and ad6).  I put
this in /etc/rc.conf before I rebooted last night:

//--start--//
$ head /etc/rc.conf
dumpdev="/dev/ar0s1b"     #swap device configured in /etc/fstab
dumpdir="/usr/var/crash"  #using a dir under /usr as /var isn't
big enough
//--end--//


Here is the output from 'disklabel -r ar0' showing that my swap device
is
indeed /dev/ar0s1b:

//--start--//
$ disklabel -r ar0
# /dev/ar0c:
type: ESDI
disk: ar0s1
label:
flags:
bytes/sector: 512
sectors/track: 63
tracks/cylinder: 255
sectors/cylinder: 16065
cylinders: 14592
sectors/unit: 234436482
rpm: 3600
interleave: 1
trackskew: 0
cylinderskew: 0
headswitch: 0           # milliseconds
track-to-track seek: 0  # milliseconds
drivedata: 0

8 partitions:
#      size   offset    fstype   [fsize bsize bps/cpg]
a:   262144        0    4.2BSD     2048 16384    94   # (Cyl.   0 - 16*)
b:  2589856   262144      swap                        # (Cyl.  16*- 177*)
c: 234436482        0    unused        0     0        # (Cyl.   0 - 14592*) 
e:   524288  2852000    4.2BSD     2048 16384    94   # (Cyl. 177*- 210*) 
f:   524288  3376288    4.2BSD     2048 16384    94   # (Cyl. 210*- 242*) 
g: 230535906  3900576    4.2BSD     2048 16384    89  # (Cyl. 242*- 14592*)
//--end--//


So, what am I missing here in order to get a dump so that I can help debug
this problem?  Is the ar0 device not able to be a dump device?  Do I need to 
install a dedicated IDE drive on the MB's IDE controller just so that I can 
get a dump?

This is exacerbated by the fact that I will have to wait another 24 hours to 
try again.

--
Mike
perl -e 'print
unpack("u","88V]N=&%C=\"!I;F9O(&EN(&AE861E<G,*");'

Seemingly Similar Threads

Search for more maybe matching threads

freebsd stable - Aug 2003 - Heads up: panics should be fixed!

Heads up: panics should be fixed!

Heads up: panics should be fixed!

Heads up: panics should be fixed!

Heads up: panics should be fixed!

Heads up: panics should be fixed!

Seemingly Similar Threads