Frode Nordahl
2005-Dec-04 22:45 UTC
panic: pmap_enter: invalid page directory pdir=0xc1e063, va=0xffc00000
Hello, After almost 6 months of problem-free operation one of my NFS servers has suddenly started to panic and do a automatic reboot regularly, about every 24 hours. This is a dual Xeon (with UP kernel for the time being) SE7501HG2 with 4 GB RAM and a Intel SRCU42X RAID controller (AMI MegaRAID). I ran 5.4-RELEASE-p2 on it, but had no kernel.debug for that, so I installed 5.4-RELEASE-p8 on it to get some information from the crashes. (See below). I have also installed a UP kernel to rule out any SMP related problems. I found no PRs referencing this exact problem, and no mailinglist postings addressing this exact problem (well, found one thread where someone provoked this to happen by tweaking the wrong numbers). My gut feeling tells me that this is a tuning problem, but I really don't know where to start, so any pointers would be great! By habit I have this in my loader.conf: vm.kmem_size_max=419430400 # cat info.7 Dump header from device /dev/amrd0s2b Architecture: i386 Architecture Version: 16777216 Dump Length: 4026466304B (3839 MB) Blocksize: 512 Dumptime: Sun Dec 4 22:07:52 2005 Hostname: xxx.yyy.no Magic: FreeBSD Kernel Dump Version String: FreeBSD 5.4-RELEASE-p8 #0: Tue Oct 11 21:54:19 CEST 2005 root@xxx.yyy.no:/usr/obj/usr/src/RELENG_5_4/src/sys/PT Panic String: pmap_enter: invalid page directory pdir=0xc1e063, va=0xffc00000 Dump Parity: 2250800518 Bounds: 7 Dump Status: good (kgdb) bt #0 doadump () at pcpu.h:159 #1 0xc05fbd4e in boot (howto=260) at /usr/src/RELENG_5_4/src/sys/kern/kern_shutdown.c:410 #2 0xc05fc014 in panic ( fmt=0xc0822933 "pmap_enter: invalid page directory pdir=%#jx, va= %#x\n") at /usr/src/RELENG_5_4/src/sys/kern/kern_shutdown.c:566 #3 0xc07a8868 in pmap_enter (pmap=0xc08f24a0, va=4290772992, m=0xc2a22f00, prot=7 '\a', wired=1) at /usr/src/RELENG_5_4/src/sys/i386/i386/ pmap.c:1948 #4 0xc074d745 in vm_fault (map=0xc103b000, vaddr=4290772992, fault_type=7 '\a', fault_flags=1) at /usr/src/RELENG_5_4/src/sys/vm/vm_fault.c:893 #5 0xc074dd68 in vm_fault_wire (map=0xc103b000, start=4289822720, end=4291231744, user_wire=0, fictitious=0) at /usr/src/RELENG_5_4/src/sys/vm/vm_fault.c:1051 #6 0xc07524b6 in vm_map_wire (map=0xc103b000, start=4289822720, end=4291231744, flags=0) at /usr/src/RELENG_5_4/src/sys/vm/ vm_map.c:1959 #7 0xc074fcf8 in kmem_alloc (map=0xc103b000, size=1409024) at /usr/src/RELENG_5_4/src/sys/vm/vm_kern.c:187 #8 0xc07ab246 in user_ldt_alloc (mdp=0x0, len=176128) at /usr/src/RELENG_5_4/src/sys/i386/i386/sys_machdep.c:310 #9 0xc07adda1 in cpu_fork (td1=0xcbd73480, p2=0xcbe70c5c, td2=0xcbe6e000, flags=20) at /usr/src/RELENG_5_4/src/sys/i386/i386/vm_machdep.c:252 #10 0xc074ef55 in vm_forkproc (td=0xcbd73480, p2=0xcbe70c5c, td2=0xcbe6e000, flags=20) at /usr/src/RELENG_5_4/src/sys/vm/vm_glue.c:473 #11 0xc05e6dec in fork1 (td=0xcbd73480, flags=20, pages=0, procp=0xf7935ce4) at /usr/src/RELENG_5_4/src/sys/kern/kern_fork.c:644 #12 0xc05e5d2c in fork (td=0xcbd73480, uap=0xf7935d14) at /usr/src/RELENG_5_4/src/sys/kern/kern_fork.c:97 #13 0xc07ac77b in syscall (frame {tf_fs = 47, tf_es = 47, tf_ds = -1078001617, tf_edi = 13817, tf_esi = 2, tf_ebp = -1077944424, tf_isp = -141337228, tf_ebx = 134628900, tf_edx = 0, tf_ecx = 22, tf_eax = 2, tf_trapno = 12, tf_err = 2, tf_eip = 671926043, tf_cs = 31, tf_eflags = 582, tf_esp = -1077944452, tf_ss = 47}) at /usr/src/RELENG_5_4/src/sys/i386/i386/trap.c:1009 #14 0xc079c4bf in Xint0x80_syscall () at /usr/src/RELENG_5_4/src/sys/i386/i386/exception.s:201 #15 0x0000002f in ?? () #16 0x0000002f in ?? () #17 0xbfbf002f in ?? () #18 0x000035f9 in ?? () #19 0x00000002 in ?? () #20 0xbfbfdf98 in ?? () ... # cat PT machine i386 cpu I686_CPU ident PT # To statically compile in device wiring instead of /boot/device.hints #hints "GENERIC.hints" # Default places to look for devices. makeoptions DEBUG=-g # Build kernel with gdb(1) debug symbols options SCHED_4BSD # 4BSD scheduler options INET # InterNETworking options INET6 # IPv6 communications protocols options FFS # Berkeley Fast Filesystem options SOFTUPDATES # Enable FFS soft updates support options UFS_ACL # Support for access control lists options UFS_DIRHASH # Improve performance on big directories options MD_ROOT # MD is a potential root device options NFSCLIENT # Network Filesystem Client options NFSSERVER # Network Filesystem Server options NFS_ROOT # NFS usable as /, requires NFSCLIENT options MSDOSFS # MSDOS Filesystem options CD9660 # ISO 9660 Filesystem options PROCFS # Process filesystem (requires PSEUDOFS) options PSEUDOFS # Pseudo-filesystem framework options GEOM_GPT # GUID Partition Tables. options COMPAT_43 # Compatible with BSD 4.3 [KEEP THIS!] options COMPAT_FREEBSD4 # Compatible with FreeBSD4 options SCSI_DELAY=15000 # Delay (in ms) before probing SCSI options KTRACE # ktrace(1) support options SYSVSHM # SYSV-style shared memory options SYSVMSG # SYSV-style message queues options SYSVSEM # SYSV-style semaphores options _KPOSIX_PRIORITY_SCHEDULING # POSIX P1003_1B real-time extensions options KBD_INSTALL_CDEV # install a CDEV entry in /dev options AHC_REG_PRETTY_PRINT # Print register bitfields in debug # output. Adds ~128k to driver. options AHD_REG_PRETTY_PRINT # Print register bitfields in debug # output. Adds ~215k to driver. options ADAPTIVE_GIANT # Giant mutex is adaptive. options KDB # Enable kernel debugger support. options KDB_UNATTENDED options BREAK_TO_DEBUGGER options DDB # Support DDB. options GDB # Support remote GDB. options QUOTA # enable disk quotas device apic # I/O APIC # Bus support. Do not remove isa, even if you have no isa slots device isa device eisa device pci # Floppy drives device fdc # ATA and ATAPI devices device ata device atadisk # ATA disk drives device ataraid # ATA RAID drives device atapicd # ATAPI CDROM drives device atapifd # ATAPI floppy drives device atapist # ATAPI tape drives options ATA_STATIC_ID # Static device numbering # SCSI Controllers device ahb # EISA AHA1742 family device ahc # AHA2940 and onboard AIC7xxx devices device ahd # AHA39320/29320 and onboard AIC79xx devices device amd # AMD 53C974 (Tekram DC-390(T)) device isp # Qlogic family device mpt # LSI-Logic MPT-Fusion #device ncr # NCR/Symbios Logic device sym # NCR/Symbios Logic (newer chipsets + those of `ncr') device trm # Tekram DC395U/UW/F DC315U adapters device adv # Advansys SCSI adapters device adw # Advansys wide SCSI adapters device aha # Adaptec 154x SCSI adapters device aic # Adaptec 15[012]x SCSI adapters, AIC-6[23]60. device bt # Buslogic/Mylex MultiMaster SCSI adapters device ncv # NCR 53C500 device nsp # Workbit Ninja SCSI-3 device stg # TMC 18C30/18C50 # SCSI peripherals device scbus # SCSI bus (required for SCSI) device ch # SCSI media changers device da # Direct Access (disks) device sa # Sequential Access (tape etc) device cd # CD device pass # Passthrough device (direct SCSI access) device ses # SCSI Environmental Services (and SAF-TE) # RAID controllers interfaced to the SCSI subsystem device amr # AMI MegaRAID device arcmsr # Areca SATA II RAID device asr # DPT SmartRAID V, VI and Adaptec SCSI RAID device ciss # Compaq Smart RAID 5* device dpt # DPT Smartcache III, IV - See NOTES for options device hptmv # Highpoint RocketRAID 182x device iir # Intel Integrated RAID device ips # IBM (Adaptec) ServeRAID device mly # Mylex AcceleRAID/eXtremeRAID device twa # 3ware 9000 series PATA/SATA RAID # RAID controllers device aac # Adaptec FSA RAID device aacp # SCSI passthrough for aac (requires CAM) device ida # Compaq Smart RAID device mlx # Mylex DAC960 family device pst # Promise Supertrak SX6000 device twe # 3ware ATA RAID # atkbdc0 controls both the keyboard and the PS/2 mouse device atkbdc # AT keyboard controller device atkbd # AT keyboard device psm # PS/2 mouse device vga # VGA video card driver device splash # Splash screen and screen saver support # syscons is the default console driver, resembling an SCO console device sc # Enable this for the pcvt (VT220 compatible) console driver #device vt #options XSERVER # support for X server on a vt console #options FAT_CURSOR # start with block cursor device agp # support several AGP chipsets # Floating point support - do not disable. device npx # Power management support (see NOTES for more options) #device apm # Add suspend/resume support for the i8254. device pmtimer # PCCARD (PCMCIA) support # PCMCIA and cardbus bridge support device cbb # cardbus (yenta) bridge device pccard # PC Card (16-bit) bus device cardbus # CardBus (32-bit) bus # Serial (COM) ports device sio # 8250, 16[45]50 based serial ports # Parallel port device ppc device ppbus # Parallel port bus (required) device lpt # Printer device plip # TCP/IP over parallel device ppi # Parallel port interface device #device vpo # Requires scbus and da # If you've got a "dumb" serial or parallel PCI card that is # supported by the puc(4) glue driver, uncomment the following # line to enable it (connects to the sio and/or ppc drivers): #device puc # PCI Ethernet NICs. device de # DEC/Intel DC21x4x (``Tulip'') device em # Intel PRO/1000 adapter Gigabit Ethernet Card device ixgb # Intel PRO/10GbE Ethernet Card device txp # 3Com 3cR990 (``Typhoon'') device vx # 3Com 3c590, 3c595 (``Vortex'') # PCI Ethernet NICs that use the common MII bus controller code. # NOTE: Be sure to keep the 'device miibus' line in order to use these NICs! device miibus # MII bus support device bfe # Broadcom BCM440x 10/100 Ethernet device bge # Broadcom BCM570xx Gigabit Ethernet device dc # DEC/Intel 21143 and various workalikes device fxp # Intel EtherExpress PRO/100B (82557, 82558) device lge # Level 1 LXT1001 gigabit ethernet device nge # NatSemi DP83820 gigabit ethernet device pcn # AMD Am79C97x PCI 10/100 (precedence over 'lnc') device re # RealTek 8139C+/8169/8169S/8110S device rl # RealTek 8129/8139 device sf # Adaptec AIC-6915 (``Starfire'') device sis # Silicon Integrated Systems SiS 900/SiS 7016 device sk # SysKonnect SK-984x & SK-982x gigabit Ethernet device ste # Sundance ST201 (D-Link DFE-550TX) device ti # Alteon Networks Tigon I/II gigabit Ethernet device tl # Texas Instruments ThunderLAN device tx # SMC EtherPower II (83c170 ``EPIC'') device vge # VIA VT612x gigabit ethernet device vr # VIA Rhine, Rhine II device wb # Winbond W89C840F device xl # 3Com 3c90x (``Boomerang'', ``Cyclone'') # ISA Ethernet NICs. pccard NICs included. device cs # Crystal Semiconductor CS89x0 NIC # 'device ed' requires 'device miibus' device ed # NE[12]000, SMC Ultra, 3c503, DS8390 cards device ex # Intel EtherExpress Pro/10 and Pro/10+ device ep # Etherlink III based cards device fe # Fujitsu MB8696x based cards device ie # EtherExpress 8/16, 3C507, StarLAN 10 etc. device lnc # NE2100, NE32-VL Lance Ethernet cards device sn # SMC's 9000 series of Ethernet chips device xe # Xircom pccard Ethernet # ISA devices that use the old ISA shims #device le # Wireless NIC cards device wlan # 802.11 support device an # Aironet 4500/4800 802.11 wireless NICs. device awi # BayStack 660 and others device wi # WaveLAN/Intersil/Symbol 802.11 wireless NICs. #device wl # Older non 802.11 Wavelan wireless NIC. # Pseudo devices. device loop # Network loopback device mem # Memory and kernel memory devices device io # I/O device device random # Entropy device device ether # Ethernet support device sl # Kernel SLIP device ppp # Kernel PPP device tun # Packet tunnel. device pty # Pseudo-ttys (telnet etc) device md # Memory "disks" device gif # IPv6 and IPv4 tunneling device faith # IPv6-to-IPv4 relaying (translation) # The `bpf' device enables the Berkeley Packet Filter. # Be aware of the administrative consequences of enabling this! # Note that 'bpf' is required for DHCP. device bpf # Berkeley packet filter # USB support #device uhci # UHCI PCI->USB interface #device ohci # OHCI PCI->USB interface ##device ehci # EHCI PCI->USB interface (USB 2.0) #device usb # USB Bus (required) ##device udbp # USB Double Bulk Pipe devices #device ugen # Generic #device uhid # "Human Interface Devices" #device ukbd # Keyboard #device ulpt # Printer #device umass # Disks/Mass storage - Requires scbus and da #device ums # Mouse #device urio # Diamond Rio 500 MP3 player #device uscanner # Scanners ## USB Ethernet, requires mii #device aue # ADMtek USB Ethernet #device axe # ASIX Electronics USB Ethernet #device cdce # Generic USB over Ethernet #device cue # CATC USB Ethernet #device kue # Kawasaki LSI USB Ethernet #device rue # RealTek RTL8150 USB Ethernet # FireWire support device firewire # FireWire bus code device sbp # SCSI over FireWire (Requires scbus and da) device fwe # Ethernet over FireWire (non-standard!) Frode Nordahl frode@nordahl.net
Frode Nordahl
2005-Dec-14 23:42 UTC
panic: pmap_enter: invalid page directory pdir=0xc1e063, va=0xffc00000
On 4. des. 2005, at 23.44, Frode Nordahl wrote:> Hello, > > After almost 6 months of problem-free operation one of my NFS > servers has suddenly started to panic and do a automatic reboot > regularly, about every 24 hours. > > This is a dual Xeon (with UP kernel for the time being) SE7501HG2 > with 4 GB RAM and a Intel SRCU42X RAID controller (AMI MegaRAID). > > I ran 5.4-RELEASE-p2 on it, but had no kernel.debug for that, so I > installed 5.4-RELEASE-p8 on it to get some information from the > crashes. (See below). I have also installed a UP kernel to rule out > any SMP related problems. > > I found no PRs referencing this exact problem, and no mailinglist > postings addressing this exact problem (well, found one thread > where someone provoked this to happen by tweaking the wrong numbers). > > > My gut feeling tells me that this is a tuning problem, but I really > don't know where to start, so any pointers would be great! > > By habit I have this in my loader.conf: vm.kmem_size_max=419430400[snip] It turns out this was caused by a erratic backup script. It runs full and incremental backups on two FireWire disks, automatically mounting the second one when the first is full. For some reason the second got full as well and the script ended up remounting the device over 340 times (!) trying to squeze some more bits out of the unwilling device :-) I will set up a test case on a crashbox, rule out that 6.0 has this problem or file a complete pr with test scripts etc. Frode Nordahl frode@nordahl.net