Anyone? Bueller?
Just had another of these crashes, after moving the disks and new memory
to a new DL380G4 box. Starting to look very much like a kernel problem,
but I do not know the best way to approach such a debugging task.
Any advice is appreciated.
-Alan
Alan Sparks said:> Been having this problem, and posted about it before. Thinking it was a
> memory issue, I've replaced all memory on the server. However, the
> problem has continued.
>
> Server is a Proliant DL380 (8GB RAM, 2 Xeon CPU), running CentOS 3.6, all
> patches up-to-date. Kernel is 2.4.21-40.ELsmp (problem seems to have
> first manifested on kernel 2.4.21-37.0.1.ELsmp). Disk is CCISS hardware
> RAID-5, straight partitioning (no LVM).
>
> This server runs an Oracle 10g instance, async I/O enabled to a NetApp
> filer where most data is stored.
>
> Traceback of latest failure follows (two crashes this morning). Anyone
> read these things well enough to tell me if there's any insight in
this?
> There are no nVidia drivers loaded, only stock kernel modules.
>
> Thanks in advance for any insight.
> -Alan
>
>
> Apr 10 05:32:22 db01-01 kernel: page not mapped. erroring out.
> Apr 10 05:32:22 db01-01 kernel: Page has mapping still set. This is a
> serious situation. However if you
> Apr 10 05:32:22 db01-01 kernel: are using the NVidia binary only module
> please report this bug to
> Apr 10 05:32:22 db01-01 kernel: NVidia and not to the linux kernel
> mailinglist.
> Apr 10 05:32:22 db01-01 kernel: ------------[ cut here ]------------
> Apr 10 05:32:22 db01-01 kernel: kernel BUG at page_alloc.c:225!
> Apr 10 05:32:22 db01-01 kernel: invalid operand: 0000
> Apr 10 05:32:22 db01-01 kernel: sg nfs lockd sunrpc tg3 microcode
> keybdev mousedev hid input ehci-hcd usb-uhci usbcore ext3 jbd cciss
> sd_mod scsi_mod
> Apr 10 05:32:22 db01-01 kernel: CPU: 1
> Apr 10 05:32:22 db01-01 kernel: EIP: 0060:[<c0159560>] Not
tainted
> Apr 10 05:32:22 db01-01 kernel: EFLAGS: 00010286
> Apr 10 05:32:22 db01-01 kernel:
> Apr 10 05:32:22 db01-01 kernel: EIP is at __free_pages_ok [kernel] 0x3e0
> (2.4.21-40.ELsmp/i686)
> Apr 10 05:32:22 db01-01 kernel: eax: 00000033 ebx: c797dd38 ecx:
> 00000001edx: c0387e98
> Apr 10 05:32:22 db01-01 kernel: esi: f4402880 edi: 00000000 ebp:
> 00000000esp: cd7d5ec8
> Apr 10 05:32:22 db01-01 kernel: ds: 0068 es: 0068 ss: 0068
> Apr 10 05:32:22 db01-01 kernel: Process keventd (pid: 6,
> stackpage=cd7d5000)
> Apr 10 05:32:22 db01-01 kernel: Stack: c02c1ea8 00000363 c000a750
> ff0ea000 c0440280 00000000 cdbac000 efc21f00
> Apr 10 05:32:22 db01-01 kernel: 00000000 00000001 00000001 00000086
> dab95054 00000001 f4402880 00000000
> Apr 10 05:32:22 db01-01 kernel: 00000000 c014cf3e 00000001 00000000
> 00000000 cd7d4000 00000000 00000e00
> Apr 10 05:32:22 db01-01 kernel: Call Trace: [<c014cf3e>]
__iodesc_free
> [kernel] 0xde (0xcd7d5f0c)
> Apr 10 05:32:22 db01-01 kernel: [<c0161e9c>] kmap_high [kernel] 0x5c
> (0xcd7d5f28)
> Apr 10 05:32:22 db01-01 kernel: [<c014d87b>] __iodesc_read_finish
> [kernel] 0x22b (0xcd7d5f38)
> Apr 10 05:32:22 db01-01 kernel: [<c01302ca>] __run_task_queue
[kernel]
> 0x6a (0xcd7d5f74)
> Apr 10 05:32:22 db01-01 kernel: [<c013c9ad>] context_thread [kernel]
> 0x13d (0xcd7d5f8c)
> Apr 10 05:32:22 db01-01 kernel: [<c013c870>] context_thread [kernel]
0x0
> (0xcd7d5fe0)
> Apr 10 05:32:22 db01-01 kernel: [<c01095cd>] kernel_thread_helper
> [kernel] 0x5 (0xcd7d5ff0)
> Apr 10 05:32:22 db01-01 kernel:
> Apr 10 05:32:22 db01-01 kernel: Code: 0f 0b e1 00 33 17 2c c0 e9 6c fc
> ff ff 9c 5a fa f0 fe 0d 70
> Apr 10 05:32:22 db01-01 kernel:
> Apr 10 05:32:22 db01-01 kernel: Kernel panic: Fatal exception
>
>
>
>
> ==========> Alan Sparks, UNIX/Linux Systems Administrator
> <asparks at doublesparks.net>
>
> _______________________________________________
> CentOS mailing list
> CentOS at centos.org
> http://lists.centos.org/mailman/listinfo/centos
>
>
==========Alan Sparks, UNIX/Linux Systems Administrator <asparks at
doublesparks.net>