Hi All, I seem to be able to reproduce a null pointer dereference and paging request errors in 1.2. Can anyone give me any pointers on tracking down what is causing it? This is with a 32Mb virtual domain, running debian woody, NFS root, 256Mb swap in a local VD, while running a process which builds openldap, python2.2.3, and related packages. I''m not sure which package, if any in particular, is causing this; could be just anything that causes a similar workload. This particular set of messages appeared before the virtual domain locked up during the openldap build... Steve DOM26: Unable to handle kernel paging request at virtual address 20000001 DOM26: printing eip: DOM26: c0007743 DOM26: *pde=00000000(00000000) DOM26: Oops: 0000 DOM26: CPU: 0 DOM26: EIP: 0819:[<c0007743>] Not tainted DOM26: EFLAGS: 00010202 DOM26: eax: 00000001 ebx: 20000001 ecx: c0a79e6c edx: c0a79e6c DOM26: esi: c0a78000 edi: c0114254 ebp: c1e5f580 esp: c0a79ce4 DOM26: ds: 0821 es: 0821 ss: 0821 DOM26: Process sh (pid: 10086, stackpage=c0a79000)<1> DOM26: Stack: 20000001 c0a78000 c002c728 c0a78000 c0a79db0 c0114254 ffffffb0 c0a78000 DOM26: c003e789 c0a79e6c c014b5ac c003e314 c0a79e6c 00000000 c0114250 c0a79de8 DOM26: c1419640 80000000 00000000 00000000 00000000 00000000 00000000 00000000 DOM26: Call Trace: [<c002c728>] [<c003e789>] [<c003e314>] [<c002ccc7>] [<c002cf4a>] DOM26: [<c002cf61>] [<c0090033>] [<c00914bf>] DOM26: DOM26: <1>Unable to handle kernel paging request at virtual address 20000001 DOM26: printing eip: DOM26: c000af0f DOM26: *pde=00000000(00000000) DOM26: Oops: 0002 DOM26: CPU: 0 DOM26: EIP: 0819:[<c000af0f>] Not tainted DOM26: EFLAGS: 00010282 DOM26: eax: 20000001 ebx: c1ed5b20 ecx: c0a78264 edx: c0a78264 DOM26: esi: 00000000 edi: 20000001 ebp: 0000000b esp: c0a79bb4 DOM26: ds: 0821 es: 0821 ss: 0821 DOM26: Process sh (pid: 10086, stackpage=c0a79000)<1> DOM26: Stack: c1ed5b20 00000000 c0a78000 0000000b 0000000b c000b55f 20000001 0000001f DOM26: 00000000 c140d6c0 20000001 c0091a87 0000000b 00000000 c1ed5b3c c0096305 DOM26: c0129928 c0a79cb0 00000000 c0a78000 00000000 20000001 c1e5f580 00000000 DOM26: Call Trace: [<c000b55f>] [<c0091a87>] [<c0096305>] [<c002eb19>] [<c0018a25>] DOM26: [<c0018c46>] [<c0018feb>] [<c0018ed4>] [<c006e759>] [<c0091768>] [<c0007743>] DOM26: [<c002c728>] [<c003e789>] [<c003e314>] [<c002ccc7>] [<c002cf4a>] [<c002cf61>] DOM26: [<c0090033>] [<c00914bf>] DOM26: DOM26: <1>Unable to handle kernel NULL pointer dereference at virtual address 00000001 DOM26: printing eip: DOM26: c000b623 DOM26: *pde=00000000(00000000) DOM26: Oops: 0002 DOM26: CPU: 0 DOM26: EIP: 0819:[<c000b623>] Not tainted DOM26: EFLAGS: 00010202 DOM26: eax: 00000000 ebx: 00000001 ecx: c0a78264 edx: c0a78264 DOM26: esi: 00000002 edi: c0a78000 ebp: 0000000b esp: c0a79aa0 DOM26: ds: 0821 es: 0821 ss: 0821 DOM26: Process sh (pid: 10086, stackpage=c0a79000)<1> DOM26: Stack: 0000001f 00000002 20000001 20000001 c0091a87 0000000b 00000000 00000002 DOM26: c0096305 c0129928 c0a79b80 00000002 c0a78000 00000002 20000001 0000000b DOM26: 63303039 c101fc58 c0a78000 00000002 c101fc58 ffffffff 00030001 c001e621 DOM26: Call Trace: [<c0091a87>] [<c0096305>] [<c001e621>] [<c001f6c0>] [<c0008996>] DOM26: [<c00200d7>] [<c00204e1>] [<c001464d>] [<c0014c92>] [<c0091768>] [<c000af0f>] DOM26: [<c000b55f>] [<c0091a87>] [<c0096305>] [<c002eb19>] [<c0018a25>] [<c0018c46>] DOM26: [<c0018feb>] [<c0018ed4>] [<c006e759>] [<c0091768>] [<c0007743>] [<c002c728>] DOM26: [<c003e789>] [<c003e314>] [<c002ccc7>] [<c002cf4a>] [<c002cf61>] [<c0090033>] DOM26: [<c00914bf>] DOM26: -- Stephen G. Traugott (KG6HDQ) UNIX/Linux Infrastructure Architect, TerraLuna LLC stevegt@TerraLuna.Org http://www.stevegt.com -- http://Infrastructures.Org ------------------------------------------------------- The SF.Net email is sponsored by EclipseCon 2004 Premiere Conference on Open Tools Development and Integration See the breadth of Eclipse activity. February 3-5 in Anaheim, CA. http://www.eclipsecon.org/osdn _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
stevegt@TerraLuna.Org
2004-Feb-10 04:13 UTC
[Xen-devel] paging request failures under load (was: Re: Null pointer deference)
Okay, the problem still exists when I bump the memory up to 256Mb, and never swap. I.E. I''ve found no workaround. Hasn''t anyone else hit anything like this? Steve DOM3: xen_console_init DOM3: Linux version 2.4.24-xeno (stevegt@pathfinder) (gcc version 3.0.4) #16 Mon Feb 2 17:46:41 PST 2004 DOM3: On node 0 totalpages: 65536 DOM3: zone(0): 4096 pages. DOM3: zone(1): 61440 pages. DOM3: zone(2): 0 pages. DOM3: Kernel command line: ip=64.71.149.20:10.27.2.50:64.71.149.1:255.255.255.0::eth0:off root=/dev/nfs nfsroot=/export//xen/fs/stevegt/tcx/root 4 DOMID=20 DOM3: Initializing CPU#0 DOM3: Xen reported: 398.780 MHz processor. DOM3: Calibrating delay loop... 1592.52 BogoMIPS DOM3: Memory: 257132k/262144k available (1078k kernel code, 5012k reserved, 308k data, 52k init, 0k highmem) DOM3: Dentry cache hash table entries: 32768 (order: 6, 262144 bytes) DOM3: Inode cache hash table entries: 16384 (order: 5, 131072 bytes) DOM3: Mount cache hash table entries: 512 (order: 0, 4096 bytes) DOM3: Buffer cache hash table entries: 16384 (order: 4, 65536 bytes) DOM3: Page-cache hash table entries: 65536 (order: 6, 262144 bytes) DOM3: CPU: L1 I cache: 16K, L1 D cache: 16K DOM3: CPU: L2 cache: 512K DOM3: CPU: Intel Pentium II (Deschutes) stepping 01 DOM3: POSIX conformance testing by UNIFIX DOM3: Linux NET4.0 for Linux 2.4 DOM3: Based upon Swansea University Computer Society NET3.039 DOM3: Initializing RT netlink socket DOM3: Starting kswapd DOM3: Journalled Block Device driver loaded DOM3: Installing knfsd (copyright (C) 1996 okir@monad.swb.de). DOM3: Xeno console successfully installed DOM3: Starting Xeno Balloon driver DOM3: pty: 256 Unix98 ptys configured DOM3: RAMDISK driver initialized: 16 RAM disks of 4096K size 1024 blocksize DOM3: loop: loaded (max 8 devices) DOM3: NET4: Linux TCP/IP 1.0 for NET4.0 DOM3: IP Protocols: ICMP, UDP, TCP DOM3: IP: routing cache hash table of 2048 buckets, 16Kbytes DOM3: TCP: Hash tables configured (established 16384 bind 16384) DOM3: IP-Config: Complete: DOM3: device=eth0, addr=64.71.149.20, mask=255.255.255.0, gw=64.71.149.1, DOM3: host=64.71.149.20, domain=, nis-domain=(none), DOM3: bootserver=10.27.2.50, rootserver=10.27.2.50, rootpathDOM3: ip_conntrack version 2.1 (2048 buckets, 16384 max) - 292 bytes per conntrack DOM3: ip_tables: (C) 2000-2002 Netfilter core team DOM3: NET4: Unix domain sockets 1.0/SMP for Linux NET4.0. DOM3: Looking up port of RPC 100003/2 on 10.27.2.50 DOM3: Looking up port of RPC 100005/1 on 10.27.2.50 DOM3: VFS: Mounted root (nfs filesystem). DOM3: Freeing unused kernel memory: 52k freed DOM3: INIT: version 2.84 booting DOM3: Activating swap. DOM3: Adding Swap: 262136k swap-space (priority -1) DOM3: Checking root file system... DOM3: fsck 1.27 (8-Mar-2002) DOM3: 10.27.2.50:/export/xen/fs/stevegt/tcx: NFS file system. DOM3: System time was Tue Feb 10 02:14:36 UTC 2004. DOM3: Setting the System Clock using the Hardware Clock as reference... DOM3: modprobe: modprobe: Can''t locate module char-major-10-135 DOM3: modprobe: modprobe: Can''t locate module char-major-4 DOM3: hwclock is unable to get I/O port access: the iopl(3) call failed. DOM3: modprobe: modprobe: Can''t locate module char-major-10-135 DOM3: modprobe: modprobe: Can''t locate module char-major-4 DOM3: System Clock set. System local time is now Tue Feb 10 02:14:36 UTC 2004. DOM3: Calculating module dependencies... depmod: cannot read ELF header from /lib/modules/2.4.24-xeno/modules.dep DOM3: depmod: cannot read ELF header from /lib/modules/2.4.24-xeno/modules.generic_string DOM3: depmod: /lib/modules/2.4.24-xeno/modules.ieee1394map is not an ELF file DOM3: depmod: /lib/modules/2.4.24-xeno/modules.isapnpmap is not an ELF file DOM3: depmod: cannot read ELF header from /lib/modules/2.4.24-xeno/modules.parportmap DOM3: depmod: /lib/modules/2.4.24-xeno/modules.pcimap is not an ELF file DOM3: depmod: cannot read ELF header from /lib/modules/2.4.24-xeno/modules.pnpbiosmap DOM3: depmod: /lib/modules/2.4.24-xeno/modules.usbmap is not an ELF file DOM3: done. DOM3: Loading modules: DOM3: Checking all file systems... DOM3: fsck 1.27 (8-Mar-2002) DOM3: Setting kernel variables. DOM3: Loading the saved-state of the serial devices... DOM3: Mounting local filesystems... DOM3: nothing was mounted DOM3: Running 0dns-down to make sure resolv.conf is ok...done. DOM3: Cleaning: /etc/network/ifstate. DOM3: Setting up IP spoofing protection: rp_filter. DOM3: Configuring network interfaces: done. DOM3: Mounting remote filesystems... DOM3: DOM3: Setting the System Clock using the Hardware Clock as reference... DOM3: System Clock set. Local time: Tue Feb 10 02:14:37 UTC 2004 DOM3: DOM3: Cleaning: /tmp /var/lock /var/run. DOM3: Initializing random number generator... done. DOM3: Recovering nvi editor sessions... done. DOM3: INIT: Entering runlevel: 4 DOM3: Starting system log daemon: syslogd. DOM3: Starting kernel log daemon: klogd. DOM3: Starting internet superserver: inetd. DOM3: Starting PCMCIA services: module directory /lib/modules/2.4.24-xeno/pcmcia not found. DOM3: Starting OpenBSD Secure Shell server: sshd. DOM3: Starting deferred execution scheduler: atd. DOM3: Starting periodic command scheduler: cron. DOM3: INIT: no more processes left in this runlevel DOM3: Unable to handle kernel paging request at virtual address 20000001 DOM3: printing eip: DOM3: c0007743 DOM3: *pde=00000000(00000000) DOM3: Oops: 0000 DOM3: CPU: 0 DOM3: EIP: 0819:[<c0007743>] Not tainted DOM3: EFLAGS: 00010202 DOM3: eax: 00000001 ebx: 20000001 ecx: c3ebde6c edx: c3ebde6c DOM3: esi: c3ebc000 edi: c0114254 ebp: c46c1060 esp: c3ebdce4 DOM3: ds: 0821 es: 0821 ss: 0821 DOM3: Process cc (pid: 13793, stackpage=c3ebd000)<1> DOM3: Stack: 20000001 c3ebc000 c002c728 c3ebc000 c3ebddb0 c0114254 ffffffb0 c3ebc000 DOM3: c003e789 c3ebde6c c014b5ac c003e314 c3ebde6c 00000000 c0114250 00000000 DOM3: 00000000 00000000 01082003 8d588810 00000000 00000000 00000000 00000000 DOM3: Call Trace: [<c002c728>] [<c003e789>] [<c003e314>] [<c002ccc7>] [<c002cf4a>] DOM3: [<c002cf61>] [<c0090033>] [<c00914bf>] DOM3: DOM3: <1>Unable to handle kernel paging request at virtual address 20000001 DOM3: printing eip: DOM3: c000af0f DOM3: *pde=00000000(00000000) DOM3: Oops: 0002 DOM3: CPU: 0 DOM3: EIP: 0819:[<c000af0f>] Not tainted DOM3: EFLAGS: 00010282 DOM3: eax: 20000001 ebx: c485c0a0 ecx: c3ebc264 edx: c3ebc264 DOM3: esi: 00000000 edi: 20000001 ebp: 0000000b esp: c3ebdbb4 DOM3: ds: 0821 es: 0821 ss: 0821 DOM3: Process cc (pid: 13793, stackpage=c3ebd000)<1> DOM3: Stack: c485c0a0 00000000 c3ebc000 0000000b 0000000b c000b55f 20000001 0000001f DOM3: 00000000 cf4227e0 20000001 c0091a87 0000000b 00000000 c485c0bc c0096305 DOM3: c0129928 c3ebdcb0 00000000 c3ebc000 00000000 20000001 c46c1060 00000000 DOM3: Call Trace: [<c000b55f>] [<c0091a87>] [<c0096305>] [<c002eb19>] [<c0018a25>] DOM3: [<c0018c46>] [<c0018feb>] [<c0018ed4>] [<c006e759>] [<c0091768>] [<c0007743>] DOM3: [<c002c728>] [<c003e789>] [<c003e314>] [<c002ccc7>] [<c002cf4a>] [<c002cf61>] DOM3: [<c0090033>] [<c00914bf>] DOM3: DOM3: <1>Unable to handle kernel NULL pointer dereference at virtual address 00000001 DOM3: printing eip: DOM3: c000b623 DOM3: *pde=00000000(00000000) DOM3: Oops: 0002 DOM3: CPU: 0 DOM3: EIP: 0819:[<c000b623>] Not tainted DOM3: EFLAGS: 00010202 DOM3: eax: 00000000 ebx: 00000001 ecx: c3ebc264 edx: c3ebc264 DOM3: esi: 00000002 edi: c3ebc000 ebp: 0000000b esp: c3ebdaa0 DOM3: ds: 0821 es: 0821 ss: 0821 DOM3: Process cc (pid: 13793, stackpage=c3ebd000)<1> DOM3: Stack: 0000001f 00000002 20000001 20000001 c0091a87 0000000b 00000000 00000002 DOM3: c0096305 c0129928 c3ebdb80 00000002 c3ebc000 00000002 20000001 0000000b DOM3: 63303039 ffffffff c3ebc000 00000002 38383130 38643538 00030001 64303030 DOM3: Call Trace: [<c0091a87>] [<c0096305>] [<c0008996>] [<c000f797>] [<c000f991>] DOM3: [<c0091768>] [<c000af0f>] [<c000b55f>] [<c0091a87>] [<c0096305>] [<c002eb19>] DOM3: [<c0018a25>] [<c0018c46>] [<c0018feb>] [<c0018ed4>] [<c006e759>] [<c0091768>] DOM3: [<c0007743>] [<c002c728>] [<c003e789>] [<c003e314>] [<c002ccc7>] [<c002cf4a>] DOM3: [<c002cf61>] [<c0090033>] [<c00914bf>] DOM3: On Mon, Feb 09, 2004 at 05:25:00PM -0800, wrote:> Hi All, > > I seem to be able to reproduce a null pointer dereference and paging > request errors in 1.2. Can anyone give me any pointers on tracking down > what is causing it? > > This is with a 32Mb virtual domain, running debian woody, NFS root, > 256Mb swap in a local VD, while running a process which builds openldap, > python2.2.3, and related packages. I''m not sure which package, if any > in particular, is causing this; could be just anything that causes a > similar workload. This particular set of messages appeared before the > virtual domain locked up during the openldap build... > > Steve > > > DOM26: Unable to handle kernel paging request at virtual address > 20000001 > DOM26: printing eip: > DOM26: c0007743 > DOM26: *pde=00000000(00000000) > DOM26: Oops: 0000 > DOM26: CPU: 0 > DOM26: EIP: 0819:[<c0007743>] Not tainted > DOM26: EFLAGS: 00010202 > DOM26: eax: 00000001 ebx: 20000001 ecx: c0a79e6c edx: c0a79e6c > DOM26: esi: c0a78000 edi: c0114254 ebp: c1e5f580 esp: c0a79ce4 > DOM26: ds: 0821 es: 0821 ss: 0821 > DOM26: Process sh (pid: 10086, stackpage=c0a79000)<1> > DOM26: Stack: 20000001 c0a78000 c002c728 c0a78000 c0a79db0 c0114254 > ffffffb0 c0a78000 > DOM26: c003e789 c0a79e6c c014b5ac c003e314 c0a79e6c 00000000 > c0114250 c0a79de8 > DOM26: c1419640 80000000 00000000 00000000 00000000 00000000 > 00000000 00000000 > DOM26: Call Trace: [<c002c728>] [<c003e789>] [<c003e314>] [<c002ccc7>] > [<c002cf4a>] > DOM26: [<c002cf61>] [<c0090033>] [<c00914bf>] > DOM26: > DOM26: <1>Unable to handle kernel paging request at virtual address > 20000001 > DOM26: printing eip: > DOM26: c000af0f > DOM26: *pde=00000000(00000000) > DOM26: Oops: 0002 > DOM26: CPU: 0 > DOM26: EIP: 0819:[<c000af0f>] Not tainted > DOM26: EFLAGS: 00010282 > DOM26: eax: 20000001 ebx: c1ed5b20 ecx: c0a78264 edx: c0a78264 > DOM26: esi: 00000000 edi: 20000001 ebp: 0000000b esp: c0a79bb4 > DOM26: ds: 0821 es: 0821 ss: 0821 > DOM26: Process sh (pid: 10086, stackpage=c0a79000)<1> > DOM26: Stack: c1ed5b20 00000000 c0a78000 0000000b 0000000b c000b55f > 20000001 0000001f > DOM26: 00000000 c140d6c0 20000001 c0091a87 0000000b 00000000 > c1ed5b3c c0096305 > DOM26: c0129928 c0a79cb0 00000000 c0a78000 00000000 20000001 > c1e5f580 00000000 > DOM26: Call Trace: [<c000b55f>] [<c0091a87>] [<c0096305>] [<c002eb19>] > [<c0018a25>] > DOM26: [<c0018c46>] [<c0018feb>] [<c0018ed4>] [<c006e759>] > [<c0091768>] [<c0007743>] > DOM26: [<c002c728>] [<c003e789>] [<c003e314>] [<c002ccc7>] > [<c002cf4a>] [<c002cf61>] > DOM26: [<c0090033>] [<c00914bf>] > DOM26: > DOM26: <1>Unable to handle kernel NULL pointer dereference at virtual > address 00000001 > DOM26: printing eip: > DOM26: c000b623 > DOM26: *pde=00000000(00000000) > DOM26: Oops: 0002 > DOM26: CPU: 0 > DOM26: EIP: 0819:[<c000b623>] Not tainted > DOM26: EFLAGS: 00010202 > DOM26: eax: 00000000 ebx: 00000001 ecx: c0a78264 edx: c0a78264 > DOM26: esi: 00000002 edi: c0a78000 ebp: 0000000b esp: c0a79aa0 > DOM26: ds: 0821 es: 0821 ss: 0821 > DOM26: Process sh (pid: 10086, stackpage=c0a79000)<1> > DOM26: Stack: 0000001f 00000002 20000001 20000001 c0091a87 0000000b > 00000000 00000002 > DOM26: c0096305 c0129928 c0a79b80 00000002 c0a78000 00000002 > 20000001 0000000b > DOM26: 63303039 c101fc58 c0a78000 00000002 c101fc58 ffffffff > 00030001 c001e621 > DOM26: Call Trace: [<c0091a87>] [<c0096305>] [<c001e621>] [<c001f6c0>] > [<c0008996>] > DOM26: [<c00200d7>] [<c00204e1>] [<c001464d>] [<c0014c92>] > [<c0091768>] [<c000af0f>] > DOM26: [<c000b55f>] [<c0091a87>] [<c0096305>] [<c002eb19>] > [<c0018a25>] [<c0018c46>] > DOM26: [<c0018feb>] [<c0018ed4>] [<c006e759>] [<c0091768>] > [<c0007743>] [<c002c728>] > DOM26: [<c003e789>] [<c003e314>] [<c002ccc7>] [<c002cf4a>] > [<c002cf61>] [<c0090033>] > DOM26: [<c00914bf>] > DOM26: > > > > -- > Stephen G. Traugott (KG6HDQ) > UNIX/Linux Infrastructure Architect, TerraLuna LLC > stevegt@TerraLuna.Org > http://www.stevegt.com -- http://Infrastructures.Org-- Stephen G. Traugott (KG6HDQ) UNIX/Linux Infrastructure Architect, TerraLuna LLC stevegt@TerraLuna.Org http://www.stevegt.com -- http://Infrastructures.Org ------------------------------------------------------- The SF.Net email is sponsored by EclipseCon 2004 Premiere Conference on Open Tools Development and Integration See the breadth of Eclipse activity. February 3-5 in Anaheim, CA. http://www.eclipsecon.org/osdn _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
stevegt@TerraLuna.Org
2004-Feb-10 06:34 UTC
[Xen-devel] Re: paging request failures under load (was: Re: Null pointer deference)
Before anyone burns too much time on this, hang on -- I wasn''t able to duplicate the problem on another cluster node (both nodes were built from the same SystemImager image). I''m looking for the reason why, and will let you know as soon as I do. Steve On Mon, Feb 09, 2004 at 08:13:22PM -0800, wrote:> Okay, the problem still exists when I bump the memory up to 256Mb, and > never swap. I.E. I''ve found no workaround. Hasn''t anyone else hit > anything like this? > > Steve > > > DOM3: xen_console_init > DOM3: Linux version 2.4.24-xeno (stevegt@pathfinder) (gcc version > 3.0.4) #16 Mon Feb 2 17:46:41 PST 2004 > DOM3: On node 0 totalpages: 65536 > DOM3: zone(0): 4096 pages. > DOM3: zone(1): 61440 pages. > DOM3: zone(2): 0 pages. > DOM3: Kernel command line: > ip=64.71.149.20:10.27.2.50:64.71.149.1:255.255.255.0::eth0:off > root=/dev/nfs nfsroot=/export//xen/fs/stevegt/tcx/root 4 DOMID=20 > DOM3: Initializing CPU#0 > DOM3: Xen reported: 398.780 MHz processor. > DOM3: Calibrating delay loop... 1592.52 BogoMIPS > DOM3: Memory: 257132k/262144k available (1078k kernel code, 5012k > reserved, 308k data, 52k init, 0k highmem) > DOM3: Dentry cache hash table entries: 32768 (order: 6, 262144 bytes) > DOM3: Inode cache hash table entries: 16384 (order: 5, 131072 bytes) > DOM3: Mount cache hash table entries: 512 (order: 0, 4096 bytes) > DOM3: Buffer cache hash table entries: 16384 (order: 4, 65536 bytes) > DOM3: Page-cache hash table entries: 65536 (order: 6, 262144 bytes) > DOM3: CPU: L1 I cache: 16K, L1 D cache: 16K > DOM3: CPU: L2 cache: 512K > DOM3: CPU: Intel Pentium II (Deschutes) stepping 01 > DOM3: POSIX conformance testing by UNIFIX > DOM3: Linux NET4.0 for Linux 2.4 > DOM3: Based upon Swansea University Computer Society NET3.039 > DOM3: Initializing RT netlink socket > DOM3: Starting kswapd > DOM3: Journalled Block Device driver loaded > DOM3: Installing knfsd (copyright (C) 1996 okir@monad.swb.de). > DOM3: Xeno console successfully installed > DOM3: Starting Xeno Balloon driver > DOM3: pty: 256 Unix98 ptys configured > DOM3: RAMDISK driver initialized: 16 RAM disks of 4096K size 1024 > blocksize > DOM3: loop: loaded (max 8 devices) > DOM3: NET4: Linux TCP/IP 1.0 for NET4.0 > DOM3: IP Protocols: ICMP, UDP, TCP > DOM3: IP: routing cache hash table of 2048 buckets, 16Kbytes > DOM3: TCP: Hash tables configured (established 16384 bind 16384) > DOM3: IP-Config: Complete: > DOM3: device=eth0, addr=64.71.149.20, mask=255.255.255.0, > gw=64.71.149.1, > DOM3: host=64.71.149.20, domain=, nis-domain=(none), > DOM3: bootserver=10.27.2.50, rootserver=10.27.2.50, rootpath> DOM3: ip_conntrack version 2.1 (2048 buckets, 16384 max) - 292 bytes > per conntrack > DOM3: ip_tables: (C) 2000-2002 Netfilter core team > DOM3: NET4: Unix domain sockets 1.0/SMP for Linux NET4.0. > DOM3: Looking up port of RPC 100003/2 on 10.27.2.50 > DOM3: Looking up port of RPC 100005/1 on 10.27.2.50 > DOM3: VFS: Mounted root (nfs filesystem). > DOM3: Freeing unused kernel memory: 52k freed > DOM3: INIT: version 2.84 booting > DOM3: Activating swap. > DOM3: Adding Swap: 262136k swap-space (priority -1) > DOM3: Checking root file system... > DOM3: fsck 1.27 (8-Mar-2002) > DOM3: 10.27.2.50:/export/xen/fs/stevegt/tcx: NFS file system. > DOM3: System time was Tue Feb 10 02:14:36 UTC 2004. > DOM3: Setting the System Clock using the Hardware Clock as > reference... > DOM3: modprobe: modprobe: Can''t locate module char-major-10-135 > DOM3: modprobe: modprobe: Can''t locate module char-major-4 > DOM3: hwclock is unable to get I/O port access: the iopl(3) call > failed. > DOM3: modprobe: modprobe: Can''t locate module char-major-10-135 > DOM3: modprobe: modprobe: Can''t locate module char-major-4 > DOM3: System Clock set. System local time is now Tue Feb 10 02:14:36 > UTC 2004. > DOM3: Calculating module dependencies... depmod: cannot read ELF > header from /lib/modules/2.4.24-xeno/modules.dep > DOM3: depmod: cannot read ELF header from > /lib/modules/2.4.24-xeno/modules.generic_string > DOM3: depmod: /lib/modules/2.4.24-xeno/modules.ieee1394map is not an > ELF file > DOM3: depmod: /lib/modules/2.4.24-xeno/modules.isapnpmap is not an ELF > file > DOM3: depmod: cannot read ELF header from > /lib/modules/2.4.24-xeno/modules.parportmap > DOM3: depmod: /lib/modules/2.4.24-xeno/modules.pcimap is not an ELF > file > DOM3: depmod: cannot read ELF header from > /lib/modules/2.4.24-xeno/modules.pnpbiosmap > DOM3: depmod: /lib/modules/2.4.24-xeno/modules.usbmap is not an ELF > file > DOM3: done. > DOM3: Loading modules: > DOM3: Checking all file systems... > DOM3: fsck 1.27 (8-Mar-2002) > DOM3: Setting kernel variables. > DOM3: Loading the saved-state of the serial devices... > DOM3: Mounting local filesystems... > DOM3: nothing was mounted > DOM3: Running 0dns-down to make sure resolv.conf is ok...done. > DOM3: Cleaning: /etc/network/ifstate. > DOM3: Setting up IP spoofing protection: rp_filter. > DOM3: Configuring network interfaces: done. > DOM3: Mounting remote filesystems... > DOM3: > DOM3: Setting the System Clock using the Hardware Clock as > reference... > DOM3: System Clock set. Local time: Tue Feb 10 02:14:37 UTC 2004 > DOM3: > DOM3: Cleaning: /tmp /var/lock /var/run. > DOM3: Initializing random number generator... done. > DOM3: Recovering nvi editor sessions... done. > DOM3: INIT: Entering runlevel: 4 > DOM3: Starting system log daemon: syslogd. > DOM3: Starting kernel log daemon: klogd. > DOM3: Starting internet superserver: inetd. > DOM3: Starting PCMCIA services: module directory > /lib/modules/2.4.24-xeno/pcmcia not found. > DOM3: Starting OpenBSD Secure Shell server: sshd. > DOM3: Starting deferred execution scheduler: atd. > DOM3: Starting periodic command scheduler: cron. > DOM3: INIT: no more processes left in this runlevel > DOM3: Unable to handle kernel paging request at virtual address > 20000001 > DOM3: printing eip: > DOM3: c0007743 > DOM3: *pde=00000000(00000000) > DOM3: Oops: 0000 > DOM3: CPU: 0 > DOM3: EIP: 0819:[<c0007743>] Not tainted > DOM3: EFLAGS: 00010202 > DOM3: eax: 00000001 ebx: 20000001 ecx: c3ebde6c edx: c3ebde6c > DOM3: esi: c3ebc000 edi: c0114254 ebp: c46c1060 esp: c3ebdce4 > DOM3: ds: 0821 es: 0821 ss: 0821 > DOM3: Process cc (pid: 13793, stackpage=c3ebd000)<1> > DOM3: Stack: 20000001 c3ebc000 c002c728 c3ebc000 c3ebddb0 c0114254 > ffffffb0 c3ebc000 > DOM3: c003e789 c3ebde6c c014b5ac c003e314 c3ebde6c 00000000 > c0114250 00000000 > DOM3: 00000000 00000000 01082003 8d588810 00000000 00000000 > 00000000 00000000 > DOM3: Call Trace: [<c002c728>] [<c003e789>] [<c003e314>] [<c002ccc7>] > [<c002cf4a>] > DOM3: [<c002cf61>] [<c0090033>] [<c00914bf>] > DOM3: > DOM3: <1>Unable to handle kernel paging request at virtual address > 20000001 > DOM3: printing eip: > DOM3: c000af0f > DOM3: *pde=00000000(00000000) > DOM3: Oops: 0002 > DOM3: CPU: 0 > DOM3: EIP: 0819:[<c000af0f>] Not tainted > DOM3: EFLAGS: 00010282 > DOM3: eax: 20000001 ebx: c485c0a0 ecx: c3ebc264 edx: c3ebc264 > DOM3: esi: 00000000 edi: 20000001 ebp: 0000000b esp: c3ebdbb4 > DOM3: ds: 0821 es: 0821 ss: 0821 > DOM3: Process cc (pid: 13793, stackpage=c3ebd000)<1> > DOM3: Stack: c485c0a0 00000000 c3ebc000 0000000b 0000000b c000b55f > 20000001 0000001f > DOM3: 00000000 cf4227e0 20000001 c0091a87 0000000b 00000000 > c485c0bc c0096305 > DOM3: c0129928 c3ebdcb0 00000000 c3ebc000 00000000 20000001 > c46c1060 00000000 > DOM3: Call Trace: [<c000b55f>] [<c0091a87>] [<c0096305>] [<c002eb19>] > [<c0018a25>] > DOM3: [<c0018c46>] [<c0018feb>] [<c0018ed4>] [<c006e759>] > [<c0091768>] [<c0007743>] > DOM3: [<c002c728>] [<c003e789>] [<c003e314>] [<c002ccc7>] > [<c002cf4a>] [<c002cf61>] > DOM3: [<c0090033>] [<c00914bf>] > DOM3: > DOM3: <1>Unable to handle kernel NULL pointer dereference at virtual > address 00000001 > DOM3: printing eip: > DOM3: c000b623 > DOM3: *pde=00000000(00000000) > DOM3: Oops: 0002 > DOM3: CPU: 0 > DOM3: EIP: 0819:[<c000b623>] Not tainted > DOM3: EFLAGS: 00010202 > DOM3: eax: 00000000 ebx: 00000001 ecx: c3ebc264 edx: c3ebc264 > DOM3: esi: 00000002 edi: c3ebc000 ebp: 0000000b esp: c3ebdaa0 > DOM3: ds: 0821 es: 0821 ss: 0821 > DOM3: Process cc (pid: 13793, stackpage=c3ebd000)<1> > DOM3: Stack: 0000001f 00000002 20000001 20000001 c0091a87 0000000b > 00000000 00000002 > DOM3: c0096305 c0129928 c3ebdb80 00000002 c3ebc000 00000002 > 20000001 0000000b > DOM3: 63303039 ffffffff c3ebc000 00000002 38383130 38643538 > 00030001 64303030 > DOM3: Call Trace: [<c0091a87>] [<c0096305>] [<c0008996>] [<c000f797>] > [<c000f991>] > DOM3: [<c0091768>] [<c000af0f>] [<c000b55f>] [<c0091a87>] > [<c0096305>] [<c002eb19>] > DOM3: [<c0018a25>] [<c0018c46>] [<c0018feb>] [<c0018ed4>] > [<c006e759>] [<c0091768>] > DOM3: [<c0007743>] [<c002c728>] [<c003e789>] [<c003e314>] > [<c002ccc7>] [<c002cf4a>] > DOM3: [<c002cf61>] [<c0090033>] [<c00914bf>] > DOM3: > > > > > On Mon, Feb 09, 2004 at 05:25:00PM -0800, wrote: > > Hi All, > > > > I seem to be able to reproduce a null pointer dereference and paging > > request errors in 1.2. Can anyone give me any pointers on tracking down > > what is causing it? > > > > This is with a 32Mb virtual domain, running debian woody, NFS root, > > 256Mb swap in a local VD, while running a process which builds openldap, > > python2.2.3, and related packages. I''m not sure which package, if any > > in particular, is causing this; could be just anything that causes a > > similar workload. This particular set of messages appeared before the > > virtual domain locked up during the openldap build... > > > > Steve > > > > > > DOM26: Unable to handle kernel paging request at virtual address > > 20000001 > > DOM26: printing eip: > > DOM26: c0007743 > > DOM26: *pde=00000000(00000000) > > DOM26: Oops: 0000 > > DOM26: CPU: 0 > > DOM26: EIP: 0819:[<c0007743>] Not tainted > > DOM26: EFLAGS: 00010202 > > DOM26: eax: 00000001 ebx: 20000001 ecx: c0a79e6c edx: c0a79e6c > > DOM26: esi: c0a78000 edi: c0114254 ebp: c1e5f580 esp: c0a79ce4 > > DOM26: ds: 0821 es: 0821 ss: 0821 > > DOM26: Process sh (pid: 10086, stackpage=c0a79000)<1> > > DOM26: Stack: 20000001 c0a78000 c002c728 c0a78000 c0a79db0 c0114254 > > ffffffb0 c0a78000 > > DOM26: c003e789 c0a79e6c c014b5ac c003e314 c0a79e6c 00000000 > > c0114250 c0a79de8 > > DOM26: c1419640 80000000 00000000 00000000 00000000 00000000 > > 00000000 00000000 > > DOM26: Call Trace: [<c002c728>] [<c003e789>] [<c003e314>] [<c002ccc7>] > > [<c002cf4a>] > > DOM26: [<c002cf61>] [<c0090033>] [<c00914bf>] > > DOM26: > > DOM26: <1>Unable to handle kernel paging request at virtual address > > 20000001 > > DOM26: printing eip: > > DOM26: c000af0f > > DOM26: *pde=00000000(00000000) > > DOM26: Oops: 0002 > > DOM26: CPU: 0 > > DOM26: EIP: 0819:[<c000af0f>] Not tainted > > DOM26: EFLAGS: 00010282 > > DOM26: eax: 20000001 ebx: c1ed5b20 ecx: c0a78264 edx: c0a78264 > > DOM26: esi: 00000000 edi: 20000001 ebp: 0000000b esp: c0a79bb4 > > DOM26: ds: 0821 es: 0821 ss: 0821 > > DOM26: Process sh (pid: 10086, stackpage=c0a79000)<1> > > DOM26: Stack: c1ed5b20 00000000 c0a78000 0000000b 0000000b c000b55f > > 20000001 0000001f > > DOM26: 00000000 c140d6c0 20000001 c0091a87 0000000b 00000000 > > c1ed5b3c c0096305 > > DOM26: c0129928 c0a79cb0 00000000 c0a78000 00000000 20000001 > > c1e5f580 00000000 > > DOM26: Call Trace: [<c000b55f>] [<c0091a87>] [<c0096305>] [<c002eb19>] > > [<c0018a25>] > > DOM26: [<c0018c46>] [<c0018feb>] [<c0018ed4>] [<c006e759>] > > [<c0091768>] [<c0007743>] > > DOM26: [<c002c728>] [<c003e789>] [<c003e314>] [<c002ccc7>] > > [<c002cf4a>] [<c002cf61>] > > DOM26: [<c0090033>] [<c00914bf>] > > DOM26: > > DOM26: <1>Unable to handle kernel NULL pointer dereference at virtual > > address 00000001 > > DOM26: printing eip: > > DOM26: c000b623 > > DOM26: *pde=00000000(00000000) > > DOM26: Oops: 0002 > > DOM26: CPU: 0 > > DOM26: EIP: 0819:[<c000b623>] Not tainted > > DOM26: EFLAGS: 00010202 > > DOM26: eax: 00000000 ebx: 00000001 ecx: c0a78264 edx: c0a78264 > > DOM26: esi: 00000002 edi: c0a78000 ebp: 0000000b esp: c0a79aa0 > > DOM26: ds: 0821 es: 0821 ss: 0821 > > DOM26: Process sh (pid: 10086, stackpage=c0a79000)<1> > > DOM26: Stack: 0000001f 00000002 20000001 20000001 c0091a87 0000000b > > 00000000 00000002 > > DOM26: c0096305 c0129928 c0a79b80 00000002 c0a78000 00000002 > > 20000001 0000000b > > DOM26: 63303039 c101fc58 c0a78000 00000002 c101fc58 ffffffff > > 00030001 c001e621 > > DOM26: Call Trace: [<c0091a87>] [<c0096305>] [<c001e621>] [<c001f6c0>] > > [<c0008996>] > > DOM26: [<c00200d7>] [<c00204e1>] [<c001464d>] [<c0014c92>] > > [<c0091768>] [<c000af0f>] > > DOM26: [<c000b55f>] [<c0091a87>] [<c0096305>] [<c002eb19>] > > [<c0018a25>] [<c0018c46>] > > DOM26: [<c0018feb>] [<c0018ed4>] [<c006e759>] [<c0091768>] > > [<c0007743>] [<c002c728>] > > DOM26: [<c003e789>] [<c003e314>] [<c002ccc7>] [<c002cf4a>] > > [<c002cf61>] [<c0090033>] > > DOM26: [<c00914bf>] > > DOM26: > > > > > > > > -- > > Stephen G. Traugott (KG6HDQ) > > UNIX/Linux Infrastructure Architect, TerraLuna LLC > > stevegt@TerraLuna.Org > > http://www.stevegt.com -- http://Infrastructures.Org > > -- > Stephen G. Traugott (KG6HDQ) > UNIX/Linux Infrastructure Architect, TerraLuna LLC > stevegt@TerraLuna.Org > http://www.stevegt.com -- http://Infrastructures.Org-- Stephen G. Traugott (KG6HDQ) UNIX/Linux Infrastructure Architect, TerraLuna LLC stevegt@TerraLuna.Org http://www.stevegt.com -- http://Infrastructures.Org ------------------------------------------------------- The SF.Net email is sponsored by EclipseCon 2004 Premiere Conference on Open Tools Development and Integration See the breadth of Eclipse activity. February 3-5 in Anaheim, CA. http://www.eclipsecon.org/osdn _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
> I seem to be able to reproduce a null pointer dereference and paging > request errors in 1.2. Can anyone give me any pointers on tracking down > what is causing it?Ouch! We haven''t seen one of these in a _very_ long time. Our systems generally don''t make heavy use of swap, so I suspect this could be the problem. You should be able to process the Oops message as per a standard Linux kernel (see Documentation/oops-tracing.txt) As a quick check, look up the EIP in system.map and see what function it blew up in. However, if it is a paging fault, I''m not sure how useful the Oops message will be. We''ll try and recreate locally. Thanks, Ian ------------------------------------------------------- The SF.Net email is sponsored by EclipseCon 2004 Premiere Conference on Open Tools Development and Integration See the breadth of Eclipse activity. February 3-5 in Anaheim, CA. http://www.eclipsecon.org/osdn _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
Looks like yesterday''s paging problems were an artifact of something on that particular dom0 disk image -- I haven''t been able to reproduce it on other nodes, and after I re-imaged the same node (with the same image), the problem has gone away there too, so that rules out hardware. However, something else did pop up -- while trying to break things with "perl -e ''$a="a"x100000000''", I got the following messages; do we care? This is in mainstream linux mm/page_alloc.c, and it''s hard for me to tell from the code whether these are outright errors or whether they were recoverable. Does anyone know? DOM4: __alloc_pages: 0-order allocation failed (gfp=0xf0/0) DOM4: __alloc_pages: 0-order allocation failed (gfp=0xf0/0) DOM4: __alloc_pages: 0-order allocation failed (gfp=0xf0/0) DOM4: __alloc_pages: 0-order allocation failed (gfp=0xf0/0) This was still on last Monday''s version of 1.2; I''ll see if I can reproduce it in today''s 1.2; I''m deploying that later tonight. Steve On Mon, Feb 09, 2004 at 10:34:34PM -0800, wrote:> Before anyone burns too much time on this, hang on -- I wasn''t able to > duplicate the problem on another cluster node (both nodes were built > from the same SystemImager image). I''m looking for the reason why, and > will let you know as soon as I do. > > Steve > > On Mon, Feb 09, 2004 at 08:13:22PM -0800, wrote: > > Okay, the problem still exists when I bump the memory up to 256Mb, and > > never swap. I.E. I''ve found no workaround. Hasn''t anyone else hit > > anything like this? > > > > Steve > > > > > > DOM3: xen_console_init > > DOM3: Linux version 2.4.24-xeno (stevegt@pathfinder) (gcc version > > 3.0.4) #16 Mon Feb 2 17:46:41 PST 2004 > > DOM3: On node 0 totalpages: 65536 > > DOM3: zone(0): 4096 pages. > > DOM3: zone(1): 61440 pages. > > DOM3: zone(2): 0 pages. > > DOM3: Kernel command line: > > ip=64.71.149.20:10.27.2.50:64.71.149.1:255.255.255.0::eth0:off > > root=/dev/nfs nfsroot=/export//xen/fs/stevegt/tcx/root 4 DOMID=20 > > DOM3: Initializing CPU#0 > > DOM3: Xen reported: 398.780 MHz processor. > > DOM3: Calibrating delay loop... 1592.52 BogoMIPS > > DOM3: Memory: 257132k/262144k available (1078k kernel code, 5012k > > reserved, 308k data, 52k init, 0k highmem) > > DOM3: Dentry cache hash table entries: 32768 (order: 6, 262144 bytes) > > DOM3: Inode cache hash table entries: 16384 (order: 5, 131072 bytes) > > DOM3: Mount cache hash table entries: 512 (order: 0, 4096 bytes) > > DOM3: Buffer cache hash table entries: 16384 (order: 4, 65536 bytes) > > DOM3: Page-cache hash table entries: 65536 (order: 6, 262144 bytes) > > DOM3: CPU: L1 I cache: 16K, L1 D cache: 16K > > DOM3: CPU: L2 cache: 512K > > DOM3: CPU: Intel Pentium II (Deschutes) stepping 01 > > DOM3: POSIX conformance testing by UNIFIX > > DOM3: Linux NET4.0 for Linux 2.4 > > DOM3: Based upon Swansea University Computer Society NET3.039 > > DOM3: Initializing RT netlink socket > > DOM3: Starting kswapd > > DOM3: Journalled Block Device driver loaded > > DOM3: Installing knfsd (copyright (C) 1996 okir@monad.swb.de). > > DOM3: Xeno console successfully installed > > DOM3: Starting Xeno Balloon driver > > DOM3: pty: 256 Unix98 ptys configured > > DOM3: RAMDISK driver initialized: 16 RAM disks of 4096K size 1024 > > blocksize > > DOM3: loop: loaded (max 8 devices) > > DOM3: NET4: Linux TCP/IP 1.0 for NET4.0 > > DOM3: IP Protocols: ICMP, UDP, TCP > > DOM3: IP: routing cache hash table of 2048 buckets, 16Kbytes > > DOM3: TCP: Hash tables configured (established 16384 bind 16384) > > DOM3: IP-Config: Complete: > > DOM3: device=eth0, addr=64.71.149.20, mask=255.255.255.0, > > gw=64.71.149.1, > > DOM3: host=64.71.149.20, domain=, nis-domain=(none), > > DOM3: bootserver=10.27.2.50, rootserver=10.27.2.50, rootpath> > DOM3: ip_conntrack version 2.1 (2048 buckets, 16384 max) - 292 bytes > > per conntrack > > DOM3: ip_tables: (C) 2000-2002 Netfilter core team > > DOM3: NET4: Unix domain sockets 1.0/SMP for Linux NET4.0. > > DOM3: Looking up port of RPC 100003/2 on 10.27.2.50 > > DOM3: Looking up port of RPC 100005/1 on 10.27.2.50 > > DOM3: VFS: Mounted root (nfs filesystem). > > DOM3: Freeing unused kernel memory: 52k freed > > DOM3: INIT: version 2.84 booting > > DOM3: Activating swap. > > DOM3: Adding Swap: 262136k swap-space (priority -1) > > DOM3: Checking root file system... > > DOM3: fsck 1.27 (8-Mar-2002) > > DOM3: 10.27.2.50:/export/xen/fs/stevegt/tcx: NFS file system. > > DOM3: System time was Tue Feb 10 02:14:36 UTC 2004. > > DOM3: Setting the System Clock using the Hardware Clock as > > reference... > > DOM3: modprobe: modprobe: Can''t locate module char-major-10-135 > > DOM3: modprobe: modprobe: Can''t locate module char-major-4 > > DOM3: hwclock is unable to get I/O port access: the iopl(3) call > > failed. > > DOM3: modprobe: modprobe: Can''t locate module char-major-10-135 > > DOM3: modprobe: modprobe: Can''t locate module char-major-4 > > DOM3: System Clock set. System local time is now Tue Feb 10 02:14:36 > > UTC 2004. > > DOM3: Calculating module dependencies... depmod: cannot read ELF > > header from /lib/modules/2.4.24-xeno/modules.dep > > DOM3: depmod: cannot read ELF header from > > /lib/modules/2.4.24-xeno/modules.generic_string > > DOM3: depmod: /lib/modules/2.4.24-xeno/modules.ieee1394map is not an > > ELF file > > DOM3: depmod: /lib/modules/2.4.24-xeno/modules.isapnpmap is not an ELF > > file > > DOM3: depmod: cannot read ELF header from > > /lib/modules/2.4.24-xeno/modules.parportmap > > DOM3: depmod: /lib/modules/2.4.24-xeno/modules.pcimap is not an ELF > > file > > DOM3: depmod: cannot read ELF header from > > /lib/modules/2.4.24-xeno/modules.pnpbiosmap > > DOM3: depmod: /lib/modules/2.4.24-xeno/modules.usbmap is not an ELF > > file > > DOM3: done. > > DOM3: Loading modules: > > DOM3: Checking all file systems... > > DOM3: fsck 1.27 (8-Mar-2002) > > DOM3: Setting kernel variables. > > DOM3: Loading the saved-state of the serial devices... > > DOM3: Mounting local filesystems... > > DOM3: nothing was mounted > > DOM3: Running 0dns-down to make sure resolv.conf is ok...done. > > DOM3: Cleaning: /etc/network/ifstate. > > DOM3: Setting up IP spoofing protection: rp_filter. > > DOM3: Configuring network interfaces: done. > > DOM3: Mounting remote filesystems... > > DOM3: > > DOM3: Setting the System Clock using the Hardware Clock as > > reference... > > DOM3: System Clock set. Local time: Tue Feb 10 02:14:37 UTC 2004 > > DOM3: > > DOM3: Cleaning: /tmp /var/lock /var/run. > > DOM3: Initializing random number generator... done. > > DOM3: Recovering nvi editor sessions... done. > > DOM3: INIT: Entering runlevel: 4 > > DOM3: Starting system log daemon: syslogd. > > DOM3: Starting kernel log daemon: klogd. > > DOM3: Starting internet superserver: inetd. > > DOM3: Starting PCMCIA services: module directory > > /lib/modules/2.4.24-xeno/pcmcia not found. > > DOM3: Starting OpenBSD Secure Shell server: sshd. > > DOM3: Starting deferred execution scheduler: atd. > > DOM3: Starting periodic command scheduler: cron. > > DOM3: INIT: no more processes left in this runlevel > > DOM3: Unable to handle kernel paging request at virtual address > > 20000001 > > DOM3: printing eip: > > DOM3: c0007743 > > DOM3: *pde=00000000(00000000) > > DOM3: Oops: 0000 > > DOM3: CPU: 0 > > DOM3: EIP: 0819:[<c0007743>] Not tainted > > DOM3: EFLAGS: 00010202 > > DOM3: eax: 00000001 ebx: 20000001 ecx: c3ebde6c edx: c3ebde6c > > DOM3: esi: c3ebc000 edi: c0114254 ebp: c46c1060 esp: c3ebdce4 > > DOM3: ds: 0821 es: 0821 ss: 0821 > > DOM3: Process cc (pid: 13793, stackpage=c3ebd000)<1> > > DOM3: Stack: 20000001 c3ebc000 c002c728 c3ebc000 c3ebddb0 c0114254 > > ffffffb0 c3ebc000 > > DOM3: c003e789 c3ebde6c c014b5ac c003e314 c3ebde6c 00000000 > > c0114250 00000000 > > DOM3: 00000000 00000000 01082003 8d588810 00000000 00000000 > > 00000000 00000000 > > DOM3: Call Trace: [<c002c728>] [<c003e789>] [<c003e314>] [<c002ccc7>] > > [<c002cf4a>] > > DOM3: [<c002cf61>] [<c0090033>] [<c00914bf>] > > DOM3: > > DOM3: <1>Unable to handle kernel paging request at virtual address > > 20000001 > > DOM3: printing eip: > > DOM3: c000af0f > > DOM3: *pde=00000000(00000000) > > DOM3: Oops: 0002 > > DOM3: CPU: 0 > > DOM3: EIP: 0819:[<c000af0f>] Not tainted > > DOM3: EFLAGS: 00010282 > > DOM3: eax: 20000001 ebx: c485c0a0 ecx: c3ebc264 edx: c3ebc264 > > DOM3: esi: 00000000 edi: 20000001 ebp: 0000000b esp: c3ebdbb4 > > DOM3: ds: 0821 es: 0821 ss: 0821 > > DOM3: Process cc (pid: 13793, stackpage=c3ebd000)<1> > > DOM3: Stack: c485c0a0 00000000 c3ebc000 0000000b 0000000b c000b55f > > 20000001 0000001f > > DOM3: 00000000 cf4227e0 20000001 c0091a87 0000000b 00000000 > > c485c0bc c0096305 > > DOM3: c0129928 c3ebdcb0 00000000 c3ebc000 00000000 20000001 > > c46c1060 00000000 > > DOM3: Call Trace: [<c000b55f>] [<c0091a87>] [<c0096305>] [<c002eb19>] > > [<c0018a25>] > > DOM3: [<c0018c46>] [<c0018feb>] [<c0018ed4>] [<c006e759>] > > [<c0091768>] [<c0007743>] > > DOM3: [<c002c728>] [<c003e789>] [<c003e314>] [<c002ccc7>] > > [<c002cf4a>] [<c002cf61>] > > DOM3: [<c0090033>] [<c00914bf>] > > DOM3: > > DOM3: <1>Unable to handle kernel NULL pointer dereference at virtual > > address 00000001 > > DOM3: printing eip: > > DOM3: c000b623 > > DOM3: *pde=00000000(00000000) > > DOM3: Oops: 0002 > > DOM3: CPU: 0 > > DOM3: EIP: 0819:[<c000b623>] Not tainted > > DOM3: EFLAGS: 00010202 > > DOM3: eax: 00000000 ebx: 00000001 ecx: c3ebc264 edx: c3ebc264 > > DOM3: esi: 00000002 edi: c3ebc000 ebp: 0000000b esp: c3ebdaa0 > > DOM3: ds: 0821 es: 0821 ss: 0821 > > DOM3: Process cc (pid: 13793, stackpage=c3ebd000)<1> > > DOM3: Stack: 0000001f 00000002 20000001 20000001 c0091a87 0000000b > > 00000000 00000002 > > DOM3: c0096305 c0129928 c3ebdb80 00000002 c3ebc000 00000002 > > 20000001 0000000b > > DOM3: 63303039 ffffffff c3ebc000 00000002 38383130 38643538 > > 00030001 64303030 > > DOM3: Call Trace: [<c0091a87>] [<c0096305>] [<c0008996>] [<c000f797>] > > [<c000f991>] > > DOM3: [<c0091768>] [<c000af0f>] [<c000b55f>] [<c0091a87>] > > [<c0096305>] [<c002eb19>] > > DOM3: [<c0018a25>] [<c0018c46>] [<c0018feb>] [<c0018ed4>] > > [<c006e759>] [<c0091768>] > > DOM3: [<c0007743>] [<c002c728>] [<c003e789>] [<c003e314>] > > [<c002ccc7>] [<c002cf4a>] > > DOM3: [<c002cf61>] [<c0090033>] [<c00914bf>] > > DOM3: > > > > > > > > > > On Mon, Feb 09, 2004 at 05:25:00PM -0800, wrote: > > > Hi All, > > > > > > I seem to be able to reproduce a null pointer dereference and paging > > > request errors in 1.2. Can anyone give me any pointers on tracking down > > > what is causing it? > > > > > > This is with a 32Mb virtual domain, running debian woody, NFS root, > > > 256Mb swap in a local VD, while running a process which builds openldap, > > > python2.2.3, and related packages. I''m not sure which package, if any > > > in particular, is causing this; could be just anything that causes a > > > similar workload. This particular set of messages appeared before the > > > virtual domain locked up during the openldap build... > > > > > > Steve > > > > > > > > > DOM26: Unable to handle kernel paging request at virtual address > > > 20000001 > > > DOM26: printing eip: > > > DOM26: c0007743 > > > DOM26: *pde=00000000(00000000) > > > DOM26: Oops: 0000 > > > DOM26: CPU: 0 > > > DOM26: EIP: 0819:[<c0007743>] Not tainted > > > DOM26: EFLAGS: 00010202 > > > DOM26: eax: 00000001 ebx: 20000001 ecx: c0a79e6c edx: c0a79e6c > > > DOM26: esi: c0a78000 edi: c0114254 ebp: c1e5f580 esp: c0a79ce4 > > > DOM26: ds: 0821 es: 0821 ss: 0821 > > > DOM26: Process sh (pid: 10086, stackpage=c0a79000)<1> > > > DOM26: Stack: 20000001 c0a78000 c002c728 c0a78000 c0a79db0 c0114254 > > > ffffffb0 c0a78000 > > > DOM26: c003e789 c0a79e6c c014b5ac c003e314 c0a79e6c 00000000 > > > c0114250 c0a79de8 > > > DOM26: c1419640 80000000 00000000 00000000 00000000 00000000 > > > 00000000 00000000 > > > DOM26: Call Trace: [<c002c728>] [<c003e789>] [<c003e314>] [<c002ccc7>] > > > [<c002cf4a>] > > > DOM26: [<c002cf61>] [<c0090033>] [<c00914bf>] > > > DOM26: > > > DOM26: <1>Unable to handle kernel paging request at virtual address > > > 20000001 > > > DOM26: printing eip: > > > DOM26: c000af0f > > > DOM26: *pde=00000000(00000000) > > > DOM26: Oops: 0002 > > > DOM26: CPU: 0 > > > DOM26: EIP: 0819:[<c000af0f>] Not tainted > > > DOM26: EFLAGS: 00010282 > > > DOM26: eax: 20000001 ebx: c1ed5b20 ecx: c0a78264 edx: c0a78264 > > > DOM26: esi: 00000000 edi: 20000001 ebp: 0000000b esp: c0a79bb4 > > > DOM26: ds: 0821 es: 0821 ss: 0821 > > > DOM26: Process sh (pid: 10086, stackpage=c0a79000)<1> > > > DOM26: Stack: c1ed5b20 00000000 c0a78000 0000000b 0000000b c000b55f > > > 20000001 0000001f > > > DOM26: 00000000 c140d6c0 20000001 c0091a87 0000000b 00000000 > > > c1ed5b3c c0096305 > > > DOM26: c0129928 c0a79cb0 00000000 c0a78000 00000000 20000001 > > > c1e5f580 00000000 > > > DOM26: Call Trace: [<c000b55f>] [<c0091a87>] [<c0096305>] [<c002eb19>] > > > [<c0018a25>] > > > DOM26: [<c0018c46>] [<c0018feb>] [<c0018ed4>] [<c006e759>] > > > [<c0091768>] [<c0007743>] > > > DOM26: [<c002c728>] [<c003e789>] [<c003e314>] [<c002ccc7>] > > > [<c002cf4a>] [<c002cf61>] > > > DOM26: [<c0090033>] [<c00914bf>] > > > DOM26: > > > DOM26: <1>Unable to handle kernel NULL pointer dereference at virtual > > > address 00000001 > > > DOM26: printing eip: > > > DOM26: c000b623 > > > DOM26: *pde=00000000(00000000) > > > DOM26: Oops: 0002 > > > DOM26: CPU: 0 > > > DOM26: EIP: 0819:[<c000b623>] Not tainted > > > DOM26: EFLAGS: 00010202 > > > DOM26: eax: 00000000 ebx: 00000001 ecx: c0a78264 edx: c0a78264 > > > DOM26: esi: 00000002 edi: c0a78000 ebp: 0000000b esp: c0a79aa0 > > > DOM26: ds: 0821 es: 0821 ss: 0821 > > > DOM26: Process sh (pid: 10086, stackpage=c0a79000)<1> > > > DOM26: Stack: 0000001f 00000002 20000001 20000001 c0091a87 0000000b > > > 00000000 00000002 > > > DOM26: c0096305 c0129928 c0a79b80 00000002 c0a78000 00000002 > > > 20000001 0000000b > > > DOM26: 63303039 c101fc58 c0a78000 00000002 c101fc58 ffffffff > > > 00030001 c001e621 > > > DOM26: Call Trace: [<c0091a87>] [<c0096305>] [<c001e621>] [<c001f6c0>] > > > [<c0008996>] > > > DOM26: [<c00200d7>] [<c00204e1>] [<c001464d>] [<c0014c92>] > > > [<c0091768>] [<c000af0f>] > > > DOM26: [<c000b55f>] [<c0091a87>] [<c0096305>] [<c002eb19>] > > > [<c0018a25>] [<c0018c46>] > > > DOM26: [<c0018feb>] [<c0018ed4>] [<c006e759>] [<c0091768>] > > > [<c0007743>] [<c002c728>] > > > DOM26: [<c003e789>] [<c003e314>] [<c002ccc7>] [<c002cf4a>] > > > [<c002cf61>] [<c0090033>] > > > DOM26: [<c00914bf>] > > > DOM26: > > > > > > > > > > > > -- > > > Stephen G. Traugott (KG6HDQ) > > > UNIX/Linux Infrastructure Architect, TerraLuna LLC > > > stevegt@TerraLuna.Org > > > http://www.stevegt.com -- http://Infrastructures.Org > > > > -- > > Stephen G. Traugott (KG6HDQ) > > UNIX/Linux Infrastructure Architect, TerraLuna LLC > > stevegt@TerraLuna.Org > > http://www.stevegt.com -- http://Infrastructures.Org > > -- > Stephen G. Traugott (KG6HDQ) > UNIX/Linux Infrastructure Architect, TerraLuna LLC > stevegt@TerraLuna.Org > http://www.stevegt.com -- http://Infrastructures.Org-- Stephen G. Traugott (KG6HDQ) UNIX/Linux Infrastructure Architect, TerraLuna LLC stevegt@TerraLuna.Org http://www.stevegt.com -- http://Infrastructures.Org ------------------------------------------------------- The SF.Net email is sponsored by EclipseCon 2004 Premiere Conference on Open Tools Development and Integration See the breadth of Eclipse activity. February 3-5 in Anaheim, CA. http://www.eclipsecon.org/osdn _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
That is odd. What that means is that something tried to allocate a page without passing in GFP_WAIT as one of the flags, and there were no pages that could be freed in any of the zones. Does this not happen when running 2.4.24 directly on the machine with the same amount of memory? -Kip On Tue, 10 Feb 2004 stevegt@TerraLuna.Org wrote:> Looks like yesterday''s paging problems were an artifact of something on > that particular dom0 disk image -- I haven''t been able to reproduce it > on other nodes, and after I re-imaged the same node (with the same > image), the problem has gone away there too, so that rules out hardware. > > However, something else did pop up -- while trying to break things with > "perl -e ''$a="a"x100000000''", I got the following messages; do we care? > This is in mainstream linux mm/page_alloc.c, and it''s hard for me to > tell from the code whether these are outright errors or whether they > were recoverable. Does anyone know? > > DOM4: __alloc_pages: 0-order allocation failed (gfp=0xf0/0) > DOM4: __alloc_pages: 0-order allocation failed (gfp=0xf0/0) > DOM4: __alloc_pages: 0-order allocation failed (gfp=0xf0/0) > DOM4: __alloc_pages: 0-order allocation failed (gfp=0xf0/0) > > This was still on last Monday''s version of 1.2; I''ll see if I can > reproduce it in today''s 1.2; I''m deploying that later tonight. > > Steve > > > On Mon, Feb 09, 2004 at 10:34:34PM -0800, wrote: > > Before anyone burns too much time on this, hang on -- I wasn''t able to > > duplicate the problem on another cluster node (both nodes were built > > from the same SystemImager image). I''m looking for the reason why, and > > will let you know as soon as I do. > > > > Steve > > > > On Mon, Feb 09, 2004 at 08:13:22PM -0800, wrote: > > > Okay, the problem still exists when I bump the memory up to 256Mb, and > > > never swap. I.E. I''ve found no workaround. Hasn''t anyone else hit > > > anything like this? > > > > > > Steve > > > > > > > > > DOM3: xen_console_init > > > DOM3: Linux version 2.4.24-xeno (stevegt@pathfinder) (gcc version > > > 3.0.4) #16 Mon Feb 2 17:46:41 PST 2004 > > > DOM3: On node 0 totalpages: 65536 > > > DOM3: zone(0): 4096 pages. > > > DOM3: zone(1): 61440 pages. > > > DOM3: zone(2): 0 pages. > > > DOM3: Kernel command line: > > > ip=64.71.149.20:10.27.2.50:64.71.149.1:255.255.255.0::eth0:off > > > root=/dev/nfs nfsroot=/export//xen/fs/stevegt/tcx/root 4 DOMID=20 > > > DOM3: Initializing CPU#0 > > > DOM3: Xen reported: 398.780 MHz processor. > > > DOM3: Calibrating delay loop... 1592.52 BogoMIPS > > > DOM3: Memory: 257132k/262144k available (1078k kernel code, 5012k > > > reserved, 308k data, 52k init, 0k highmem) > > > DOM3: Dentry cache hash table entries: 32768 (order: 6, 262144 bytes) > > > DOM3: Inode cache hash table entries: 16384 (order: 5, 131072 bytes) > > > DOM3: Mount cache hash table entries: 512 (order: 0, 4096 bytes) > > > DOM3: Buffer cache hash table entries: 16384 (order: 4, 65536 bytes) > > > DOM3: Page-cache hash table entries: 65536 (order: 6, 262144 bytes) > > > DOM3: CPU: L1 I cache: 16K, L1 D cache: 16K > > > DOM3: CPU: L2 cache: 512K > > > DOM3: CPU: Intel Pentium II (Deschutes) stepping 01 > > > DOM3: POSIX conformance testing by UNIFIX > > > DOM3: Linux NET4.0 for Linux 2.4 > > > DOM3: Based upon Swansea University Computer Society NET3.039 > > > DOM3: Initializing RT netlink socket > > > DOM3: Starting kswapd > > > DOM3: Journalled Block Device driver loaded > > > DOM3: Installing knfsd (copyright (C) 1996 okir@monad.swb.de). > > > DOM3: Xeno console successfully installed > > > DOM3: Starting Xeno Balloon driver > > > DOM3: pty: 256 Unix98 ptys configured > > > DOM3: RAMDISK driver initialized: 16 RAM disks of 4096K size 1024 > > > blocksize > > > DOM3: loop: loaded (max 8 devices) > > > DOM3: NET4: Linux TCP/IP 1.0 for NET4.0 > > > DOM3: IP Protocols: ICMP, UDP, TCP > > > DOM3: IP: routing cache hash table of 2048 buckets, 16Kbytes > > > DOM3: TCP: Hash tables configured (established 16384 bind 16384) > > > DOM3: IP-Config: Complete: > > > DOM3: device=eth0, addr=64.71.149.20, mask=255.255.255.0, > > > gw=64.71.149.1, > > > DOM3: host=64.71.149.20, domain=, nis-domain=(none), > > > DOM3: bootserver=10.27.2.50, rootserver=10.27.2.50, rootpath> > > DOM3: ip_conntrack version 2.1 (2048 buckets, 16384 max) - 292 bytes > > > per conntrack > > > DOM3: ip_tables: (C) 2000-2002 Netfilter core team > > > DOM3: NET4: Unix domain sockets 1.0/SMP for Linux NET4.0. > > > DOM3: Looking up port of RPC 100003/2 on 10.27.2.50 > > > DOM3: Looking up port of RPC 100005/1 on 10.27.2.50 > > > DOM3: VFS: Mounted root (nfs filesystem). > > > DOM3: Freeing unused kernel memory: 52k freed > > > DOM3: INIT: version 2.84 booting > > > DOM3: Activating swap. > > > DOM3: Adding Swap: 262136k swap-space (priority -1) > > > DOM3: Checking root file system... > > > DOM3: fsck 1.27 (8-Mar-2002) > > > DOM3: 10.27.2.50:/export/xen/fs/stevegt/tcx: NFS file system. > > > DOM3: System time was Tue Feb 10 02:14:36 UTC 2004. > > > DOM3: Setting the System Clock using the Hardware Clock as > > > reference... > > > DOM3: modprobe: modprobe: Can''t locate module char-major-10-135 > > > DOM3: modprobe: modprobe: Can''t locate module char-major-4 > > > DOM3: hwclock is unable to get I/O port access: the iopl(3) call > > > failed. > > > DOM3: modprobe: modprobe: Can''t locate module char-major-10-135 > > > DOM3: modprobe: modprobe: Can''t locate module char-major-4 > > > DOM3: System Clock set. System local time is now Tue Feb 10 02:14:36 > > > UTC 2004. > > > DOM3: Calculating module dependencies... depmod: cannot read ELF > > > header from /lib/modules/2.4.24-xeno/modules.dep > > > DOM3: depmod: cannot read ELF header from > > > /lib/modules/2.4.24-xeno/modules.generic_string > > > DOM3: depmod: /lib/modules/2.4.24-xeno/modules.ieee1394map is not an > > > ELF file > > > DOM3: depmod: /lib/modules/2.4.24-xeno/modules.isapnpmap is not an ELF > > > file > > > DOM3: depmod: cannot read ELF header from > > > /lib/modules/2.4.24-xeno/modules.parportmap > > > DOM3: depmod: /lib/modules/2.4.24-xeno/modules.pcimap is not an ELF > > > file > > > DOM3: depmod: cannot read ELF header from > > > /lib/modules/2.4.24-xeno/modules.pnpbiosmap > > > DOM3: depmod: /lib/modules/2.4.24-xeno/modules.usbmap is not an ELF > > > file > > > DOM3: done. > > > DOM3: Loading modules: > > > DOM3: Checking all file systems... > > > DOM3: fsck 1.27 (8-Mar-2002) > > > DOM3: Setting kernel variables. > > > DOM3: Loading the saved-state of the serial devices... > > > DOM3: Mounting local filesystems... > > > DOM3: nothing was mounted > > > DOM3: Running 0dns-down to make sure resolv.conf is ok...done. > > > DOM3: Cleaning: /etc/network/ifstate. > > > DOM3: Setting up IP spoofing protection: rp_filter. > > > DOM3: Configuring network interfaces: done. > > > DOM3: Mounting remote filesystems... > > > DOM3: > > > DOM3: Setting the System Clock using the Hardware Clock as > > > reference... > > > DOM3: System Clock set. Local time: Tue Feb 10 02:14:37 UTC 2004 > > > DOM3: > > > DOM3: Cleaning: /tmp /var/lock /var/run. > > > DOM3: Initializing random number generator... done. > > > DOM3: Recovering nvi editor sessions... done. > > > DOM3: INIT: Entering runlevel: 4 > > > DOM3: Starting system log daemon: syslogd. > > > DOM3: Starting kernel log daemon: klogd. > > > DOM3: Starting internet superserver: inetd. > > > DOM3: Starting PCMCIA services: module directory > > > /lib/modules/2.4.24-xeno/pcmcia not found. > > > DOM3: Starting OpenBSD Secure Shell server: sshd. > > > DOM3: Starting deferred execution scheduler: atd. > > > DOM3: Starting periodic command scheduler: cron. > > > DOM3: INIT: no more processes left in this runlevel > > > DOM3: Unable to handle kernel paging request at virtual address > > > 20000001 > > > DOM3: printing eip: > > > DOM3: c0007743 > > > DOM3: *pde=00000000(00000000) > > > DOM3: Oops: 0000 > > > DOM3: CPU: 0 > > > DOM3: EIP: 0819:[<c0007743>] Not tainted > > > DOM3: EFLAGS: 00010202 > > > DOM3: eax: 00000001 ebx: 20000001 ecx: c3ebde6c edx: c3ebde6c > > > DOM3: esi: c3ebc000 edi: c0114254 ebp: c46c1060 esp: c3ebdce4 > > > DOM3: ds: 0821 es: 0821 ss: 0821 > > > DOM3: Process cc (pid: 13793, stackpage=c3ebd000)<1> > > > DOM3: Stack: 20000001 c3ebc000 c002c728 c3ebc000 c3ebddb0 c0114254 > > > ffffffb0 c3ebc000 > > > DOM3: c003e789 c3ebde6c c014b5ac c003e314 c3ebde6c 00000000 > > > c0114250 00000000 > > > DOM3: 00000000 00000000 01082003 8d588810 00000000 00000000 > > > 00000000 00000000 > > > DOM3: Call Trace: [<c002c728>] [<c003e789>] [<c003e314>] [<c002ccc7>] > > > [<c002cf4a>] > > > DOM3: [<c002cf61>] [<c0090033>] [<c00914bf>] > > > DOM3: > > > DOM3: <1>Unable to handle kernel paging request at virtual address > > > 20000001 > > > DOM3: printing eip: > > > DOM3: c000af0f > > > DOM3: *pde=00000000(00000000) > > > DOM3: Oops: 0002 > > > DOM3: CPU: 0 > > > DOM3: EIP: 0819:[<c000af0f>] Not tainted > > > DOM3: EFLAGS: 00010282 > > > DOM3: eax: 20000001 ebx: c485c0a0 ecx: c3ebc264 edx: c3ebc264 > > > DOM3: esi: 00000000 edi: 20000001 ebp: 0000000b esp: c3ebdbb4 > > > DOM3: ds: 0821 es: 0821 ss: 0821 > > > DOM3: Process cc (pid: 13793, stackpage=c3ebd000)<1> > > > DOM3: Stack: c485c0a0 00000000 c3ebc000 0000000b 0000000b c000b55f > > > 20000001 0000001f > > > DOM3: 00000000 cf4227e0 20000001 c0091a87 0000000b 00000000 > > > c485c0bc c0096305 > > > DOM3: c0129928 c3ebdcb0 00000000 c3ebc000 00000000 20000001 > > > c46c1060 00000000 > > > DOM3: Call Trace: [<c000b55f>] [<c0091a87>] [<c0096305>] [<c002eb19>] > > > [<c0018a25>] > > > DOM3: [<c0018c46>] [<c0018feb>] [<c0018ed4>] [<c006e759>] > > > [<c0091768>] [<c0007743>] > > > DOM3: [<c002c728>] [<c003e789>] [<c003e314>] [<c002ccc7>] > > > [<c002cf4a>] [<c002cf61>] > > > DOM3: [<c0090033>] [<c00914bf>] > > > DOM3: > > > DOM3: <1>Unable to handle kernel NULL pointer dereference at virtual > > > address 00000001 > > > DOM3: printing eip: > > > DOM3: c000b623 > > > DOM3: *pde=00000000(00000000) > > > DOM3: Oops: 0002 > > > DOM3: CPU: 0 > > > DOM3: EIP: 0819:[<c000b623>] Not tainted > > > DOM3: EFLAGS: 00010202 > > > DOM3: eax: 00000000 ebx: 00000001 ecx: c3ebc264 edx: c3ebc264 > > > DOM3: esi: 00000002 edi: c3ebc000 ebp: 0000000b esp: c3ebdaa0 > > > DOM3: ds: 0821 es: 0821 ss: 0821 > > > DOM3: Process cc (pid: 13793, stackpage=c3ebd000)<1> > > > DOM3: Stack: 0000001f 00000002 20000001 20000001 c0091a87 0000000b > > > 00000000 00000002 > > > DOM3: c0096305 c0129928 c3ebdb80 00000002 c3ebc000 00000002 > > > 20000001 0000000b > > > DOM3: 63303039 ffffffff c3ebc000 00000002 38383130 38643538 > > > 00030001 64303030 > > > DOM3: Call Trace: [<c0091a87>] [<c0096305>] [<c0008996>] [<c000f797>] > > > [<c000f991>] > > > DOM3: [<c0091768>] [<c000af0f>] [<c000b55f>] [<c0091a87>] > > > [<c0096305>] [<c002eb19>] > > > DOM3: [<c0018a25>] [<c0018c46>] [<c0018feb>] [<c0018ed4>] > > > [<c006e759>] [<c0091768>] > > > DOM3: [<c0007743>] [<c002c728>] [<c003e789>] [<c003e314>] > > > [<c002ccc7>] [<c002cf4a>] > > > DOM3: [<c002cf61>] [<c0090033>] [<c00914bf>] > > > DOM3: > > > > > > > > > > > > > > > On Mon, Feb 09, 2004 at 05:25:00PM -0800, wrote: > > > > Hi All, > > > > > > > > I seem to be able to reproduce a null pointer dereference and paging > > > > request errors in 1.2. Can anyone give me any pointers on tracking down > > > > what is causing it? > > > > > > > > This is with a 32Mb virtual domain, running debian woody, NFS root, > > > > 256Mb swap in a local VD, while running a process which builds openldap, > > > > python2.2.3, and related packages. I''m not sure which package, if any > > > > in particular, is causing this; could be just anything that causes a > > > > similar workload. This particular set of messages appeared before the > > > > virtual domain locked up during the openldap build... > > > > > > > > Steve > > > > > > > > > > > > DOM26: Unable to handle kernel paging request at virtual address > > > > 20000001 > > > > DOM26: printing eip: > > > > DOM26: c0007743 > > > > DOM26: *pde=00000000(00000000) > > > > DOM26: Oops: 0000 > > > > DOM26: CPU: 0 > > > > DOM26: EIP: 0819:[<c0007743>] Not tainted > > > > DOM26: EFLAGS: 00010202 > > > > DOM26: eax: 00000001 ebx: 20000001 ecx: c0a79e6c edx: c0a79e6c > > > > DOM26: esi: c0a78000 edi: c0114254 ebp: c1e5f580 esp: c0a79ce4 > > > > DOM26: ds: 0821 es: 0821 ss: 0821 > > > > DOM26: Process sh (pid: 10086, stackpage=c0a79000)<1> > > > > DOM26: Stack: 20000001 c0a78000 c002c728 c0a78000 c0a79db0 c0114254 > > > > ffffffb0 c0a78000 > > > > DOM26: c003e789 c0a79e6c c014b5ac c003e314 c0a79e6c 00000000 > > > > c0114250 c0a79de8 > > > > DOM26: c1419640 80000000 00000000 00000000 00000000 00000000 > > > > 00000000 00000000 > > > > DOM26: Call Trace: [<c002c728>] [<c003e789>] [<c003e314>] [<c002ccc7>] > > > > [<c002cf4a>] > > > > DOM26: [<c002cf61>] [<c0090033>] [<c00914bf>] > > > > DOM26: > > > > DOM26: <1>Unable to handle kernel paging request at virtual address > > > > 20000001 > > > > DOM26: printing eip: > > > > DOM26: c000af0f > > > > DOM26: *pde=00000000(00000000) > > > > DOM26: Oops: 0002 > > > > DOM26: CPU: 0 > > > > DOM26: EIP: 0819:[<c000af0f>] Not tainted > > > > DOM26: EFLAGS: 00010282 > > > > DOM26: eax: 20000001 ebx: c1ed5b20 ecx: c0a78264 edx: c0a78264 > > > > DOM26: esi: 00000000 edi: 20000001 ebp: 0000000b esp: c0a79bb4 > > > > DOM26: ds: 0821 es: 0821 ss: 0821 > > > > DOM26: Process sh (pid: 10086, stackpage=c0a79000)<1> > > > > DOM26: Stack: c1ed5b20 00000000 c0a78000 0000000b 0000000b c000b55f > > > > 20000001 0000001f > > > > DOM26: 00000000 c140d6c0 20000001 c0091a87 0000000b 00000000 > > > > c1ed5b3c c0096305 > > > > DOM26: c0129928 c0a79cb0 00000000 c0a78000 00000000 20000001 > > > > c1e5f580 00000000 > > > > DOM26: Call Trace: [<c000b55f>] [<c0091a87>] [<c0096305>] [<c002eb19>] > > > > [<c0018a25>] > > > > DOM26: [<c0018c46>] [<c0018feb>] [<c0018ed4>] [<c006e759>] > > > > [<c0091768>] [<c0007743>] > > > > DOM26: [<c002c728>] [<c003e789>] [<c003e314>] [<c002ccc7>] > > > > [<c002cf4a>] [<c002cf61>] > > > > DOM26: [<c0090033>] [<c00914bf>] > > > > DOM26: > > > > DOM26: <1>Unable to handle kernel NULL pointer dereference at virtual > > > > address 00000001 > > > > DOM26: printing eip: > > > > DOM26: c000b623 > > > > DOM26: *pde=00000000(00000000) > > > > DOM26: Oops: 0002 > > > > DOM26: CPU: 0 > > > > DOM26: EIP: 0819:[<c000b623>] Not tainted > > > > DOM26: EFLAGS: 00010202 > > > > DOM26: eax: 00000000 ebx: 00000001 ecx: c0a78264 edx: c0a78264 > > > > DOM26: esi: 00000002 edi: c0a78000 ebp: 0000000b esp: c0a79aa0 > > > > DOM26: ds: 0821 es: 0821 ss: 0821 > > > > DOM26: Process sh (pid: 10086, stackpage=c0a79000)<1> > > > > DOM26: Stack: 0000001f 00000002 20000001 20000001 c0091a87 0000000b > > > > 00000000 00000002 > > > > DOM26: c0096305 c0129928 c0a79b80 00000002 c0a78000 00000002 > > > > 20000001 0000000b > > > > DOM26: 63303039 c101fc58 c0a78000 00000002 c101fc58 ffffffff > > > > 00030001 c001e621 > > > > DOM26: Call Trace: [<c0091a87>] [<c0096305>] [<c001e621>] [<c001f6c0>] > > > > [<c0008996>] > > > > DOM26: [<c00200d7>] [<c00204e1>] [<c001464d>] [<c0014c92>] > > > > [<c0091768>] [<c000af0f>] > > > > DOM26: [<c000b55f>] [<c0091a87>] [<c0096305>] [<c002eb19>] > > > > [<c0018a25>] [<c0018c46>] > > > > DOM26: [<c0018feb>] [<c0018ed4>] [<c006e759>] [<c0091768>] > > > > [<c0007743>] [<c002c728>] > > > > DOM26: [<c003e789>] [<c003e314>] [<c002ccc7>] [<c002cf4a>] > > > > [<c002cf61>] [<c0090033>] > > > > DOM26: [<c00914bf>] > > > > DOM26: > > > > > > > > > > > > > > > > -- > > > > Stephen G. Traugott (KG6HDQ) > > > > UNIX/Linux Infrastructure Architect, TerraLuna LLC > > > > stevegt@TerraLuna.Org > > > > http://www.stevegt.com -- http://Infrastructures.Org > > > > > > -- > > > Stephen G. Traugott (KG6HDQ) > > > UNIX/Linux Infrastructure Architect, TerraLuna LLC > > > stevegt@TerraLuna.Org > > > http://www.stevegt.com -- http://Infrastructures.Org > > > > -- > > Stephen G. Traugott (KG6HDQ) > > UNIX/Linux Infrastructure Architect, TerraLuna LLC > > stevegt@TerraLuna.Org > > http://www.stevegt.com -- http://Infrastructures.Org > > -- > Stephen G. Traugott (KG6HDQ) > UNIX/Linux Infrastructure Architect, TerraLuna LLC > stevegt@TerraLuna.Org > http://www.stevegt.com -- http://Infrastructures.Org > > > ------------------------------------------------------- > The SF.Net email is sponsored by EclipseCon 2004 > Premiere Conference on Open Tools Development and Integration > See the breadth of Eclipse activity. February 3-5 in Anaheim, CA. > http://www.eclipsecon.org/osdn > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/xen-devel >------------------------------------------------------- The SF.Net email is sponsored by EclipseCon 2004 Premiere Conference on Open Tools Development and Integration See the breadth of Eclipse activity. February 3-5 in Anaheim, CA. http://www.eclipsecon.org/osdn _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
I don''t know -- I''ve never seen Linux issue this particular message before. This is a 32Mb virtual domain, with 256Mb VD swap, and an NFS root filesystem. I don''t have any real machines with only 32Mb RAM. ;-) I''ve duplicated it in two different nodes now, using the 02 Feb 1.2 build. I seem to be able to get one every few minutes by running "perl -e ''$a="a"x100000000''" in a while loop on two guests at the same time. I was going to try to duplicate it in today''s 1.2, but that''s not booting -- I''ll start another thread for that. Steve On Tue, Feb 10, 2004 at 08:51:42PM -0800, Kip Macy wrote:> > That is odd. What that means is that something tried to allocate a page > without passing in GFP_WAIT as one of the flags, and there were no pages > that could be freed in any of the zones. Does this not happen when > running 2.4.24 directly on the machine with the same amount of memory? > > > -Kip > > > On Tue, 10 Feb 2004 stevegt@TerraLuna.Org wrote: > > > Looks like yesterday''s paging problems were an artifact of something on > > that particular dom0 disk image -- I haven''t been able to reproduce it > > on other nodes, and after I re-imaged the same node (with the same > > image), the problem has gone away there too, so that rules out hardware. > > > > However, something else did pop up -- while trying to break things with > > "perl -e ''$a="a"x100000000''", I got the following messages; do we care? > > This is in mainstream linux mm/page_alloc.c, and it''s hard for me to > > tell from the code whether these are outright errors or whether they > > were recoverable. Does anyone know? > > > > DOM4: __alloc_pages: 0-order allocation failed (gfp=0xf0/0) > > DOM4: __alloc_pages: 0-order allocation failed (gfp=0xf0/0) > > DOM4: __alloc_pages: 0-order allocation failed (gfp=0xf0/0) > > DOM4: __alloc_pages: 0-order allocation failed (gfp=0xf0/0) > > > > This was still on last Monday''s version of 1.2; I''ll see if I can > > reproduce it in today''s 1.2; I''m deploying that later tonight. > > > > Steve > > > > > > On Mon, Feb 09, 2004 at 10:34:34PM -0800, wrote: > > > Before anyone burns too much time on this, hang on -- I wasn''t able to > > > duplicate the problem on another cluster node (both nodes were built > > > from the same SystemImager image). I''m looking for the reason why, and > > > will let you know as soon as I do. > > > > > > Steve > > > > > > On Mon, Feb 09, 2004 at 08:13:22PM -0800, wrote: > > > > Okay, the problem still exists when I bump the memory up to 256Mb, and > > > > never swap. I.E. I''ve found no workaround. Hasn''t anyone else hit > > > > anything like this? > > > > > > > > Steve > > > > > > > > > > > > DOM3: xen_console_init > > > > DOM3: Linux version 2.4.24-xeno (stevegt@pathfinder) (gcc version > > > > 3.0.4) #16 Mon Feb 2 17:46:41 PST 2004 > > > > DOM3: On node 0 totalpages: 65536 > > > > DOM3: zone(0): 4096 pages. > > > > DOM3: zone(1): 61440 pages. > > > > DOM3: zone(2): 0 pages. > > > > DOM3: Kernel command line: > > > > ip=64.71.149.20:10.27.2.50:64.71.149.1:255.255.255.0::eth0:off > > > > root=/dev/nfs nfsroot=/export//xen/fs/stevegt/tcx/root 4 DOMID=20 > > > > DOM3: Initializing CPU#0 > > > > DOM3: Xen reported: 398.780 MHz processor. > > > > DOM3: Calibrating delay loop... 1592.52 BogoMIPS > > > > DOM3: Memory: 257132k/262144k available (1078k kernel code, 5012k > > > > reserved, 308k data, 52k init, 0k highmem) > > > > DOM3: Dentry cache hash table entries: 32768 (order: 6, 262144 bytes) > > > > DOM3: Inode cache hash table entries: 16384 (order: 5, 131072 bytes) > > > > DOM3: Mount cache hash table entries: 512 (order: 0, 4096 bytes) > > > > DOM3: Buffer cache hash table entries: 16384 (order: 4, 65536 bytes) > > > > DOM3: Page-cache hash table entries: 65536 (order: 6, 262144 bytes) > > > > DOM3: CPU: L1 I cache: 16K, L1 D cache: 16K > > > > DOM3: CPU: L2 cache: 512K > > > > DOM3: CPU: Intel Pentium II (Deschutes) stepping 01 > > > > DOM3: POSIX conformance testing by UNIFIX > > > > DOM3: Linux NET4.0 for Linux 2.4 > > > > DOM3: Based upon Swansea University Computer Society NET3.039 > > > > DOM3: Initializing RT netlink socket > > > > DOM3: Starting kswapd > > > > DOM3: Journalled Block Device driver loaded > > > > DOM3: Installing knfsd (copyright (C) 1996 okir@monad.swb.de). > > > > DOM3: Xeno console successfully installed > > > > DOM3: Starting Xeno Balloon driver > > > > DOM3: pty: 256 Unix98 ptys configured > > > > DOM3: RAMDISK driver initialized: 16 RAM disks of 4096K size 1024 > > > > blocksize > > > > DOM3: loop: loaded (max 8 devices) > > > > DOM3: NET4: Linux TCP/IP 1.0 for NET4.0 > > > > DOM3: IP Protocols: ICMP, UDP, TCP > > > > DOM3: IP: routing cache hash table of 2048 buckets, 16Kbytes > > > > DOM3: TCP: Hash tables configured (established 16384 bind 16384) > > > > DOM3: IP-Config: Complete: > > > > DOM3: device=eth0, addr=64.71.149.20, mask=255.255.255.0, > > > > gw=64.71.149.1, > > > > DOM3: host=64.71.149.20, domain=, nis-domain=(none), > > > > DOM3: bootserver=10.27.2.50, rootserver=10.27.2.50, rootpath> > > > DOM3: ip_conntrack version 2.1 (2048 buckets, 16384 max) - 292 bytes > > > > per conntrack > > > > DOM3: ip_tables: (C) 2000-2002 Netfilter core team > > > > DOM3: NET4: Unix domain sockets 1.0/SMP for Linux NET4.0. > > > > DOM3: Looking up port of RPC 100003/2 on 10.27.2.50 > > > > DOM3: Looking up port of RPC 100005/1 on 10.27.2.50 > > > > DOM3: VFS: Mounted root (nfs filesystem). > > > > DOM3: Freeing unused kernel memory: 52k freed > > > > DOM3: INIT: version 2.84 booting > > > > DOM3: Activating swap. > > > > DOM3: Adding Swap: 262136k swap-space (priority -1) > > > > DOM3: Checking root file system... > > > > DOM3: fsck 1.27 (8-Mar-2002) > > > > DOM3: 10.27.2.50:/export/xen/fs/stevegt/tcx: NFS file system. > > > > DOM3: System time was Tue Feb 10 02:14:36 UTC 2004. > > > > DOM3: Setting the System Clock using the Hardware Clock as > > > > reference... > > > > DOM3: modprobe: modprobe: Can''t locate module char-major-10-135 > > > > DOM3: modprobe: modprobe: Can''t locate module char-major-4 > > > > DOM3: hwclock is unable to get I/O port access: the iopl(3) call > > > > failed. > > > > DOM3: modprobe: modprobe: Can''t locate module char-major-10-135 > > > > DOM3: modprobe: modprobe: Can''t locate module char-major-4 > > > > DOM3: System Clock set. System local time is now Tue Feb 10 02:14:36 > > > > UTC 2004. > > > > DOM3: Calculating module dependencies... depmod: cannot read ELF > > > > header from /lib/modules/2.4.24-xeno/modules.dep > > > > DOM3: depmod: cannot read ELF header from > > > > /lib/modules/2.4.24-xeno/modules.generic_string > > > > DOM3: depmod: /lib/modules/2.4.24-xeno/modules.ieee1394map is not an > > > > ELF file > > > > DOM3: depmod: /lib/modules/2.4.24-xeno/modules.isapnpmap is not an ELF > > > > file > > > > DOM3: depmod: cannot read ELF header from > > > > /lib/modules/2.4.24-xeno/modules.parportmap > > > > DOM3: depmod: /lib/modules/2.4.24-xeno/modules.pcimap is not an ELF > > > > file > > > > DOM3: depmod: cannot read ELF header from > > > > /lib/modules/2.4.24-xeno/modules.pnpbiosmap > > > > DOM3: depmod: /lib/modules/2.4.24-xeno/modules.usbmap is not an ELF > > > > file > > > > DOM3: done. > > > > DOM3: Loading modules: > > > > DOM3: Checking all file systems... > > > > DOM3: fsck 1.27 (8-Mar-2002) > > > > DOM3: Setting kernel variables. > > > > DOM3: Loading the saved-state of the serial devices... > > > > DOM3: Mounting local filesystems... > > > > DOM3: nothing was mounted > > > > DOM3: Running 0dns-down to make sure resolv.conf is ok...done. > > > > DOM3: Cleaning: /etc/network/ifstate. > > > > DOM3: Setting up IP spoofing protection: rp_filter. > > > > DOM3: Configuring network interfaces: done. > > > > DOM3: Mounting remote filesystems... > > > > DOM3: > > > > DOM3: Setting the System Clock using the Hardware Clock as > > > > reference... > > > > DOM3: System Clock set. Local time: Tue Feb 10 02:14:37 UTC 2004 > > > > DOM3: > > > > DOM3: Cleaning: /tmp /var/lock /var/run. > > > > DOM3: Initializing random number generator... done. > > > > DOM3: Recovering nvi editor sessions... done. > > > > DOM3: INIT: Entering runlevel: 4 > > > > DOM3: Starting system log daemon: syslogd. > > > > DOM3: Starting kernel log daemon: klogd. > > > > DOM3: Starting internet superserver: inetd. > > > > DOM3: Starting PCMCIA services: module directory > > > > /lib/modules/2.4.24-xeno/pcmcia not found. > > > > DOM3: Starting OpenBSD Secure Shell server: sshd. > > > > DOM3: Starting deferred execution scheduler: atd. > > > > DOM3: Starting periodic command scheduler: cron. > > > > DOM3: INIT: no more processes left in this runlevel > > > > DOM3: Unable to handle kernel paging request at virtual address > > > > 20000001 > > > > DOM3: printing eip: > > > > DOM3: c0007743 > > > > DOM3: *pde=00000000(00000000) > > > > DOM3: Oops: 0000 > > > > DOM3: CPU: 0 > > > > DOM3: EIP: 0819:[<c0007743>] Not tainted > > > > DOM3: EFLAGS: 00010202 > > > > DOM3: eax: 00000001 ebx: 20000001 ecx: c3ebde6c edx: c3ebde6c > > > > DOM3: esi: c3ebc000 edi: c0114254 ebp: c46c1060 esp: c3ebdce4 > > > > DOM3: ds: 0821 es: 0821 ss: 0821 > > > > DOM3: Process cc (pid: 13793, stackpage=c3ebd000)<1> > > > > DOM3: Stack: 20000001 c3ebc000 c002c728 c3ebc000 c3ebddb0 c0114254 > > > > ffffffb0 c3ebc000 > > > > DOM3: c003e789 c3ebde6c c014b5ac c003e314 c3ebde6c 00000000 > > > > c0114250 00000000 > > > > DOM3: 00000000 00000000 01082003 8d588810 00000000 00000000 > > > > 00000000 00000000 > > > > DOM3: Call Trace: [<c002c728>] [<c003e789>] [<c003e314>] [<c002ccc7>] > > > > [<c002cf4a>] > > > > DOM3: [<c002cf61>] [<c0090033>] [<c00914bf>] > > > > DOM3: > > > > DOM3: <1>Unable to handle kernel paging request at virtual address > > > > 20000001 > > > > DOM3: printing eip: > > > > DOM3: c000af0f > > > > DOM3: *pde=00000000(00000000) > > > > DOM3: Oops: 0002 > > > > DOM3: CPU: 0 > > > > DOM3: EIP: 0819:[<c000af0f>] Not tainted > > > > DOM3: EFLAGS: 00010282 > > > > DOM3: eax: 20000001 ebx: c485c0a0 ecx: c3ebc264 edx: c3ebc264 > > > > DOM3: esi: 00000000 edi: 20000001 ebp: 0000000b esp: c3ebdbb4 > > > > DOM3: ds: 0821 es: 0821 ss: 0821 > > > > DOM3: Process cc (pid: 13793, stackpage=c3ebd000)<1> > > > > DOM3: Stack: c485c0a0 00000000 c3ebc000 0000000b 0000000b c000b55f > > > > 20000001 0000001f > > > > DOM3: 00000000 cf4227e0 20000001 c0091a87 0000000b 00000000 > > > > c485c0bc c0096305 > > > > DOM3: c0129928 c3ebdcb0 00000000 c3ebc000 00000000 20000001 > > > > c46c1060 00000000 > > > > DOM3: Call Trace: [<c000b55f>] [<c0091a87>] [<c0096305>] [<c002eb19>] > > > > [<c0018a25>] > > > > DOM3: [<c0018c46>] [<c0018feb>] [<c0018ed4>] [<c006e759>] > > > > [<c0091768>] [<c0007743>] > > > > DOM3: [<c002c728>] [<c003e789>] [<c003e314>] [<c002ccc7>] > > > > [<c002cf4a>] [<c002cf61>] > > > > DOM3: [<c0090033>] [<c00914bf>] > > > > DOM3: > > > > DOM3: <1>Unable to handle kernel NULL pointer dereference at virtual > > > > address 00000001 > > > > DOM3: printing eip: > > > > DOM3: c000b623 > > > > DOM3: *pde=00000000(00000000) > > > > DOM3: Oops: 0002 > > > > DOM3: CPU: 0 > > > > DOM3: EIP: 0819:[<c000b623>] Not tainted > > > > DOM3: EFLAGS: 00010202 > > > > DOM3: eax: 00000000 ebx: 00000001 ecx: c3ebc264 edx: c3ebc264 > > > > DOM3: esi: 00000002 edi: c3ebc000 ebp: 0000000b esp: c3ebdaa0 > > > > DOM3: ds: 0821 es: 0821 ss: 0821 > > > > DOM3: Process cc (pid: 13793, stackpage=c3ebd000)<1> > > > > DOM3: Stack: 0000001f 00000002 20000001 20000001 c0091a87 0000000b > > > > 00000000 00000002 > > > > DOM3: c0096305 c0129928 c3ebdb80 00000002 c3ebc000 00000002 > > > > 20000001 0000000b > > > > DOM3: 63303039 ffffffff c3ebc000 00000002 38383130 38643538 > > > > 00030001 64303030 > > > > DOM3: Call Trace: [<c0091a87>] [<c0096305>] [<c0008996>] [<c000f797>] > > > > [<c000f991>] > > > > DOM3: [<c0091768>] [<c000af0f>] [<c000b55f>] [<c0091a87>] > > > > [<c0096305>] [<c002eb19>] > > > > DOM3: [<c0018a25>] [<c0018c46>] [<c0018feb>] [<c0018ed4>] > > > > [<c006e759>] [<c0091768>] > > > > DOM3: [<c0007743>] [<c002c728>] [<c003e789>] [<c003e314>] > > > > [<c002ccc7>] [<c002cf4a>] > > > > DOM3: [<c002cf61>] [<c0090033>] [<c00914bf>] > > > > DOM3: > > > > > > > > > > > > > > > > > > > > On Mon, Feb 09, 2004 at 05:25:00PM -0800, wrote: > > > > > Hi All, > > > > > > > > > > I seem to be able to reproduce a null pointer dereference and paging > > > > > request errors in 1.2. Can anyone give me any pointers on tracking down > > > > > what is causing it? > > > > > > > > > > This is with a 32Mb virtual domain, running debian woody, NFS root, > > > > > 256Mb swap in a local VD, while running a process which builds openldap, > > > > > python2.2.3, and related packages. I''m not sure which package, if any > > > > > in particular, is causing this; could be just anything that causes a > > > > > similar workload. This particular set of messages appeared before the > > > > > virtual domain locked up during the openldap build... > > > > > > > > > > Steve > > > > > > > > > > > > > > > DOM26: Unable to handle kernel paging request at virtual address > > > > > 20000001 > > > > > DOM26: printing eip: > > > > > DOM26: c0007743 > > > > > DOM26: *pde=00000000(00000000) > > > > > DOM26: Oops: 0000 > > > > > DOM26: CPU: 0 > > > > > DOM26: EIP: 0819:[<c0007743>] Not tainted > > > > > DOM26: EFLAGS: 00010202 > > > > > DOM26: eax: 00000001 ebx: 20000001 ecx: c0a79e6c edx: c0a79e6c > > > > > DOM26: esi: c0a78000 edi: c0114254 ebp: c1e5f580 esp: c0a79ce4 > > > > > DOM26: ds: 0821 es: 0821 ss: 0821 > > > > > DOM26: Process sh (pid: 10086, stackpage=c0a79000)<1> > > > > > DOM26: Stack: 20000001 c0a78000 c002c728 c0a78000 c0a79db0 c0114254 > > > > > ffffffb0 c0a78000 > > > > > DOM26: c003e789 c0a79e6c c014b5ac c003e314 c0a79e6c 00000000 > > > > > c0114250 c0a79de8 > > > > > DOM26: c1419640 80000000 00000000 00000000 00000000 00000000 > > > > > 00000000 00000000 > > > > > DOM26: Call Trace: [<c002c728>] [<c003e789>] [<c003e314>] [<c002ccc7>] > > > > > [<c002cf4a>] > > > > > DOM26: [<c002cf61>] [<c0090033>] [<c00914bf>] > > > > > DOM26: > > > > > DOM26: <1>Unable to handle kernel paging request at virtual address > > > > > 20000001 > > > > > DOM26: printing eip: > > > > > DOM26: c000af0f > > > > > DOM26: *pde=00000000(00000000) > > > > > DOM26: Oops: 0002 > > > > > DOM26: CPU: 0 > > > > > DOM26: EIP: 0819:[<c000af0f>] Not tainted > > > > > DOM26: EFLAGS: 00010282 > > > > > DOM26: eax: 20000001 ebx: c1ed5b20 ecx: c0a78264 edx: c0a78264 > > > > > DOM26: esi: 00000000 edi: 20000001 ebp: 0000000b esp: c0a79bb4 > > > > > DOM26: ds: 0821 es: 0821 ss: 0821 > > > > > DOM26: Process sh (pid: 10086, stackpage=c0a79000)<1> > > > > > DOM26: Stack: c1ed5b20 00000000 c0a78000 0000000b 0000000b c000b55f > > > > > 20000001 0000001f > > > > > DOM26: 00000000 c140d6c0 20000001 c0091a87 0000000b 00000000 > > > > > c1ed5b3c c0096305 > > > > > DOM26: c0129928 c0a79cb0 00000000 c0a78000 00000000 20000001 > > > > > c1e5f580 00000000 > > > > > DOM26: Call Trace: [<c000b55f>] [<c0091a87>] [<c0096305>] [<c002eb19>] > > > > > [<c0018a25>] > > > > > DOM26: [<c0018c46>] [<c0018feb>] [<c0018ed4>] [<c006e759>] > > > > > [<c0091768>] [<c0007743>] > > > > > DOM26: [<c002c728>] [<c003e789>] [<c003e314>] [<c002ccc7>] > > > > > [<c002cf4a>] [<c002cf61>] > > > > > DOM26: [<c0090033>] [<c00914bf>] > > > > > DOM26: > > > > > DOM26: <1>Unable to handle kernel NULL pointer dereference at virtual > > > > > address 00000001 > > > > > DOM26: printing eip: > > > > > DOM26: c000b623 > > > > > DOM26: *pde=00000000(00000000) > > > > > DOM26: Oops: 0002 > > > > > DOM26: CPU: 0 > > > > > DOM26: EIP: 0819:[<c000b623>] Not tainted > > > > > DOM26: EFLAGS: 00010202 > > > > > DOM26: eax: 00000000 ebx: 00000001 ecx: c0a78264 edx: c0a78264 > > > > > DOM26: esi: 00000002 edi: c0a78000 ebp: 0000000b esp: c0a79aa0 > > > > > DOM26: ds: 0821 es: 0821 ss: 0821 > > > > > DOM26: Process sh (pid: 10086, stackpage=c0a79000)<1> > > > > > DOM26: Stack: 0000001f 00000002 20000001 20000001 c0091a87 0000000b > > > > > 00000000 00000002 > > > > > DOM26: c0096305 c0129928 c0a79b80 00000002 c0a78000 00000002 > > > > > 20000001 0000000b > > > > > DOM26: 63303039 c101fc58 c0a78000 00000002 c101fc58 ffffffff > > > > > 00030001 c001e621 > > > > > DOM26: Call Trace: [<c0091a87>] [<c0096305>] [<c001e621>] [<c001f6c0>] > > > > > [<c0008996>] > > > > > DOM26: [<c00200d7>] [<c00204e1>] [<c001464d>] [<c0014c92>] > > > > > [<c0091768>] [<c000af0f>] > > > > > DOM26: [<c000b55f>] [<c0091a87>] [<c0096305>] [<c002eb19>] > > > > > [<c0018a25>] [<c0018c46>] > > > > > DOM26: [<c0018feb>] [<c0018ed4>] [<c006e759>] [<c0091768>] > > > > > [<c0007743>] [<c002c728>] > > > > > DOM26: [<c003e789>] [<c003e314>] [<c002ccc7>] [<c002cf4a>] > > > > > [<c002cf61>] [<c0090033>] > > > > > DOM26: [<c00914bf>] > > > > > DOM26: > > > > > > > > > > > > > > > > > > > > -- > > > > > Stephen G. Traugott (KG6HDQ) > > > > > UNIX/Linux Infrastructure Architect, TerraLuna LLC > > > > > stevegt@TerraLuna.Org > > > > > http://www.stevegt.com -- http://Infrastructures.Org > > > > > > > > -- > > > > Stephen G. Traugott (KG6HDQ) > > > > UNIX/Linux Infrastructure Architect, TerraLuna LLC > > > > stevegt@TerraLuna.Org > > > > http://www.stevegt.com -- http://Infrastructures.Org > > > > > > -- > > > Stephen G. Traugott (KG6HDQ) > > > UNIX/Linux Infrastructure Architect, TerraLuna LLC > > > stevegt@TerraLuna.Org > > > http://www.stevegt.com -- http://Infrastructures.Org > > > > -- > > Stephen G. Traugott (KG6HDQ) > > UNIX/Linux Infrastructure Architect, TerraLuna LLC > > stevegt@TerraLuna.Org > > http://www.stevegt.com -- http://Infrastructures.Org > > > > > > ------------------------------------------------------- > > The SF.Net email is sponsored by EclipseCon 2004 > > Premiere Conference on Open Tools Development and Integration > > See the breadth of Eclipse activity. February 3-5 in Anaheim, CA. > > http://www.eclipsecon.org/osdn > > _______________________________________________ > > Xen-devel mailing list > > Xen-devel@lists.sourceforge.net > > https://lists.sourceforge.net/lists/listinfo/xen-devel > > >-- Stephen G. Traugott (KG6HDQ) UNIX/Linux Infrastructure Architect, TerraLuna LLC stevegt@TerraLuna.Org http://www.stevegt.com -- http://Infrastructures.Org ------------------------------------------------------- The SF.Net email is sponsored by EclipseCon 2004 Premiere Conference on Open Tools Development and Integration See the breadth of Eclipse activity. February 3-5 in Anaheim, CA. http://www.eclipsecon.org/osdn _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
This means that a number of GFP_ATOMIC allocations failed. This is no surprise in a low-memory location where your root filesystem is mounted over NFS: Linux isn''t able to launder and evict pages quickly enough to satisfy all non-blocking page-allocation requests. I can practically guarantee exactly the same behaviour in native Linux under the same circumstances (we saw various NFS-root weirdnesses on native Linux under high load when stress-testing the MM code). -- Keir> Looks like yesterday''s paging problems were an artifact of something on > that particular dom0 disk image -- I haven''t been able to reproduce it > on other nodes, and after I re-imaged the same node (with the same > image), the problem has gone away there too, so that rules out hardware. > > However, something else did pop up -- while trying to break things with > "perl -e ''$a="a"x100000000''", I got the following messages; do we care? > This is in mainstream linux mm/page_alloc.c, and it''s hard for me to > tell from the code whether these are outright errors or whether they > were recoverable. Does anyone know? > > DOM4: __alloc_pages: 0-order allocation failed (gfp=0xf0/0) > DOM4: __alloc_pages: 0-order allocation failed (gfp=0xf0/0) > DOM4: __alloc_pages: 0-order allocation failed (gfp=0xf0/0) > DOM4: __alloc_pages: 0-order allocation failed (gfp=0xf0/0) > > This was still on last Monday''s version of 1.2; I''ll see if I can > reproduce it in today''s 1.2; I''m deploying that later tonight. > > Steve------------------------------------------------------- The SF.Net email is sponsored by EclipseCon 2004 Premiere Conference on Open Tools Development and Integration See the breadth of Eclipse activity. February 3-5 in Anaheim, CA. http://www.eclipsecon.org/osdn _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
stevegt@TerraLuna.Org
2004-Feb-11 21:23 UTC
[Xen-devel] Re: paging request failures under load (was: Re: Null pointer deference)
They''re baaaack! ;-} While upgrading a virtual domain to debian testing so I could get a gcc > 3.0.4, it issued a couple of these oopses and hung, resulting in a broken upgrade. This is different hardware, same image; a machine I was totally unable to duplicate these on yesterday. Xen 02 Feb 1.2 build, gcc 3.0.4. Also this morning, my first production customer sent me mail saying one of his guests was "slow". I looked, and, sure enough, he had had these same oopses -- paging failure followed by null pointer dereference. This was a third node, same image. Telltale symptom from his end was that ''top'' hangs -- I''ve noticed this in all cases. This plus Keir''s message about the clock skew (which I''m also seeing on these guests) makes me suspect gcc 3.0.4. So here''s what I''m doing: - running under native Linux, upgrade an unmounted NFS root filesystem to debian testing in a chroot - still in the chroot, build today''s 1.2 xen/xenolinux with gcc 3.3.2 - deploy the resulting xen and xenolinux on one or more nodes - install ksymoops/System.map on those nodes so that we can get meaningful oops output if it does happen again (per earlier mail from Ian and Bin) - test, test, test I''ll let you know how it goes. The reason I''m doing this in a chroot is that I''m thinking of setting up an automated Xen regression test environment under Xen, daily pulls, that sort of thing. This NFS root would be a build server for that environment. Is anyone already working on something like this? Steve On Mon, Feb 09, 2004 at 10:34:34PM -0800, wrote:> Before anyone burns too much time on this, hang on -- I wasn''t able to > duplicate the problem on another cluster node (both nodes were built > from the same SystemImager image). I''m looking for the reason why, and > will let you know as soon as I do. > > Steve > > On Mon, Feb 09, 2004 at 08:13:22PM -0800, wrote: > > Okay, the problem still exists when I bump the memory up to 256Mb, and > > never swap. I.E. I''ve found no workaround. Hasn''t anyone else hit > > anything like this? > > > > Steve > > > > > > DOM3: xen_console_init > > DOM3: Linux version 2.4.24-xeno (stevegt@pathfinder) (gcc version > > 3.0.4) #16 Mon Feb 2 17:46:41 PST 2004 > > DOM3: On node 0 totalpages: 65536 > > DOM3: zone(0): 4096 pages. > > DOM3: zone(1): 61440 pages. > > DOM3: zone(2): 0 pages. > > DOM3: Kernel command line: > > ip=64.71.149.20:10.27.2.50:64.71.149.1:255.255.255.0::eth0:off > > root=/dev/nfs nfsroot=/export//xen/fs/stevegt/tcx/root 4 DOMID=20 > > DOM3: Initializing CPU#0 > > DOM3: Xen reported: 398.780 MHz processor. > > DOM3: Calibrating delay loop... 1592.52 BogoMIPS > > DOM3: Memory: 257132k/262144k available (1078k kernel code, 5012k > > reserved, 308k data, 52k init, 0k highmem) > > DOM3: Dentry cache hash table entries: 32768 (order: 6, 262144 bytes) > > DOM3: Inode cache hash table entries: 16384 (order: 5, 131072 bytes) > > DOM3: Mount cache hash table entries: 512 (order: 0, 4096 bytes) > > DOM3: Buffer cache hash table entries: 16384 (order: 4, 65536 bytes) > > DOM3: Page-cache hash table entries: 65536 (order: 6, 262144 bytes) > > DOM3: CPU: L1 I cache: 16K, L1 D cache: 16K > > DOM3: CPU: L2 cache: 512K > > DOM3: CPU: Intel Pentium II (Deschutes) stepping 01 > > DOM3: POSIX conformance testing by UNIFIX > > DOM3: Linux NET4.0 for Linux 2.4 > > DOM3: Based upon Swansea University Computer Society NET3.039 > > DOM3: Initializing RT netlink socket > > DOM3: Starting kswapd > > DOM3: Journalled Block Device driver loaded > > DOM3: Installing knfsd (copyright (C) 1996 okir@monad.swb.de). > > DOM3: Xeno console successfully installed > > DOM3: Starting Xeno Balloon driver > > DOM3: pty: 256 Unix98 ptys configured > > DOM3: RAMDISK driver initialized: 16 RAM disks of 4096K size 1024 > > blocksize > > DOM3: loop: loaded (max 8 devices) > > DOM3: NET4: Linux TCP/IP 1.0 for NET4.0 > > DOM3: IP Protocols: ICMP, UDP, TCP > > DOM3: IP: routing cache hash table of 2048 buckets, 16Kbytes > > DOM3: TCP: Hash tables configured (established 16384 bind 16384) > > DOM3: IP-Config: Complete: > > DOM3: device=eth0, addr=64.71.149.20, mask=255.255.255.0, > > gw=64.71.149.1, > > DOM3: host=64.71.149.20, domain=, nis-domain=(none), > > DOM3: bootserver=10.27.2.50, rootserver=10.27.2.50, rootpath> > DOM3: ip_conntrack version 2.1 (2048 buckets, 16384 max) - 292 bytes > > per conntrack > > DOM3: ip_tables: (C) 2000-2002 Netfilter core team > > DOM3: NET4: Unix domain sockets 1.0/SMP for Linux NET4.0. > > DOM3: Looking up port of RPC 100003/2 on 10.27.2.50 > > DOM3: Looking up port of RPC 100005/1 on 10.27.2.50 > > DOM3: VFS: Mounted root (nfs filesystem). > > DOM3: Freeing unused kernel memory: 52k freed > > DOM3: INIT: version 2.84 booting > > DOM3: Activating swap. > > DOM3: Adding Swap: 262136k swap-space (priority -1) > > DOM3: Checking root file system... > > DOM3: fsck 1.27 (8-Mar-2002) > > DOM3: 10.27.2.50:/export/xen/fs/stevegt/tcx: NFS file system. > > DOM3: System time was Tue Feb 10 02:14:36 UTC 2004. > > DOM3: Setting the System Clock using the Hardware Clock as > > reference... > > DOM3: modprobe: modprobe: Can''t locate module char-major-10-135 > > DOM3: modprobe: modprobe: Can''t locate module char-major-4 > > DOM3: hwclock is unable to get I/O port access: the iopl(3) call > > failed. > > DOM3: modprobe: modprobe: Can''t locate module char-major-10-135 > > DOM3: modprobe: modprobe: Can''t locate module char-major-4 > > DOM3: System Clock set. System local time is now Tue Feb 10 02:14:36 > > UTC 2004. > > DOM3: Calculating module dependencies... depmod: cannot read ELF > > header from /lib/modules/2.4.24-xeno/modules.dep > > DOM3: depmod: cannot read ELF header from > > /lib/modules/2.4.24-xeno/modules.generic_string > > DOM3: depmod: /lib/modules/2.4.24-xeno/modules.ieee1394map is not an > > ELF file > > DOM3: depmod: /lib/modules/2.4.24-xeno/modules.isapnpmap is not an ELF > > file > > DOM3: depmod: cannot read ELF header from > > /lib/modules/2.4.24-xeno/modules.parportmap > > DOM3: depmod: /lib/modules/2.4.24-xeno/modules.pcimap is not an ELF > > file > > DOM3: depmod: cannot read ELF header from > > /lib/modules/2.4.24-xeno/modules.pnpbiosmap > > DOM3: depmod: /lib/modules/2.4.24-xeno/modules.usbmap is not an ELF > > file > > DOM3: done. > > DOM3: Loading modules: > > DOM3: Checking all file systems... > > DOM3: fsck 1.27 (8-Mar-2002) > > DOM3: Setting kernel variables. > > DOM3: Loading the saved-state of the serial devices... > > DOM3: Mounting local filesystems... > > DOM3: nothing was mounted > > DOM3: Running 0dns-down to make sure resolv.conf is ok...done. > > DOM3: Cleaning: /etc/network/ifstate. > > DOM3: Setting up IP spoofing protection: rp_filter. > > DOM3: Configuring network interfaces: done. > > DOM3: Mounting remote filesystems... > > DOM3: > > DOM3: Setting the System Clock using the Hardware Clock as > > reference... > > DOM3: System Clock set. Local time: Tue Feb 10 02:14:37 UTC 2004 > > DOM3: > > DOM3: Cleaning: /tmp /var/lock /var/run. > > DOM3: Initializing random number generator... done. > > DOM3: Recovering nvi editor sessions... done. > > DOM3: INIT: Entering runlevel: 4 > > DOM3: Starting system log daemon: syslogd. > > DOM3: Starting kernel log daemon: klogd. > > DOM3: Starting internet superserver: inetd. > > DOM3: Starting PCMCIA services: module directory > > /lib/modules/2.4.24-xeno/pcmcia not found. > > DOM3: Starting OpenBSD Secure Shell server: sshd. > > DOM3: Starting deferred execution scheduler: atd. > > DOM3: Starting periodic command scheduler: cron. > > DOM3: INIT: no more processes left in this runlevel > > DOM3: Unable to handle kernel paging request at virtual address > > 20000001 > > DOM3: printing eip: > > DOM3: c0007743 > > DOM3: *pde=00000000(00000000) > > DOM3: Oops: 0000 > > DOM3: CPU: 0 > > DOM3: EIP: 0819:[<c0007743>] Not tainted > > DOM3: EFLAGS: 00010202 > > DOM3: eax: 00000001 ebx: 20000001 ecx: c3ebde6c edx: c3ebde6c > > DOM3: esi: c3ebc000 edi: c0114254 ebp: c46c1060 esp: c3ebdce4 > > DOM3: ds: 0821 es: 0821 ss: 0821 > > DOM3: Process cc (pid: 13793, stackpage=c3ebd000)<1> > > DOM3: Stack: 20000001 c3ebc000 c002c728 c3ebc000 c3ebddb0 c0114254 > > ffffffb0 c3ebc000 > > DOM3: c003e789 c3ebde6c c014b5ac c003e314 c3ebde6c 00000000 > > c0114250 00000000 > > DOM3: 00000000 00000000 01082003 8d588810 00000000 00000000 > > 00000000 00000000 > > DOM3: Call Trace: [<c002c728>] [<c003e789>] [<c003e314>] [<c002ccc7>] > > [<c002cf4a>] > > DOM3: [<c002cf61>] [<c0090033>] [<c00914bf>] > > DOM3: > > DOM3: <1>Unable to handle kernel paging request at virtual address > > 20000001 > > DOM3: printing eip: > > DOM3: c000af0f > > DOM3: *pde=00000000(00000000) > > DOM3: Oops: 0002 > > DOM3: CPU: 0 > > DOM3: EIP: 0819:[<c000af0f>] Not tainted > > DOM3: EFLAGS: 00010282 > > DOM3: eax: 20000001 ebx: c485c0a0 ecx: c3ebc264 edx: c3ebc264 > > DOM3: esi: 00000000 edi: 20000001 ebp: 0000000b esp: c3ebdbb4 > > DOM3: ds: 0821 es: 0821 ss: 0821 > > DOM3: Process cc (pid: 13793, stackpage=c3ebd000)<1> > > DOM3: Stack: c485c0a0 00000000 c3ebc000 0000000b 0000000b c000b55f > > 20000001 0000001f > > DOM3: 00000000 cf4227e0 20000001 c0091a87 0000000b 00000000 > > c485c0bc c0096305 > > DOM3: c0129928 c3ebdcb0 00000000 c3ebc000 00000000 20000001 > > c46c1060 00000000 > > DOM3: Call Trace: [<c000b55f>] [<c0091a87>] [<c0096305>] [<c002eb19>] > > [<c0018a25>] > > DOM3: [<c0018c46>] [<c0018feb>] [<c0018ed4>] [<c006e759>] > > [<c0091768>] [<c0007743>] > > DOM3: [<c002c728>] [<c003e789>] [<c003e314>] [<c002ccc7>] > > [<c002cf4a>] [<c002cf61>] > > DOM3: [<c0090033>] [<c00914bf>] > > DOM3: > > DOM3: <1>Unable to handle kernel NULL pointer dereference at virtual > > address 00000001 > > DOM3: printing eip: > > DOM3: c000b623 > > DOM3: *pde=00000000(00000000) > > DOM3: Oops: 0002 > > DOM3: CPU: 0 > > DOM3: EIP: 0819:[<c000b623>] Not tainted > > DOM3: EFLAGS: 00010202 > > DOM3: eax: 00000000 ebx: 00000001 ecx: c3ebc264 edx: c3ebc264 > > DOM3: esi: 00000002 edi: c3ebc000 ebp: 0000000b esp: c3ebdaa0 > > DOM3: ds: 0821 es: 0821 ss: 0821 > > DOM3: Process cc (pid: 13793, stackpage=c3ebd000)<1> > > DOM3: Stack: 0000001f 00000002 20000001 20000001 c0091a87 0000000b > > 00000000 00000002 > > DOM3: c0096305 c0129928 c3ebdb80 00000002 c3ebc000 00000002 > > 20000001 0000000b > > DOM3: 63303039 ffffffff c3ebc000 00000002 38383130 38643538 > > 00030001 64303030 > > DOM3: Call Trace: [<c0091a87>] [<c0096305>] [<c0008996>] [<c000f797>] > > [<c000f991>] > > DOM3: [<c0091768>] [<c000af0f>] [<c000b55f>] [<c0091a87>] > > [<c0096305>] [<c002eb19>] > > DOM3: [<c0018a25>] [<c0018c46>] [<c0018feb>] [<c0018ed4>] > > [<c006e759>] [<c0091768>] > > DOM3: [<c0007743>] [<c002c728>] [<c003e789>] [<c003e314>] > > [<c002ccc7>] [<c002cf4a>] > > DOM3: [<c002cf61>] [<c0090033>] [<c00914bf>] > > DOM3: > > > > > > > > > > On Mon, Feb 09, 2004 at 05:25:00PM -0800, wrote: > > > Hi All, > > > > > > I seem to be able to reproduce a null pointer dereference and paging > > > request errors in 1.2. Can anyone give me any pointers on tracking down > > > what is causing it? > > > > > > This is with a 32Mb virtual domain, running debian woody, NFS root, > > > 256Mb swap in a local VD, while running a process which builds openldap, > > > python2.2.3, and related packages. I''m not sure which package, if any > > > in particular, is causing this; could be just anything that causes a > > > similar workload. This particular set of messages appeared before the > > > virtual domain locked up during the openldap build... > > > > > > Steve > > > > > > > > > DOM26: Unable to handle kernel paging request at virtual address > > > 20000001 > > > DOM26: printing eip: > > > DOM26: c0007743 > > > DOM26: *pde=00000000(00000000) > > > DOM26: Oops: 0000 > > > DOM26: CPU: 0 > > > DOM26: EIP: 0819:[<c0007743>] Not tainted > > > DOM26: EFLAGS: 00010202 > > > DOM26: eax: 00000001 ebx: 20000001 ecx: c0a79e6c edx: c0a79e6c > > > DOM26: esi: c0a78000 edi: c0114254 ebp: c1e5f580 esp: c0a79ce4 > > > DOM26: ds: 0821 es: 0821 ss: 0821 > > > DOM26: Process sh (pid: 10086, stackpage=c0a79000)<1> > > > DOM26: Stack: 20000001 c0a78000 c002c728 c0a78000 c0a79db0 c0114254 > > > ffffffb0 c0a78000 > > > DOM26: c003e789 c0a79e6c c014b5ac c003e314 c0a79e6c 00000000 > > > c0114250 c0a79de8 > > > DOM26: c1419640 80000000 00000000 00000000 00000000 00000000 > > > 00000000 00000000 > > > DOM26: Call Trace: [<c002c728>] [<c003e789>] [<c003e314>] [<c002ccc7>] > > > [<c002cf4a>] > > > DOM26: [<c002cf61>] [<c0090033>] [<c00914bf>] > > > DOM26: > > > DOM26: <1>Unable to handle kernel paging request at virtual address > > > 20000001 > > > DOM26: printing eip: > > > DOM26: c000af0f > > > DOM26: *pde=00000000(00000000) > > > DOM26: Oops: 0002 > > > DOM26: CPU: 0 > > > DOM26: EIP: 0819:[<c000af0f>] Not tainted > > > DOM26: EFLAGS: 00010282 > > > DOM26: eax: 20000001 ebx: c1ed5b20 ecx: c0a78264 edx: c0a78264 > > > DOM26: esi: 00000000 edi: 20000001 ebp: 0000000b esp: c0a79bb4 > > > DOM26: ds: 0821 es: 0821 ss: 0821 > > > DOM26: Process sh (pid: 10086, stackpage=c0a79000)<1> > > > DOM26: Stack: c1ed5b20 00000000 c0a78000 0000000b 0000000b c000b55f > > > 20000001 0000001f > > > DOM26: 00000000 c140d6c0 20000001 c0091a87 0000000b 00000000 > > > c1ed5b3c c0096305 > > > DOM26: c0129928 c0a79cb0 00000000 c0a78000 00000000 20000001 > > > c1e5f580 00000000 > > > DOM26: Call Trace: [<c000b55f>] [<c0091a87>] [<c0096305>] [<c002eb19>] > > > [<c0018a25>] > > > DOM26: [<c0018c46>] [<c0018feb>] [<c0018ed4>] [<c006e759>] > > > [<c0091768>] [<c0007743>] > > > DOM26: [<c002c728>] [<c003e789>] [<c003e314>] [<c002ccc7>] > > > [<c002cf4a>] [<c002cf61>] > > > DOM26: [<c0090033>] [<c00914bf>] > > > DOM26: > > > DOM26: <1>Unable to handle kernel NULL pointer dereference at virtual > > > address 00000001 > > > DOM26: printing eip: > > > DOM26: c000b623 > > > DOM26: *pde=00000000(00000000) > > > DOM26: Oops: 0002 > > > DOM26: CPU: 0 > > > DOM26: EIP: 0819:[<c000b623>] Not tainted > > > DOM26: EFLAGS: 00010202 > > > DOM26: eax: 00000000 ebx: 00000001 ecx: c0a78264 edx: c0a78264 > > > DOM26: esi: 00000002 edi: c0a78000 ebp: 0000000b esp: c0a79aa0 > > > DOM26: ds: 0821 es: 0821 ss: 0821 > > > DOM26: Process sh (pid: 10086, stackpage=c0a79000)<1> > > > DOM26: Stack: 0000001f 00000002 20000001 20000001 c0091a87 0000000b > > > 00000000 00000002 > > > DOM26: c0096305 c0129928 c0a79b80 00000002 c0a78000 00000002 > > > 20000001 0000000b > > > DOM26: 63303039 c101fc58 c0a78000 00000002 c101fc58 ffffffff > > > 00030001 c001e621 > > > DOM26: Call Trace: [<c0091a87>] [<c0096305>] [<c001e621>] [<c001f6c0>] > > > [<c0008996>] > > > DOM26: [<c00200d7>] [<c00204e1>] [<c001464d>] [<c0014c92>] > > > [<c0091768>] [<c000af0f>] > > > DOM26: [<c000b55f>] [<c0091a87>] [<c0096305>] [<c002eb19>] > > > [<c0018a25>] [<c0018c46>] > > > DOM26: [<c0018feb>] [<c0018ed4>] [<c006e759>] [<c0091768>] > > > [<c0007743>] [<c002c728>] > > > DOM26: [<c003e789>] [<c003e314>] [<c002ccc7>] [<c002cf4a>] > > > [<c002cf61>] [<c0090033>] > > > DOM26: [<c00914bf>] > > > DOM26: > > > > > > > > > > > > -- > > > Stephen G. Traugott (KG6HDQ) > > > UNIX/Linux Infrastructure Architect, TerraLuna LLC > > > stevegt@TerraLuna.Org > > > http://www.stevegt.com -- http://Infrastructures.Org > > > > -- > > Stephen G. Traugott (KG6HDQ) > > UNIX/Linux Infrastructure Architect, TerraLuna LLC > > stevegt@TerraLuna.Org > > http://www.stevegt.com -- http://Infrastructures.Org > > -- > Stephen G. Traugott (KG6HDQ) > UNIX/Linux Infrastructure Architect, TerraLuna LLC > stevegt@TerraLuna.Org > http://www.stevegt.com -- http://Infrastructures.Org-- Stephen G. Traugott (KG6HDQ) UNIX/Linux Infrastructure Architect, TerraLuna LLC stevegt@TerraLuna.Org http://www.stevegt.com -- http://Infrastructures.Org ------------------------------------------------------- SF.Net is sponsored by: Speed Start Your Linux Apps Now. Build and deploy apps & Web services for Linux with a free DVD software kit from IBM. Click Now! http://ads.osdn.com/?ad_id=1356&alloc_id=3438&op=click _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
Keir Fraser
2004-Feb-11 22:35 UTC
Re: [Xen-devel] Re: paging request failures under load (was: Re: Null pointer deference)
> - install ksymoops/System.map on those nodes so that we can get > meaningful oops output if it does happen again (per earlier mail from > Ian and Bin)Number-one priority for debugging is having access to the kernel object file (it''s the ''vmlinux'' file at the root of the build tree). Given that, and the precise version of Xen/Xenolinux that you built, I can have a fair stab at unpicking what happened. If the crash is in Xen itself then the Xen image file is what I need (''xen'' file at teh root of the Xen build tree). Symbolic backtraces are nice but definitely of secondary importance.> The reason I''m doing this in a chroot is that I''m thinking of setting up > an automated Xen regression test environment under Xen, daily pulls, > that sort of thing. This NFS root would be a build server for that > environment. Is anyone already working on something like this?We have a regression test here in the lab, but: 1. It uses some SPEC benchmarks, so it''s not publically distributable. 2. It''s based on an old Redhat -- a more up-to-date filesystem would be good. 3. We don''t have enough spare machines to do a really large test. Your setup sounds liek it could be much better! -- Keir ------------------------------------------------------- SF.Net is sponsored by: Speed Start Your Linux Apps Now. Build and deploy apps & Web services for Linux with a free DVD software kit from IBM. Click Now! http://ads.osdn.com/?ad_id=1356&alloc_id=3438&op=click _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
Hi All, bad news. After running several guests for most of the past week on the 12 Feb build of 1.2, built with GCC 3.3.2, with 64Mb of RAM and NFS roots, I finally got another paging oops. The entire console log, xen and xenolinux binaries, and System.map, are at: http://t7a.org/tmp/oops1-n2h54/ Let me know if there''s anything else I can do. Steve On Wed, Feb 11, 2004 at 10:35:15PM +0000, Keir Fraser wrote:> > - install ksymoops/System.map on those nodes so that we can get > > meaningful oops output if it does happen again (per earlier mail from > > Ian and Bin) > > Number-one priority for debugging is having access to the kernel > object file (it''s the ''vmlinux'' file at the root of the build > tree). Given that, and the precise version of Xen/Xenolinux that you > built, I can have a fair stab at unpicking what happened. If the crash > is in Xen itself then the Xen image file is what I need (''xen'' file at > teh root of the Xen build tree). > > Symbolic backtraces are nice but definitely of secondary importance. > > > The reason I''m doing this in a chroot is that I''m thinking of setting up > > an automated Xen regression test environment under Xen, daily pulls, > > that sort of thing. This NFS root would be a build server for that > > environment. Is anyone already working on something like this? > > We have a regression test here in the lab, but: > 1. It uses some SPEC benchmarks, so it''s not publically distributable. > 2. It''s based on an old Redhat -- a more up-to-date filesystem would > be good. > 3. We don''t have enough spare machines to do a really large test. > > Your setup sounds liek it could be much better! > > -- Keir > > > ------------------------------------------------------- > SF.Net is sponsored by: Speed Start Your Linux Apps Now. > Build and deploy apps & Web services for Linux with > a free DVD software kit from IBM. Click Now! > http://ads.osdn.com/?ad_id=1356&alloc_id=3438&op=click > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/xen-devel-- Stephen G. Traugott (KG6HDQ) UNIX/Linux Infrastructure Architect, TerraLuna LLC stevegt@TerraLuna.Org http://www.stevegt.com -- http://Infrastructures.Org ------------------------------------------------------- SF.Net is sponsored by: Speed Start Your Linux Apps Now. Build and deploy apps & Web services for Linux with a free DVD software kit from IBM. Click Now! http://ads.osdn.com/?ad_id=1356&alloc_id=3438&op=click _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
Some thoughts: the only thing I can think of that I might be doing differently than any of you is that I''m running the NFS root server on another node, a standard Linux machine, with the traffic going through a lightly-loaded 100Mb switch. Nevertheless, I still get occasional NFS timeouts, as you''ll see in the console log. These haven''t worried me, but if you find that these paging oopses are happening in mmap code, that might be worth noting. Steve On Fri, Feb 20, 2004 at 10:23:26PM -0800, wrote:> Hi All, bad news. > > After running several guests for most of the past week on the 12 Feb > build of 1.2, built with GCC 3.3.2, with 64Mb of RAM and NFS roots, I > finally got another paging oops. The entire console log, xen and > xenolinux binaries, and System.map, are at: > > http://t7a.org/tmp/oops1-n2h54/ > > Let me know if there''s anything else I can do. > > Steve > > > On Wed, Feb 11, 2004 at 10:35:15PM +0000, Keir Fraser wrote: > > > - install ksymoops/System.map on those nodes so that we can get > > > meaningful oops output if it does happen again (per earlier mail from > > > Ian and Bin) > > > > Number-one priority for debugging is having access to the kernel > > object file (it''s the ''vmlinux'' file at the root of the build > > tree). Given that, and the precise version of Xen/Xenolinux that you > > built, I can have a fair stab at unpicking what happened. If the crash > > is in Xen itself then the Xen image file is what I need (''xen'' file at > > teh root of the Xen build tree). > > > > Symbolic backtraces are nice but definitely of secondary importance. > > > > > The reason I''m doing this in a chroot is that I''m thinking of setting up > > > an automated Xen regression test environment under Xen, daily pulls, > > > that sort of thing. This NFS root would be a build server for that > > > environment. Is anyone already working on something like this? > > > > We have a regression test here in the lab, but: > > 1. It uses some SPEC benchmarks, so it''s not publically distributable. > > 2. It''s based on an old Redhat -- a more up-to-date filesystem would > > be good. > > 3. We don''t have enough spare machines to do a really large test. > > > > Your setup sounds liek it could be much better! > > > > -- Keir > > > > > > ------------------------------------------------------- > > SF.Net is sponsored by: Speed Start Your Linux Apps Now. > > Build and deploy apps & Web services for Linux with > > a free DVD software kit from IBM. Click Now! > > http://ads.osdn.com/?ad_id=1356&alloc_id=3438&op=click > > _______________________________________________ > > Xen-devel mailing list > > Xen-devel@lists.sourceforge.net > > https://lists.sourceforge.net/lists/listinfo/xen-devel > > -- > Stephen G. Traugott (KG6HDQ) > UNIX/Linux Infrastructure Architect, TerraLuna LLC > stevegt@TerraLuna.Org > http://www.stevegt.com -- http://Infrastructures.Org-- Stephen G. Traugott (KG6HDQ) UNIX/Linux Infrastructure Architect, TerraLuna LLC stevegt@TerraLuna.Org http://www.stevegt.com -- http://Infrastructures.Org ------------------------------------------------------- SF.Net is sponsored by: Speed Start Your Linux Apps Now. Build and deploy apps & Web services for Linux with a free DVD software kit from IBM. Click Now! http://ads.osdn.com/?ad_id=1356&alloc_id=3438&op=click _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
Aargh! Please disregard. While processing ksymoops just now I realized that the two guest machines that had these oopses were in fact the *only* two that I hadn''t yet upgraded to the 12 Feb/GCC 3.3.2 xenolinux. These were running the new Xen, old xenolinux. All others have been fine, with a week of runtime so far. Steve On Fri, Feb 20, 2004 at 10:23:26PM -0800, wrote:> Hi All, bad news. > > After running several guests for most of the past week on the 12 Feb > build of 1.2, built with GCC 3.3.2, with 64Mb of RAM and NFS roots, I > finally got another paging oops. The entire console log, xen and > xenolinux binaries, and System.map, are at: > > http://t7a.org/tmp/oops1-n2h54/ > > Let me know if there''s anything else I can do. > > Steve > > > On Wed, Feb 11, 2004 at 10:35:15PM +0000, Keir Fraser wrote: > > > - install ksymoops/System.map on those nodes so that we can get > > > meaningful oops output if it does happen again (per earlier mail from > > > Ian and Bin) > > > > Number-one priority for debugging is having access to the kernel > > object file (it''s the ''vmlinux'' file at the root of the build > > tree). Given that, and the precise version of Xen/Xenolinux that you > > built, I can have a fair stab at unpicking what happened. If the crash > > is in Xen itself then the Xen image file is what I need (''xen'' file at > > teh root of the Xen build tree). > > > > Symbolic backtraces are nice but definitely of secondary importance. > > > > > The reason I''m doing this in a chroot is that I''m thinking of setting up > > > an automated Xen regression test environment under Xen, daily pulls, > > > that sort of thing. This NFS root would be a build server for that > > > environment. Is anyone already working on something like this? > > > > We have a regression test here in the lab, but: > > 1. It uses some SPEC benchmarks, so it''s not publically distributable. > > 2. It''s based on an old Redhat -- a more up-to-date filesystem would > > be good. > > 3. We don''t have enough spare machines to do a really large test. > > > > Your setup sounds liek it could be much better! > > > > -- Keir > > > > > > ------------------------------------------------------- > > SF.Net is sponsored by: Speed Start Your Linux Apps Now. > > Build and deploy apps & Web services for Linux with > > a free DVD software kit from IBM. Click Now! > > http://ads.osdn.com/?ad_id=1356&alloc_id=3438&op=click > > _______________________________________________ > > Xen-devel mailing list > > Xen-devel@lists.sourceforge.net > > https://lists.sourceforge.net/lists/listinfo/xen-devel > > -- > Stephen G. Traugott (KG6HDQ) > UNIX/Linux Infrastructure Architect, TerraLuna LLC > stevegt@TerraLuna.Org > http://www.stevegt.com -- http://Infrastructures.Org-- Stephen G. Traugott (KG6HDQ) UNIX/Linux Infrastructure Architect, TerraLuna LLC stevegt@TerraLuna.Org http://www.stevegt.com -- http://Infrastructures.Org ------------------------------------------------------- SF.Net is sponsored by: Speed Start Your Linux Apps Now. Build and deploy apps & Web services for Linux with a free DVD software kit from IBM. Click Now! http://ads.osdn.com/?ad_id=1356&alloc_id=3438&op=click _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
> Aargh! Please disregard. While processing ksymoops just now I realized > that the two guest machines that had these oopses were in fact the > *only* two that I hadn''t yet upgraded to the 12 Feb/GCC 3.3.2 xenolinux. > These were running the new Xen, old xenolinux. All others have been > fine, with a week of runtime so far.Phew. :-) Of course, we''re interested in any crashes that occur when using GCC 2.95.3 and 3.3.x. We''ll also accept crash dumps from 3.2.2 -- I''ve heard that this compiler has trouble with Linux in some cases, but since it''s the compiler that we use the most (it ships with RH9), we have some degree of trust in it! I''ve taken a lot of time in the last week to shake bugs out of Xen/Xenolinux 1.2. Hopefully this will reduce the number of bug reports, despite the recent upgrade to linux-2.4.25. -- Keir ------------------------------------------------------- SF.Net is sponsored by: Speed Start Your Linux Apps Now. Build and deploy apps & Web services for Linux with a free DVD software kit from IBM. Click Now! http://ads.osdn.com/?ad_id=1356&alloc_id=3438&op=click _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
While we''re on the topic of compiler idiosyncracies, I''m wondering if it really makes sense to compile the tools with -O3. They''re not performance critical, so it seems like the risks might outweigh the gains. -Kip On Sat, 21 Feb 2004, Keir Fraser wrote:> > Aargh! Please disregard. While processing ksymoops just now I realized > > that the two guest machines that had these oopses were in fact the > > *only* two that I hadn''t yet upgraded to the 12 Feb/GCC 3.3.2 xenolinux. > > These were running the new Xen, old xenolinux. All others have been > > fine, with a week of runtime so far. > > Phew. :-) > > Of course, we''re interested in any crashes that occur when using GCC > 2.95.3 and 3.3.x. We''ll also accept crash dumps from 3.2.2 -- I''ve > heard that this compiler has trouble with Linux in some cases, but > since it''s the compiler that we use the most (it ships with RH9), we > have some degree of trust in it! > > I''ve taken a lot of time in the last week to shake bugs out of > Xen/Xenolinux 1.2. Hopefully this will reduce the number of bug > reports, despite the recent upgrade to linux-2.4.25. > > -- Keir > > > > > > ------------------------------------------------------- > SF.Net is sponsored by: Speed Start Your Linux Apps Now. > Build and deploy apps & Web services for Linux with > a free DVD software kit from IBM. Click Now! > http://ads.osdn.com/?ad_id=1356&alloc_id=3438&op=click > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/xen-devel >------------------------------------------------------- SF.Net is sponsored by: Speed Start Your Linux Apps Now. Build and deploy apps & Web services for Linux with a free DVD software kit from IBM. Click Now! http://ads.osdn.com/?ad_id=1356&alloc_id=3438&op=click _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
I have dual P3/750Mhz box running Mandrake-9.1. I have downloaded the Xen/Xenolinux 1.3 stuff and compiled. It comes out that I can''t the kernel configuration right -- I was able to compile xen-1.2 and xenolinux-2.4.24 but when the machine is booted (using a grub floppy) it starts boot process and then the box keeps on resetting itslef. My guess: xen boots correctly but xenolinux goes haywire. Can some kind soul give me a copy of ''.config'' of a correctly booting system? Thanks in advance, -ishwar ------------------------------------------------------- SF.Net is sponsored by: Speed Start Your Linux Apps Now. Build and deploy apps & Web services for Linux with a free DVD software kit from IBM. Click Now! http://ads.osdn.com/?ad_id=1356&alloc_id=3438&op=click _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
> I have dual P3/750Mhz box running Mandrake-9.1. I have downloaded > the Xen/Xenolinux 1.3 stuff and compiled. It comes out that I can''t > the kernel configuration right -- I was able to compile xen-1.2 and > xenolinux-2.4.24 but when the machine is booted (using a grub floppy) > it starts boot process and then the box keeps on resetting itslef. > My guess: xen boots correctly but xenolinux goes haywire. > > Can some kind soul give me a copy of ''.config'' of a correctly booting > system?There''s a default config (defconfig) in the xenolinux tree. Just type "make mrproper oldconfig dep bzImage" [''mrproper'' deletes the previous .config, ''oldconfig'' copies the default arch xeno config to .config.] If you want to stop it rebooting if something goes wrong, use the ''noreboot'' xen command line option. You should then be able to examine the error message. Alternatively, connect a serial console and enable serial output with the ''ser_baud'' xen command line option. Ian ------------------------------------------------------- SF.Net is sponsored by: Speed Start Your Linux Apps Now. Build and deploy apps & Web services for Linux with a free DVD software kit from IBM. Click Now! http://ads.osdn.com/?ad_id=1356&alloc_id=3438&op=click _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
On Sat, 21 Feb 2004, Ian Pratt wrote:> > I have dual P3/750Mhz box running Mandrake-9.1. I have downloaded > > system? > > There''s a default config (defconfig) in the xenolinux tree.What it it''s location relative to xenolinux-2.4.24? -ishwar ------------------------------------------------------- SF.Net is sponsored by: Speed Start Your Linux Apps Now. Build and deploy apps & Web services for Linux with a free DVD software kit from IBM. Click Now! http://ads.osdn.com/?ad_id=1356&alloc_id=3438&op=click _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
just do> ARCH=xeno make oldconfigOn Sun, 22 Feb 2004, I RATTAN wrote:> > > On Sat, 21 Feb 2004, Ian Pratt wrote: > > > > I have dual P3/750Mhz box running Mandrake-9.1. I have downloaded > > > system? > > > > There''s a default config (defconfig) in the xenolinux tree. > What it it''s location relative to xenolinux-2.4.24? > > -ishwar > > > ------------------------------------------------------- > SF.Net is sponsored by: Speed Start Your Linux Apps Now. > Build and deploy apps & Web services for Linux with > a free DVD software kit from IBM. Click Now! > http://ads.osdn.com/?ad_id=1356&alloc_id=3438&op=click > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/xen-devel >------------------------------------------------------- SF.Net is sponsored by: Speed Start Your Linux Apps Now. Build and deploy apps & Web services for Linux with a free DVD software kit from IBM. Click Now! http://ads.osdn.com/?ad_id=1356&alloc_id=3438&op=click _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
> > > On Sat, 21 Feb 2004, Ian Pratt wrote: > > > > I have dual P3/750Mhz box running Mandrake-9.1. I have downloaded > > > system? > > > > There''s a default config (defconfig) in the xenolinux tree. > What it it''s location relative to xenolinux-2.4.24?It will be generated by running ''make mrproper'' ''make oldconfig''. Alternatively, copy it from arch/xeno/defconfig Ian ------------------------------------------------------- SF.Net is sponsored by: Speed Start Your Linux Apps Now. Build and deploy apps & Web services for Linux with a free DVD software kit from IBM. Click Now! http://ads.osdn.com/?ad_id=1356&alloc_id=3438&op=click _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel