We have a number of Xen servers deployed around the country and they have proved to be stable for the past few months. However, in the last few weeks most of the machines have started to exhibit the same symptoms: the networking has stopped working meaning that it is no longer possible to access the DomUs (fortunately the Dom0 is OK). In some case, when I log onto a DomU using xm console I see a whole string of messages like this: __alloc_pages: 0-order allocation failed (gfp=0xf0/0) Uptime for the DomU is around 130-160 days in most cases. I am using a fairly old version of Xen (around 2.0.1 I think): is there any obvious problem that has been fixed in the past months that would resolve this issue? (I would have to be fairly sure of concrete improvements before upgrading these machines as if anything goes wrong I will have to hop in my car and driver for hours!) Stephen -- Dr. Stephen Childs, Research Fellow, EGEE Project, phone: +353-1-6081797 Computer Architecture Group, email: Stephen.Childs @ cs.tcd.ie Trinity College Dublin, Ireland web: http://www.cs.tcd.ie/Stephen.Childs _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
> __alloc_pages: 0-order allocation failed (gfp=0xf0/0) > > Uptime for the DomU is around 130-160 days in most cases. > > I am using a fairly old version of Xen (around 2.0.1 I > think): is there any obvious problem that has been fixed in > the past months that would resolve this issue? (I would have > to be fairly sure of concrete improvements before upgrading > these machines as if anything goes wrong I will have to hop > in my car and driver for hours!)There''s been a lot of fixes since 2.0.1, so you''d be well advised to upgrade (I''d even go to 2.0-testing). It might be worth doing a cat /proc/slabinfo in one of your stuck domU''s to see if there''s anything obviously leaking. You may get some value just upgrading the domU and leaving the dom0 and Xen as-is, but I''d generally advise upgrading everything together. Ian _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Alexander Wetzel
2005-Aug-01 19:50 UTC
Re: [Xen-users] Xen networking fails after long uptime
Hello, Since a few weeks I also have heavy problems with Xen and networking, but they occur all two to three days! All domU systems are basicly unreachable, but connections to dom0 are fine. An tcpdump shows, that arp requests are recived, but not answered in a timely manner?!? An xm console <host> is working, and the host is responding. An ssh connection to the domUs is still working, but hangs minutes, then updates in a flush and hangs again. And yes, I''m using bridging. An reboot of the domU didn''t help, and an restart of xen (/etc/init.d/xend stop; /etc/init.d/xend start) faild, xm list was not working thereafter. I checked /var/log/xend.log (no recent entry, last was an console disconnect ) and /var/log/xend-debug.log(which was last updated Friday and has unfortunatelly no timestamps. # tail xend-debug.log VirqClient.virqReceived> 4 vif-bridge down vif=vif2.0 domain=elrond mac=aa:00:00:00:00:11 bridge=xen-vlan10br vif-bridge down vif=vif2.1 domain=elrond mac=aa:00:00:6c:c5:ad bridge=xen-vlan10br VirqClient.virqReceived> 4 vif-bridge down vif=vif1.0 domain=deagol mac=aa:00:00:00:00:12 bridge=xen-vlan10br vif-bridge down vif=vif1.1 domain=deagol mac=aa:00:00:00:00:13 bridge=xen-eth1br VirqClient.virqReceived> 4 vif-bridge down vif=vif3.0 domain=gandalf mac=aa:00:00:00:00:14 bridge=xen-vlan10br Unhandled error in Deferred: Failure: twisted.internet.defer.TimeoutError: Callback timed out I found nothing in other logfiles or with dmesg on dom0 and (one) domU For those who want the facts, here is what my problem analyse produced. If you have suggestions what more can be checked the next time, I''ll be happy to do so! _____________________________________________________________________________________________ The XEN box is running debian testing and has two NIC, one running an an 802.1q trunk connected to an cisco switch, trunking two vlans. (It also makes use of dm-mapper and lvm, but this seems to work normal) The other is used for Internet connectivity and is connected to an DSL modem. # cat /proc/net/vlan/config VLAN Dev name | VLAN ID Name-Type: VLAN_NAME_TYPE_PLUS_VID_NO_PAD vlan10 | 10 | eth0 vlan20 | 20 | eth0 vlan100 | 100 | eth0 # brctl show xen-eth1br 8000.0030843df4ef no eth1 vif1.1 xen-vlan100br 8000.000102f63d3d no vif1.0 vlan100 xen-vlan10br 8000.000102f63d3d no vif2.0 vif2.1 vif3.0 vlan10 xen-vlan20br 8000.000102f63d3d no vlan20 ---------- during the error ------------------ # xm list Name Id Mem(MB) CPU State Time(s) Console Domain-0 0 43 0 r---- 1993.3 Domain-5 5 113 0 --p-- 0.0 deagol 4 15 0 -b--- 1592.6 9604 elrond 2 171 0 -b--- 68.0 9602 gandalf 5 112 0 -b--- 575.1 9605 # xm vif-list Domain-5 (vif (idx 0) (vif 0) (mac aa:00:00:00:00:14) (bridge xen-vlan10br) (evtchn 28 4) (index 0)) # xm vif-list deagol (vif (idx 0) (vif 0) (mac aa:00:00:00:00:12) (bridge xen-vlan100br) (evtchn 25 4) (index 0)) (vif (idx 1) (vif 1) (mac aa:00:00:00:00:13) (bridge xen-eth1br) (evtchn 26 5) (index 1)) # xm vif-list elrond (vif (idx 0) (vif 0) (mac aa:00:00:00:00:11) (bridge xen-vlan10br) (evtchn 18 4) (index 0)) (vif (idx 1) (vif 1) (mac aa:00:00:4c:fb:bf) (evtchn 19 5) (index 1)) # xm dmesg ERROR: cannot use unconfigured serial port COM1 No VGA adaptor detected! __ __ ____ ___ ____ \ \/ /___ _ __ |___ \ / _ \ | ___| \ // _ \ ''_ \ __) || | | ||___ \ / \ __/ | | | / __/ | |_| | ___) | /_/\_\___|_| |_| |_____(_)___(_)____/ http://www.cl.cam.ac.uk/netos/xen University of Cambridge Computer Laboratory Xen version 2.0.5 (root@mordor) (gcc version 3.3.5 (Debian 1:3.3.5-8 )) Sun Mar 13 18:27:54 CET 2005 Latest ChangeSet: information unavailable (XEN) Physical RAM map: (XEN) 0000000000000000 - 00000000000a0000 (usable) (XEN) 00000000000f0000 - 0000000000100000 (reserved) (XEN) 0000000000100000 - 000000001fffc000 (usable) (XEN) 000000001fffc000 - 000000001ffff000 (ACPI data) (XEN) 000000001ffff000 - 0000000020000000 (ACPI NVS) (XEN) 00000000fec00000 - 00000000fec01000 (reserved) (XEN) 00000000fee00000 - 00000000fee01000 (reserved) (XEN) 00000000ffff0000 - 0000000100000000 (reserved) (XEN) System RAM: 511MB (523888kB) (XEN) Xen heap: 10MB (10780kB) (XEN) CPU0: Before vendor init, caps: 3febfbff 00000000 00000000, ven dor = 0 (XEN) CPU#0: Hyper-Threading is disabled (XEN) CPU caps: 3febfbff 00000000 00000000 00000000 (XEN) ACPI: RSDP (v000 ASUS ) @ 0x000f75a0 (XEN) ACPI: RSDT (v001 ASUS <P4B> 0x42302e31 MSFT 0x31313031) @ 0x1fffc000 (XEN) ACPI: FADT (v001 ASUS <P4B> 0x42302e31 MSFT 0x31313031) @ 0x1fffc100 (XEN) ACPI: BOOT (v001 ASUS <P4B> 0x42302e31 MSFT 0x31313031) @ 0x1fffc040 (XEN) ACPI: MADT (v001 ASUS <P4B> 0x42302e31 MSFT 0x31313031) @ 0x1fffc080 (XEN) ACPI: DSDT (v001 ASUS <P4B> 0x00001000 MSFT 0x0100000b) @ 0x00000000 (XEN) ACPI: Local APIC address 0xfee00000 (XEN) ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled) (XEN) Processor #0 Pentium 4(tm) APIC version 20 (XEN) ACPI: LAPIC_NMI (acpi_id[0x00] high edge lint[0x1]) (XEN) Using scheduler: Borrowed Virtual Time (bvt) (XEN) Initializing CPU#0 (XEN) Detected 1513.504 MHz processor. (XEN) Found and enabled local APIC! (XEN) CPU0: Before vendor init, caps: 3febfbff 00000000 00000000, ven dor = 0 (XEN) CPU#0: Hyper-Threading is disabled (XEN) CPU caps: 3febfbff 00000000 00000000 00000000 (XEN) CPU0 booted (XEN) SMP motherboard not detected. (XEN) enabled ExtINT on CPU#0 (XEN) ESR value before enabling vector: 00000000 (XEN) ESR value after enabling vector: 00000000 (XEN) Using local APIC timer interrupts. (XEN) Calibrating APIC timer for CPU0... (XEN) ..... CPU speed is 1513.5049 MHz. (XEN) ..... Bus speed is 100.9002 MHz. (XEN) ..... bus_scale = 0x00006754 (XEN) Time init: (XEN) .... System Time: 10002545ns (XEN) .... cpu_freq: 00000000:5A3640C0 (XEN) .... scale: 00000001:5249A1DD (XEN) .... Wall Clock: 1122921855s 20000us (XEN) PCI: PCI BIOS revision 2.10 entry at 0xf11f0, last bus=2 (XEN) PCI: Using configuration type 1 (XEN) PCI: Probing PCI hardware (XEN) PCI: Probing PCI hardware (bus 00) (XEN) PCI: Enabled i801 SMBus device (XEN) Transparent bridge - PCI device 8086:244e (XEN) PCI: Using IRQ router PIIX/ICH [8086/2440] at 00:1f.0 (XEN) mtrr: v2.0 (20020519) (XEN) *** LOADING DOMAIN 0 *** (XEN) Xen-ELF header found: ''GUEST_OS=linux,GUEST_VER=2.6,XEN_VER=2.0 ,VIRT_BASE=0xC0000000,LOADER=generic,PT_MODE_WRITABLE'' (XEN) PHYSICAL MEMORY ARRANGEMENT: (XEN) Kernel image: 00c00000->00f58c38 (XEN) Initrd image: 00f59000->010be000 (XEN) Dom0 alloc.: 01400000->04400000 (XEN) VIRTUAL MEMORY ARRANGEMENT: (XEN) Loaded kernel: c0100000->c0498308 (XEN) Init. ramdisk: c0499000->c05fe000 (XEN) Phys-Mach map: c05fe000->c060a000 (XEN) Page tables: c060a000->c060d000 (XEN) Start info: c060d000->c060e000 (XEN) Boot stack: c060e000->c060f000 (XEN) TOTAL: c0000000->c0800000 (XEN) ENTRY ADDRESS: c0100000 (XEN) Scrubbing DOM0 RAM: .done. (XEN) Initrd len 0x165000, start at 0xc0499000 (XEN) Scrubbing Free RAM: ......done. (XEN) *** Serial input -> DOM0 (type ''CTRL-a'' three times to switch input to Xen). (XEN) PCI: Found IRQ 10 for device 02:0c.0 (XEN) PCI: Found IRQ 12 for device 02:0a.0 (XEN) PCI: Assigned IRQ 5 for device 02:0e.0 ------- xm list after an reboot ------------------- # xm list Name Id Mem(MB) CPU State Time(s) Console Domain-0 0 43 0 r---- 42.9 deagol 1 15 0 -b--- 8.5 9601 elrond 2 171 0 -b--- 7.6 9602 gandalf 3 112 0 -b--- 32.1 9603 It started after I updated to Xen 2.0.6, compiled from source with customized dom0 and domU kernels. Xen 2.0.5 was working fine for me, without any of these problems. So I first switched back to my old 2.0.5 xen and linux 2.6.10-xen0 kernel, but keept the domU kernels at the newer 2.6.11 kernel of xen 2.0.6 Friday this setup crashed again, the same symtomes. So I also switched back all domU kernels to 2.6.10-xenU. And today it crashed again! The only difference between the stable xen-2.0.5 and the now unstable xen 2.0.5 are some OS updates. Is it possible that some packets on the Dom0 affect the network behavior? I tried to debug the problem today and I got funny results: This here is a tcpdump on one of the dom0 systems 10.16.1.4 is the domo, 10.16.1.10 is an domU running different services, e.g. DNS (10.16.1.10 was reported as dom5 with xm list, so I used vif5.0 for tcpdump) # tcpdump -ni vif5.0 host 10.16.1.4 20:38:20.675298 arp who-has 10.16.1.4 tell 10.16.1.10 20:38:20.675302 arp who-has 10.16.1.4 tell 10.16.1.10 20:38:20.675305 arp who-has 10.16.1.4 tell 10.16.1.10 20:38:20.675659 arp reply 10.16.1.4 is-at 00:01:02:f6:3d:3d 20:38:20.675674 arp reply 10.16.1.4 is-at 00:01:02:f6:3d:3d 20:38:20.675681 arp reply 10.16.1.4 is-at 00:01:02:f6:3d:3d 20:38:20.676909 arp who-has 10.16.1.4 tell 10.16.1.10 20:38:20.676911 arp who-has 10.16.1.4 tell 10.16.1.10 20:38:20.676913 arp who-has 10.16.1.4 tell 10.16.1.10 20:38:20.677221 arp reply 10.16.1.4 is-at 00:01:02:f6:3d:3d 20:38:20.677227 arp reply 10.16.1.4 is-at 00:01:02:f6:3d:3d 20:38:20.677231 arp reply 10.16.1.4 is-at 00:01:02:f6:3d:3d 20:38:20.679148 arp who-has 10.16.1.4 tell 10.16.1.10 20:38:20.679152 arp who-has 10.16.1.4 tell 10.16.1.10 20:38:20.679153 arp who-has 10.16.1.4 tell 10.16.1.10 20:38:20.679497 arp reply 10.16.1.4 is-at 00:01:02:f6:3d:3d 20:38:20.679502 arp reply 10.16.1.4 is-at 00:01:02:f6:3d:3d 20:38:20.679506 arp reply 10.16.1.4 is-at 00:01:02:f6:3d:3d 20:38:20.681899 IP 10.16.1.4.1338 > 10.16.1.10.53: 37786+ PTR? 10.1. 20:38:25.675885 arp who-has 10.16.1.10 tell 10.16.1.4 20:38:25.686025 IP 10.16.1.4.1338 > 10.16.1.10.53: 37786+ PTR? 10.1. 20:38:26.676086 arp who-has 10.16.1.10 tell 10.16.1.4 20:38:27.676306 arp who-has 10.16.1.10 tell 10.16.1.4 20:38:30.706919 arp who-has 10.16.1.10 tell 10.16.1.4 20:38:31.707188 arp who-has 10.16.1.10 tell 10.16.1.4 20:38:32.707338 arp who-has 10.16.1.10 tell 10.16.1.4 20:38:35.717957 arp who-has 10.16.1.10 tell 10.16.1.4 20:38:36.718186 arp who-has 10.16.1.10 tell 10.16.1.4 20:38:37.718388 arp who-has 10.16.1.10 tell 10.16.1.4 20:38:40.719681 arp reply 10.16.1.10 is-at aa:00:00:00:00:14 20:38:40.719682 IP 10.16.1.10.53 > 10.16.1.4.1338: 37769* 1/1/1 PTR[ 20:38:40.719683 IP 10.16.1.10.514 > 10.16.1.4.514: SYSLOG daemon.info 20:38:40.719982 IP 10.16.1.4 > 10.16.1.10: ICMP 10.16.1.4 udp port 13 20:38:40.720440 IP 10.16.1.10.514 > 10.16.1.4.514: SYSLOG daemon.info 20:38:40.720881 arp reply 10.16.1.10 is-at aa:00:00:00:00:14 20:38:40.720884 arp reply 10.16.1.10 is-at aa:00:00:00:00:14 20:38:40.725145 IP 10.16.1.4.1338 > 10.16.1.10.53: 37788+ PTR? 10.1. 20:38:40.726787 IP 10.16.1.10.514 > 10.16.1.4.514: SYSLOG daemon.info 20:38:40.726788 IP 10.16.1.10.53 > 10.16.1.4.1338: 37770* 1/1/1 PTR[ 20:38:40.726789 IP 10.16.1.10.514 > 10.16.1.4.514: SYSLOG daemon.info 20:38:45.731429 IP 10.16.1.4.1338 > 10.16.1.10.53: 37788+ PTR? 10.1. 20:38:50.751605 IP 10.16.1.4.1338 > 10.16.1.10.53: 37789+ PTR? 2.16. 20:38:55.762171 IP 10.16.1.4.1338 > 10.16.1.10.53: 37789+ PTR? 2.16. 20:39:00.773518 IP 10.16.1.10.53 > 10.16.1.4.1338: 37770* 1/1/1 PTR[ 20:39:00.773519 IP 10.16.1.10.514 > 10.16.1.4.514: SYSLOG daemon.info 20:39:00.773520 IP 10.16.1.10.514 > 10.16.1.4.514: SYSLOG daemon.info 20:39:00.773521 IP 10.16.1.10.514 > 10.16.1.4.514: SYSLOG daemon.info 20:39:00.773798 IP 10.16.1.4 > 10.16.1.10: ICMP 10.16.1.4 udp port 13 20:39:00.774357 IP 10.16.1.10.514 > 10.16.1.4.514: SYSLOG daemon.info 20:39:00.774360 IP 10.16.1.10.514 > 10.16.1.4.514: SYSLOG daemon.info 20:39:00.777867 IP 10.16.1.4.1338 > 10.16.1.10.53: 37790+ PTR? 10.1. 20:39:00.779564 IP 10.16.1.10.514 > 10.16.1.4.514: SYSLOG daemon.info 20:39:00.779567 IP 10.16.1.10.514 > 10.16.1.4.514: SYSLOG mail.error, 20:39:05.784210 IP 10.16.1.4.1338 > 10.16.1.10.53: 37790+ PTR? 10.1. 20:39:10.795953 IP 10.16.1.4.1338 > 10.16.1.10.53: 37791+ PTR? 10.1. 20:39:15.806276 IP 10.16.1.4.1338 > 10.16.1.10.53: 37791+ PTR? 10.1. 20:39:20.818109 IP 10.16.1.4.1338 > 10.16.1.10.53: 37792+ PTR? 10.1. 20:39:25.828337 IP 10.16.1.4.1338 > 10.16.1.10.53: 37792+ PTR? 10.1. 20:39:30.829302 arp who-has 10.16.1.10 tell 10.16.1.4 20:39:30.843575 IP 10.16.1.4.1338 > 10.16.1.10.53: 37793+ PTR? 10.1. 20:39:31.829514 arp who-has 10.16.1.10 tell 10.16.1.4 20:39:32.829712 arp who-has 10.16.1.10 tell 10.16.1.4 20:39:35.860338 arp who-has 10.16.1.10 tell 10.16.1.4 20:39:36.860559 arp who-has 10.16.1.10 tell 10.16.1.4 Here the tcpdump running on eth0 of 10.16.1.10: 20:38:40.713136 IP 10.16.1.4 > 10.16.1.10: icmp 135: 10.16.1.4 udp port 1338 unreachable 20:38:40.713688 arp reply 10.16.1.10 is-at aa:00:00:00:00:14 20:38:40.713698 arp reply 10.16.1.10 is-at aa:00:00:00:00:14 20:38:40.718308 IP 10.16.1.10.514 > 10.16.1.4.514: UDP, length: 80 20:38:40.718319 IP 10.16.1.10.53 > 10.16.1.4.1338: 37770* 1/1/1 PTR[|domain] 20:38:40.718322 IP 10.16.1.10.514 > 10.16.1.4.514: UDP, length: 79 20:38:40.718337 IP 10.16.1.4.1338 > 10.16.1.10.53: 37788+ PTR? 10.1.16.10.in-addr.arpa. (41) 20:38:40.862428 IP 10.16.1.10.53 > 10.16.1.4.1338: 37770* 1/1/1 PTR[|domain] 20:38:40.862434 IP 10.16.1.10.514 > 10.16.1.4.514: UDP, length: 80 20:38:40.862437 IP 10.16.1.10.514 > 10.16.1.4.514: UDP, length: 79 20:38:40.862439 IP 10.16.1.10.514 > 10.16.1.4.514: UDP, length: 76 20:38:40.862441 IP 10.16.1.10.514 > 10.16.1.4.514: UDP, length: 81 20:38:45.723532 IP 10.16.1.4.1338 > 10.16.1.10.53: 37788+ PTR? 10.1.16.10.in-addr.arpa. (41) <missed a part> 20:40:00.909075 IP 10.16.1.10.53 > 10.16.1.4.1338: 37772* 1/1/1 PTR[|domain] 20:40:00.914836 arp who-has 10.16.1.10 tell 10.16.1.4 20:40:01.914819 arp who-has 10.16.1.10 tell 10.16.1.4 20:40:02.914787 arp who-has 10.16.1.10 tell 10.16.1.4 20:40:05.924838 arp who-has 10.16.1.10 tell 10.16.1.4 20:40:06.924784 arp who-has 10.16.1.10 tell 10.16.1.4 20:40:07.924846 arp who-has 10.16.1.10 tell 10.16.1.4 20:40:10.925408 IP 10.16.1.10.53 > 10.16.1.4.1338: 37772* 1/1/1 PTR[|domain] 20:40:10.928817 IP 10.16.1.10.53 > 10.16.1.4.1338: 37773* 1/1/1 PTR[|domain] 20:40:10.934789 arp who-has 10.16.1.10 tell 10.16.1.4 20:40:11.934841 arp who-has 10.16.1.10 tell 10.16.1.4 20:40:12.934804 arp who-has 10.16.1.10 tell 10.16.1.4 20:40:15.944830 arp who-has 10.16.1.10 tell 10.16.1.4 20:40:16.944796 arp who-has 10.16.1.10 tell 10.16.1.4 20:40:17.944867 arp who-has 10.16.1.10 tell 10.16.1.4 20:40:20.947643 IP 10.16.1.4 > 10.16.1.10: icmp 135: 10.16.1.4 udp port 1338 unreachable 20:40:20.948283 IP 10.16.1.10.53 > 10.16.1.4.1338: 37773* 1/1/1 PTR[|domain] 20:40:20.949142 IP 10.16.1.4 > 10.16.1.10: icmp 135: 10.16.1.4 udp port 1338 unreachable 20:40:20.949616 IP 10.16.1.10.53 > 10.16.1.4.1338: 37774* 1/1/1 PTR[|domain] 20:40:20.949633 arp reply 10.16.1.10 is-at aa:00:00:00:00:14 20:40:20.949636 IP 10.16.1.10.53 > 10.16.1.4.1338: 37774* 1/1/1 PTR[|domain] 20:40:20.949641 arp reply 10.16.1.10 is-at aa:00:00:00:00:14 20:40:20.949643 arp reply 10.16.1.10 is-at aa:00:00:00:00:14 20:40:20.950740 IP 10.16.1.4.1338 > 10.16.1.10.53: 37798+ PTR? 10.1.16.10.in-addr.arpa. (41) 20:40:20.953255 IP 10.16.1.10.53 > 10.16.1.4.1338: 37775* 1/1/1 PTR[|domain] 20:40:20.953263 arp reply 10.16.1.10 is-at aa:00:00:00:00:14 20:40:20.953269 IP 10.16.1.10.53 > 10.16.1.4.1338: 37775* 1/1/1 PTR[|domain] 20:40:20.953272 arp reply 10.16.1.10 is-at aa:00:00:00:00:14 20:40:20.957795 arp reply 10.16.1.10 is-at aa:00:00:00:00:14 20:40:20.957807 arp reply 10.16.1.10 is-at aa:00:00:00:00:14 20:40:20.957811 arp reply 10.16.1.10 is-at aa:00:00:00:00:14 20:40:22.874129 arp reply 10.16.1.10 is-at aa:00:00:00:00:14 20:40:22.874146 arp reply 10.16.1.10 is-at aa:00:00:00:00:14 20:40:22.874149 arp reply 10.16.1.10 is-at aa:00:00:00:00:14 20:40:22.874151 arp reply 10.16.1.10 is-at aa:00:00:00:00:14 20:40:22.874154 arp reply 10.16.1.10 is-at aa:00:00:00:00:14 20:40:22.874156 arp reply 10.16.1.10 is-at aa:00:00:00:00:14 20:40:22.874158 arp reply 10.16.1.10 is-at aa:00:00:00:00:14 20:40:22.874165 arp reply 10.16.1.10 is-at aa:00:00:00:00:14 20:40:25.964790 IP 10.16.1.4.1338 > 10.16.1.10.53: 37798+ PTR? 10.1.16.10.in-addr.arpa. (41) 20:40:30.987943 arp reply 10.16.1.10 is-at aa:00:00:00:00:14 20:40:30.987951 arp reply 10.16.1.10 is-at aa:00:00:00:00:14 20:40:30.987958 arp reply 10.16.1.10 is-at aa:00:00:00:00:14 20:40:30.990887 IP 10.16.1.4.1338 > 10.16.1.10.53: 37799+ PTR? 10.1.16.10.in-addr.arpa. (41) 20:40:35.997336 IP 10.16.1.4.1338 > 10.16.1.10.53: 37799+ PTR? 10.1.16.10.in-addr.arpa. (41) 20:40:41.008222 arp reply 10.16.1.10 is-at aa:00:00:00:00:14 20:40:41.008245 arp reply 10.16.1.10 is-at aa:00:00:00:00:14 20:40:41.008252 arp reply 10.16.1.10 is-at aa:00:00:00:00:14 Am Freitag, 29. Juli 2005 12:44 schrieb Ian Pratt:> > > __alloc_pages: 0-order allocation failed (gfp=0xf0/0) > > > > Uptime for the DomU is around 130-160 days in most cases. > > > > I am using a fairly old version of Xen (around 2.0.1 I > > think): is there any obvious problem that has been fixed in > > the past months that would resolve this issue? (I would have > > to be fairly sure of concrete improvements before upgrading > > these machines as if anything goes wrong I will have to hop > > in my car and driver for hours!) > > There''s been a lot of fixes since 2.0.1, so you''d be well advised to > upgrade (I''d even go to 2.0-testing). > > It might be worth doing a cat /proc/slabinfo in one of your stuck domU''s > to see if there''s anything obviously leaking. > > You may get some value just upgrading the domU and leaving the dom0 and > Xen as-is, but I''d generally advise upgrading everything together. > > Ian > > _______________________________________________ > Xen-users mailing list > Xen-users@lists.xensource.com > http://lists.xensource.com/xen-users > >_______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users