Steve Traugott
2006-Aug-02 20:54 UTC
[Xen-devel] blocking Xen 3.X production use: soft lockup bugs
Hi All, I hate to say it, but it''s starting to look like soft lockup bug(s) are turning into a serious roadblock for general production use of Xen 3.X, on a wide range of hardware. I''ve been using Xen since the 1.0 days, and I have to say that this the most serious showstopper bug I''ve ever hit -- it usually manifests itself during the first significant network and/or disk I/O after starting a second or third domU on the same box, and is the only bug I''ve ever hit that has caused permanent damage -- it tends to corrupt guest filesystems. In my case it''s stopped a deployment dead in its tracks, and our only options at this point are to go back to Xen 2.X or (horrors) to native Linux kernels. The problem (or something that looks identical) is described in several tickets, status currently NEW or REOPENED, no clear resolution: http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=543 http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=690 http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=697 http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=705 In our own shop, we consistently hit soft lockups while running on both IBM x330''s and older Netengines (similar to an IBM 4000R). We''ve found no workaround. We''re on xen-3.0-testing, changeset 9732, kernel 2.6.6.13. On April 6th, Keir posted a note saying this was fixed as of a blkif_schedule() fix, which we already have because that was way back in changeset 9587... http://lists.xensource.com/archives/html/xen-devel/2006-04/msg00121.html. The most recent devel list traffic I''ve found which covers this is July 7th: http://lists.xensource.com/archives/html/xen-users/2006-07/msg00134.html ...this message referred back to Kier''s comment as describing a fix, but it doesn''t look true; while Kier''s 9587 checkin may have fixed a soft lockup problem, there appear to be more out there, or else there''s been regression. Do we have any consensus that this bug is fixed at all in xen-3.0-testing, or even unstable? Is anyone who was hitting soft lockups in testing *not* hitting them any more on the same hardware? If so, what changeset are you on now? If anyone needs any more information, just let me know. As usual, if anyone wants login and console server access to one of these boxes to chase this down, I''m more than happy to provide that. Thanks, Steve -- Stephen G. Traugott (KG6HDQ) UNIX/Linux Infrastructure Architect, TerraLuna LLC stevegt@TerraLuna.Org http://www.stevegt.com -- http://Infrastructures.Org _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Pratt
2006-Aug-02 22:25 UTC
RE: [Xen-devel] blocking Xen 3.X production use: soft lockup bugs
> The problem (or something that looks identical) is described in > several tickets, status currently NEW or REOPENED, no clear > resolution: > http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=543 > http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=690 > http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=697 > http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=705There''s very little to go on here. Two of the bugs are actually the same guy. One of the others is x86_64 the other two are 32b. The only thing in common about the stack traces is that networking functions seem to feature. Taking a wild guess, are you doing some kind of unusual networking setup involving iptables rules?> Do we have any consensus that this bug is fixed at all in > xen-3.0-testing, or even unstable? Is anyone who was hitting soft > lockups in testing *not* hitting them any more on the same hardware? > If so, what changeset are you on now?Soft lockups could be due to a huge variety of causes. It''s unlikely to be a hardware issue, and since the problems seem to be experienced by a very small number of users my guess would be that it''s configuration dependent, most likely networking.> If anyone needs any more information, just let me know. As usual, if > anyone wants login and console server access to one of these boxes to > chase this down, I''m more than happy to provide that.Having a really detailed bug report would really be the best way of proceeding. When this happens, does it just effect one guest? What''s the stack trace? How many VCPUs has the guest got? Is the guest completely hosed or is it still pingable? What about guest console echo? What about ''xm sysreq''? Looking in dom0, are you still seeing packets go to/from the associated VIF? How many network interfaces has the guest got? What''s the precise networking setup in dom0? Can you come up with a recipe for reproduction, ideally with a single guest? Thanks, Ian _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Steve Traugott
2006-Aug-02 22:48 UTC
[Xen-devel] Re: blocking Xen 3.X production use: soft lockup bugs
Here are some examples of the sort of soft lockups I''m seeing -- I can''t say right now if they''ve all been showing the same stack trace, but I''ll keep an eye on that from now on. I know they haven''t all been on the same CPU. Anything else anyone needs, just let me know -- and I''d like to reaffirm my earlier offer of access to one of these machines. I''m also starting to think a XenSource wiki page "how to report/workaround soft lockups" might be in order; I suspect many of the bug reports (including my own) haven''t been detailed enough to differentiate between the various things that can cause soft lockups. This was on an IBM x330. Steve n4h34:~# xm create -c /etc/xen/auto/build2.t7a.org Using config file "/etc/xen/auto/build2.t7a.org". Started domain build2.t7a.org Linux version 2.6.16.13-xen (root@n4h33) (gcc version 3.3.5 (Debian 1:3.3.5-12)) #2 SMP Sun Jun 11 14:25:16 PDT 2006 BIOS-provided physical RAM map: Xen: 0000000000000000 - 0000000008000000 (usable) 0MB HIGHMEM available. 136MB LOWMEM available. ACPI in unprivileged domain disabled IRQ lockup detection disabled Built 1 zonelists Kernel command line: root=/dev/sda1 2 Enabling fast FPU save and restore... done. Enabling unmasked SIMD FPU exception support... done. Initializing CPU#0 PID hash table entries: 1024 (order: 10, 16384 bytes) Xen reported: 1130.113 MHz processor. Dentry cache hash table entries: 32768 (order: 5, 131072 bytes) Inode-cache hash table entries: 16384 (order: 4, 65536 bytes) Software IO TLB disabled vmalloc area: c9000000-fb7fe000, maxmem 33ffe000 Memory: 114612k/139264k available (3368k kernel code, 16308k reserved, 1033k data, 196k init, 0k highmem) Checking if this processor honours the WP bit even in supervisor mode... Ok. Calibrating delay using timer specific routine.. 2261.96 BogoMIPS (lpj=11309833) Security Framework v1.0.0 initialized Capability LSM initialized Mount-cache hash table entries: 512 CPU: L1 I cache: 16K, L1 D cache: 16K CPU: L2 cache: 512K Checking ''hlt'' instruction... OK. Brought up 1 CPUs migration_cost=0 checking if image is initramfs... it is Freeing initrd memory: 9535k freed Grant table initialized NET: Registered protocol family 16 Brought up 1 CPUs PCI: setting up Xen PCI frontend stub ACPI: Subsystem revision 20060127 ACPI: Interpreter disabled. Linux Plug and Play Support v0.97 (c) Adam Belay xen_mem: Initialising balloon driver. SCSI subsystem initialized usbcore: registered new driver usbfs usbcore: registered new driver hub PCI: System does not support PCI PCI: System does not support PCI IA-32 Microcode Update Driver: v1.14-xen <tigran@veritas.com> VFS: Disk quotas dquot_6.5.1 Dquot-cache hash table entries: 1024 (order 0, 4096 bytes) JFS: nTxBlock = 1024, nTxLock = 8192 SGI XFS with ACLs, security attributes, realtime, large block numbers, no debug enabled Initializing Cryptographic API io scheduler noop registered io scheduler anticipatory registered (default) io scheduler deadline registered io scheduler cfq registered PNP: No PS/2 controller found. Probing ports directly. i8042.c: No controller found. RAMDISK driver initialized: 16 RAM disks of 16384K size 1024 blocksize Xen virtual console successfully installed as tty1 Event-channel device installed. blkif_init: reqs=64, pages=704, mmap_vstart=0xc7400000 netfront: Initialising virtual ethernet driver. Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2 ide: Assuming 50MHz system bus speed for PIO modes; override with idebus=xx Registering block device major 8 ide-floppy driver 0.99.newide Fusion MPT base driver 3.03.07 Copyright (c) 1999-2005 LSI Logic Corporation Fusion MPT SPI Host driver 3.03.07 Fusion MPT misc device (ioctl) driver 3.03.07 mptctl: Registered with Fusion MPT base driver mptctl: /dev/mptctl @ (major,minor=10,220) usbmon: debugfs is not available usbcore: registered new driver libusual mice: PS/2 mouse device common for all mice md: md driver 0.90.3 MAX_MD_DEVS=256, MD_SB_DISKS=27 md: bitmap version 4.39 NET: Registered protocol family 2 IP route cache hash table entries: 2048 (order: 1, 8192 bytes) TCP established hash table entries: 8192 (order: 4, 65536 bytes) TCP bind hash table entries: 8192 (order: 4, 65536 bytes) TCP: Hash tables configured (established 8192 bind 8192) TCP reno registered Initializing IPsec netlink socket NET: Registered protocol family 1 NET: Registered protocol family 17 NET: Registered protocol family 8 NET: Registered protocol family 20 Using IPI No-Shortcut mode Freeing unused kernel memory: 196k freed Loading, please wait... Begin: Loading essential drivers... ... tg3: no version for "struct_module" found: kernel tainted. eepro100.c:v1.09j-t 9/29/99 Donald Becker http://www.scyld.com/network/eepro100.html eepro100.c: $Revision: 1.36 $ 2000/11/17 Modified by Andrey V. Savochkin <saw@saw.sw.com.sg> and others Intel(R) PRO/1000 Network Driver - version 6.3.9-k4 Copyright (c) 1999-2005 Intel Corporation. Done. Begin: Running /scripts/init-premount ... FATAL: Error inserting fan (/lib/modules/2.6.16.13-xen/kernel/drivers/acpi/fan.ko): No such device FATAL: Error inserting thermal (/lib/modules/2.6.16.13-xen/kernel/drivers/acpi/thermal.ko): No such device Done. Begin: Mounting root file system... ... Begin: Running /scripts/local-top ... Done. Begin: Running /scripts/local-premount ... Done. kjournald starting. Commit interval 5 seconds EXT3-fs: mounted filesystem with ordered data mode. Begin: Running /scripts/log-bottom ... Done. Done. Begin: Running /scripts/init-bottom ... Done. mount: Mounting /sys on /root/sys failed: No such file or directory INIT: version 2.85 booting Activating swap. Checking root file system... fsck 1.39 (29-May-2006) /dev/sda1: clean, 21526/917504 files, 245920/1835007 blocks EXT3 FS on sda1, internal journal System time was Wed Aug 2 22:17:34 UTC 2006. Setting the System Clock using the Hardware Clock as reference... System Clock set. System local time is now Wed Aug 2 22:17:37 UTC 2006. Loading device-mapper support. Checking all file systems... fsck 1.39 (29-May-2006) Setting kernel variables.. Mounting local filesystems... Adding 524280k swap on /swap00. Priority:-1 extents:134 across:533176k Cleaning /tmp /var/run /var/lock. Running 0dns-down to make sure resolv.conf is ok...done. Cleaning: /etc/network/ifstate. Setting up IP spoofing protection: rp_filter. Configuring network interfaces...done. Loading the saved-state of the serial devices... /dev/ttyS0: No such file or directory /dev/ttyS0: No such file or directory /dev/ttyS1: No such file or directory /dev/ttyS1: No such file or directory Not setting System Clock Initializing random number generator...done. Recovering nvi editor sessions... done. INIT: Entering runlevel: 2 Starting isconf daemonRunning isconf updateisconf: info: build2.t7a.org is on guest-1 branch isconf: info: may reboot... isconf: info: checking for updates isconf: info: fetching http://10.27.4.7:65028/t7a.org/block/fb2/fb2e8177e647be52a1c64e21fcb913455c71e731-8b3a10ecde5fc43984807e34550a2ebd-1?challenge=0.911958506882 isconf: info: fetching http://10.27.4.7:65028/t7a.org/block/fb2/fb2e8177e647be52a1c64e21fcb913455c71e731-8b3a10ecde5fc43984807e34550a2ebd-1?challenge=0.999292957677 isconf: info: fetching http://10.27.4.7:65028/t7a.org/block/fb2/fb2e8177e647be52a1c64e21fcb913455c71e731-8b3a10ecde5fc43984807e34550a2ebd-1?challenge=0.239902520967 BUG: soft lockup detected on CPU#0! Pid: 2383, comm: isconf EIP: 0073:[<080c9763>] CPU: 0 EIP is at 0x80c9763 ESP: 007b:bfcc962c EFLAGS: 00200282 Tainted: GF (2.6.16.13-xen #2) EAX: 00000001 EBX: 0000003a ECX: bfcc9624 EDX: 00000000 ESI: 08137cb4 EDI: 00000001 EBP: bfcc9638 DS: 007b ES: 007b CR0: 80050033 CR2: b7b97000 CR3: 0055e000 CR4: 00000640 isconf: info: fetching http://10.27.4.34:65028/t7a.org/block/ff1/ff1276f7811aeeade18d54a6c3578261ff36ecbb-4fb47b36cda57ae95af56372f03bb2ca-1?challenge=0.265409462016 isconf: info: updated /etc/ldap/ldap.conf BUG: soft lockup detected on CPU#0! Pid: 2383, comm: isconf EIP: 0073:[<080af84d>] CPU: 0 EIP is at 0x80af84d ESP: 007b:bfcc96d0 EFLAGS: 00200246 Tainted: GF (2.6.16.13-xen #2) EAX: 00000001 EBX: 082031fe ECX: 082031fe EDX: b7af1f8c ESI: 00000000 EDI: 082030ec EBP: bfcc9838 DS: 007b ES: 007b CR0: 80050033 CR2: b7b97000 CR3: 0055e000 CR4: 00000640 isconf: info: fetching http://10.27.4.7:65028/t7a.org/block/c0e/c0e10bc50572deb89da6e9d96ac5971a39fddc65-fc3558eaffc90497248f97f9b0e3a924-1?challenge=0.130730726051 isconf: info: updated /etc/ca-certificates.conf isconf: info: running [''update-ca-certificates''] Updating certificates in /etc/ssl/certs....done. isconf: info: updated /etc/ldap/ldap.conf BUG: soft lockup detected on CPU#0! Pid: 1, comm: init EIP: 0061:[<c0322fe1>] CPU: 0 EIP is at netif_poll+0x101/0x810 EFLAGS: 00000216 Tainted: GF (2.6.16.13-xen #2) EAX: 00000037 EBX: c0945180 ECX: 0001134e EDX: c0945000 ESI: c0f48280 EDI: c0f499e8 EBP: c09451c0 DS: 007b ES: 007b CR0: 8005003b CR2: b7d579e0 CR3: 0057e000 CR4: 00000640 [<c03d891a>] net_rx_action+0xea/0x230 [<c0124cb5>] __do_softirq+0xf5/0x120 [<c0124d75>] do_softirq+0x95/0xa0 [<c0106c0f>] do_IRQ+0x1f/0x30 [<c0312f58>] evtchn_do_upcall+0xa8/0xf0 [<c0105178>] hypervisor_callback+0x2c/0x34 [<c02c2081>] __copy_user_intel+0x31/0xb0 [<c02c2220>] __copy_to_user_ll+0x70/0x80 [<c02c22f2>] copy_to_user+0x42/0x60 [<c0171068>] cp_new_stat64+0xf8/0x110 [<c01710b7>] sys_stat64+0x37/0x40 [<c0104fb5>] syscall_call+0x7/0xb isconf: warning: clierr: Connection reset by peer Starting system log daemon: syslogd. Starting kernel log daemon: klogd. No configuration file was found for slapd at /etc/ldap/slapd.conf. If you have moved the slapd configuration file please modify /etc/default/slapd to reflect this. If you chose to not configure slapd during installation then you need to do so prior to attempting to start slapd. An example slapd.conf is in /usr/share/slapd Starting Heimdal KDC: heimdal-kdc. Starting Heimdal password server: kpasswdd. Starting internet superserver: inetd. Starting PCMCIA services: module directory /lib/modules/2.6.16.13-xen/pcmcia not found. Starting OpenBSD Secure Shell server: sshd. Starting deferred execution scheduler: atd. Starting periodic command scheduler: cron. Debian GNU/Linux testing/unstable build2.t7a.org tty1 build2.t7a.org login: On Wed, Aug 02, 2006 at 01:54:49PM -0700, Steve Traugott wrote:> Hi All, > > I hate to say it, but it''s starting to look like soft lockup bug(s) > are turning into a serious roadblock for general production use of Xen > 3.X, on a wide range of hardware. I''ve been using Xen since the 1.0 > days, and I have to say that this the most serious showstopper bug > I''ve ever hit -- it usually manifests itself during the first > significant network and/or disk I/O after starting a second or third > domU on the same box, and is the only bug I''ve ever hit that has > caused permanent damage -- it tends to corrupt guest filesystems. In > my case it''s stopped a deployment dead in its tracks, and our only > options at this point are to go back to Xen 2.X or (horrors) to native > Linux kernels. > > The problem (or something that looks identical) is described in > several tickets, status currently NEW or REOPENED, no clear > resolution: > http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=543 > http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=690 > http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=697 > http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=705 > > In our own shop, we consistently hit soft lockups while running on > both IBM x330''s and older Netengines (similar to an IBM 4000R). We''ve > found no workaround. We''re on xen-3.0-testing, changeset 9732, kernel > 2.6.6.13. On April 6th, Keir posted a note saying this was fixed as > of a blkif_schedule() fix, which we already have because that was way > back in changeset 9587... > http://lists.xensource.com/archives/html/xen-devel/2006-04/msg00121.html. > > The most recent devel list traffic I''ve found which covers this is > July 7th: > http://lists.xensource.com/archives/html/xen-users/2006-07/msg00134.html > ...this message referred back to Kier''s comment as describing a fix, > but it doesn''t look true; while Kier''s 9587 checkin may have fixed a > soft lockup problem, there appear to be more out there, or else > there''s been regression. > > Do we have any consensus that this bug is fixed at all in > xen-3.0-testing, or even unstable? Is anyone who was hitting soft > lockups in testing *not* hitting them any more on the same hardware? > If so, what changeset are you on now? > > If anyone needs any more information, just let me know. As usual, if > anyone wants login and console server access to one of these boxes to > chase this down, I''m more than happy to provide that. > > Thanks, > > Steve > -- > Stephen G. Traugott (KG6HDQ) > UNIX/Linux Infrastructure Architect, TerraLuna LLC > stevegt@TerraLuna.Org > http://www.stevegt.com -- http://Infrastructures.Org-- Stephen G. Traugott (KG6HDQ) UNIX/Linux Infrastructure Architect, TerraLuna LLC stevegt@TerraLuna.Org http://www.stevegt.com -- http://Infrastructures.Org _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Pratt
2006-Aug-02 23:36 UTC
RE: [Xen-devel] Re: blocking Xen 3.X production use: soft lockup bugs
> Here are some examples of the sort of soft lockups I''m seeing -- I > can''t say right now if they''ve all been showing the same stack trace, > but I''ll keep an eye on that from now on. I know they haven''t all > been on the same CPU. Anything else anyone needs, just let me know -- > and I''d like to reaffirm my earlier offer of access to one of these > machines.So, you''re seeing this on a 1 CPU 32b non PAE guest. Please can you answer some of my other questions in the previous email. How many other guests were running, what else was the system doing? Using ''xm list'', is the guest burning CPU? What about dom0? The soft lockup messages appear to be benign in that the domain seems to be continuing quite happily after printing them -- its quite possible that the system was sufficiently busy that the domain VCPU just didn''t get scheduled for a while, triggering the warning message. Are you sure they''re actually related to the more serious problem you''re experiencing? Have you tried using -unstable and hence xen''s new scheduler? This is less likely to provoke soft lockup false alarms. Ian> I''m also starting to think a XenSource wiki page "how to > report/workaround soft lockups" might be in order; I suspect many of > the bug reports (including my own) haven''t been detailed enough to > differentiate between the various things that can cause soft lockups. > > This was on an IBM x330. > > Steve > > n4h34:~# xm create -c /etc/xen/auto/build2.t7a.org > Using config file "/etc/xen/auto/build2.t7a.org". > Started domain build2.t7a.org > Linux version 2.6.16.13-xen (root@n4h33) (gcc version 3.3.5 (Debian > 1:3.3.5-12)) #2 SMP Sun Jun 11 14:25:16 PDT 2006 > BIOS-provided physical RAM map: > Xen: 0000000000000000 - 0000000008000000 (usable) > 0MB HIGHMEM available. > 136MB LOWMEM available. > ACPI in unprivileged domain disabled > IRQ lockup detection disabled > Built 1 zonelists > Kernel command line: root=/dev/sda1 2 > Enabling fast FPU save and restore... done. > Enabling unmasked SIMD FPU exception support... done. > Initializing CPU#0 > PID hash table entries: 1024 (order: 10, 16384 bytes) > Xen reported: 1130.113 MHz processor. > Dentry cache hash table entries: 32768 (order: 5, 131072 bytes) > Inode-cache hash table entries: 16384 (order: 4, 65536 bytes) > Software IO TLB disabled > vmalloc area: c9000000-fb7fe000, maxmem 33ffe000 > Memory: 114612k/139264k available (3368k kernel code, 16308k reserved, > 1033k data, 196k init, 0k highmem) > Checking if this processor honours the WP bit even in supervisormode...> Ok. > Calibrating delay using timer specific routine.. 2261.96 BogoMIPS > (lpj=11309833) > Security Framework v1.0.0 initialized > Capability LSM initialized > Mount-cache hash table entries: 512 > CPU: L1 I cache: 16K, L1 D cache: 16K > CPU: L2 cache: 512K > Checking ''hlt'' instruction... OK. > Brought up 1 CPUs > migration_cost=0 > checking if image is initramfs... it is > Freeing initrd memory: 9535k freed > Grant table initialized > NET: Registered protocol family 16 > Brought up 1 CPUs > PCI: setting up Xen PCI frontend stub > ACPI: Subsystem revision 20060127 > ACPI: Interpreter disabled. > Linux Plug and Play Support v0.97 (c) Adam Belay > xen_mem: Initialising balloon driver. > SCSI subsystem initialized > usbcore: registered new driver usbfs > usbcore: registered new driver hub > PCI: System does not support PCI > PCI: System does not support PCI > IA-32 Microcode Update Driver: v1.14-xen <tigran@veritas.com> > VFS: Disk quotas dquot_6.5.1 > Dquot-cache hash table entries: 1024 (order 0, 4096 bytes) > JFS: nTxBlock = 1024, nTxLock = 8192 > SGI XFS with ACLs, security attributes, realtime, large block numbers,no> debug enabled > Initializing Cryptographic API > io scheduler noop registered > io scheduler anticipatory registered (default) > io scheduler deadline registered > io scheduler cfq registered > PNP: No PS/2 controller found. Probing ports directly. > i8042.c: No controller found. > RAMDISK driver initialized: 16 RAM disks of 16384K size 1024 blocksize > Xen virtual console successfully installed as tty1 > Event-channel device installed. > blkif_init: reqs=64, pages=704, mmap_vstart=0xc7400000 > netfront: Initialising virtual ethernet driver. > Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2 > ide: Assuming 50MHz system bus speed for PIO modes; override withidebus=xx> Registering block device major 8 > ide-floppy driver 0.99.newide > Fusion MPT base driver 3.03.07 > Copyright (c) 1999-2005 LSI Logic Corporation > Fusion MPT SPI Host driver 3.03.07 > Fusion MPT misc device (ioctl) driver 3.03.07 > mptctl: Registered with Fusion MPT base driver > mptctl: /dev/mptctl @ (major,minor=10,220) > usbmon: debugfs is not available > usbcore: registered new driver libusual > mice: PS/2 mouse device common for all mice > md: md driver 0.90.3 MAX_MD_DEVS=256, MD_SB_DISKS=27 > md: bitmap version 4.39 > NET: Registered protocol family 2 > IP route cache hash table entries: 2048 (order: 1, 8192 bytes) > TCP established hash table entries: 8192 (order: 4, 65536 bytes) > TCP bind hash table entries: 8192 (order: 4, 65536 bytes) > TCP: Hash tables configured (established 8192 bind 8192) > TCP reno registered > Initializing IPsec netlink socket > NET: Registered protocol family 1 > NET: Registered protocol family 17 > NET: Registered protocol family 8 > NET: Registered protocol family 20 > Using IPI No-Shortcut mode > Freeing unused kernel memory: 196k freed > Loading, please wait... > Begin: Loading essential drivers... ... > tg3: no version for "struct_module" found: kernel tainted. > eepro100.c:v1.09j-t 9/29/99 Donald Becker > http://www.scyld.com/network/eepro100.html > eepro100.c: $Revision: 1.36 $ 2000/11/17 Modified by Andrey V.Savochkin> <saw@saw.sw.com.sg> and others > Intel(R) PRO/1000 Network Driver - version 6.3.9-k4 > Copyright (c) 1999-2005 Intel Corporation. > Done. > Begin: Running /scripts/init-premount ... > FATAL: Error inserting fan (/lib/modules/2.6.16.13- > xen/kernel/drivers/acpi/fan.ko): No such device > FATAL: Error inserting thermal (/lib/modules/2.6.16.13- > xen/kernel/drivers/acpi/thermal.ko): No such device > Done. > Begin: Mounting root file system... ... > Begin: Running /scripts/local-top ... > Done. > Begin: Running /scripts/local-premount ... > Done. > kjournald starting. Commit interval 5 seconds > EXT3-fs: mounted filesystem with ordered data mode. > Begin: Running /scripts/log-bottom ... > Done. > Done. > Begin: Running /scripts/init-bottom ... > Done. > mount: Mounting /sys on /root/sys failed: No such file or directory > INIT: version 2.85 booting > Activating swap. > Checking root file system... > fsck 1.39 (29-May-2006) > /dev/sda1: clean, 21526/917504 files, 245920/1835007 blocks > EXT3 FS on sda1, internal journal > System time was Wed Aug 2 22:17:34 UTC 2006. > Setting the System Clock using the Hardware Clock as reference... > System Clock set. System local time is now Wed Aug 2 22:17:37 UTC2006.> Loading device-mapper support. > Checking all file systems... > fsck 1.39 (29-May-2006) > Setting kernel variables.. > Mounting local filesystems... > Adding 524280k swap on /swap00. Priority:-1 extents:134across:533176k> Cleaning /tmp /var/run /var/lock. > Running 0dns-down to make sure resolv.conf is ok...done. > Cleaning: /etc/network/ifstate. > Setting up IP spoofing protection: rp_filter. > Configuring network interfaces...done. > Loading the saved-state of the serial devices... > /dev/ttyS0: No such file or directory > /dev/ttyS0: No such file or directory > /dev/ttyS1: No such file or directory > /dev/ttyS1: No such file or directory > Not setting System Clock > Initializing random number generator...done. > Recovering nvi editor sessions... done. > INIT: Entering runlevel: 2 > Starting isconf daemonRunning isconf updateisconf: info:build2.t7a.org is> on guest-1 branch > isconf: info: may reboot... > isconf: info: checking for updates > isconf: info: fetching >http://10.27.4.7:65028/t7a.org/block/fb2/fb2e8177e647be52a1c64e21fcb9134 55c> 71e731-8b3a10ecde5fc43984807e34550a2ebd-1?challenge=0.911958506882 > isconf: info: fetching >http://10.27.4.7:65028/t7a.org/block/fb2/fb2e8177e647be52a1c64e21fcb9134 55c> 71e731-8b3a10ecde5fc43984807e34550a2ebd-1?challenge=0.999292957677 > isconf: info: fetching >http://10.27.4.7:65028/t7a.org/block/fb2/fb2e8177e647be52a1c64e21fcb9134 55c> 71e731-8b3a10ecde5fc43984807e34550a2ebd-1?challenge=0.239902520967 > BUG: soft lockup detected on CPU#0! > > Pid: 2383, comm: isconf > EIP: 0073:[<080c9763>] CPU: 0 > EIP is at 0x80c9763 > ESP: 007b:bfcc962c EFLAGS: 00200282 Tainted: GF(2.6.16.13-xen #2)> EAX: 00000001 EBX: 0000003a ECX: bfcc9624 EDX: 00000000 > ESI: 08137cb4 EDI: 00000001 EBP: bfcc9638 DS: 007b ES: 007b > CR0: 80050033 CR2: b7b97000 CR3: 0055e000 CR4: 00000640 > isconf: info: fetching >http://10.27.4.34:65028/t7a.org/block/ff1/ff1276f7811aeeade18d54a6c35782 61f> f36ecbb-4fb47b36cda57ae95af56372f03bb2ca-1?challenge=0.265409462016 > isconf: info: updated /etc/ldap/ldap.conf > BUG: soft lockup detected on CPU#0! > > Pid: 2383, comm: isconf > EIP: 0073:[<080af84d>] CPU: 0 > EIP is at 0x80af84d > ESP: 007b:bfcc96d0 EFLAGS: 00200246 Tainted: GF(2.6.16.13-xen #2)> EAX: 00000001 EBX: 082031fe ECX: 082031fe EDX: b7af1f8c > ESI: 00000000 EDI: 082030ec EBP: bfcc9838 DS: 007b ES: 007b > CR0: 80050033 CR2: b7b97000 CR3: 0055e000 CR4: 00000640 > isconf: info: fetching >http://10.27.4.7:65028/t7a.org/block/c0e/c0e10bc50572deb89da6e9d96ac5971 a39> fddc65-fc3558eaffc90497248f97f9b0e3a924-1?challenge=0.130730726051 > isconf: info: updated /etc/ca-certificates.conf > isconf: info: running [''update-ca-certificates''] > Updating certificates in /etc/ssl/certs....done. > isconf: info: updated /etc/ldap/ldap.conf > BUG: soft lockup detected on CPU#0! > > Pid: 1, comm: init > EIP: 0061:[<c0322fe1>] CPU: 0 > EIP is at netif_poll+0x101/0x810 > EFLAGS: 00000216 Tainted: GF (2.6.16.13-xen #2) > EAX: 00000037 EBX: c0945180 ECX: 0001134e EDX: c0945000 > ESI: c0f48280 EDI: c0f499e8 EBP: c09451c0 DS: 007b ES: 007b > CR0: 8005003b CR2: b7d579e0 CR3: 0057e000 CR4: 00000640 > [<c03d891a>] net_rx_action+0xea/0x230 > [<c0124cb5>] __do_softirq+0xf5/0x120 > [<c0124d75>] do_softirq+0x95/0xa0 > [<c0106c0f>] do_IRQ+0x1f/0x30 > [<c0312f58>] evtchn_do_upcall+0xa8/0xf0 > [<c0105178>] hypervisor_callback+0x2c/0x34 > [<c02c2081>] __copy_user_intel+0x31/0xb0 > [<c02c2220>] __copy_to_user_ll+0x70/0x80 > [<c02c22f2>] copy_to_user+0x42/0x60 > [<c0171068>] cp_new_stat64+0xf8/0x110 > [<c01710b7>] sys_stat64+0x37/0x40 > [<c0104fb5>] syscall_call+0x7/0xb > isconf: warning: clierr: Connection reset by peer > Starting system log daemon: syslogd. > Starting kernel log daemon: klogd. > No configuration file was found for slapd at /etc/ldap/slapd.conf. > If you have moved the slapd configuration file please modify > /etc/default/slapd to reflect this. If you chose to not > configure slapd during installation then you need to do so > prior to attempting to start slapd. > An example slapd.conf is in /usr/share/slapd > Starting Heimdal KDC: heimdal-kdc. > Starting Heimdal password server: kpasswdd. > Starting internet superserver: inetd. > Starting PCMCIA services: module directory /lib/modules/2.6.16.13- > xen/pcmcia not found. > Starting OpenBSD Secure Shell server: sshd. > Starting deferred execution scheduler: atd. > Starting periodic command scheduler: cron. > > Debian GNU/Linux testing/unstable build2.t7a.org tty1 > > build2.t7a.org login: > > On Wed, Aug 02, 2006 at 01:54:49PM -0700, Steve Traugott wrote: > > Hi All, > > > > I hate to say it, but it''s starting to look like soft lockup bug(s) > > are turning into a serious roadblock for general production use ofXen> > 3.X, on a wide range of hardware. I''ve been using Xen since the 1.0 > > days, and I have to say that this the most serious showstopper bug > > I''ve ever hit -- it usually manifests itself during the first > > significant network and/or disk I/O after starting a second or third > > domU on the same box, and is the only bug I''ve ever hit that has > > caused permanent damage -- it tends to corrupt guest filesystems.In> > my case it''s stopped a deployment dead in its tracks, and our only > > options at this point are to go back to Xen 2.X or (horrors) tonative> > Linux kernels. > > > > The problem (or something that looks identical) is described in > > several tickets, status currently NEW or REOPENED, no clear > > resolution: > > http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=543 > > http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=690 > > http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=697 > > http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=705 > > > > In our own shop, we consistently hit soft lockups while running on > > both IBM x330''s and older Netengines (similar to an IBM 4000R).We''ve> > found no workaround. We''re on xen-3.0-testing, changeset 9732,kernel> > 2.6.6.13. On April 6th, Keir posted a note saying this was fixed as > > of a blkif_schedule() fix, which we already have because that wasway> > back in changeset 9587... > >http://lists.xensource.com/archives/html/xen-devel/2006-04/msg00121.html .> > > > The most recent devel list traffic I''ve found which covers this is > > July 7th: > >http://lists.xensource.com/archives/html/xen-users/2006-07/msg00134.html> > ...this message referred back to Kier''s comment as describing a fix, > > but it doesn''t look true; while Kier''s 9587 checkin may have fixed a > > soft lockup problem, there appear to be more out there, or else > > there''s been regression. > > > > Do we have any consensus that this bug is fixed at all in > > xen-3.0-testing, or even unstable? Is anyone who was hitting soft > > lockups in testing *not* hitting them any more on the same hardware? > > If so, what changeset are you on now? > > > > If anyone needs any more information, just let me know. As usual,if> > anyone wants login and console server access to one of these boxesto> > chase this down, I''m more than happy to provide that. > > > > Thanks, > > > > Steve > > -- > > Stephen G. Traugott (KG6HDQ) > > UNIX/Linux Infrastructure Architect, TerraLuna LLC > > stevegt@TerraLuna.Org > > http://www.stevegt.com -- http://Infrastructures.Org > > -- > Stephen G. Traugott (KG6HDQ) > UNIX/Linux Infrastructure Architect, TerraLuna LLC > stevegt@TerraLuna.Org > http://www.stevegt.com -- http://Infrastructures.Org > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Steve Traugott
2006-Aug-03 00:27 UTC
Re: [Xen-devel] blocking Xen 3.X production use: soft lockup bugs
Hi Ian, Thanks for your patience... On Wed, Aug 02, 2006 at 11:25:45PM +0100, Ian Pratt wrote:> > The problem (or something that looks identical) is described in > > several tickets, status currently NEW or REOPENED, no clear > > resolution: > > http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=543 > > http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=690 > > http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=697 > > http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=705 > > There''s very little to go on here. Two of the bugs are actually the same > guy. One of the others is x86_64 the other two are 32b.That''s what I was starting to realize -- a lot of folks (including me) have been classing all soft lockups together, without digging deeper.> The only thing in common about the stack traces is that networking > functions seem to feature.Might be the same in my case; see the stack trace in my message in this thread a few minutes ago, copied below. The ''isconf'' process you see there does a lot of UDP and TCP traffic for file transfers, as well as moderate disk I/O.> Taking a wild guess, are you doing some kind of unusual networking setup > involving iptables rules?Nope. Right now I can''t think of anything I''m doing that''s not the standard Xen bridging setup.> > Do we have any consensus that this bug is fixed at all in > > xen-3.0-testing, or even unstable? Is anyone who was hitting soft > > lockups in testing *not* hitting them any more on the same hardware? > > If so, what changeset are you on now? > > Soft lockups could be due to a huge variety of causes. It''s unlikely to > be a hardware issue, and since the problems seem to be experienced by a > very small number of users my guess would be that it''s configuration > dependent, most likely networking. > > > If anyone needs any more information, just let me know. As usual, if > > anyone wants login and console server access to one of these boxes to > > chase this down, I''m more than happy to provide that. > > Having a really detailed bug report would really be the best way of > proceeding.This is why I was thinking about starting a "how to report soft lockups" wiki page; I think we haven''t been giving you enough. Is there already a more generic Xen bug reporting howto somewhere, or should I have at it, using your questions below as a start?> When this happens, does it just effect one guest?We typically see error messages on only one guest''s console, but other guests and dom0 tend to lock up for ~30 seconds as well.> What''s the stack trace?See the dmesg below (this is the same one I just posted a few minutes ago, in my previous message, copying here for reference).> How many VCPUs has the guest got?One. So far I''ve seen soft lockups with and without nosmp on the Xen command line on our Netengines, but can''t yet tell you if they were the same stack trace. Haven''t tried nosmp on the x330''s yet, am about to.> Is the guest completely hosed or is it still pingable?Tends to be unpingable for ~30 seconds, usually recovers, but sometimes corrupts filesystem to a state which is unrecoverable (first machine it destroyed was our primary KDC, ouch...)> What about guest console echo?Works, but latency is on the order of several seconds or more for ~30 seconds.> What about ''xm sysreq''?Unable to answer this yet due to that high dom0 latency.> Looking in dom0, are you still seeing packets go to/from the > associated VIF?Unable to answer this yet due to that high dom0 latency.> How many network interfaces has the guest got?Only eth0 and lo.> What''s the precise networking setup in dom0?Standard Xen bridging config: n4h34:~# ifconfig | grep encap eth0 Link encap:Ethernet HWaddr 00:02:55:C7:CA:D8 lo Link encap:Local Loopback peth0 Link encap:Ethernet HWaddr FE:FF:FF:FF:FF:FF vif0.0 Link encap:Ethernet HWaddr FE:FF:FF:FF:FF:FF vif1.0 Link encap:Ethernet HWaddr FE:FF:FF:FF:FF:FF vif5.0 Link encap:Ethernet HWaddr FE:FF:FF:FF:FF:FF xenbr0 Link encap:Ethernet HWaddr FE:FF:FF:FF:FF:FF n4h34:~# iptables -L Chain INPUT (policy ACCEPT) target prot opt source destination Chain FORWARD (policy ACCEPT) target prot opt source destination Chain OUTPUT (policy ACCEPT) target prot opt source destination n4h34:~# route Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface 10.27.4.0 * 255.255.255.0 U 0 0 0 eth0 default 10.27.4.254 0.0.0.0 UG 0 0 0 eth0> Can you come up with a recipe for reproduction, ideally with a > single guest?It looks like it can be reliably produced by starting a second guest and doing a mix of steady network and disk I/O -- isconf, for instance, runs during rc and updates the local disk image by pulling new packages over the network and installing them on the fly (http://trac.t7a.org/isconf/), so it is usually the first to trigger the bug in our environment. I haven''t seen the bug as often with only one guest. For example, I built an AFS server in domain 1 on this same x330, generating lots of disk and network I/O in the process, ran it for days with no problems, then tried to start a copy of the same base image up as domain 2 on the same box and got the dmesg you see here; only the hostname, IP, MAC etc. were different. I''ll see if I can come up with a simple python script or something which can trigger it. The only other "unusual" thing I can think of about this configuration is that it''s using DRBD on top of EVMS in dom0 for the guest volumes; this would also increase dom0 network traffic during any guest disk I/O. I hope to heck this doesn''t turn out to be a DRBD incompatibility; we''ve used DRBD with Xen since the early 2.X days, and it''s been solid. I''ll have to do some testing to see if I can eliminate DRBD as a factor. Steve n4h34:~# xm create -c /etc/xen/auto/build2.t7a.org Using config file "/etc/xen/auto/build2.t7a.org". Started domain build2.t7a.org Linux version 2.6.16.13-xen (root@n4h33) (gcc version 3.3.5 (Debian 1:3.3.5-12)) #2 SMP Sun Jun 11 14:25:16 PDT 2006 BIOS-provided physical RAM map: Xen: 0000000000000000 - 0000000008000000 (usable) 0MB HIGHMEM available. 136MB LOWMEM available. ACPI in unprivileged domain disabled IRQ lockup detection disabled Built 1 zonelists Kernel command line: root=/dev/sda1 2 Enabling fast FPU save and restore... done. Enabling unmasked SIMD FPU exception support... done. Initializing CPU#0 PID hash table entries: 1024 (order: 10, 16384 bytes) Xen reported: 1130.113 MHz processor. Dentry cache hash table entries: 32768 (order: 5, 131072 bytes) Inode-cache hash table entries: 16384 (order: 4, 65536 bytes) Software IO TLB disabled vmalloc area: c9000000-fb7fe000, maxmem 33ffe000 Memory: 114612k/139264k available (3368k kernel code, 16308k reserved, 1033k data, 196k init, 0k highmem) Checking if this processor honours the WP bit even in supervisor mode... Ok. Calibrating delay using timer specific routine.. 2261.96 BogoMIPS (lpj=11309833) Security Framework v1.0.0 initialized Capability LSM initialized Mount-cache hash table entries: 512 CPU: L1 I cache: 16K, L1 D cache: 16K CPU: L2 cache: 512K Checking ''hlt'' instruction... OK. Brought up 1 CPUs migration_cost=0 checking if image is initramfs... it is Freeing initrd memory: 9535k freed Grant table initialized NET: Registered protocol family 16 Brought up 1 CPUs PCI: setting up Xen PCI frontend stub ACPI: Subsystem revision 20060127 ACPI: Interpreter disabled. Linux Plug and Play Support v0.97 (c) Adam Belay xen_mem: Initialising balloon driver. SCSI subsystem initialized usbcore: registered new driver usbfs usbcore: registered new driver hub PCI: System does not support PCI PCI: System does not support PCI IA-32 Microcode Update Driver: v1.14-xen <tigran@veritas.com> VFS: Disk quotas dquot_6.5.1 Dquot-cache hash table entries: 1024 (order 0, 4096 bytes) JFS: nTxBlock = 1024, nTxLock = 8192 SGI XFS with ACLs, security attributes, realtime, large block numbers, no debug enabled Initializing Cryptographic API io scheduler noop registered io scheduler anticipatory registered (default) io scheduler deadline registered io scheduler cfq registered PNP: No PS/2 controller found. Probing ports directly. i8042.c: No controller found. RAMDISK driver initialized: 16 RAM disks of 16384K size 1024 blocksize Xen virtual console successfully installed as tty1 Event-channel device installed. blkif_init: reqs=64, pages=704, mmap_vstart=0xc7400000 netfront: Initialising virtual ethernet driver. Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2 ide: Assuming 50MHz system bus speed for PIO modes; override with idebus=xx Registering block device major 8 ide-floppy driver 0.99.newide Fusion MPT base driver 3.03.07 Copyright (c) 1999-2005 LSI Logic Corporation Fusion MPT SPI Host driver 3.03.07 Fusion MPT misc device (ioctl) driver 3.03.07 mptctl: Registered with Fusion MPT base driver mptctl: /dev/mptctl @ (major,minor=10,220) usbmon: debugfs is not available usbcore: registered new driver libusual mice: PS/2 mouse device common for all mice md: md driver 0.90.3 MAX_MD_DEVS=256, MD_SB_DISKS=27 md: bitmap version 4.39 NET: Registered protocol family 2 IP route cache hash table entries: 2048 (order: 1, 8192 bytes) TCP established hash table entries: 8192 (order: 4, 65536 bytes) TCP bind hash table entries: 8192 (order: 4, 65536 bytes) TCP: Hash tables configured (established 8192 bind 8192) TCP reno registered Initializing IPsec netlink socket NET: Registered protocol family 1 NET: Registered protocol family 17 NET: Registered protocol family 8 NET: Registered protocol family 20 Using IPI No-Shortcut mode Freeing unused kernel memory: 196k freed Loading, please wait... Begin: Loading essential drivers... ... tg3: no version for "struct_module" found: kernel tainted. eepro100.c:v1.09j-t 9/29/99 Donald Becker http://www.scyld.com/network/eepro100.html eepro100.c: $Revision: 1.36 $ 2000/11/17 Modified by Andrey V. Savochkin <saw@saw.sw.com.sg> and others Intel(R) PRO/1000 Network Driver - version 6.3.9-k4 Copyright (c) 1999-2005 Intel Corporation. Done. Begin: Running /scripts/init-premount ... FATAL: Error inserting fan (/lib/modules/2.6.16.13-xen/kernel/drivers/acpi/fan.ko): No such device FATAL: Error inserting thermal (/lib/modules/2.6.16.13-xen/kernel/drivers/acpi/thermal.ko): No such device Done. Begin: Mounting root file system... ... Begin: Running /scripts/local-top ... Done. Begin: Running /scripts/local-premount ... Done. kjournald starting. Commit interval 5 seconds EXT3-fs: mounted filesystem with ordered data mode. Begin: Running /scripts/log-bottom ... Done. Done. Begin: Running /scripts/init-bottom ... Done. mount: Mounting /sys on /root/sys failed: No such file or directory INIT: version 2.85 booting Activating swap. Checking root file system... fsck 1.39 (29-May-2006) /dev/sda1: clean, 21526/917504 files, 245920/1835007 blocks EXT3 FS on sda1, internal journal System time was Wed Aug 2 22:17:34 UTC 2006. Setting the System Clock using the Hardware Clock as reference... System Clock set. System local time is now Wed Aug 2 22:17:37 UTC 2006. Loading device-mapper support. Checking all file systems... fsck 1.39 (29-May-2006) Setting kernel variables.. Mounting local filesystems... Adding 524280k swap on /swap00. Priority:-1 extents:134 across:533176k Cleaning /tmp /var/run /var/lock. Running 0dns-down to make sure resolv.conf is ok...done. Cleaning: /etc/network/ifstate. Setting up IP spoofing protection: rp_filter. Configuring network interfaces...done. Loading the saved-state of the serial devices... /dev/ttyS0: No such file or directory /dev/ttyS0: No such file or directory /dev/ttyS1: No such file or directory /dev/ttyS1: No such file or directory Not setting System Clock Initializing random number generator...done. Recovering nvi editor sessions... done. INIT: Entering runlevel: 2 Starting isconf daemonRunning isconf updateisconf: info: build2.t7a.org is on guest-1 branch isconf: info: may reboot... isconf: info: checking for updates isconf: info: fetching http://10.27.4.7:65028/t7a.org/block/fb2/fb2e8177e647be52a1c64e21fcb913455c71e731-8b3a10ecde5fc43984807e34550a2ebd-1?challenge=0.911958506882 isconf: info: fetching http://10.27.4.7:65028/t7a.org/block/fb2/fb2e8177e647be52a1c64e21fcb913455c71e731-8b3a10ecde5fc43984807e34550a2ebd-1?challenge=0.999292957677 isconf: info: fetching http://10.27.4.7:65028/t7a.org/block/fb2/fb2e8177e647be52a1c64e21fcb913455c71e731-8b3a10ecde5fc43984807e34550a2ebd-1?challenge=0.239902520967 BUG: soft lockup detected on CPU#0! Pid: 2383, comm: isconf EIP: 0073:[<080c9763>] CPU: 0 EIP is at 0x80c9763 ESP: 007b:bfcc962c EFLAGS: 00200282 Tainted: GF (2.6.16.13-xen #2) EAX: 00000001 EBX: 0000003a ECX: bfcc9624 EDX: 00000000 ESI: 08137cb4 EDI: 00000001 EBP: bfcc9638 DS: 007b ES: 007b CR0: 80050033 CR2: b7b97000 CR3: 0055e000 CR4: 00000640 isconf: info: fetching http://10.27.4.34:65028/t7a.org/block/ff1/ff1276f7811aeeade18d54a6c3578261ff36ecbb-4fb47b36cda57ae95af56372f03bb2ca-1?challenge=0.265409462016 isconf: info: updated /etc/ldap/ldap.conf BUG: soft lockup detected on CPU#0! Pid: 2383, comm: isconf EIP: 0073:[<080af84d>] CPU: 0 EIP is at 0x80af84d ESP: 007b:bfcc96d0 EFLAGS: 00200246 Tainted: GF (2.6.16.13-xen #2) EAX: 00000001 EBX: 082031fe ECX: 082031fe EDX: b7af1f8c ESI: 00000000 EDI: 082030ec EBP: bfcc9838 DS: 007b ES: 007b CR0: 80050033 CR2: b7b97000 CR3: 0055e000 CR4: 00000640 isconf: info: fetching http://10.27.4.7:65028/t7a.org/block/c0e/c0e10bc50572deb89da6e9d96ac5971a39fddc65-fc3558eaffc90497248f97f9b0e3a924-1?challenge=0.130730726051 isconf: info: updated /etc/ca-certificates.conf isconf: info: running [''update-ca-certificates''] Updating certificates in /etc/ssl/certs....done. isconf: info: updated /etc/ldap/ldap.conf BUG: soft lockup detected on CPU#0! Pid: 1, comm: init EIP: 0061:[<c0322fe1>] CPU: 0 EIP is at netif_poll+0x101/0x810 EFLAGS: 00000216 Tainted: GF (2.6.16.13-xen #2) EAX: 00000037 EBX: c0945180 ECX: 0001134e EDX: c0945000 ESI: c0f48280 EDI: c0f499e8 EBP: c09451c0 DS: 007b ES: 007b CR0: 8005003b CR2: b7d579e0 CR3: 0057e000 CR4: 00000640 [<c03d891a>] net_rx_action+0xea/0x230 [<c0124cb5>] __do_softirq+0xf5/0x120 [<c0124d75>] do_softirq+0x95/0xa0 [<c0106c0f>] do_IRQ+0x1f/0x30 [<c0312f58>] evtchn_do_upcall+0xa8/0xf0 [<c0105178>] hypervisor_callback+0x2c/0x34 [<c02c2081>] __copy_user_intel+0x31/0xb0 [<c02c2220>] __copy_to_user_ll+0x70/0x80 [<c02c22f2>] copy_to_user+0x42/0x60 [<c0171068>] cp_new_stat64+0xf8/0x110 [<c01710b7>] sys_stat64+0x37/0x40 [<c0104fb5>] syscall_call+0x7/0xb isconf: warning: clierr: Connection reset by peer Starting system log daemon: syslogd. Starting kernel log daemon: klogd. No configuration file was found for slapd at /etc/ldap/slapd.conf. If you have moved the slapd configuration file please modify /etc/default/slapd to reflect this. If you chose to not configure slapd during installation then you need to do so prior to attempting to start slapd. An example slapd.conf is in /usr/share/slapd Starting Heimdal KDC: heimdal-kdc. Starting Heimdal password server: kpasswdd. Starting internet superserver: inetd. Starting PCMCIA services: module directory /lib/modules/2.6.16.13-xen/pcmcia not found. Starting OpenBSD Secure Shell server: sshd. Starting deferred execution scheduler: atd. Starting periodic command scheduler: cron. Debian GNU/Linux testing/unstable build2.t7a.org tty1 build2.t7a.org login: -- Stephen G. Traugott (KG6HDQ) UNIX/Linux Infrastructure Architect, TerraLuna LLC stevegt@TerraLuna.Org http://www.stevegt.com -- http://Infrastructures.Org _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Steve Traugott
2006-Aug-03 00:42 UTC
Re: [Xen-devel] Re: blocking Xen 3.X production use: soft lockup bugs
On Thu, Aug 03, 2006 at 12:36:35AM +0100, Ian Pratt wrote:> Using ''xm list'', is the guest burning CPU?I was watching for that, haven''t spotted any significant CPU usage yet; seems to be hung rather than spinning.> What about dom0?That I haven''t been watching for. ;-) Will do.> The soft lockup messages appear to be benign in that the domain seems to > be continuing quite happily after printing them -- its quite possible > that the system was sufficiently busy that the domain VCPU just didn''t > get scheduled for a while, triggering the warning message. Are you sure > they''re actually related to the more serious problem you''re > experiencing?I can''t prove that the network-related soft lockups I''m seeing on the x330''s are the same soft lockups related to filesystem damage we saw on the Netengines -- we stopped using Netengines for Xen 3 when we hit that (they run Xen 2 fine). Now that I know what to look for, I''ll go back and re-create the Xen 3 environment on the Netengines so I can reproduce the problem there.> Have you tried using -unstable and hence xen''s new scheduler? This is > less likely to provoke soft lockup false alarms.Haven''t tried unstable yet, since this is for the production infrastructure for my family''s business; am in the process of rebuilding with testing changeset 9762 though. (is that really tip? hg log says Jun 29th for that changeset, even after a pull...) Thanks again, Steve -- Stephen G. Traugott (KG6HDQ) UNIX/Linux Infrastructure Architect, TerraLuna LLC stevegt@TerraLuna.Org http://www.stevegt.com -- http://Infrastructures.Org _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Pratt
2006-Aug-03 00:59 UTC
RE: [Xen-devel] Re: blocking Xen 3.X production use: soft lockup bugs
> > The soft lockup messages appear to be benign in that the domainseems to> > be continuing quite happily after printing them -- its quitepossible> > that the system was sufficiently busy that the domain VCPU justdidn''t> > get scheduled for a while, triggering the warning message. Are yousure> > they''re actually related to the more serious problem you''re > > experiencing? > > I can''t prove that the network-related soft lockups I''m seeing on the > x330''s are the same soft lockups related to filesystem damage we saw > on the Netengines -- we stopped using Netengines for Xen 3 when we hit > that (they run Xen 2 fine). Now that I know what to look for, I''ll go > back and re-create the Xen 3 environment on the Netengines so I can > reproduce the problem there.Do you get anything of interest in dom0''s dmesg? The fact that dom0 is unresponsive for some seconds is interesting. What does your dom0 use as a root filesystem? Is your dom0 uni proc or smp? When its in the stalled state, if you have a serial console, switching to xen''s debug console (ctrl-a three times) and hitting ''d'' and ''q'' a few times might be useful. You''ll need to lookup all the EIPs into symbols by hand. (This is easier if you''re running the same kernel in all domains)> > Have you tried using -unstable and hence xen''s new scheduler? Thisis> > less likely to provoke soft lockup false alarms. > > Haven''t tried unstable yet, since this is for the production > infrastructure for my family''s business; am in the process of > rebuilding with testing changeset 9762 though. (is that really tip? > hg log says Jun 29th for that changeset, even after a pull...)There have been no requests to back port patches since then. If you can, its really worth trying -unstable. Any changeset from over last weekend should be just fine. Ian _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2006-Aug-03 08:03 UTC
Re: [Xen-devel] blocking Xen 3.X production use: soft lockup bugs
On 2 Aug 2006, at 23:25, Ian Pratt wrote:>> Do we have any consensus that this bug is fixed at all in >> xen-3.0-testing, or even unstable? Is anyone who was hitting soft >> lockups in testing *not* hitting them any more on the same hardware? >> If so, what changeset are you on now? > > Soft lockups could be due to a huge variety of causes. It''s unlikely to > be a hardware issue, and since the problems seem to be experienced by a > very small number of users my guess would be that it''s configuration > dependent, most likely networking.Also older versions using sedf scheduler (which has now been patched to avoid this) could end up with domain0 consuming all CPU and starving other guests, leading to softlockup errors. We haven''t seen any such errors on our own test machines since this was fixed. Of course, that doesn''t mean there aren''t problems with other test scenarios! -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2006-Aug-03 08:07 UTC
Re: [Xen-devel] blocking Xen 3.X production use: soft lockup bugs
On 3 Aug 2006, at 01:27, Steve Traugott wrote:>> When this happens, does it just effect one guest? > > We typically see error messages on only one guest''s console, but other > guests and dom0 tend to lock up for ~30 seconds as well.Do you have serial access to any of these systems. It''d be interesting to get some debug output from Xen during one of these ''outages'' and see what it thinks the CPU is doing (e.g., the ''q'' and ''d'' debug keys would be interesting). -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Steve Traugott
2006-Aug-04 20:21 UTC
Re: [Xen-devel] blocking Xen 3.X production use: soft lockup bugs
You nailed it, Keir. On Thu, Aug 03, 2006 at 09:03:18AM +0100, Keir Fraser wrote:> Also older versions using sedf scheduler (which has now been patched to > avoid this) could end up with domain0 consuming all CPU and starving > other guests, leading to softlockup errors. We haven''t seen any such > errors on our own test machines since this was fixed. Of course, that > doesn''t mean there aren''t problems with other test scenarios!That is exactly what was happening. I did more testing yesterday and last night (-testing changeset 9732), and realized that I was only seeing soft lockups on the second of two domU guests, and only when running a heavy load in dom0. According to ''xm vcpu-list'' the second guest was on CPU 0, as was the workload in dom0... I added more workload processes to consume both CPUs in dom0, and of course when I did that, the first guest ground to a halt and started showing soft lockups as well. I was usually able to trigger the soft lockups in a few seconds simply by running one or more of these in dom0: cat /dev/zero > /dev/null Variants of ''nc -ub 255.255.255.255 10000 < /dev/zero'' and ''nc -u -l -p 10000 > /dev/null'' in dom0 or domU also made things interesting, though I''m not sure that the network traffic is a factor. (Kids, don''t do this on a production net...) So I built -unstable changeset 10868, and ran an even heavier workload (the above, plus ''bonnie'' in the guests) on dom0 and two guests overnight, and they experienced no soft lockups; running -unstable, changeset 10868, credit scheduler. This same workload would have caused soft lockups within seconds in -testing changeset 9732 using the sedf scheduler; I may not have been able to get it started at all. Response time remained subsecond under -unstable; -testing would have been on its knees. Steve -- Stephen G. Traugott (KG6HDQ) UNIX/Linux Infrastructure Architect, TerraLuna LLC stevegt@TerraLuna.Org http://www.stevegt.com -- http://Infrastructures.Org _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Steve Traugott
2006-Aug-04 20:37 UTC
Re: [Xen-devel] Re: blocking Xen 3.X production use: soft lockup bugs
On Thu, Aug 03, 2006 at 01:59:20AM +0100, Ian Pratt wrote:> > > Have you tried using -unstable and hence xen''s new scheduler? This > is > > > less likely to provoke soft lockup false alarms. > > > > Haven''t tried unstable yet, since this is for the production > > infrastructure for my family''s business; am in the process of > > rebuilding with testing changeset 9762 though. (is that really tip? > > hg log says Jun 29th for that changeset, even after a pull...) > > There have been no requests to back port patches since then. > > If you can, its really worth trying -unstable. Any changeset from over > last weekend should be just fine.Ian, on your advice I skipped my -testing 9762 build and went straight to -unstable 10868. I can only saw *wow*! Night and day difference between 9732 and 10868. See my message as of a few minutes ago for the details, but at this point I''m considering taking 10868 into production. I might want -unstable anyway; I''m starting to get to the point where I can chase that blktap/AFS idea we were discussing with Andrew in late 2004 - early 2005. Steve -- Stephen G. Traugott (KG6HDQ) UNIX/Linux Infrastructure Architect, TerraLuna LLC stevegt@TerraLuna.Org http://www.stevegt.com -- http://Infrastructures.Org _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Pratt
2006-Aug-05 07:38 UTC
RE: [Xen-devel] blocking Xen 3.X production use: soft lockup bugs
> So I built -unstable changeset 10868, and ran an even heavier workload > (the above, plus ''bonnie'' in the guests) on dom0 and two guests > overnight, and they experienced no soft lockups; running -unstable, > changeset 10868, credit scheduler. This same workload would have > caused soft lockups within seconds in -testing changeset 9732 using > the sedf scheduler; I may not have been able to get it started at all. > Response time remained subsecond under -unstable; -testing would have > been on its knees.That''s good to hear. 3.0.3 is going to be a big leap forward in many ways. Ian _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2006-Aug-05 08:50 UTC
Re: [Xen-devel] blocking Xen 3.X production use: soft lockup bugs
On 4/8/06 9:21 pm, "Steve Traugott" <stevegt@TerraLuna.Org> wrote:> You nailed it, Keir. > > On Thu, Aug 03, 2006 at 09:03:18AM +0100, Keir Fraser wrote: >> Also older versions using sedf scheduler (which has now been patched to >> avoid this) could end up with domain0 consuming all CPU and starving >> other guests, leading to softlockup errors. We haven''t seen any such >> errors on our own test machines since this was fixed. Of course, that >> doesn''t mean there aren''t problems with other test scenarios! > > That is exactly what was happening. I did more testing yesterday and > last night (-testing changeset 9732), and realized that I was only > seeing soft lockups on the second of two domU guests, and only when > running a heavy load in dom0. According to ''xm vcpu-list'' the second > guest was on CPU 0, as was the workload in dom0... I added more > workload processes to consume both CPUs in dom0, and of course when I > did that, the first guest ground to a halt and started showing soft > lockups as well.It''s *always* worth trying the tip of 3.0-testing if you''re seeing problems with a strictly older version. In this case there are about 50 newer changesets, many of which are the result of aggressive testing by Suse for SLES10. Well worth having. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Harry Butterworth
2006-Aug-05 11:59 UTC
Re: [Xen-devel] blocking Xen 3.X production use: soft lockup bugs
Another data point: Yesterday I was working with an unstable changeset from the morning (I think about halfway through the qemu patches) and running HVM xm-test to try to debug the create-concurrent failures. qemu_dm was taking 100% of one core and I got about 6 soft lockups in dom0 and 2 dom0 hangs. I''m not sure exactly why HVM testing is all over the floor for me, maybe I picked a bad changeset or perhaps the recent ubuntu updates have broken something. It''s possible that there are still some lurking soft lockup issues anyway. Harry. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2006-Aug-05 13:45 UTC
Re: [Xen-devel] blocking Xen 3.X production use: soft lockup bugs
On 5/8/06 12:59 pm, "Harry Butterworth" <harry@hebutterworth.freeserve.co.uk> wrote:> Another data point: Yesterday I was working with an unstable changeset > from the morning (I think about halfway through the qemu patches) and > running HVM xm-test to try to debug the create-concurrent failures. > qemu_dm was taking 100% of one core and I got about 6 soft lockups in > dom0 and 2 dom0 hangs. > > I''m not sure exactly why HVM testing is all over the floor for me, maybe > I picked a bad changeset or perhaps the recent ubuntu updates have > broken something. > > It''s possible that there are still some lurking soft lockup issues > anyway.Well, I believe the issues are sorted out for paravirtualised guests at least. Maybe there are lurkers for HVM guests -- if so, and they''re of the scale of hangs and softlockups, we''d really like detailed info so we could try to repro. Some HVM instability is also doubtless due to the current rate of churn in the unstable tree, which will be carrying on for at least another week (while more pv driver and shadow mode patches are applied). -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Harry Butterworth
2006-Aug-05 14:33 UTC
Re: [Xen-devel] blocking Xen 3.X production use: soft lockup bugs
On Sat, 2006-08-05 at 14:45 +0100, Keir Fraser wrote:> On 5/8/06 12:59 pm, "Harry Butterworth" > <harry@hebutterworth.freeserve.co.uk> wrote: > > > Another data point: Yesterday I was working with an unstable changeset > > from the morning (I think about halfway through the qemu patches) and > > running HVM xm-test to try to debug the create-concurrent failures. > > qemu_dm was taking 100% of one core and I got about 6 soft lockups in > > dom0 and 2 dom0 hangs. > > > > I''m not sure exactly why HVM testing is all over the floor for me, maybe > > I picked a bad changeset or perhaps the recent ubuntu updates have > > broken something. > > > > It''s possible that there are still some lurking soft lockup issues > > anyway. > > Well, I believe the issues are sorted out for paravirtualised guests at > least. Maybe there are lurkers for HVM guests -- if so, and they''re of the > scale of hangs and softlockups, we''d really like detailed info so we could > try to repro.I''ll post the changeset and any more details I can when I get back into work on Monday but dd was segfaulting for me due to a locale issue after the ubuntu update so I don''t really have a lot of confidence that it''s even a xen problem yet. Harry. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Harry Butterworth
2006-Aug-07 14:15 UTC
[Xen-devel] blocking Xen 3.X production use: soft lockup bugs
So when I wrote this...> On Sat, 2006-08-05 at 14:45 +0100, Keir Fraser wrote: > > On 5/8/06 12:59 pm, "Harry Butterworth" > > <harry@xxxxxxxxxxxxxxxxxxxxxxxxxxxxx> wrote: > > > > > Another data point: Yesterday I was working with an unstable changeset > > > from the morning (I think about halfway through the qemu patches) and > > > running HVM xm-test to try to debug the create-concurrent failures. > > > qemu_dm was taking 100% of one core and I got about 6 soft lockups in > > > dom0 and 2 dom0 hangs. > > > > > > I''m not sure exactly why HVM testing is all over the floor for me, maybe > > > I picked a bad changeset or perhaps the recent ubuntu updates have > > > broken something. > > > > > > It''s possible that there are still some lurking soft lockup issues > > > anyway. > > > > Well, I believe the issues are sorted out for paravirtualised guests at > > least. Maybe there are lurkers for HVM guests -- if so, and they''re of the > > scale of hangs and softlockups, we''d really like detailed info so we could > > try to repro. > > I''ll post the changeset and any more details I can when I get back into > work on Monday but dd was segfaulting for me due to a locale issue after > the ubuntu update so I don''t really have a lot of confidence that it''s > even a xen problem yet. > > Harry....the changeset was 10927 which was after 10921 where Christian changed the HVM cdrom configuration and I was using this patch http://lists.xensource.com/archives/html/xen-devel/2006-07/msg01052.html which uses the old style configuration for which there is no backwards compatibility. So that explains why the HVM testing was all over the floor. But it doesn''t really explain the softlockups or the dom0 hangs. The bad config must have been provoking some bad behaviour from something. The HVM testing is working again for me now I have updated the above patch. I''ve moved on a few changesets and I''m not getting soft lockups any more either so for the time being I''m going back to the create-concurrent failure that I was originally investigating. Harry. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Steve Traugott
2006-Aug-12 01:46 UTC
stress testing was: Re: [Xen-devel] blocking Xen 3.X production use: soft lockup bugs
On Sat, Aug 05, 2006 at 09:50:29AM +0100, Keir Fraser wrote:> On 4/8/06 9:21 pm, "Steve Traugott" <stevegt@TerraLuna.Org> wrote: > > On Thu, Aug 03, 2006 at 09:03:18AM +0100, Keir Fraser wrote: > >> Also older versions using sedf scheduler (which has now been patched to > >> avoid this) could end up with domain0 consuming all CPU and starving > >> other guests, leading to softlockup errors. We haven''t seen any such > >> errors on our own test machines since this was fixed. Of course, that > >> doesn''t mean there aren''t problems with other test scenarios! > > > > That is exactly what was happening. I did more testing yesterday and > > last night (-testing changeset 9732), and realized that I was only > > seeing soft lockups on the second of two domU guests, and only when > > running a heavy load in dom0. According to ''xm vcpu-list'' the second > > guest was on CPU 0, as was the workload in dom0... I added more > > workload processes to consume both CPUs in dom0, and of course when I > > did that, the first guest ground to a halt and started showing soft > > lockups as well. > > It''s *always* worth trying the tip of 3.0-testing if you''re seeing problems > with a strictly older version. In this case there are about 50 newer > changesets, many of which are the result of aggressive testing by Suse for > SLES10. Well worth having.In between everything else I''m partway through a build of testing tip; I really need to automate that, including all my local modules, some stress tests, and so on -- ever since Xen 1.0 we''ve been using various combinations of Xen, AFS, DRBD, and soon aoe, in an environment which is apparently good at finding bugs and regression in all of the above... :-} Because we rely on these things so much it would be in my best interests to write a "daily build" style stress test harness -- if I do this, I''d probably do a periodic post similar to Rick''s and David''s xm-test results, maybe some web stats and so on. Does this sound like something that would be useful to other folks? Steve -- Stephen G. Traugott (KG6HDQ) Managing Partner, TerraLuna LLC stevegt@TerraLuna.Org -- http://www.t7a.org _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Uh oh.... On Sat, Aug 05, 2006 at 08:38:03AM +0100, Ian Pratt wrote:> > So I built -unstable changeset 10868, and ran an even heavier workload > > (the above, plus ''bonnie'' in the guests) on dom0 and two guests > > overnight, and they experienced no soft lockups; running -unstable, > > changeset 10868, credit scheduler. This same workload would have > > caused soft lockups within seconds in -testing changeset 9732 using > > the sedf scheduler; I may not have been able to get it started at all. > > Response time remained subsecond under -unstable; -testing would have > > been on its knees. > > That''s good to hear. 3.0.3 is going to be a big leap forward in many > ways.Same environment as we''ve discussed earlier in this thread, now on unstable changeset 10868, Xen on top of DRBD, this one seems to be a reliable crash when under combined disk and network load, still isolating but would welcome suggestions... Steve root (hd0,0) Filesystem type is ext2fs, partition type 0x83 kernel /boot/xen-3.0.gz dom0_mem=65536 com1=9600,8n1 [Multiboot-elf, <0x100000:0x7a008:0x55ff8>, shtab=0x1d0078, entry=0x100000] module /boot/vmlinuz-2.6.16.13-xen root=801 ro console=ttyS0 ramdisk_size=32768 [Multiboot-module @ 0x1d1000, 0x494ed0 bytes] module /boot/initrd.img-2.6.16.13-xen [Multiboot-module @ 0x666000, 0x9a6400 bytes] __ __ _____ ___ _ _ _ \ \/ /___ _ __ |___ / / _ \ _ _ _ __ ___| |_ __ _| |__ | | ___ \ // _ \ ''_ \ |_ \| | | |__| | | | ''_ \/ __| __/ _` | ''_ \| |/ _ \ / \ __/ | | | ___) | |_| |__| |_| | | | \__ \ || (_| | |_) | | __/ /_/\_\___|_| |_| |____(_)___/ \__,_|_| |_|___/\__\__,_|_.__/|_|\___| http://www.cl.cam.ac.uk/netos/xen University of Cambridge Computer Laboratory Xen version 3.0-unstable (root@prd.terraluna.org) (gcc version 3.3.5 (Debian 1:3.3.5-13)) Fri Aug 4 10:54:50 PDT 2006 Latest ChangeSet: Sat Jul 29 06:05:59 2006 +0100 10868:d2bf1a7cc131 (XEN) Command line: /boot/xen-3.0.gz dom0_mem=65536 com1=9600,8n1 (XEN) Physical RAM map: (XEN) 0000000000000000 - 000000000009c000 (usable) (XEN) 000000000009c000 - 00000000000a0000 (reserved) (XEN) 00000000000e0000 - 0000000000100000 (reserved) (XEN) 0000000000100000 - 000000003ffec340 (usable) (XEN) 000000003ffec340 - 000000003fff0000 (ACPI data) (XEN) 000000003fff0000 - 0000000040000000 (reserved) (XEN) 00000000fec00000 - 0000000100000000 (reserved) (XEN) System RAM: 1023MB (1048096kB) (XEN) Xen heap: 10MB (10396kB) (XEN) Using scheduler: SMP Credit Scheduler (credit) (XEN) PAE disabled. (XEN) found SMP MP-table at 0009c1d0 (XEN) DMI 2.3 present. (XEN) Using APIC driver default (XEN) ACPI: RSDP (v000 IBM ) @ 0x000fdfd0 (XEN) ACPI: RSDT (v001 IBM SEREMRLD 0x00001000 IBM 0x45444f43) @ 0x3ffeff80 (XEN) ACPI: FADT (v001 IBM SEREMRLD 0x00001000 IBM 0x45444f43) @ 0x3ffeff00 (XEN) ACPI: MADT (v001 IBM SEREMRLD 0x00001000 IBM 0x45444f43) @ 0x3ffefe80 (XEN) ACPI: DSDT (v001 IBM SEREMRLD 0x00001000 MSFT 0x0100000b) @ 0x00000000 (XEN) ACPI: Local APIC address 0xfee00000 (XEN) ACPI: LAPIC (acpi_id[0x00] lapic_id[0x03] enabled) (XEN) Processor #3 6:11 APIC version 17 (XEN) ACPI: LAPIC (acpi_id[0x01] lapic_id[0x00] enabled) (XEN) Processor #0 6:11 APIC version 17 (XEN) ACPI: IOAPIC (id[0x0e] address[0xfec00000] gsi_base[0]) (XEN) IOAPIC[0]: apic_id 14, version 17, address 0xfec00000, GSI 0-15 (XEN) ACPI: IOAPIC (id[0x0d] address[0xfec01000] gsi_base[16]) (XEN) IOAPIC[1]: apic_id 13, version 17, address 0xfec01000, GSI 16-31 (XEN) ACPI: INT_SRC_OVR (bus 0 bus_irq 3 global_irq 30 dfl dfl) (XEN) ACPI: IRQ3 used by override. (XEN) Enabling APIC mode: Flat. Using 2 I/O APICs (XEN) Using ACPI (MADT) for SMP configuration information (XEN) Initializing CPU#0 (XEN) Detected 1130.180 MHz processor. (XEN) CPU: L1 I cache: 16K, L1 D cache: 16K (XEN) CPU: L2 cache: 512K (XEN) Intel machine check architecture supported. (XEN) Intel machine check reporting enabled on CPU#0. (XEN) CPU0: Intel(R) Pentium(R) III CPU family 1133MHz stepping 01 (XEN) Booting processor 1/0 eip 90000 (XEN) Initializing CPU#1 (XEN) CPU: L1 I cache: 16K, L1 D cache: 16K (XEN) CPU: L2 cache: 512K (XEN) Intel machine check architecture supported. (XEN) Intel machine check reporting enabled on CPU#1. (XEN) CPU1: Intel(R) Pentium(R) III CPU family 1133MHz stepping 01 (XEN) Total of 2 processors activated. (XEN) ENABLING IO-APIC IRQs (XEN) -> Using new ACK method (XEN) ..TIMER: vector=0xF0 apic1=0 pin1=0 apic2=-1 pin2=-1 (XEN) ..MP-BIOS bug: 8254 timer not connected to IO-APIC (XEN) ...trying to set up timer (IRQ0) through the 8259A ... failed. (XEN) ...trying to set up timer as Virtual Wire IRQ... works. (XEN) checking TSC synchronization across 2 CPUs: passed. (XEN) Platform timer is 1.193MHz PIT (XEN) Brought up 2 CPUs (XEN) Machine check exception polling timer started. (XEN) *** LOADING DOMAIN 0 *** (XEN) Domain 0 kernel supports features = { 0000001f }. (XEN) Domain 0 kernel requires features = { 00000000 }. (XEN) PHYSICAL MEMORY ARRANGEMENT: (XEN) Dom0 alloc.: 3c000000->3e000000 (8192 pages to be allocated) (XEN) VIRTUAL MEMORY ARRANGEMENT: (XEN) Loaded kernel: c0100000->c05d5d74 (XEN) Init. ramdisk: c05d6000->c0f7c400 (XEN) Phys-Mach map: c0f7d000->c0f8d000 (XEN) Start info: c0f8d000->c0f8e000 (XEN) Page tables: c0f8e000->c0f94000 (XEN) Boot stack: c0f94000->c0f95000 (XEN) TOTAL: c0000000->c1400000 (XEN) ENTRY ADDRESS: c0100000 (XEN) Dom0 has maximum 2 VCPUs (XEN) Initrd len 0x9a6400, start at 0xc05d6000 (XEN) Scrubbing Free RAM: ...........done. (XEN) Xen trace buffers: disabled (XEN) Xen is relinquishing VGA console. (XEN) *** Serial input -> DOM0 (type ''CTRL-a'' three times to switch input to Xen). Linux version 2.6.16.13-xen (root@n4h8) (gcc version 3.3.5 (Debian 1:3.3.5-13)) #3 SMP Fri Aug 4 10:57:44 PDT 2006 BIOS-provided physical RAM map: Xen: 0000000000000000 - 0000000004800000 (usable) 0MB HIGHMEM available. 72MB LOWMEM available. DMI 2.3 present. ACPI: LAPIC (acpi_id[0x00] lapic_id[0x03] enabled) ACPI: LAPIC (acpi_id[0x01] lapic_id[0x00] enabled) ACPI: IOAPIC (id[0x0e] address[0xfec00000] gsi_base[0]) IOAPIC[0]: apic_id 14, version 17, address 0xfec00000, GSI 0-15 ACPI: IOAPIC (id[0x0d] address[0xfec01000] gsi_base[16]) IOAPIC[1]: apic_id 13, version 17, address 0xfec01000, GSI 16-31 ACPI: INT_SRC_OVR (bus 0 bus_irq 3 global_irq 30 dfl dfl) Enabling APIC mode: Flat. Using 2 I/O APICs Using ACPI (MADT) for SMP configuration information Allocating PCI resources starting at 50000000 (gap: 40000000:bec00000) Built 1 zonelists Kernel command line: root=801 ro console=ttyS0 ramdisk_size=32768 Enabling fast FPU save and restore... done. Enabling unmasked SIMD FPU exception support... done. Initializing CPU#0 PID hash table entries: 512 (order: 9, 8192 bytes) Xen reported: 1130.113 MHz processor. Console: colour VGA+ 80x25 Dentry cache hash table entries: 16384 (order: 4, 65536 bytes) Inode-cache hash table entries: 8192 (order: 3, 32768 bytes) Software IO TLB enabled: Aperture: 2 megabytes Kernel range: 0x00000000c10f0000 - 0x00000000c12f0000 vmalloc area: c5000000-fb7fe000, maxmem 33ffe000 Memory: 47400k/73728k available (3389k kernel code, 18068k reserved, 1044k data, 208k init, 0k highmem) Checking if this processor honours the WP bit even in supervisor mode... Ok. Calibrating delay using timer specific routine.. 2261.60 BogoMIPS (lpj=11308023) Security Framework v1.0.0 initialized Capability LSM initialized Mount-cache hash table entries: 512 CPU: L1 I cache: 16K, L1 D cache: 16K CPU: L2 cache: 512K Checking ''hlt'' instruction... OK. ENABLING IO-APIC IRQs Enabling SMP... Brought up 2 CPUs Initializing CPU#1 migration_cost=1895 checking if image is initramfs... it is Freeing initrd memory: 9881k freed Grant table initialized NET: Registered protocol family 16 ACPI: bus type pci registered PCI: Using configuration type 1 ACPI: Subsystem revision 20060127 ACPI: Interpreter enabled ACPI: Using IOAPIC for interrupt routing ACPI: PCI Root Bridge [PCI0] (0000:00) ACPI: PCI Interrupt Link [LPE1] (IRQs *10) ACPI: PCI Interrupt Link [LPE2] (IRQs *10) ACPI: PCI Interrupt Link [LPVI] (IRQs) *0, disabled. ACPI: PCI Interrupt Link [LPUS] (IRQs *7) ACPI: PCI Root Bridge [PCI1] (0000:01) ACPI: PCI Interrupt Link [LPSA] (IRQs *9) ACPI: PCI Interrupt Link [LP1A] (IRQs) *0, disabled. ACPI: PCI Interrupt Link [LP1B] (IRQs) *0, disabled. ACPI: PCI Interrupt Link [LP2A] (IRQs *9) ACPI: PCI Interrupt Link [LP2B] (IRQs) *0, disabled. Linux Plug and Play Support v0.97 (c) Adam Belay xen_mem: Initialising balloon driver. SCSI subsystem initialized usbcore: registered new driver usbfs usbcore: registered new driver hub PCI: Using ACPI for IRQ routing PCI: If a device doesn''t work, try "pci=routeirq". If it helps, post a report IA-32 Microcode Update Driver: v1.14-xen <tigran@veritas.com> VFS: Disk quotas dquot_6.5.1 Dquot-cache hash table entries: 1024 (order 0, 4096 bytes) JFS: nTxBlock = 512, nTxLock = 4096 SGI XFS with ACLs, security attributes, realtime, large block numbers, no debug enabled Initializing Cryptographic API io scheduler noop registered io scheduler anticipatory registered (default) io scheduler deadline registered io scheduler cfq registered PNP: No PS/2 controller found. Probing ports directly. serio: i8042 AUX port at 0x60,0x64 irq 12 serio: i8042 KBD port at 0x60,0x64 irq 1 RAMDISK driver initialized: 16 RAM disks of 32768K size 1024 blocksize Xen virtual console successfully installed as ttyS0 Event-channel device installed. blkif_init: reqs=64, pages=704, mmap_vstart=0xc0800000 Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2 ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx SvrWks OSB4: IDE controller at PCI slot 0000:00:0f.1 SvrWks OSB4: chipset revision 0 SvrWks OSB4: not 100% native mode: will probe irqs later ide0: BM-DMA at 0x0700-0x0707, BIOS settings: hda:DMA, hdb:DMA ide1: BM-DMA at 0x0708-0x070f, BIOS settings: hdc:DMA, hdd:DMA hda: LG CD-ROM CRN-8245B, ATAPI CD/DVD-ROM drive ide0 at 0x1f0-0x1f7,0x3f6 on irq 14 hda: ATAPI 24X CD-ROM drive, 128kB Cache, (U)DMA Uniform CD-ROM driver Revision: 3.20 ide-floppy driver 0.99.newide ACPI: PCI Interrupt 0000:01:03.0[A] -> GSI 28 (level, low) -> IRQ 16 scsi0 : Adaptec AIC7XXX EISA/VLB/PCI SCSI HBA DRIVER, Rev 7.0 <Adaptec aic7892 Ultra160 SCSI adapter> aic7892: Ultra160 Wide Channel A, SCSI Id=7, 32/253 SCBs ACPI: PCI Interrupt 0000:01:05.0[A] -> GSI 20 (level, low) -> IRQ 17 scsi1 : IBM PCI ServeRAID 7.12.05 Build 761 <ServeRAID 4L> Vendor: IBM Model: SERVERAID Rev: 1.00 Type: Direct-Access ANSI SCSI revision: 02 Vendor: IBM Model: SERVERAID Rev: 1.00 Type: Direct-Access ANSI SCSI revision: 02 Vendor: IBM Model: SERVERAID Rev: 1.00 Type: Processor ANSI SCSI revision: 02 Vendor: IBM Model: FTlV1 S2 Rev: 0 Type: Processor ANSI SCSI revision: 02 Fusion MPT base driver 3.03.07 Copyright (c) 1999-2005 LSI Logic Corporation Fusion MPT SPI Host driver 3.03.07 Fusion MPT misc device (ioctl) driver 3.03.07 mptctl: Registered with Fusion MPT base driver mptctl: /dev/mptctl @ (major,minor=10,220) usbmon: debugfs is not available usbcore: registered new driver libusual mice: PS/2 mouse device common for all mice md: md driver 0.90.3 MAX_MD_DEVS=256, MD_SB_DISKS=27 md: bitmap version 4.39 NET: Registered protocol family 2 input: AT Translated Set 2 keyboard as /class/input/input0 IP route cache hash table entries: 1024 (order: 0, 4096 bytes) TCP established hash table entries: 4096 (order: 3, 32768 bytes) TCP bind hash table entries: 4096 (order: 3, 32768 bytes) TCP: Hash tables configured (established 4096 bind 4096) TCP reno registered Initializing IPsec netlink socket NET: Registered protocol family 1 NET: Registered protocol family 17 NET: Registered protocol family 8 NET: Registered protocol family 20 Using IPI No-Shortcut mode Freeing unused kernel memory: 208k freed Loading, please wait... Begin: Loading essential drivers... ... logips2pp: Detected unknown logitech mouse model 0 eepro100.c:v1.09j-t 9/29/99 Donald Becker http://www.scyld.com/network/eepro100.html eepro100.c: $Revision: 1.36 $ 2000/11/17 Modified by Andrey V. Savochkin <saw@saw.sw.com.sg> and others ACPI: PCI Interrupt 0000:00:02.0[A] -> GSI 27 (level, low) -> IRQ 18 eth0: 0000:00:02.0, 00:02:55:C7:CA:D8, IRQ 18. Board assembly 754338-001, Physical connectors present: RJ45 Primary interface chip i82555 PHY #1. General self-test: passed. Serial sub-system self-test: passed. Internal registers self-test: passed. ROM checksum self-test: passed (0x04f4518b). ACPI: PCI Interrupt 0000:00:0a.0[A] -> GSI 25 (level, low) -> IRQ 19 eth1: 0000:00:0a.0, 00:02:55:C7:CA:D9, IRQ 19. Board assembly 754338-001, Physical connectors present: RJ45 Primary interface chip i82555 PHY #1. General self-test: passed. Serial sub-system self-test: passed. Internal registers self-test: passed. ROM checksum self-test: passed (0x04f4518b). Intel(R) PRO/1000 Network Driver - version 6.3.9-k4 Copyright (c) 1999-2005 Intel Corporation. Done. Begin: Running /scripts/init-premount ... input: PS/2 Logitech Mouse as /class/input/input1 Done. Begin: Mounting root file system... ... Begin: Running /scripts/local-top ... device-mapper: 4.5.0-ioctl (2005-10-04) initialised: dm-devel@redhat.com /scripts/local-top/lvm: 36: vgchange: not found Done. SCSI device sda: 71096320 512-byte hdwr sectors (36401 MB) sda: assuming Write Enabled sda: assuming drive cache: write through SCSI device sda: 71096320 512-byte hdwr sectors (36401 MB) sda: assuming Write Enabled sda: assuming drive cache: write through sda: sda1 sda2 < sda5 > sd 1:0:0:0: Attached scsi disk sda SCSI device sdb: 71096320 512-byte hdwr sectors (36401 MB) sdb: assuming Write Enabled sdb: assuming drive cache: write through SCSI device sdb: 71096320 512-byte hdwr sectors (36401 MB) sdb: assuming Write Enabled sdb: assuming drive cache: write through sdb: unknown partition table sd 1:0:1:0: Attached scsi disk sdb Begin: Running /scripts/local-premount ... Done. EXT3-fs: INFO: recovery required on readonly filesystem. EXT3-fs: write access will be enabled during recovery. kjournald starting. Commit interval 5 seconds EXT3-fs: recovery complete. EXT3-fs: mounted filesystem with ordered data mode. Begin: Running /scripts/log-bottom ... Done. Done. Begin: Running /scripts/init-bottom ... Done. INIT: version 2.86 booting Starting the hotplug events dispatcher: udevd. Synthesizing the initial hotplug events...done. Waiting for /dev to be fully populated...done. Loading /etc/console/boottime.kmap.gz Activating swap. Adding 1019880k swap on /dev/sda5. Priority:-1 extents:1 across:1019880k Checking root file system... fsck 1.37 (21-Mar-2005) /dev/sda1: clean, 58437/512000 files, 445341/1023996 blocks EXT3 FS on sda1, internal journal System time was Sat Aug 12 00:22:00 UTC 2006. Setting the System Clock using the Hardware Clock as reference... System Clock set. System local time is now Sat Aug 12 00:22:01 UTC 2006. Calculating module dependencies...done. Loading modules... auto FATAL: Module auto not found. 3c59x softdog Software Watchdog Timer: 0.07 initialized. soft_noboot=0 soft_margin=60 sec (nowayout= 0) tulip Linux Tulip driver version 1.1.13 (May 11, 2002) e100 e100: Intel(R) PRO/100 Network Driver, 3.5.10-k2-NAPI e100: Copyright(c) 1999-2005 Intel Corporation All modules loaded. Loading device-mapper support. Starting Enterprise Volume Management System: device-mapper: dm-linear: Device lookup failed device-mapper: error adding target to table device-mapper: dm-linear: Device lookup failed device-mapper: error adding target to table device-mapper: dm-linear: Device lookup failed device-mapper: error adding target to table device-mapper: dm-linear: Device lookup failed device-mapper: error adding target to table device-mapper: dm-linear: Device lookup failed device-mapper: error adding target to table device-mapper: dm-linear: Device lookup failed device-mapper: error adding target to table device-mapper: dm-linear: Device lookup failed device-mapper: error adding target to table device-mapper: dm-linear: Device lookup failed device-mapper: error adding target to table evms. Checking all file systems... fsck 1.37 (21-Mar-2005) Setting kernel variables ... ... done. Loading the saved-state of the serial devices... Cannot get serial info: Invalid argument Cannot get serial info: Invalid argument /dev/ttyS1: No such file or directory /dev/ttyS1: No such file or directory Mounting local filesystems... Cleaning /tmp /var/run /var/lock. Cleaning: /etc/network/ifstate. Setting up IP spoofing protection: rp_filter. Configuring network interfaces: done. Starting portmap daemon: portmap. Setting the System Clock using the Hardware Clock as reference... System Clock set. Local time: Fri Aug 11 17:22:07 PDT 2006 Recovering jove files ... Done. Running ntpdate to synchronize clock. Initializing random number generator...done. Recovering nvi editor sessions... done. Setting up X server socket directory /tmp/.X11-unix...done. Setting up ICE socket directory /tmp/.ICE-unix...done. INIT: Entering runlevel: 2 Starting system log daemon: syslogd. Starting kernel log daemon: klogd. Starting virtual private network daemon:. Starting portmap daemon: portmap. Starting NFS common utilities: statd lockd. Starting internet superserver: inetd. * The mptctl module is missing. Please have a look at the README.Debian.gz. . Exporting directories for NFS kernel daemon...Installing knfsd (copyright (C) 1996 okir@monad.swb.de). 10.27.3.21:/export/xen/fs/stevegt/crashme1/root: No such file or directory 10.27.3.18:/export/xen/fs/stevegt/xentest2/root: No such file or directory 10.27.3.17:/export/xen/fs/stevegt/xentest1/root: No such file or directory 10.27.3.8:/export/xen/fs/cclarke/umltest1/root: No such file or directory hotaru.chaosring.Org:/export/xen/fs/stevegt/crashme1/root: No such file or directory rn-2.refactored-networks.com:/export/xen/fs/stevegt/xentest2/root: No such file or directory xentest1.stevegt.TerraLuna.Org:/export/xen/fs/stevegt/xentest1/root: No such file or directory umltest1.cclarke.TerraLuna.Org:/export/xen/fs/cclarke/umltest1/root: No such file or directory 10.27.3.26:/export/xen/fs/stevegt/gforge/root: No such file or directory 10.27.3.23:/export/xen/fs/baseline/sarge/root: No such file or directory 10.27.3.22:/export/xen/fs/stevegt/build1/root: No such file or directory 10.27.3.16:/export/xen/fs/baseline/woody/root: No such file or directory bugs.t7a.Org:/export/xen/fs/stevegt/gforge/root: No such file or directory sarge.baseline.TerraLuna.Org:/export/xen/fs/baseline/sarge/root: No such file or directory build1.stevegt.TerraLuna.Org:/export/xen/fs/stevegt/build1/root: No such file or directory woody.baseline.TerraLuna.Org:/export/xen/fs/baseline/woody/root: No such file or directory 10.27.3.19:/export/xen/fs/cclarke/ebond/root: No such file or directory ebond.cclarke.TerraLuna.Org:/export/xen/fs/cclarke/ebond/root: No such file or directory 10.27.3.29:/export/xen/fs/dmasten/mss0/root: No such file or directory 10.27.3.27:/export/xen/fs/cclarke/ccms/root: No such file or directory 10.27.3.25:/export/xen/fs/baseline/sid/root: No such file or directory 10.27.3.24:/export/xen/fs/stevegt/cvs1/root: No such file or directory sid.stevegt.TerraLuna.Org:/export/xen/fs/baseline/sid/root: No such file or directory cvs1.stevegt.TerraLuna.Org:/export/xen/fs/stevegt/cvs1/root: No such file or directory 10.27.3.28:/export/xen/fs/stevegt/t7a/root: No such file or directory 10.27.3.20:/export/xen/fs/stevegt/tcx/root: No such file or directory tcx.TerraLuna.Org:/export/xen/fs/stevegt/tcx/root: No such file or directory n2h54.prd.TerraLuna.Org:/export/xen: No such file or directory n2h53.prd.TerraLuna.Org:/export/xen: No such file or directory n2h51.prd.TerraLuna.Org:/export/xen: No such file or directory n2h41.prd.TerraLuna.Org:/export/xen: No such file or directory done. Starting NFS kernel daemon: nfsdNFSD: Using /var/lib/nfs/v4recovery as the NFSv4 state recovery directory NFSD: unable to find recovery directory /var/lib/nfs/v4recovery NFSD: starting 90-second grace period mountd. Starting OpenBSD Secure Shell server: sshdNET: Registered protocol family 10 lo: Disabled Privacy Extensions IPv6 over IPv4 tunneling driver . Starting NTP server: ntpd. AFS module /lib/modules/2.6.16.13-xen/fs/openafs.mp.o does not exist. Not starting AFS. Please consider building kernel modules using instructions in /usr/share/doc/openafs-client/README.modules Starting DRBD resources: drbd: initialised. Version: 0.7.20 (api:79/proto:74) drbd: SVN Revision: 2260 build by root@n4h8, 2006-08-04 11:00:44 drbd: registered as block device major 147 [ d0 drbd7: resync bitmap: bits=1048576 words=32768 drbd7: size = 4095 MB (4194303 KB) drbd7: 0 KB marked out-of-sync by on disk bit-map. drbd7: No usable activity log found. drbd7: drbdsetup [3825]: cstate Unconfigured --> StandAlone d1 drbd8: resync bitmap: bits=1048576 words=32768 drbd8: size = 4095 MB (4194303 KB) drbd8: 0 KB marked out-of-sync by on disk bit-map. drbd8: Found 4 transactions (46 active extents) in activity log. drbd8: Marked additional 156 MB as out-of-sync based on AL. drbd8: drbdsetup [3829]: cstate Unconfigured --> StandAlone d2 drbd9: resync bitmap: bits=2621440 words=81920 drbd9: size = 9 GB (10485759 KB) drbd9: 0 KB marked out-of-sync by on disk bit-map. drbd9: Found 4 transactions (192 active extents) in activity log. drbd9: drbdsetup [3833]: cstate Unconfigured --> StandAlone d3 drbd10: resync bitmap: bits=1835008 words=57344 drbd10: size = 7167 MB (7340031 KB) drbd10: 0 KB marked out-of-sync by on disk bit-map. drbd10: Found 4 transactions (192 active extents) in activity log. drbd10: Marked additional 508 MB as out-of-sync based on AL. drbd10: drbdsetup [3837]: cstate Unconfigured --> StandAlone d4 drbd11: resync bitmap: bits=786432 words=24576 drbd11: size = 3071 MB (3145727 KB) drbd11: 0 KB marked out-of-sync by on disk bit-map. drbd11: Found 4 transactions (192 active extents) in activity log. drbd11: drbdsetup [3841]: cstate Unconfigured --> StandAlone d5 drbd12: resync bitmap: bits=786432 words=24576 drbd12: size = 3071 MB (3145727 KB) drbd12: 0 KB marked out-of-sync by on disk bit-map. drbd12: Found 4 transactions (136 active extents) in activity log. drbd12: Marked additional 508 MB as out-of-sync based on AL. drbd12: drbdsetup [3845]: cstate Unconfigured --> StandAlone s0 s1 s2 s3 s4 s5 n0 drbd7: drbdsetup [3883]: cstate StandAlone --> Unconnected drbd7: drbd7_receiver [3884]: cstate Unconnected --> WFConnection n1 drbd8: drbdsetup [3891]: cstate StandAlone --> Unconnected drbd8: drbd8_receiver [3892]: cstate Unconnected --> WFConnection n2 drbd9: drbdsetup [3899]: cstate StandAlone --> Unconnected drbd9: drbd9_receiver [3900]: cstate Unconnected --> WFConnection n3 drbd10: drbdsetup [3907]: cstate StandAlone --> Unconnected drbd10: drbd10_receiver [3908]: cstate Unconnected --> WFConnection n4 drbd11: drbdsetup [3915]: cstate StandAlone --> Unconnected drbd11: drbd11_receiver [3916]: cstate Unconnected --> WFConnection n5 drbd12: drbdsetup [3923]: cstate StandAlone --> Unconnected drbd12: drbd12_receiver [3924]: cstate Unconnected --> WFConnection ]. .......... *************************************************************** DRBD''s startup script waits for the peer node(s) to appear. - In case this node was already a degraded cluster before the reboot the timeout is 120 seconds. [degr-wfc-timeout] - If the peer was available before the reboot the timeout will expire after 0 seconds. [wfc-timeout] (These values are for resource ''ds1.t7a.org''; 0 sec -> wait forever) To abort waiting enter ''yes'' [ 18]:drbd7: drbd7_receiver [3884]: cstate WFConnection --> WFReportParams drbd7: Handshake successful: DRBD Network Protocol version 74 drbd7: Connection established. drbd7: I am(S): 1:00000002:00000001:00000018:00000002:01 drbd7: Peer(S): 1:00000002:00000001:00000019:00000002:10 drbd7: drbd7_receiver [3884]: cstate WFReportParams --> WFBitMapT drbd7: Secondary/Unknown --> Secondary/Secondary drbd7: drbd7_receiver [3884]: cstate WFBitMapT --> SyncTarget drbd7: Resync started as SyncTarget (need to sync 520192 KB [130048 bits set]). drbd9: drbd9_receiver [3900]: cstate WFConnection --> WFReportParams drbd9: Handshake successful: DRBD Network Protocol version 74 drbd9: Connection established. drbd9: I am(S): 1:00000002:00000001:0000000d:00000001:01 drbd9: Peer(S): 1:00000002:00000001:0000000e:00000001:10 drbd9: drbd9_receiver [3900]: cstate WFReportParams --> WFBitMapT drbd9: Secondary/Unknown --> Secondary/Secondary drbd9: drbd9_receiver [3900]: cstate WFBitMapT --> SyncTarget drbd9: Resync started as SyncTarget (need to sync 290816 KB [72704 bits set]). drbd9: drbd9_receiver [3900]: cstate SyncTarget --> PausedSyncT drbd9: Syncer waits for sync group. drbd11: drbd11_receiver [3916]: cstate WFConnection --> WFReportParams drbd11: Handshake successful: DRBD Network Protocol version 74 drbd11: Connection established. drbd11: I am(S): 1:00000002:00000001:00000008:00000001:01 drbd11: Peer(S): 1:00000002:00000001:00000009:00000001:10 drbd11: drbd11_receiver [3916]: cstate WFReportParams --> WFBitMapT drbd11: Secondary/Unknown --> Secondary/Secondary drbd11: drbd11_receiver [3916]: cstate WFBitMapT --> SyncTarget drbd11: Resync started as SyncTarget (need to sync 274432 KB [68608 bits set]). drbd11: drbd11_receiver [3916]: cstate SyncTarget --> PausedSyncT drbd11: Syncer waits for sync group. drbd8: drbd8_receiver [3892]: cstate WFConnection --> WFReportParams drbd8: Handshake successful: DRBD Network Protocol version 74 drbd8: Connection established. drbd8: I am(S): 1:00000002:00000001:00000017:00000001:11 drbd8: Peer(S): 1:00000002:00000001:00000017:00000001:01 drbd8: drbd8_receiver [3892]: cstate WFReportParams --> WFBitMapS drbd8: Secondary/Unknown --> Secondary/Secondary drbd8: drbd8_receiver [3892]: cstate WFBitMapS --> SyncSource drbd8: Resync started as SyncSource (need to sync 159744 KB [39936 bits set]). drbd8: drbd8_receiver [3892]: cstate SyncSource --> PausedSyncS drbd8: Syncer waits for sync group. drbd12: drbd12_receiver [3924]: cstate WFConnection --> WFReportParams drbd12: Handshake successful: DRBD Network Protocol version 74 drbd12: Connection established. drbd12: I am(S): 1:00000002:00000001:00000001:00000001:11 drbd12: Peer(S): 1:00000002:00000001:00000001:00000001:01 drbd12: drbd12_receiver [3924]: cstate WFReportParams --> WFBitMapS drbd12: Secondary/Unknown --> Secondary/Secondary drbd12: drbd12_receiver [3924]: cstate WFBitMapS --> SyncSource drbd12: Resync started as SyncSource (need to sync 520192 KB [130048 bits set]). drbd12: drbd12_receiver [3924]: cstate SyncSource --> PausedSyncS drbd12: Syncer waits for sync group. drbd10: drbd10_receiver [3908]: cstate WFConnection --> WFReportParams drbd10: Handshake successful: DRBD Network Protocol version 74 drbd10: Connection established. drbd10: I am(S): 1:00000002:00000001:0000000b:00000001:11 drbd10: Peer(S): 1:00000002:00000001:0000000b:00000001:01 drbd10: drbd10_receiver [3908]: cstate WFReportParams --> WFBitMapS drbd10: Secondary/Unknown --> Secondary/Secondary drbd10: drbd10_receiver [3908]: cstate WFBitMapS --> SyncSource drbd10: Resync started as SyncSource (need to sync 520192 KB [130048 bits set]). drbd10: drbd10_receiver [3908]: cstate SyncSource --> PausedSyncS drbd10: Syncer waits for sync group. Heartbeat not configured: /etc/ha.d/ha.cf not found. Heartbeat failure [rc=1]. Failed. Starting deferred execution scheduler: atd. Starting periodic command scheduler: cron. Starting watchdog daemon: watchdog. openMosix: ERROR: Cannot find the /proc/hpc directory. openMosix: ERROR: Looks like this is not an openMosix enabled kernel. openMosix: ERROR: Configuration ABORTED. openMosix: HINT: need to recompile your kernel with openMosix support enabled? openMosix: HINT: need to update your boot manager (lilo, grub, etc)? openMosix: HINT: have you read the docs at http://openMosix.sf.net/ ? Bridge firewalling registered device vif0.0 entered promiscuous mode xenbr0: port 1(vif0.0) entering learning state xenbr0: topology change detected, propagating xenbr0: port 1(vif0.0) entering forwarding state device peth0 entered promiscuous mode xenbr0: port 2(peth0) entering learning state xenbr0: topology change detected, propagating xenbr0: port 2(peth0) entering forwarding state Starting auto Xen domains: afs1.t7a.orgdrbd9: Secondary/Secondary --> Secondary/Primary device vif1.0 entered promiscuous mode ADDRCONF(NETDEV_UP): vif1.0: link is not ready ip_tables: (C) 2000-2006 Netfilter Core Team Error: Device 2049 (vbd) could not be connected. Backend device not found. xenbr0: port 3(vif1.0) entering disabled state ! build2.t7a.orgdevice vif1.0 left promiscuous mode xenbr0: port 3(vif1.0) entering disabled state device vif2.0 entered promiscuous mode ADDRCONF(NETDEV_UP): vif2.0: link is not ready drbd10: Secondary/Secondary --> Primary/Secondary ADDRCONF(NETDEV_CHANGE): vif2.0: link becomes ready xenbr0: port 3(vif2.0) entering learning state xenbr0: topology change detected, propagating xenbr0: port 3(vif2.0) entering forwarding state ds1.t7a.orgdrbd7: Secondary/Secondary --> Secondary/Primary device vif3.0 entered promiscuous mode ADDRCONF(NETDEV_UP): vif3.0: link is not ready Error: Device 2049 (vbd) could not be connected. Backend device not found. ! kdc1.t7a.orgxenbr0: port 4(vif3.0) entering disabled state device vif3.0 left promiscuous mode xenbr0: port 4(vif3.0) entering disabled state drbd8: Secondary/Secondary --> Primary/Secondary device vif4.0 entered promiscuous mode ADDRCONF(NETDEV_UP): vif4.0: link is not ready ADDRCONF(NETDEV_CHANGE): vif4.0: link becomes ready xenbr0: port 4(vif4.0) entering learning state xenbr0: topology change detected, propagating xenbr0: port 4(vif4.0) entering forwarding state w1.cdint.comdrbd11: Secondary/Secondary --> Secondary/Primary device vif5.0 entered promiscuous mode ADDRCONF(NETDEV_UP): vif5.0: link is not ready Error: Device 2049 (vbd) could not be connected. Backend device not found. xenbr0: port 5(vif5.0) entering disabled state ! webkdc.t7a.orgdevice vif5.0 left promiscuous mode xenbr0: port 5(vif5.0) entering disabled state device vif6.0 entered promiscuous mode ADDRCONF(NETDEV_UP): vif6.0: link is not ready drbd12: Secondary/Secondary --> Primary/Secondary ADDRCONF(NETDEV_CHANGE): vif6.0: link becomes ready xenbr0: port 5(vif6.0) entering learning state xenbr0: topology change detected, propagating xenbr0: port 5(vif6.0) entering forwarding state [done] _sudZUZ#Z#XZo=_ DDDD EEEEEE BBBB IIIIII AAAA NN NN _jmZZ2!!~---~!!X##wa DD DD EE BB BB II AA AA NNN NN .<wdP~~ -!YZL, DD DD EEEEE BBBBB II AAAAAA NNNN NN .mX2'' _%aaa__ XZ[. DD DD EE BB BB II AA AA NN NNNN oZ[ _jdXY!~?S#wa ]Xb; DDDD EEEEEE BBBBB IIIIII AA AA NN NN _#e'' .]X2( ~Xw| )XXc .2Z` ]X[. xY| ]oZ( Linux Version 2.6.16.13-xen .2#; )3k; _s!~ jXf` Compiled #3 SMP Fri Aug 4 10:57:44 PDT 2006 1Z> -]Xb/ ~ __#2( Two 1.13GHz Intel Intel(R) Pentium(R) III CPU family 1133MHz Processors -Zo; +!4ZwaaaauZZXY'' 72M RAM *#[, ~-?!!!!!!-~ 4523.20 Bogomips Total XUb;. n4h34 )YXL,, +3#bc, -)SSL,, ~~~~~ Updating the Linuxlogo... done. Debian GNU/Linux 3.0 n4h34 ttyS0 n4h34 login: drbd7: Resync done (total 137 sec; paused 0 sec; 3796 K/sec) drbd7: drbd7_worker [3826]: cstate SyncTarget --> Connected drbd8: Syncer continues. drbd8: drbd7_worker [3826]: cstate PausedSyncS --> SyncSource drbd9: Syncer continues. drbd9: drbd7_worker [3826]: cstate PausedSyncT --> SyncTarget drbd10: Syncer continues. drbd10: drbd7_worker [3826]: cstate PausedSyncS --> SyncSource drbd11: Syncer continues. drbd11: drbd7_worker [3826]: cstate PausedSyncT --> SyncTarget drbd12: Syncer continues. drbd12: drbd7_worker [3826]: cstate PausedSyncS --> SyncSource drbd8: Resync done (total 211 sec; paused 137 sec; 2156 K/sec) drbd8: drbd8_worker [3830]: cstate SyncSource --> Connected drbd11: Resync done (total 239 sec; paused 137 sec; 2688 K/sec) drbd11: drbd11_worker [3842]: cstate SyncTarget --> Connected drbd9: Resync done (total 243 sec; paused 137 sec; 2740 K/sec) drbd9: drbd9_worker [3834]: cstate SyncTarget --> Connected Debian GNU/Linux 3.0 n4h34 ttyS0 n4h34 login: Debian GNU/Linux 3.0 n4h34 ttyS0 n4h34 login: Debian GNU/Linux 3.0 n4h34 ttyS0 n4h34 login: Debian GNU/Linux 3.0 n4h34 ttyS0 n4h34 login: Debian GNU/Linux 3.0 n4h34 ttyS0 n4h34 login: Debian GNU/Linux 3.0 n4h34 ttyS0 n4h34 login: drbd12: Resync done (total 306 sec; paused 136 sec; 3056 K/sec) drbd12: drbd12_worker [3846]: cstate SyncSource --> Connected Debian GNU/Linux 3.0 n4h34 ttyS0 n4h34 login: Debian GNU/Linux 3.0 n4h34 ttyS0 n4h34 login: drbd10: Resync done (total 310 sec; paused 136 sec; 2988 K/sec) drbd10: drbd10_worker [3838]: cstate SyncSource --> Connected Debian GNU/Linux 3.0 n4h34 ttyS0 n4h34 login: Unable to handle kernel paging request at virtual address c0976590 printing eip: c03d6ed7 *pde = ma 3cf91067 pa 00f91067 *pte = ma 00000000 pa fffff000 Oops: 0000 [#1] SMP Modules linked in: iptable_filter ip_tables x_tables bridge drbd ipv6 nfsd lockd sunrpc e100 tulip softdog 3c59x evdev sd_mod dm_mod thermal processor fan e1000 eepro100 mii tg3 CPU: 0 EIP: 0061:[<c03d6ed7>] Not tainted VLI EFLAGS: 00010206 (2.6.16.13-xen #3) EIP is at skb_copy_bits+0x127/0x280 eax: c0976000 ebx: 000005a8 ecx: 0000016a edx: c3711720 esi: c0976590 edi: c37110e0 ebp: 000005a8 esp: c00818c8 ds: 007b es: 007b ss: 0069 Process drbd12_receiver (pid: 3924, threadinfo=c0080000 task=c3efda90) Stack: <0>c1012ec0 00000002 c03d65ef c1d51a00 000005ea 00000042 00000000 00000000 00000020 c3edc800 c376fd64 c03d6b0f c376fd64 00000042 c37110e0 000005a8 c1a70000 c54690c0 00000000 c1d51ac0 c3edc800 c376fd64 c03dc478 c376fd64 Call Trace: [<c03d65ef>] pskb_expand_head+0xdf/0x140 [<c03d6b0f>] __pskb_pull_tail+0x7f/0x320 [<c54690c0>] br_nf_dev_queue_xmit+0x0/0x50 [bridge] [<c03dc478>] dev_queue_xmit+0x328/0x370 [<c5462f7e>] br_dev_queue_push_xmit+0xbe/0x140 [bridge] [<c5469212>] br_nf_post_routing+0x102/0x1c0 [bridge] [<c54690c0>] br_nf_dev_queue_xmit+0x0/0x50 [bridge] [<c5462ec0>] br_dev_queue_push_xmit+0x0/0x140 [bridge] [<c03f43f8>] nf_iterate+0x78/0x90 [<c5462ec0>] br_dev_queue_push_xmit+0x0/0x140 [bridge] [<c5462ec0>] br_dev_queue_push_xmit+0x0/0x140 [bridge] [<c03f447e>] nf_hook_slow+0x6e/0x110 [<c5462ec0>] br_dev_queue_push_xmit+0x0/0x140 [bridge] [<c5463061>] br_forward_finish+0x61/0x70 [bridge] [<c5462ec0>] br_dev_queue_push_xmit+0x0/0x140 [bridge] [<c5468995>] br_nf_forward_finish+0x75/0x130 [bridge] [<c5463000>] br_forward_finish+0x0/0x70 [bridge] [<c5468b38>] br_nf_forward_ip+0xe8/0x190 [bridge] [<c5468920>] br_nf_forward_finish+0x0/0x130 [bridge] [<c5463000>] br_forward_finish+0x0/0x70 [bridge] [<c03f43f8>] nf_iterate+0x78/0x90 [<c5463000>] br_forward_finish+0x0/0x70 [bridge] [<c5463000>] br_forward_finish+0x0/0x70 [bridge] [<c03f447e>] nf_hook_slow+0x6e/0x110 [<c5463000>] br_forward_finish+0x0/0x70 [bridge] [<c5463167>] __br_forward+0x77/0x80 [bridge] [<c5463000>] br_forward_finish+0x0/0x70 [bridge] [<c5463fbf>] br_handle_frame_finish+0xdf/0x160 [bridge] [<c5463ee0>] br_handle_frame_finish+0x0/0x160 [bridge] [<c5467d89>] br_nf_pre_routing_finish+0xf9/0x370 [bridge] [<c5463ee0>] br_handle_frame_finish+0x0/0x160 [bridge] [<c0322e3a>] loopback_start_xmit+0xba/0x110 [<c0400d70>] ip_finish_output+0x0/0x220 [<c03dc07e>] dev_hard_start_xmit+0x5e/0x130 [<c03dc3b5>] dev_queue_xmit+0x265/0x370 [<c03f43f8>] nf_iterate+0x78/0x90 [<c5467c90>] br_nf_pre_routing_finish+0x0/0x370 [bridge] [<c5467c90>] br_nf_pre_routing_finish+0x0/0x370 [bridge] [<c03f447e>] nf_hook_slow+0x6e/0x110 [<c5467c90>] br_nf_pre_routing_finish+0x0/0x370 [bridge] [<c5463ee0>] br_handle_frame_finish+0x0/0x160 [bridge] [<c54685fc>] br_nf_pre_routing+0x26c/0x520 [bridge] [<c5467c90>] br_nf_pre_routing_finish+0x0/0x370 [bridge] [<c03f43f8>] nf_iterate+0x78/0x90 [<c5463ee0>] br_handle_frame_finish+0x0/0x160 [bridge] [<c5463ee0>] br_handle_frame_finish+0x0/0x160 [bridge] [<c03f447e>] nf_hook_slow+0x6e/0x110 [<c5463ee0>] br_handle_frame_finish+0x0/0x160 [bridge] [<c546422d>] br_handle_frame+0x1ed/0x230 [bridge] [<c5463ee0>] br_handle_frame_finish+0x0/0x160 [bridge] [<c03dcb21>] netif_receive_skb+0x1a1/0x330 [<c03dcd87>] process_backlog+0xd7/0x190 [<c03dcf2a>] net_rx_action+0xea/0x230 [<c0125915>] __do_softirq+0xf5/0x120 [<c01259d5>] do_softirq+0x95/0xa0 [<c0125a42>] local_bh_enable+0x62/0xa0 [<c0406bd1>] tcp_prequeue_process+0x71/0x80 [<c04070e9>] tcp_recvmsg+0x349/0x750 [<c50eebfc>] dm_request+0xbc/0x100 [dm_mod] [<c03d5085>] sock_common_recvmsg+0x55/0x70 [<c03d11cf>] sock_recvmsg+0xef/0x110 [<c03143fa>] force_evtchn_callback+0xa/0x10 [<c0147163>] mempool_alloc+0x33/0xe0 [<c01367d0>] autoremove_wake_function+0x0/0x60 [<c50eebfc>] dm_request+0xbc/0x100 [dm_mod] [<c02b2ee0>] generic_make_request+0xf0/0x160 [<c529c680>] drbd_recv+0x90/0x190 [drbd] [<c529cdec>] drbd_recv_header+0x2c/0xf0 [drbd] [<c529e580>] receive_DataRequest+0x0/0x7d0 [drbd] [<c52a045c>] drbdd+0x1c/0x150 [drbd] [<c52a105a>] drbdd_init+0x7a/0x1a0 [drbd] [<c52a7136>] drbd_thread_setup+0x86/0xf0 [drbd] [<c52a70b0>] drbd_thread_setup+0x0/0xf0 [drbd] [<c0102f75>] kernel_thread_helper+0x5/0x10 Code: 8b 4c 24 30 8b 7c 24 34 8b 91 a0 00 00 00 8b 4c 24 18 0f b7 74 ca 18 8b 4c 24 14 8d 34 06 01 fe 29 ce 8b 7c 24 38 89 d9 c1 e9 02 <f3> a5 89 d9 83 e1 03 74 02 f3 a4 89 04 24 ba 02 00 00 00 89 54 <0>Kernel panic - not syncing: Fatal exception in interrupt (XEN) Domain 0 crashed: rebooting machine in 5 seconds. -- Stephen G. Traugott (KG6HDQ) Managing Partner, TerraLuna LLC stevegt@TerraLuna.Org -- http://www.t7a.org _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Steve Traugott
2006-Aug-12 04:34 UTC
[Xen-devel] Re: Unable to handle kernel paging request
Okay, for some reason I''m now able to crash dom0 pretty reliably in this configuration (domU root filesystem in DRBD) simply by running this in domU: nc -u -b 255.255.255.255 54321 < /dev/zero Anyone else want to give that a try on 10868 or later, and see if they can duplicate it, with or without DRBD? Steve On Fri, Aug 11, 2006 at 06:50:50PM -0700, Steve Traugott wrote:> Uh oh.... > > On Sat, Aug 05, 2006 at 08:38:03AM +0100, Ian Pratt wrote: > > > So I built -unstable changeset 10868, and ran an even heavier workload > > > (the above, plus ''bonnie'' in the guests) on dom0 and two guests > > > overnight, and they experienced no soft lockups; running -unstable, > > > changeset 10868, credit scheduler. This same workload would have > > > caused soft lockups within seconds in -testing changeset 9732 using > > > the sedf scheduler; I may not have been able to get it started at all. > > > Response time remained subsecond under -unstable; -testing would have > > > been on its knees. > > > > That''s good to hear. 3.0.3 is going to be a big leap forward in many > > ways. > > Same environment as we''ve discussed earlier in this thread, now on > unstable changeset 10868, Xen on top of DRBD, this one seems to be a > reliable crash when under combined disk and network load, still > isolating but would welcome suggestions... > > Steve > > > root (hd0,0) > Filesystem type is ext2fs, partition type 0x83 > kernel /boot/xen-3.0.gz dom0_mem=65536 com1=9600,8n1 > [Multiboot-elf, <0x100000:0x7a008:0x55ff8>, shtab=0x1d0078, entry=0x100000] > module /boot/vmlinuz-2.6.16.13-xen root=801 ro console=ttyS0 ramdisk_size=32768 > [Multiboot-module @ 0x1d1000, 0x494ed0 bytes] > module /boot/initrd.img-2.6.16.13-xen > [Multiboot-module @ 0x666000, 0x9a6400 bytes] > __ __ _____ ___ _ _ _ > \ \/ /___ _ __ |___ / / _ \ _ _ _ __ ___| |_ __ _| |__ | | ___ > \ // _ \ ''_ \ |_ \| | | |__| | | | ''_ \/ __| __/ _` | ''_ \| |/ _ \ > / \ __/ | | | ___) | |_| |__| |_| | | | \__ \ || (_| | |_) | | __/ > /_/\_\___|_| |_| |____(_)___/ \__,_|_| |_|___/\__\__,_|_.__/|_|\___| > > http://www.cl.cam.ac.uk/netos/xen > University of Cambridge Computer Laboratory > > Xen version 3.0-unstable (root@prd.terraluna.org) (gcc version 3.3.5 (Debian 1:3.3.5-13)) Fri Aug 4 10:54:50 PDT 2006 > Latest ChangeSet: Sat Jul 29 06:05:59 2006 +0100 10868:d2bf1a7cc131 > > (XEN) Command line: /boot/xen-3.0.gz dom0_mem=65536 com1=9600,8n1 > (XEN) Physical RAM map: > (XEN) 0000000000000000 - 000000000009c000 (usable) > (XEN) 000000000009c000 - 00000000000a0000 (reserved) > (XEN) 00000000000e0000 - 0000000000100000 (reserved) > (XEN) 0000000000100000 - 000000003ffec340 (usable) > (XEN) 000000003ffec340 - 000000003fff0000 (ACPI data) > (XEN) 000000003fff0000 - 0000000040000000 (reserved) > (XEN) 00000000fec00000 - 0000000100000000 (reserved) > (XEN) System RAM: 1023MB (1048096kB) > (XEN) Xen heap: 10MB (10396kB) > (XEN) Using scheduler: SMP Credit Scheduler (credit) > (XEN) PAE disabled. > (XEN) found SMP MP-table at 0009c1d0 > (XEN) DMI 2.3 present. > (XEN) Using APIC driver default > (XEN) ACPI: RSDP (v000 IBM ) @ 0x000fdfd0 > (XEN) ACPI: RSDT (v001 IBM SEREMRLD 0x00001000 IBM 0x45444f43) @ 0x3ffeff80 > (XEN) ACPI: FADT (v001 IBM SEREMRLD 0x00001000 IBM 0x45444f43) @ 0x3ffeff00 > (XEN) ACPI: MADT (v001 IBM SEREMRLD 0x00001000 IBM 0x45444f43) @ 0x3ffefe80 > (XEN) ACPI: DSDT (v001 IBM SEREMRLD 0x00001000 MSFT 0x0100000b) @ 0x00000000 > (XEN) ACPI: Local APIC address 0xfee00000 > (XEN) ACPI: LAPIC (acpi_id[0x00] lapic_id[0x03] enabled) > (XEN) Processor #3 6:11 APIC version 17 > (XEN) ACPI: LAPIC (acpi_id[0x01] lapic_id[0x00] enabled) > (XEN) Processor #0 6:11 APIC version 17 > (XEN) ACPI: IOAPIC (id[0x0e] address[0xfec00000] gsi_base[0]) > (XEN) IOAPIC[0]: apic_id 14, version 17, address 0xfec00000, GSI 0-15 > (XEN) ACPI: IOAPIC (id[0x0d] address[0xfec01000] gsi_base[16]) > (XEN) IOAPIC[1]: apic_id 13, version 17, address 0xfec01000, GSI 16-31 > (XEN) ACPI: INT_SRC_OVR (bus 0 bus_irq 3 global_irq 30 dfl dfl) > (XEN) ACPI: IRQ3 used by override. > (XEN) Enabling APIC mode: Flat. Using 2 I/O APICs > (XEN) Using ACPI (MADT) for SMP configuration information > (XEN) Initializing CPU#0 > (XEN) Detected 1130.180 MHz processor. > (XEN) CPU: L1 I cache: 16K, L1 D cache: 16K > (XEN) CPU: L2 cache: 512K > (XEN) Intel machine check architecture supported. > (XEN) Intel machine check reporting enabled on CPU#0. > (XEN) CPU0: Intel(R) Pentium(R) III CPU family 1133MHz stepping 01 > (XEN) Booting processor 1/0 eip 90000 > (XEN) Initializing CPU#1 > (XEN) CPU: L1 I cache: 16K, L1 D cache: 16K > (XEN) CPU: L2 cache: 512K > (XEN) Intel machine check architecture supported. > (XEN) Intel machine check reporting enabled on CPU#1. > (XEN) CPU1: Intel(R) Pentium(R) III CPU family 1133MHz stepping 01 > (XEN) Total of 2 processors activated. > (XEN) ENABLING IO-APIC IRQs > (XEN) -> Using new ACK method > (XEN) ..TIMER: vector=0xF0 apic1=0 pin1=0 apic2=-1 pin2=-1 > (XEN) ..MP-BIOS bug: 8254 timer not connected to IO-APIC > (XEN) ...trying to set up timer (IRQ0) through the 8259A ... failed. > (XEN) ...trying to set up timer as Virtual Wire IRQ... works. > (XEN) checking TSC synchronization across 2 CPUs: passed. > (XEN) Platform timer is 1.193MHz PIT > (XEN) Brought up 2 CPUs > (XEN) Machine check exception polling timer started. > (XEN) *** LOADING DOMAIN 0 *** > (XEN) Domain 0 kernel supports features = { 0000001f }. > (XEN) Domain 0 kernel requires features = { 00000000 }. > (XEN) PHYSICAL MEMORY ARRANGEMENT: > (XEN) Dom0 alloc.: 3c000000->3e000000 (8192 pages to be allocated) > (XEN) VIRTUAL MEMORY ARRANGEMENT: > (XEN) Loaded kernel: c0100000->c05d5d74 > (XEN) Init. ramdisk: c05d6000->c0f7c400 > (XEN) Phys-Mach map: c0f7d000->c0f8d000 > (XEN) Start info: c0f8d000->c0f8e000 > (XEN) Page tables: c0f8e000->c0f94000 > (XEN) Boot stack: c0f94000->c0f95000 > (XEN) TOTAL: c0000000->c1400000 > (XEN) ENTRY ADDRESS: c0100000 > (XEN) Dom0 has maximum 2 VCPUs > (XEN) Initrd len 0x9a6400, start at 0xc05d6000 > (XEN) Scrubbing Free RAM: ...........done. > (XEN) Xen trace buffers: disabled > (XEN) Xen is relinquishing VGA console. > (XEN) *** Serial input -> DOM0 (type ''CTRL-a'' three times to switch input to Xen). > Linux version 2.6.16.13-xen (root@n4h8) (gcc version 3.3.5 (Debian 1:3.3.5-13)) #3 SMP Fri Aug 4 10:57:44 PDT 2006 > BIOS-provided physical RAM map: > Xen: 0000000000000000 - 0000000004800000 (usable) > 0MB HIGHMEM available. > 72MB LOWMEM available. > DMI 2.3 present. > ACPI: LAPIC (acpi_id[0x00] lapic_id[0x03] enabled) > ACPI: LAPIC (acpi_id[0x01] lapic_id[0x00] enabled) > ACPI: IOAPIC (id[0x0e] address[0xfec00000] gsi_base[0]) > IOAPIC[0]: apic_id 14, version 17, address 0xfec00000, GSI 0-15 > ACPI: IOAPIC (id[0x0d] address[0xfec01000] gsi_base[16]) > IOAPIC[1]: apic_id 13, version 17, address 0xfec01000, GSI 16-31 > ACPI: INT_SRC_OVR (bus 0 bus_irq 3 global_irq 30 dfl dfl) > Enabling APIC mode: Flat. Using 2 I/O APICs > Using ACPI (MADT) for SMP configuration information > Allocating PCI resources starting at 50000000 (gap: 40000000:bec00000) > Built 1 zonelists > Kernel command line: root=801 ro console=ttyS0 ramdisk_size=32768 > Enabling fast FPU save and restore... done. > Enabling unmasked SIMD FPU exception support... done. > Initializing CPU#0 > PID hash table entries: 512 (order: 9, 8192 bytes) > Xen reported: 1130.113 MHz processor. > Console: colour VGA+ 80x25 > Dentry cache hash table entries: 16384 (order: 4, 65536 bytes) > Inode-cache hash table entries: 8192 (order: 3, 32768 bytes) > Software IO TLB enabled: > Aperture: 2 megabytes > Kernel range: 0x00000000c10f0000 - 0x00000000c12f0000 > vmalloc area: c5000000-fb7fe000, maxmem 33ffe000 > Memory: 47400k/73728k available (3389k kernel code, 18068k reserved, 1044k data, 208k init, 0k highmem) > Checking if this processor honours the WP bit even in supervisor mode... Ok. > Calibrating delay using timer specific routine.. 2261.60 BogoMIPS (lpj=11308023) > Security Framework v1.0.0 initialized > Capability LSM initialized > Mount-cache hash table entries: 512 > CPU: L1 I cache: 16K, L1 D cache: 16K > CPU: L2 cache: 512K > Checking ''hlt'' instruction... OK. > ENABLING IO-APIC IRQs > Enabling SMP... > Brought up 2 CPUs > Initializing CPU#1 > migration_cost=1895 > checking if image is initramfs... it is > Freeing initrd memory: 9881k freed > Grant table initialized > NET: Registered protocol family 16 > ACPI: bus type pci registered > PCI: Using configuration type 1 > ACPI: Subsystem revision 20060127 > ACPI: Interpreter enabled > ACPI: Using IOAPIC for interrupt routing > ACPI: PCI Root Bridge [PCI0] (0000:00) > ACPI: PCI Interrupt Link [LPE1] (IRQs *10) > ACPI: PCI Interrupt Link [LPE2] (IRQs *10) > ACPI: PCI Interrupt Link [LPVI] (IRQs) *0, disabled. > ACPI: PCI Interrupt Link [LPUS] (IRQs *7) > ACPI: PCI Root Bridge [PCI1] (0000:01) > ACPI: PCI Interrupt Link [LPSA] (IRQs *9) > ACPI: PCI Interrupt Link [LP1A] (IRQs) *0, disabled. > ACPI: PCI Interrupt Link [LP1B] (IRQs) *0, disabled. > ACPI: PCI Interrupt Link [LP2A] (IRQs *9) > ACPI: PCI Interrupt Link [LP2B] (IRQs) *0, disabled. > Linux Plug and Play Support v0.97 (c) Adam Belay > xen_mem: Initialising balloon driver. > SCSI subsystem initialized > usbcore: registered new driver usbfs > usbcore: registered new driver hub > PCI: Using ACPI for IRQ routing > PCI: If a device doesn''t work, try "pci=routeirq". If it helps, post a report > IA-32 Microcode Update Driver: v1.14-xen <tigran@veritas.com> > VFS: Disk quotas dquot_6.5.1 > Dquot-cache hash table entries: 1024 (order 0, 4096 bytes) > JFS: nTxBlock = 512, nTxLock = 4096 > SGI XFS with ACLs, security attributes, realtime, large block numbers, no debug enabled > Initializing Cryptographic API > io scheduler noop registered > io scheduler anticipatory registered (default) > io scheduler deadline registered > io scheduler cfq registered > PNP: No PS/2 controller found. Probing ports directly. > serio: i8042 AUX port at 0x60,0x64 irq 12 > serio: i8042 KBD port at 0x60,0x64 irq 1 > RAMDISK driver initialized: 16 RAM disks of 32768K size 1024 blocksize > Xen virtual console successfully installed as ttyS0 > Event-channel device installed. > blkif_init: reqs=64, pages=704, mmap_vstart=0xc0800000 > Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2 > ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx > SvrWks OSB4: IDE controller at PCI slot 0000:00:0f.1 > SvrWks OSB4: chipset revision 0 > SvrWks OSB4: not 100% native mode: will probe irqs later > ide0: BM-DMA at 0x0700-0x0707, BIOS settings: hda:DMA, hdb:DMA > ide1: BM-DMA at 0x0708-0x070f, BIOS settings: hdc:DMA, hdd:DMA > hda: LG CD-ROM CRN-8245B, ATAPI CD/DVD-ROM drive > ide0 at 0x1f0-0x1f7,0x3f6 on irq 14 > hda: ATAPI 24X CD-ROM drive, 128kB Cache, (U)DMA > Uniform CD-ROM driver Revision: 3.20 > ide-floppy driver 0.99.newide > ACPI: PCI Interrupt 0000:01:03.0[A] -> GSI 28 (level, low) -> IRQ 16 > scsi0 : Adaptec AIC7XXX EISA/VLB/PCI SCSI HBA DRIVER, Rev 7.0 > <Adaptec aic7892 Ultra160 SCSI adapter> > aic7892: Ultra160 Wide Channel A, SCSI Id=7, 32/253 SCBs > > ACPI: PCI Interrupt 0000:01:05.0[A] -> GSI 20 (level, low) -> IRQ 17 > scsi1 : IBM PCI ServeRAID 7.12.05 Build 761 <ServeRAID 4L> > Vendor: IBM Model: SERVERAID Rev: 1.00 > Type: Direct-Access ANSI SCSI revision: 02 > Vendor: IBM Model: SERVERAID Rev: 1.00 > Type: Direct-Access ANSI SCSI revision: 02 > Vendor: IBM Model: SERVERAID Rev: 1.00 > Type: Processor ANSI SCSI revision: 02 > Vendor: IBM Model: FTlV1 S2 Rev: 0 > Type: Processor ANSI SCSI revision: 02 > Fusion MPT base driver 3.03.07 > Copyright (c) 1999-2005 LSI Logic Corporation > Fusion MPT SPI Host driver 3.03.07 > Fusion MPT misc device (ioctl) driver 3.03.07 > mptctl: Registered with Fusion MPT base driver > mptctl: /dev/mptctl @ (major,minor=10,220) > usbmon: debugfs is not available > usbcore: registered new driver libusual > mice: PS/2 mouse device common for all mice > md: md driver 0.90.3 MAX_MD_DEVS=256, MD_SB_DISKS=27 > md: bitmap version 4.39 > NET: Registered protocol family 2 > input: AT Translated Set 2 keyboard as /class/input/input0 > IP route cache hash table entries: 1024 (order: 0, 4096 bytes) > TCP established hash table entries: 4096 (order: 3, 32768 bytes) > TCP bind hash table entries: 4096 (order: 3, 32768 bytes) > TCP: Hash tables configured (established 4096 bind 4096) > TCP reno registered > Initializing IPsec netlink socket > NET: Registered protocol family 1 > NET: Registered protocol family 17 > NET: Registered protocol family 8 > NET: Registered protocol family 20 > Using IPI No-Shortcut mode > Freeing unused kernel memory: 208k freed > Loading, please wait... > Begin: Loading essential drivers... ... > logips2pp: Detected unknown logitech mouse model 0 > eepro100.c:v1.09j-t 9/29/99 Donald Becker http://www.scyld.com/network/eepro100.html > eepro100.c: $Revision: 1.36 $ 2000/11/17 Modified by Andrey V. Savochkin <saw@saw.sw.com.sg> and others > ACPI: PCI Interrupt 0000:00:02.0[A] -> GSI 27 (level, low) -> IRQ 18 > eth0: 0000:00:02.0, 00:02:55:C7:CA:D8, IRQ 18. > Board assembly 754338-001, Physical connectors present: RJ45 > Primary interface chip i82555 PHY #1. > General self-test: passed. > Serial sub-system self-test: passed. > Internal registers self-test: passed. > ROM checksum self-test: passed (0x04f4518b). > ACPI: PCI Interrupt 0000:00:0a.0[A] -> GSI 25 (level, low) -> IRQ 19 > eth1: 0000:00:0a.0, 00:02:55:C7:CA:D9, IRQ 19. > Board assembly 754338-001, Physical connectors present: RJ45 > Primary interface chip i82555 PHY #1. > General self-test: passed. > Serial sub-system self-test: passed. > Internal registers self-test: passed. > ROM checksum self-test: passed (0x04f4518b). > Intel(R) PRO/1000 Network Driver - version 6.3.9-k4 > Copyright (c) 1999-2005 Intel Corporation. > Done. > Begin: Running /scripts/init-premount ... > input: PS/2 Logitech Mouse as /class/input/input1 > Done. > Begin: Mounting root file system... ... > Begin: Running /scripts/local-top ... > device-mapper: 4.5.0-ioctl (2005-10-04) initialised: dm-devel@redhat.com > /scripts/local-top/lvm: 36: vgchange: not found > Done. > SCSI device sda: 71096320 512-byte hdwr sectors (36401 MB) > sda: assuming Write Enabled > sda: assuming drive cache: write through > SCSI device sda: 71096320 512-byte hdwr sectors (36401 MB) > sda: assuming Write Enabled > sda: assuming drive cache: write through > sda: sda1 sda2 < sda5 > > sd 1:0:0:0: Attached scsi disk sda > SCSI device sdb: 71096320 512-byte hdwr sectors (36401 MB) > sdb: assuming Write Enabled > sdb: assuming drive cache: write through > SCSI device sdb: 71096320 512-byte hdwr sectors (36401 MB) > sdb: assuming Write Enabled > sdb: assuming drive cache: write through > sdb: unknown partition table > sd 1:0:1:0: Attached scsi disk sdb > Begin: Running /scripts/local-premount ... > Done. > EXT3-fs: INFO: recovery required on readonly filesystem. > EXT3-fs: write access will be enabled during recovery. > kjournald starting. Commit interval 5 seconds > EXT3-fs: recovery complete. > EXT3-fs: mounted filesystem with ordered data mode. > Begin: Running /scripts/log-bottom ... > Done. > Done. > Begin: Running /scripts/init-bottom ... > Done. > INIT: version 2.86 booting > Starting the hotplug events dispatcher: udevd. > Synthesizing the initial hotplug events...done. > Waiting for /dev to be fully populated...done. > Loading /etc/console/boottime.kmap.gz > Activating swap. > Adding 1019880k swap on /dev/sda5. Priority:-1 extents:1 across:1019880k > Checking root file system... > fsck 1.37 (21-Mar-2005) > /dev/sda1: clean, 58437/512000 files, 445341/1023996 blocks > EXT3 FS on sda1, internal journal > System time was Sat Aug 12 00:22:00 UTC 2006. > Setting the System Clock using the Hardware Clock as reference... > System Clock set. System local time is now Sat Aug 12 00:22:01 UTC 2006. > Calculating module dependencies...done. > Loading modules... > auto > FATAL: Module auto not found. > 3c59x > softdog > Software Watchdog Timer: 0.07 initialized. soft_noboot=0 soft_margin=60 sec (nowayout= 0) > tulip > Linux Tulip driver version 1.1.13 (May 11, 2002) > e100 > e100: Intel(R) PRO/100 Network Driver, 3.5.10-k2-NAPI > e100: Copyright(c) 1999-2005 Intel Corporation > All modules loaded. > Loading device-mapper support. > Starting Enterprise Volume Management System: device-mapper: dm-linear: Device lookup failed > device-mapper: error adding target to table > device-mapper: dm-linear: Device lookup failed > device-mapper: error adding target to table > device-mapper: dm-linear: Device lookup failed > device-mapper: error adding target to table > device-mapper: dm-linear: Device lookup failed > device-mapper: error adding target to table > device-mapper: dm-linear: Device lookup failed > device-mapper: error adding target to table > device-mapper: dm-linear: Device lookup failed > device-mapper: error adding target to table > device-mapper: dm-linear: Device lookup failed > device-mapper: error adding target to table > device-mapper: dm-linear: Device lookup failed > device-mapper: error adding target to table > evms. > Checking all file systems... > fsck 1.37 (21-Mar-2005) > Setting kernel variables ... > ... done. > Loading the saved-state of the serial devices... > Cannot get serial info: Invalid argument > Cannot get serial info: Invalid argument > /dev/ttyS1: No such file or directory > /dev/ttyS1: No such file or directory > Mounting local filesystems... > Cleaning /tmp /var/run /var/lock. > Cleaning: /etc/network/ifstate. > Setting up IP spoofing protection: rp_filter. > Configuring network interfaces: done. > Starting portmap daemon: portmap. > > Setting the System Clock using the Hardware Clock as reference... > System Clock set. Local time: Fri Aug 11 17:22:07 PDT 2006 > > Recovering jove files ... Done. > Running ntpdate to synchronize clock. > Initializing random number generator...done. > Recovering nvi editor sessions... done. > Setting up X server socket directory /tmp/.X11-unix...done. > Setting up ICE socket directory /tmp/.ICE-unix...done. > INIT: Entering runlevel: 2 > Starting system log daemon: syslogd. > Starting kernel log daemon: klogd. > Starting virtual private network daemon:. > Starting portmap daemon: portmap. > Starting NFS common utilities: statd lockd. > Starting internet superserver: inetd. > * The mptctl module is missing. Please have a look at the README.Debian.gz. > . > Exporting directories for NFS kernel daemon...Installing knfsd (copyright (C) 1996 okir@monad.swb.de). > 10.27.3.21:/export/xen/fs/stevegt/crashme1/root: No such file or directory > 10.27.3.18:/export/xen/fs/stevegt/xentest2/root: No such file or directory > 10.27.3.17:/export/xen/fs/stevegt/xentest1/root: No such file or directory > 10.27.3.8:/export/xen/fs/cclarke/umltest1/root: No such file or directory > hotaru.chaosring.Org:/export/xen/fs/stevegt/crashme1/root: No such file or directory > rn-2.refactored-networks.com:/export/xen/fs/stevegt/xentest2/root: No such file or directory > xentest1.stevegt.TerraLuna.Org:/export/xen/fs/stevegt/xentest1/root: No such file or directory > umltest1.cclarke.TerraLuna.Org:/export/xen/fs/cclarke/umltest1/root: No such file or directory > 10.27.3.26:/export/xen/fs/stevegt/gforge/root: No such file or directory > 10.27.3.23:/export/xen/fs/baseline/sarge/root: No such file or directory > 10.27.3.22:/export/xen/fs/stevegt/build1/root: No such file or directory > 10.27.3.16:/export/xen/fs/baseline/woody/root: No such file or directory > bugs.t7a.Org:/export/xen/fs/stevegt/gforge/root: No such file or directory > sarge.baseline.TerraLuna.Org:/export/xen/fs/baseline/sarge/root: No such file or directory > build1.stevegt.TerraLuna.Org:/export/xen/fs/stevegt/build1/root: No such file or directory > woody.baseline.TerraLuna.Org:/export/xen/fs/baseline/woody/root: No such file or directory > 10.27.3.19:/export/xen/fs/cclarke/ebond/root: No such file or directory > ebond.cclarke.TerraLuna.Org:/export/xen/fs/cclarke/ebond/root: No such file or directory > 10.27.3.29:/export/xen/fs/dmasten/mss0/root: No such file or directory > 10.27.3.27:/export/xen/fs/cclarke/ccms/root: No such file or directory > 10.27.3.25:/export/xen/fs/baseline/sid/root: No such file or directory > 10.27.3.24:/export/xen/fs/stevegt/cvs1/root: No such file or directory > sid.stevegt.TerraLuna.Org:/export/xen/fs/baseline/sid/root: No such file or directory > cvs1.stevegt.TerraLuna.Org:/export/xen/fs/stevegt/cvs1/root: No such file or directory > 10.27.3.28:/export/xen/fs/stevegt/t7a/root: No such file or directory > 10.27.3.20:/export/xen/fs/stevegt/tcx/root: No such file or directory > tcx.TerraLuna.Org:/export/xen/fs/stevegt/tcx/root: No such file or directory > n2h54.prd.TerraLuna.Org:/export/xen: No such file or directory > n2h53.prd.TerraLuna.Org:/export/xen: No such file or directory > n2h51.prd.TerraLuna.Org:/export/xen: No such file or directory > n2h41.prd.TerraLuna.Org:/export/xen: No such file or directory > done. > Starting NFS kernel daemon: nfsdNFSD: Using /var/lib/nfs/v4recovery as the NFSv4 state recovery directory > NFSD: unable to find recovery directory /var/lib/nfs/v4recovery > NFSD: starting 90-second grace period > mountd. > Starting OpenBSD Secure Shell server: sshdNET: Registered protocol family 10 > lo: Disabled Privacy Extensions > IPv6 over IPv4 tunneling driver > . > Starting NTP server: ntpd. > AFS module /lib/modules/2.6.16.13-xen/fs/openafs.mp.o does not exist. Not starting AFS. > Please consider building kernel modules using instructions in > /usr/share/doc/openafs-client/README.modules > Starting DRBD resources: drbd: initialised. Version: 0.7.20 (api:79/proto:74) > drbd: SVN Revision: 2260 build by root@n4h8, 2006-08-04 11:00:44 > drbd: registered as block device major 147 > [ d0 drbd7: resync bitmap: bits=1048576 words=32768 > drbd7: size = 4095 MB (4194303 KB) > drbd7: 0 KB marked out-of-sync by on disk bit-map. > drbd7: No usable activity log found. > drbd7: drbdsetup [3825]: cstate Unconfigured --> StandAlone > d1 drbd8: resync bitmap: bits=1048576 words=32768 > drbd8: size = 4095 MB (4194303 KB) > drbd8: 0 KB marked out-of-sync by on disk bit-map. > drbd8: Found 4 transactions (46 active extents) in activity log. > drbd8: Marked additional 156 MB as out-of-sync based on AL. > drbd8: drbdsetup [3829]: cstate Unconfigured --> StandAlone > d2 drbd9: resync bitmap: bits=2621440 words=81920 > drbd9: size = 9 GB (10485759 KB) > drbd9: 0 KB marked out-of-sync by on disk bit-map. > drbd9: Found 4 transactions (192 active extents) in activity log. > drbd9: drbdsetup [3833]: cstate Unconfigured --> StandAlone > d3 drbd10: resync bitmap: bits=1835008 words=57344 > drbd10: size = 7167 MB (7340031 KB) > drbd10: 0 KB marked out-of-sync by on disk bit-map. > drbd10: Found 4 transactions (192 active extents) in activity log. > drbd10: Marked additional 508 MB as out-of-sync based on AL. > drbd10: drbdsetup [3837]: cstate Unconfigured --> StandAlone > d4 drbd11: resync bitmap: bits=786432 words=24576 > drbd11: size = 3071 MB (3145727 KB) > drbd11: 0 KB marked out-of-sync by on disk bit-map. > drbd11: Found 4 transactions (192 active extents) in activity log. > drbd11: drbdsetup [3841]: cstate Unconfigured --> StandAlone > d5 drbd12: resync bitmap: bits=786432 words=24576 > drbd12: size = 3071 MB (3145727 KB) > drbd12: 0 KB marked out-of-sync by on disk bit-map. > drbd12: Found 4 transactions (136 active extents) in activity log. > drbd12: Marked additional 508 MB as out-of-sync based on AL. > drbd12: drbdsetup [3845]: cstate Unconfigured --> StandAlone > s0 s1 s2 s3 s4 s5 n0 drbd7: drbdsetup [3883]: cstate StandAlone --> Unconnected > drbd7: drbd7_receiver [3884]: cstate Unconnected --> WFConnection > n1 drbd8: drbdsetup [3891]: cstate StandAlone --> Unconnected > drbd8: drbd8_receiver [3892]: cstate Unconnected --> WFConnection > n2 drbd9: drbdsetup [3899]: cstate StandAlone --> Unconnected > drbd9: drbd9_receiver [3900]: cstate Unconnected --> WFConnection > n3 drbd10: drbdsetup [3907]: cstate StandAlone --> Unconnected > drbd10: drbd10_receiver [3908]: cstate Unconnected --> WFConnection > n4 drbd11: drbdsetup [3915]: cstate StandAlone --> Unconnected > drbd11: drbd11_receiver [3916]: cstate Unconnected --> WFConnection > n5 drbd12: drbdsetup [3923]: cstate StandAlone --> Unconnected > drbd12: drbd12_receiver [3924]: cstate Unconnected --> WFConnection > ]. > .......... > *************************************************************** > DRBD''s startup script waits for the peer node(s) to appear. > - In case this node was already a degraded cluster before the > reboot the timeout is 120 seconds. [degr-wfc-timeout] > - If the peer was available before the reboot the timeout will > expire after 0 seconds. [wfc-timeout] > (These values are for resource ''ds1.t7a.org''; 0 sec -> wait forever) > To abort waiting enter ''yes'' [ 18]:drbd7: drbd7_receiver [3884]: cstate WFConnection --> WFReportParams > drbd7: Handshake successful: DRBD Network Protocol version 74 > drbd7: Connection established. > drbd7: I am(S): 1:00000002:00000001:00000018:00000002:01 > drbd7: Peer(S): 1:00000002:00000001:00000019:00000002:10 > drbd7: drbd7_receiver [3884]: cstate WFReportParams --> WFBitMapT > drbd7: Secondary/Unknown --> Secondary/Secondary > drbd7: drbd7_receiver [3884]: cstate WFBitMapT --> SyncTarget > drbd7: Resync started as SyncTarget (need to sync 520192 KB [130048 bits set]). > drbd9: drbd9_receiver [3900]: cstate WFConnection --> WFReportParams > drbd9: Handshake successful: DRBD Network Protocol version 74 > drbd9: Connection established. > drbd9: I am(S): 1:00000002:00000001:0000000d:00000001:01 > drbd9: Peer(S): 1:00000002:00000001:0000000e:00000001:10 > drbd9: drbd9_receiver [3900]: cstate WFReportParams --> WFBitMapT > drbd9: Secondary/Unknown --> Secondary/Secondary > drbd9: drbd9_receiver [3900]: cstate WFBitMapT --> SyncTarget > drbd9: Resync started as SyncTarget (need to sync 290816 KB [72704 bits set]). > drbd9: drbd9_receiver [3900]: cstate SyncTarget --> PausedSyncT > drbd9: Syncer waits for sync group. > drbd11: drbd11_receiver [3916]: cstate WFConnection --> WFReportParams > drbd11: Handshake successful: DRBD Network Protocol version 74 > drbd11: Connection established. > drbd11: I am(S): 1:00000002:00000001:00000008:00000001:01 > drbd11: Peer(S): 1:00000002:00000001:00000009:00000001:10 > drbd11: drbd11_receiver [3916]: cstate WFReportParams --> WFBitMapT > drbd11: Secondary/Unknown --> Secondary/Secondary > drbd11: drbd11_receiver [3916]: cstate WFBitMapT --> SyncTarget > drbd11: Resync started as SyncTarget (need to sync 274432 KB [68608 bits set]). > drbd11: drbd11_receiver [3916]: cstate SyncTarget --> PausedSyncT > drbd11: Syncer waits for sync group. > drbd8: drbd8_receiver [3892]: cstate WFConnection --> WFReportParams > drbd8: Handshake successful: DRBD Network Protocol version 74 > drbd8: Connection established. > drbd8: I am(S): 1:00000002:00000001:00000017:00000001:11 > drbd8: Peer(S): 1:00000002:00000001:00000017:00000001:01 > drbd8: drbd8_receiver [3892]: cstate WFReportParams --> WFBitMapS > drbd8: Secondary/Unknown --> Secondary/Secondary > drbd8: drbd8_receiver [3892]: cstate WFBitMapS --> SyncSource > drbd8: Resync started as SyncSource (need to sync 159744 KB [39936 bits set]). > drbd8: drbd8_receiver [3892]: cstate SyncSource --> PausedSyncS > drbd8: Syncer waits for sync group. > drbd12: drbd12_receiver [3924]: cstate WFConnection --> WFReportParams > drbd12: Handshake successful: DRBD Network Protocol version 74 > drbd12: Connection established. > drbd12: I am(S): 1:00000002:00000001:00000001:00000001:11 > drbd12: Peer(S): 1:00000002:00000001:00000001:00000001:01 > drbd12: drbd12_receiver [3924]: cstate WFReportParams --> WFBitMapS > drbd12: Secondary/Unknown --> Secondary/Secondary > drbd12: drbd12_receiver [3924]: cstate WFBitMapS --> SyncSource > drbd12: Resync started as SyncSource (need to sync 520192 KB [130048 bits set]). > drbd12: drbd12_receiver [3924]: cstate SyncSource --> PausedSyncS > drbd12: Syncer waits for sync group. > drbd10: drbd10_receiver [3908]: cstate WFConnection --> WFReportParams > drbd10: Handshake successful: DRBD Network Protocol version 74 > drbd10: Connection established. > drbd10: I am(S): 1:00000002:00000001:0000000b:00000001:11 > drbd10: Peer(S): 1:00000002:00000001:0000000b:00000001:01 > drbd10: drbd10_receiver [3908]: cstate WFReportParams --> WFBitMapS > > drbd10: Secondary/Unknown --> Secondary/Secondary > drbd10: drbd10_receiver [3908]: cstate WFBitMapS --> SyncSource > drbd10: Resync started as SyncSource (need to sync 520192 KB [130048 bits set]). > drbd10: drbd10_receiver [3908]: cstate SyncSource --> PausedSyncS > drbd10: Syncer waits for sync group. > Heartbeat not configured: /etc/ha.d/ha.cf not found. > Heartbeat failure [rc=1]. Failed. > Starting deferred execution scheduler: atd. > Starting periodic command scheduler: cron. > Starting watchdog daemon: watchdog. > openMosix: ERROR: Cannot find the /proc/hpc directory. > openMosix: ERROR: Looks like this is not an openMosix enabled kernel. > openMosix: ERROR: Configuration ABORTED. > openMosix: HINT: need to recompile your kernel with openMosix support enabled? > openMosix: HINT: need to update your boot manager (lilo, grub, etc)? > openMosix: HINT: have you read the docs at http://openMosix.sf.net/ ? > Bridge firewalling registered > device vif0.0 entered promiscuous mode > xenbr0: port 1(vif0.0) entering learning state > xenbr0: topology change detected, propagating > xenbr0: port 1(vif0.0) entering forwarding state > device peth0 entered promiscuous mode > xenbr0: port 2(peth0) entering learning state > xenbr0: topology change detected, propagating > xenbr0: port 2(peth0) entering forwarding state > Starting auto Xen domains: afs1.t7a.orgdrbd9: Secondary/Secondary --> Secondary/Primary > device vif1.0 entered promiscuous mode > ADDRCONF(NETDEV_UP): vif1.0: link is not ready > ip_tables: (C) 2000-2006 Netfilter Core Team > Error: Device 2049 (vbd) could not be connected. Backend device not found. > xenbr0: port 3(vif1.0) entering disabled state > ! build2.t7a.orgdevice vif1.0 left promiscuous mode > xenbr0: port 3(vif1.0) entering disabled state > device vif2.0 entered promiscuous mode > ADDRCONF(NETDEV_UP): vif2.0: link is not ready > drbd10: Secondary/Secondary --> Primary/Secondary > ADDRCONF(NETDEV_CHANGE): vif2.0: link becomes ready > xenbr0: port 3(vif2.0) entering learning state > xenbr0: topology change detected, propagating > xenbr0: port 3(vif2.0) entering forwarding state > ds1.t7a.orgdrbd7: Secondary/Secondary --> Secondary/Primary > device vif3.0 entered promiscuous mode > ADDRCONF(NETDEV_UP): vif3.0: link is not ready > Error: Device 2049 (vbd) could not be connected. Backend device not found. > ! kdc1.t7a.orgxenbr0: port 4(vif3.0) entering disabled state > > device vif3.0 left promiscuous mode > xenbr0: port 4(vif3.0) entering disabled state > > > > > > > drbd8: Secondary/Secondary --> Primary/Secondary > device vif4.0 entered promiscuous mode > ADDRCONF(NETDEV_UP): vif4.0: link is not ready > ADDRCONF(NETDEV_CHANGE): vif4.0: link becomes ready > xenbr0: port 4(vif4.0) entering learning state > xenbr0: topology change detected, propagating > xenbr0: port 4(vif4.0) entering forwarding state > w1.cdint.comdrbd11: Secondary/Secondary --> Secondary/Primary > device vif5.0 entered promiscuous mode > ADDRCONF(NETDEV_UP): vif5.0: link is not ready > Error: Device 2049 (vbd) could not be connected. Backend device not found. > xenbr0: port 5(vif5.0) entering disabled state > ! webkdc.t7a.orgdevice vif5.0 left promiscuous mode > xenbr0: port 5(vif5.0) entering disabled state > device vif6.0 entered promiscuous mode > ADDRCONF(NETDEV_UP): vif6.0: link is not ready > drbd12: Secondary/Secondary --> Primary/Secondary > ADDRCONF(NETDEV_CHANGE): vif6.0: link becomes ready > xenbr0: port 5(vif6.0) entering learning state > xenbr0: topology change detected, propagating > xenbr0: port 5(vif6.0) entering forwarding state > [done] > > > _sudZUZ#Z#XZo=_ DDDD EEEEEE BBBB IIIIII AAAA NN NN > _jmZZ2!!~---~!!X##wa DD DD EE BB BB II AA AA NNN NN > .<wdP~~ -!YZL, DD DD EEEEE BBBBB II AAAAAA NNNN NN > .mX2'' _%aaa__ XZ[. DD DD EE BB BB II AA AA NN NNNN > oZ[ _jdXY!~?S#wa ]Xb; DDDD EEEEEE BBBBB IIIIII AA AA NN NN > _#e'' .]X2( ~Xw| )XXc > .2Z` ]X[. xY| ]oZ( Linux Version 2.6.16.13-xen > .2#; )3k; _s!~ jXf` Compiled #3 SMP Fri Aug 4 10:57:44 PDT 2006 > 1Z> -]Xb/ ~ __#2( Two 1.13GHz Intel Intel(R) Pentium(R) III CPU family 1133MHz Processors > -Zo; +!4ZwaaaauZZXY'' 72M RAM > *#[, ~-?!!!!!!-~ 4523.20 Bogomips Total > XUb;. n4h34 > )YXL,, > +3#bc, > -)SSL,, > ~~~~~ > Updating the Linuxlogo... done. > Debian GNU/Linux 3.0 n4h34 ttyS0 > > n4h34 login: drbd7: Resync done (total 137 sec; paused 0 sec; 3796 K/sec) > drbd7: drbd7_worker [3826]: cstate SyncTarget --> Connected > drbd8: Syncer continues. > drbd8: drbd7_worker [3826]: cstate PausedSyncS --> SyncSource > drbd9: Syncer continues. > drbd9: drbd7_worker [3826]: cstate PausedSyncT --> SyncTarget > drbd10: Syncer continues. > drbd10: drbd7_worker [3826]: cstate PausedSyncS --> SyncSource > drbd11: Syncer continues. > drbd11: drbd7_worker [3826]: cstate PausedSyncT --> SyncTarget > drbd12: Syncer continues. > drbd12: drbd7_worker [3826]: cstate PausedSyncS --> SyncSource > drbd8: Resync done (total 211 sec; paused 137 sec; 2156 K/sec) > drbd8: drbd8_worker [3830]: cstate SyncSource --> Connected > drbd11: Resync done (total 239 sec; paused 137 sec; 2688 K/sec) > drbd11: drbd11_worker [3842]: cstate SyncTarget --> Connected > drbd9: Resync done (total 243 sec; paused 137 sec; 2740 K/sec) > drbd9: drbd9_worker [3834]: cstate SyncTarget --> Connected > > Debian GNU/Linux 3.0 n4h34 ttyS0 > > n4h34 login: > Debian GNU/Linux 3.0 n4h34 ttyS0 > > n4h34 login: > Debian GNU/Linux 3.0 n4h34 ttyS0 > > n4h34 login: > Debian GNU/Linux 3.0 n4h34 ttyS0 > > n4h34 login: > Debian GNU/Linux 3.0 n4h34 ttyS0 > > n4h34 login: > Debian GNU/Linux 3.0 n4h34 ttyS0 > > n4h34 login: drbd12: Resync done (total 306 sec; paused 136 sec; 3056 K/sec) > drbd12: drbd12_worker [3846]: cstate SyncSource --> Connected > > Debian GNU/Linux 3.0 n4h34 ttyS0 > > n4h34 login: > Debian GNU/Linux 3.0 n4h34 ttyS0 > > n4h34 login: drbd10: Resync done (total 310 sec; paused 136 sec; 2988 K/sec) > drbd10: drbd10_worker [3838]: cstate SyncSource --> Connected > > Debian GNU/Linux 3.0 n4h34 ttyS0 > > n4h34 login: Unable to handle kernel paging request at virtual address c0976590 > printing eip: > c03d6ed7 > *pde = ma 3cf91067 pa 00f91067 > *pte = ma 00000000 pa fffff000 > Oops: 0000 [#1] > SMP > Modules linked in: iptable_filter ip_tables x_tables bridge drbd ipv6 nfsd lockd sunrpc e100 tulip softdog 3c59x evdev sd_mod dm_mod thermal processor fan e1000 eepro100 mii tg3 > CPU: 0 > EIP: 0061:[<c03d6ed7>] Not tainted VLI > EFLAGS: 00010206 (2.6.16.13-xen #3) > EIP is at skb_copy_bits+0x127/0x280 > eax: c0976000 ebx: 000005a8 ecx: 0000016a edx: c3711720 > esi: c0976590 edi: c37110e0 ebp: 000005a8 esp: c00818c8 > ds: 007b es: 007b ss: 0069 > Process drbd12_receiver (pid: 3924, threadinfo=c0080000 task=c3efda90) > Stack: <0>c1012ec0 00000002 c03d65ef c1d51a00 000005ea 00000042 00000000 00000000 > 00000020 c3edc800 c376fd64 c03d6b0f c376fd64 00000042 c37110e0 000005a8 > c1a70000 c54690c0 00000000 c1d51ac0 c3edc800 c376fd64 c03dc478 c376fd64 > Call Trace: > [<c03d65ef>] pskb_expand_head+0xdf/0x140 > [<c03d6b0f>] __pskb_pull_tail+0x7f/0x320 > [<c54690c0>] br_nf_dev_queue_xmit+0x0/0x50 [bridge] > [<c03dc478>] dev_queue_xmit+0x328/0x370 > [<c5462f7e>] br_dev_queue_push_xmit+0xbe/0x140 [bridge] > [<c5469212>] br_nf_post_routing+0x102/0x1c0 [bridge] > [<c54690c0>] br_nf_dev_queue_xmit+0x0/0x50 [bridge] > [<c5462ec0>] br_dev_queue_push_xmit+0x0/0x140 [bridge] > [<c03f43f8>] nf_iterate+0x78/0x90 > [<c5462ec0>] br_dev_queue_push_xmit+0x0/0x140 [bridge] > [<c5462ec0>] br_dev_queue_push_xmit+0x0/0x140 [bridge] > [<c03f447e>] nf_hook_slow+0x6e/0x110 > [<c5462ec0>] br_dev_queue_push_xmit+0x0/0x140 [bridge] > [<c5463061>] br_forward_finish+0x61/0x70 [bridge] > [<c5462ec0>] br_dev_queue_push_xmit+0x0/0x140 [bridge] > [<c5468995>] br_nf_forward_finish+0x75/0x130 [bridge] > [<c5463000>] br_forward_finish+0x0/0x70 [bridge] > [<c5468b38>] br_nf_forward_ip+0xe8/0x190 [bridge] > [<c5468920>] br_nf_forward_finish+0x0/0x130 [bridge] > [<c5463000>] br_forward_finish+0x0/0x70 [bridge] > [<c03f43f8>] nf_iterate+0x78/0x90 > [<c5463000>] br_forward_finish+0x0/0x70 [bridge] > [<c5463000>] br_forward_finish+0x0/0x70 [bridge] > [<c03f447e>] nf_hook_slow+0x6e/0x110 > [<c5463000>] br_forward_finish+0x0/0x70 [bridge] > [<c5463167>] __br_forward+0x77/0x80 [bridge] > [<c5463000>] br_forward_finish+0x0/0x70 [bridge] > [<c5463fbf>] br_handle_frame_finish+0xdf/0x160 [bridge] > [<c5463ee0>] br_handle_frame_finish+0x0/0x160 [bridge] > [<c5467d89>] br_nf_pre_routing_finish+0xf9/0x370 [bridge] > [<c5463ee0>] br_handle_frame_finish+0x0/0x160 [bridge] > [<c0322e3a>] loopback_start_xmit+0xba/0x110 > [<c0400d70>] ip_finish_output+0x0/0x220 > [<c03dc07e>] dev_hard_start_xmit+0x5e/0x130 > [<c03dc3b5>] dev_queue_xmit+0x265/0x370 > [<c03f43f8>] nf_iterate+0x78/0x90 > [<c5467c90>] br_nf_pre_routing_finish+0x0/0x370 [bridge] > [<c5467c90>] br_nf_pre_routing_finish+0x0/0x370 [bridge] > [<c03f447e>] nf_hook_slow+0x6e/0x110 > [<c5467c90>] br_nf_pre_routing_finish+0x0/0x370 [bridge] > [<c5463ee0>] br_handle_frame_finish+0x0/0x160 [bridge] > [<c54685fc>] br_nf_pre_routing+0x26c/0x520 [bridge] > [<c5467c90>] br_nf_pre_routing_finish+0x0/0x370 [bridge] > [<c03f43f8>] nf_iterate+0x78/0x90 > [<c5463ee0>] br_handle_frame_finish+0x0/0x160 [bridge] > [<c5463ee0>] br_handle_frame_finish+0x0/0x160 [bridge] > [<c03f447e>] nf_hook_slow+0x6e/0x110 > [<c5463ee0>] br_handle_frame_finish+0x0/0x160 [bridge] > [<c546422d>] br_handle_frame+0x1ed/0x230 [bridge] > [<c5463ee0>] br_handle_frame_finish+0x0/0x160 [bridge] > [<c03dcb21>] netif_receive_skb+0x1a1/0x330 > [<c03dcd87>] process_backlog+0xd7/0x190 > [<c03dcf2a>] net_rx_action+0xea/0x230 > [<c0125915>] __do_softirq+0xf5/0x120 > [<c01259d5>] do_softirq+0x95/0xa0 > [<c0125a42>] local_bh_enable+0x62/0xa0 > [<c0406bd1>] tcp_prequeue_process+0x71/0x80 > [<c04070e9>] tcp_recvmsg+0x349/0x750 > [<c50eebfc>] dm_request+0xbc/0x100 [dm_mod] > [<c03d5085>] sock_common_recvmsg+0x55/0x70 > [<c03d11cf>] sock_recvmsg+0xef/0x110 > [<c03143fa>] force_evtchn_callback+0xa/0x10 > [<c0147163>] mempool_alloc+0x33/0xe0 > [<c01367d0>] autoremove_wake_function+0x0/0x60 > [<c50eebfc>] dm_request+0xbc/0x100 [dm_mod] > [<c02b2ee0>] generic_make_request+0xf0/0x160 > [<c529c680>] drbd_recv+0x90/0x190 [drbd] > [<c529cdec>] drbd_recv_header+0x2c/0xf0 [drbd] > [<c529e580>] receive_DataRequest+0x0/0x7d0 [drbd] > [<c52a045c>] drbdd+0x1c/0x150 [drbd] > [<c52a105a>] drbdd_init+0x7a/0x1a0 [drbd] > [<c52a7136>] drbd_thread_setup+0x86/0xf0 [drbd] > [<c52a70b0>] drbd_thread_setup+0x0/0xf0 [drbd] > [<c0102f75>] kernel_thread_helper+0x5/0x10 > Code: 8b 4c 24 30 8b 7c 24 34 8b 91 a0 00 00 00 8b 4c 24 18 0f b7 74 ca 18 8b 4c 24 14 8d 34 06 01 fe 29 ce 8b 7c 24 38 89 d9 c1 e9 02 <f3> a5 89 d9 83 e1 03 74 02 f3 a4 89 04 24 ba 02 00 00 00 89 54 > <0>Kernel panic - not syncing: Fatal exception in interrupt > (XEN) Domain 0 crashed: rebooting machine in 5 seconds. > > -- > Stephen G. Traugott (KG6HDQ) > Managing Partner, TerraLuna LLC > stevegt@TerraLuna.Org -- http://www.t7a.org-- Stephen G. Traugott (KG6HDQ) Managing Partner, TerraLuna LLC stevegt@TerraLuna.Org -- http://www.t7a.org _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Steve Traugott
2006-Aug-12 05:50 UTC
[Xen-devel] Re: Unable to handle kernel paging request
Another data point... Dom0 seems to only want to crash when more than 3 or 4 domU''s are running (each with their own DRBD root, with DRBD running in dom0), and the below ''nc'' command is run in the last domU... Still looking. Steve On Fri, Aug 11, 2006 at 09:34:12PM -0700, Steve Traugott wrote:> Okay, for some reason I''m now able to crash dom0 pretty reliably in > this configuration (domU root filesystem in DRBD) simply by running > this in domU: > > nc -u -b 255.255.255.255 54321 < /dev/zero > > Anyone else want to give that a try on 10868 or later, and see if they > can duplicate it, with or without DRBD? > > Steve > > On Fri, Aug 11, 2006 at 06:50:50PM -0700, Steve Traugott wrote: > > Uh oh.... > > > > On Sat, Aug 05, 2006 at 08:38:03AM +0100, Ian Pratt wrote: > > > > So I built -unstable changeset 10868, and ran an even heavier workload > > > > (the above, plus ''bonnie'' in the guests) on dom0 and two guests > > > > overnight, and they experienced no soft lockups; running -unstable, > > > > changeset 10868, credit scheduler. This same workload would have > > > > caused soft lockups within seconds in -testing changeset 9732 using > > > > the sedf scheduler; I may not have been able to get it started at all. > > > > Response time remained subsecond under -unstable; -testing would have > > > > been on its knees. > > > > > > That''s good to hear. 3.0.3 is going to be a big leap forward in many > > > ways. > > > > Same environment as we''ve discussed earlier in this thread, now on > > unstable changeset 10868, Xen on top of DRBD, this one seems to be a > > reliable crash when under combined disk and network load, still > > isolating but would welcome suggestions... > > > > Steve > > > > > > root (hd0,0) > > Filesystem type is ext2fs, partition type 0x83 > > kernel /boot/xen-3.0.gz dom0_mem=65536 com1=9600,8n1 > > [Multiboot-elf, <0x100000:0x7a008:0x55ff8>, shtab=0x1d0078, entry=0x100000] > > module /boot/vmlinuz-2.6.16.13-xen root=801 ro console=ttyS0 ramdisk_size=32768 > > [Multiboot-module @ 0x1d1000, 0x494ed0 bytes] > > module /boot/initrd.img-2.6.16.13-xen > > [Multiboot-module @ 0x666000, 0x9a6400 bytes] > > __ __ _____ ___ _ _ _ > > \ \/ /___ _ __ |___ / / _ \ _ _ _ __ ___| |_ __ _| |__ | | ___ > > \ // _ \ ''_ \ |_ \| | | |__| | | | ''_ \/ __| __/ _` | ''_ \| |/ _ \ > > / \ __/ | | | ___) | |_| |__| |_| | | | \__ \ || (_| | |_) | | __/ > > /_/\_\___|_| |_| |____(_)___/ \__,_|_| |_|___/\__\__,_|_.__/|_|\___| > > > > http://www.cl.cam.ac.uk/netos/xen > > University of Cambridge Computer Laboratory > > > > Xen version 3.0-unstable (root@prd.terraluna.org) (gcc version 3.3.5 (Debian 1:3.3.5-13)) Fri Aug 4 10:54:50 PDT 2006 > > Latest ChangeSet: Sat Jul 29 06:05:59 2006 +0100 10868:d2bf1a7cc131 > > > > (XEN) Command line: /boot/xen-3.0.gz dom0_mem=65536 com1=9600,8n1 > > (XEN) Physical RAM map: > > (XEN) 0000000000000000 - 000000000009c000 (usable) > > (XEN) 000000000009c000 - 00000000000a0000 (reserved) > > (XEN) 00000000000e0000 - 0000000000100000 (reserved) > > (XEN) 0000000000100000 - 000000003ffec340 (usable) > > (XEN) 000000003ffec340 - 000000003fff0000 (ACPI data) > > (XEN) 000000003fff0000 - 0000000040000000 (reserved) > > (XEN) 00000000fec00000 - 0000000100000000 (reserved) > > (XEN) System RAM: 1023MB (1048096kB) > > (XEN) Xen heap: 10MB (10396kB) > > (XEN) Using scheduler: SMP Credit Scheduler (credit) > > (XEN) PAE disabled. > > (XEN) found SMP MP-table at 0009c1d0 > > (XEN) DMI 2.3 present. > > (XEN) Using APIC driver default > > (XEN) ACPI: RSDP (v000 IBM ) @ 0x000fdfd0 > > (XEN) ACPI: RSDT (v001 IBM SEREMRLD 0x00001000 IBM 0x45444f43) @ 0x3ffeff80 > > (XEN) ACPI: FADT (v001 IBM SEREMRLD 0x00001000 IBM 0x45444f43) @ 0x3ffeff00 > > (XEN) ACPI: MADT (v001 IBM SEREMRLD 0x00001000 IBM 0x45444f43) @ 0x3ffefe80 > > (XEN) ACPI: DSDT (v001 IBM SEREMRLD 0x00001000 MSFT 0x0100000b) @ 0x00000000 > > (XEN) ACPI: Local APIC address 0xfee00000 > > (XEN) ACPI: LAPIC (acpi_id[0x00] lapic_id[0x03] enabled) > > (XEN) Processor #3 6:11 APIC version 17 > > (XEN) ACPI: LAPIC (acpi_id[0x01] lapic_id[0x00] enabled) > > (XEN) Processor #0 6:11 APIC version 17 > > (XEN) ACPI: IOAPIC (id[0x0e] address[0xfec00000] gsi_base[0]) > > (XEN) IOAPIC[0]: apic_id 14, version 17, address 0xfec00000, GSI 0-15 > > (XEN) ACPI: IOAPIC (id[0x0d] address[0xfec01000] gsi_base[16]) > > (XEN) IOAPIC[1]: apic_id 13, version 17, address 0xfec01000, GSI 16-31 > > (XEN) ACPI: INT_SRC_OVR (bus 0 bus_irq 3 global_irq 30 dfl dfl) > > (XEN) ACPI: IRQ3 used by override. > > (XEN) Enabling APIC mode: Flat. Using 2 I/O APICs > > (XEN) Using ACPI (MADT) for SMP configuration information > > (XEN) Initializing CPU#0 > > (XEN) Detected 1130.180 MHz processor. > > (XEN) CPU: L1 I cache: 16K, L1 D cache: 16K > > (XEN) CPU: L2 cache: 512K > > (XEN) Intel machine check architecture supported. > > (XEN) Intel machine check reporting enabled on CPU#0. > > (XEN) CPU0: Intel(R) Pentium(R) III CPU family 1133MHz stepping 01 > > (XEN) Booting processor 1/0 eip 90000 > > (XEN) Initializing CPU#1 > > (XEN) CPU: L1 I cache: 16K, L1 D cache: 16K > > (XEN) CPU: L2 cache: 512K > > (XEN) Intel machine check architecture supported. > > (XEN) Intel machine check reporting enabled on CPU#1. > > (XEN) CPU1: Intel(R) Pentium(R) III CPU family 1133MHz stepping 01 > > (XEN) Total of 2 processors activated. > > (XEN) ENABLING IO-APIC IRQs > > (XEN) -> Using new ACK method > > (XEN) ..TIMER: vector=0xF0 apic1=0 pin1=0 apic2=-1 pin2=-1 > > (XEN) ..MP-BIOS bug: 8254 timer not connected to IO-APIC > > (XEN) ...trying to set up timer (IRQ0) through the 8259A ... failed. > > (XEN) ...trying to set up timer as Virtual Wire IRQ... works. > > (XEN) checking TSC synchronization across 2 CPUs: passed. > > (XEN) Platform timer is 1.193MHz PIT > > (XEN) Brought up 2 CPUs > > (XEN) Machine check exception polling timer started. > > (XEN) *** LOADING DOMAIN 0 *** > > (XEN) Domain 0 kernel supports features = { 0000001f }. > > (XEN) Domain 0 kernel requires features = { 00000000 }. > > (XEN) PHYSICAL MEMORY ARRANGEMENT: > > (XEN) Dom0 alloc.: 3c000000->3e000000 (8192 pages to be allocated) > > (XEN) VIRTUAL MEMORY ARRANGEMENT: > > (XEN) Loaded kernel: c0100000->c05d5d74 > > (XEN) Init. ramdisk: c05d6000->c0f7c400 > > (XEN) Phys-Mach map: c0f7d000->c0f8d000 > > (XEN) Start info: c0f8d000->c0f8e000 > > (XEN) Page tables: c0f8e000->c0f94000 > > (XEN) Boot stack: c0f94000->c0f95000 > > (XEN) TOTAL: c0000000->c1400000 > > (XEN) ENTRY ADDRESS: c0100000 > > (XEN) Dom0 has maximum 2 VCPUs > > (XEN) Initrd len 0x9a6400, start at 0xc05d6000 > > (XEN) Scrubbing Free RAM: ...........done. > > (XEN) Xen trace buffers: disabled > > (XEN) Xen is relinquishing VGA console. > > (XEN) *** Serial input -> DOM0 (type ''CTRL-a'' three times to switch input to Xen). > > Linux version 2.6.16.13-xen (root@n4h8) (gcc version 3.3.5 (Debian 1:3.3.5-13)) #3 SMP Fri Aug 4 10:57:44 PDT 2006 > > BIOS-provided physical RAM map: > > Xen: 0000000000000000 - 0000000004800000 (usable) > > 0MB HIGHMEM available. > > 72MB LOWMEM available. > > DMI 2.3 present. > > ACPI: LAPIC (acpi_id[0x00] lapic_id[0x03] enabled) > > ACPI: LAPIC (acpi_id[0x01] lapic_id[0x00] enabled) > > ACPI: IOAPIC (id[0x0e] address[0xfec00000] gsi_base[0]) > > IOAPIC[0]: apic_id 14, version 17, address 0xfec00000, GSI 0-15 > > ACPI: IOAPIC (id[0x0d] address[0xfec01000] gsi_base[16]) > > IOAPIC[1]: apic_id 13, version 17, address 0xfec01000, GSI 16-31 > > ACPI: INT_SRC_OVR (bus 0 bus_irq 3 global_irq 30 dfl dfl) > > Enabling APIC mode: Flat. Using 2 I/O APICs > > Using ACPI (MADT) for SMP configuration information > > Allocating PCI resources starting at 50000000 (gap: 40000000:bec00000) > > Built 1 zonelists > > Kernel command line: root=801 ro console=ttyS0 ramdisk_size=32768 > > Enabling fast FPU save and restore... done. > > Enabling unmasked SIMD FPU exception support... done. > > Initializing CPU#0 > > PID hash table entries: 512 (order: 9, 8192 bytes) > > Xen reported: 1130.113 MHz processor. > > Console: colour VGA+ 80x25 > > Dentry cache hash table entries: 16384 (order: 4, 65536 bytes) > > Inode-cache hash table entries: 8192 (order: 3, 32768 bytes) > > Software IO TLB enabled: > > Aperture: 2 megabytes > > Kernel range: 0x00000000c10f0000 - 0x00000000c12f0000 > > vmalloc area: c5000000-fb7fe000, maxmem 33ffe000 > > Memory: 47400k/73728k available (3389k kernel code, 18068k reserved, 1044k data, 208k init, 0k highmem) > > Checking if this processor honours the WP bit even in supervisor mode... Ok. > > Calibrating delay using timer specific routine.. 2261.60 BogoMIPS (lpj=11308023) > > Security Framework v1.0.0 initialized > > Capability LSM initialized > > Mount-cache hash table entries: 512 > > CPU: L1 I cache: 16K, L1 D cache: 16K > > CPU: L2 cache: 512K > > Checking ''hlt'' instruction... OK. > > ENABLING IO-APIC IRQs > > Enabling SMP... > > Brought up 2 CPUs > > Initializing CPU#1 > > migration_cost=1895 > > checking if image is initramfs... it is > > Freeing initrd memory: 9881k freed > > Grant table initialized > > NET: Registered protocol family 16 > > ACPI: bus type pci registered > > PCI: Using configuration type 1 > > ACPI: Subsystem revision 20060127 > > ACPI: Interpreter enabled > > ACPI: Using IOAPIC for interrupt routing > > ACPI: PCI Root Bridge [PCI0] (0000:00) > > ACPI: PCI Interrupt Link [LPE1] (IRQs *10) > > ACPI: PCI Interrupt Link [LPE2] (IRQs *10) > > ACPI: PCI Interrupt Link [LPVI] (IRQs) *0, disabled. > > ACPI: PCI Interrupt Link [LPUS] (IRQs *7) > > ACPI: PCI Root Bridge [PCI1] (0000:01) > > ACPI: PCI Interrupt Link [LPSA] (IRQs *9) > > ACPI: PCI Interrupt Link [LP1A] (IRQs) *0, disabled. > > ACPI: PCI Interrupt Link [LP1B] (IRQs) *0, disabled. > > ACPI: PCI Interrupt Link [LP2A] (IRQs *9) > > ACPI: PCI Interrupt Link [LP2B] (IRQs) *0, disabled. > > Linux Plug and Play Support v0.97 (c) Adam Belay > > xen_mem: Initialising balloon driver. > > SCSI subsystem initialized > > usbcore: registered new driver usbfs > > usbcore: registered new driver hub > > PCI: Using ACPI for IRQ routing > > PCI: If a device doesn''t work, try "pci=routeirq". If it helps, post a report > > IA-32 Microcode Update Driver: v1.14-xen <tigran@veritas.com> > > VFS: Disk quotas dquot_6.5.1 > > Dquot-cache hash table entries: 1024 (order 0, 4096 bytes) > > JFS: nTxBlock = 512, nTxLock = 4096 > > SGI XFS with ACLs, security attributes, realtime, large block numbers, no debug enabled > > Initializing Cryptographic API > > io scheduler noop registered > > io scheduler anticipatory registered (default) > > io scheduler deadline registered > > io scheduler cfq registered > > PNP: No PS/2 controller found. Probing ports directly. > > serio: i8042 AUX port at 0x60,0x64 irq 12 > > serio: i8042 KBD port at 0x60,0x64 irq 1 > > RAMDISK driver initialized: 16 RAM disks of 32768K size 1024 blocksize > > Xen virtual console successfully installed as ttyS0 > > Event-channel device installed. > > blkif_init: reqs=64, pages=704, mmap_vstart=0xc0800000 > > Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2 > > ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx > > SvrWks OSB4: IDE controller at PCI slot 0000:00:0f.1 > > SvrWks OSB4: chipset revision 0 > > SvrWks OSB4: not 100% native mode: will probe irqs later > > ide0: BM-DMA at 0x0700-0x0707, BIOS settings: hda:DMA, hdb:DMA > > ide1: BM-DMA at 0x0708-0x070f, BIOS settings: hdc:DMA, hdd:DMA > > hda: LG CD-ROM CRN-8245B, ATAPI CD/DVD-ROM drive > > ide0 at 0x1f0-0x1f7,0x3f6 on irq 14 > > hda: ATAPI 24X CD-ROM drive, 128kB Cache, (U)DMA > > Uniform CD-ROM driver Revision: 3.20 > > ide-floppy driver 0.99.newide > > ACPI: PCI Interrupt 0000:01:03.0[A] -> GSI 28 (level, low) -> IRQ 16 > > scsi0 : Adaptec AIC7XXX EISA/VLB/PCI SCSI HBA DRIVER, Rev 7.0 > > <Adaptec aic7892 Ultra160 SCSI adapter> > > aic7892: Ultra160 Wide Channel A, SCSI Id=7, 32/253 SCBs > > > > ACPI: PCI Interrupt 0000:01:05.0[A] -> GSI 20 (level, low) -> IRQ 17 > > scsi1 : IBM PCI ServeRAID 7.12.05 Build 761 <ServeRAID 4L> > > Vendor: IBM Model: SERVERAID Rev: 1.00 > > Type: Direct-Access ANSI SCSI revision: 02 > > Vendor: IBM Model: SERVERAID Rev: 1.00 > > Type: Direct-Access ANSI SCSI revision: 02 > > Vendor: IBM Model: SERVERAID Rev: 1.00 > > Type: Processor ANSI SCSI revision: 02 > > Vendor: IBM Model: FTlV1 S2 Rev: 0 > > Type: Processor ANSI SCSI revision: 02 > > Fusion MPT base driver 3.03.07 > > Copyright (c) 1999-2005 LSI Logic Corporation > > Fusion MPT SPI Host driver 3.03.07 > > Fusion MPT misc device (ioctl) driver 3.03.07 > > mptctl: Registered with Fusion MPT base driver > > mptctl: /dev/mptctl @ (major,minor=10,220) > > usbmon: debugfs is not available > > usbcore: registered new driver libusual > > mice: PS/2 mouse device common for all mice > > md: md driver 0.90.3 MAX_MD_DEVS=256, MD_SB_DISKS=27 > > md: bitmap version 4.39 > > NET: Registered protocol family 2 > > input: AT Translated Set 2 keyboard as /class/input/input0 > > IP route cache hash table entries: 1024 (order: 0, 4096 bytes) > > TCP established hash table entries: 4096 (order: 3, 32768 bytes) > > TCP bind hash table entries: 4096 (order: 3, 32768 bytes) > > TCP: Hash tables configured (established 4096 bind 4096) > > TCP reno registered > > Initializing IPsec netlink socket > > NET: Registered protocol family 1 > > NET: Registered protocol family 17 > > NET: Registered protocol family 8 > > NET: Registered protocol family 20 > > Using IPI No-Shortcut mode > > Freeing unused kernel memory: 208k freed > > Loading, please wait... > > Begin: Loading essential drivers... ... > > logips2pp: Detected unknown logitech mouse model 0 > > eepro100.c:v1.09j-t 9/29/99 Donald Becker http://www.scyld.com/network/eepro100.html > > eepro100.c: $Revision: 1.36 $ 2000/11/17 Modified by Andrey V. Savochkin <saw@saw.sw.com.sg> and others > > ACPI: PCI Interrupt 0000:00:02.0[A] -> GSI 27 (level, low) -> IRQ 18 > > eth0: 0000:00:02.0, 00:02:55:C7:CA:D8, IRQ 18. > > Board assembly 754338-001, Physical connectors present: RJ45 > > Primary interface chip i82555 PHY #1. > > General self-test: passed. > > Serial sub-system self-test: passed. > > Internal registers self-test: passed. > > ROM checksum self-test: passed (0x04f4518b). > > ACPI: PCI Interrupt 0000:00:0a.0[A] -> GSI 25 (level, low) -> IRQ 19 > > eth1: 0000:00:0a.0, 00:02:55:C7:CA:D9, IRQ 19. > > Board assembly 754338-001, Physical connectors present: RJ45 > > Primary interface chip i82555 PHY #1. > > General self-test: passed. > > Serial sub-system self-test: passed. > > Internal registers self-test: passed. > > ROM checksum self-test: passed (0x04f4518b). > > Intel(R) PRO/1000 Network Driver - version 6.3.9-k4 > > Copyright (c) 1999-2005 Intel Corporation. > > Done. > > Begin: Running /scripts/init-premount ... > > input: PS/2 Logitech Mouse as /class/input/input1 > > Done. > > Begin: Mounting root file system... ... > > Begin: Running /scripts/local-top ... > > device-mapper: 4.5.0-ioctl (2005-10-04) initialised: dm-devel@redhat.com > > /scripts/local-top/lvm: 36: vgchange: not found > > Done. > > SCSI device sda: 71096320 512-byte hdwr sectors (36401 MB) > > sda: assuming Write Enabled > > sda: assuming drive cache: write through > > SCSI device sda: 71096320 512-byte hdwr sectors (36401 MB) > > sda: assuming Write Enabled > > sda: assuming drive cache: write through > > sda: sda1 sda2 < sda5 > > > sd 1:0:0:0: Attached scsi disk sda > > SCSI device sdb: 71096320 512-byte hdwr sectors (36401 MB) > > sdb: assuming Write Enabled > > sdb: assuming drive cache: write through > > SCSI device sdb: 71096320 512-byte hdwr sectors (36401 MB) > > sdb: assuming Write Enabled > > sdb: assuming drive cache: write through > > sdb: unknown partition table > > sd 1:0:1:0: Attached scsi disk sdb > > Begin: Running /scripts/local-premount ... > > Done. > > EXT3-fs: INFO: recovery required on readonly filesystem. > > EXT3-fs: write access will be enabled during recovery. > > kjournald starting. Commit interval 5 seconds > > EXT3-fs: recovery complete. > > EXT3-fs: mounted filesystem with ordered data mode. > > Begin: Running /scripts/log-bottom ... > > Done. > > Done. > > Begin: Running /scripts/init-bottom ... > > Done. > > INIT: version 2.86 booting > > Starting the hotplug events dispatcher: udevd. > > Synthesizing the initial hotplug events...done. > > Waiting for /dev to be fully populated...done. > > Loading /etc/console/boottime.kmap.gz > > Activating swap. > > Adding 1019880k swap on /dev/sda5. Priority:-1 extents:1 across:1019880k > > Checking root file system... > > fsck 1.37 (21-Mar-2005) > > /dev/sda1: clean, 58437/512000 files, 445341/1023996 blocks > > EXT3 FS on sda1, internal journal > > System time was Sat Aug 12 00:22:00 UTC 2006. > > Setting the System Clock using the Hardware Clock as reference... > > System Clock set. System local time is now Sat Aug 12 00:22:01 UTC 2006. > > Calculating module dependencies...done. > > Loading modules... > > auto > > FATAL: Module auto not found. > > 3c59x > > softdog > > Software Watchdog Timer: 0.07 initialized. soft_noboot=0 soft_margin=60 sec (nowayout= 0) > > tulip > > Linux Tulip driver version 1.1.13 (May 11, 2002) > > e100 > > e100: Intel(R) PRO/100 Network Driver, 3.5.10-k2-NAPI > > e100: Copyright(c) 1999-2005 Intel Corporation > > All modules loaded. > > Loading device-mapper support. > > Starting Enterprise Volume Management System: device-mapper: dm-linear: Device lookup failed > > device-mapper: error adding target to table > > device-mapper: dm-linear: Device lookup failed > > device-mapper: error adding target to table > > device-mapper: dm-linear: Device lookup failed > > device-mapper: error adding target to table > > device-mapper: dm-linear: Device lookup failed > > device-mapper: error adding target to table > > device-mapper: dm-linear: Device lookup failed > > device-mapper: error adding target to table > > device-mapper: dm-linear: Device lookup failed > > device-mapper: error adding target to table > > device-mapper: dm-linear: Device lookup failed > > device-mapper: error adding target to table > > device-mapper: dm-linear: Device lookup failed > > device-mapper: error adding target to table > > evms. > > Checking all file systems... > > fsck 1.37 (21-Mar-2005) > > Setting kernel variables ... > > ... done. > > Loading the saved-state of the serial devices... > > Cannot get serial info: Invalid argument > > Cannot get serial info: Invalid argument > > /dev/ttyS1: No such file or directory > > /dev/ttyS1: No such file or directory > > Mounting local filesystems... > > Cleaning /tmp /var/run /var/lock. > > Cleaning: /etc/network/ifstate. > > Setting up IP spoofing protection: rp_filter. > > Configuring network interfaces: done. > > Starting portmap daemon: portmap. > > > > Setting the System Clock using the Hardware Clock as reference... > > System Clock set. Local time: Fri Aug 11 17:22:07 PDT 2006 > > > > Recovering jove files ... Done. > > Running ntpdate to synchronize clock. > > Initializing random number generator...done. > > Recovering nvi editor sessions... done. > > Setting up X server socket directory /tmp/.X11-unix...done. > > Setting up ICE socket directory /tmp/.ICE-unix...done. > > INIT: Entering runlevel: 2 > > Starting system log daemon: syslogd. > > Starting kernel log daemon: klogd. > > Starting virtual private network daemon:. > > Starting portmap daemon: portmap. > > Starting NFS common utilities: statd lockd. > > Starting internet superserver: inetd. > > * The mptctl module is missing. Please have a look at the README.Debian.gz. > > . > > Exporting directories for NFS kernel daemon...Installing knfsd (copyright (C) 1996 okir@monad.swb.de). > > 10.27.3.21:/export/xen/fs/stevegt/crashme1/root: No such file or directory > > 10.27.3.18:/export/xen/fs/stevegt/xentest2/root: No such file or directory > > 10.27.3.17:/export/xen/fs/stevegt/xentest1/root: No such file or directory > > 10.27.3.8:/export/xen/fs/cclarke/umltest1/root: No such file or directory > > hotaru.chaosring.Org:/export/xen/fs/stevegt/crashme1/root: No such file or directory > > rn-2.refactored-networks.com:/export/xen/fs/stevegt/xentest2/root: No such file or directory > > xentest1.stevegt.TerraLuna.Org:/export/xen/fs/stevegt/xentest1/root: No such file or directory > > umltest1.cclarke.TerraLuna.Org:/export/xen/fs/cclarke/umltest1/root: No such file or directory > > 10.27.3.26:/export/xen/fs/stevegt/gforge/root: No such file or directory > > 10.27.3.23:/export/xen/fs/baseline/sarge/root: No such file or directory > > 10.27.3.22:/export/xen/fs/stevegt/build1/root: No such file or directory > > 10.27.3.16:/export/xen/fs/baseline/woody/root: No such file or directory > > bugs.t7a.Org:/export/xen/fs/stevegt/gforge/root: No such file or directory > > sarge.baseline.TerraLuna.Org:/export/xen/fs/baseline/sarge/root: No such file or directory > > build1.stevegt.TerraLuna.Org:/export/xen/fs/stevegt/build1/root: No such file or directory > > woody.baseline.TerraLuna.Org:/export/xen/fs/baseline/woody/root: No such file or directory > > 10.27.3.19:/export/xen/fs/cclarke/ebond/root: No such file or directory > > ebond.cclarke.TerraLuna.Org:/export/xen/fs/cclarke/ebond/root: No such file or directory > > 10.27.3.29:/export/xen/fs/dmasten/mss0/root: No such file or directory > > 10.27.3.27:/export/xen/fs/cclarke/ccms/root: No such file or directory > > 10.27.3.25:/export/xen/fs/baseline/sid/root: No such file or directory > > 10.27.3.24:/export/xen/fs/stevegt/cvs1/root: No such file or directory > > sid.stevegt.TerraLuna.Org:/export/xen/fs/baseline/sid/root: No such file or directory > > cvs1.stevegt.TerraLuna.Org:/export/xen/fs/stevegt/cvs1/root: No such file or directory > > 10.27.3.28:/export/xen/fs/stevegt/t7a/root: No such file or directory > > 10.27.3.20:/export/xen/fs/stevegt/tcx/root: No such file or directory > > tcx.TerraLuna.Org:/export/xen/fs/stevegt/tcx/root: No such file or directory > > n2h54.prd.TerraLuna.Org:/export/xen: No such file or directory > > n2h53.prd.TerraLuna.Org:/export/xen: No such file or directory > > n2h51.prd.TerraLuna.Org:/export/xen: No such file or directory > > n2h41.prd.TerraLuna.Org:/export/xen: No such file or directory > > done. > > Starting NFS kernel daemon: nfsdNFSD: Using /var/lib/nfs/v4recovery as the NFSv4 state recovery directory > > NFSD: unable to find recovery directory /var/lib/nfs/v4recovery > > NFSD: starting 90-second grace period > > mountd. > > Starting OpenBSD Secure Shell server: sshdNET: Registered protocol family 10 > > lo: Disabled Privacy Extensions > > IPv6 over IPv4 tunneling driver > > . > > Starting NTP server: ntpd. > > AFS module /lib/modules/2.6.16.13-xen/fs/openafs.mp.o does not exist. Not starting AFS. > > Please consider building kernel modules using instructions in > > /usr/share/doc/openafs-client/README.modules > > Starting DRBD resources: drbd: initialised. Version: 0.7.20 (api:79/proto:74) > > drbd: SVN Revision: 2260 build by root@n4h8, 2006-08-04 11:00:44 > > drbd: registered as block device major 147 > > [ d0 drbd7: resync bitmap: bits=1048576 words=32768 > > drbd7: size = 4095 MB (4194303 KB) > > drbd7: 0 KB marked out-of-sync by on disk bit-map. > > drbd7: No usable activity log found. > > drbd7: drbdsetup [3825]: cstate Unconfigured --> StandAlone > > d1 drbd8: resync bitmap: bits=1048576 words=32768 > > drbd8: size = 4095 MB (4194303 KB) > > drbd8: 0 KB marked out-of-sync by on disk bit-map. > > drbd8: Found 4 transactions (46 active extents) in activity log. > > drbd8: Marked additional 156 MB as out-of-sync based on AL. > > drbd8: drbdsetup [3829]: cstate Unconfigured --> StandAlone > > d2 drbd9: resync bitmap: bits=2621440 words=81920 > > drbd9: size = 9 GB (10485759 KB) > > drbd9: 0 KB marked out-of-sync by on disk bit-map. > > drbd9: Found 4 transactions (192 active extents) in activity log. > > drbd9: drbdsetup [3833]: cstate Unconfigured --> StandAlone > > d3 drbd10: resync bitmap: bits=1835008 words=57344 > > drbd10: size = 7167 MB (7340031 KB) > > drbd10: 0 KB marked out-of-sync by on disk bit-map. > > drbd10: Found 4 transactions (192 active extents) in activity log. > > drbd10: Marked additional 508 MB as out-of-sync based on AL. > > drbd10: drbdsetup [3837]: cstate Unconfigured --> StandAlone > > d4 drbd11: resync bitmap: bits=786432 words=24576 > > drbd11: size = 3071 MB (3145727 KB) > > drbd11: 0 KB marked out-of-sync by on disk bit-map. > > drbd11: Found 4 transactions (192 active extents) in activity log. > > drbd11: drbdsetup [3841]: cstate Unconfigured --> StandAlone > > d5 drbd12: resync bitmap: bits=786432 words=24576 > > drbd12: size = 3071 MB (3145727 KB) > > drbd12: 0 KB marked out-of-sync by on disk bit-map. > > drbd12: Found 4 transactions (136 active extents) in activity log. > > drbd12: Marked additional 508 MB as out-of-sync based on AL. > > drbd12: drbdsetup [3845]: cstate Unconfigured --> StandAlone > > s0 s1 s2 s3 s4 s5 n0 drbd7: drbdsetup [3883]: cstate StandAlone --> Unconnected > > drbd7: drbd7_receiver [3884]: cstate Unconnected --> WFConnection > > n1 drbd8: drbdsetup [3891]: cstate StandAlone --> Unconnected > > drbd8: drbd8_receiver [3892]: cstate Unconnected --> WFConnection > > n2 drbd9: drbdsetup [3899]: cstate StandAlone --> Unconnected > > drbd9: drbd9_receiver [3900]: cstate Unconnected --> WFConnection > > n3 drbd10: drbdsetup [3907]: cstate StandAlone --> Unconnected > > drbd10: drbd10_receiver [3908]: cstate Unconnected --> WFConnection > > n4 drbd11: drbdsetup [3915]: cstate StandAlone --> Unconnected > > drbd11: drbd11_receiver [3916]: cstate Unconnected --> WFConnection > > n5 drbd12: drbdsetup [3923]: cstate StandAlone --> Unconnected > > drbd12: drbd12_receiver [3924]: cstate Unconnected --> WFConnection > > ]. > > .......... > > *************************************************************** > > DRBD''s startup script waits for the peer node(s) to appear. > > - In case this node was already a degraded cluster before the > > reboot the timeout is 120 seconds. [degr-wfc-timeout] > > - If the peer was available before the reboot the timeout will > > expire after 0 seconds. [wfc-timeout] > > (These values are for resource ''ds1.t7a.org''; 0 sec -> wait forever) > > To abort waiting enter ''yes'' [ 18]:drbd7: drbd7_receiver [3884]: cstate WFConnection --> WFReportParams > > drbd7: Handshake successful: DRBD Network Protocol version 74 > > drbd7: Connection established. > > drbd7: I am(S): 1:00000002:00000001:00000018:00000002:01 > > drbd7: Peer(S): 1:00000002:00000001:00000019:00000002:10 > > drbd7: drbd7_receiver [3884]: cstate WFReportParams --> WFBitMapT > > drbd7: Secondary/Unknown --> Secondary/Secondary > > drbd7: drbd7_receiver [3884]: cstate WFBitMapT --> SyncTarget > > drbd7: Resync started as SyncTarget (need to sync 520192 KB [130048 bits set]). > > drbd9: drbd9_receiver [3900]: cstate WFConnection --> WFReportParams > > drbd9: Handshake successful: DRBD Network Protocol version 74 > > drbd9: Connection established. > > drbd9: I am(S): 1:00000002:00000001:0000000d:00000001:01 > > drbd9: Peer(S): 1:00000002:00000001:0000000e:00000001:10 > > drbd9: drbd9_receiver [3900]: cstate WFReportParams --> WFBitMapT > > drbd9: Secondary/Unknown --> Secondary/Secondary > > drbd9: drbd9_receiver [3900]: cstate WFBitMapT --> SyncTarget > > drbd9: Resync started as SyncTarget (need to sync 290816 KB [72704 bits set]). > > drbd9: drbd9_receiver [3900]: cstate SyncTarget --> PausedSyncT > > drbd9: Syncer waits for sync group. > > drbd11: drbd11_receiver [3916]: cstate WFConnection --> WFReportParams > > drbd11: Handshake successful: DRBD Network Protocol version 74 > > drbd11: Connection established. > > drbd11: I am(S): 1:00000002:00000001:00000008:00000001:01 > > drbd11: Peer(S): 1:00000002:00000001:00000009:00000001:10 > > drbd11: drbd11_receiver [3916]: cstate WFReportParams --> WFBitMapT > > drbd11: Secondary/Unknown --> Secondary/Secondary > > drbd11: drbd11_receiver [3916]: cstate WFBitMapT --> SyncTarget > > drbd11: Resync started as SyncTarget (need to sync 274432 KB [68608 bits set]). > > drbd11: drbd11_receiver [3916]: cstate SyncTarget --> PausedSyncT > > drbd11: Syncer waits for sync group. > > drbd8: drbd8_receiver [3892]: cstate WFConnection --> WFReportParams > > drbd8: Handshake successful: DRBD Network Protocol version 74 > > drbd8: Connection established. > > drbd8: I am(S): 1:00000002:00000001:00000017:00000001:11 > > drbd8: Peer(S): 1:00000002:00000001:00000017:00000001:01 > > drbd8: drbd8_receiver [3892]: cstate WFReportParams --> WFBitMapS > > drbd8: Secondary/Unknown --> Secondary/Secondary > > drbd8: drbd8_receiver [3892]: cstate WFBitMapS --> SyncSource > > drbd8: Resync started as SyncSource (need to sync 159744 KB [39936 bits set]). > > drbd8: drbd8_receiver [3892]: cstate SyncSource --> PausedSyncS > > drbd8: Syncer waits for sync group. > > drbd12: drbd12_receiver [3924]: cstate WFConnection --> WFReportParams > > drbd12: Handshake successful: DRBD Network Protocol version 74 > > drbd12: Connection established. > > drbd12: I am(S): 1:00000002:00000001:00000001:00000001:11 > > drbd12: Peer(S): 1:00000002:00000001:00000001:00000001:01 > > drbd12: drbd12_receiver [3924]: cstate WFReportParams --> WFBitMapS > > drbd12: Secondary/Unknown --> Secondary/Secondary > > drbd12: drbd12_receiver [3924]: cstate WFBitMapS --> SyncSource > > drbd12: Resync started as SyncSource (need to sync 520192 KB [130048 bits set]). > > drbd12: drbd12_receiver [3924]: cstate SyncSource --> PausedSyncS > > drbd12: Syncer waits for sync group. > > drbd10: drbd10_receiver [3908]: cstate WFConnection --> WFReportParams > > drbd10: Handshake successful: DRBD Network Protocol version 74 > > drbd10: Connection established. > > drbd10: I am(S): 1:00000002:00000001:0000000b:00000001:11 > > drbd10: Peer(S): 1:00000002:00000001:0000000b:00000001:01 > > drbd10: drbd10_receiver [3908]: cstate WFReportParams --> WFBitMapS > > > > drbd10: Secondary/Unknown --> Secondary/Secondary > > drbd10: drbd10_receiver [3908]: cstate WFBitMapS --> SyncSource > > drbd10: Resync started as SyncSource (need to sync 520192 KB [130048 bits set]). > > drbd10: drbd10_receiver [3908]: cstate SyncSource --> PausedSyncS > > drbd10: Syncer waits for sync group. > > Heartbeat not configured: /etc/ha.d/ha.cf not found. > > Heartbeat failure [rc=1]. Failed. > > Starting deferred execution scheduler: atd. > > Starting periodic command scheduler: cron. > > Starting watchdog daemon: watchdog. > > openMosix: ERROR: Cannot find the /proc/hpc directory. > > openMosix: ERROR: Looks like this is not an openMosix enabled kernel. > > openMosix: ERROR: Configuration ABORTED. > > openMosix: HINT: need to recompile your kernel with openMosix support enabled? > > openMosix: HINT: need to update your boot manager (lilo, grub, etc)? > > openMosix: HINT: have you read the docs at http://openMosix.sf.net/ ? > > Bridge firewalling registered > > device vif0.0 entered promiscuous mode > > xenbr0: port 1(vif0.0) entering learning state > > xenbr0: topology change detected, propagating > > xenbr0: port 1(vif0.0) entering forwarding state > > device peth0 entered promiscuous mode > > xenbr0: port 2(peth0) entering learning state > > xenbr0: topology change detected, propagating > > xenbr0: port 2(peth0) entering forwarding state > > Starting auto Xen domains: afs1.t7a.orgdrbd9: Secondary/Secondary --> Secondary/Primary > > device vif1.0 entered promiscuous mode > > ADDRCONF(NETDEV_UP): vif1.0: link is not ready > > ip_tables: (C) 2000-2006 Netfilter Core Team > > Error: Device 2049 (vbd) could not be connected. Backend device not found. > > xenbr0: port 3(vif1.0) entering disabled state > > ! build2.t7a.orgdevice vif1.0 left promiscuous mode > > xenbr0: port 3(vif1.0) entering disabled state > > device vif2.0 entered promiscuous mode > > ADDRCONF(NETDEV_UP): vif2.0: link is not ready > > drbd10: Secondary/Secondary --> Primary/Secondary > > ADDRCONF(NETDEV_CHANGE): vif2.0: link becomes ready > > xenbr0: port 3(vif2.0) entering learning state > > xenbr0: topology change detected, propagating > > xenbr0: port 3(vif2.0) entering forwarding state > > ds1.t7a.orgdrbd7: Secondary/Secondary --> Secondary/Primary > > device vif3.0 entered promiscuous mode > > ADDRCONF(NETDEV_UP): vif3.0: link is not ready > > Error: Device 2049 (vbd) could not be connected. Backend device not found. > > ! kdc1.t7a.orgxenbr0: port 4(vif3.0) entering disabled state > > > > device vif3.0 left promiscuous mode > > xenbr0: port 4(vif3.0) entering disabled state > > > > > > > > > > > > > > drbd8: Secondary/Secondary --> Primary/Secondary > > device vif4.0 entered promiscuous mode > > ADDRCONF(NETDEV_UP): vif4.0: link is not ready > > ADDRCONF(NETDEV_CHANGE): vif4.0: link becomes ready > > xenbr0: port 4(vif4.0) entering learning state > > xenbr0: topology change detected, propagating > > xenbr0: port 4(vif4.0) entering forwarding state > > w1.cdint.comdrbd11: Secondary/Secondary --> Secondary/Primary > > device vif5.0 entered promiscuous mode > > ADDRCONF(NETDEV_UP): vif5.0: link is not ready > > Error: Device 2049 (vbd) could not be connected. Backend device not found. > > xenbr0: port 5(vif5.0) entering disabled state > > ! webkdc.t7a.orgdevice vif5.0 left promiscuous mode > > xenbr0: port 5(vif5.0) entering disabled state > > device vif6.0 entered promiscuous mode > > ADDRCONF(NETDEV_UP): vif6.0: link is not ready > > drbd12: Secondary/Secondary --> Primary/Secondary > > ADDRCONF(NETDEV_CHANGE): vif6.0: link becomes ready > > xenbr0: port 5(vif6.0) entering learning state > > xenbr0: topology change detected, propagating > > xenbr0: port 5(vif6.0) entering forwarding state > > [done] > > > > > > _sudZUZ#Z#XZo=_ DDDD EEEEEE BBBB IIIIII AAAA NN NN > > _jmZZ2!!~---~!!X##wa DD DD EE BB BB II AA AA NNN NN > > .<wdP~~ -!YZL, DD DD EEEEE BBBBB II AAAAAA NNNN NN > > .mX2'' _%aaa__ XZ[. DD DD EE BB BB II AA AA NN NNNN > > oZ[ _jdXY!~?S#wa ]Xb; DDDD EEEEEE BBBBB IIIIII AA AA NN NN > > _#e'' .]X2( ~Xw| )XXc > > .2Z` ]X[. xY| ]oZ( Linux Version 2.6.16.13-xen > > .2#; )3k; _s!~ jXf` Compiled #3 SMP Fri Aug 4 10:57:44 PDT 2006 > > 1Z> -]Xb/ ~ __#2( Two 1.13GHz Intel Intel(R) Pentium(R) III CPU family 1133MHz Processors > > -Zo; +!4ZwaaaauZZXY'' 72M RAM > > *#[, ~-?!!!!!!-~ 4523.20 Bogomips Total > > XUb;. n4h34 > > )YXL,, > > +3#bc, > > -)SSL,, > > ~~~~~ > > Updating the Linuxlogo... done. > > Debian GNU/Linux 3.0 n4h34 ttyS0 > > > > n4h34 login: drbd7: Resync done (total 137 sec; paused 0 sec; 3796 K/sec) > > drbd7: drbd7_worker [3826]: cstate SyncTarget --> Connected > > drbd8: Syncer continues. > > drbd8: drbd7_worker [3826]: cstate PausedSyncS --> SyncSource > > drbd9: Syncer continues. > > drbd9: drbd7_worker [3826]: cstate PausedSyncT --> SyncTarget > > drbd10: Syncer continues. > > drbd10: drbd7_worker [3826]: cstate PausedSyncS --> SyncSource > > drbd11: Syncer continues. > > drbd11: drbd7_worker [3826]: cstate PausedSyncT --> SyncTarget > > drbd12: Syncer continues. > > drbd12: drbd7_worker [3826]: cstate PausedSyncS --> SyncSource > > drbd8: Resync done (total 211 sec; paused 137 sec; 2156 K/sec) > > drbd8: drbd8_worker [3830]: cstate SyncSource --> Connected > > drbd11: Resync done (total 239 sec; paused 137 sec; 2688 K/sec) > > drbd11: drbd11_worker [3842]: cstate SyncTarget --> Connected > > drbd9: Resync done (total 243 sec; paused 137 sec; 2740 K/sec) > > drbd9: drbd9_worker [3834]: cstate SyncTarget --> Connected > > > > Debian GNU/Linux 3.0 n4h34 ttyS0 > > > > n4h34 login: > > Debian GNU/Linux 3.0 n4h34 ttyS0 > > > > n4h34 login: > > Debian GNU/Linux 3.0 n4h34 ttyS0 > > > > n4h34 login: > > Debian GNU/Linux 3.0 n4h34 ttyS0 > > > > n4h34 login: > > Debian GNU/Linux 3.0 n4h34 ttyS0 > > > > n4h34 login: > > Debian GNU/Linux 3.0 n4h34 ttyS0 > > > > n4h34 login: drbd12: Resync done (total 306 sec; paused 136 sec; 3056 K/sec) > > drbd12: drbd12_worker [3846]: cstate SyncSource --> Connected > > > > Debian GNU/Linux 3.0 n4h34 ttyS0 > > > > n4h34 login: > > Debian GNU/Linux 3.0 n4h34 ttyS0 > > > > n4h34 login: drbd10: Resync done (total 310 sec; paused 136 sec; 2988 K/sec) > > drbd10: drbd10_worker [3838]: cstate SyncSource --> Connected > > > > Debian GNU/Linux 3.0 n4h34 ttyS0 > > > > n4h34 login: Unable to handle kernel paging request at virtual address c0976590 > > printing eip: > > c03d6ed7 > > *pde = ma 3cf91067 pa 00f91067 > > *pte = ma 00000000 pa fffff000 > > Oops: 0000 [#1] > > SMP > > Modules linked in: iptable_filter ip_tables x_tables bridge drbd ipv6 nfsd lockd sunrpc e100 tulip softdog 3c59x evdev sd_mod dm_mod thermal processor fan e1000 eepro100 mii tg3 > > CPU: 0 > > EIP: 0061:[<c03d6ed7>] Not tainted VLI > > EFLAGS: 00010206 (2.6.16.13-xen #3) > > EIP is at skb_copy_bits+0x127/0x280 > > eax: c0976000 ebx: 000005a8 ecx: 0000016a edx: c3711720 > > esi: c0976590 edi: c37110e0 ebp: 000005a8 esp: c00818c8 > > ds: 007b es: 007b ss: 0069 > > Process drbd12_receiver (pid: 3924, threadinfo=c0080000 task=c3efda90) > > Stack: <0>c1012ec0 00000002 c03d65ef c1d51a00 000005ea 00000042 00000000 00000000 > > 00000020 c3edc800 c376fd64 c03d6b0f c376fd64 00000042 c37110e0 000005a8 > > c1a70000 c54690c0 00000000 c1d51ac0 c3edc800 c376fd64 c03dc478 c376fd64 > > Call Trace: > > [<c03d65ef>] pskb_expand_head+0xdf/0x140 > > [<c03d6b0f>] __pskb_pull_tail+0x7f/0x320 > > [<c54690c0>] br_nf_dev_queue_xmit+0x0/0x50 [bridge] > > [<c03dc478>] dev_queue_xmit+0x328/0x370 > > [<c5462f7e>] br_dev_queue_push_xmit+0xbe/0x140 [bridge] > > [<c5469212>] br_nf_post_routing+0x102/0x1c0 [bridge] > > [<c54690c0>] br_nf_dev_queue_xmit+0x0/0x50 [bridge] > > [<c5462ec0>] br_dev_queue_push_xmit+0x0/0x140 [bridge] > > [<c03f43f8>] nf_iterate+0x78/0x90 > > [<c5462ec0>] br_dev_queue_push_xmit+0x0/0x140 [bridge] > > [<c5462ec0>] br_dev_queue_push_xmit+0x0/0x140 [bridge] > > [<c03f447e>] nf_hook_slow+0x6e/0x110 > > [<c5462ec0>] br_dev_queue_push_xmit+0x0/0x140 [bridge] > > [<c5463061>] br_forward_finish+0x61/0x70 [bridge] > > [<c5462ec0>] br_dev_queue_push_xmit+0x0/0x140 [bridge] > > [<c5468995>] br_nf_forward_finish+0x75/0x130 [bridge] > > [<c5463000>] br_forward_finish+0x0/0x70 [bridge] > > [<c5468b38>] br_nf_forward_ip+0xe8/0x190 [bridge] > > [<c5468920>] br_nf_forward_finish+0x0/0x130 [bridge] > > [<c5463000>] br_forward_finish+0x0/0x70 [bridge] > > [<c03f43f8>] nf_iterate+0x78/0x90 > > [<c5463000>] br_forward_finish+0x0/0x70 [bridge] > > [<c5463000>] br_forward_finish+0x0/0x70 [bridge] > > [<c03f447e>] nf_hook_slow+0x6e/0x110 > > [<c5463000>] br_forward_finish+0x0/0x70 [bridge] > > [<c5463167>] __br_forward+0x77/0x80 [bridge] > > [<c5463000>] br_forward_finish+0x0/0x70 [bridge] > > [<c5463fbf>] br_handle_frame_finish+0xdf/0x160 [bridge] > > [<c5463ee0>] br_handle_frame_finish+0x0/0x160 [bridge] > > [<c5467d89>] br_nf_pre_routing_finish+0xf9/0x370 [bridge] > > [<c5463ee0>] br_handle_frame_finish+0x0/0x160 [bridge] > > [<c0322e3a>] loopback_start_xmit+0xba/0x110 > > [<c0400d70>] ip_finish_output+0x0/0x220 > > [<c03dc07e>] dev_hard_start_xmit+0x5e/0x130 > > [<c03dc3b5>] dev_queue_xmit+0x265/0x370 > > [<c03f43f8>] nf_iterate+0x78/0x90 > > [<c5467c90>] br_nf_pre_routing_finish+0x0/0x370 [bridge] > > [<c5467c90>] br_nf_pre_routing_finish+0x0/0x370 [bridge] > > [<c03f447e>] nf_hook_slow+0x6e/0x110 > > [<c5467c90>] br_nf_pre_routing_finish+0x0/0x370 [bridge] > > [<c5463ee0>] br_handle_frame_finish+0x0/0x160 [bridge] > > [<c54685fc>] br_nf_pre_routing+0x26c/0x520 [bridge] > > [<c5467c90>] br_nf_pre_routing_finish+0x0/0x370 [bridge] > > [<c03f43f8>] nf_iterate+0x78/0x90 > > [<c5463ee0>] br_handle_frame_finish+0x0/0x160 [bridge] > > [<c5463ee0>] br_handle_frame_finish+0x0/0x160 [bridge] > > [<c03f447e>] nf_hook_slow+0x6e/0x110 > > [<c5463ee0>] br_handle_frame_finish+0x0/0x160 [bridge] > > [<c546422d>] br_handle_frame+0x1ed/0x230 [bridge] > > [<c5463ee0>] br_handle_frame_finish+0x0/0x160 [bridge] > > [<c03dcb21>] netif_receive_skb+0x1a1/0x330 > > [<c03dcd87>] process_backlog+0xd7/0x190 > > [<c03dcf2a>] net_rx_action+0xea/0x230 > > [<c0125915>] __do_softirq+0xf5/0x120 > > [<c01259d5>] do_softirq+0x95/0xa0 > > [<c0125a42>] local_bh_enable+0x62/0xa0 > > [<c0406bd1>] tcp_prequeue_process+0x71/0x80 > > [<c04070e9>] tcp_recvmsg+0x349/0x750 > > [<c50eebfc>] dm_request+0xbc/0x100 [dm_mod] > > [<c03d5085>] sock_common_recvmsg+0x55/0x70 > > [<c03d11cf>] sock_recvmsg+0xef/0x110 > > [<c03143fa>] force_evtchn_callback+0xa/0x10 > > [<c0147163>] mempool_alloc+0x33/0xe0 > > [<c01367d0>] autoremove_wake_function+0x0/0x60 > > [<c50eebfc>] dm_request+0xbc/0x100 [dm_mod] > > [<c02b2ee0>] generic_make_request+0xf0/0x160 > > [<c529c680>] drbd_recv+0x90/0x190 [drbd] > > [<c529cdec>] drbd_recv_header+0x2c/0xf0 [drbd] > > [<c529e580>] receive_DataRequest+0x0/0x7d0 [drbd] > > [<c52a045c>] drbdd+0x1c/0x150 [drbd] > > [<c52a105a>] drbdd_init+0x7a/0x1a0 [drbd] > > [<c52a7136>] drbd_thread_setup+0x86/0xf0 [drbd] > > [<c52a70b0>] drbd_thread_setup+0x0/0xf0 [drbd] > > [<c0102f75>] kernel_thread_helper+0x5/0x10 > > Code: 8b 4c 24 30 8b 7c 24 34 8b 91 a0 00 00 00 8b 4c 24 18 0f b7 74 ca 18 8b 4c 24 14 8d 34 06 01 fe 29 ce 8b 7c 24 38 89 d9 c1 e9 02 <f3> a5 89 d9 83 e1 03 74 02 f3 a4 89 04 24 ba 02 00 00 00 89 54 > > <0>Kernel panic - not syncing: Fatal exception in interrupt > > (XEN) Domain 0 crashed: rebooting machine in 5 seconds. > > > > -- > > Stephen G. Traugott (KG6HDQ) > > Managing Partner, TerraLuna LLC > > stevegt@TerraLuna.Org -- http://www.t7a.org > > -- > Stephen G. Traugott (KG6HDQ) > Managing Partner, TerraLuna LLC > stevegt@TerraLuna.Org -- http://www.t7a.org-- Stephen G. Traugott (KG6HDQ) Managing Partner, TerraLuna LLC stevegt@TerraLuna.Org -- http://www.t7a.org _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 12/8/06 6:50 am, "Steve Traugott" <stevegt@TerraLuna.Org> wrote:> Another data point... Dom0 seems to only want to crash when more than > 3 or 4 domU''s are running (each with their own DRBD root, with DRBD > running in dom0), and the below ''nc'' command is run in the last > domU... > > Still looking.What does the crash look like? There was no oops message from domain0 in the kernel logs that you posted. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel