Steven Smith
2006-Jul-18 12:51 UTC
[Xen-devel] Paravirtualised drivers for fully virtualised domains
(The list appears to have eaten my previous attempt to send this. Apologies if you receive multiple copies.) The attached patches allow you to use paravirtualised network and block interfaces from fully virtualised domains, based on Intel''s patches from a few months ago. These are significantly faster than the equivalent ioemu devices, sometimes by more than an order of magnitude. These drivers are explicitly not considered by XenSource to be an alternative to improving the performance of the ioemu devices. Rather, work on both will continue in parallel. To build, apply the three patches to a clean checkout of xen-unstable and then build Xen, dom0, and the tools in the usual way. To build the drivers themselves, you first need to build a native kernel for the guest, and then go cd xen-unstable.hg/unmodified-drivers/linux-2.6 ./mkbuildtree make -C /usr/src/linux-2.6.16 M=$PWD modules where /usr/src/linux-2.6.16 is the path to the area where you built the guest kernel. This should be a native kernel, and not a xenolinux one. You should end up with four modules. xen-evtchn.ko should be loaded first, followed by xenbus.ko, and then whichever of xen-vnif.ko and xen-vbd.ko you need. None of the modules need any arguments. The xm configuration syntax is exactly the same as it would be for paravirtualised devices in a paravirtualised domain. For a network interface, you take your line vif= [ ''type=ioemu,mac=00:16:3E:C1:CA:78'' ] (or whatever) and replace it with vif= [ ''type=ioemu,mac=00:16:3E:C1:CA:78'', ''bridge=xenbr0'' ] where bridge=xenbr0 should be some suitable netif configuration string, as it would be in the PV-on-PV case. Disk is likewise fairly simple: disk = [ ''file:/path/to/image,ioemu:hda,w'' ] becomes disk = [ ''file:/path/to/image,ioemu:hda,w'', ''file:/path/to/some/other/image,hde,w'' ] There is a slight complication in that the paravirtualised block device can''t share an IDE controller with an ioemu device, so if you have an ioemu hda, the paravirtualised device must be hde or later. This is to avoid confusing the Linux IDE driver. Note that having a PV device doesn''t imply having a corresponding ioemu device, and vice versa. Configuring a single backing store to appear as both an IDE device and a paravirtualised block device is likely to cause problems; don''t do it. The patches consist of a number of big parts: -- A version of netback and netfront which can copy packets into domains rather than doing page flipping. It''s much easier to make this work well with qemu, since the P2M table doesn''t need to change, and it can be faster for some workloads. The copying interface has been confirmed to work in paravirtualised domains, but is currently disabled there. -- Reworking the device model and hypervisor support so that iorequest completion notifications no longer go to the HVM guest''s event channel mask. This avoids a whole slew of really quite nasty race conditions -- Adding a new device to the qemu PCI bus which is used for bootstrapping the devices and getting an IRQ. -- Support for hypercalls from HVM domains -- Various shims and fixes to the frontends so that they work without the rest of the xenolinux infrastructure. The patches still have a few rough edges, and they''re not as easy to understand as I''d like, but I think they should be mostly comprehensible and reasonably stable. The plan is to add them to xen-unstable over the next few weeks, probably before 3.0.3, so any testing which anyone can do would be helpful. The Xen and tools changes are also available as a series of smaller patches at http://www.cl.cam.ac.uk/~sos22/pv-on-hvm/hvm_xen . The composition of these gives hvm_xen_unstable.diff. Steven. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ben Thomas
2006-Jul-18 13:45 UTC
Re: [Xen-devel] Paravirtualised drivers for fully virtualised domains
Steven, This is very interesting. Thanks for posting it. It appears to closely parallel work that we''ve been doing here and have mentioned at times on this list. I believe that Steve Ofsthun posted some suggested patches in this area a little while back. I''m looking forward to seeing how your patches resolve the issues and feedback that was given to Steve. I''m also interested to see how you resolve the 32/64 bit issues. I know that there''s some sensitivity here as one of the pieces of feedback about our recent XI shadow posting was the current lack of 32bit support. A 64 bit hypervisor on an hvm capable machine should be capable of concurrent support of both 32 and 64 bit guest domains. Thanks for posting this. We''re looking forward to seeing your final submissions. Thanks ! -b On 7/18/06, Steven Smith <sos22-xen@srcf.ucam.org> wrote:> > (The list appears to have eaten my previous attempt to send this. > Apologies if you receive multiple copies.) > > The attached patches allow you to use paravirtualised network and > block interfaces from fully virtualised domains, based on Intel''s > patches from a few months ago. These are significantly faster than > the equivalent ioemu devices, sometimes by more than an order of > magnitude. > > These drivers are explicitly not considered by XenSource to be an > alternative to improving the performance of the ioemu devices. > Rather, work on both will continue in parallel. > > To build, apply the three patches to a clean checkout of xen-unstable > and then build Xen, dom0, and the tools in the usual way. To build > the drivers themselves, you first need to build a native kernel for > the guest, and then go > > cd xen-unstable.hg/unmodified-drivers/linux-2.6 > ./mkbuildtree > make -C /usr/src/linux-2.6.16 M=$PWD modules > > where /usr/src/linux-2.6.16 is the path to the area where you built > the guest kernel. This should be a native kernel, and not a xenolinux > one. You should end up with four modules. xen-evtchn.ko should be > loaded first, followed by xenbus.ko, and then whichever of xen-vnif.ko > and xen-vbd.ko you need. None of the modules need any arguments. > > The xm configuration syntax is exactly the same as it would be for > paravirtualised devices in a paravirtualised domain. For a network > interface, you take your line > > vif= [ ''type=ioemu,mac=00:16:3E:C1:CA:78'' ] > > (or whatever) and replace it with > > vif= [ ''type=ioemu,mac=00:16:3E:C1:CA:78'', ''bridge=xenbr0'' ] > > where bridge=xenbr0 should be some suitable netif configuration > string, as it would be in the PV-on-PV case. Disk is likewise fairly > simple: > > disk = [ ''file:/path/to/image,ioemu:hda,w'' ] > > becomes > > disk = [ ''file:/path/to/image,ioemu:hda,w'', > ''file:/path/to/some/other/image,hde,w'' ] > > There is a slight complication in that the paravirtualised block > device can''t share an IDE controller with an ioemu device, so if you > have an ioemu hda, the paravirtualised device must be hde or later. > This is to avoid confusing the Linux IDE driver. > > Note that having a PV device doesn''t imply having a corresponding > ioemu device, and vice versa. Configuring a single backing store to > appear as both an IDE device and a paravirtualised block device is > likely to cause problems; don''t do it. > > > > The patches consist of a number of big parts: > > -- A version of netback and netfront which can copy packets into > domains rather than doing page flipping. It''s much easier to make > this work well with qemu, since the P2M table doesn''t need to > change, and it can be faster for some workloads. > > The copying interface has been confirmed to work in paravirtualised > domains, but is currently disabled there. > > -- Reworking the device model and hypervisor support so that iorequest > completion notifications no longer go to the HVM guest''s event > channel mask. This avoids a whole slew of really quite nasty race > conditions > > -- Adding a new device to the qemu PCI bus which is used for > bootstrapping the devices and getting an IRQ. > > -- Support for hypercalls from HVM domains > > -- Various shims and fixes to the frontends so that they work without > the rest of the xenolinux infrastructure. > > The patches still have a few rough edges, and they''re not as easy to > understand as I''d like, but I think they should be mostly > comprehensible and reasonably stable. The plan is to add them to > xen-unstable over the next few weeks, probably before 3.0.3, so any > testing which anyone can do would be helpful. > > The Xen and tools changes are also available as a series of smaller > patches at http://www.cl.cam.ac.uk/~sos22/pv-on-hvm/hvm_xen . The > composition of these gives hvm_xen_unstable.diff. > > Steven. > > > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1.4.3 (GNU/Linux) > > iD8DBQFEvNk5O4S8/gLNrjcRAviLAJ0eS/1FZY+5ArbCrAaExsMrNAl9AQCgqyIp > cRz5az+HktMS60u0qy+3dJA> =19b4 > -----END PGP SIGNATURE----- > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel > > > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Steve Ofsthun
2006-Jul-18 16:00 UTC
Re: [Xen-devel] Paravirtualised drivers for fully virtualised domains
Steven Smith wrote:> The attached patches allow you to use paravirtualised network and > block interfaces from fully virtualised domains, based on Intel''s > patches from a few months ago. These are significantly faster than > the equivalent ioemu devices, sometimes by more than an order of > magnitude.Excellent work Steven! I''ve been working on a similar set of patches and your effort seems quite comprehensive. I do have a few questions: Can you comment on the testing matrix you used? In particular, does this patch address both 32-bit and 64-bit hypervisors? Can 32-bit guests make 64-bit hypercalls? Have you built the guest environment on anything other than a 2.6.16 version of Linux? We ran into extra work supporting older linux versions. You did some work to make xenbus a loadable module in the guest domains. Can this be used to make xenbus loadable in Domain 0?> These drivers are explicitly not considered by XenSource to be an > alternative to improving the performance of the ioemu devices. > Rather, work on both will continue in parallel.I agree. Both activities are worth developing.> There is a slight complication in that the paravirtualised block > device can''t share an IDE controller with an ioemu device, so if you > have an ioemu hda, the paravirtualised device must be hde or later. > This is to avoid confusing the Linux IDE driver. > > Note that having a PV device doesn''t imply having a corresponding > ioemu device, and vice versa. Configuring a single backing store to > appear as both an IDE device and a paravirtualised block device is > likely to cause problems; don''t do it.Several problems exist here: Domain 0 buffer cache coherency issues can cause catastrophic file system corruption. This is due to the backend accessing the backing device directly, and QEMU accessing the device through buffered reads and writes. We are working on a patch to convert QEMU to use O_DIRECT whenever possible. This solves the cache coherency issue. Actually presenting two copies of the same device to linux can cause its own problems. Mounting using LABEL= will complain about duplicate labels. However, using the device names directly seems to work. With this approach it is possible to decide in the guest whether to mount a device as an emulated disk or a PV disk.> The patches consist of a number of big parts: > > -- A version of netback and netfront which can copy packets into > domains rather than doing page flipping. It''s much easier to make > this work well with qemu, since the P2M table doesn''t need to > change, and it can be faster for some workloads.Recent patches to change QEMU to dynamically map memory may make this easier. We still avoid it to prevent large guest pages from being broken up (under the XI shadow code).> The copying interface has been confirmed to work in paravirtualised > domains, but is currently disabled there. > > -- Reworking the device model and hypervisor support so that iorequest > completion notifications no longer go to the HVM guest''s event > channel mask. This avoids a whole slew of really quite nasty race > conditionsThis is great news. We were filtering iorequest bits out during guest event notification delivery. Your method is much cleaner.> -- Adding a new device to the qemu PCI bus which is used for > bootstrapping the devices and getting an IRQ.Have you thought about supporting more than one IRQ. We are experimenting with an IRQ per device class (BUS, NIC, VBD).> -- Support for hypercalls from HVM domains > > -- Various shims and fixes to the frontends so that they work without > the rest of the xenolinux infrastructure. > > The patches still have a few rough edges, and they''re not as easy to > understand as I''d like, but I think they should be mostly > comprehensible and reasonably stable. The plan is to add them to > xen-unstable over the next few weeks, probably before 3.0.3, so any > testing which anyone can do would be helpful.This is a very good start! Steve -- Steve Ofsthun - Virtual Iron Software, Inc. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Mark Williamson
2006-Jul-18 16:23 UTC
Re: [Xen-devel] Paravirtualised drivers for fully virtualised domains
> > These drivers are explicitly not considered by XenSource to be an > > alternative to improving the performance of the ioemu devices. > > Rather, work on both will continue in parallel. > > I agree. Both activities are worth developing.There''s lots of stuff still to be done to make the ioemu devices work better, even if some users wish to use PV drivers directly some will still want the simplicity of working "out of the box".> Actually presenting two copies of the same device to linux can cause > its own problems. Mounting using LABEL= will complain about duplicate > labels. However, using the device names directly seems to work. With > this approach it is possible to decide in the guest whether to mount > a device as an emulated disk or a PV disk.We should *really* have interlocks in dom0 to prevent a guest from accessing both simultaneously :-) Initially, we could just allow the user only to configure as either model, not both (using a check in Xend, as we do for checking mounted partitions, etc). To support what you propose we''d probably have to add a little control plane stuff, but I think it''d be worth it to avoid too many people damaging stuff! To mangle a quote I once saw online: duplicate device access can be used to hunt both foot and game, but only one will feed your family. Cheers, Mark _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Steven Smith
2006-Jul-18 20:34 UTC
Re: [Xen-devel] Paravirtualised drivers for fully virtualised domains
> >The attached patches allow you to use paravirtualised network and > >block interfaces from fully virtualised domains, based on Intel''s > >patches from a few months ago. These are significantly faster than > >the equivalent ioemu devices, sometimes by more than an order of > >magnitude. > I''ve been working on a similar set of patches and your effort seems > quite comprehensive.Yeah, we (XenSource and Virtual Iron) really need to do a better job of coordinating who''s working on what. :)> I do have a few questions: > > Can you comment on the testing matrix you used? In particular, does > this patch address both 32-bit and 64-bit hypervisors? Can 32-bit > guests make 64-bit hypercalls?This set of patches only deals with the 32 bit case. Further, the PAE case depends on Tim Deegan''s new shadow mode posted last week. Sorry, I should have said that in the initial post.> Have you built the guest environment on anything other than a 2.6.16 > version of Linux? We ran into extra work supporting older linux versions.#ifdef soup will get you back to about 2.6.12-ish without too many problems. These patches don''t include that, since it would complicate merging.> You did some work to make xenbus a loadable module in the guest domains. > Can this be used to make xenbus loadable in Domain 0?I can''t see any immediate reason why not, but it''s not clear to me why that would be useful.> >There is a slight complication in that the paravirtualised block > >device can''t share an IDE controller with an ioemu device, so if you > >have an ioemu hda, the paravirtualised device must be hde or later. > >This is to avoid confusing the Linux IDE driver. > > > >Note that having a PV device doesn''t imply having a corresponding > >ioemu device, and vice versa. Configuring a single backing store to > >appear as both an IDE device and a paravirtualised block device is > >likely to cause problems; don''t do it. > Domain 0 buffer cache coherency issues can cause catastrophic file > system corruption. This is due to the backend accessing the backing > device directly, and QEMU accessing the device through buffered > reads and writes. We are working on a patch to convert QEMU to use > O_DIRECT whenever possible. This solves the cache coherency issue.I wasn''t aware of these issues. I was much more worried about domU trying to cache the devices twice, and those caches getting out of sync. It''s pretty much the usual problem of configuring a device into two domains and then having them trip over each other. Do you have a plan for dealing with this?> Actually presenting two copies of the same device to linux can cause > its own problems. Mounting using LABEL= will complain about duplicate > labels. However, using the device names directly seems to work. With > this approach it is possible to decide in the guest whether to mount > a device as an emulated disk or a PV disk.My plan here was to just not support VMs which mix paravirtualised and ioemulated devices, requiring the user to load the PV drivers from an initrd. Of course, you have to load the initrd somehow, but the bootloader should only be reading the disk, which makes the coherency issues much easier. As a last resort, rombios could learn about the PV devices, but I''d rather avoid that if possible. Your way would be preferable, though, if it works.> >The patches consist of a number of big parts: > > > >-- A version of netback and netfront which can copy packets into > > domains rather than doing page flipping. It''s much easier to make > > this work well with qemu, since the P2M table doesn''t need to > > change, and it can be faster for some workloads. > Recent patches to change QEMU to dynamically map memory may make this > easier.Yes, agreed. It should be possible to add this in later in a backwards-compatible fashion.> >-- Reworking the device model and hypervisor support so that iorequest > > completion notifications no longer go to the HVM guest''s event > > channel mask. This avoids a whole slew of really quite nasty race > > conditions > This is great news. We were filtering iorequest bits out during guest > event notification delivery. Your method is much cleaner.Thank you.> >-- Adding a new device to the qemu PCI bus which is used for > > bootstrapping the devices and getting an IRQ. > Have you thought about supporting more than one IRQ. We are experimenting > with an IRQ per device class (BUS, NIC, VBD).I considered it, but it wasn''t obvious that there would be much benefit. You can potentially scan a smaller part of the pending event channel mask, but that''s fairly quick already. Steven.> >-- Support for hypercalls from HVM domains > > > >-- Various shims and fixes to the frontends so that they work without > > the rest of the xenolinux infrastructure. > > > >The patches still have a few rough edges, and they''re not as easy to > >understand as I''d like, but I think they should be mostly > >comprehensible and reasonably stable. The plan is to add them to > >xen-unstable over the next few weeks, probably before 3.0.3, so any > >testing which anyone can do would be helpful. > > This is a very good start! > > Steve > -- > Steve Ofsthun - Virtual Iron Software, Inc._______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Steve Ofsthun
2006-Jul-18 23:24 UTC
Re: [Xen-devel] Paravirtualised drivers for fully virtualised domains
Steven Smith wrote:>>Have you built the guest environment on anything other than a 2.6.16 >>version of Linux? We ran into extra work supporting older linux versions. > > #ifdef soup will get you back to about 2.6.12-ish without too many > problems. These patches don''t include that, since it would complicate > merging.I was thinking about SLES9 (2.6.5), RHEL4 (2.6.9), RHEL3 (2.4.21).>>You did some work to make xenbus a loadable module in the guest domains. >>Can this be used to make xenbus loadable in Domain 0? > > I can''t see any immediate reason why not, but it''s not clear to me why > that would be useful.It just makes it easier to insert alternate bus implementations.>>Domain 0 buffer cache coherency issues can cause catastrophic file >>system corruption. This is due to the backend accessing the backing >>device directly, and QEMU accessing the device through buffered >>reads and writes. We are working on a patch to convert QEMU to use >>O_DIRECT whenever possible. This solves the cache coherency issue. > > I wasn''t aware of these issues. I was much more worried about domU > trying to cache the devices twice, and those caches getting out of > sync. It''s pretty much the usual problem of configuring a device into > two domains and then having them trip over each other. Do you have a > plan for dealing with this?We eliminate any buffer cache use in domain 0 for backing store objects. This prevents double caching and reduces domain 0 ''s memory footprint. We don''t restrict multiple domain access to the same "raw" backing object. Real hardware allows this (at least for SCSI/FC). This may be necessary for shared storage clustering.>>Actually presenting two copies of the same device to linux can cause >>its own problems. Mounting using LABEL= will complain about duplicate >>labels. However, using the device names directly seems to work. With >>this approach it is possible to decide in the guest whether to mount >>a device as an emulated disk or a PV disk. > > My plan here was to just not support VMs which mix paravirtualised and > ioemulated devices, requiring the user to load the PV drivers from an > initrd. Of course, you have to load the initrd somehow, but the > bootloader should only be reading the disk, which makes the coherency > issues much easier. As a last resort, rombios could learn about the > PV devices, but I''d rather avoid that if possible. > > Your way would be preferable, though, if it works.We currently only allow this for the boot device (mainly to avoid the rombios work you mention). In addition, we make the qemu device only visible to the rombios (and not the guest O/S) by controlling the IDE probe logic in qemu.>>>-- Adding a new device to the qemu PCI bus which is used for >>> bootstrapping the devices and getting an IRQ. >> >>Have you thought about supporting more than one IRQ. We are experimenting >>with an IRQ per device class (BUS, NIC, VBD). > > I considered it, but it wasn''t obvious that there would be much > benefit. You can potentially scan a smaller part of the pending event > channel mask, but that''s fairly quick already.The main benefit we see is for legacy Linux variants that limit 1 CPU per IRQ. Allowing additional IRQs increases the possible interrupt processing concurrency. In addition, one interrupt class can''t starve another (on SMP guests). Steve -- Steve Ofsthun - Virtual Iron Software, Inc. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Pratt
2006-Jul-19 04:14 UTC
RE: [Xen-devel] Paravirtualised drivers for fully virtualised domains
> >>Have you built the guest environment on anything other than a 2.6.16 > >>version of Linux? We ran into extra work supporting older linux > versions. > > > > #ifdef soup will get you back to about 2.6.12-ish without too many > > problems. These patches don''t include that, since it wouldcomplicate> > merging. > > I was thinking about SLES9 (2.6.5), RHEL4 (2.6.9), RHEL3 (2.4.21).Steven''s patches should be easy to back port given that we already have real PV drivers for all these kernels. Source for strictly unofficial (non vendor Supported) xen-ports of these kernels are available at http://xenbits.xensource.com/kernels 2.6.5 sles9sp2; 2.6.9 rhel4u1; 2.4.21 rhel3u5 Ian _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Gerd Hoffmann
2006-Jul-19 06:50 UTC
Re: [Xen-devel] Paravirtualised drivers for fully virtualised domains
Steve Ofsthun wrote:> Steven Smith wrote: > >>> Have you built the guest environment on anything other than a 2.6.16 >>> version of Linux? We ran into extra work supporting older linux >>> versions. >> >> #ifdef soup will get you back to about 2.6.12-ish without too many >> problems. These patches don''t include that, since it would complicate >> merging. > > I was thinking about SLES9 (2.6.5), RHEL4 (2.6.9), RHEL3 (2.4.21).SLES9 SP3 kernels available here: http://forge.novell.com/modules/xfcontent/downloads.php/xenpreview/SUSE%20Linux%20Enterprise%20Server/9%20SP3/ I have a sles9 guest up and running on a sles10 host machine. cheers, Gerd -- Gerd Hoffmann <kraxel@suse.de> http://www.suse.de/~kraxel/julika-dora.jpeg _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Steven Smith
2006-Jul-26 15:34 UTC
Re: [Xen-devel] Paravirtualised drivers for fully virtualised domains
I''ve just put an updated version of these patches up at http://www.cl.cam.ac.uk/~sos22/pv-on-hvm/rev2 . There''s also an equivalent single big patch at http://www.cl.cam.ac.uk/~sos22/pv-on-hvm/rev2.combined . Thank you to everyone who gave feedback on the previous version. The main changes since last time are: -- Support for SMP guests -- Support for 64 bit guests on a 64 bit hypervisor -- Partial support for 32 bit guests on a 64 bit hypervisor: the network interface works, but the block device doesn''t. The block device can be made to work by #define''ing ALIEN_INTERFACES in blkif.h, but drivers compiled in that way won''t work with 32 on 32. The problem here is that blkif_request_t contains extra padding in 64 bit builds, and so is a different size, and so the block ring layout is different. Other structures with similar problems are handled either by run time tests in the drivers (shared_info_t) or translation wrappers in the hypervisor (xen_feature_info_t, xen_add_to_physmap_t), but trying to do this for the block rings would require far more painful and extensive surgery. I''m inclined to stick with multiply compiling the frontend drivers in the short term, although it''ll obviously need doing in a slightly less grotty way. Steven. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Nakajima, Jun
2006-Jul-26 22:35 UTC
RE: [Xen-devel] Paravirtualised drivers for fully virtualised domains
Steven Smith wrote:> I''ve just put an updated version of these patches up at > http://www.cl.cam.ac.uk/~sos22/pv-on-hvm/rev2 . There''s also an > equivalent single big patch at > http://www.cl.cam.ac.uk/~sos22/pv-on-hvm/rev2.combined . Thank you to > everyone who gave feedback on the previous version. > > The main changes since last time are: > > -- Support for SMP guests > -- Support for 64 bit guests on a 64 bit hypervisor > -- Partial support for 32 bit guests on a 64 bit hypervisor: the > network interface works, but the block device doesn''t. > > The block device can be made to work by #define''ing ALIEN_INTERFACES > in blkif.h, but drivers compiled in that way won''t work with 32 on 32. > The problem here is that blkif_request_t contains extra padding in 64 > bit builds, and so is a different size, and so the block ring layout > is different.When do you expect this be in the unstable tree? Or which issues must be resolved befor that?> > Other structures with similar problems are handled either by run time > tests in the drivers (shared_info_t) or translation wrappers in the > hypervisor (xen_feature_info_t, xen_add_to_physmap_t), but trying to > do this for the block rings would require far more painful and > extensive surgery. I''m inclined to stick with multiply compiling the > frontend drivers in the short term, although it''ll obviously need > doing in a slightly less grotty way. > > Steven.Jun --- Intel Open Source Technology Center _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
He, Qing
2006-Aug-02 08:01 UTC
RE: [Xen-devel] Paravirtualised drivers for fully virtualised domains
Hi Steven, I found some issues regarding this patch. When I''m trying to start windows as VMX guest (with no drivers, of course) under this patch, the guests fail. I ran with three images, windows 2000, XP and 2003. For 2000 and XP, QEMU windows do not show, there are two lines in the serial output: (XEN) Create event channels for vcpu 0. (XEN) Send on unbound Xen event channel? For 2003 guest, QEMU can start, but before the windows start screen shows, it crashes and restarts, complaining about unreasonable mmio opcodes. The serial output is: (XEN) (GUEST: 1) unsupported PCI BIOS function 0x0E (XEN) (GUEST: 1) int13_harddisk: function 15, unmapped device for ELDL=82 (XEN) 0, This opcode isn''t handled yet! (XEN) handle_mmio: failed to decode instruction (XEN) mmio opcode: va 0xf821f600, gpa 0xa9600, len 2: 00 00 (XEN) domain_crash_sync called from platform.c:880 (XEN) Domain 1 (vcpu#0) crashed on cpu#2: (XEN) ----[ Xen-3.0-unstable Not tainted ]---- (XEN) CPU: 2 (XEN) EIP: 0008:[<8081d986>] (XEN) EFLAGS: 00010202 CONTEXT: hvm (XEN) eax: 00008008 ebx: 000003ce ecx: 000003ce edx: f821f600 (XEN) esi: 8081d9fa edi: f886ecd0 ebp: f886ecfc esp: f886ecbc (XEN) cr0: 8001003b cr3: 8f500000 (XEN) ds: 0023 es: 0023 fs: 0030 gs: 0000 ss: 0010 cs: 0008 (XEN) Create event channels for vcpu 0. (XEN) Send on unbound Xen event channel? (XEN) (GUEST: 2) HVM Loader (XEN) (GUEST: 2) Loading ROMBIOS ... (XEN) (GUEST: 2) Loading Cirrus VGABIOS ... (XEN) (GUEST: 2) Loading VMXAssist ... (XEN) (GUEST: 2) VMX go ... (XEN) (GUEST: 2) VMXAssist (Aug 2 2006) (XEN) (GUEST: 2) Memory size 512 MB (XEN) (GUEST: 2) E820 map: (XEN) (GUEST: 2) 0000000000000000 - 000000000009F800 (RAM) (XEN) (GUEST: 2) 000000000009F800 - 00000000000A0000 (Reserved) (XEN) (GUEST: 2) 00000000000A0000 - 00000000000C0000 (Type 16) (XEN) (GUEST: 2) 00000000000F0000 - 0000000000100000 (Reserved) (XEN) (GUEST: 2) 0000000000100000 - 000000001FFFE000 (RAM) (XEN) (GUEST: 2) 000000001FFFE000 - 000000001FFFF000 (Type 18) (XEN) (GUEST: 2) 000000001FFFF000 - 0000000020000000 (Type 17) (XEN) (GUEST: 2) 0000000020000000 - 0000000020003000 (ACPI NVS) (XEN) (GUEST: 2) 0000000020003000 - 000000002000D000 (ACPI Data) (XEN) (GUEST: 2) 00000000FEC00000 - 0000000100000000 (Type 16) (XEN) (GUEST: 2) (XEN) (GUEST: 2) Start BIOS ... (XEN) (GUEST: 2) Starting emulated 16-bit real-mode: ip=F000:FFF0 (XEN) (GUEST: 2) rombios.c,v 1.138 2005/05/07 15:55:26 vruppert Exp $ (XEN) (GUEST: 2) Remapping master: ICW2 0x8 -> 0x20 (XEN) (GUEST: 2) Remapping slave: ICW2 0x70 -> 0x28 (XEN) (GUEST: 2) VGABios $Id: vgabios.c,v 1.61 2005/05/24 16:50:50 vruppert Exp $ (XEN) (GUEST: 2) HVMAssist BIOS, 1 cpu, $Revision: 1.138 $ $Date: 2005/05/07 15:55:26 $ (XEN) (GUEST: 2) (XEN) (GUEST: 2) ata0-0: PCHS=16383/16/63 translation=lba LCHS=1024/255/63 (XEN) (GUEST: 2) ata0 master: QEMU HARDDISK ATA-7 Hard-Disk (12289 MBytes) (XEN) (GUEST: 2) ata0-1: PCHS=3047/16/63 translation=lba LCHS=761/64/63 (XEN) (GUEST: 2) ata0 slave: QEMU HARDDISK ATA-7 Hard-Disk (1500 MBytes) (XEN) (GUEST: 2) ata1 master: QEMU CD-ROM ATAPI-4 CD-Rom/DVD-Rom (XEN) (GUEST: 2) ata1 slave: Unknown device (XEN) (GUEST: 2) (XEN) (GUEST: 2) Booting from CD-Rom... (XEN) (GUEST: 2) unsupported PCI BIOS function 0x0E (XEN) (GUEST: 2) int13_harddisk: function 15, unmapped device for ELDL=82 (XEN) 0, This opcode isn''t handled yet! (XEN) handle_mmio: failed to decode instruction (XEN) mmio opcode: va 0xf821f600, gpa 0xa9600, len 2: 00 00 (XEN) domain_crash_sync called from platform.c:880 (XEN) Domain 2 (vcpu#0) crashed on cpu#2: (XEN) ----[ Xen-3.0-unstable Not tainted ]---- (XEN) CPU: 2 (XEN) EIP: 0008:[<8081d986>] (XEN) EFLAGS: 00010202 CONTEXT: hvm (XEN) eax: 00008008 ebx: 000003ce ecx: 000003ce edx: f821f600 (XEN) esi: 8081d9fa edi: f886ecd0 ebp: f886ecfc esp: f886ecbc (XEN) cr0: 8001003b cr3: 2ded8000 (XEN) ds: 0023 es: 0023 fs: 0030 gs: 0000 ss: 0010 cs: 0008 Meanwhile, I don''t experience any problems for Linux guest. Do you have any ideas why this happens? Best regards, Qing He>-----Original Message----- >From: xen-devel-bounces@lists.xensource.com >[mailto:xen-devel-bounces@lists.xensource.com] On Behalf Of Steven Smith >Sent: 2006年7月26日 23:35 >To: xen-devel@lists.xensource.com >Cc: sos22@srcf.ucam.org >Subject: Re: [Xen-devel] Paravirtualised drivers for fully virtualised domains > >I''ve just put an updated version of these patches up at >http://www.cl.cam.ac.uk/~sos22/pv-on-hvm/rev2 . There''s also an >equivalent single big patch at >http://www.cl.cam.ac.uk/~sos22/pv-on-hvm/rev2.combined . Thank you to >everyone who gave feedback on the previous version. > >The main changes since last time are: > >-- Support for SMP guests >-- Support for 64 bit guests on a 64 bit hypervisor >-- Partial support for 32 bit guests on a 64 bit hypervisor: the network > interface works, but the block device doesn''t. > >The block device can be made to work by #define''ing ALIEN_INTERFACES >in blkif.h, but drivers compiled in that way won''t work with 32 on 32. >The problem here is that blkif_request_t contains extra padding in 64 >bit builds, and so is a different size, and so the block ring layout >is different. > >Other structures with similar problems are handled either by run time >tests in the drivers (shared_info_t) or translation wrappers in the >hypervisor (xen_feature_info_t, xen_add_to_physmap_t), but trying to >do this for the block rings would require far more painful and >extensive surgery. I''m inclined to stick with multiply compiling the >frontend drivers in the short term, although it''ll obviously need >doing in a slightly less grotty way. > >Steven._______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Zhao, Yunfeng
2006-Aug-02 08:23 UTC
RE: [Xen-devel] Paravirtualised drivers for fully virtualised domains
Qing Your problem should be problem of credit scheduler. If you use sedf or bvt, you would not meet the problem. Thanks Yunfeng>-----Original Message----- >From: xen-devel-bounces@lists.xensource.com >[mailto:xen-devel-bounces@lists.xensource.com] On Behalf Of He, Qing >Sent: 2006年8月2日 16:02 >To: Steven Smith; xen-devel@lists.xensource.com >Cc: sos22@srcf.ucam.org >Subject: RE: [Xen-devel] Paravirtualised drivers for fully virtualised domains > >Hi Steven, >I found some issues regarding this patch. >When I''m trying to start windows as VMX guest (with no drivers, of course) under >this patch, the guests fail. I ran with three images, windows 2000, XP and 2003. > >For 2000 and XP, QEMU windows do not show, there are two lines in the serial >output: > (XEN) Create event channels for vcpu 0. > (XEN) Send on unbound Xen event channel? > >For 2003 guest, QEMU can start, but before the windows start screen shows, it >crashes and restarts, complaining about unreasonable mmio opcodes. The serial >output is: > (XEN) (GUEST: 1) unsupported PCI BIOS function 0x0E > (XEN) (GUEST: 1) int13_harddisk: function 15, unmapped device for ELDL=82 > (XEN) 0, This opcode isn''t handled yet! > (XEN) handle_mmio: failed to decode instruction > (XEN) mmio opcode: va 0xf821f600, gpa 0xa9600, len 2: 00 00 > (XEN) domain_crash_sync called from platform.c:880 > (XEN) Domain 1 (vcpu#0) crashed on cpu#2: > (XEN) ----[ Xen-3.0-unstable Not tainted ]---- > (XEN) CPU: 2 > (XEN) EIP: 0008:[<8081d986>] > (XEN) EFLAGS: 00010202 CONTEXT: hvm > (XEN) eax: 00008008 ebx: 000003ce ecx: 000003ce edx: f821f600 > (XEN) esi: 8081d9fa edi: f886ecd0 ebp: f886ecfc esp: f886ecbc > (XEN) cr0: 8001003b cr3: 8f500000 > (XEN) ds: 0023 es: 0023 fs: 0030 gs: 0000 ss: 0010 cs: 0008 > (XEN) Create event channels for vcpu 0. > (XEN) Send on unbound Xen event channel? > (XEN) (GUEST: 2) HVM Loader > (XEN) (GUEST: 2) Loading ROMBIOS ... > (XEN) (GUEST: 2) Loading Cirrus VGABIOS ... > (XEN) (GUEST: 2) Loading VMXAssist ... > (XEN) (GUEST: 2) VMX go ... > (XEN) (GUEST: 2) VMXAssist (Aug 2 2006) > (XEN) (GUEST: 2) Memory size 512 MB > (XEN) (GUEST: 2) E820 map: > (XEN) (GUEST: 2) 0000000000000000 - 000000000009F800 (RAM) > (XEN) (GUEST: 2) 000000000009F800 - 00000000000A0000 (Reserved) > (XEN) (GUEST: 2) 00000000000A0000 - 00000000000C0000 (Type 16) > (XEN) (GUEST: 2) 00000000000F0000 - 0000000000100000 (Reserved) > (XEN) (GUEST: 2) 0000000000100000 - 000000001FFFE000 (RAM) > (XEN) (GUEST: 2) 000000001FFFE000 - 000000001FFFF000 (Type 18) > (XEN) (GUEST: 2) 000000001FFFF000 - 0000000020000000 (Type 17) > (XEN) (GUEST: 2) 0000000020000000 - 0000000020003000 (ACPI NVS) > (XEN) (GUEST: 2) 0000000020003000 - 000000002000D000 (ACPI Data) > (XEN) (GUEST: 2) 00000000FEC00000 - 0000000100000000 (Type 16) > (XEN) (GUEST: 2) > (XEN) (GUEST: 2) Start BIOS ... > (XEN) (GUEST: 2) Starting emulated 16-bit real-mode: ip=F000:FFF0 > (XEN) (GUEST: 2) rombios.c,v 1.138 2005/05/07 15:55:26 vruppert Exp $ > (XEN) (GUEST: 2) Remapping master: ICW2 0x8 -> 0x20 > (XEN) (GUEST: 2) Remapping slave: ICW2 0x70 -> 0x28 > (XEN) (GUEST: 2) VGABios $Id: vgabios.c,v 1.61 2005/05/24 16:50:50 >vruppert Exp $ > (XEN) (GUEST: 2) HVMAssist BIOS, 1 cpu, $Revision: 1.138 $ $Date: >2005/05/07 15:55:26 $ > (XEN) (GUEST: 2) > (XEN) (GUEST: 2) ata0-0: PCHS=16383/16/63 translation=lba >LCHS=1024/255/63 > (XEN) (GUEST: 2) ata0 master: QEMU HARDDISK ATA-7 Hard-Disk (12289 MBytes) > (XEN) (GUEST: 2) ata0-1: PCHS=3047/16/63 translation=lba LCHS=761/64/63 > (XEN) (GUEST: 2) ata0 slave: QEMU HARDDISK ATA-7 Hard-Disk (1500 MBytes) > (XEN) (GUEST: 2) ata1 master: QEMU CD-ROM ATAPI-4 CD-Rom/DVD-Rom > (XEN) (GUEST: 2) ata1 slave: Unknown device > (XEN) (GUEST: 2) > (XEN) (GUEST: 2) Booting from CD-Rom... > (XEN) (GUEST: 2) unsupported PCI BIOS function 0x0E > (XEN) (GUEST: 2) int13_harddisk: function 15, unmapped device for ELDL=82 > (XEN) 0, This opcode isn''t handled yet! > (XEN) handle_mmio: failed to decode instruction > (XEN) mmio opcode: va 0xf821f600, gpa 0xa9600, len 2: 00 00 > (XEN) domain_crash_sync called from platform.c:880 > (XEN) Domain 2 (vcpu#0) crashed on cpu#2: > (XEN) ----[ Xen-3.0-unstable Not tainted ]---- > (XEN) CPU: 2 > (XEN) EIP: 0008:[<8081d986>] > (XEN) EFLAGS: 00010202 CONTEXT: hvm > (XEN) eax: 00008008 ebx: 000003ce ecx: 000003ce edx: f821f600 > (XEN) esi: 8081d9fa edi: f886ecd0 ebp: f886ecfc esp: f886ecbc > (XEN) cr0: 8001003b cr3: 2ded8000 > (XEN) ds: 0023 es: 0023 fs: 0030 gs: 0000 ss: 0010 cs: 0008 > >Meanwhile, I don''t experience any problems for Linux guest. Do you have any ideas >why this happens? > >Best regards, >Qing He >>-----Original Message----- >>From: xen-devel-bounces@lists.xensource.com >>[mailto:xen-devel-bounces@lists.xensource.com] On Behalf Of Steven Smith >>Sent: 2006年7月26日 23:35 >>To: xen-devel@lists.xensource.com >>Cc: sos22@srcf.ucam.org >>Subject: Re: [Xen-devel] Paravirtualised drivers for fully virtualised domains >> >>I''ve just put an updated version of these patches up at >>http://www.cl.cam.ac.uk/~sos22/pv-on-hvm/rev2 . There''s also an >>equivalent single big patch at >>http://www.cl.cam.ac.uk/~sos22/pv-on-hvm/rev2.combined . Thank you to >>everyone who gave feedback on the previous version. >> >>The main changes since last time are: >> >>-- Support for SMP guests >>-- Support for 64 bit guests on a 64 bit hypervisor >>-- Partial support for 32 bit guests on a 64 bit hypervisor: the network >> interface works, but the block device doesn''t. >> >>The block device can be made to work by #define''ing ALIEN_INTERFACES >>in blkif.h, but drivers compiled in that way won''t work with 32 on 32. >>The problem here is that blkif_request_t contains extra padding in 64 >>bit builds, and so is a different size, and so the block ring layout >>is different. >> >>Other structures with similar problems are handled either by run time >>tests in the drivers (shared_info_t) or translation wrappers in the >>hypervisor (xen_feature_info_t, xen_add_to_physmap_t), but trying to >>do this for the block rings would require far more painful and >>extensive surgery. I''m inclined to stick with multiply compiling the >>frontend drivers in the short term, although it''ll obviously need >>doing in a slightly less grotty way. >> >>Steven. > >_______________________________________________ >Xen-devel mailing list >Xen-devel@lists.xensource.com >http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Steven Hand
2006-Aug-02 08:56 UTC
Re: [Xen-devel] Paravirtualised drivers for fully virtualised domains
>>I found some issues regarding this patch. >>When I''m trying to start windows as VMX guest (with no drivers, of course) >>under this patch, the guests fail. I ran with three images, windows 2000, >>XP and 2003. >> >>For 2000 and XP, QEMU windows do not show, there are two lines in the >serial >>output: >> (XEN) Create event channels for vcpu 0. >> (XEN) Send on unbound Xen event channel?As you note yourself, this problem is very unlikely to be anything to do with the new PV drivers or associated infrastructure. I think your problem is more likely to be a race condition in the startup of the qemu-dm helper process. If you check the latest /var/log/qemu-*.log and see a message something like "xc_get_pfnlist returned -1" this means that qemu-dm tried to interrogate the domain before it had been created. (I''ve seen this myself before but only on non-debug builds of Xen) We should fix this properly, but for now you can just retry. cheers, S. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Steven Smith
2006-Aug-02 09:30 UTC
Re: [Xen-devel] Paravirtualised drivers for fully virtualised domains
> When I''m trying to start windows as VMX guest (with no drivers, of > course) under this patch, the guests fail. I ran with three images, > windows 2000, XP and 2003.> For 2000 and XP, QEMU windows do not show, there are two lines in the serial output: > (XEN) Create event channels for vcpu 0. > (XEN) Send on unbound Xen event channel?Is there anything interesting in /var/log/qemu-dm.* ? Only the most recent log file is relevant (which isn''t necessarily the one with the highest number, unfortunately). Also, it looks like this is crashing too soon for it to be related to what guest you''re running. Are all of the disk images the same type (file vs. block device) and size?> For 2003 guest, QEMU can start, but before the windows start screen >shows, it crashes and restarts, complaining about unreasonable mmio >opcodes. The serial output is: > > (XEN) (GUEST: 1) unsupported PCI BIOS function 0x0E > (XEN) (GUEST: 1) int13_harddisk: function 15, unmapped device for ELDL=82 > (XEN) 0, This opcode isn''t handled yet! > (XEN) handle_mmio: failed to decode instruction > (XEN) mmio opcode: va 0xf821f600, gpa 0xa9600, len 2: 00 00 > (XEN) domain_crash_sync called from platform.c:880This looks like a problem with hvm_copy. Is this a PAE hypervisor?> Meanwhile, I don''t experience any problems for Linux guest. Do you > have any ideas why this happens?Some kind of race would be my first guess. Steven. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Steven Smith
2006-Aug-02 09:37 UTC
Re: [Xen-devel] Paravirtualised drivers for fully virtualised domains
> >>For 2000 and XP, QEMU windows do not show, there are two lines in the > >serial > >>output: > >> (XEN) Create event channels for vcpu 0. > >> (XEN) Send on unbound Xen event channel? > As you note yourself, this problem is very unlikely to be anything to do > with the new PV drivers or associated infrastructure.You''d think that, but the PV patch changes the way we send requests to the device model, and could have produced this behaviour. I thought I''d got all of the relevant bugs fixed, but apparently not. Steven. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
He, Qing
2006-Aug-02 09:49 UTC
RE: [Xen-devel] Paravirtualised drivers for fully virtualised domains
>-----Original Message----- >From: Steven Smith [mailto:sos22@hermes.cam.ac.uk] On Behalf Of Steven Smith >Sent: 2006年8月2日 17:31 >To: He, Qing >Cc: Steven Smith; xen-devel@lists.xensource.com; sos22@srcf.ucam.org >Subject: Re: [Xen-devel] Paravirtualised drivers for fully virtualised domains > >> When I''m trying to start windows as VMX guest (with no drivers, of >> course) under this patch, the guests fail. I ran with three images, >> windows 2000, XP and 2003. > >> For 2000 and XP, QEMU windows do not show, there are two lines in the serial >output: >> (XEN) Create event channels for vcpu 0. >> (XEN) Send on unbound Xen event channel? >Is there anything interesting in /var/log/qemu-dm.* ? Only the most >recent log file is relevant (which isn''t necessarily the one with the >highest number, unfortunately). > >Also, it looks like this is crashing too soon for it to be related to >what guest you''re running. Are all of the disk images the same type >(file vs. block device) and size? >Sorry, these 2 cases are of some kind of configuration errors, some qemu parameters changed after qemu update, I can get them run using an early changeset. So when it cannot boot, I don''t think of the possibility of configuration errors. After changed the configuration, they can boot now (doesn''t test if they meet the same problem as below)>> For 2003 guest, QEMU can start, but before the windows start screen >>shows, it crashes and restarts, complaining about unreasonable mmio >>opcodes. The serial output is: >> >> (XEN) (GUEST: 1) unsupported PCI BIOS function 0x0E >> (XEN) (GUEST: 1) int13_harddisk: function 15, unmapped device for ELDL=82 >> (XEN) 0, This opcode isn''t handled yet! >> (XEN) handle_mmio: failed to decode instruction >> (XEN) mmio opcode: va 0xf821f600, gpa 0xa9600, len 2: 00 00 >> (XEN) domain_crash_sync called from platform.c:880 >This looks like a problem with hvm_copy. Is this a PAE hypervisor? >Your patch is based on Cset 10735, before applied the patch, I can start and run the image with no problems; but after the patch, this problem can be reproduced every time. It''s not a PAE hypervisor, and qemu log doesn''t show much information: domid: 1 qemu: the number of cpus is 1 shared page at pfn:1ffff, mfn: 3e35f char device redirected to /dev/pts/2>> Meanwhile, I don''t experience any problems for Linux guest. Do you >> have any ideas why this happens? >Some kind of race would be my first guess. > >Steven.Best regards, Qing _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
He, Qing
2006-Aug-02 10:35 UTC
RE: [Xen-devel] Paravirtualised drivers for fully virtualised domains
Thanks Steven, with this patch, the problem''s gone. Qing>-----Original Message----- >From: Steven Smith [mailto:sos22@hermes.cam.ac.uk] On Behalf Of Steven Smith >Sent: 2006年8月2日 18:16 >To: He, Qing >Cc: Steven Smith; sos22@srcf.ucam.org; xen-devel@xensource.com >Subject: Re: [Xen-devel] Paravirtualised drivers for fully virtualised domains > >> >> (XEN) 0, This opcode isn''t handled yet! >> >> (XEN) handle_mmio: failed to decode instruction >> >> (XEN) mmio opcode: va 0xf821f600, gpa 0xa9600, len 2: 00 00 >> >> (XEN) domain_crash_sync called from platform.c:880 >> >This looks like a problem with hvm_copy. Is this a PAE hypervisor? >> > >> Your patch is based on Cset 10735, before applied the patch, I can >>start and run the image with no problems; but after the patch, this >>problem can be reproduced every time. >Sorry, I wasn''t trying to shift blame here: the patch I posted >includes some changes to hvm_copy in the non-PAE case, and I suspect >that it''s those which are causing these problems. Does the attached >patch help? > >(Apply it over the top of the ones I posted previously) > >Steven_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Himanshu Raj
2006-Aug-03 06:59 UTC
Re: [Xen-devel] Paravirtualised drivers for fully virtualised domains
Hi Folks, Could any of you suggest a good recent linux kernel to work with? The latest 2.6.16.13 native version doesn''t boot as an hvm guest (No errors, just doesn''t show anything on its boot console, xm list reports it running thought). If you could hint on this problem, let me know. Thanks, Himanshu On Wed, Aug 02, 2006 at 06:35:31PM +0800, He, Qing wrote:> Thanks Steven, with this patch, the problem''s gone. > > Qing > > >-----Original Message----- > >From: Steven Smith [mailto:sos22@hermes.cam.ac.uk] On Behalf Of Steven Smith > >Sent: 2006??8??2?? 18:16 > >To: He, Qing > >Cc: Steven Smith; sos22@srcf.ucam.org; xen-devel@xensource.com > >Subject: Re: [Xen-devel] Paravirtualised drivers for fully virtualised domains > > > >> >> (XEN) 0, This opcode isn''t handled yet! > >> >> (XEN) handle_mmio: failed to decode instruction > >> >> (XEN) mmio opcode: va 0xf821f600, gpa 0xa9600, len 2: 00 00 > >> >> (XEN) domain_crash_sync called from platform.c:880 > >> >This looks like a problem with hvm_copy. Is this a PAE hypervisor? > >> > > >> Your patch is based on Cset 10735, before applied the patch, I can > >>start and run the image with no problems; but after the patch, this > >>problem can be reproduced every time. > >Sorry, I wasn''t trying to shift blame here: the patch I posted > >includes some changes to hvm_copy in the non-PAE case, and I suspect > >that it''s those which are causing these problems. Does the attached > >patch help? > > > >(Apply it over the top of the ones I posted previously) > > > >Steven> _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel-- ------------------------------------------------------------------------- Himanshu Raj PhD Student, GaTech (www.cc.gatech.edu/~rhim) I prefer to receive attachments in an open, non-proprietary format. ------------------------------------------------------------------------- _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Steven Smith
2006-Aug-03 09:35 UTC
Re: [Xen-devel] Paravirtualised drivers for fully virtualised domains
> Could any of you suggest a good recent linux kernel to work with? > The latest 2.6.16.13 native version doesn''t boot as an hvm guest (No > errors, just doesn''t show anything on its boot console, xm list > reports it running thought). If you could hint on this problem, let > me know.2.6.16.13 should work. Is the failure mode the same both with and without my patches? Steven. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Himanshu Raj
2006-Aug-04 06:13 UTC
Re: [Xen-devel] Paravirtualised drivers for fully virtualised domains
Yes it is the case (i.e. hvm guest w/o your patch is getting stuck the same way). This happens with the latest version. I have only tried the RHEL4 or Debian Sarge guest kernels before (both <= 2.6.9) and they work fine. Failure is also quite intriguing - the guest just sort of goes into an infinite loop, no further output on vnc console (or serial), nothing pops up in usual logs (qemu-dm.log, xend.log, xend-debug.log ...). Kind of trumped by it. Kindly let me know if anything comes to mind. Thanks, Himanshu On Thu, Aug 03, 2006 at 10:35:51AM +0100, Steven Smith wrote:> > Could any of you suggest a good recent linux kernel to work with? > > The latest 2.6.16.13 native version doesn''t boot as an hvm guest (No > > errors, just doesn''t show anything on its boot console, xm list > > reports it running thought). If you could hint on this problem, let > > me know. > 2.6.16.13 should work. Is the failure mode the same both with and > without my patches? > > Steven.-- ------------------------------------------------------------------------- Himanshu Raj PhD Student, GaTech (www.cc.gatech.edu/~rhim) I prefer to receive attachments in an open, non-proprietary format. ------------------------------------------------------------------------- _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Steven Smith
2006-Aug-08 09:42 UTC
Re: [Xen-devel] Paravirtualised drivers for fully virtualised domains
I just put a new version of the PV-on-HVM patches up at http://www.cl.cam.ac.uk/~sos22/pv-on-hvm/rev8 . These are against 10968:51c227428166 and are otherwise largely unchanged from the previous versions. Steven. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Steve Dobbelstein
2006-Aug-09 18:05 UTC
Re: [Xen-devel] Paravirtualised drivers for fully virtualised domains
Steven Smith <sos22-xen@srcf.ucam.org> wrote on 08/08/2006 04:42:15 AM:> I just put a new version of the PV-on-HVM patches up at > http://www.cl.cam.ac.uk/~sos22/pv-on-hvm/rev8 . These are against > 10968:51c227428166 and are otherwise largely unchanged from the > previous versions. > > Steven.I have been running some informal performance tests on the rev8 patches. Thought I''d share my finding thus far. I am finding that disk performance (sequential/random read/write) with the PV xen-vbd driver in an HVM domain is pretty much equal to that of a PV domain. Cool. Not surprising, but cool nonetheless. At the moment I''m having trouble running a network test (netperf) of the PV xen-vnif driver within our testing framework. I''ll post those findings when I get some reliable numbers. Testing on the rev2 version of the patches showed pretty much equal network performance between running on a PV driver in an HVM domain and a PV domain. I am noticing two odd behaviors with the rev8 patches, though. 1. When I try to create a PV domain, the domain hangs on bootup displaying repeated messages to the console: netfront: Bad rx response id 1. netfront: Bad rx response id 0. netfront: Bad rx response id 1. netfront: Bad rx response id 0. ... I had to reboot from an unpatched changeset 10968 build to get the performance numbers for a PV domain. (Hence, I am not comparing numbers from the exact same code base, which is one reason why the tests are "informal".) I haven''t dug into the cause of this problem yet. 2. When I destroy the HVM domain it stays in the zombie state. dib:~ # xm list Name ID Mem(MiB) VCPUs State Time(s) Domain-0 0 768 1 r----- 2328.4 Zombie-hvm1 1 768 1 -----d 1502.6 I''m not sure how to debug this one. Any pointers would be helpful. Steve D. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Steven Smith
2006-Aug-10 11:08 UTC
[Xen-devel] Paravirtualised drivers for fully virtualised domains, rev9
I just put a new version of the PV-on-HVM patches up at http://www.cl.cam.ac.uk/~sos22/pv-on-hvm/rev9 . These are against 10968:51c227428166, as before. Hopefully, the problems some people have been having with network access from paravirtualised domains and domains becoming zombies are now fixed. Thanks to everyone who submitted bug reports on these. Steven. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Steve Dobbelstein
2006-Aug-10 21:48 UTC
Re: [Xen-devel] Paravirtualised drivers for fully virtualised domains, rev9
Steven Smith <sos22-xen@srcf.ucam.org> wrote on 08/10/2006 06:08:38 AM:> I just put a new version of the PV-on-HVM patches up at > http://www.cl.cam.ac.uk/~sos22/pv-on-hvm/rev9 . These are against > 10968:51c227428166, as before. Hopefully, the problems some people > have been having with network access from paravirtualised domains and > domains becoming zombies are now fixed. > > Thanks to everyone who submitted bug reports on these.Hi, Steve. Thought I''d share my findings so far with rev9. The good news is that I don''t get zombies anymore. The bad news is that I''m still getting very poor network performance running netperf, worse than a fully virtualized domain. I thought it was something wrong with my test setup when I was testing rev8, but the test setup looks good and the results are repeatable. Here is what I have found so far in trying to chase down the cause of the slowdown. The qemu-dm process is running 99.9% of the CPU on dom0. I ran xenoprofile to see what functions are chewing up the most time. Here are the first several lines of output from the xenoprofile report: 1316786 17.1956 vmlinux-2.6.16.13-xen0-up vmlinux-2.6.16.13-xen0-up system_call 1243487 16.2385 vmlinux-2.6.16.13-xen0-up vmlinux-2.6.16.13-xen0-up do_select 492967 6.4376 vmlinux-2.6.16.13-xen0-up vmlinux-2.6.16.13-xen0-up do_gettimeofday 467692 6.1075 vmlinux-2.6.16.13-xen0-up vmlinux-2.6.16.13-xen0-up sys_select 376844 4.9211 vmlinux-2.6.16.13-xen0-up vmlinux-2.6.16.13-xen0-up fget 330483 4.3157 vmlinux-2.6.16.13-xen0-up vmlinux-2.6.16.13-xen0-up sys_clock_gettime 291153 3.8021 vmlinux-2.6.16.13-xen0-up vmlinux-2.6.16.13-xen0-up ktime_get_ts 291098 3.8014 vmlinux-2.6.16.13-xen0-up vmlinux-2.6.16.13-xen0-up memset 249732 3.2612 xen-unstable-syms xen-unstable-syms write_cr3 195102 2.5478 vmlinux-2.6.16.13-xen0-up vmlinux-2.6.16.13-xen0-up fget_light 190663 2.4898 vmlinux-2.6.16.13-xen0-up vmlinux-2.6.16.13-xen0-up __kmalloc 183748 2.3995 vmlinux-2.6.16.13-xen0-up vmlinux-2.6.16.13-xen0-up tty_poll 152136 1.9867 vmlinux-2.6.16.13-xen0-up vmlinux-2.6.16.13-xen0-up copy_user_generic 129317 1.6887 vmlinux-2.6.16.13-xen0-up vmlinux-2.6.16.13-xen0-up tun_chr_poll 115066 1.5026 vmlinux-2.6.16.13-xen0-up vmlinux-2.6.16.13-xen0-up getnstimeofday 94228 1.2305 vmlinux-2.6.16.13-xen0-up vmlinux-2.6.16.13-xen0-up wait_for_completion_interruptible 85598 1.1178 vmlinux-2.6.16.13-xen0-up vmlinux-2.6.16.13-xen0-up copy_from_user 83495 1.0903 qemu-dm qemu-dm qemu_run_timers 82606 1.0787 xen-unstable-syms xen-unstable-syms syscall_enter 82507 1.0774 xen-unstable-syms xen-unstable-syms FLT2 76960 1.0050 qemu-dm qemu-dm main_loop_wait 71759 0.9371 xen-unstable-syms xen-unstable-syms toggle_guest_mode 47744 0.6235 vmlinux-2.6.16.13-xen0-up vmlinux-2.6.16.13-xen0-up sys_read 44890 0.5862 vmlinux-2.6.16.13-xen0-up vmlinux-2.6.16.13-xen0-up pipe_poll 40506 0.5290 vmlinux-2.6.16.13-xen0-up vmlinux-2.6.16.13-xen0-up pty_chars_in_buffer 40210 0.5251 librt-2.4.so librt-2.4.so clock_gettime 37866 0.4945 vmlinux-2.6.16.13-xen0-up vmlinux-2.6.16.13-xen0-up normal_poll 35160 0.4591 vmlinux-2.6.16.13-xen0-up vmlinux-2.6.16.13-xen0-up tty_paranoia_check 34715 0.4533 vmlinux-2.6.16.13-xen0-up vmlinux-2.6.16.13-xen0-up poll_initwait 34225 0.4469 xen-unstable-syms xen-unstable-syms test_guest_events 32643 0.4263 xen-unstable-syms xen-unstable-syms restore_all_guest 31101 0.4061 vmlinux-2.6.16.13-xen0-up vmlinux-2.6.16.13-xen0-up posix_ktime_get_ts 29352 0.3833 qemu-dm qemu-dm DMA_run 27741 0.3623 vmlinux-2.6.16.13-xen0-up vmlinux-2.6.16.13-xen0-up vfs_read 27443 0.3584 papps1-syms papps1-syms (no symbols) 26663 0.3482 qemu-dm qemu-dm main_loop 26283 0.3432 vmlinux-2.6.16.13-xen0-up vmlinux-2.6.16.13-xen0-up kfree 24446 0.3192 xen-unstable-syms xen-unstable-syms __copy_from_user_ll 23117 0.3019 xen-unstable-syms xen-unstable-syms do_iret 22559 0.2946 vmlinux-2.6.16.13-xen0-up vmlinux-2.6.16.13-xen0-up fput 20354 0.2658 vmlinux-2.6.16.13-xen0-up vmlinux-2.6.16.13-xen0-up __wake_up_common 19516 0.2549 vmlinux-2.6.16.13-xen0-up vmlinux-2.6.16.13-xen0-up tty_ldisc_deref 19290 0.2519 xen-unstable-syms xen-unstable-syms test_all_events 18499 0.2416 vmlinux-2.6.16.13-xen0-up vmlinux-2.6.16.13-xen0-up rw_verify_area 17759 0.2319 vmlinux-2.6.16.13-xen0-up vmlinux-2.6.16.13-xen0-up __wake_up 13282 0.1734 xen-unstable-syms xen-unstable-syms create_bounce_frame 11968 0.1563 vmlinux-2.6.16.13-xen0-up vmlinux-2.6.16.13-xen0-up hypercall_page 11211 0.1464 vmlinux-2.6.16.13-xen0-up vmlinux-2.6.16.13-xen0-up set_normalized_timespec 11127 0.1453 vmlinux-2.6.16.13-xen0-up vmlinux-2.6.16.13-xen0-up sysret_check 10494 0.1370 xen-unstable-syms xen-unstable-syms FLT131 10467 0.1367 pxen1-syms pxen1-syms vmx_asm_vmexit_handler 9478 0.1238 libpthread-2.4.so libpthread-2.4.so __pthread_disable_asynccancel 9260 0.1209 vmlinux-2.6.16.13-xen0-up vmlinux-2.6.16.13-xen0-up copy_to_user 9222 0.1204 vmlinux-2.6.16.13-xen0-up vmlinux-2.6.16.13-xen0-up sync_buffer 8616 0.1125 vmlinux-2.6.16.13-xen0-up vmlinux-2.6.16.13-xen0-up tty_ldisc_ref_wait 7985 0.1043 oprofiled oprofiled odb_insert 6862 0.0896 libpthread-2.4.so libpthread-2.4.so __read_nocancel 6806 0.0889 xen-unstable-syms xen-unstable-syms FLT3 6676 0.0872 qemu-dm qemu-dm cpu_get_clock 6576 0.0859 pxen1-syms pxen1-syms vmx_load_cpu_guest_regs 6450 0.0842 vmlinux-2.6.16.13-xen0-up vmlinux-2.6.16.13-xen0-up evtchn_poll 6349 0.0829 vmlinux-2.6.16.13-xen0-up vmlinux-2.6.16.13-xen0-up pty_write_room 5978 0.0781 qemu-dm qemu-dm qemu_get_clock 5906 0.0771 pxen1-syms pxen1-syms resync_all 5745 0.0750 oprofiled oprofiled opd_process_samples 5738 0.0749 vmlinux-2.6.16.13-xen0-up vmlinux-2.6.16.13-xen0-up tty_ldisc_try 4943 0.0645 oprofiled oprofiled sfile_find 4803 0.0627 xen-unstable-syms xen-unstable-syms pit_read_counter 4194 0.0548 xen-unstable-syms xen-unstable-syms copy_from_user 4007 0.0523 pxen1-syms pxen1-syms vmx_store_cpu_guest_regs 3838 0.0501 vmlinux-2.6.16.13-xen0-up vmlinux-2.6.16.13-xen0-up n_tty_chars_in_buffer 3507 0.0458 xen-unstable-syms xen-unstable-syms FLT6 3501 0.0457 xen-unstable-syms xen-unstable-syms FLT4 3474 0.0454 oprofiled oprofiled pop_buffer_value 3436 0.0449 xen-unstable-syms xen-unstable-syms FLT11 3283 0.0429 vmlinux-2.6.16.13-xen0-up vmlinux-2.6.16.13-xen0-up poll_freewait 3260 0.0426 pxen1-syms pxen1-syms vmx_vmexit_handler xen-unstable-syms is the Xen hypervisor running on behalf of dom0. pxen1-syms is the Xen hypervisor running on behalf of the HVM domain. vmlinux-2.6.16.13-xen0-up is the kernel running in dom0. It appears that a lot of time is spent running timers and getting the current time. Not being familiar with the code, I am now crawling through it to see how timers are handled and how the xen-vnif PV driver uses them. I''m also looking for potential differences between rev2 and rev8 since the network performance of rev2 was pretty equal to that of a PV domain. Knowing the code, you may have a solution before I find the problem. Steve D. P.S. This just in from a test running while I typed the above. I noticed that qemu will start a "gui_timer" when VNC is not used. I normally run without graphics (nographic=1 in the domain config file). I changed the config file to use VNC. The qemu-dm CPU utilization in dom0 dropped to below 10%. The network performance improved from 0.19 Mb/s to 9.75 Mb/s (still less than the 23.07 Mb/s for a fully virtualized domain). It appears there is some interaction between using the xen-vnif driver and the qemu timer code. I''m still exploring. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Steven Smith
2006-Aug-11 10:17 UTC
Re: [Xen-devel] Paravirtualised drivers for fully virtualised domains, rev9
> Here is what I have found so far in trying to chase down the cause of the > slowdown. > The qemu-dm process is running 99.9% of the CPU on dom0.That seems very wrong. When I try this, the device model is almost completely idle. Could you see what strace says, please, or if there are any strange messages in the /var/log/qemu-dm. file?> It appears that a lot of time is spent running timers and getting the > current time. Not being familiar with the code, I am now crawling through > it to see how timers are handled and how the xen-vnif PV driver uses them.Timer handling isn''t really changed by any of these patches. Patch 02.ioemu_xen_evtchns.diff is in vaguely the same area, but I can''t see how it could cause the problems you''re seeing, assuming your hypervisor and libxc are up to date. What changeset of xen-unstable did you apply the patches to?> P.S. This just in from a test running while I typed the above. I noticed > that qemu will start a "gui_timer" when VNC is not used. I normally run > without graphics (nographic=1 in the domain config file). I changed the > config file to use VNC. The qemu-dm CPU utilization in dom0 dropped to > below 10%. The network performance improved from 0.19 Mb/s to 9.75 Mb/s > (still less than the 23.07 Mb/s for a fully virtualized domain).When I try this, I see about 1600Mb/s between dom0 and a paravirtualised domU, about 30Mb/s between dom0 and an ioemu domU, and about 1200Mb/s between dom0 and an HVM domU running these drivers, all collected using netpipe-tcp. That is a regression, but much smaller than you''re seeing. There are a couple of obvious things to check: 1) Do the statistics reported by ifconfig show any errors? 2) How often is the event channel interrupt firing according to /proc/interrupts? I see about 50k-150k/second. 3) Is there any packet loss when you ping a domain? Start your test and run a ping in parallel. The other thing is that these drivers seem to be very sensitive to kernel debugging options in the domU. If you''ve got anything enabled in the kernel hacking menu it might be worth trying again with that switched off.> It appears there is some interaction between using the xen-vnif > driver and the qemu timer code. I''m still exploring.I''d be happier if I could reproduce this problem here. Are you running SMP? PAE? 64 bit? What kernel are you running in the domU? Steven. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Harry Butterworth
2006-Aug-11 10:31 UTC
Re: [Xen-devel] Paravirtualised drivers for fully virtualised domains, rev9
On Fri, 2006-08-11 at 11:17 +0100, Steven Smith wrote:> > Here is what I have found so far in trying to chase down the cause of the > > slowdown. > > The qemu-dm process is running 99.9% of the CPU on dom0. > That seems very wrong. When I try this, the device model is almost > completely idle. Could you see what strace says, please, or if there > are any strange messages in the /var/log/qemu-dm. file?I haven''t tried the patches being discussed in this thread but I''m seeing similar problems with qemu-dm anyway... I''ve been looking into bugzilla 725 and I''m also seeing 100% cpu usage by qemu-dm. xm-test uses the nographic flag and I find that if this is not set then the cpu usage drops to normal levels and the test passes.> > > It appears that a lot of time is spent running timers and getting the > > current time.Yes, this is what I was seeing with the nographic flag set.> Not being familiar with the code, I am now crawling through > > it to see how timers are handled and how the xen-vnif PV driver uses them. > Timer handling isn''t really changed by any of these patches. Patch > 02.ioemu_xen_evtchns.diff is in vaguely the same area, but I can''t see > how it could cause the problems you''re seeing, assuming your > hypervisor and libxc are up to date. > > What changeset of xen-unstable did you apply the patches to?I''ve been seeing the problem on recent unstable changesets without the patches. Changesets 10992, 10949 for example.> > > P.S. This just in from a test running while I typed the above. I noticed > > that qemu will start a "gui_timer" when VNC is not used. I normally run > > without graphics (nographic=1 in the domain config file).> I changed the > > config file to use VNC. The qemu-dm CPU utilization in dom0 dropped to > > below 10%.Yep, that''s what I see without the patches.> The network performance improved from 0.19 Mb/s to 9.75 Mb/s > > (still less than the 23.07 Mb/s for a fully virtualized domain). > When I try this, I see about 1600Mb/s between dom0 and a > paravirtualised domU, about 30Mb/s between dom0 and an ioemu domU, and > about 1200Mb/s between dom0 and an HVM domU running these drivers, all > collected using netpipe-tcp. That is a regression, but much smaller > than you''re seeing. > > There are a couple of obvious things to check: > > 1) Do the statistics reported by ifconfig show any errors? > 2) How often is the event channel interrupt firing according to > /proc/interrupts? I see about 50k-150k/second. > 3) Is there any packet loss when you ping a domain? Start your test > and run a ping in parallel. > > The other thing is that these drivers seem to be very sensitive to > kernel debugging options in the domU. If you''ve got anything enabled > in the kernel hacking menu it might be worth trying again with that > switched off. > > > It appears there is some interaction between using the xen-vnif > > driver and the qemu timer code. I''m still exploring. > I''d be happier if I could reproduce this problem here. Are you > running SMP? PAE? 64 bit? What kernel are you running in the domU? > > Steven. > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Steve Dobbelstein
2006-Aug-11 17:04 UTC
Re: [Xen-devel] Paravirtualised drivers for fully virtualised domains, rev9
Steven Smith <sos22@hermes.cam.ac.uk> wrote on 08/11/2006 05:17:04 AM:> > Here is what I have found so far in trying to chase down the cause ofthe> > slowdown. > > The qemu-dm process is running 99.9% of the CPU on dom0. > That seems very wrong. When I try this, the device model is almost > completely idle. Could you see what strace says, please, or if there > are any strange messages in the /var/log/qemu-dm. file?Looks like I jumped the gun in relating the 99.9% CPU usage for qemu-dm and the network. I start up the HVM domain and without running any tests qemu-dm is chewing up 99.9% of the CPU in dom0. So it appears that the 100% CPU qemu usage is a problem by itself. Looks like the same problem Harry Butterworth is seeing.> > It appears that a lot of time is spent running timers and getting the > > current time. Not being familiar with the code, I am now crawlingthrough> > it to see how timers are handled and how the xen-vnif PV driver usesthem.> Timer handling isn''t really changed by any of these patches. Patch > 02.ioemu_xen_evtchns.diff is in vaguely the same area, but I can''t see > how it could cause the problems you''re seeing, assuming your > hypervisor and libxc are up to date. > > What changeset of xen-unstable did you apply the patches to?10968> > P.S. This just in from a test running while I typed the above. Inoticed> > that qemu will start a "gui_timer" when VNC is not used. I normallyrun> > without graphics (nographic=1 in the domain config file). I changedthe> > config file to use VNC. The qemu-dm CPU utilization in dom0 dropped to > > below 10%. The network performance improved from 0.19 Mb/s to 9.75Mb/s> > (still less than the 23.07 Mb/s for a fully virtualized domain). > When I try this, I see about 1600Mb/s between dom0 and a > paravirtualised domU, about 30Mb/s between dom0 and an ioemu domU, and > about 1200Mb/s between dom0 and an HVM domU running these drivers, all > collected using netpipe-tcp. That is a regression, but much smaller > than you''re seeing. > > There are a couple of obvious things to check: > > 1) Do the statistics reported by ifconfig show any errors?No errors.> 2) How often is the event channel interrupt firing according to > /proc/interrupts? I see about 50k-150k/second.I''m seeing ~ 500/s when netpipe-tcp reports decent throughput at smaller buffer sizes and then ~50/s when the throughput drops at larger buffer sizes.> 3) Is there any packet loss when you ping a domain? Start your test > and run a ping in parallel.No packet loss.> The other thing is that these drivers seem to be very sensitive to > kernel debugging options in the domU. If you''ve got anything enabled > in the kernel hacking menu it might be worth trying again with that > switched off.Kernel debugging is on. I also have Oprofile enabled. I''ll build a kernel without those and see if it helps.> > It appears there is some interaction between using the xen-vnif > > driver and the qemu timer code. I''m still exploring. > I''d be happier if I could reproduce this problem here. Are you > running SMP? PAE? 64 bit? What kernel are you running in the domU?UP kernels in both the domU and dom0 (although the scheduler likes to move the 1 vcpu in dom0 around to different physical CPUs). 64-bit kernels on both. Steve D. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Steven Smith
2006-Aug-12 08:32 UTC
Re: [Xen-devel] Paravirtualised drivers for fully virtualised domains, rev9
> > > Here is what I have found so far in trying to chase down the cause of > the > > > slowdown. > > > The qemu-dm process is running 99.9% of the CPU on dom0. > > That seems very wrong. When I try this, the device model is almost > > completely idle. Could you see what strace says, please, or if there > > are any strange messages in the /var/log/qemu-dm. file? > Looks like I jumped the gun in relating the 99.9% CPU usage for qemu-dm and > the network. I start up the HVM domain and without running any tests > qemu-dm is chewing up 99.9% of the CPU in dom0. So it appears that the > 100% CPU qemu usage is a problem by itself. Looks like the same problem > Harry Butterworth is seeing.qemu-dm misbehaving could certainly lead to the netif going very slowly.> > 2) How often is the event channel interrupt firing according to > > /proc/interrupts? I see about 50k-150k/second. > I''m seeing ~ 500/s when netpipe-tcp reports decent throughput at smaller > buffer sizes and then ~50/s when the throughput drops at larger buffer > sizes.How large do they have to be to cause problems?> > The other thing is that these drivers seem to be very sensitive to > > kernel debugging options in the domU. If you''ve got anything enabled > > in the kernel hacking menu it might be worth trying again with that > > switched off. > Kernel debugging is on. I also have Oprofile enabled. I''ll build a kernel > without those and see if it helps.Worth a shot. It shouldn''t cause the problems with qemu, though.> > > It appears there is some interaction between using the xen-vnif > > > driver and the qemu timer code. I''m still exploring. > > I''d be happier if I could reproduce this problem here. Are you > > running SMP? PAE? 64 bit? What kernel are you running in the domU? > UP kernels in both the domU and dom0 (although the scheduler likes to move > the 1 vcpu in dom0 around to different physical CPUs). 64-bit kernels on > both.I''ve mostly been testing with 32 bit PAE. I''ll have a go with a 64 bit system on Monday. Thanks, Steven. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Steven Smith
2006-Aug-14 09:12 UTC
Re: [Xen-devel] Paravirtualised drivers for fully virtualised domains, rev9
> > > Here is what I have found so far in trying to chase down the cause of the > > > slowdown. > > > The qemu-dm process is running 99.9% of the CPU on dom0. > > That seems very wrong. When I try this, the device model is almost > > completely idle. Could you see what strace says, please, or if there > > are any strange messages in the /var/log/qemu-dm. file? > I''ve been looking into bugzilla 725 and I''m also seeing 100% cpu usage > by qemu-dm. xm-test uses the nographic flag and I find that if this is > not set then the cpu usage drops to normal levels and the test passes.Does the attached patch help? Steven. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Steve Dobbelstein
2006-Aug-14 21:22 UTC
Re: [Xen-devel] Paravirtualised drivers for fully virtualised domains, rev9
Steven Smith <sos22@hermes.cam.ac.uk> wrote on 08/12/2006 03:32:23 AM:> > > > Here is what I have found so far in trying to chase down the causeof> > the > > > > slowdown. > > > > The qemu-dm process is running 99.9% of the CPU on dom0. > > > That seems very wrong. When I try this, the device model is almost > > > completely idle. Could you see what strace says, please, or if there > > > are any strange messages in the /var/log/qemu-dm. file? > > Looks like I jumped the gun in relating the 99.9% CPU usage for qemu-dmand> > the network. I start up the HVM domain and without running any tests > > qemu-dm is chewing up 99.9% of the CPU in dom0. So it appears that the > > 100% CPU qemu usage is a problem by itself. Looks like the sameproblem> > Harry Butterworth is seeing. > qemu-dm misbehaving could certainly lead to the netif going very > slowly.Agreed. I applied the patch to sent to Harry. I appears to fix the 99.9% CPU usage problem.> > > 2) How often is the event channel interrupt firing according to > > > /proc/interrupts? I see about 50k-150k/second. > > I''m seeing ~ 500/s when netpipe-tcp reports decent throughput atsmaller> > buffer sizes and then ~50/s when the throughput drops at larger buffer > > sizes. > How large do they have to be to cause problems?I''m noticing a drop off in throughput at a buffer size of 3069. Here is a snip from the output from netpipe-tcp. 43: 1021 bytes 104 times --> 20.27 Mbps in 384.28 usec 44: 1024 bytes 129 times --> 20.14 Mbps in 387.86 usec 45: 1027 bytes 129 times --> 20.17 Mbps in 388.46 usec 46: 1533 bytes 129 times --> 22.94 Mbps in 509.95 usec 47: 1536 bytes 130 times --> 23.00 Mbps in 509.48 usec 48: 1539 bytes 130 times --> 23.12 Mbps in 507.92 usec 49: 2045 bytes 66 times --> 30.02 Mbps in 519.66 usec 50: 2048 bytes 96 times --> 30.50 Mbps in 512.35 usec 51: 2051 bytes 97 times --> 30.61 Mbps in 511.24 usec 52: 3069 bytes 98 times --> 0.61 Mbps in 38672.52 usec 53: 3072 bytes 3 times --> 0.48 Mbps in 48633.50 usec 54: 3075 bytes 3 times --> 0.48 Mbps in 48542.50 usec 55: 4093 bytes 3 times --> 0.64 Mbps in 48516.35 usec 56: 4096 bytes 3 times --> 0.65 Mbps in 48449.48 usec 57: 4099 bytes 3 times --> 0.64 Mbps in 48575.84 usec The throughput remains low for the remainder of the buffer sizes which go to 49155 before the benchmarks exits due to the requests taking more than a second.> > > The other thing is that these drivers seem to be very sensitive to > > > kernel debugging options in the domU. If you''ve got anything enabled > > > in the kernel hacking menu it might be worth trying again with that > > > switched off. > > Kernel debugging is on. I also have Oprofile enabled. I''ll build akernel> > without those and see if it helps. > Worth a shot. It shouldn''t cause the problems with qemu, though.I built a kernel without kernel debugging and without instrumentation. The results were very similar. 43: 1021 bytes 104 times --> 20.27 Mbps in 384.28 usec 44: 1024 bytes 129 times --> 20.30 Mbps in 384.91 usec 45: 1027 bytes 130 times --> 20.19 Mbps in 388.02 usec 46: 1533 bytes 129 times --> 22.97 Mbps in 509.25 usec 47: 1536 bytes 130 times --> 23.02 Mbps in 509.12 usec 48: 1539 bytes 131 times --> 23.04 Mbps in 509.65 usec 49: 2045 bytes 65 times --> 30.41 Mbps in 513.07 usec 50: 2048 bytes 97 times --> 30.49 Mbps in 512.49 usec 51: 2051 bytes 97 times --> 30.45 Mbps in 513.85 usec 52: 3069 bytes 97 times --> 0.75 Mbps in 31141.34 usec 53: 3072 bytes 3 times --> 0.48 Mbps in 48596.50 usec 54: 3075 bytes 3 times --> 0.48 Mbps in 48876.17 usec 55: 4093 bytes 3 times --> 0.64 Mbps in 48489.33 usec 56: 4096 bytes 3 times --> 0.64 Mbps in 48606.63 usec 57: 4099 bytes 3 times --> 0.64 Mbps in 48568.33 usec Again, the throughput remains low for the remainder of the buffer sizes which go to 49155 The above tests were run to netpipe-tcp running on another machine. When I run to netpipe-tcp running in dom0 I get better throughput but also some strange behavior. Again, a snip from the output. 43: 1021 bytes 606 times --> 140.14 Mbps in 55.58 usec 44: 1024 bytes 898 times --> 141.16 Mbps in 55.35 usec 45: 1027 bytes 905 times --> 138.93 Mbps in 56.40 usec 46: 1533 bytes 890 times --> 133.74 Mbps in 87.45 usec 47: 1536 bytes 762 times --> 132.82 Mbps in 88.23 usec 48: 1539 bytes 756 times --> 132.01 Mbps in 88.95 usec 49: 2045 bytes 376 times --> 172.36 Mbps in 90.52 usec 50: 2048 bytes 552 times --> 177.41 Mbps in 88.07 usec 51: 2051 bytes 568 times --> 176.12 Mbps in 88.85 usec 52: 3069 bytes 564 times --> 0.44 Mbps in 53173.74 usec 53: 3072 bytes 3 times --> 0.44 Mbps in 53249.32 usec 54: 3075 bytes 3 times --> 0.50 Mbps in 46639.64 usec 55: 4093 bytes 3 times --> 321.94 Mbps in 97.00 usec 56: 4096 bytes 515 times --> 287.05 Mbps in 108.87 usec 57: 4099 bytes 459 times --> 2.69 Mbps in 11615.94 usec 58: 6141 bytes 4 times --> 0.63 Mbps in 74535.64 usec 59: 6144 bytes 3 times --> 0.35 Mbps in 133242.01 usec 60: 6147 bytes 3 times --> 0.35 Mbps in 133311.47 usec 61: 8189 bytes 3 times --> 0.62 Mbps in 100391.51 usec 62: 8192 bytes 3 times --> 1.05 Mbps in 59535.66 usec 63: 8195 bytes 3 times --> 0.63 Mbps in 99598.69 usec 64: 12285 bytes 3 times --> 0.47 Mbps in 199974.34 usec 65: 12288 bytes 3 times --> 4.70 Mbps in 19933.34 usec 66: 12291 bytes 3 times --> 4.70 Mbps in 19933.30 usec 67: 16381 bytes 3 times --> 0.71 Mbps in 176984.35 usec 68: 16384 bytes 3 times --> 0.93 Mbps in 134929.50 usec 69: 16387 bytes 3 times --> 0.93 Mbps in 134930.33 usec The throughput drops at a buffer size of 3069 as in the prior runs, but it regains at 4093 and 4096, and then drops off again for the remainder of the test. I don''t know offhand why the throughput drops off. I''ll look into it. Any tips would be helpful. For comparison, an FV domU running netpipe-tcp to another machine will ramp up to about 20 Mbps at a buffer size of around 128 KB and then taper off to 17 Mbps. A PV domU will ramp up to around 750 Mbps at a buffer size of about 2 MB and maintain that throughput to an 8 MB buffer when the test stopped. On dom0 netpipe-tcp running to another machine ramps up to around 850 Mbps at a buffer sizes from 3 MB to 8 MB where the test stopped. Steve D. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Steven Smith
2006-Aug-15 07:27 UTC
Re: [Xen-devel] Paravirtualised drivers for fully virtualised domains, rev9
> > > Looks like I jumped the gun in relating the 99.9% CPU usage for qemu-dm > and > > > the network. I start up the HVM domain and without running any tests > > > qemu-dm is chewing up 99.9% of the CPU in dom0. So it appears that the > > > 100% CPU qemu usage is a problem by itself. Looks like the same > problem > > > Harry Butterworth is seeing. > > qemu-dm misbehaving could certainly lead to the netif going very > > slowly. > Agreed. I applied the patch to sent to Harry. I appears to fix the 99.9% > CPU usage problem.Great, thanks.> > > > 2) How often is the event channel interrupt firing according to > > > > /proc/interrupts? I see about 50k-150k/second. > > > I''m seeing ~ 500/s when netpipe-tcp reports decent throughput at > smaller > > > buffer sizes and then ~50/s when the throughput drops at larger buffer > > > sizes. > > How large do they have to be to cause problems? > I''m noticing a drop off in throughput at a buffer size of 3069. Here is a > snip from the output from netpipe-tcp.What are the MTUs on the interfaces, according to ifconfig, in dom0 and domU?> I don''t know offhand why the throughput drops off. I''ll look into it. Any > tips would be helpful.tcpdump in the domU and dom0 might be enlightening, just to see if any packets are getting dropped or truncated. The connections probably slow enough when it''s misbehaving for it to keep up. Are you running through the bridge? It''s unlikely to be that, but it would be good to eliminate it as a variable by doing some domU<->dom0 tests without it involved. What version of Linux are you running in the domU? Does it have any patches applied? Steven. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Steve Dobbelstein
2006-Aug-15 22:05 UTC
Re: [Xen-devel] Paravirtualised drivers for fully virtualised domains, rev9
Steven Smith <sos22-xen@srcf.ucam.org> wrote on 08/15/2006 02:27:50 AM:> > > > > 2) How often is the event channel interrupt firing according to > > > > > /proc/interrupts? I see about 50k-150k/second. > > > > I''m seeing ~ 500/s when netpipe-tcp reports decent throughput at > > smaller > > > > buffer sizes and then ~50/s when the throughput drops at largerbuffer> > > > sizes. > > > How large do they have to be to cause problems? > > I''m noticing a drop off in throughput at a buffer size of 3069. Hereis a> > snip from the output from netpipe-tcp. > What are the MTUs on the interfaces, according to ifconfig, in dom0 > and domU?MTUs on all the interfaces are 1500.> > I don''t know offhand why the throughput drops off. I''ll look into it.Any> > tips would be helpful. > tcpdump in the domU and dom0 might be enlightening, just to see if any > packets are getting dropped or truncated. The connections probably > slow enough when it''s misbehaving for it to keep up.tcpdump on both dom0 and domU shows no packets dropped and none truncated. I noticed lines such as: 16:28:18.596654 IP dib.ltc.austin.ibm.com > hvm1.ltc.austin.ibm.com: ICMP dib.ltc.austin.ibm.com unreachable - need to frag (mtu 1500), length 556 in the tcpdump output during the slow down. (dib.ltc.austin.ibm.com is dom0.) Knowing very little about the TCP protocol, I''m not sure if that indicates a problem.> Are you running through the bridge? It''s unlikely to be that, but it > would be good to eliminate it as a variable by doing some domU<->dom0 > tests without it involved.I am running through the bridge, the default Xen setup. I doubt the bridge is the problem since I also use the bridge for a PV domU and an FV domU and those don''t see a slowdown.> What version of Linux are you running in the domU? Does it have any > patches applied?SLES 10 beta 10. (Yes, SLES 10 has released. We haven''t updated our automated testing framework yet.) I''m running a 2.6.16.13 kernel.org kernel, the current base kernel for xen-unstable. No patches applied. Here is the kernel config from /proc/config.gz in the HVM domU. (See attached file: hvm_kernel_config) Thanks for your attention. Steve D. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
sos22-xen@srcf.ucam.org
2006-Aug-16 13:33 UTC
[Xen-devel] Paravirtualised drivers for fully virtualised domains, rev11
There''s a new version of this patch up at http://www.cl.cam.ac.uk/~sos22/pv-on-hvm/rev11 . The main change here is that I now do a slightly more thorough job of disabling GSO when compiled against kernels which don''t support it. These patches should apply against changeset 11139:ff124973a28a. Steven. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Steven Smith
2006-Aug-16 13:36 UTC
Re: [Xen-devel] Paravirtualised drivers for fully virtualised domains, rev9
> The good news is that I don''t get zombies anymore. The bad news is that > I''m still getting very poor network performance running netperf, worse than > a fully virtualized domain. I thought it was something wrong with my test > setup when I was testing rev8, but the test setup looks good and the > results are repeatable.This should be fixed in rev11. Thanks, Steven. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel