Guy Zana
2007-May-31 23:04 UTC
[Xen-devel] [RFC][PATCH 0/6] HVM PCI Passthrough (non-IOMMU)
The following patches can be applied cleanly to C/S 15011 (unstable). They provide HVM PCI Pass-through for non-IOMMU based machines. In order to support DMA, a single HVM, called NativeDom, has its P2M table populated in a 1:1 fashion, where each gpfn==mfn. * Tested on 32bit Windows and Linux operating systems based HVMs. * Dom0/Xen is compiled in x86_32 mode. * Only level-triggered interrupts are supported in this version, this is good for most PCI devices. * Edge-triggered interrupts should be easily supported by just asserting them when they are raised - this is not implemented in this patch. To test the patches, you''ll have to change some hard-coded parts in the code: * The bus, device and function of the device that you are going to allocate for NativeDom (pass-through.h) * The pt_init() function (pass-through.c) currently programmed to do pass-through for 3 USB devices on the DQ965GF desktop, change it as you wish. * You''ll have to hide the device that you are giving to the HVM so dom0 doesn''t use it. Use the pciback kernel parameter in order to do so. The files are organized as follows: 1) conf.patch - Some changes to the general configuration files. 2) misc.patch - Some global changes & fixes 3) 1to1.patch - Provides the new memory layout and the allocator that support it. 4) int.patch - Interrupts binding / injecting. 5) ioemu.patch - Changes to the device model, provides PCI configuration updates, BAR emulation and PIO/MMIO access functions, which should be moved to the Hypervisor, but we suggest to keep them for debugging purposes) 6) libpci.patch - A library to access the PCI config space, probe the bus for devices, etc. It is basically a copy & paste from the libpci app, with some additions. Please see our presentation: http://www.xensource.com/files/xensummit_4/Neocleus_HVM_PCI_Pass-through _Zana.pdf Thanks, Guy. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
John Byrne
2007-Jun-08 02:53 UTC
Re: [Xen-devel] [RFC][PATCH 0/6] HVM PCI Passthrough (non-IOMMU)
Guy, I tried your patches with a bnx2 NIC on SLES10 and they didn''t work. The first reason was that you mask off the capabilities bit in the PCI status. If I got rid of this, I could at least get the NIC to configure, but it didn''t work and the dropped packets looked to be random garbage, so I don''t think it was talking to the device properly. (But I understand almost nothing about PCI device configuration, so I don''t know what to look for.) I haven''t noticed the merge tree springing into existence into on xenbits, so is there any progress on making into a real feature? It sounds like most of the work needs to be done between you and Intel, but I could certainly help with testing. One thing I am interested in is, with the 1:1 mapping, could we disable the VT page-fault handling? I''ve found that the page-fault overhead for VT is horrible and would probably affect fork-exec benchmarks significantly. Thanks, John Byrne _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Guy Zana
2007-Jun-08 18:23 UTC
RE: [Xen-devel] [RFC][PATCH 0/6] HVM PCI Passthrough (non-IOMMU)
Hi Jhon, Thanks for testing out our patches! My comments below.> -----Original Message----- > From: John Byrne [mailto:john.l.byrne@hp.com] > Sent: Friday, June 08, 2007 5:53 AM > To: Guy Zana > Cc: xen-devel@lists.xensource.com > Subject: Re: [Xen-devel] [RFC][PATCH 0/6] HVM PCI Passthrough > (non-IOMMU) > > > Guy, > > I tried your patches with a bnx2 NIC on SLES10 and they didn''t work. > > The first reason was that you mask off the capabilities bit > in the PCI status. If I got rid of this, I could at least get > the NIC to configure, but it didn''t work and the dropped > packets looked to be random garbage, so I don''t think it was > talking to the device properly. (But I understand almost > nothing about PCI device configuration, so I don''t know what > to look for.) >The released patches are considered to be "developmental", there are still work needed to be done (not too much though :) ) in order to make it usable for everyone. Are you sure you mapped the right IRQ? Please post the qemu-dm log file / xm dmesg. The capabilities bits are masked-off so we won''t need to handle MSIs yet and power management (ACPI) related stuff, that could be quite a pain when trying to do pass-through for integrated devices. Another thing, Does this NIC card has an expansion ROM?> I haven''t noticed the merge tree springing into existence > into on xenbits, so is there any progress on making into a > real feature? It sounds like most of the work needs to be > done between you and Intel, but I could certainly help with testing. >That would be great! I think that both patches (ours'' and Intel''s) need some more work before we can start merging. Neocleus already merged some parts from the Intel patches (mmio & pio handling). We are also aiming for 64bits (x86) support on the next release.> One thing I am interested in is, with the 1:1 mapping, could > we disable the VT page-fault handling? I''ve found that the > page-fault overhead for VT is horrible and would probably > affect fork-exec benchmarks significantly.Cool idea! Our CTO thought about it as well :) It''s kind of hard not to use the VT page-fault handler at all, there are some issues with memory protection (security), and memory-remapping that we would want to do in the future (In order to support bios & expansion ROM duplication). I agree that you can make it faster though! it may require some drastic changes in the hypervisor. Thanks, Guy. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
John Byrne
2007-Jun-09 02:25 UTC
Re: [Xen-devel] [RFC][PATCH 0/6] HVM PCI Passthrough (non-IOMMU)
Guy, Things are working at least somewhat, now. Answers/comments below. Guy Zana wrote:> Hi Jhon, > > Thanks for testing out our patches! > My comments below. > >> -----Original Message----- >> From: John Byrne [mailto:john.l.byrne@hp.com] >> Sent: Friday, June 08, 2007 5:53 AM >> To: Guy Zana >> Cc: xen-devel@lists.xensource.com >> Subject: Re: [Xen-devel] [RFC][PATCH 0/6] HVM PCI Passthrough >> (non-IOMMU) >> >> >> Guy, >> >> I tried your patches with a bnx2 NIC on SLES10 and they didn''t work. >> >> The first reason was that you mask off the capabilities bit >> in the PCI status. If I got rid of this, I could at least get >> the NIC to configure, but it didn''t work and the dropped >> packets looked to be random garbage, so I don''t think it was >> talking to the device properly. (But I understand almost >> nothing about PCI device configuration, so I don''t know what >> to look for.) >> > >The released patches are considered to be "developmental", there are >still work needed to be done (not too much though :) ) in order to make >it usable for everyone. Are you sure you mapped the right IRQ? Please >post the qemu-dm log file / xm dmesg. The capabilities bits are >masked-off so we won''t need to handle MSIs yet and power management >(ACPI) related stuff, that could be quite a pain when trying to do >pass-through for integrated devices.I''d missed the line in your patch zero e-mail about pass-through.c. Once I''d fixed that and with your hint about MSI-interrupts, I passed the disable_msi option to the bnx2 driver and things worked, at least for a while. I could get a ssh connection going through the interface, but the machine locked up. My 32-bit machine doesn''t have a lot of memory, so things are sluggish and it is hard to tell lock-ups from thrashing. I will reinstall one of my 64-bit machines that has more memory as 32-bits and try it there.> Another thing, > Does this NIC card has an expansion ROM?Not according to lspci.> >> I haven''t noticed the merge tree springing into existence >> into on xenbits, so is there any progress on making into a >> real feature? It sounds like most of the work needs to be >> done between you and Intel, but I could certainly help with testing. >> > > That would be great!Just let me know what you need tested and I''ll see what I can do.> > I think that both patches (ours'' and Intel''s) need some more work before we can start merging. > Neocleus already merged some parts from the Intel patches (mmio & pio > handling). We are also aiming for 64bits (x86) support on the next release.64-bits would be nice as that as what I usually run.>> One thing I am interested in is, with the 1:1 mapping, could >> we disable the VT page-fault handling? I''ve found that the >> page-fault overhead for VT is horrible and would probably >> affect fork-exec benchmarks significantly. > > Cool idea! Our CTO thought about it as well :) > It''s kind of hard not to use the VT page-fault handler at all, there are > some issues with memory protection (security), and memory-remapping that > we would want to do in the future (In order to support bios & expansion > ROM duplication). I agree that you can make it faster though! it may > require some drastic changes in the hypervisor.Without an IOMMU, you forfeit memory protection, anyway, so I am willing to handwave security for the moment. For VT, at the moment, it looks like I might be able to just hack something to set the VMCS to disable page faults after the domain is running. Setting CR3 will still generate a fault, but all you need to do is set the real CR3, as far as I can tell. It may not really work out, but I''m going to try. Thanks, John _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Guy Zana
2007-Jun-09 05:55 UTC
RE: [Xen-devel] [RFC][PATCH 0/6] HVM PCI Passthrough (non-IOMMU)
> -----Original Message----- > From: John Byrne [mailto:john.l.byrne@hp.com] > Sent: Saturday, June 09, 2007 5:26 AM > To: Guy Zana > Cc: xen-devel@lists.xensource.com; Kay, Allen M; Tian, Kevin > Subject: Re: [Xen-devel] [RFC][PATCH 0/6] HVM PCI Passthrough > (non-IOMMU) >> > I''d missed the line in your patch zero e-mail about > pass-through.c. Once I''d fixed that and with your hint about > MSI-interrupts, I passed the disable_msi option to the bnx2 > driver and things worked, at least for a while. I could get a > ssh connection going through the interface, but the machine > locked up. My 32-bit machine doesn''t have a lot of memory, so > things are sluggish and it is hard to tell lock-ups from > thrashing. I will reinstall one of my 64-bit machines that > has more memory as 32-bits and try it there.That trashing could be caused by the 1:1 mapping. The only downside of using the 1-to-1 mapping right now is that the region 0-12MB is remapped to 16-28MB, so all guest''s DMA operations with buffers allocated from the 0-12MB region would fail. That region is remapped because Xen (code & data) lives there, and this is the main reason for moving to x86_64, there Xen is relocated to higher memory and the 0-12MB region can be mapped in a 1:1 fashion. If you can enforce your linux guest to allocate dma buffers from above the 12MB address you can avoid that problem, you can also log the dma descriptors and see if the problem is because of that.> > Just let me know what you need tested and I''ll see what I can do. >Thanks, I appreciate it.> > > > Cool idea! Our CTO thought about it as well :) It''s kind of > hard not > > to use the VT page-fault handler at all, there are some issues with > > memory protection (security), and memory-remapping that we > would want > > to do in the future (In order to support bios & expansion ROM > > duplication). I agree that you can make it faster though! it may > > require some drastic changes in the hypervisor. > > Without an IOMMU, you forfeit memory protection, anyway, so IThat''s not completely true! We have found ways to restrict memory access to a specific region on specific chipsets.> am willing to handwave security for the moment. For VT, at > the moment, it looks like I might be able to just hack > something to set the VMCS to disable page faults after the > domain is running. Setting CR3 will still generate a fault, > but all you need to do is set the real CR3, as far as I can > tell. It may not really work out, but I''m going to try. >It''s interesting to see the result of that. Keep in mind that you''ll have to provide memory remapping (you don''t want to run over the real bios or expansion ROMs!) by updating the page-tables, I''m not sure how would you track updates without the page-fault exit. Thanks, Guy.> Thanks, > > John > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel