Muli Ben-Yehuda
2006-Jan-20 01:55 UTC
[Xen-devel] VP problematic for backend drivers on IA64?
Hi Dan, I understand that during the IA64 session at the summit there was some discussion on VP being problematic for the current backend drivers (or the other way around), and IOMMUs were suggested as a possible solution. Could you please elaborate on what''s the problem? Thanks, Muli -- Muli Ben-Yehuda http://www.mulix.org | http://mulix.livejournal.com/ _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Magenheimer, Dan (HP Labs Fort Collins)
2006-Jan-20 17:08 UTC
[Xen-devel] RE: VP problematic for backend drivers on IA64?
Hi Muli -- I''m cc''ing the xen-ia64-devel list as many of the Xen/ia64 team don''t keep up with xen-devel... Side note for anyone new to following this thread: The terms P2M, P==M, and VP are as defined in http://lists.xensource.com/archives/html/xen-devel/2006-01/msg00184.html The backend drivers have a lot of code that assume P2M. Blkback has been "ported" to handle P==M but netback never was. Neither has been "ported" to VP yet so there is some work to do. It may turn out to be easy (e.g. #define''ing a few macros to be no-ops). However, there''s likely to be some subtle changes too as there was for P==M. But the real problem is not really in the backend drivers, it is in the lower layers of the driver stack that the backend drivers sit on top of. VP means that the machine addresses are hidden to the domain. But domain0 (and future driver domains) still need to program DMA-capable devices, both for any domain0 I/O and for I/O on behalf of domU''s (via blkfront/blkback). Thus, domain0 cannot really be fully VP. I think what we discussed at the summit was a modified form of VP which is somewhere between VP and P2M. All RAM addressing is VP, but all device addressing needs to be P2M. It was observed that since an IOMMU intercepts all device addressing (and only device addressing), by ensuring that domain0 (and any driver domain) only has device addressing via a "software IOMMU", the problem should be solved. That just about exhausts my expertise in this area, so others can feel free to jump in (and please correct my mistakes). Dan> -----Original Message----- > From: Muli Ben-Yehuda [mailto:mulix@mulix.org] > Sent: Thursday, January 19, 2006 6:55 PM > To: Magenheimer, Dan (HP Labs Fort Collins) > Cc: xen-devel; okrieg@us.ibm.com > Subject: VP problematic for backend drivers on IA64? > > Hi Dan, > > I understand that during the IA64 session at the summit there was some > discussion on VP being problematic for the current backend drivers (or > the other way around), and IOMMUs were suggested as a possible > solution. Could you please elaborate on what''s the problem? > > Thanks, > Muli > -- > Muli Ben-Yehuda > http://www.mulix.org | http://mulix.livejournal.com/ > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Muli Ben-Yehuda
2006-Jan-20 22:17 UTC
[Xen-devel] Re: VP problematic for backend drivers on IA64?
On Fri, Jan 20, 2006 at 09:08:21AM -0800, Magenheimer, Dan (HP Labs Fort Collins) wrote:> Hi Muli -- > > I''m cc''ing the xen-ia64-devel list as many of the > Xen/ia64 team don''t keep up with xen-devel...Actually, you didn''t :-)> The backend drivers have a lot of code that assume P2M. > Blkback has been "ported" to handle P==M but netback > never was. Neither has been "ported" to VP yet so there > is some work to do. It may turn out to be easy (e.g. > #define''ing a few macros to be no-ops). However, there''s > likely to be some subtle changes too as there was for P==M.Where can I find the diff for blkback to work P==M? is this integrated into xen-unstable or is it in the IA64 tree?> But the real problem is not really in the backend drivers, > it is in the lower layers of the driver stack that the > backend drivers sit on top of. VP means that the machine > addresses are hidden to the domain. But domain0 (and > future driver domains) still need to program DMA-capable > devices, both for any domain0 I/O and for I/O on behalf > of domU''s (via blkfront/blkback). Thus, domain0 cannot > really be fully VP.Linux provides the DMA-API abstraction, so that drivers do not need to be aware of the deails of translating from a guest-physical address to a bus address (akak machine address). Theoretically, a DMA-API implementation is the only part of the dom0 Linux kernel that would need to know to read the P2M table (P2M) or do nothing (P=M) or call into Xen to get the tanslation (VP without IOMMU) or call into Xen to establish an IOMMU mapping (VP w/ IOMMU).> I think what we discussed at the summit was a modified form > of VP which is somewhere between VP and P2M. All RAM > addressing is VP, but all device addressing needs to be > P2M. It was observed that since an IOMMU intercepts all > device addressing (and only device addressing), by ensuring > that domain0 (and any driver domain) only has device > addressing via a "software IOMMU", the problem should be > solved.Unless the machine has a real HW IOMMU, the device must see bus addresses, which means the driver must pass it bus addresses. The "virtual IOMMU" therefore becomes a DMA-API implementation which calls into Xen for P->Bus translation.> That just about exhausts my expertise in this area, so > others can feel free to jump in (and please correct my > mistakes).I think it makes sense. Does IA64 already implement VP dom0? are there any plans for x86(-64) VP dom0? Cheers, Muli -- Muli Ben-Yehuda http://www.mulix.org | http://mulix.livejournal.com/ _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Magenheimer, Dan (HP Labs Fort Collins)
2006-Jan-22 22:45 UTC
[Xen-devel] RE: VP problematic for backend drivers on IA64?
> -----Original Message----- > From: Muli Ben-Yehuda [mailto:mulix@mulix.org] > Sent: Friday, January 20, 2006 3:17 PM > To: Magenheimer, Dan (HP Labs Fort Collins) > Cc: xen-devel; okrieg@us.ibm.com; Jon Mason > Subject: Re: VP problematic for backend drivers on IA64? > > On Fri, Jan 20, 2006 at 09:08:21AM -0800, Magenheimer, Dan > (HP Labs Fort Collins) wrote: > > Hi Muli -- > > > > I''m cc''ing the xen-ia64-devel list as many of the > > Xen/ia64 team don''t keep up with xen-devel... > > Actually, you didn''t :-)Oops! For anyone on xen-ia64-devel wanting to catch up on this thread: http://lists.xensource.com/archives/html/xen-devel/2006-01/msg00492.html http://lists.xensource.com/archives/html/xen-devel/2006-01/msg00507.html> > The backend drivers have a lot of code that assume P2M. > > Blkback has been "ported" to handle P==M but netback > > never was. Neither has been "ported" to VP yet so there > > is some work to do. It may turn out to be easy (e.g. > > #define''ing a few macros to be no-ops). However, there''s > > likely to be some subtle changes too as there was for P==M. > > Where can I find the diff for blkback to work P==M? is this integrated > into xen-unstable or is it in the IA64 tree?It is all checked in to xen-unstable. (The xen-ia64-unstable tree is sync''ed roughly weekly with xen-unstable.)> > But the real problem is not really in the backend drivers, > > it is in the lower layers of the driver stack that the > > backend drivers sit on top of. VP means that the machine > > addresses are hidden to the domain. But domain0 (and > > future driver domains) still need to program DMA-capable > > devices, both for any domain0 I/O and for I/O on behalf > > of domU''s (via blkfront/blkback). Thus, domain0 cannot > > really be fully VP. > > Linux provides the DMA-API abstraction, so that drivers do not need to > be aware of the deails of translating from a guest-physical address to > a bus address (akak machine address). Theoretically, a DMA-API > implementation is the only part of the dom0 Linux kernel that would > need to know to read the P2M table (P2M) or do nothing (P=M) or call > into Xen to get the tanslation (VP without IOMMU) or call into Xen to > establish an IOMMU mapping (VP w/ IOMMU).Yes, unless there are legacy drivers/devices that circumvent the DMA interface. I don''t know if this is the case on some/many/all Linux/ia64 configurations... perhaps someone with more familiarity with a broad range of Linux/ia64 configurations can comment? I would be concerned with, for example, IDE, GART, VGA, console...?> > I think what we discussed at the summit was a modified form > > of VP which is somewhere between VP and P2M. All RAM > > addressing is VP, but all device addressing needs to be > > P2M. It was observed that since an IOMMU intercepts all > > device addressing (and only device addressing), by ensuring > > that domain0 (and any driver domain) only has device > > addressing via a "software IOMMU", the problem should be > > solved. > > Unless the machine has a real HW IOMMU, the device must see bus > addresses, which means the driver must pass it bus addresses. The > "virtual IOMMU" therefore becomes a DMA-API implementation which calls > into Xen for P->Bus translation.OK.> > That just about exhausts my expertise in this area, so > > others can feel free to jump in (and please correct my > > mistakes). > > I think it makes sense. Does IA64 already implement VP dom0? are there > any plans for x86(-64) VP dom0?No, Xen/ia64 domain0 has always been P==M, though some hypervisor code written prior to booting on hardware (back when it only ran on a simulator) under an ifdef may be resurrected that supports VP dom0.> Cheers, > Muli > -- > Muli Ben-YehudaThanks! Dan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Alex Williamson
2006-Jan-23 15:52 UTC
Re: [Xen-devel] RE: VP problematic for backend drivers on IA64?
On Sun, 2006-01-22 at 14:45 -0800, Magenheimer, Dan (HP Labs Fort Collins) wrote:> > Yes, unless there are legacy drivers/devices that circumvent the > DMA interface. I don''t know if this is the case on some/many/all > Linux/ia64 configurations... perhaps someone with more familiarity > with a broad range of Linux/ia64 configurations can comment? I > would be concerned with, for example, IDE, GART, VGA, console...?VGA and serial consoles don''t typically do DMA-like operations AFAIK. They may live in legacy address spaces, but I think their programming model is entirely reads and writes. All IDE chips can operate in standard PCI mode these days, so they should be covered by the DMA-API. GARTs are similar to IOMMUs, they''ll need to be modified to understand the translation. Alex -- Alex Williamson HP Linux & Open Source Lab _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Pratt
2006-Jan-24 23:42 UTC
RE: [Xen-devel] VP problematic for backend drivers on IA64?
> I understand that during the IA64 session at the summit there > was some discussion on VP being problematic for the current > backend drivers (or the other way around), and IOMMUs were > suggested as a possible solution. Could you please elaborate > on what''s the problem?It''s simply that the actual DMA operations need to use machine addresses. Ideally, you''d use an iommu to translate/partition, but in the absence of an iommu simply enabling a privileged domain to read its p2m table and translate the pfn to an mfn is sufficient. Ian _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Magenheimer, Dan (HP Labs Fort Collins)
2006-Jan-25 00:02 UTC
RE: [Xen-devel] VP problematic for backend drivers on IA64?
>translate the pfn to an mfn is sufficientActually, after thinking about this, it''s a bit more complicated because of the possibility that a DMA may address more than one page. If so, a simple DMA may need to be translated into a scatter-gather (or a scatter-gather into a more complex scatter-gather). Not impossible, obviously because Xen/x86 handles this -- by changing Linux, correct? Do hardware IOMMU''s in general handle this complication? E.g. is there a cleanly defined interface that can be applied to a VP domain "Xen IOMMU"? Dan> -----Original Message----- > From: Ian Pratt [mailto:m+Ian.Pratt@cl.cam.ac.uk] > Sent: Tuesday, January 24, 2006 4:43 PM > To: Muli Ben-Yehuda; Magenheimer, Dan (HP Labs Fort Collins) > Cc: xen-devel; okrieg@us.ibm.com; ian.pratt@cl.cam.ac.uk > Subject: RE: [Xen-devel] VP problematic for backend drivers on IA64? > > > > I understand that during the IA64 session at the summit there > > was some discussion on VP being problematic for the current > > backend drivers (or the other way around), and IOMMUs were > > suggested as a possible solution. Could you please elaborate > > on what''s the problem? > > It''s simply that the actual DMA operations need to use machine > addresses. Ideally, you''d use an iommu to translate/partition, but in > the absence of an iommu simply enabling a privileged domain > to read its > p2m table and translate the pfn to an mfn is sufficient. > > Ian >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Muli Ben-Yehuda
2006-Jan-25 00:16 UTC
Re: [Xen-devel] VP problematic for backend drivers on IA64?
On Tue, Jan 24, 2006 at 04:02:36PM -0800, Magenheimer, Dan (HP Labs Fort Collins) wrote:> Actually, after thinking about this, it''s a bit > more complicated because of the possibility that a DMA may > address more than one page. If so, a simple DMA may need to be > translated into a scatter-gather (or a scatter-gather into > a more complex scatter-gather). > > Not impossible, obviously because Xen/x86 handles this -- by > changing Linux, correct?Correct. Specifically, it''s handled by Xen''s version of swiotlb.c.> Do hardware IOMMU''s in general handle this complication?Yes. You can use a HW IOMMU to map a scatter-gather list of machine pages into a contigous range in the IO space. It can also do the reverse, but that''s less interesting.> E.g. is there a cleanly defined interface that can be applied > to a VP domain "Xen IOMMU"?I''m not sure what you''re asking here? Cheers, Muli -- Muli Ben-Yehuda http://www.mulix.org | http://mulix.livejournal.com/ _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Pratt
2006-Jan-25 01:35 UTC
RE: [Xen-devel] VP problematic for backend drivers on IA64?
> > Actually, after thinking about this, it''s a bit more complicated > > because of the possibility that a DMA may address more than > one page. > > If so, a simple DMA may need to be translated into a scatter-gather > > (or a scatter-gather into a more complex scatter-gather). > > > > Not impossible, obviously because Xen/x86 handles this -- > by changing > > Linux, correct? > > Correct. Specifically, it''s handled by Xen''s version of swiotlb.c.swiotlb just handles the rare cases where things straddle a page. Other modifications we''ve made try to avoid this happening (preventing block request merging across boundries that aren''t machine and phys contiguous, skb slab cache).> > Do hardware IOMMU''s in general handle this complication? > > Yes. You can use a HW IOMMU to map a scatter-gather list of > machine pages into a contigous range in the IO space. It can > also do the reverse, but that''s less interesting. > > > E.g. is there a cleanly defined interface that can be > applied to a VP > > domain "Xen IOMMU"?The existing linux one will suffice. Ian _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Gerd Hoffmann
2006-Jan-25 10:29 UTC
Re: [Xen-devel] VP problematic for backend drivers on IA64?
Muli Ben-Yehuda wrote:> Hi Dan, > > I understand that during the IA64 session at the summit there was some > discussion on VP being problematic for the current backend drivers (or > the other way around), and IOMMUs were suggested as a possible > solution. Could you please elaborate on what''s the problem?I''ll try to give a short overview. VP and IOMMU support are separate problems, although there are some relations between the two ... Current linux block device (also other) drivers use a "struct page", an offset and the length to address some piece of memory, usually as source or target for DMA. Linux has an API (see Documentation/DMA-mapping.txt) to translate a "struct page" to a DMA address for a specific device. This was originally implemented to support IOMMUs. It can also be used to hide the phys=>machine address translation from device drivers, so the current linux drivers run unmodified on VP. The problem for the backend driver is that it submits I/O requests on behalf of *other* domains. Right now the backend driver maps the foreign pages into it''s own address space just to have a valid "struct page" it can pass down to the block driver which talks to the real hardware, although usually there is no need to do that to perform the actual I/O. One suggestion from the summit was to allocate some "struct page" for foreign pages and tag them somehow (new page flag + grant table handle in page->private maybe). The xenified kernel''s DMA mapping implementation can check the flag then and do the "right thing". I''m not fully aware what other consequences this has for the linux memory management, asking on lkml how to deal with that (and maybe get other/better suggestions) is probably not a bad idea. At least one place which must also be touched for that is kmap()+friends. cheers, Gerd -- Gerd ''just married'' Hoffmann <kraxel@suse.de> I''m the hacker formerly known as Gerd Knorr. http://www.suse.de/~kraxel/just-married.jpeg _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Muli Ben-Yehuda
2006-Jan-25 14:37 UTC
Re: [Xen-devel] VP problematic for backend drivers on IA64?
On Wed, Jan 25, 2006 at 11:29:37AM +0100, Gerd Hoffmann wrote:> One suggestion from the summit was to allocate some "struct page" for > foreign pages and tag them somehow (new page flag + grant table handle > in page->private maybe). The xenified kernel''s DMA mapping > implementation can check the flag then and do the "right thing". I''m > not fully aware what other consequences this has for the linux memory > management,I think it''s a pretty ugly hack; struct page has a very specific meaning in Linux. Minimizing changes in Linux by subverting this meaning does not strike me as the right thing to do. At the moment we simply use the struct page to pass information between the Xen aware backends and the non-Xen-aware Linux drivers that do the actual DMA. This requires however that the struct page point to an actual physical page, which ideally wouldn''t be required at all - Linux is never going to look at the page (kmap it). Having said that, changing the DMA-API to take something other than a virtual address (that then gets translated to physical, then to machine) is not a trivial undertaking.> asking on lkml how to deal with that (and maybe get > other/better suggestions) is probably not a bad idea. At least one > place which must also be touched for that is kmap()+friends.Sure, I''d be interested in the response. Cheers, Muli -- Muli Ben-Yehuda http://www.mulix.org | http://mulix.livejournal.com/ _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Gerd Hoffmann
2006-Jan-25 16:24 UTC
Re: [Xen-devel] VP problematic for backend drivers on IA64?
Hi,> I think it''s a pretty ugly hack; struct page has a very specific > meaning in Linux. Minimizing changes in Linux by subverting this > meaning does not strike me as the right thing to do.Well, not exactly nice indeed, but any other solution involves touching all block drivers ...> at all - Linux is never going to look at the page (kmap it). HavingDepends. If it''s actually DMA''ing directly it doesn''t, which should be true in 99% of all cases. But there are some corner cases: If the block layer needs bounce buffers it will attempt to kmap() the page to copy the data. The same is true for drivers which don''t DMA (floppy.c for example). cheers, Gerd -- Gerd ''just married'' Hoffmann <kraxel@suse.de> I''m the hacker formerly known as Gerd Knorr. http://www.suse.de/~kraxel/just-married.jpeg _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Muli Ben-Yehuda
2006-Jan-25 16:46 UTC
Re: [Xen-devel] VP problematic for backend drivers on IA64?
On Wed, Jan 25, 2006 at 05:24:56PM +0100, Gerd Hoffmann wrote:> Hi, > > > I think it''s a pretty ugly hack; struct page has a very specific > > meaning in Linux. Minimizing changes in Linux by subverting this > > meaning does not strike me as the right thing to do. > > Well, not exactly nice indeed, but any other solution involves touching > all block drivers ...Right. I''ve been thinking about this sporadically since the summit and don''t have a good answer yet.> Depends. If it''s actually DMA''ing directly it doesn''t, which should be > true in 99% of all cases. But there are some corner cases: If the > block layer needs bounce buffers it will attempt to kmap() the page to > copy the data. The same is true for drivers which don''t DMA (floppy.c > for example).That''s a good point. Since Xen provides its own "IOMMU" (swiotlb at the moment), I think we should set PCI_DMA_BUS_IS_PHYS so that the block layer never does bounce buffers on its own - unless doing it there is more efficient? As for drivers that don''t use DMA - we only care about para-virtualized drivers. I doubt we''ll see any PV drivers where the real drivers don''t do DMA. Cheers, Muli -- Muli Ben-Yehuda http://www.mulix.org | http://mulix.livejournal.com/ _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2006-Jan-25 16:56 UTC
Re: [Xen-devel] VP problematic for backend drivers on IA64?
On 25 Jan 2006, at 16:46, Muli Ben-Yehuda wrote:> That''s a good point. Since Xen provides its own "IOMMU" (swiotlb at > the moment), I think we should set PCI_DMA_BUS_IS_PHYS so that the > block layer never does bounce buffers on its own - unless doing it > there is more efficient?We set it to zero if using swiotlb. I think setting it to zero disables driver-specific bounce buffer code?> As for drivers that don''t use DMA - we only care about > para-virtualized drivers. I doubt we''ll see any PV drivers where the > real drivers don''t do DMA.That''s a big assumption that may only be 99.9% correct. :-) -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Muli Ben-Yehuda
2006-Jan-25 17:54 UTC
Re: [Xen-devel] VP problematic for backend drivers on IA64?
On Wed, Jan 25, 2006 at 04:56:43PM +0000, Keir Fraser wrote:> > On 25 Jan 2006, at 16:46, Muli Ben-Yehuda wrote: > > >That''s a good point. Since Xen provides its own "IOMMU" (swiotlb at > >the moment), I think we should set PCI_DMA_BUS_IS_PHYS so that the > >block layer never does bounce buffers on its own - unless doing it > >there is more efficient? > > We set it to zero if using swiotlb. I think setting it to zero disables > driver-specific bounce buffer code?Yes it does. I thought we weren''t doing it for some reason, sorry about the false alarm. Cheers, Muli -- Muli Ben-Yehuda http://www.mulix.org | http://mulix.livejournal.com/ _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel