At OLS we had a couple of "Xen Mini Summit" sessions. Although there weren''t any formal minutes, here''s a rough summary of what we discussed/concluded. Status as of 24 July 2005 ======================== Summary: We have a couple of annoying bugs, but things are coming together nicely. x86_32p (PAE 16GB) and x86_64 ports are close to being feature complete with x86_32. We should be able to ship 3.0.0 with the full feature set for these ports. [IA64 and Power are not part of the 3.0.0 release, but have actually made good progress anyhow]. We are going to need to support guest kernels that conform to the Xen 3.0 API for quite a few years to come, hence its important that we ensure the API is extensible and as easy to make backward compatible as possible. This has led to us wanting to work hard to push through a couple of API changes that will make life easier in future: * New Time API. This is required for systems with unsyncronized TSCs like the IBM Summit systems, and is also good for variable speed CPUs (laptops) * Replacing control messages with XenBus/XenStore. Doing backwards support of the old control message API would be a *huge* pain, so we''re really keen to get this incorporated before 3.0.0. The new time API has been checked-in by Keir, and seems to be working fine for most people, but there is at least one bug report of time running fast. Please test. Rolling-out XenBus/XenStore is a big project, but is making good progress. A xen-tools@lists.xensource.com mailing list has been created to co-ordinate this work, and there''s now a focussed effort to complete it ASAP. The first phase of testing the 3.0 release can begin before the switch to XenStore is complete -- there are plenty of platform related issues that can be debugged completely independently. The plan is to do weekly 3.0-testing releases until we feel that we''re on top of the bugs being found by the development community and would benefit from rolling a 3.0.0 release to get wider exposure. A very nice regression test infrastructure has been developed that should be ready to go live in the next week. We also have a ''TestCD'' which can be used for automated testing. The aim is to get it run on as a wide a variety of machines as possible. The results from both of these test tools will appear in a results matrix on the web. The regression test infrastructure is able to run sophisticated tests requiring co-ordination between multiple virtual machines (e.g. SpecWEB etc). The framework is easy to extend and add other tests (such as the ones developed by Paul) and it should be possible to develop it into a comprehensive suite. It''ll also be possible for others to run the nightly tests on ''interesting machines'' (e.g. wide SMP''s) and have the results automatically added to the web matrix. The plan is to roll the first 3.0-testing release as soon as there are no ''show stopper'' bugs in the unstable tree. Unfortunately, the current domU networking bug that a number of people have reported probably falls into this category. Hopefully we can get a testing release out early next week. More help to fix bugs (or isolate the changeset that introduces them) would be *greatly* appreciated. Although not strictly part of the 3.0 release, one of the most important things we need to do is to get the arch-xen patch prepared into a form that can be submitted upstream to Andrew/Linus. A couple of great volunteers stepped forward, and we need to make this an absoloute priority and help them as much as we can. Looking at the various sections of the Xen code base, the following paragraphs summarize the main issues: Tools ==== We need to complete the XenBus/XenStore switchover. Block devices are basically done, but there''s still work to do with net, console, balloon, hotplug, shutdown etc. There are a bunch of small outstanding tools issues we need to address: * sanitize all the xm commnds to give them consistent naming and parameters * test error paths * split console from xend and replace control messages with XenBus (1st part complete) * fix output of ''xm info'' * (authentication of relocation connection) * (option to use xm-xend connection over SSL/TCP rather than just unix domain socket) Xen/Linux arch independent ========================= The new time API needs more testing. SEDF scheduler needs more testing. Save/restore code needs to save state for multiple VCPUs rather than just VCPU0. Check-in AQ''s patch to allow the number of CPUs and min memory used by dom0 to be set from xend-config.sxp bug: xend start script looses networking on some machines. More testing required for workaround suggested in bugzilla. x86_32 Xen/Linux =============== needs: modify kmap to use update_va_mapping is an important optimization for domains with more than 890MB (CONFIG_HIGHMEM4G). bug: nasty networking issue reported in domU SLES9 guests. x86_32p (PAE for >4GB) Xen/Linux =============================== [ compile with XEN_TARGET_X86_PAE=y ] Nightly snapshot x86_32p install tarballs are now available from the downloads page. Seems stable running dom0 and domU''s, though not widely tested. Needs particular testing on systems with >4GB RAM. Machines with dumb SATA controllers with >4GB may be a particular problem. needs: * 3-level shadow mode pagetable support to allow live relocation * save/restore support for 3-level pagetables * testing of NX/XD x86_64 Xen/Linux =============== Seems stable running dom0 and domU''s. Passes LTP plus other tests. Needs particular testing on systems with >4GB RAM, particularly those with dumb SATA controllers. needs: * SMP guest support (patch in progress) * writable pagetable support for fast fork/exit (patch in progress) * save/restore support for 4 level pagetables * more testing of NX/XD Roadmap ====== Once the testing tree forks, the unstable tree will remain closed until we have a stable 3.0 release shipping. The main development items slated for 3.1 are: * Pacifica / VT-x abstraction layer, plus improved IO emulation * Finish phase 3 of the tools project (split xend into lots of small tools co-ordinated via XenStore) * Performance tuning and optimization -- less reliance on manual configuration * Support for Infiniband/Smart NICs (direct guest IO access) Ian _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 7/28/05, Ian Pratt <m+Ian.Pratt@cl.cam.ac.uk> wrote:> > At OLS we had a couple of "Xen Mini Summit" sessions. Although there > weren''t any formal minutes, here''s a rough summary of what we > discussed/concluded. > > Status as of 24 July 2005 > ========================> > Summary: We have a couple of annoying bugs, but things are coming > together nicely. > > x86_32p (PAE 16GB) and x86_64 ports are close to being feature complete > with x86_32. We should be able to ship 3.0.0 with the full feature set > for these ports. [IA64 and Power are not part of the 3.0.0 release, but > have actually made good progress anyhow]. > > We are going to need to support guest kernels that conform to the Xen > 3.0 API for quite a few years to come, hence its important that we > ensure the API is extensible and as easy to make backward compatible as > possible. This has led to us wanting to work hard to push through a > couple of API changes that will make life easier in future: > > * New Time API. This is required for systems with unsyncronized TSCs > like the IBM Summit systems, and is also good for variable speed CPUs > (laptops) > > * Replacing control messages with XenBus/XenStore. Doing backwards > support of the old control message API would be a *huge* pain, so we''re > really keen to get this incorporated before 3.0.0. > > The new time API has been checked-in by Keir, and seems to be working > fine for most people, but there is at least one bug report of time > running fast. Please test. > > Rolling-out XenBus/XenStore is a big project, but is making good > progress. A xen-tools@lists.xensource.com mailing list has been created > to co-ordinate this work, and there''s now a focussed effort to complete > it ASAP. > > The first phase of testing the 3.0 release can begin before the switch > to XenStore is complete -- there are plenty of platform related issues > that can be debugged completely independently. The plan is to do weekly > 3.0-testing releases until we feel that we''re on top of the bugs being > found by the development community and would benefit from rolling a > 3.0.0 release to get wider exposure. > > A very nice regression test infrastructure has been developed that > should be ready to go live in the next week. We also have a ''TestCD'' > which can be used for automated testing. The aim is to get it run on as > a wide a variety of machines as possible. The results from both of these > test tools will appear in a results matrix on the web. The regression > test infrastructure is able to run sophisticated tests requiring > co-ordination between multiple virtual machines (e.g. SpecWEB etc). The > framework is easy to extend and add other tests (such as the ones > developed by Paul) and it should be possible to develop it into a > comprehensive suite. It''ll also be possible for others to run the > nightly tests on ''interesting machines'' (e.g. wide SMP''s) and have the > results automatically added to the web matrix. > > The plan is to roll the first 3.0-testing release as soon as there are > no ''show stopper'' bugs in the unstable tree. Unfortunately, the current > domU networking bug that a number of people have reported probably falls > into this category. Hopefully we can get a testing release out early > next week. More help to fix bugs (or isolate the changeset that > introduces them) would be *greatly* appreciated. > > Although not strictly part of the 3.0 release, one of the most important > things we need to do is to get the arch-xen patch prepared into a form > that can be submitted upstream to Andrew/Linus. A couple of great > volunteers stepped forward, and we need to make this an absoloute > priority and help them as much as we can. > > Looking at the various sections of the Xen code base, the following > paragraphs summarize the main issues: > > Tools > ====> > We need to complete the XenBus/XenStore switchover. Block devices are > basically doneHave it checked in yet?> > There are a bunch of small outstanding tools issues we need to address: > * sanitize all the xm commnds to give them consistent naming and > parameters > * test error paths > * split console from xend and replace control messages with XenBus (1st > part complete) > * fix output of ''xm info''oops, i will re-submit the patch to show xen version. but what is wrong with "xm info" at the moment? regards, aq _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
> > We need to complete the XenBus/XenStore switchover. Block > devices are > > basically done > > Have it checked in yet?The code to switch block devices over isn''t checked in yet, the rest of it is. It''s not quite ready for checking in yet, but I think it got posted to xen-tools.> > There are a bunch of small outstanding tools issues we need > to address: > > * sanitize all the xm commnds to give them consistent naming and > > parameters > > * test error paths > > * split console from xend and replace control messages with XenBus > > (1st part complete) > > * fix output of ''xm info'' > > oops, i will re-submit the patch to show xen version. but > what is wrong with "xm info" at the moment?Last time I looked, we weren''t exporting all the information available e.g. num logical cpus, num nodes, sockets per node. Also we should export the Xen architecture as {x86_32, x86_32p, x86_64} rather than reporting the dom0 architecture. Also, please can you repost you dom0-mem-min and dom0-num-cpus patch to the list for general review. Thanks, Ian _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Thu, Jul 28, 2005 at 10:15:26PM +0100, Ian Pratt wrote:> > Ian, > > Could you post the slides you presented at the mini-summit? > > They''re not really freestanding, but here they are anyway: > > http://www.cl.cam.ac.uk/~iap10/temp/xen-summit-2005-07.pptThe slides show >4g in green, and i''m pretty sure that''s not quite the case until we get bounce buffers or iommu working. sRp -- Scott Parish Signed-off-by: srparish@us.ibm.com _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Thu, Jul 28, 2005 at 10:35:06PM +0100, Ian Pratt wrote:> > > > http://www.cl.cam.ac.uk/~iap10/temp/xen-summit-2005-07.ppt > > > > The slides show >4g in green, and i''m pretty sure that''s not quite the > > > case until we get bounce buffers or iommu working. > > Keir checked some bounce buffer support in a while back > (dma_map_single). It''s largely untested, though.I must have missed this. Unless there is something i''m needing to enable, the problem i''ve been working on with dma being attempted to high addresses still stands.> * Intel would get the standard s/w iommu working (just a case of > ensuring we have a machine contiguous aperture beloe 4GB -- the Linux > code currently uses the boot mem allocator rather than using > alloc_coherent, so it will need a little tweaking)Yeah, i haven''t had much luck so far.> * AMD would test gart support. Should just work... > > Since the vast majority of server platforms have hardware that is >4GB > DMA capable, it might not actually be such a big deal in practice. The > biggest pain is probably dumb SATA controllers. We definitely need more > test coverage.Right, i''m using a sata drive, it must be dumb. sRp -- Scott Parish Signed-off-by: srparish@us.ibm.com _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian, Could you post the slides you presented at the mini-summit? Bruce Walker Hewlett-Packard -----Original Message----- From: xen-devel-bounces@lists.xensource.com [mailto:xen-devel-bounces@lists.xensource.com] On Behalf Of Ian Pratt Sent: Wednesday, July 27, 2005 4:34 PM To: xen-devel Subject: [Xen-devel] Xen 3.0 Status update At OLS we had a couple of "Xen Mini Summit" sessions. Although there weren''t any formal minutes, here''s a rough summary of what we discussed/concluded. Status as of 24 July 2005 ======================== Summary: We have a couple of annoying bugs, but things are coming together nicely. x86_32p (PAE 16GB) and x86_64 ports are close to being feature complete with x86_32. We should be able to ship 3.0.0 with the full feature set for these ports. [IA64 and Power are not part of the 3.0.0 release, but have actually made good progress anyhow]. We are going to need to support guest kernels that conform to the Xen 3.0 API for quite a few years to come, hence its important that we ensure the API is extensible and as easy to make backward compatible as possible. This has led to us wanting to work hard to push through a couple of API changes that will make life easier in future: * New Time API. This is required for systems with unsyncronized TSCs like the IBM Summit systems, and is also good for variable speed CPUs (laptops) * Replacing control messages with XenBus/XenStore. Doing backwards support of the old control message API would be a *huge* pain, so we''re really keen to get this incorporated before 3.0.0. The new time API has been checked-in by Keir, and seems to be working fine for most people, but there is at least one bug report of time running fast. Please test. Rolling-out XenBus/XenStore is a big project, but is making good progress. A xen-tools@lists.xensource.com mailing list has been created to co-ordinate this work, and there''s now a focussed effort to complete it ASAP. The first phase of testing the 3.0 release can begin before the switch to XenStore is complete -- there are plenty of platform related issues that can be debugged completely independently. The plan is to do weekly 3.0-testing releases until we feel that we''re on top of the bugs being found by the development community and would benefit from rolling a 3.0.0 release to get wider exposure. A very nice regression test infrastructure has been developed that should be ready to go live in the next week. We also have a ''TestCD'' which can be used for automated testing. The aim is to get it run on as a wide a variety of machines as possible. The results from both of these test tools will appear in a results matrix on the web. The regression test infrastructure is able to run sophisticated tests requiring co-ordination between multiple virtual machines (e.g. SpecWEB etc). The framework is easy to extend and add other tests (such as the ones developed by Paul) and it should be possible to develop it into a comprehensive suite. It''ll also be possible for others to run the nightly tests on ''interesting machines'' (e.g. wide SMP''s) and have the results automatically added to the web matrix. The plan is to roll the first 3.0-testing release as soon as there are no ''show stopper'' bugs in the unstable tree. Unfortunately, the current domU networking bug that a number of people have reported probably falls into this category. Hopefully we can get a testing release out early next week. More help to fix bugs (or isolate the changeset that introduces them) would be *greatly* appreciated. Although not strictly part of the 3.0 release, one of the most important things we need to do is to get the arch-xen patch prepared into a form that can be submitted upstream to Andrew/Linus. A couple of great volunteers stepped forward, and we need to make this an absoloute priority and help them as much as we can. Looking at the various sections of the Xen code base, the following paragraphs summarize the main issues: Tools ==== We need to complete the XenBus/XenStore switchover. Block devices are basically done, but there''s still work to do with net, console, balloon, hotplug, shutdown etc. There are a bunch of small outstanding tools issues we need to address: * sanitize all the xm commnds to give them consistent naming and parameters * test error paths * split console from xend and replace control messages with XenBus (1st part complete) * fix output of ''xm info'' * (authentication of relocation connection) * (option to use xm-xend connection over SSL/TCP rather than just unix domain socket) Xen/Linux arch independent ========================= The new time API needs more testing. SEDF scheduler needs more testing. Save/restore code needs to save state for multiple VCPUs rather than just VCPU0. Check-in AQ''s patch to allow the number of CPUs and min memory used by dom0 to be set from xend-config.sxp bug: xend start script looses networking on some machines. More testing required for workaround suggested in bugzilla. x86_32 Xen/Linux =============== needs: modify kmap to use update_va_mapping is an important optimization for domains with more than 890MB (CONFIG_HIGHMEM4G). bug: nasty networking issue reported in domU SLES9 guests. x86_32p (PAE for >4GB) Xen/Linux =============================== [ compile with XEN_TARGET_X86_PAE=y ] Nightly snapshot x86_32p install tarballs are now available from the downloads page. Seems stable running dom0 and domU''s, though not widely tested. Needs particular testing on systems with >4GB RAM. Machines with dumb SATA controllers with >4GB may be a particular problem. needs: * 3-level shadow mode pagetable support to allow live relocation * save/restore support for 3-level pagetables * testing of NX/XD x86_64 Xen/Linux =============== Seems stable running dom0 and domU''s. Passes LTP plus other tests. Needs particular testing on systems with >4GB RAM, particularly those with dumb SATA controllers. needs: * SMP guest support (patch in progress) * writable pagetable support for fast fork/exit (patch in progress) * save/restore support for 4 level pagetables * more testing of NX/XD Roadmap ====== Once the testing tree forks, the unstable tree will remain closed until we have a stable 3.0 release shipping. The main development items slated for 3.1 are: * Pacifica / VT-x abstraction layer, plus improved IO emulation * Finish phase 3 of the tools project (split xend into lots of small tools co-ordinated via XenStore) * Performance tuning and optimization -- less reliance on manual configuration * Support for Infiniband/Smart NICs (direct guest IO access) Ian _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
> Ian, > Could you post the slides you presented at the mini-summit?They''re not really freestanding, but here they are anyway: http://www.cl.cam.ac.uk/~iap10/temp/xen-summit-2005-07.ppt Ian _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
> > http://www.cl.cam.ac.uk/~iap10/temp/xen-summit-2005-07.ppt > > The slides show >4g in green, and i''m pretty sure that''s not quite the> case until we get bounce buffers or iommu working.Keir checked some bounce buffer support in a while back (dma_map_single). It''s largely untested, though. However, at the summit we agreed that: * Intel would get the standard s/w iommu working (just a case of ensuring we have a machine contiguous aperture beloe 4GB -- the Linux code currently uses the boot mem allocator rather than using alloc_coherent, so it will need a little tweaking) * AMD would test gart support. Should just work... Since the vast majority of server platforms have hardware that is >4GB DMA capable, it might not actually be such a big deal in practice. The biggest pain is probably dumb SATA controllers. We definitely need more test coverage. Ian _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Thu, Jul 28, 2005 at 11:39:46PM +0100, Ian Pratt wrote:> > > > > * Intel would get the standard s/w iommu working (just a case of > > > ensuring we have a machine contiguous aperture beloe 4GB -- > > the Linux > > > code currently uses the boot mem allocator rather than using > > > alloc_coherent, so it will need a little tweaking) > > > > Yeah, i haven''t had much luck so far. > > I don''t really understand why the aperture is allocated so early. I > suspect its not actually used until the normal bootmem allocator is up. > > The slightly more fundamental problem is that we need a <4GB allocation > zone in Xen, but since allocating the apperture is only currently an > issue for dom0 it won''t actually be a problem in practice. (something we > need to address before driver domains come back)I have a patch that introduces zones into xen, and a hypercall to request dmaable memory, which i''ve made xen_contig_memory() use. Unfortunately, there still seems to be some places where kmallocs are done for dma buffers. (i tried putting all linux memory into ZONE_NORMAL and caught a couple of these places) If the zones patch would be helpful i could clean it up and post it. sRp -- Scott Parish Signed-off-by: srparish@us.ibm.com _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Thu, Jul 28, 2005 at 11:57:27PM +0100, Mark Williamson wrote:> > I have a patch that introduces zones into xen, and a hypercall to > > request dmaable memory, which i''ve made xen_contig_memory() use. > > Unfortunately, there still seems to be some places where kmallocs are > > done for dma buffers. (i tried putting all linux memory into ZONE_NORMAL > > and caught a couple of these places) > > The Linux USB stack uses kmalloc-ed memory as DMA buffers as standard > practice. This should still be dealt with correctly by bounce buffer code, > though.I''ll have to look at that code. The place i caught was drivers/scsi/sd.c:1471: buffer = kmalloc(512, GFP_KERNEL | __GFP_DMA); I''ve tried fixing the page allocator to xen_contig_memory() pages that are requested __GFP_DMA, but now get a null pointer dereference i haven''t shaken out yet. (i''m not suggesting that xen_contig_memory() is the appropriate long term solution, but for prototyping it should work) sRp -- Scott Parish Signed-off-by: srparish@us.ibm.com _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Fri, Jul 29, 2005 at 12:05:10AM +0100, Ian Pratt wrote:> > > The slightly more fundamental problem is that we need a <4GB > > > allocation zone in Xen, but since allocating the apperture is only > > > currently an issue for dom0 it won''t actually be a problem in > > > practice. (something we need to address before driver domains come > > > back) > > > > I have a patch that introduces zones into xen, and a > > hypercall to request dmaable memory, which i''ve made > > xen_contig_memory() use. > > The hypercall should probably pass in the ''order'' of the address limit > required for the allocation. There are a few stupid devices that require > memory below 2GB etc (e.g. aacraid)This is with the MEMOP_decrease_reservation hypercall, which is already using up all of its allotted arguments. Its been a while, but it didn''t look like it was going to be real easy to raise the limit of 6 arguments on x86_32.> > Unfortunately, there still seems to be some places where > > kmallocs are done for dma buffers. (i tried putting all linux > > memory into ZONE_NORMAL and caught a couple of these places) > > Can you give examples? What size are the allocations? Do you know what > the official position is i.e. is using kmalloc with ZONE_DMA deprecated?I have no idea about official positions of the linux kernel. sRp -- Scott Parish Signed-off-by: srparish@us.ibm.com _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
> > > * Intel would get the standard s/w iommu working (just a case of > > ensuring we have a machine contiguous aperture beloe 4GB -- > the Linux > > code currently uses the boot mem allocator rather than using > > alloc_coherent, so it will need a little tweaking) > > Yeah, i haven''t had much luck so far.I don''t really understand why the aperture is allocated so early. I suspect its not actually used until the normal bootmem allocator is up. The slightly more fundamental problem is that we need a <4GB allocation zone in Xen, but since allocating the apperture is only currently an issue for dom0 it won''t actually be a problem in practice. (something we need to address before driver domains come back) Ian _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
> I have a patch that introduces zones into xen, and a hypercall to > request dmaable memory, which i''ve made xen_contig_memory() use. > Unfortunately, there still seems to be some places where kmallocs are > done for dma buffers. (i tried putting all linux memory into ZONE_NORMAL > and caught a couple of these places)The Linux USB stack uses kmalloc-ed memory as DMA buffers as standard practice. This should still be dealt with correctly by bounce buffer code, though. Cheers, Mark> If the zones patch would be helpful i could clean it up and post it. > > sRp_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
> > The slightly more fundamental problem is that we need a <4GB > > allocation zone in Xen, but since allocating the apperture is only > > currently an issue for dom0 it won''t actually be a problem in > > practice. (something we need to address before driver domains come > > back) > > I have a patch that introduces zones into xen, and a > hypercall to request dmaable memory, which i''ve made > xen_contig_memory() use.The hypercall should probably pass in the ''order'' of the address limit required for the allocation. There are a few stupid devices that require memory below 2GB etc (e.g. aacraid)> Unfortunately, there still seems to be some places where > kmallocs are done for dma buffers. (i tried putting all linux > memory into ZONE_NORMAL and caught a couple of these places)Can you give examples? What size are the allocations? Do you know what the official position is i.e. is using kmalloc with ZONE_DMA deprecated? Thanks, Ian _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
> > > I have a patch that introduces zones into xen, and a hypercall to > > > request dmaable memory, which i''ve made > > > xen_contig_memory() use. > > > > The hypercall should probably pass in the ''order'' of the > address limit > > required for the allocation. There are a few stupid devices that > > require memory below 2GB etc (e.g. aacraid) > > This is with the MEMOP_decrease_reservation hypercall, which > is already using up all of its allotted arguments. Its been a > while, but it didn''t look like it was going to be real easy > to raise the limit of 6 arguments on x86_32.extent_order only needs to be a byte parameter, so it would be reasonable to have the next byte of the word be the addr_limit_order. (We might want a separate alignment order in future too).> > > Unfortunately, there still seems to be some places where kmallocs > > > are done for dma buffers. (i tried putting all linux memory into > > > ZONE_NORMAL and caught a couple of these places) > > > > Can you give examples? What size are the allocations? Do > you know what > > the official position is i.e. is using kmalloc with > ZONE_DMA deprecated? > > I have no idea about official positions of the linux kernel.I guess its probably allowed for sub page allocations. Hopefully the s/w iommu can take care of these at map time. Ian _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 7/28/05, Ian Pratt <m+Ian.Pratt@cl.cam.ac.uk> wrote:> > > > We need to complete the XenBus/XenStore switchover. Block > > devices are > > > basically done > > > > Have it checked in yet? > > The code to switch block devices over isn''t checked in yet, the rest of > it is. It''s not quite ready for checking in yet, but I think it got > posted to xen-tools. >unfortunately i havent seen anything posted in that mailing list. the only message archived of that list so far is a "welcome" mail.> > > There are a bunch of small outstanding tools issues we need > > to address: > > > * sanitize all the xm commnds to give them consistent naming and > > > parameters > > > * test error paths > > > * split console from xend and replace control messages with XenBus > > > (1st part complete) > > > * fix output of ''xm info'' > > > > oops, i will re-submit the patch to show xen version. but > > what is wrong with "xm info" at the moment? > > Last time I looked, we weren''t exporting all the information available > e.g. num logical cpus, num nodes, sockets per node. Also we should > export the Xen architecture as {x86_32, x86_32p, x86_64} rather than > reporting the dom0 architecture. >ok, i will submit the patch for this problem> Also, please can you repost you dom0-mem-min and dom0-num-cpus patch to > the list for general review. >ok. regards, aq _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Attached patch adds a DMA zone to xen, also modifies xen_contig_memory() to ask for DMA pages. sRp On Fri, Jul 29, 2005 at 12:31:42AM +0100, Ian Pratt wrote:> > > > I have a patch that introduces zones into xen, and a hypercall to > > > > request dmaable memory, which i''ve made > > > > xen_contig_memory() use. > > > > > > The hypercall should probably pass in the ''order'' of the > > address limit > > > required for the allocation. There are a few stupid devices that > > > require memory below 2GB etc (e.g. aacraid) > > > > This is with the MEMOP_decrease_reservation hypercall, which > > is already using up all of its allotted arguments. Its been a > > while, but it didn''t look like it was going to be real easy > > to raise the limit of 6 arguments on x86_32. > > extent_order only needs to be a byte parameter, so it would be > reasonable to have the next byte of the word be the addr_limit_order. > (We might want a separate alignment order in future too). > > > > > Unfortunately, there still seems to be some places where kmallocs > > > > are done for dma buffers. (i tried putting all linux memory into > > > > ZONE_NORMAL and caught a couple of these places) > > > > > > Can you give examples? What size are the allocations? Do > > you know what > > > the official position is i.e. is using kmalloc with > > ZONE_DMA deprecated? > > > > I have no idea about official positions of the linux kernel. > > I guess its probably allowed for sub page allocations. > > Hopefully the s/w iommu can take care of these at map time. > > Ian > >-- Scott Parish Signed-off-by: srparish@us.ibm.com _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
"Ian Pratt" <m+Ian.Pratt@cl.cam.ac.uk> writes:> > Unfortunately, there still seems to be some places where > > kmallocs are done for dma buffers. (i tried putting all linux > > memory into ZONE_NORMAL and caught a couple of these places) > > Can you give examples? What size are the allocations? Do you know what > the official position is i.e. is using kmalloc with ZONE_DMA deprecated?If you do this: fd = open("/dev/video0"); // open bttv grabber card ioctl(fd, ...); // configure tvnorm, size, ... read(fd, somelargebuf); // capture a single frame bttv will try to send the video frame directly to the buffer passed. Lock pages, kick DMA, wait until finished, unlock pages, done. And bttv has no control at all about how these pages are allocated. DMA memory really can be almost anything. There is no way around having a swiotlb-like bounce buffer mechanism hooked into the dma mapping API as fallback. At the moment the linux kernel provides no way to hint that you want to use the specific piece of memory you are asking for for 32-bit PCI DMA. ZONE_DMA is historical stuff, 16MB only for ISA DMA IIRC, not really useful. Maybe Andy finally finds some time to polish & submit the ZONE_DMA32 patch. bttv tries to allocate buffers from ZONE_NORMAL (i.e. avoid highmem) in case it has control over the allocations, which is far from being perfect. Works reliable only on 32 bit, doesn''t work on 64-bit without iommu and >4GB for example ... Gerd -- panic("it works"); /* avoid being flooded with debug messages */ _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 29 Jul 2005, at 10:06, Gerd Knorr wrote:> DMA memory really can be almost anything. There is no way around > having a swiotlb-like bounce buffer mechanism hooked into the dma > mapping API as fallback. > > At the moment the linux kernel provides no way to hint that you want > to use the specific piece of memory you are asking for for 32-bit PCI > DMA. ZONE_DMA is historical stuff, 16MB only for ISA DMA IIRC, not > really useful. Maybe Andy finally finds some time to polish & submit > the ZONE_DMA32 patch.Any driver that really wants ZONE_DMA memory on xenlinux (ie. 24-bit safe memory) is screwed. The low 16MB memory is always allocated exclusively to Xen itself. That does only affect old ISA hardware though, afaik. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
> bttv tries to allocate buffers from ZONE_NORMAL (i.e. avoid > highmem) in case it has control over the allocations, which > is far from being perfect. Works reliable only on 32 bit, > doesn''t work on 64-bit without iommu and >4GB for example ...Yep, for pages for actual data-path transfers there''s nothing better we can do than use the iommu (or s/w implementation their of). However, for the long-lived control regions used by some devices (e.g. for descriptor rings etc) it probably makes sense to try and allocate them in memory they can access directly. Some of these have daft restrictions, such as the aacraid device only being able to use memory below 2GB for its control program. Most of these regions are bigger than a page so have to be created with dma_alloc_coherent, hence we have an opportunity to allocate them in a region they can directly access. Ian _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
> However, for the long-lived control regions used by some devices (e.g. > for descriptor rings etc) it probably makes sense to try and allocate > them in memory they can access directly. Some of these have daft > restrictions, such as the aacraid device only being able to use memory > below 2GB for its control program.> Most of these regions are bigger than a page so have to be created with > dma_alloc_coherent, hence we have an opportunity to allocate them in a > region they can directly access.Yep, for the control structures it should be easy to sort as dma_alloc_coherent() must be used for them (Documentation/DMA-mapping.txt is pretty clear here) and so we have a opportunity to hook in and make sure it''s DMA-able memory. Linux device drivers can pass a address mask telling the kernel which range the device in question can handle (pci_set_dma_mask), it''s probably a good idea to pass that down to xen. That should catch corner cases like the 2G limit on some raid controller mentioned here recently without extra work I think. Gerd -- panic("it works"); /* avoid being flooded with debug messages */ _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel