Yuji Shimada
2008-Nov-18 09:40 UTC
[Xen-devel] [PATCH] fix memory allocation from NUMA node for VT-d.
The memory relating guest domain should be allocated from NUMA node on which the guest runs. Because the latency of the same NUMA node is faster than that of a different one. This patch fixes memory allocation for Address Translation Structure of VT-d. VT-d uses two types of Structures for DMA address translation. The one is Device Assignment Structure. The other is Address Translation Structure. There is only one Device Assignment Structure on a system. So, it doesn''t need to change memory allocation for Device Assignment Structure. It means using default policy. On the other hand, Address Translation Structure exists per guest domain. So, it needs allocating the memory for Address Translation Structure from NUMA node which guest domain runs. This patch is useful for a system which has many IOMMUs. Thanks, -- Yuji Shimada Signed-off-by: Yuji Shimada <shimada-yxb@necst.nec.co.jp> diff -r 5fd51e1e9c79 xen/drivers/passthrough/vtd/intremap.c --- a/xen/drivers/passthrough/vtd/intremap.c Wed Nov 05 10:57:21 2008 +0000 +++ b/xen/drivers/passthrough/vtd/intremap.c Tue Nov 18 17:37:31 2008 +0900 @@ -473,7 +473,7 @@ ir_ctrl = iommu_ir_ctrl(iommu); if ( ir_ctrl->iremap_maddr == 0 ) { - ir_ctrl->iremap_maddr = alloc_pgtable_maddr(); + ir_ctrl->iremap_maddr = alloc_pgtable_maddr(NULL); if ( ir_ctrl->iremap_maddr == 0 ) { dprintk(XENLOG_WARNING VTDPREFIX, diff -r 5fd51e1e9c79 xen/drivers/passthrough/vtd/iommu.c --- a/xen/drivers/passthrough/vtd/iommu.c Wed Nov 05 10:57:21 2008 +0000 +++ b/xen/drivers/passthrough/vtd/iommu.c Tue Nov 18 17:37:31 2008 +0900 @@ -148,7 +148,7 @@ root = &root_entries[bus]; if ( !root_present(*root) ) { - maddr = alloc_pgtable_maddr(); + maddr = alloc_pgtable_maddr(NULL); if ( maddr == 0 ) { unmap_vtd_domain_page(root_entries); @@ -205,7 +205,7 @@ addr &= (((u64)1) << addr_width) - 1; spin_lock_irqsave(&hd->mapping_lock, flags); if ( hd->pgd_maddr == 0 ) - if ( !alloc || ((hd->pgd_maddr = alloc_pgtable_maddr()) == 0) ) + if ( !alloc || ((hd->pgd_maddr = alloc_pgtable_maddr(domain)) == 0) ) goto out; parent = (struct dma_pte *)map_vtd_domain_page(hd->pgd_maddr); @@ -218,7 +218,7 @@ { if ( !alloc ) break; - maddr = alloc_pgtable_maddr(); + maddr = alloc_pgtable_maddr(domain); if ( !maddr ) break; dma_set_pte_addr(*pte, maddr); @@ -605,7 +605,7 @@ spin_lock_irqsave(&iommu->register_lock, flags); if ( iommu->root_maddr == 0 ) - iommu->root_maddr = alloc_pgtable_maddr(); + iommu->root_maddr = alloc_pgtable_maddr(NULL); if ( iommu->root_maddr == 0 ) { spin_unlock_irqrestore(&iommu->register_lock, flags); diff -r 5fd51e1e9c79 xen/drivers/passthrough/vtd/qinval.c --- a/xen/drivers/passthrough/vtd/qinval.c Wed Nov 05 10:57:21 2008 +0000 +++ b/xen/drivers/passthrough/vtd/qinval.c Tue Nov 18 17:37:31 2008 +0900 @@ -426,7 +426,7 @@ if ( qi_ctrl->qinval_maddr == 0 ) { - qi_ctrl->qinval_maddr = alloc_pgtable_maddr(); + qi_ctrl->qinval_maddr = alloc_pgtable_maddr(NULL); if ( qi_ctrl->qinval_maddr == 0 ) { dprintk(XENLOG_WARNING VTDPREFIX, diff -r 5fd51e1e9c79 xen/drivers/passthrough/vtd/vtd.h --- a/xen/drivers/passthrough/vtd/vtd.h Wed Nov 05 10:57:21 2008 +0000 +++ b/xen/drivers/passthrough/vtd/vtd.h Tue Nov 18 17:37:31 2008 +0900 @@ -101,7 +101,7 @@ void cacheline_flush(char *); void flush_all_cache(void); void *map_to_nocache_virt(int nr_iommus, u64 maddr); -u64 alloc_pgtable_maddr(void); +u64 alloc_pgtable_maddr(struct domain *d); void free_pgtable_maddr(u64 maddr); void *map_vtd_domain_page(u64 maddr); void unmap_vtd_domain_page(void *va); diff -r 5fd51e1e9c79 xen/drivers/passthrough/vtd/x86/vtd.c --- a/xen/drivers/passthrough/vtd/x86/vtd.c Wed Nov 05 10:57:21 2008 +0000 +++ b/xen/drivers/passthrough/vtd/x86/vtd.c Tue Nov 18 17:37:31 2008 +0900 @@ -22,6 +22,7 @@ #include <xen/domain_page.h> #include <asm/paging.h> #include <xen/iommu.h> +#include <xen/numa.h> #include "../iommu.h" #include "../dmar.h" #include "../vtd.h" @@ -37,13 +38,21 @@ } /* Allocate page table, return its machine address */ -u64 alloc_pgtable_maddr(void) +u64 alloc_pgtable_maddr(struct domain *d) { struct page_info *pg; u64 *vaddr; unsigned long mfn; - pg = alloc_domheap_page(NULL, 0); + if (d == NULL) + { + pg = alloc_domheap_page(NULL, 0); + } + else + { + pg = alloc_domheap_page(NULL, MEMF_node(domain_to_node(d))); + } + if ( !pg ) return 0; mfn = page_to_mfn(pg); _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Espen Skoglund
2008-Nov-18 12:00 UTC
Re: [Xen-devel] [PATCH] fix memory allocation from NUMA node for VT-d.
Given an FSB based system the IOMMUs sit in the north-bridge. How does this work qith QPI? Where in the system do the different IOMMUs sit? Wouldn''t it make more sense to allocate memory from one of the nodes where the IOMMU is attached? Having the memory allocated from the node of the guest only helps when the guest needs to update its page tables. I''d rather optimize for page table walks in the IOMMU. eSk [Yuji Shimada]> The memory relating guest domain should be allocated from NUMA node > on which the guest runs. Because the latency of the same NUMA node > is faster than that of a different one.> This patch fixes memory allocation for Address Translation Structure > of VT-d.> VT-d uses two types of Structures for DMA address translation. The > one is Device Assignment Structure. The other is Address > Translation Structure.> There is only one Device Assignment Structure on a system. > So, it doesn''t need to change memory allocation for Device Assignment > Structure. It means using default policy.> On the other hand, Address Translation Structure exists per guest > domain. So, it needs allocating the memory for Address Translation > Structure from NUMA node which guest domain runs.> This patch is useful for a system which has many IOMMUs.> Thanks,> -- > Yuji Shimada> Signed-off-by: Yuji Shimada <shimada-yxb@necst.nec.co.jp>> diff -r 5fd51e1e9c79 xen/drivers/passthrough/vtd/intremap.c > --- a/xen/drivers/passthrough/vtd/intremap.c Wed Nov 05 10:57:21 2008 +0000 > +++ b/xen/drivers/passthrough/vtd/intremap.c Tue Nov 18 17:37:31 2008 +0900 > @@ -473,7 +473,7 @@ > ir_ctrl = iommu_ir_ctrl(iommu); > if ( ir_ctrl->iremap_maddr == 0 ) > { > - ir_ctrl->iremap_maddr = alloc_pgtable_maddr(); > + ir_ctrl->iremap_maddr = alloc_pgtable_maddr(NULL); > if ( ir_ctrl->iremap_maddr == 0 ) > { > dprintk(XENLOG_WARNING VTDPREFIX, > diff -r 5fd51e1e9c79 xen/drivers/passthrough/vtd/iommu.c > --- a/xen/drivers/passthrough/vtd/iommu.c Wed Nov 05 10:57:21 2008 +0000 > +++ b/xen/drivers/passthrough/vtd/iommu.c Tue Nov 18 17:37:31 2008 +0900 > @@ -148,7 +148,7 @@ > root = &root_entries[bus]; > if ( !root_present(*root) ) > { > - maddr = alloc_pgtable_maddr(); > + maddr = alloc_pgtable_maddr(NULL); > if ( maddr == 0 ) > { > unmap_vtd_domain_page(root_entries); > @@ -205,7 +205,7 @@ > addr &= (((u64)1) << addr_width) - 1; > spin_lock_irqsave(&hd->mapping_lock, flags); > if ( hd->pgd_maddr == 0 ) > - if ( !alloc || ((hd->pgd_maddr = alloc_pgtable_maddr()) == 0) ) > + if ( !alloc || ((hd->pgd_maddr = alloc_pgtable_maddr(domain)) == 0) ) > goto out;> parent = (struct dma_pte *)map_vtd_domain_page(hd->pgd_maddr); > @@ -218,7 +218,7 @@ > { > if ( !alloc ) > break; > - maddr = alloc_pgtable_maddr(); > + maddr = alloc_pgtable_maddr(domain); > if ( !maddr ) > break; > dma_set_pte_addr(*pte, maddr); > @@ -605,7 +605,7 @@ > spin_lock_irqsave(&iommu->register_lock, flags);> if ( iommu->root_maddr == 0 ) > - iommu->root_maddr = alloc_pgtable_maddr(); > + iommu->root_maddr = alloc_pgtable_maddr(NULL); > if ( iommu->root_maddr == 0 ) > { > spin_unlock_irqrestore(&iommu->register_lock, flags); > diff -r 5fd51e1e9c79 xen/drivers/passthrough/vtd/qinval.c > --- a/xen/drivers/passthrough/vtd/qinval.c Wed Nov 05 10:57:21 2008 +0000 > +++ b/xen/drivers/passthrough/vtd/qinval.c Tue Nov 18 17:37:31 2008 +0900 > @@ -426,7 +426,7 @@> if ( qi_ctrl->qinval_maddr == 0 ) > { > - qi_ctrl->qinval_maddr = alloc_pgtable_maddr(); > + qi_ctrl->qinval_maddr = alloc_pgtable_maddr(NULL); > if ( qi_ctrl->qinval_maddr == 0 ) > { > dprintk(XENLOG_WARNING VTDPREFIX, > diff -r 5fd51e1e9c79 xen/drivers/passthrough/vtd/vtd.h > --- a/xen/drivers/passthrough/vtd/vtd.h Wed Nov 05 10:57:21 2008 +0000 > +++ b/xen/drivers/passthrough/vtd/vtd.h Tue Nov 18 17:37:31 2008 +0900 > @@ -101,7 +101,7 @@ > void cacheline_flush(char *); > void flush_all_cache(void); > void *map_to_nocache_virt(int nr_iommus, u64 maddr); > -u64 alloc_pgtable_maddr(void); > +u64 alloc_pgtable_maddr(struct domain *d); > void free_pgtable_maddr(u64 maddr); > void *map_vtd_domain_page(u64 maddr); > void unmap_vtd_domain_page(void *va); > diff -r 5fd51e1e9c79 xen/drivers/passthrough/vtd/x86/vtd.c > --- a/xen/drivers/passthrough/vtd/x86/vtd.c Wed Nov 05 10:57:21 2008 +0000 > +++ b/xen/drivers/passthrough/vtd/x86/vtd.c Tue Nov 18 17:37:31 2008 +0900 > @@ -22,6 +22,7 @@ > #include <xen/domain_page.h> > #include <asm/paging.h> > #include <xen/iommu.h> > +#include <xen/numa.h> > #include "../iommu.h" > #include "../dmar.h" > #include "../vtd.h" > @@ -37,13 +38,21 @@ > }> /* Allocate page table, return its machine address */ > -u64 alloc_pgtable_maddr(void) > +u64 alloc_pgtable_maddr(struct domain *d) > { > struct page_info *pg; > u64 *vaddr; > unsigned long mfn;> - pg = alloc_domheap_page(NULL, 0); > + if (d == NULL) > + { > + pg = alloc_domheap_page(NULL, 0); > + } > + else > + { > + pg = alloc_domheap_page(NULL, MEMF_node(domain_to_node(d))); > + } > + > if ( !pg ) > return 0; > mfn = page_to_mfn(pg);> _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Yuji Shimada
2008-Nov-19 08:26 UTC
Re: [Xen-devel] [PATCH] fix memory allocation from NUMA node for VT-d.
Hi Espen, Your suggestion allocating memory from one of the nodes where the IOMMU is attached improves performance more. But more memory is needed, because structures are needed per IOMMU. My patch keeps the current implementation, one Device Assignment Structure and Address Translation Structure per guest. Xen''s user will assign a device to a closer guest. So, node of the guest and node connected to IOMMU will be the same. As a result, the memory performance will be improved with my patch. Thanks, -- Yuji Shimada On Tue, 18 Nov 2008 12:00:37 +0000 Espen Skoglund <espen.skoglund@netronome.com> wrote:> Given an FSB based system the IOMMUs sit in the north-bridge. How > does this work qith QPI? Where in the system do the different IOMMUs > sit? Wouldn''t it make more sense to allocate memory from one of the > nodes where the IOMMU is attached? Having the memory allocated from > the node of the guest only helps when the guest needs to update its > page tables. I''d rather optimize for page table walks in the IOMMU. > > eSk_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Kay, Allen M
2008-Nov-19 18:57 UTC
RE: [Xen-devel] [PATCH] fix memory allocation from NUMA node for VT-d.
>Xen''s user will assign a device to a closer guest. So, node of the >guest and node connected to IOMMU will be the same. >As a result, the memory performance will be improved with my patch.Are you assuming guest will ping the guest to a physical CPU? How does the user figure out which devices are closer to which physical CPU in the platform in a QPI system without using proximity domain info? Allen>-----Original Message----- >From: xen-devel-bounces@lists.xensource.com >[mailto:xen-devel-bounces@lists.xensource.com] On Behalf Of >Yuji Shimada >Sent: Wednesday, November 19, 2008 12:26 AM >To: Espen Skoglund >Cc: xen-devel@lists.xensource.com; ''Keir Fraser'' >Subject: Re: [Xen-devel] [PATCH] fix memory allocation from >NUMA node for VT-d. > >Hi Espen, > >Your suggestion allocating memory from one of the nodes where the >IOMMU is attached improves performance more. But more memory is >needed, because structures are needed per IOMMU. > >My patch keeps the current implementation, one Device Assignment >Structure and Address Translation Structure per guest. > >Xen''s user will assign a device to a closer guest. So, node of the >guest and node connected to IOMMU will be the same. >As a result, the memory performance will be improved with my patch. > >Thanks, >-- >Yuji Shimada > >On Tue, 18 Nov 2008 12:00:37 +0000 >Espen Skoglund <espen.skoglund@netronome.com> wrote: > >> Given an FSB based system the IOMMUs sit in the north-bridge. How >> does this work qith QPI? Where in the system do the different IOMMUs >> sit? Wouldn''t it make more sense to allocate memory from one of the >> nodes where the IOMMU is attached? Having the memory allocated from >> the node of the guest only helps when the guest needs to update its >> page tables. I''d rather optimize for page table walks in the IOMMU. >> >> eSk > > >_______________________________________________ >Xen-devel mailing list >Xen-devel@lists.xensource.com >http://lists.xensource.com/xen-devel >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Espen Skoglund
2008-Nov-20 20:00 UTC
RE: [Xen-devel] [PATCH] fix memory allocation from NUMA node for VT-d.
You only require more memory if you duplicate the structures per IOMMU. While this is indeed possible (and may even be the desired solution) it is not what I suggested. And you''re making the assumption here that the guest is assigned to the node of the IOMMU. As Allen points out, how does a user make this decision? And in many cases I would expect that you would not want to assign many guests to the same node anyway. By at least keeping the IOMMU page tables local to the node you''ll get lower latencies for the page table walker. eSk>> -----Original Message----- >> From: xen-devel-bounces@lists.xensource.com >> [mailto:xen-devel-bounces@lists.xensource.com] On Behalf Of >> Yuji Shimada >> Sent: Wednesday, November 19, 2008 12:26 AM >> To: Espen Skoglund >> Cc: xen-devel@lists.xensource.com; ''Keir Fraser'' >> Subject: Re: [Xen-devel] [PATCH] fix memory allocation from >> NUMA node for VT-d. >> >> Hi Espen, >> >> Your suggestion allocating memory from one of the nodes where the >> IOMMU is attached improves performance more. But more memory is >> needed, because structures are needed per IOMMU. >> >> My patch keeps the current implementation, one Device Assignment >> Structure and Address Translation Structure per guest. >> >> Xen''s user will assign a device to a closer guest. So, node of the >> guest and node connected to IOMMU will be the same. >> As a result, the memory performance will be improved with my patch. >> >> Thanks, >> -- >> Yuji Shimada >> >> On Tue, 18 Nov 2008 12:00:37 +0000 >> Espen Skoglund <espen.skoglund@netronome.com> wrote: >> >>> Given an FSB based system the IOMMUs sit in the north-bridge. How >>> does this work qith QPI? Where in the system do the different IOMMUs >>> sit? Wouldn''t it make more sense to allocate memory from one of the >>> nodes where the IOMMU is attached? Having the memory allocated from >>> the node of the guest only helps when the guest needs to update its >>> page tables. I''d rather optimize for page table walks in the IOMMU. >>> >>> eSk > > > > > >_______________________________________________ > >Xen-devel mailing list > >Xen-devel@lists.xensource.com > >http://lists.xensource.com/xen-devel > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Yuji Shimada
2008-Nov-26 08:32 UTC
Re: [Xen-devel] [PATCH] fix memory allocation from NUMA node for VT-d.
Hi, Espen & Kay,> Are you assuming guest will ping the guest to a physical CPU?Yes. On Xen, memory is automatically assigned from the same NUMA node as a physical CPU. Memory isn''t moved from initial assignment. But if we doesn''t ping a guest to physical CPUs, the guest can run on every physical CPUs. So, when the user uses NUMA machine, it is better to ping the guest to physical CPUs. Because the latency becomes lower.> How does the user figure out which devices are closer to which > physical CPU in the platform in a QPI system without using proximity > domain info?The users can read machine spec to assign I/O device from the same NUMA node as CPU to a guest.> By at least keeping the IOMMU page tables local to the node you''ll get > lower latencies for the page table walker.We can get lower latency by using proximity domain info of DMAR. But it needs more modifications than my patch. I will not work for this. If anyone develops this function, I will not be against it. Thanks, -- Yuji Shimada On Thu, 20 Nov 2008 20:00:08 +0000 Espen Skoglund <espen.skoglund@netronome.com> wrote:> You only require more memory if you duplicate the structures per > IOMMU. While this is indeed possible (and may even be the desired > solution) it is not what I suggested. > > And you''re making the assumption here that the guest is assigned to > the node of the IOMMU. As Allen points out, how does a user make this > decision? And in many cases I would expect that you would not want to > assign many guests to the same node anyway. By at least keeping the > IOMMU page tables local to the node you''ll get lower latencies for the > page table walker. > > eSk >On Wed, 19 Nov 2008 10:57:10 -0800 "Kay, Allen M" <allen.m.kay@intel.com> wrote:> > >Xen''s user will assign a device to a closer guest. So, node of the > >guest and node connected to IOMMU will be the same. > >As a result, the memory performance will be improved with my patch. > > Are you assuming guest will ping the guest to a physical CPU? How > does the user figure out which devices are closer to which physical > CPU in the platform in a QPI system without using proximity domain > info? > > Allen > > >> -----Original Message----- > >> From: xen-devel-bounces@lists.xensource.com > >> [mailto:xen-devel-bounces@lists.xensource.com] On Behalf Of > >> Yuji Shimada > >> Sent: Wednesday, November 19, 2008 12:26 AM > >> To: Espen Skoglund > >> Cc: xen-devel@lists.xensource.com; ''Keir Fraser'' > >> Subject: Re: [Xen-devel] [PATCH] fix memory allocation from > >> NUMA node for VT-d. > >> > >> Hi Espen, > >> > >> Your suggestion allocating memory from one of the nodes where the > >> IOMMU is attached improves performance more. But more memory is > >> needed, because structures are needed per IOMMU. > >> > >> My patch keeps the current implementation, one Device Assignment > >> Structure and Address Translation Structure per guest. > >> > >> Xen''s user will assign a device to a closer guest. So, node of the > >> guest and node connected to IOMMU will be the same. > >> As a result, the memory performance will be improved with my patch. > >> > >> Thanks, > >> -- > >> Yuji Shimada > >> > >> On Tue, 18 Nov 2008 12:00:37 +0000 > >> Espen Skoglund <espen.skoglund@netronome.com> wrote: > >> > >>> Given an FSB based system the IOMMUs sit in the north-bridge. How > >>> does this work qith QPI? Where in the system do the different IOMMUs > >>> sit? Wouldn''t it make more sense to allocate memory from one of the > >>> nodes where the IOMMU is attached? Having the memory allocated from > >>> the node of the guest only helps when the guest needs to update its > >>> page tables. I''d rather optimize for page table walks in the IOMMU. > >>> > >>> eSk > > > > > > > > >_______________________________________________ > > >Xen-devel mailing list > > >Xen-devel@lists.xensource.com > > >http://lists.xensource.com/xen-devel > > > > > _______________________________________________ > > Xen-devel mailing list > > Xen-devel@lists.xensource.com > > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Kay, Allen M
2008-Nov-26 11:48 UTC
RE: [Xen-devel] [PATCH] fix memory allocation from NUMA node for VT-d.
>We can get lower latency by using proximity domain info of DMAR. But it >needs more modifications than my patch. I will not work for this. >If anyone develops this function, I will not be against it.Proximity domain info support is on our to-do list. We are planning to implement it in Q1 of next year. It is a little more change than your patch but I don''t think it is that more much, just need to parse the proximity domain info in ACPI and then use the node id for allocating memory for root, context and page tables. Allen>-----Original Message----- >From: Yuji Shimada [mailto:shimada-yxb@necst.nec.co.jp] >Sent: Wednesday, November 26, 2008 12:32 AM >To: Espen Skoglund; Kay, Allen M >Cc: ''Keir Fraser''; xen-devel@lists.xensource.com >Subject: Re: [Xen-devel] [PATCH] fix memory allocation from >NUMA node for VT-d. > >Hi, Espen & Kay, > >> Are you assuming guest will ping the guest to a physical CPU? > >Yes. >On Xen, memory is automatically assigned from the same NUMA node as a >physical CPU. Memory isn''t moved from initial assignment. But if we >doesn''t ping a guest to physical CPUs, the guest can run on every >physical CPUs. So, when the user uses NUMA machine, it is better to >ping the guest to physical CPUs. Because the latency becomes lower. > >> How does the user figure out which devices are closer to which >> physical CPU in the platform in a QPI system without using proximity >> domain info? > >The users can read machine spec to assign I/O device from the same >NUMA node as CPU to a guest. > >> By at least keeping the IOMMU page tables local to the node >you''ll get >> lower latencies for the page table walker. > >We can get lower latency by using proximity domain info of DMAR. But it >needs more modifications than my patch. I will not work for this. >If anyone develops this function, I will not be against it. > >Thanks, >-- >Yuji Shimada > > >On Thu, 20 Nov 2008 20:00:08 +0000 >Espen Skoglund <espen.skoglund@netronome.com> wrote: > >> You only require more memory if you duplicate the structures per >> IOMMU. While this is indeed possible (and may even be the desired >> solution) it is not what I suggested. >> >> And you''re making the assumption here that the guest is assigned to >> the node of the IOMMU. As Allen points out, how does a user >make this >> decision? And in many cases I would expect that you would >not want to >> assign many guests to the same node anyway. By at least keeping the >> IOMMU page tables local to the node you''ll get lower >latencies for the >> page table walker. >> >> eSk >> > >On Wed, 19 Nov 2008 10:57:10 -0800 >"Kay, Allen M" <allen.m.kay@intel.com> wrote: > >> >> >Xen''s user will assign a device to a closer guest. So, node of the >> >guest and node connected to IOMMU will be the same. >> >As a result, the memory performance will be improved with my patch. >> >> Are you assuming guest will ping the guest to a physical CPU? How >> does the user figure out which devices are closer to which physical >> CPU in the platform in a QPI system without using proximity domain >> info? >> >> Allen >> >> >> -----Original Message----- >> >> From: xen-devel-bounces@lists.xensource.com >> >> [mailto:xen-devel-bounces@lists.xensource.com] On Behalf Of >> >> Yuji Shimada >> >> Sent: Wednesday, November 19, 2008 12:26 AM >> >> To: Espen Skoglund >> >> Cc: xen-devel@lists.xensource.com; ''Keir Fraser'' >> >> Subject: Re: [Xen-devel] [PATCH] fix memory allocation from >> >> NUMA node for VT-d. >> >> >> >> Hi Espen, >> >> >> >> Your suggestion allocating memory from one of the nodes where the >> >> IOMMU is attached improves performance more. But more memory is >> >> needed, because structures are needed per IOMMU. >> >> >> >> My patch keeps the current implementation, one Device Assignment >> >> Structure and Address Translation Structure per guest. >> >> >> >> Xen''s user will assign a device to a closer guest. So, node of the >> >> guest and node connected to IOMMU will be the same. >> >> As a result, the memory performance will be improved with >my patch. >> >> >> >> Thanks, >> >> -- >> >> Yuji Shimada >> >> >> >> On Tue, 18 Nov 2008 12:00:37 +0000 >> >> Espen Skoglund <espen.skoglund@netronome.com> wrote: >> >> >> >>> Given an FSB based system the IOMMUs sit in the >north-bridge. How >> >>> does this work qith QPI? Where in the system do the >different IOMMUs >> >>> sit? Wouldn''t it make more sense to allocate memory >from one of the >> >>> nodes where the IOMMU is attached? Having the memory >allocated from >> >>> the node of the guest only helps when the guest needs to >update its >> >>> page tables. I''d rather optimize for page table walks >in the IOMMU. >> >>> >> >>> eSk >> > > >> > > >> > >_______________________________________________ >> > >Xen-devel mailing list >> > >Xen-devel@lists.xensource.com >> > >http://lists.xensource.com/xen-devel >> > > >> > _______________________________________________ >> > Xen-devel mailing list >> > Xen-devel@lists.xensource.com >> > http://lists.xensource.com/xen-devel > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel