Hi Keir; Currently base on Xen''s scheduler, if users don''t set vcpu affinity, Vcpu can be run on all p-cpus in machine. If it is a NUMA machine, performance will be down because of memory latency in memory access when CPU and memory are on different nodes. So I think their may be need to supply a mechanism to make xen run better on NUMA machine even if users don''t set vcpu affinity. I think out policies: 1: Don''t make any changes and only supply node free memory info to help guest to set proper VCPU affinity which has been realized in my last patch. 2: When set max-vcpu in domain build, we can choose a node base on nowadays policy of choose CPU to locate VCPU which mainly considers CPU balance. Then set this node''s cpumask to all VCPUS'' affinity to bind domain on this node. The disadvantage of this method is after setting max-vcpu, if user configures VCPU affinity, VCPU affinity will be set again. This is done in first patch attached. 3: We can do this in CP. If user doesn''t set VCPU affinity, we can choose a VCPU affinity for guest domain. This need a new policy to choose which node guest will run on NUMA machine. I think it is reasonable to consider memory usage first. I do this in the second patch. This patch depends on my last patch of get free memory size per node. Which method do you prefer? Comments are welcome. Thanks. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Option 3, please. A better allocation policy might look for node with enough memory that has least load (perhaps measured as crudely as node with least VCPUs bound to it). -- Keir On 26/2/08 03:40, "Duan, Ronghui" <ronghui.duan@intel.com> wrote:> Hi Keir; > > Currently base on Xen¹s scheduler, if users don¹t set vcpu affinity, Vcpu can > be run on all p-cpus in machine. If it is a NUMA machine, performance will be > down because of memory latency in memory access when CPU and memory are on > different nodes. So I think their may be need to supply a mechanism to make > xen run better on NUMA machine even if users don¹t set vcpu affinity. I think > out policies: > > 1: Don¹t make any changes and only supply node free memory info to help guest > to set proper VCPU affinity which has been realized in my last patch. > > 2: When set max-vcpu in domain build, we can choose a node base on nowadays > policy of choose CPU to locate VCPU which mainly considers CPU balance. Then > set this node¹s cpumask to all VCPUS¹ affinity to bind domain on this node. > The disadvantage of this method is after setting max-vcpu, if user configures > VCPU affinity, VCPU affinity will be set again. This is done in first patch > attached. > > 3: We can do this in CP. If user doesn¹t set VCPU affinity, we can choose a > VCPU affinity for guest domain. This need a new policy to choose which node > guest will run on NUMA machine. I think it is reasonable to consider memory > usage first. I do this in the second patch. This patch depends on my last > patch of get free memory size per node. > > Which method do you prefer? Comments are welcome. Thanks. > > > > > > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Duan, Ronghui
2008-Feb-27 08:56 UTC
RE: [Xen-devel] [PATCH] Bind guest with with NUMA ndoe.
Is this the one that your want? Thanks Set vcpu affinity to make better performance in NUMA machine. Signed-off-by: Duan Ronghui <ronghui.duan@intel.com> diff -r 9a890c817922 tools/python/xen/xend/XendDomainInfo.py --- a/tools/python/xen/xend/XendDomainInfo.py Wed Feb 27 18:53:08 2008 +0800 +++ b/tools/python/xen/xend/XendDomainInfo.py Thu Feb 28 01:12:23 2008 +0800 @@ -1961,6 +1961,39 @@ class XendDomainInfo: if self.info[''cpus''] is not None and len(self.info[''cpus''])> 0:for v in range(0, self.info[''VCPUs_max'']): xc.vcpu_setaffinity(self.domid, v, self.info[''cpus'']) + else: + info = xc.physinfo() + if info[''nr_nodes''] > 1: + node_memory_list = info[''node_to_memory''] + needmem self.image.getRequiredAvailableMemory(self.info[''memory_dynamic_max'']) / 1024 + candidate_node_list = [] + for i in range(0, info[''nr_nodes'']): + if node_memory_list[i] >= needmem: + candidate_node_list.append(i) + if candidate_node_list is None or len(candidate_node_list) == 1: + index = node_memory_list.index( max(node_memory_list) ) + cpumask = info[''node_to_cpu''][index] + else: + nodeload = [0] + nodeload = nodeload * info[''nr_nodes''] + from xen.xend import XendDomain + doms = XendDomain.instance().list(''all'') + for dom in doms: + cpuinfo = dom.getVCPUInfo() + for vcpu in sxp.children(cpuinfo, ''vcpu''): + def vinfo(n, t): + return t(sxp.child_value(vcpu, n)) + cpumap = vinfo(''cpumap'', list) + for i in candidate_node_list: + node_cpumask info[''node_to_cpu''][i] + for j in node_cpumask: + if j in cpumap: + nodeload[i] += 1 + break + index = nodeload.index( min(nodeload) ) + cpumask = info[''node_to_cpu''][index] + for v in range(0, self.info[''VCPUs_max'']): + xc.vcpu_setaffinity(self.domid, v, cpumask) # Use architecture- and image-specific calculations to determine # the various headrooms necessary, given the raw configured ________________________________ From: Keir Fraser [mailto:Keir.Fraser@cl.cam.ac.uk] Sent: Tuesday, February 26, 2008 4:43 PM To: Duan, Ronghui; xen-devel@lists.xensource.com Subject: Re: [Xen-devel] [PATCH] Bind guest with with NUMA ndoe. Option 3, please. A better allocation policy might look for node with enough memory that has least load (perhaps measured as crudely as node with least VCPUs bound to it). -- Keir On 26/2/08 03:40, "Duan, Ronghui" <ronghui.duan@intel.com> wrote: Hi Keir; Currently base on Xen''s scheduler, if users don''t set vcpu affinity, Vcpu can be run on all p-cpus in machine. If it is a NUMA machine, performance will be down because of memory latency in memory access when CPU and memory are on different nodes. So I think their may be need to supply a mechanism to make xen run better on NUMA machine even if users don''t set vcpu affinity. I think out policies: 1: Don''t make any changes and only supply node free memory info to help guest to set proper VCPU affinity which has been realized in my last patch. 2: When set max-vcpu in domain build, we can choose a node base on nowadays policy of choose CPU to locate VCPU which mainly considers CPU balance. Then set this node''s cpumask to all VCPUS'' affinity to bind domain on this node. The disadvantage of this method is after setting max-vcpu, if user configures VCPU affinity, VCPU affinity will be set again. This is done in first patch attached. 3: We can do this in CP. If user doesn''t set VCPU affinity, we can choose a VCPU affinity for guest domain. This need a new policy to choose which node guest will run on NUMA machine. I think it is reasonable to consider memory usage first. I do this in the second patch. This patch depends on my last patch of get free memory size per node. Which method do you prefer? Comments are welcome. Thanks. ________________________________ _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Looks fine to me. K. On 27/2/08 08:56, "Duan, Ronghui" <ronghui.duan@intel.com> wrote:> Is this the one that your want? > Thanks > Set vcpu affinity to make better performance in NUMA machine. > > Signed-off-by: Duan Ronghui <ronghui.duan@intel.com> > > diff -r 9a890c817922 tools/python/xen/xend/XendDomainInfo.py > --- a/tools/python/xen/xend/XendDomainInfo.py Wed Feb 27 18:53:08 2008 +0800 > +++ b/tools/python/xen/xend/XendDomainInfo.py Thu Feb 28 01:12:23 2008 +0800 > @@ -1961,6 +1961,39 @@ class XendDomainInfo: > if self.info[''cpus''] is not None and len(self.info[''cpus'']) > 0: > for v in range(0, self.info[''VCPUs_max'']): > xc.vcpu_setaffinity(self.domid, v, self.info[''cpus'']) > + else: > + info = xc.physinfo() > + if info[''nr_nodes''] > 1: > + node_memory_list = info[''node_to_memory''] > + needmem > self.image.getRequiredAvailableMemory(self.info[''memory_dynamic_max'']) / 1024 > + candidate_node_list = [] > + for i in range(0, info[''nr_nodes'']): > + if node_memory_list[i] >= needmem: > + candidate_node_list.append(i) > + if candidate_node_list is None or > len(candidate_node_list) == 1: > + index = node_memory_list.index( max(node_memory_list) > ) > + cpumask = info[''node_to_cpu''][index] > + else: > + nodeload = [0] > + nodeload = nodeload * info[''nr_nodes''] > + from xen.xend import XendDomain > + doms = XendDomain.instance().list(''all'') > + for dom in doms: > + cpuinfo = dom.getVCPUInfo() > + for vcpu in sxp.children(cpuinfo, ''vcpu''): > + def vinfo(n, t): > + return t(sxp.child_value(vcpu, n)) > + cpumap = vinfo(''cpumap'', list) > + for i in candidate_node_list: > + node_cpumask = info[''node_to_cpu''][i] > + for j in node_cpumask: > + if j in cpumap: > + nodeload[i] += 1 > + break > + index = nodeload.index( min(nodeload) ) > + cpumask = info[''node_to_cpu''][index] > + for v in range(0, self.info[''VCPUs_max'']): > + xc.vcpu_setaffinity(self.domid, v, cpumask) > > # Use architecture- and image-specific calculations to determine > # the various headrooms necessary, given the raw configured > > > > > > > From: Keir Fraser [mailto:Keir.Fraser@cl.cam.ac.uk] > Sent: Tuesday, February 26, 2008 4:43 PM > To: Duan, Ronghui; xen-devel@lists.xensource.com > Subject: Re: [Xen-devel] [PATCH] Bind guest with with NUMA ndoe. > > Option 3, please. A better allocation policy might look for node with enough > memory that has least load (perhaps measured as crudely as node with least > VCPUs bound to it). > > -- Keir > > On 26/2/08 03:40, "Duan, Ronghui" <ronghui.duan@intel.com> wrote: > Hi Keir; > > Currently base on Xen¹s scheduler, if users don¹t set vcpu affinity, Vcpu can > be run on all p-cpus in machine. If it is a NUMA machine, performance will be > down because of memory latency in memory access when CPU and memory are on > different nodes. So I think their may be need to supply a mechanism to make > xen run better on NUMA machine even if users don¹t set vcpu affinity. I think > out policies: > > 1: Don¹t make any changes and only supply node free memory info to help guest > to set proper VCPU affinity which has been realized in my last patch. > > 2: When set max-vcpu in domain build, we can choose a node base on nowadays > policy of choose CPU to locate VCPU which mainly considers CPU balance. Then > set this node¹s cpumask to all VCPUS¹ affinity to bind domain on this node. > The disadvantage of this method is after setting max-vcpu, if user configures > VCPU affinity, VCPU affinity will be set again. This is done in first patch > attached. > > 3: We can do this in CP. If user doesn¹t set VCPU affinity, we can choose a > VCPU affinity for guest domain. This need a new policy to choose which node > guest will run on NUMA machine. I think it is reasonable to consider memory > usage first. I do this in the second patch. This patch depends on my last > patch of get free memory size per node. > > Which method do you prefer? Comments are welcome. Thanks. > > > > > > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel