Hi Keir, Ian, Attached is a patch which implements "xm cpuinfo" and "xm cacheinfo" commands. Output of these commands on a 4 way paxville (2 core per socket, 2 threads per core) and 2 way clovertown (Quadcore) system is shown bellow. It would be easy to extend this functionality to other architectures such as IA64 or power by reusing most of the code. Other architectures would need to implement switch-cases XEN_SYSCTL_cpuinfo: & XEN_SYSCTL_cacheinfo: in their arch_do_sysctl() in the function hyper visor to get this functionality. The Changes are distributed in 3 areas viz hyper visor, libxc and python code as seen in the diffstat bellow. Please apply and/or provide comments for the patch. Signed-Off-By: Nitin A Kamble <nitin.a.kamble@intel.com> [nitin@lvmm64 multicore]$ diffstat multicore_info_patch18.diff tools/libxc/xc_misc.c | 37 ++++ tools/libxc/xenctrl.h | 11 + tools/python/xen/lowlevel/xc/xc.c | 139 +++++++++++++++++- tools/python/xen/util/bugtool.py | 2 tools/python/xen/xend/XendNode.py | 8 + tools/python/xen/xend/server/XMLRPCServer.py | 3 tools/python/xen/xm/main.py | 110 ++++++++++++++ xen/arch/x86/cpu/intel_cacheinfo.c | 208 +++++++++++++++++++++++++++ xen/arch/x86/sysctl.c | 111 ++++++++++++++ xen/include/public/sysctl.h | 55 +++++++ 10 files changed, 681 insertions(+), 3 deletions(-) [root@lharwich1-fc4-64 ~]# xm cpuinfo cpu core package node id id id id 0 0 2 ff 1 1 3 ff 2 0 0 ff 3 1 1 ff 4 1 2 ff 5 0 3 ff 6 1 0 ff 7 0 1 ff 8 0 2 ff 9 1 3 ff a 0 0 ff b 1 1 ff c 1 2 ff d 0 3 ff e 1 0 ff f 0 1 ff [root@lharwich1-fc4-64 ~]# xm cpuinfo -x cpu core package node thread core id id id id siblings_map siblings_map 0 0 2 ff 0000000000000101 0000000000001111 1 1 3 ff 0000000000000202 0000000000002222 2 0 0 ff 0000000000000404 0000000000004444 3 1 1 ff 0000000000000808 0000000000008888 4 1 2 ff 0000000000001010 0000000000001111 5 0 3 ff 0000000000002020 0000000000002222 6 1 0 ff 0000000000004040 0000000000004444 7 0 1 ff 0000000000008080 0000000000008888 8 0 2 ff 0000000000000101 0000000000001111 9 1 3 ff 0000000000000202 0000000000002222 a 0 0 ff 0000000000000404 0000000000004444 b 1 1 ff 0000000000000808 0000000000008888 c 1 2 ff 0000000000001010 0000000000001111 d 0 3 ff 0000000000002020 0000000000002222 e 1 0 ff 0000000000004040 0000000000004444 f 0 1 ff 0000000000008080 0000000000008888 [root@lharwich1-fc4-64 ~]# xm cacheinfo cpu cache cache cache shared id level type size cpus_map 0 1 DATA 16KB 00000101 0 2 UNI 1MB 00000101 0 3 UNI 16MB 00001111 1 1 DATA 16KB 00000202 1 2 UNI 1MB 00000202 1 3 UNI 16MB 00002222 2 1 DATA 16KB 00000404 2 2 UNI 1MB 00000404 2 3 UNI 16MB 00004444 3 1 DATA 16KB 00000808 3 2 UNI 1MB 00000808 3 3 UNI 16MB 00008888 4 1 DATA 16KB 00001010 4 2 UNI 1MB 00001010 4 3 UNI 16MB 00001111 5 1 DATA 16KB 00002020 5 2 UNI 1MB 00002020 5 3 UNI 16MB 00002222 6 1 DATA 16KB 00004040 6 2 UNI 1MB 00004040 6 3 UNI 16MB 00004444 7 1 DATA 16KB 00008080 7 2 UNI 1MB 00008080 7 3 UNI 16MB 00008888 8 1 DATA 16KB 00000101 8 2 UNI 1MB 00000101 8 3 UNI 16MB 00001111 9 1 DATA 16KB 00000202 9 2 UNI 1MB 00000202 9 3 UNI 16MB 00002222 a 1 DATA 16KB 00000404 a 2 UNI 1MB 00000404 a 3 UNI 16MB 00004444 b 1 DATA 16KB 00000808 b 2 UNI 1MB 00000808 b 3 UNI 16MB 00008888 c 1 DATA 16KB 00001010 c 2 UNI 1MB 00001010 c 3 UNI 16MB 00001111 d 1 DATA 16KB 00002020 d 2 UNI 1MB 00002020 d 3 UNI 16MB 00002222 e 1 DATA 16KB 00004040 e 2 UNI 1MB 00004040 e 3 UNI 16MB 00004444 f 1 DATA 16KB 00008080 f 2 UNI 1MB 00008080 f 3 UNI 16MB 00008888 [root@lharwich1-fc4-64 ~]# xm cacheinfo -x cpu cache cache cache shared no_of ways_of physical coherency id level type size cpus_map sets associativity line_partition line_size 0 1 DATA 16KB 00000101 1f 7 0 3f 0 2 UNI 1MB 00000101 3ff 7 1 3f 0 3 UNI 16MB 00001111 3fff f 0 3f 1 1 DATA 16KB 00000202 1f 7 0 3f 1 2 UNI 1MB 00000202 3ff 7 1 3f 1 3 UNI 16MB 00002222 3fff f 0 3f 2 1 DATA 16KB 00000404 1f 7 0 3f 2 2 UNI 1MB 00000404 3ff 7 1 3f 2 3 UNI 16MB 00004444 3fff f 0 3f 3 1 DATA 16KB 00000808 1f 7 0 3f 3 2 UNI 1MB 00000808 3ff 7 1 3f 3 3 UNI 16MB 00008888 3fff f 0 3f 4 1 DATA 16KB 00001010 1f 7 0 3f 4 2 UNI 1MB 00001010 3ff 7 1 3f 4 3 UNI 16MB 00001111 3fff f 0 3f 5 1 DATA 16KB 00002020 1f 7 0 3f 5 2 UNI 1MB 00002020 3ff 7 1 3f 5 3 UNI 16MB 00002222 3fff f 0 3f 6 1 DATA 16KB 00004040 1f 7 0 3f 6 2 UNI 1MB 00004040 3ff 7 1 3f 6 3 UNI 16MB 00004444 3fff f 0 3f 7 1 DATA 16KB 00008080 1f 7 0 3f 7 2 UNI 1MB 00008080 3ff 7 1 3f 7 3 UNI 16MB 00008888 3fff f 0 3f 8 1 DATA 16KB 00000101 1f 7 0 3f 8 2 UNI 1MB 00000101 3ff 7 1 3f 8 3 UNI 16MB 00001111 3fff f 0 3f 9 1 DATA 16KB 00000202 1f 7 0 3f 9 2 UNI 1MB 00000202 3ff 7 1 3f 9 3 UNI 16MB 00002222 3fff f 0 3f a 1 DATA 16KB 00000404 1f 7 0 3f a 2 UNI 1MB 00000404 3ff 7 1 3f a 3 UNI 16MB 00004444 3fff f 0 3f b 1 DATA 16KB 00000808 1f 7 0 3f b 2 UNI 1MB 00000808 3ff 7 1 3f b 3 UNI 16MB 00008888 3fff f 0 3f c 1 DATA 16KB 00001010 1f 7 0 3f c 2 UNI 1MB 00001010 3ff 7 1 3f c 3 UNI 16MB 00001111 3fff f 0 3f d 1 DATA 16KB 00002020 1f 7 0 3f d 2 UNI 1MB 00002020 3ff 7 1 3f d 3 UNI 16MB 00002222 3fff f 0 3f e 1 DATA 16KB 00004040 1f 7 0 3f e 2 UNI 1MB 00004040 3ff 7 1 3f e 3 UNI 16MB 00004444 3fff f 0 3f f 1 DATA 16KB 00008080 1f 7 0 3f f 2 UNI 1MB 00008080 3ff 7 1 3f f 3 UNI 16MB 00008888 3fff f 0 3f [root@lclovertown1 ~]# xm cpuinfo cpu core package node id id id id 0 0 0 ff 1 0 1 ff 2 1 0 ff 3 1 1 ff 4 2 0 ff 5 2 1 ff 6 3 0 ff 7 3 1 ff [root@lclovertown1 ~]# xm cpuinfo -x cpu core package node thread core id id id id siblings_map siblings_map 0 0 0 ff 0000000000000001 0000000000000055 1 0 1 ff 0000000000000002 00000000000000aa 2 1 0 ff 0000000000000004 0000000000000055 3 1 1 ff 0000000000000008 00000000000000aa 4 2 0 ff 0000000000000010 0000000000000055 5 2 1 ff 0000000000000020 00000000000000aa 6 3 0 ff 0000000000000040 0000000000000055 7 3 1 ff 0000000000000080 00000000000000aa [root@lclovertown1 ~]# xm cacheinfo cpu cache cache cache shared id level type size cpus_map 0 1 DATA 32KB 00000001 0 1 INST 32KB 00000001 0 2 UNI 4MB 00000005 1 1 DATA 32KB 00000002 1 1 INST 32KB 00000002 1 2 UNI 4MB 0000000a 2 1 DATA 32KB 00000004 2 1 INST 32KB 00000004 2 2 UNI 4MB 00000005 3 1 DATA 32KB 00000008 3 1 INST 32KB 00000008 3 2 UNI 4MB 0000000a 4 1 DATA 32KB 00000010 4 1 INST 32KB 00000010 4 2 UNI 4MB 00000050 5 1 DATA 32KB 00000020 5 1 INST 32KB 00000020 5 2 UNI 4MB 000000a0 6 1 DATA 32KB 00000040 6 1 INST 32KB 00000040 6 2 UNI 4MB 00000050 7 1 DATA 32KB 00000080 7 1 INST 32KB 00000080 7 2 UNI 4MB 000000a0 [root@lclovertown1 ~]# xm cacheinfo -x cpu cache cache cache shared no_of ways_of physical coherency id level type size cpus_map sets associativity line_partition line_size 0 1 DATA 32KB 00000001 3f 7 0 3f 0 1 INST 32KB 00000001 3f 7 0 3f 0 2 UNI 4MB 00000005 fff f 0 3f 1 1 DATA 32KB 00000002 3f 7 0 3f 1 1 INST 32KB 00000002 3f 7 0 3f 1 2 UNI 4MB 0000000a fff f 0 3f 2 1 DATA 32KB 00000004 3f 7 0 3f 2 1 INST 32KB 00000004 3f 7 0 3f 2 2 UNI 4MB 00000005 fff f 0 3f 3 1 DATA 32KB 00000008 3f 7 0 3f 3 1 INST 32KB 00000008 3f 7 0 3f 3 2 UNI 4MB 0000000a fff f 0 3f 4 1 DATA 32KB 00000010 3f 7 0 3f 4 1 INST 32KB 00000010 3f 7 0 3f 4 2 UNI 4MB 00000050 fff f 0 3f 5 1 DATA 32KB 00000020 3f 7 0 3f 5 1 INST 32KB 00000020 3f 7 0 3f 5 2 UNI 4MB 000000a0 fff f 0 3f 6 1 DATA 32KB 00000040 3f 7 0 3f 6 1 INST 32KB 00000040 3f 7 0 3f 6 2 UNI 4MB 00000050 fff f 0 3f 7 1 DATA 32KB 00000080 3f 7 0 3f 7 1 INST 32KB 00000080 3f 7 0 3f 7 2 UNI 4MB 000000a0 fff f 0 3f Thanks & Regards, Nitin Open Source Technology Center, Intel Corporation. ------------------------------------------------------------------------ - The mind is like a parachute; it works much better when it''s open. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Some comments: 1. Use of cpumaps in the sysctl interface assumes no more than 64 CPUs. We got rid of that assumption everywhere else. You don¹t really need the cpumaps anyway tools can deduce them from other information (e.g., matching up core/package ids across cpus). 2. The cacheinfo call is heinously Intel specific, especially the type¹ field which corresponds to a bitfield from an Intel-specific CPUID leaf. What is returned if the cacheinfo is requested on an older Intel box or on an AMD, Via, Transmeta, etc. CPU? 3. What are these calls for? Beyond dumping a lot of info in a new xm call, who is wanting this detailed info? Particularly on the cacheinfo side I could imagine some apps wanting to know what resource they have to play with, but these sysctls are available only to dom0. And those apps will probably just go straight at CPUID anyway, and assume the cache hierarchy is symmetric (a reasonably safe assumption really). -- Keir On 9/12/06 1:41 am, "Kamble, Nitin A" <nitin.a.kamble@intel.com> wrote:> Hi Keir, Ian, > Attached is a patch which implements ³xm cpuinfo² and ³xm cacheinfo² > commands. Output of these commands on a 4 way paxville (2 core per socket, 2 > threads per core) and 2 way clovertown (Quadcore) system is shown bellow. > It would be easy to extend this functionality to other architectures such > as IA64 or power by reusing most of the code. Other architectures would need > to implement switch-cases XEN_SYSCTL_cpuinfo: & XEN_SYSCTL_cacheinfo: in their > arch_do_sysctl() in the function hyper visor to get this functionality. > > The Changes are distributed in 3 areas viz hyper visor, libxc and python > code as seen in the diffstat bellow. > > Please apply and/or provide comments for the patch._______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Kamble, Nitin A
2006-Dec-11 23:25 UTC
RE: [Xen-devel] [PATCH] Export Multicore information
Hi Keir, Thanks for the response. My comments bellow. Thanks & Regards, Nitin Open Source Technology Center, Intel Corporation. ------------------------------------------------------------------------ - The mind is like a parachute; it works much better when it''s open. ________________________________ From: Keir Fraser [mailto:keir@xensource.com] Some comments: 1. Use of cpumaps in the sysctl interface assumes no more than 64 CPUs. We got rid of that assumption everywhere else. You don''t really need the cpumaps anyway - tools can deduce them from other information (e.g., matching up core/package ids across cpus). Makes sense to me. It was kind of extra information. I was trying to provide as much information as a user can get running on native Linux kernel. 2. The cacheinfo call is heinously Intel specific, especially the ''type'' field which corresponds to a bitfield from an Intel-specific CPUID leaf. What is returned if the cacheinfo is requested on an older Intel box or on an AMD, Via, Transmeta, etc. CPU? Yes I agree that cacheinfo code in the patch I sent is very Intel specific. If the same code will not run on other x86 cpus, And the xm command is not going to provide any information at all. This is because it checks for Intel processor in the hyper visor to get this information. The purpose of providing this information is to enable the administrator/end user to make a better decision on assigning physical CPUs to different VMs. And the real use this information is with the multi-core or hyper-threaded processors. I am assuming that other people will extend this code to support other x86 and non-x86 multi-core processors. 3. What are these calls for? Beyond dumping a lot of info in a new xm call, who is wanting this detailed info? Particularly on the cacheinfo side I could imagine some apps wanting to know what resource they have to play with, but these sysctls are available only to dom0. And those apps will probably just go straight at CPUID anyway, and assume the cache hierarchy is symmetric (a reasonably safe assumption really). This information is for administrator/end user of the system to make better decisions about partitioning the processors in the system across multiple domains. Dom0 may not be running on all the CPUs in the system, so this gives a view from hyper visor about all the CPUs that hyper visor sees. And also the cache info may not always be symmetric in the system. Let me know if you have any further concerns. -- Keir On 9/12/06 1:41 am, "Kamble, Nitin A" <nitin.a.kamble@intel.com> wrote: Hi Keir, Ian, Attached is a patch which implements "xm cpuinfo" and "xm cacheinfo" commands. Output of these commands on a 4 way paxville (2 core per socket, 2 threads per core) and 2 way clovertown (Quadcore) system is shown bellow. It would be easy to extend this functionality to other architectures such as IA64 or power by reusing most of the code. Other architectures would need to implement switch-cases XEN_SYSCTL_cpuinfo: & XEN_SYSCTL_cacheinfo: in their arch_do_sysctl() in the function hyper visor to get this functionality. The Changes are distributed in 3 areas viz hyper visor, libxc and python code as seen in the diffstat bellow. Please apply and/or provide comments for the patch. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Mon, Dec 11, 2006 at 03:25:49PM -0800, Kamble, Nitin A wrote:> This information is for administrator/end user of the system to make > better decisions about partitioning the processors in the system across > multiple domains. Dom0 may not be running on all the CPUs in the system, > so this gives a view from hyper visor about all the CPUs that hyper > visor sees. And also the cache info may not always be symmetric in the > system.Whilst it might be useful to provide hypercalls for such, I do not think it makes sense for ''xm'' to have this information. It should be provided as part of whatever OS tools already exist to output this information. Adding xm cacheinfo seems like a hack. regards john _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Kamble, Nitin A
2006-Dec-12 00:26 UTC
RE: [Xen-devel] [PATCH] Export Multicore information
Hi John, Thanks for your comments. The situation is, dom0 may not see all the processors in the system. Only hyper visor is capable of seeing all. I would let tools/apps on dom0 to see information about CPUs assigned to dom0, not the whole set available in the system. So it is necessary to have a hyper visor specific interface, and in my opinion xm is the best choice for that. Thanks & Regards, Nitin Open Source Technology Center, Intel Corporation. ------------------------------------------------------------------------ - The mind is like a parachute; it works much better when it''s open. -----Original Message----- From: John Levon [mailto:levon@movementarian.org] Sent: Monday, December 11, 2006 3:47 PM To: Kamble, Nitin A Cc: Keir Fraser; Ian Pratt; Mallick, Asit K; Yu, Wilfred; Xen devel list; Nakajima, Jun; Yang, Fred Subject: Re: [Xen-devel] [PATCH] Export Multicore information On Mon, Dec 11, 2006 at 03:25:49PM -0800, Kamble, Nitin A wrote:> This information is for administrator/end user of the system to make > better decisions about partitioning the processors in the systemacross> multiple domains. Dom0 may not be running on all the CPUs in thesystem,> so this gives a view from hyper visor about all the CPUs that hyper > visor sees. And also the cache info may not always be symmetric in the > system.Whilst it might be useful to provide hypercalls for such, I do not think it makes sense for ''xm'' to have this information. It should be provided as part of whatever OS tools already exist to output this information. Adding xm cacheinfo seems like a hack. regards john _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Mon, Dec 11, 2006 at 04:26:53PM -0800, Kamble, Nitin A wrote:> Thanks for your comments. The situation is, dom0 may not see all the > processors in the system. Only hyper visor is capable of seeing all. I > would let tools/apps on dom0 to see information about CPUs assigned to > dom0, not the whole set available in the system. So it is necessary to > have a hyper visor specific interface, and in my opinion xm is the best > choice for that.This didn''t answer my objection at all... again, whilst it may make sense to add hypercalls for such information, can you explain why you''re adding stuff to xm rather than just letting the relevant tools output the information? Be it /proc, or whatever? regards john _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 12/12/06 12:26 am, "Kamble, Nitin A" <nitin.a.kamble@intel.com> wrote:> Thanks for your comments. The situation is, dom0 may not see all the > processors in the system. Only hyper visor is capable of seeing all. I > would let tools/apps on dom0 to see information about CPUs assigned to > dom0, not the whole set available in the system. So it is necessary to > have a hyper visor specific interface, and in my opinion xm is the best > choice for that.A strong argument against ''xm cacheinfo'' and the required extra hypercall is that x86 cache hierarchies are always symmetric, to the best of my knowledge. So it does not matter which CPU dom0 happens to interrogate for cache info -- the information can be extrapolated to all other CPUs in the system. So the admin can run one of the many little tools that can be downloaded and that dump out interesting CPUID information. The physical CPU topology hypercall (i.e., the smaller half of your patch) is more interesting to me. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Kamble, Nitin A
2006-Dec-12 19:27 UTC
RE: [Xen-devel] [PATCH] Export Multicore information
Hi Keir, The most interesting field of the cacheinfo is "shared cpus map". Take an example of clowertown. It has 4 cores per socket. Basically there are 2 duel-core dies in the packages. So there are 2 pairs of cores in the packages which share the L2 caches among the pair itself, not across the package. So with this information the end user/admin would have better selection when she decides to split the cores from a single package across multiple domains. Thanks & Regards, Nitin Open Source Technology Center, Intel Corporation. ------------------------------------------------------------------------ - The mind is like a parachute; it works much better when it''s open. -----Original Message----- From: Keir Fraser [mailto:Keir.Fraser@cl.cam.ac.uk] Sent: Monday, December 11, 2006 11:39 PM To: Kamble, Nitin A; John Levon Cc: Yu, Wilfred; Ian Pratt; Xen devel list; Yang, Fred; Mallick, Asit K; Nakajima, Jun; Keir Fraser Subject: Re: [Xen-devel] [PATCH] Export Multicore information On 12/12/06 12:26 am, "Kamble, Nitin A" <nitin.a.kamble@intel.com> wrote:> Thanks for your comments. The situation is, dom0 may not see allthe> processors in the system. Only hyper visor is capable of seeing all. I > would let tools/apps on dom0 to see information about CPUs assigned to > dom0, not the whole set available in the system. So it is necessary to > have a hyper visor specific interface, and in my opinion xm is thebest> choice for that.A strong argument against ''xm cacheinfo'' and the required extra hypercall is that x86 cache hierarchies are always symmetric, to the best of my knowledge. So it does not matter which CPU dom0 happens to interrogate for cache info -- the information can be extrapolated to all other CPUs in the system. So the admin can run one of the many little tools that can be downloaded and that dump out interesting CPUID information. The physical CPU topology hypercall (i.e., the smaller half of your patch) is more interesting to me. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Emmanuel Ackaouy
2006-Dec-12 20:01 UTC
Re: [Xen-devel] [PATCH] Export Multicore information
Nitin, In terms of topology, aren''t the sibling and core maps enough information? I guess we could plan ahead for nodes in the interface too. Do we really need to export cpu, core, and package ids though? Also, should we export info for anything other than online cpus? I''ve only looked at the code to support XEN_SYSCTL_cpuinfo in sysctl.c. I don''t get the "count" check here. Looks like we ignore the size of the array passed by the user. We read it but ignore it. Am I missing something? Cheers, Emmanuel. On Fri, Dec 08, 2006 at 05:41:26PM -0800, Kamble, Nitin A wrote:> Hi Keir, Ian, > > Attached is a patch which implements "xm cpuinfo" and "xm cacheinfo" > commands. Output of these commands on a 4 way paxville (2 core per socket, > 2 threads per core) and 2 way clovertown (Quadcore) system is shown > bellow. > > It would be easy to extend this functionality to other architectures > such as IA64 or power by reusing most of the code. Other architectures > would need to implement switch-cases XEN_SYSCTL_cpuinfo: & > XEN_SYSCTL_cacheinfo: in their arch_do_sysctl() in the function hyper > visor to get this functionality. > > > > The Changes are distributed in 3 areas viz hyper visor, libxc and > python code as seen in the diffstat bellow. > > > > Please apply and/or provide comments for the patch. > > > > Signed-Off-By: Nitin A Kamble <nitin.a.kamble@intel.com>_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Kamble, Nitin A
2006-Dec-12 23:39 UTC
RE: [Xen-devel] [PATCH] Export Multicore information
Hi Emmanuel, My comments bellow. Thanks & Regards, Nitin Open Source Technology Center, Intel Corporation. ------------------------------------------------------------------------ - The mind is like a parachute; it works much better when it''s open.>-----Original Message----- >From: Emmanuel Ackaouy [mailto:ack@xensource.com] >Sent: Tuesday, December 12, 2006 12:02 PM>In terms of topology, aren''t the sibling and core maps enough >information? I guess we could plan ahead for nodes in the >interface too. Do we really need to export cpu, core, and >package ids though? Also, should we export info for anything >other than online cpus?Yes, the thread_sibling_map & core_siblings_map duplicate the information provided by thread_id, core_id & package_id. And we can get rid of one of these sets. That is the reason I had it as the extra information with -x command line switch. In my opinion keeping the thread_id, core_id & package_id is better because then we don''t have to worry about bitmap length in the future. Keir also suggested that comment 1st. I can easily take out the sibling_maps from the implementation. We do not have ability to offline a processor from hypervisor, do we? If we have cpu hotplug capability in hypervisor then this patch would need a bit of rework. Until then using either cpu_online_map or cpu_possible_map is not much different. And we can stick to the cpu_online_map, if it makes the code look clean.> >I''ve only looked at the code to support XEN_SYSCTL_cpuinfo in >sysctl.c. I don''t get the "count" check here. Looks like we >ignore the size of the array passed by the user. We read it >but ignore it. Am I missing something?I think I should have added some documentation comments in the code there mentioning the usage of count. It would make more sense if you look at the code which is calling this interface in the xc.c. There are 2 conventions for calling this sysctl function. 1. With count = 0, and the array_list = NULL This will fill the count variable with no of entries this call can fill with the 2nd convention. if ((pi->count == 0) && guest_handle_is_null(pi->cpuinfo_list)) { pi->count = num_possible_cpus(); if ( copy_to_guest(u_sysctl, sysctl, 1) ) { printk("sysctl XEN_SYSCTL_cpuinfo: copy to guest (sysctl) failed \n"); ret = -EFAULT; } break; /* this inside the switch statement is similar to return */ } 2. Here user space allocates the space for array_list of count size, and passes it to the sysctl. In the user level code from tools/python/xen/lowlevel/xc/xc.c the xc_cpuinfo() is call to the sysctl. /* 1st get the count of items in the list */ info.count = 0; set_xen_guest_handle(info.cpuinfo_list, NULL); if ( (ret = xc_cpuinfo(self->xc_handle, &info)) != 0 ) { errno = ret; return PyErr_SetFromErrno(xc_error); } if ( info.count <= 0 ) { return PyErr_SetFromErrno(xc_error); } /* now allocate space for all items and get details of all */ list = PyList_New(info.count); if (list == NULL) { return PyErr_NoMemory(); } cpuinfo_list = PyMem_New(xc_cpuinfo_list_t, info.count); if (cpuinfo_list == NULL) { return PyErr_NoMemory(); } set_xen_guest_handle(info.cpuinfo_list, cpuinfo_list); if ( xc_cpuinfo(self->xc_handle, &info) != 0 ) { PyMem_Free(cpuinfo_list); return PyErr_SetFromErrno(xc_error); }> >Cheers, >Emmanuel._______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 12/12/06 7:27 pm, "Kamble, Nitin A" <nitin.a.kamble@intel.com> wrote:> The most interesting field of the cacheinfo is "shared cpus map". > Take an example of clowertown. It has 4 cores per socket. Basically > there are 2 duel-core dies in the packages. So there are 2 pairs of > cores in the packages which share the L2 caches among the pair itself, > not across the package. > So with this information the end user/admin would have better > selection when she decides to split the cores from a single package > across multiple domains.Hm.... Ok, so we can have cache hierarchies where a single cache is split among *some* of the threads in a core, or *some* of the cores in a package? So you end up needing the physical APIC id to be able to apply the CPUID information and find out *which* siblings are sharing? I''m still tempted just to provide that APICid info to the guest. Was my earlier presumption that the cache hierarchy in practice will always be symmetric correct? Because that could simplify things. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 13/12/06 08:44, "Keir Fraser" <keir@xensource.com> wrote:> Hm.... Ok, so we can have cache hierarchies where a single cache is split > among *some* of the threads in a core, or *some* of the cores in a package? > So you end up needing the physical APIC id to be able to apply the CPUID > information and find out *which* siblings are sharing? > > I''m still tempted just to provide that APICid info to the guest. Was my > earlier presumption that the cache hierarchy in practice will always be > symmetric correct? Because that could simplify things.Actually, my main problem with these new info interfaces is that there is no reason to make them privileged except that they need to run on the correct physical CPU. Hence we''ve ended up with interfaces for MTRR access, microcode updates, MSR accesses, and this will start to add CPUID access also. We already started to discuss general ways we could execute arbitrary guest code on the appropriate physical CPU. For this topology and cache info, why not make a user app which sets process-VCPU and VCPU-CPU affinities appropriately to run its CPUID payload in the right places? Sched_setaffinity() (linux syscall) and xc_vcpu_setaffinity() should be all you need. Additionally the affinity-setting code ought to be generically useful in other scenarios too. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Emmanuel Ackaouy
2006-Dec-13 12:01 UTC
Re: [Xen-devel] [PATCH] Export Multicore information
On Tue, Dec 12, 2006 at 03:39:22PM -0800, Kamble, Nitin A wrote:> I think I should have added some documentation comments in the code > there mentioning the usage of count. It would make more sense if you > look at the code which is calling this interface in the xc.c. There are > 2 conventions for calling this sysctl function. > > 1. With count = 0, and the array_list = NULL > > 2. Here user space allocates the space for array_list of count size, and > passes it to the sysctl.(2) is the case I''m talking about.>From your patch in sysctl.c:+ count = min((int)pi->count, num_possible_cpus()); ^^^^^^^^^^ You grab "count" correctly. + + if ( guest_handle_is_null(pi->cpuinfo_list) ) { + printk("sysctl XEN_SYSCTL_cpuinfo: guest handle is null \n"); + ret = -EFAULT; + break; + } + + j = 0; + for_each_cpu(i) { + extern int cpu_2_node[]; + + cl.cpu_id = i; + cl.core_id = cpu_core_id[i]; + cl.package_id = phys_proc_id[i]; + cl.node_id = cpu_2_node[i]; + cl.thread_siblings_map = cpus_addr(cpu_sibling_map[i])[0]; + cl.core_siblings_map = cpus_addr(cpu_core_map[i])[0]; + + if ( copy_to_guest_offset(pi->cpuinfo_list, j, &cl, 1) ) { + printk("sysctl XEN_SYSCTL_cpuinfo: copy to guest (cpuinfo_list) failed \n"); + ret -EFAULT; + break; + } + j++; ^^^^^^^^^^^ We never check that ''j'' goes >= ''count''. + } As a matter of fact, it looks to me like ''count'' is a dead variable as soon as it''s set? _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Kamble, Nitin A
2006-Dec-14 00:32 UTC
RE: [Xen-devel] [PATCH] Export Multicore information
Hi Emmanuel, Good catch. I will incorporate your suggestion in the next patch. Thanks & Regards, Nitin Open Source Technology Center, Intel Corporation. ------------------------------------------------------------------------ - The mind is like a parachute; it works much better when it''s open.>-----Original Message----- >From: Emmanuel Ackaouy [mailto:ack@xensource.com] >Sent: Wednesday, December 13, 2006 4:01 AM >To: Kamble, Nitin A >Cc: Keir Fraser; Ian Pratt; Mallick, Asit K; Yu, Wilfred; Xen devellist;>Yang, Fred; Nakajima, Jun >Subject: Re: [Xen-devel] [PATCH] Export Multicore information > >On Tue, Dec 12, 2006 at 03:39:22PM -0800, Kamble, Nitin A wrote: >> I think I should have added some documentation comments in the code >> there mentioning the usage of count. It would make more sense if you >> look at the code which is calling this interface in the xc.c. Thereare>> 2 conventions for calling this sysctl function. >> >> 1. With count = 0, and the array_list = NULL >> >> 2. Here user space allocates the space for array_list of count size,and>> passes it to the sysctl. > >(2) is the case I''m talking about. > >From your patch in sysctl.c: > >+ count = min((int)pi->count, num_possible_cpus()); > > ^^^^^^^^^^ You grab "count" correctly. > >+ >+ if ( guest_handle_is_null(pi->cpuinfo_list) ) { >+ printk("sysctl XEN_SYSCTL_cpuinfo: guest handle is null\n");>+ ret = -EFAULT; >+ break; >+ } >+ >+ j = 0; >+ for_each_cpu(i) { >+ extern int cpu_2_node[]; >+ >+ cl.cpu_id = i; >+ cl.core_id = cpu_core_id[i]; >+ cl.package_id = phys_proc_id[i]; >+ cl.node_id = cpu_2_node[i]; >+ cl.thread_siblings_map = cpus_addr(cpu_sibling_map[i])[0]; >+ cl.core_siblings_map = cpus_addr(cpu_core_map[i])[0]; >+ >+ if ( copy_to_guest_offset(pi->cpuinfo_list, j, &cl, 1) ) { >+ printk("sysctl XEN_SYSCTL_cpuinfo: copy to guest >(cpuinfo_list) > failed \n"); >+ ret -EFAULT; >+ break; >+ } >+ j++; > > ^^^^^^^^^^^ We never check that ''j'' goes >= ''count''. > >+ } > > >As a matter of fact, it looks to me like ''count'' is a dead >variable as soon as it''s set?_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Kamble, Nitin A
2006-Dec-14 01:20 UTC
RE: [Xen-devel] [PATCH] Export Multicore information
Hi Keir, I think you have interesting ideas. Let me discuss with Jun & Asit about these ideas. Thanks & Regards, Nitin Open Source Technology Center, Intel Corporation. ------------------------------------------------------------------------ - The mind is like a parachute; it works much better when it''s open.>-----Original Message----- >From: Keir Fraser [mailto:keir@xensource.com] >Sent: Wednesday, December 13, 2006 1:56 AM >To: Keir Fraser; Kamble, Nitin A; John Levon >Cc: Yu, Wilfred; Ian Pratt; Xen devel list; Yang, Fred; Mallick, AsitK;>Nakajima, Jun >Subject: Re: [Xen-devel] [PATCH] Export Multicore information > >On 13/12/06 08:44, "Keir Fraser" <keir@xensource.com> wrote: > >> Hm.... Ok, so we can have cache hierarchies where a single cache issplit>> among *some* of the threads in a core, or *some* of the cores in a >package? >> So you end up needing the physical APIC id to be able to apply theCPUID>> information and find out *which* siblings are sharing? >> >> I''m still tempted just to provide that APICid info to the guest. Wasmy>> earlier presumption that the cache hierarchy in practice will alwaysbe>> symmetric correct? Because that could simplify things. > >Actually, my main problem with these new info interfaces is that thereis>no >reason to make them privileged except that they need to run on thecorrect>physical CPU. Hence we''ve ended up with interfaces for MTRR access, >microcode updates, MSR accesses, and this will start to add CPUIDaccess>also. > >We already started to discuss general ways we could execute arbitraryguest>code on the appropriate physical CPU. For this topology and cache info,why>not make a user app which sets process-VCPU and VCPU-CPU affinities >appropriately to run its CPUID payload in the right places? >Sched_setaffinity() (linux syscall) and xc_vcpu_setaffinity() should beall>you need. Additionally the affinity-setting code ought to begenerically>useful in other scenarios too. > > -- Keir_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Kamble, Nitin A
2006-Dec-15 23:18 UTC
RE: [Xen-devel] [PATCH] Export Multicore information
Hi Keir, I had some discussion with Jun/Asit, and I understand your comments better now. :) I think your points are: 1. Access to non privileged user: In a fashion similar to /proc/cpuinfo or /sys/cpuinfo. I think this is very much doable, I can add a new sys interface hierarchy for dom0 kernel like /sys/xen/cpu/cpu0/topology/core_id. Do you see any issue with this approach? 2. Temporary altering vcpu-pcpu binding: To get information for all cpus in the guest_land. One issue with this approach is, if the admin has giving fewer than total cpus to dom0, and it has hard bound the dom0 vpcus to pvcpus, then this will not work. Instead as you mentioned we can make an interface just like microcode in the hypervisor for this. Sorry I am not up2date on this "We already started to discuss general ways we could execute arbitrary guest code on the appropriate physical CPU.", Can you provide more details. Also exporting the APICID to vcpu will be needed, for guests to figure out their own cpu''s topology. Thanks & Regards, Nitin Open Source Technology Center, Intel Corporation. ------------------------------------------------------------------------ - The mind is like a parachute; it works much better when it''s open.>-----Original Message----- >From: Kamble, Nitin A >Sent: Wednesday, December 13, 2006 5:21 PM >To: ''Keir Fraser''; John Levon >Cc: Yu, Wilfred; Ian Pratt; Xen devel list; Yang, Fred; Mallick, AsitK;>Nakajima, Jun >Subject: RE: [Xen-devel] [PATCH] Export Multicore information > >Hi Keir, > I think you have interesting ideas. Let me discuss with Jun & Asitabout>these ideas. > >Thanks & Regards, >Nitin >Open Source Technology Center, Intel Corporation. >------------------------------------------------------------------------->The mind is like a parachute; it works much better when it''s open. > >>-----Original Message----- >>From: Keir Fraser [mailto:keir@xensource.com] >>Sent: Wednesday, December 13, 2006 1:56 AM >>To: Keir Fraser; Kamble, Nitin A; John Levon >>Cc: Yu, Wilfred; Ian Pratt; Xen devel list; Yang, Fred; Mallick, AsitK;>>Nakajima, Jun >>Subject: Re: [Xen-devel] [PATCH] Export Multicore information >> >>On 13/12/06 08:44, "Keir Fraser" <keir@xensource.com> wrote: >> >>> Hm.... Ok, so we can have cache hierarchies where a single cache is >split >>> among *some* of the threads in a core, or *some* of the cores in a >>package? >>> So you end up needing the physical APIC id to be able to apply theCPUID>>> information and find out *which* siblings are sharing? >>> >>> I''m still tempted just to provide that APICid info to the guest. Wasmy>>> earlier presumption that the cache hierarchy in practice will alwaysbe>>> symmetric correct? Because that could simplify things. >> >>Actually, my main problem with these new info interfaces is that thereis>>no >>reason to make them privileged except that they need to run on thecorrect>>physical CPU. Hence we''ve ended up with interfaces for MTRR access, >>microcode updates, MSR accesses, and this will start to add CPUIDaccess>>also. >> >>We already started to discuss general ways we could execute arbitrary >guest >>code on the appropriate physical CPU. For this topology and cacheinfo,>why >>not make a user app which sets process-VCPU and VCPU-CPU affinities >>appropriately to run its CPUID payload in the right places? >>Sched_setaffinity() (linux syscall) and xc_vcpu_setaffinity() shouldbe>all >>you need. Additionally the affinity-setting code ought to begenerically>>useful in other scenarios too. >> >> -- Keir_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 15/12/06 11:18 pm, "Kamble, Nitin A" <nitin.a.kamble@intel.com> wrote:> 2. Temporary altering vcpu-pcpu binding: To get information for all cpus > in the guest_land. > One issue with this approach is, if the admin has giving fewer than > total cpus to dom0, and it has hard bound the dom0 vpcus to pvcpus, then > this will not work.There is no such hard binding. You can always re-bind a dom0 vcpu to some other pcpu if you really want to. If we do add hard binding for dom0 then it will be because we want a 1:1 relationship between vcpus and pcpus -- if this is added we''ll provide a way for the application to detect and switch to a different mode of operation (where it switches affinity among vcpus, rather than sitting on one vcpu and switching among pcpus). We don''t want any more hypercall interfaces, and we probably don''t want any more sysfs interfaces. This can all be coded in userland. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel