dongxiao.xu@intel.com
2013-Nov-29 05:48 UTC
[PATCH v3 0/7] enable Cache QoS Monitoring (CQM) feature
From: Dongxiao Xu <dongxiao.xu@intel.com> Changes from v2: - Address comments from Andrew Cooper, including: * Merging tools stack changes into one patch. * Reduce the IPI number to one per socket. * Change structures for CQM data exchange between tools and Xen. * Misc of format/variable/function name changes. - Address comments from Konrad Rzeszutek Wilk, including: * Simplify the error printing logic. * Add xsm check for the new added hypercalls. Changes from v1: - Address comments from Andrew Cooper, including: * Change function names, e.g., alloc_cqm_rmid(), system_supports_cqm(), etc. * Change some structure element order to save packing cost. * Correct some function''s return value. * Some programming styles change. * ... Future generations of Intel Xeon processor may offer monitoring capability in each logical processor to measure specific quality-of-service metric, for example, the Cache QoS Monitoring to get L3 cache occupancy. Detailed information please refer to Intel SDM chapter 17.14. Cache QoS Monitoring provides a layer of abstraction between applications and logical processors through the use of Resource Monitoring IDs (RMIDs). In Xen design, each guest in the system can be assigned an RMID independently, while RMID=0 is reserved for monitoring domains that doesn''t enable CQM service. When any of the domain''s vcpu is scheduled on a logical processor, the domain''s RMID will be activated by programming the value into one specific MSR, and when the vcpu is scheduled out, a RMID=0 will be programmed into that MSR. The Cache QoS Hardware tracks cache utilization of memory accesses according to the RMIDs and reports monitored data via a counter register. With this solution, we can get the knowledge how much L3 cache is used by a certain guest. To attach CQM service to a certain guest, two approaches are provided: 1) Create the guest with "pqos_cqm=1" set in configuration file. 2) Use "xl pqos-attach cqm domid" for a running guest. To detached CQM service from a guest, users can: 1) Use "xl pqos-detach cqm domid" for a running guest. 2) Also destroying a guest will detach the CQM service. To get the L3 cache usage, users can use the command of: $ xl pqos-list cqm (domid) The below data is just an example showing how the CQM related data is exposed to end user. [root@localhost]# xl pqos-list cqm RMID count 56 RMID available 53 Name ID SocketID L3C_Usage SocketID L3C_Usage Domain-0 0 0 20127744 1 25231360 ExampleHVMDomain 1 0 3211264 1 10551296 Dongxiao Xu (7): x86: detect and initialize Cache QoS Monitoring feature x86: handle CQM resource when creating/destroying guests x86: dynamically attach/detach CQM service for a guest x86: collect CQM information from all sockets x86: enable CQM monitoring for each domain RMID xsm: add platform QoS related xsm policies tools: enable Cache QoS Monitoring feature for libxl/libxc tools/flask/policy/policy/modules/xen/xen.if | 2 +- tools/flask/policy/policy/modules/xen/xen.te | 5 +- tools/libxc/xc_domain.c | 48 ++++++ tools/libxc/xenctrl.h | 12 ++ tools/libxl/Makefile | 3 +- tools/libxl/libxl.h | 5 + tools/libxl/libxl_create.c | 3 + tools/libxl/libxl_pqos.c | 108 +++++++++++++ tools/libxl/libxl_types.idl | 1 + tools/libxl/xl.h | 3 + tools/libxl/xl_cmdimpl.c | 138 ++++++++++++++++ tools/libxl/xl_cmdtable.c | 15 ++ xen/arch/x86/Makefile | 1 + xen/arch/x86/cpu/intel.c | 6 + xen/arch/x86/domain.c | 13 ++ xen/arch/x86/domctl.c | 40 +++++ xen/arch/x86/pqos.c | 219 ++++++++++++++++++++++++++ xen/arch/x86/setup.c | 3 + xen/arch/x86/sysctl.c | 89 +++++++++++ xen/common/domctl.c | 5 +- xen/include/asm-x86/cpufeature.h | 1 + xen/include/asm-x86/domain.h | 2 + xen/include/asm-x86/msr-index.h | 5 + xen/include/asm-x86/pqos.h | 46 ++++++ xen/include/public/domctl.h | 26 +++ xen/include/public/sysctl.h | 11 ++ xen/include/xen/sched.h | 3 + xen/xsm/flask/hooks.c | 7 + xen/xsm/flask/policy/access_vectors | 17 +- 29 files changed, 830 insertions(+), 7 deletions(-) create mode 100644 tools/libxl/libxl_pqos.c create mode 100644 xen/arch/x86/pqos.c create mode 100644 xen/include/asm-x86/pqos.h -- 1.7.9.5
dongxiao.xu@intel.com
2013-Nov-29 05:48 UTC
[PATCH v3 1/7] x86: detect and initialize Cache QoS Monitoring feature
From: Dongxiao Xu <dongxiao.xu@intel.com> Detect platform QoS feature status and enumerate the resource types, one of which is to monitor the L3 cache occupancy. Also introduce a Xen grub command line parameter to control the QoS feature status globally. Signed-off-by: Jiongxi Li <jiongxi.li@intel.com> Signed-off-by: Dongxiao Xu <dongxiao.xu@intel.com> --- xen/arch/x86/Makefile | 1 + xen/arch/x86/cpu/intel.c | 6 +++ xen/arch/x86/pqos.c | 95 ++++++++++++++++++++++++++++++++++++++ xen/arch/x86/setup.c | 3 ++ xen/include/asm-x86/cpufeature.h | 1 + xen/include/asm-x86/pqos.h | 32 +++++++++++++ 6 files changed, 138 insertions(+) create mode 100644 xen/arch/x86/pqos.c create mode 100644 xen/include/asm-x86/pqos.h diff --git a/xen/arch/x86/Makefile b/xen/arch/x86/Makefile index d502bdf..54962e0 100644 --- a/xen/arch/x86/Makefile +++ b/xen/arch/x86/Makefile @@ -58,6 +58,7 @@ obj-y += crash.o obj-y += tboot.o obj-y += hpet.o obj-y += xstate.o +obj-y += pqos.o obj-$(crash_debug) += gdbstub.o diff --git a/xen/arch/x86/cpu/intel.c b/xen/arch/x86/cpu/intel.c index 27fe762..f0d83ea 100644 --- a/xen/arch/x86/cpu/intel.c +++ b/xen/arch/x86/cpu/intel.c @@ -230,6 +230,12 @@ static void __devinit init_intel(struct cpuinfo_x86 *c) ( c->cpuid_level >= 0x00000006 ) && ( cpuid_eax(0x00000006) & (1u<<2) ) ) set_bit(X86_FEATURE_ARAT, c->x86_capability); + + /* Check platform QoS monitoring capability */ + if ((c->cpuid_level >= 0x00000007) && + (cpuid_ebx(0x00000007) & (1u<<12))) + set_bit(X86_FEATURE_QOSM, c->x86_capability); + } static struct cpu_dev intel_cpu_dev __cpuinitdata = { diff --git a/xen/arch/x86/pqos.c b/xen/arch/x86/pqos.c new file mode 100644 index 0000000..e2172c4 --- /dev/null +++ b/xen/arch/x86/pqos.c @@ -0,0 +1,95 @@ +/* + * pqos.c: Platform QoS related service for guest. + * + * Copyright (c) 2013, Intel Corporation + * Author: Jiongxi Li <jiongxi.li@intel.com> + * Author: Dongxiao Xu <dongxiao.xu@intel.com> + * + * This program is free software; you can redistribute it and/or modify it + * under the terms and conditions of the GNU General Public License, + * version 2, as published by the Free Software Foundation. + * + * This program is distributed in the hope it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for + * more details. + * + * You should have received a copy of the GNU General Public License along with + * this program; if not, write to the Free Software Foundation, Inc., 59 Temple + * Place - Suite 330, Boston, MA 02111-1307 USA. + */ +#include <asm/processor.h> +#include <xen/init.h> +#include <asm/pqos.h> + +static bool_t __initdata pqos_enabled = 1; +boolean_param("pqos", pqos_enabled); + +static unsigned int cqm_rmid_count = 256; +integer_param("cqm_rmid_count", cqm_rmid_count); + +unsigned int cqm_upscaling_factor = 0; +bool_t cqm_enabled = 0; +domid_t *cqm_rmid_array = NULL; + +static void __init init_cqm(void) +{ + unsigned int rmid; + unsigned int eax, edx; + unsigned int max_cqm_rmid; + + cpuid_count(0xf, 1, &eax, &cqm_upscaling_factor, &max_cqm_rmid, &edx); + if ( !(edx & QOS_MONITOR_EVTID_L3) ) + return; + + cqm_rmid_count = min(cqm_rmid_count, max_cqm_rmid + 1); + + cqm_rmid_array = xzalloc_array(domid_t, cqm_rmid_count); + if ( !cqm_rmid_array ) + { + cqm_rmid_count = 0; + return; + } + + /* Reserve RMID 0 for all domains not being monitored */ + cqm_rmid_array[0] = DOMID_XEN; + + for ( rmid = 1; rmid < cqm_rmid_count; rmid++ ) + cqm_rmid_array[rmid] = DOMID_INVALID; + + cqm_enabled = 1; + + printk(XENLOG_INFO "Cache QoS Monitoring Enabled.\n"); +} + +static void __init init_qos_monitor(void) +{ + unsigned int qm_features; + unsigned int eax, ebx, ecx; + + if ( !(boot_cpu_has(X86_FEATURE_QOSM)) ) + return; + + cpuid_count(0xf, 0, &eax, &ebx, &ecx, &qm_features); + + if ( qm_features & QOS_MONITOR_TYPE_L3 ) + init_cqm(); +} + +void __init init_platform_qos(void) +{ + if ( !pqos_enabled ) + return; + + init_qos_monitor(); +} + +/* + * Local variables: + * mode: C + * c-file-style: "BSD" + * c-basic-offset: 4 + * tab-width: 4 + * indent-tabs-mode: nil + * End: + */ diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c index 5bf4ee0..95418e4 100644 --- a/xen/arch/x86/setup.c +++ b/xen/arch/x86/setup.c @@ -48,6 +48,7 @@ #include <asm/setup.h> #include <xen/cpu.h> #include <asm/nmi.h> +#include <asm/pqos.h> /* opt_nosmp: If true, secondary processors are ignored. */ static bool_t __initdata opt_nosmp; @@ -1402,6 +1403,8 @@ void __init __start_xen(unsigned long mbi_p) domain_unpause_by_systemcontroller(dom0); + init_platform_qos(); + reset_stack_and_jump(init_done); } diff --git a/xen/include/asm-x86/cpufeature.h b/xen/include/asm-x86/cpufeature.h index 1cfaf94..ca59668 100644 --- a/xen/include/asm-x86/cpufeature.h +++ b/xen/include/asm-x86/cpufeature.h @@ -147,6 +147,7 @@ #define X86_FEATURE_ERMS (7*32+ 9) /* Enhanced REP MOVSB/STOSB */ #define X86_FEATURE_INVPCID (7*32+10) /* Invalidate Process Context ID */ #define X86_FEATURE_RTM (7*32+11) /* Restricted Transactional Memory */ +#define X86_FEATURE_QOSM (7*32+12) /* Platform QoS monitoring capability */ #define X86_FEATURE_NO_FPU_SEL (7*32+13) /* FPU CS/DS stored as zero */ #define X86_FEATURE_SMAP (7*32+20) /* Supervisor Mode Access Prevention */ diff --git a/xen/include/asm-x86/pqos.h b/xen/include/asm-x86/pqos.h new file mode 100644 index 0000000..e8caca2 --- /dev/null +++ b/xen/include/asm-x86/pqos.h @@ -0,0 +1,32 @@ +/* + * pqos.h: Platform QoS related service for guest. + * + * Copyright (c) 2013, Intel Corporation + * Author: Jiongxi Li <jiongxi.li@intel.com> + * Author: Dongxiao Xu <dongxiao.xu@intel.com> + * + * This program is free software; you can redistribute it and/or modify it + * under the terms and conditions of the GNU General Public License, + * version 2, as published by the Free Software Foundation. + * + * This program is distributed in the hope it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for + * more details. + * + * You should have received a copy of the GNU General Public License along with + * this program; if not, write to the Free Software Foundation, Inc., 59 Temple + * Place - Suite 330, Boston, MA 02111-1307 USA. + */ +#ifndef ASM_PQOS_H +#define ASM_PQOS_H + +/* QoS Resource Type Enumeration */ +#define QOS_MONITOR_TYPE_L3 0x2 + +/* QoS Monitoring Event ID */ +#define QOS_MONITOR_EVTID_L3 0x1 + +void init_platform_qos(void); + +#endif -- 1.7.9.5
dongxiao.xu@intel.com
2013-Nov-29 05:48 UTC
[PATCH v3 2/7] x86: handle CQM resource when creating/destroying guests
From: Dongxiao Xu <dongxiao.xu@intel.com> Allocate an RMID for a guest when it is created. This per-guest RMID will be used to monitor Cache QoS related data. The RMID will be relinquished when guest is destroyed. Signed-off-by: Jiongxi Li <jiongxi.li@intel.com> Signed-off-by: Dongxiao Xu <dongxiao.xu@intel.com> --- xen/arch/x86/domain.c | 8 ++++++ xen/arch/x86/pqos.c | 55 ++++++++++++++++++++++++++++++++++++++++++ xen/common/domctl.c | 5 +++- xen/include/asm-x86/domain.h | 2 ++ xen/include/asm-x86/pqos.h | 5 ++++ xen/include/public/domctl.h | 3 +++ xen/include/xen/sched.h | 3 +++ 7 files changed, 80 insertions(+), 1 deletion(-) diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c index a3868f9..41e1fc6 100644 --- a/xen/arch/x86/domain.c +++ b/xen/arch/x86/domain.c @@ -60,6 +60,7 @@ #include <xen/numa.h> #include <xen/iommu.h> #include <compat/vcpu.h> +#include <asm/pqos.h> DEFINE_PER_CPU(struct vcpu *, curr_vcpu); DEFINE_PER_CPU(unsigned long, cr4); @@ -579,6 +580,11 @@ int arch_domain_create(struct domain *d, unsigned int domcr_flags) tsc_set_info(d, TSC_MODE_DEFAULT, 0UL, 0, 0); spin_lock_init(&d->arch.vtsc_lock); + /* Allocate CQM RMID for guest */ + d->arch.pqos_cqm_rmid = 0; + if ( system_supports_cqm() && (domcr_flags & DOMCRF_pqos_cqm) ) + alloc_cqm_rmid(d); + return 0; fail: @@ -612,6 +618,8 @@ void arch_domain_destroy(struct domain *d) free_xenheap_page(d->shared_info); cleanup_domain_irq_mapping(d); + + free_cqm_rmid(d); } unsigned long pv_guest_cr4_fixup(const struct vcpu *v, unsigned long guest_cr4) diff --git a/xen/arch/x86/pqos.c b/xen/arch/x86/pqos.c index e2172c4..1148f3b 100644 --- a/xen/arch/x86/pqos.c +++ b/xen/arch/x86/pqos.c @@ -20,6 +20,7 @@ */ #include <asm/processor.h> #include <xen/init.h> +#include <xen/spinlock.h> #include <asm/pqos.h> static bool_t __initdata pqos_enabled = 1; @@ -31,6 +32,7 @@ integer_param("cqm_rmid_count", cqm_rmid_count); unsigned int cqm_upscaling_factor = 0; bool_t cqm_enabled = 0; domid_t *cqm_rmid_array = NULL; +static DEFINE_SPINLOCK(cqm_lock); static void __init init_cqm(void) { @@ -84,6 +86,59 @@ void __init init_platform_qos(void) init_qos_monitor(); } +bool_t system_supports_cqm(void) +{ + return cqm_enabled; +} + +int alloc_cqm_rmid(struct domain *d) +{ + int rc = 0; + unsigned int rmid; + unsigned long flags; + + ASSERT(system_supports_cqm()); + + spin_lock_irqsave(&cqm_lock, flags); + /* RMID=0 is reserved, enumerate from 1 */ + for ( rmid = 1; rmid < cqm_rmid_count; rmid++ ) + { + if ( cqm_rmid_array[rmid] != DOMID_INVALID) + continue; + + cqm_rmid_array[rmid] = d->domain_id; + break; + } + spin_unlock_irqrestore(&cqm_lock, flags); + + /* No CQM RMID available, assign RMID=0 by default */ + if ( rmid == cqm_rmid_count ) + { + rmid = 0; + rc = -1; + } + + d->arch.pqos_cqm_rmid = rmid; + + return rc; +} + +void free_cqm_rmid(struct domain *d) +{ + unsigned int rmid = d->arch.pqos_cqm_rmid; + unsigned long flags; + + /* We do not free system reserved "RMID=0" */ + if ( rmid == 0 ) + return; + + spin_lock_irqsave(&cqm_lock, flags); + cqm_rmid_array[rmid] = DOMID_INVALID; + spin_unlock_irqrestore(&cqm_lock, flags); + + d->arch.pqos_cqm_rmid = 0; +} + /* * Local variables: * mode: C diff --git a/xen/common/domctl.c b/xen/common/domctl.c index 904d27b..1c2e320 100644 --- a/xen/common/domctl.c +++ b/xen/common/domctl.c @@ -425,7 +425,8 @@ long do_domctl(XEN_GUEST_HANDLE_PARAM(xen_domctl_t) u_domctl) | XEN_DOMCTL_CDF_pvh_guest | XEN_DOMCTL_CDF_hap | XEN_DOMCTL_CDF_s3_integrity - | XEN_DOMCTL_CDF_oos_off)) ) + | XEN_DOMCTL_CDF_oos_off + | XEN_DOMCTL_CDF_pqos_cqm)) ) break; dom = op->domain; @@ -467,6 +468,8 @@ long do_domctl(XEN_GUEST_HANDLE_PARAM(xen_domctl_t) u_domctl) domcr_flags |= DOMCRF_s3_integrity; if ( op->u.createdomain.flags & XEN_DOMCTL_CDF_oos_off ) domcr_flags |= DOMCRF_oos_off; + if ( op->u.createdomain.flags & XEN_DOMCTL_CDF_pqos_cqm ) + domcr_flags |= DOMCRF_pqos_cqm; d = domain_create(dom, domcr_flags, op->u.createdomain.ssidref); if ( IS_ERR(d) ) diff --git a/xen/include/asm-x86/domain.h b/xen/include/asm-x86/domain.h index 9d39061..9487251 100644 --- a/xen/include/asm-x86/domain.h +++ b/xen/include/asm-x86/domain.h @@ -313,6 +313,8 @@ struct arch_domain spinlock_t e820_lock; struct e820entry *e820; unsigned int nr_e820; + + unsigned int pqos_cqm_rmid; /* CQM RMID assigned to the domain */ } __cacheline_aligned; #define has_arch_pdevs(d) (!list_empty(&(d)->arch.pdev_list)) diff --git a/xen/include/asm-x86/pqos.h b/xen/include/asm-x86/pqos.h index e8caca2..c54905b 100644 --- a/xen/include/asm-x86/pqos.h +++ b/xen/include/asm-x86/pqos.h @@ -20,6 +20,7 @@ */ #ifndef ASM_PQOS_H #define ASM_PQOS_H +#include <xen/sched.h> /* QoS Resource Type Enumeration */ #define QOS_MONITOR_TYPE_L3 0x2 @@ -29,4 +30,8 @@ void init_platform_qos(void); +bool_t system_supports_cqm(void); +int alloc_cqm_rmid(struct domain *d); +void free_cqm_rmid(struct domain *d); + #endif diff --git a/xen/include/public/domctl.h b/xen/include/public/domctl.h index 01a3652..47a850a 100644 --- a/xen/include/public/domctl.h +++ b/xen/include/public/domctl.h @@ -62,6 +62,9 @@ struct xen_domctl_createdomain { /* Is this a PVH guest (as opposed to an HVM or PV guest)? */ #define _XEN_DOMCTL_CDF_pvh_guest 4 #define XEN_DOMCTL_CDF_pvh_guest (1U<<_XEN_DOMCTL_CDF_pvh_guest) + /* Enable pqos-cqm? */ +#define _XEN_DOMCTL_CDF_pqos_cqm 5 +#define XEN_DOMCTL_CDF_pqos_cqm (1U<<_XEN_DOMCTL_CDF_pqos_cqm) uint32_t flags; }; typedef struct xen_domctl_createdomain xen_domctl_createdomain_t; diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h index cbdf377..3a42656 100644 --- a/xen/include/xen/sched.h +++ b/xen/include/xen/sched.h @@ -507,6 +507,9 @@ struct domain *domain_create( /* DOMCRF_pvh: Create PV domain in HVM container. */ #define _DOMCRF_pvh 5 #define DOMCRF_pvh (1U<<_DOMCRF_pvh) + /* DOMCRF_pqos_cqm: Create a domain with CQM support */ +#define _DOMCRF_pqos_cqm 6 +#define DOMCRF_pqos_cqm (1U<<_DOMCRF_pqos_cqm) /* * rcu_lock_domain_by_id() is more efficient than get_domain_by_id(). -- 1.7.9.5
dongxiao.xu@intel.com
2013-Nov-29 05:48 UTC
[PATCH v3 3/7] x86: dynamically attach/detach CQM service for a guest
From: Dongxiao Xu <dongxiao.xu@intel.com> Add hypervisor side support for dynamically attach and detach CQM services for a certain guest. Signed-off-by: Jiongxi Li <jiongxi.li@intel.com> Signed-off-by: Dongxiao Xu <dongxiao.xu@intel.com> --- xen/arch/x86/domctl.c | 40 ++++++++++++++++++++++++++++++++++++++++ xen/include/public/domctl.h | 14 ++++++++++++++ 2 files changed, 54 insertions(+) diff --git a/xen/arch/x86/domctl.c b/xen/arch/x86/domctl.c index f7e4586..7007990 100644 --- a/xen/arch/x86/domctl.c +++ b/xen/arch/x86/domctl.c @@ -35,6 +35,7 @@ #include <asm/mem_sharing.h> #include <asm/xstate.h> #include <asm/debugger.h> +#include <asm/pqos.h> static int gdbsx_guest_mem_io( domid_t domid, struct xen_domctl_gdbsx_memio *iop) @@ -1223,6 +1224,45 @@ long arch_do_domctl( } break; + case XEN_DOMCTL_attach_pqos: + { + if ( domctl->u.qos_type.flags & XEN_DOMCTL_pqos_cqm ) + { + if ( !system_supports_cqm() ) + ret = -ENODEV; + else if ( d->arch.pqos_cqm_rmid > 0 ) + ret = -EEXIST; + else + { + ret = alloc_cqm_rmid(d); + if ( ret < 0 ) + ret = -EUSERS; + } + } + else + ret = -EINVAL; + } + break; + + case XEN_DOMCTL_detach_pqos: + { + if ( domctl->u.qos_type.flags & XEN_DOMCTL_pqos_cqm ) + { + if ( !system_supports_cqm() ) + ret = -ENODEV; + else if ( d->arch.pqos_cqm_rmid > 0 ) + { + free_cqm_rmid(d); + ret = 0; + } + else + ret = -ENOENT; + } + else + ret = -EINVAL; + } + break; + default: ret = iommu_do_domctl(domctl, d, u_domctl); break; diff --git a/xen/include/public/domctl.h b/xen/include/public/domctl.h index 47a850a..800b2f4 100644 --- a/xen/include/public/domctl.h +++ b/xen/include/public/domctl.h @@ -872,6 +872,17 @@ struct xen_domctl_set_max_evtchn { typedef struct xen_domctl_set_max_evtchn xen_domctl_set_max_evtchn_t; DEFINE_XEN_GUEST_HANDLE(xen_domctl_set_max_evtchn_t); +/* XEN_DOMCTL_attach_pqos */ +/* XEN_DOMCTL_detach_pqos */ +struct xen_domctl_qos_type { + /* Attach or detach flag for cqm */ +#define _XEN_DOMCTL_pqos_cqm 0 +#define XEN_DOMCTL_pqos_cqm (1U<<_XEN_DOMCTL_pqos_cqm) + uint32_t flags; +}; +typedef struct xen_domctl_qos_type xen_domctl_qos_type_t; +DEFINE_XEN_GUEST_HANDLE(xen_domctl_qos_type_t); + struct xen_domctl { uint32_t cmd; #define XEN_DOMCTL_createdomain 1 @@ -941,6 +952,8 @@ struct xen_domctl { #define XEN_DOMCTL_setnodeaffinity 68 #define XEN_DOMCTL_getnodeaffinity 69 #define XEN_DOMCTL_set_max_evtchn 70 +#define XEN_DOMCTL_attach_pqos 71 +#define XEN_DOMCTL_detach_pqos 72 #define XEN_DOMCTL_gdbsx_guestmemio 1000 #define XEN_DOMCTL_gdbsx_pausevcpu 1001 #define XEN_DOMCTL_gdbsx_unpausevcpu 1002 @@ -1001,6 +1014,7 @@ struct xen_domctl { struct xen_domctl_set_broken_page_p2m set_broken_page_p2m; struct xen_domctl_gdbsx_pauseunp_vcpu gdbsx_pauseunp_vcpu; struct xen_domctl_gdbsx_domstatus gdbsx_domstatus; + struct xen_domctl_qos_type qos_type; uint8_t pad[128]; } u; }; -- 1.7.9.5
dongxiao.xu@intel.com
2013-Nov-29 05:48 UTC
[PATCH v3 4/7] x86: collect CQM information from all sockets
From: Dongxiao Xu <dongxiao.xu@intel.com> Collect CQM information (L3 cache occupancy) from all sockets. Upper layer application can parse the data structure to get the information of guest''s L3 cache occupancy on certain sockets. Signed-off-by: Jiongxi Li <jiongxi.li@intel.com> Signed-off-by: Dongxiao Xu <dongxiao.xu@intel.com> --- xen/arch/x86/pqos.c | 59 ++++++++++++++++++++++++++ xen/arch/x86/sysctl.c | 89 +++++++++++++++++++++++++++++++++++++++ xen/include/asm-x86/msr-index.h | 4 ++ xen/include/asm-x86/pqos.h | 8 ++++ xen/include/public/domctl.h | 9 ++++ xen/include/public/sysctl.h | 11 +++++ 6 files changed, 180 insertions(+) diff --git a/xen/arch/x86/pqos.c b/xen/arch/x86/pqos.c index 1148f3b..615c5ea 100644 --- a/xen/arch/x86/pqos.c +++ b/xen/arch/x86/pqos.c @@ -19,6 +19,7 @@ * Place - Suite 330, Boston, MA 02111-1307 USA. */ #include <asm/processor.h> +#include <asm/msr.h> #include <xen/init.h> #include <xen/spinlock.h> #include <asm/pqos.h> @@ -91,6 +92,26 @@ bool_t system_supports_cqm(void) return cqm_enabled; } +unsigned int get_cqm_count(void) +{ + return cqm_rmid_count; +} + +unsigned int get_cqm_avail(void) +{ + unsigned int rmid, cqm_avail = 0; + unsigned long flags; + + spin_lock_irqsave(&cqm_lock, flags); + /* RMID=0 is reserved, enumerate from 1 */ + for ( rmid = 1; rmid < cqm_rmid_count; rmid++ ) + if ( cqm_rmid_array[rmid] == DOMID_INVALID ) + cqm_avail++; + spin_unlock_irqrestore(&cqm_lock, flags); + + return cqm_avail; +} + int alloc_cqm_rmid(struct domain *d) { int rc = 0; @@ -139,6 +160,44 @@ void free_cqm_rmid(struct domain *d) d->arch.pqos_cqm_rmid = 0; } +static void read_cqm_data(void *arg) +{ + uint64_t cqm_data; + unsigned int rmid; + int socket = cpu_to_socket(smp_processor_id()); + struct xen_socket_cqmdata *data = arg; + unsigned long flags, i; + + if ( socket < 0 ) + return; + + spin_lock_irqsave(&cqm_lock, flags); + /* RMID=0 is reserved, enumerate from 1 */ + for ( rmid = 1; rmid < cqm_rmid_count; rmid++ ) + { + if ( cqm_rmid_array[rmid] == DOMID_INVALID ) + continue; + + wrmsr(MSR_IA32_QOSEVTSEL, QOS_MONITOR_EVTID_L3, rmid); + rdmsrl(MSR_IA32_QMC, cqm_data); + + i = socket * cqm_rmid_count + rmid; + data[i].valid = !(cqm_data & IA32_QM_CTR_ERROR_MASK); + if ( data[i].valid ) + { + data[i].l3c_occupancy = cqm_data * cqm_upscaling_factor; + data[i].socket = socket; + data[i].domid = cqm_rmid_array[rmid]; + } + } + spin_unlock_irqrestore(&cqm_lock, flags); +} + +void get_cqm_info(cpumask_t *cpu_cqmdata_map, struct xen_socket_cqmdata *data) +{ + on_selected_cpus(cpu_cqmdata_map, read_cqm_data, data, 1); +} + /* * Local variables: * mode: C diff --git a/xen/arch/x86/sysctl.c b/xen/arch/x86/sysctl.c index 15d4b91..f916fe6 100644 --- a/xen/arch/x86/sysctl.c +++ b/xen/arch/x86/sysctl.c @@ -28,6 +28,7 @@ #include <xen/nodemask.h> #include <xen/cpu.h> #include <xsm/xsm.h> +#include <asm/pqos.h> #define get_xen_guest_handle(val, hnd) do { val = (hnd).p; } while (0) @@ -66,6 +67,47 @@ void arch_do_physinfo(xen_sysctl_physinfo_t *pi) pi->capabilities |= XEN_SYSCTL_PHYSCAP_hvm_directio; } +/* Select one random CPU for each socket */ +static void select_socket_cpu(cpumask_t *cpu_bitmap) +{ + int i; + unsigned int cpu; + cpumask_t *socket_cpuset; + int max_socket = 0; + unsigned int num_cpus = num_online_cpus(); + DECLARE_BITMAP(sockets, num_cpus); + + cpumask_clear(cpu_bitmap); + + for_each_online_cpu(cpu) + { + i = cpu_to_socket(cpu); + if ( i < 0 || test_and_set_bit(i, sockets) ) + continue; + max_socket = max(max_socket, i); + } + + socket_cpuset = xzalloc_array(cpumask_t, max_socket + 1); + if ( !socket_cpuset ) + return; + + for_each_online_cpu(cpu) + { + i = cpu_to_socket(cpu); + if ( i < 0 ) + continue; + cpumask_set_cpu(cpu, &socket_cpuset[i]); + } + + for ( i = 0; i <= max_socket; i++ ) + { + cpu = cpumask_any(&socket_cpuset[i]); + cpumask_set_cpu(cpu, cpu_bitmap); + } + + xfree(socket_cpuset); +} + long arch_do_sysctl( struct xen_sysctl *sysctl, XEN_GUEST_HANDLE_PARAM(xen_sysctl_t) u_sysctl) { @@ -101,6 +143,53 @@ long arch_do_sysctl( } break; + case XEN_SYSCTL_getcqminfo: + { + struct xen_socket_cqmdata *info; + uint32_t num_sockets; + uint32_t num_rmid; + cpumask_t cpu_cqmdata_map; + + if ( !system_supports_cqm() ) + { + ret = -ENODEV; + break; + } + + select_socket_cpu(&cpu_cqmdata_map); + + num_sockets = min((unsigned int)cpumask_weight(&cpu_cqmdata_map), + sysctl->u.getcqminfo.num_sockets); + num_rmid = get_cqm_count(); + info = xzalloc_array(struct xen_socket_cqmdata, + num_rmid * num_sockets); + if ( !info ) + { + ret = -ENOMEM; + break; + } + + get_cqm_info(&cpu_cqmdata_map, info); + + if ( copy_to_guest_offset(sysctl->u.getcqminfo.buffer, + 0, info, num_rmid * num_sockets) ) + { + ret = -EFAULT; + xfree(info); + break; + } + + sysctl->u.getcqminfo.num_rmid = num_rmid; + sysctl->u.getcqminfo.num_rmid_avail = get_cqm_avail(); + sysctl->u.getcqminfo.num_sockets = num_sockets; + + if ( copy_to_guest(u_sysctl, sysctl, 1) ) + ret = -EFAULT; + + xfree(info); + } + break; + default: ret = -ENOSYS; break; diff --git a/xen/include/asm-x86/msr-index.h b/xen/include/asm-x86/msr-index.h index e597a28..46ef165 100644 --- a/xen/include/asm-x86/msr-index.h +++ b/xen/include/asm-x86/msr-index.h @@ -488,4 +488,8 @@ /* Geode defined MSRs */ #define MSR_GEODE_BUSCONT_CONF0 0x00001900 +/* Platform QoS register */ +#define MSR_IA32_QOSEVTSEL 0x00000c8d +#define MSR_IA32_QMC 0x00000c8e + #endif /* __ASM_MSR_INDEX_H */ diff --git a/xen/include/asm-x86/pqos.h b/xen/include/asm-x86/pqos.h index c54905b..2ab9277 100644 --- a/xen/include/asm-x86/pqos.h +++ b/xen/include/asm-x86/pqos.h @@ -21,6 +21,8 @@ #ifndef ASM_PQOS_H #define ASM_PQOS_H #include <xen/sched.h> +#include <xen/cpumask.h> +#include <public/domctl.h> /* QoS Resource Type Enumeration */ #define QOS_MONITOR_TYPE_L3 0x2 @@ -28,10 +30,16 @@ /* QoS Monitoring Event ID */ #define QOS_MONITOR_EVTID_L3 0x1 +/* IA32_QM_CTR */ +#define IA32_QM_CTR_ERROR_MASK (0x3ul << 62) + void init_platform_qos(void); bool_t system_supports_cqm(void); int alloc_cqm_rmid(struct domain *d); void free_cqm_rmid(struct domain *d); +unsigned int get_cqm_count(void); +unsigned int get_cqm_avail(void); +void get_cqm_info(cpumask_t *cpu_cqmdata_map, struct xen_socket_cqmdata *data); #endif diff --git a/xen/include/public/domctl.h b/xen/include/public/domctl.h index 800b2f4..53c740e 100644 --- a/xen/include/public/domctl.h +++ b/xen/include/public/domctl.h @@ -883,6 +883,15 @@ struct xen_domctl_qos_type { typedef struct xen_domctl_qos_type xen_domctl_qos_type_t; DEFINE_XEN_GUEST_HANDLE(xen_domctl_qos_type_t); +struct xen_socket_cqmdata { + uint64_t l3c_occupancy; + uint32_t socket; + domid_t domid; + uint8_t valid; +}; +typedef struct xen_socket_cqmdata xen_socket_cqmdata_t; +DEFINE_XEN_GUEST_HANDLE(xen_socket_cqmdata_t); + struct xen_domctl { uint32_t cmd; #define XEN_DOMCTL_createdomain 1 diff --git a/xen/include/public/sysctl.h b/xen/include/public/sysctl.h index 8437d31..85eee16 100644 --- a/xen/include/public/sysctl.h +++ b/xen/include/public/sysctl.h @@ -632,6 +632,15 @@ struct xen_sysctl_coverage_op { typedef struct xen_sysctl_coverage_op xen_sysctl_coverage_op_t; DEFINE_XEN_GUEST_HANDLE(xen_sysctl_coverage_op_t); +/* XEN_SYSCTL_getcqminfo */ +struct xen_sysctl_getcqminfo { + XEN_GUEST_HANDLE_64(xen_socket_cqmdata_t) buffer; /* OUT */ + uint32_t num_sockets; /* IN/OUT */ + uint32_t num_rmid; /* OUT */ + uint32_t num_rmid_avail; /* OUT */ +}; +typedef struct xen_sysctl_getcqminfo xen_sysctl_getcqminfo_t; +DEFINE_XEN_GUEST_HANDLE(xen_sysctl_getcqminfo_t); struct xen_sysctl { uint32_t cmd; @@ -654,6 +663,7 @@ struct xen_sysctl { #define XEN_SYSCTL_cpupool_op 18 #define XEN_SYSCTL_scheduler_op 19 #define XEN_SYSCTL_coverage_op 20 +#define XEN_SYSCTL_getcqminfo 21 uint32_t interface_version; /* XEN_SYSCTL_INTERFACE_VERSION */ union { struct xen_sysctl_readconsole readconsole; @@ -675,6 +685,7 @@ struct xen_sysctl { struct xen_sysctl_cpupool_op cpupool_op; struct xen_sysctl_scheduler_op scheduler_op; struct xen_sysctl_coverage_op coverage_op; + struct xen_sysctl_getcqminfo getcqminfo; uint8_t pad[128]; } u; }; -- 1.7.9.5
dongxiao.xu@intel.com
2013-Nov-29 05:48 UTC
[PATCH v3 5/7] x86: enable CQM monitoring for each domain RMID
From: Dongxiao Xu <dongxiao.xu@intel.com> If the CQM service is attached to a domain, its related RMID will be set to hardware for monitoring when the domain''s vcpu is scheduled in. When the domain''s vcpu is scheduled out, RMID 0 (system reserved) will be set for monitoring. Signed-off-by: Jiongxi Li <jiongxi.li@intel.com> Signed-off-by: Dongxiao Xu <dongxiao.xu@intel.com> --- xen/arch/x86/domain.c | 5 +++++ xen/arch/x86/pqos.c | 10 ++++++++++ xen/include/asm-x86/msr-index.h | 1 + xen/include/asm-x86/pqos.h | 1 + 4 files changed, 17 insertions(+) diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c index 41e1fc6..628f7eb 100644 --- a/xen/arch/x86/domain.c +++ b/xen/arch/x86/domain.c @@ -1371,6 +1371,8 @@ static void __context_switch(void) { memcpy(&p->arch.user_regs, stack_regs, CTXT_SWITCH_STACK_BYTES); vcpu_save_fpu(p); + if ( system_supports_cqm() ) + cqm_assoc_rmid(0); p->arch.ctxt_switch_from(p); } @@ -1395,6 +1397,9 @@ static void __context_switch(void) } vcpu_restore_fpu_eager(n); n->arch.ctxt_switch_to(n); + + if ( system_supports_cqm() && n->domain->arch.pqos_cqm_rmid > 0 ) + cqm_assoc_rmid(n->domain->arch.pqos_cqm_rmid); } gdt = !is_pv_32on64_vcpu(n) ? per_cpu(gdt_table, cpu) : diff --git a/xen/arch/x86/pqos.c b/xen/arch/x86/pqos.c index 615c5ea..1faa650 100644 --- a/xen/arch/x86/pqos.c +++ b/xen/arch/x86/pqos.c @@ -29,6 +29,7 @@ boolean_param("pqos", pqos_enabled); static unsigned int cqm_rmid_count = 256; integer_param("cqm_rmid_count", cqm_rmid_count); +static uint64_t rmid_mask; unsigned int cqm_upscaling_factor = 0; bool_t cqm_enabled = 0; @@ -75,6 +76,8 @@ static void __init init_qos_monitor(void) cpuid_count(0xf, 0, &eax, &ebx, &ecx, &qm_features); + rmid_mask = ~(~0ull << get_count_order(ebx)); + if ( qm_features & QOS_MONITOR_TYPE_L3 ) init_cqm(); } @@ -198,6 +201,13 @@ void get_cqm_info(cpumask_t *cpu_cqmdata_map, struct xen_socket_cqmdata *data) on_selected_cpus(cpu_cqmdata_map, read_cqm_data, data, 1); } +void cqm_assoc_rmid(unsigned int rmid) +{ + uint64_t val; + rdmsrl(MSR_IA32_PQR_ASSOC, val); + wrmsrl(MSR_IA32_PQR_ASSOC, (val & ~(rmid_mask)) | (rmid & rmid_mask)); +} + /* * Local variables: * mode: C diff --git a/xen/include/asm-x86/msr-index.h b/xen/include/asm-x86/msr-index.h index 46ef165..45f4918 100644 --- a/xen/include/asm-x86/msr-index.h +++ b/xen/include/asm-x86/msr-index.h @@ -491,5 +491,6 @@ /* Platform QoS register */ #define MSR_IA32_QOSEVTSEL 0x00000c8d #define MSR_IA32_QMC 0x00000c8e +#define MSR_IA32_PQR_ASSOC 0x00000c8f #endif /* __ASM_MSR_INDEX_H */ diff --git a/xen/include/asm-x86/pqos.h b/xen/include/asm-x86/pqos.h index 2ab9277..c75643a 100644 --- a/xen/include/asm-x86/pqos.h +++ b/xen/include/asm-x86/pqos.h @@ -41,5 +41,6 @@ void free_cqm_rmid(struct domain *d); unsigned int get_cqm_count(void); unsigned int get_cqm_avail(void); void get_cqm_info(cpumask_t *cpu_cqmdata_map, struct xen_socket_cqmdata *data); +void cqm_assoc_rmid(unsigned int rmid); #endif -- 1.7.9.5
dongxiao.xu@intel.com
2013-Nov-29 05:48 UTC
[PATCH v3 6/7] xsm: add platform QoS related xsm policies
From: Dongxiao Xu <dongxiao.xu@intel.com> Add xsm policies for attach/detach pqos services and get CQM info hypercalls. Signed-off-by: Dongxiao Xu <dongxiao.xu@intel.com> --- tools/flask/policy/policy/modules/xen/xen.if | 2 +- tools/flask/policy/policy/modules/xen/xen.te | 5 ++++- xen/xsm/flask/hooks.c | 7 +++++++ xen/xsm/flask/policy/access_vectors | 17 ++++++++++++++--- 4 files changed, 26 insertions(+), 5 deletions(-) diff --git a/tools/flask/policy/policy/modules/xen/xen.if b/tools/flask/policy/policy/modules/xen/xen.if index dedc035..1f683af 100644 --- a/tools/flask/policy/policy/modules/xen/xen.if +++ b/tools/flask/policy/policy/modules/xen/xen.if @@ -49,7 +49,7 @@ define(`create_domain_common'', ` getdomaininfo hypercall setvcpucontext setextvcpucontext getscheduler getvcpuinfo getvcpuextstate getaddrsize getaffinity setaffinity }; - allow $1 $2:domain2 { set_cpuid settsc setscheduler setclaim set_max_evtchn }; + allow $1 $2:domain2 { set_cpuid settsc setscheduler setclaim set_max_evtchn pqos_op }; allow $1 $2:security check_context; allow $1 $2:shadow enable; allow $1 $2:mmu { map_read map_write adjust memorymap physmap pinpage mmuext_op }; diff --git a/tools/flask/policy/policy/modules/xen/xen.te b/tools/flask/policy/policy/modules/xen/xen.te index bb59fe8..115fcfe 100644 --- a/tools/flask/policy/policy/modules/xen/xen.te +++ b/tools/flask/policy/policy/modules/xen/xen.te @@ -64,6 +64,9 @@ allow dom0_t xen_t:xen { getidle debug getcpuinfo heap pm_op mca_op lockprof cpupool_op tmem_op tmem_control getscheduler setscheduler }; +allow dom0_t xen_t:xen2 { + pqos_op +}; allow dom0_t xen_t:mmu memorymap; # Allow dom0 to use these domctls on itself. For domctls acting on other @@ -76,7 +79,7 @@ allow dom0_t dom0_t:domain { getpodtarget setpodtarget set_misc_info set_virq_handler }; allow dom0_t dom0_t:domain2 { - set_cpuid gettsc settsc setscheduler set_max_evtchn + set_cpuid gettsc settsc setscheduler set_max_evtchn pqos_op }; allow dom0_t dom0_t:resource { add remove }; diff --git a/xen/xsm/flask/hooks.c b/xen/xsm/flask/hooks.c index b1e2593..884922b 100644 --- a/xen/xsm/flask/hooks.c +++ b/xen/xsm/flask/hooks.c @@ -730,6 +730,10 @@ static int flask_domctl(struct domain *d, int cmd) case XEN_DOMCTL_set_max_evtchn: return current_has_perm(d, SECCLASS_DOMAIN2, DOMAIN2__SET_MAX_EVTCHN); + case XEN_DOMCTL_attach_pqos: + case XEN_DOMCTL_detach_pqos: + return current_has_perm(d, SECCLASS_DOMAIN2, DOMAIN2__PQOS_OP); + default: printk("flask_domctl: Unknown op %d\n", cmd); return -EPERM; @@ -785,6 +789,9 @@ static int flask_sysctl(int cmd) case XEN_SYSCTL_numainfo: return domain_has_xen(current->domain, XEN__PHYSINFO); + case XEN_SYSCTL_getcqminfo: + return domain_has_xen(current->domain, XEN2__PQOS_OP); + default: printk("flask_sysctl: Unknown op %d\n", cmd); return -EPERM; diff --git a/xen/xsm/flask/policy/access_vectors b/xen/xsm/flask/policy/access_vectors index 1fbe241..91af8b2 100644 --- a/xen/xsm/flask/policy/access_vectors +++ b/xen/xsm/flask/policy/access_vectors @@ -3,9 +3,9 @@ # # class class_name { permission_name ... } -# Class xen consists of dom0-only operations dealing with the hypervisor itself. -# Unless otherwise specified, the source is the domain executing the hypercall, -# and the target is the xen initial sid (type xen_t). +# Class xen and xen2 consists of dom0-only operations dealing with the +# hypervisor itself. Unless otherwise specified, the source is the domain +# executing the hypercall, and the target is the xen initial sid (type xen_t). class xen { # XENPF_settime @@ -75,6 +75,14 @@ class xen setscheduler } +# This is a continuation of class xen, since only 32 permissions can be +# defined per class +class xen2 +{ +# XEN_SYSCTL_getcqminfo + pqos_op +} + # Classes domain and domain2 consist of operations that a domain performs on # another domain or on itself. Unless otherwise specified, the source is the # domain executing the hypercall, and the target is the domain being operated on @@ -196,6 +204,9 @@ class domain2 setclaim # XEN_DOMCTL_set_max_evtchn set_max_evtchn +# XEN_DOMCTL_attach_pqos +# XEN_DOMCTL_detach_pqos + pqos_op } # Similar to class domain, but primarily contains domctls related to HVM domains -- 1.7.9.5
dongxiao.xu@intel.com
2013-Nov-29 05:48 UTC
[PATCH v3 7/7] tools: enable Cache QoS Monitoring feature for libxl/libxc
From: Dongxiao Xu <dongxiao.xu@intel.com> Introduced a new config parameter "pqos_cqm", if it is set to 1, guest will be created with CQM feature enabled. Introduced two new xl commands to attach/detach CQM service for a guest $ xl pqos-attach cqm domid $ xl pqos-detach cqm domid Introduce one new xl command to retrive guest CQM information $ xl pqos-list cqm (domid) Signed-off-by: Jiongxi Li <jiongxi.li@intel.com> Signed-off-by: Dongxiao Xu <dongxiao.xu@intel.com> --- tools/libxc/xc_domain.c | 48 +++++++++++++++ tools/libxc/xenctrl.h | 12 ++++ tools/libxl/Makefile | 3 +- tools/libxl/libxl.h | 5 ++ tools/libxl/libxl_create.c | 3 + tools/libxl/libxl_pqos.c | 108 +++++++++++++++++++++++++++++++++ tools/libxl/libxl_types.idl | 1 + tools/libxl/xl.h | 3 + tools/libxl/xl_cmdimpl.c | 138 +++++++++++++++++++++++++++++++++++++++++++ tools/libxl/xl_cmdtable.c | 15 +++++ 10 files changed, 335 insertions(+), 1 deletion(-) create mode 100644 tools/libxl/libxl_pqos.c diff --git a/tools/libxc/xc_domain.c b/tools/libxc/xc_domain.c index 1ccafc5..6b71559 100644 --- a/tools/libxc/xc_domain.c +++ b/tools/libxc/xc_domain.c @@ -1776,6 +1776,54 @@ int xc_domain_set_max_evtchn(xc_interface *xch, uint32_t domid, return do_domctl(xch, &domctl); } +int xc_domain_pqos_attach(xc_interface *xch, uint32_t domid, uint32_t flags) +{ + DECLARE_DOMCTL; + domctl.cmd = XEN_DOMCTL_attach_pqos; + domctl.domain = (domid_t)domid; + domctl.u.qos_type.flags = flags; + return do_domctl(xch, &domctl); +} + +int xc_domain_pqos_detach(xc_interface *xch, uint32_t domid, uint32_t flags) +{ + DECLARE_DOMCTL; + domctl.cmd = XEN_DOMCTL_detach_pqos; + domctl.domain = (domid_t)domid; + domctl.u.qos_type.flags = flags; + return do_domctl(xch, &domctl); +} + +int xc_domain_getcqminfolist(xc_interface *xch, sysctl_cqminfo_t *info) +{ + int ret = 0; + xen_socket_cqmdata_t *data = info->cqmdata; + DECLARE_SYSCTL; + + DECLARE_HYPERCALL_BOUNCE(data, + info->num_rmid * info->num_sockets * sizeof(*data), + XC_HYPERCALL_BUFFER_BOUNCE_OUT); + + if ( xc_hypercall_bounce_pre(xch, data) ) + return -1; + + sysctl.cmd = XEN_SYSCTL_getcqminfo; + set_xen_guest_handle(sysctl.u.getcqminfo.buffer, data); + + if ( xc_sysctl(xch, &sysctl) < 0 ) + ret = -1; + else + { + info->num_sockets = sysctl.u.getcqminfo.num_sockets; + info->num_rmid = sysctl.u.getcqminfo.num_rmid; + info->num_rmid_avail = sysctl.u.getcqminfo.num_rmid_avail; + } + + xc_hypercall_bounce_post(xch, data); + + return ret; +} + /* * Local variables: * mode: C diff --git a/tools/libxc/xenctrl.h b/tools/libxc/xenctrl.h index 4ac6b8a..894e186 100644 --- a/tools/libxc/xenctrl.h +++ b/tools/libxc/xenctrl.h @@ -2395,4 +2395,16 @@ int xc_kexec_load(xc_interface *xch, uint8_t type, uint16_t arch, */ int xc_kexec_unload(xc_interface *xch, int type); +struct xc_sysctl_getcqminfo +{ + uint32_t num_rmid; + uint32_t num_rmid_avail; + uint32_t num_sockets; + xen_socket_cqmdata_t *cqmdata; +}; +typedef struct xc_sysctl_getcqminfo sysctl_cqminfo_t; + +int xc_domain_pqos_attach(xc_interface *xch, uint32_t domid, uint32_t flags); +int xc_domain_pqos_detach(xc_interface *xch, uint32_t domid, uint32_t flags); +int xc_domain_getcqminfolist(xc_interface *xch, sysctl_cqminfo_t *info); #endif /* XENCTRL_H */ diff --git a/tools/libxl/Makefile b/tools/libxl/Makefile index cf214bb..35f0b97 100644 --- a/tools/libxl/Makefile +++ b/tools/libxl/Makefile @@ -74,7 +74,8 @@ LIBXL_OBJS = flexarray.o libxl.o libxl_create.o libxl_dm.o libxl_pci.o \ libxl_internal.o libxl_utils.o libxl_uuid.o \ libxl_json.o libxl_aoutils.o libxl_numa.o \ libxl_save_callout.o _libxl_save_msgs_callout.o \ - libxl_qmp.o libxl_event.o libxl_fork.o $(LIBXL_OBJS-y) + libxl_qmp.o libxl_event.o libxl_fork.o libxl_pqos.o \ + $(LIBXL_OBJS-y) LIBXL_OBJS += _libxl_types.o libxl_flask.o _libxl_types_internal.o $(LIBXL_OBJS): CFLAGS += $(CFLAGS_LIBXL) -include $(XEN_ROOT)/tools/config.h diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h index c7dceda..fdca92d 100644 --- a/tools/libxl/libxl.h +++ b/tools/libxl/libxl.h @@ -285,6 +285,7 @@ #include <libxl_uuid.h> #include <_libxl_list.h> +#include <xenctrl.h> /* API compatibility. */ #ifdef LIBXL_API_VERSION @@ -1051,6 +1052,10 @@ int libxl_flask_getenforce(libxl_ctx *ctx); int libxl_flask_setenforce(libxl_ctx *ctx, int mode); int libxl_flask_loadpolicy(libxl_ctx *ctx, void *policy, uint32_t size); +int libxl_pqos_attach(libxl_ctx *ctx, uint32_t domid, const char * qos_type); +int libxl_pqos_detach(libxl_ctx *ctx, uint32_t domid, const char * qos_type); +int libxl_get_cqm_info(libxl_ctx *ctx, sysctl_cqminfo_t *info); + /* misc */ /* Each of these sets or clears the flag according to whether the diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c index 5e9cdcc..454c69d 100644 --- a/tools/libxl/libxl_create.c +++ b/tools/libxl/libxl_create.c @@ -40,6 +40,8 @@ int libxl__domain_create_info_setdefault(libxl__gc *gc, libxl_defbool_setdefault(&c_info->run_hotplug_scripts, true); + libxl_defbool_setdefault(&c_info->pqos_cqm, false); + return 0; } @@ -454,6 +456,7 @@ int libxl__domain_make(libxl__gc *gc, libxl_domain_create_info *info, } flags |= XEN_DOMCTL_CDF_hap; } + flags |= libxl_defbool_val(info->pqos_cqm) ? XEN_DOMCTL_CDF_pqos_cqm : 0; *domid = -1; /* Ultimately, handle is an array of 16 uint8_t, same as uuid */ diff --git a/tools/libxl/libxl_pqos.c b/tools/libxl/libxl_pqos.c new file mode 100644 index 0000000..bf7593a --- /dev/null +++ b/tools/libxl/libxl_pqos.c @@ -0,0 +1,108 @@ +/* + * Copyright (C) 2013 Intel Corporation + * Author Jiongxi Li <jiongxi.li@intel.com> + * Author Dongxiao Xu <dongxiao.xu@intel.com> + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU Lesser General Public License as published + * by the Free Software Foundation; version 2.1 only. with the special + * exception on linking described in file LICENSE. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU Lesser General Public License for more details. + */ + +#include "libxl_osdeps.h" /* must come before any other headers */ +#include "libxl_internal.h" + +static const char * const msg[] = { + [EINVAL] = "invalid QoS resource type! Supported types: \"cqm\"", + [ENODEV] = "CQM is not supported in this system.", + [EEXIST] = "CQM is already attached to this domain.", + [ENOENT] = "CQM is not attached to this domain.", + [EUSERS] = "there is no free CQM RMID available.", + [ESRCH] = "is this Domain ID valid?", +}; + +int libxl_pqos_attach(libxl_ctx *ctx, uint32_t domid, const char * qos_type) +{ + int rc; + uint32_t flags = 0; + + if (!strncmp(qos_type, "cqm", 3)) + flags |= XEN_DOMCTL_pqos_cqm; + else { + rc = -EINVAL; + LIBXL__LOG(ctx, XTL_ERROR, "%s", msg[EINVAL]); + return rc; + } + + rc = xc_domain_pqos_attach(ctx->xch, domid, flags); + if (rc < 0) { + switch(errno) { + case EINVAL: + case ENODEV: + case EEXIST: + case EUSERS: + case ESRCH: + LIBXL__LOG(ctx, XTL_ERROR, "%s", msg[errno]); + break; + default: + LIBXL__LOG(ctx, XTL_ERROR, "errno: %d", errno); + } + } + + return rc; +} + +int libxl_pqos_detach(libxl_ctx *ctx, uint32_t domid, const char * qos_type) +{ + int rc; + uint32_t flags = 0; + + if (!strncmp(qos_type, "cqm", 3)) + flags |= XEN_DOMCTL_pqos_cqm; + else { + rc = -EINVAL; + LIBXL__LOG(ctx, XTL_ERROR, "%s", msg[EINVAL]); + return rc; + } + + rc = xc_domain_pqos_detach(ctx->xch, domid, flags); + if (rc < 0) { + switch(errno) { + case EINVAL: + case ENODEV: + case ENOENT: + case ESRCH: + LIBXL__LOG(ctx, XTL_ERROR, "%s", msg[errno]); + break; + default: + LIBXL__LOG(ctx, XTL_ERROR, "errno: %d", errno); + } + } + + return rc; +} + +int libxl_get_cqm_info(libxl_ctx *ctx, + sysctl_cqminfo_t *info) +{ + int ret; + + ret = xc_domain_getcqminfolist(ctx->xch, info); + if (ret < 0) + return -EINVAL; + + return ret; +} + +/* + * Local variables: + * mode: C + * c-basic-offset: 4 + * indent-tabs-mode: nil + * End: + */ diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl index de5bac3..22688d8 100644 --- a/tools/libxl/libxl_types.idl +++ b/tools/libxl/libxl_types.idl @@ -275,6 +275,7 @@ libxl_domain_create_info = Struct("domain_create_info",[ ("poolid", uint32), ("run_hotplug_scripts",libxl_defbool), ("pvh", libxl_defbool), + ("pqos_cqm", libxl_defbool), ], dir=DIR_IN) libxl_domain_restore_params = Struct("domain_restore_params", [ diff --git a/tools/libxl/xl.h b/tools/libxl/xl.h index e005c39..994d3be 100644 --- a/tools/libxl/xl.h +++ b/tools/libxl/xl.h @@ -105,6 +105,9 @@ int main_getenforce(int argc, char **argv); int main_setenforce(int argc, char **argv); int main_loadpolicy(int argc, char **argv); int main_remus(int argc, char **argv); +int main_pqosattach(int argc, char **argv); +int main_pqosdetach(int argc, char **argv); +int main_pqoslist(int argc, char **argv); void help(const char *command); diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c index 8690ec7..cc2a095 100644 --- a/tools/libxl/xl_cmdimpl.c +++ b/tools/libxl/xl_cmdimpl.c @@ -670,6 +670,8 @@ static void parse_config_data(const char *config_source, exit(1); } + xlu_cfg_get_defbool(config, "pqos_cqm", &c_info->pqos_cqm, 0); + libxl_domain_build_info_init_type(b_info, c_info->type); if (blkdev_start) b_info->blkdev_start = strdup(blkdev_start); @@ -7193,6 +7195,142 @@ int main_remus(int argc, char **argv) return -ERROR_FAIL; } +int main_pqosattach(int argc, char **argv) +{ + uint32_t domid; + int opt, rc; + const char *qos_type = NULL; + + SWITCH_FOREACH_OPT(opt, "", NULL, "pqos-attach", 2) { + /* No options */ + } + + qos_type = argv[optind]; + domid = find_domain(argv[optind + 1]); + + rc = libxl_pqos_attach(ctx, domid, qos_type); + + return rc; +} + +int main_pqosdetach(int argc, char **argv) +{ + uint32_t domid; + int opt, rc; + const char *qos_type = NULL; + + SWITCH_FOREACH_OPT(opt, "", NULL, "pqos-detach", 2) { + /* No options */ + } + + qos_type = argv[optind]; + domid = find_domain(argv[optind + 1]); + + rc = libxl_pqos_detach(ctx, domid, qos_type); + + return rc; +} + +static void print_cqm_info(const sysctl_cqminfo_t *info, uint32_t first_domain, + unsigned int num_domains) +{ + unsigned long i, j; + xen_socket_cqmdata_t *cqmdata; + char *domname; + int found = 0; + + if (info->num_rmid == 0) + printf("System doesn''t supoort CQM.\n"); + else if (info->num_rmid - info->num_rmid_avail == 1) + printf("No RMID is assigned to domains.\n"); + else { + printf("RMID count %5d\tRMID available %5d\n", + info->num_rmid, info->num_rmid_avail); + printf("Name ID"); + for (i = 0; i < info->num_sockets; i++) + printf("\tSocketID\tL3C_Usage"); + for (i = first_domain; i < (first_domain + num_domains); i++) { + found = 0; + for (j = 0; j < (info->num_rmid * info->num_sockets); j++) { + cqmdata = info->cqmdata + j; + if (!cqmdata->valid || cqmdata->domid != i) + continue; + if (!found) { + domname = libxl_domid_to_name(ctx, cqmdata->domid); + printf("\n%-40s %5d", domname, cqmdata->domid); + free(domname); + found = 1; + } + printf("%10u %16lu ", cqmdata->socket, cqmdata->l3c_occupancy); + } + } + printf("\n"); + } +} + +int main_pqoslist(int argc, char **argv) +{ + int opt; + const char *qos_type = NULL; + uint32_t first_domain; + unsigned int num_domains; + int rc = 0; + sysctl_cqminfo_t info; + + SWITCH_FOREACH_OPT(opt, "", NULL, "pqos-list", 1) { + /* No options */ + } + + qos_type = argv[optind]; + + if (!strncmp(qos_type, "cqm", 3)) { + if (optind + 1 >= argc) { + first_domain = 0; + num_domains = 1024; + } else if (optind + 1 == argc - 1) { + first_domain = find_domain(argv[optind + 1]); + num_domains = 1; + if (!libxl_domid_to_name(ctx, first_domain)) + { + fprintf(stderr, "Invalid domain id: %d.\n", first_domain); + return 1; + } + } else { + help("pqos-list"); + return 2; + } + + info.num_rmid= 256; + info.num_sockets = 128; + info.cqmdata = calloc(info.num_rmid * info.num_sockets, + sizeof(xen_socket_cqmdata_t)); + if (!info.cqmdata) { + fprintf(stderr, "Allocating domain cqminfo failed.\n"); + return ERROR_FAIL; + } + + rc = libxl_get_cqm_info(ctx, &info); + + if (rc < 0) { + fprintf(stderr, "Failed to get domain CQM info, " + "check whether CQM feature is supported.\n"); + if (info.cqmdata) + free(info.cqmdata); + return 1; + } + print_cqm_info(&info, first_domain, num_domains); + + if (info.cqmdata) + free(info.cqmdata); + } else { + fprintf(stderr, "QoS resource type supported is: cqm.\n"); + help("pqos-list"); + return 2; + } + + return 0; +} + /* * Local variables: * mode: C diff --git a/tools/libxl/xl_cmdtable.c b/tools/libxl/xl_cmdtable.c index 326a660..6ced416 100644 --- a/tools/libxl/xl_cmdtable.c +++ b/tools/libxl/xl_cmdtable.c @@ -488,6 +488,21 @@ struct cmd_spec cmd_table[] = { " of the domain." }, + { "pqos-attach", + &main_pqosattach, 0, 1, + "Allocate and map qos resource", + "<Resource> <Domain>", + }, + { "pqos-detach", + &main_pqosdetach, 0, 1, + "Reliquish qos resource", + "<Resource> <Domain>", + }, + { "pqos-list", + &main_pqoslist, 0, 0, + "List qos information about all/some domains", + "<Resource> [Domain]", + }, }; int cmdtable_len = sizeof(cmd_table)/sizeof(struct cmd_spec); -- 1.7.9.5
Andrew Cooper
2013-Nov-29 13:54 UTC
Re: [PATCH v3 1/7] x86: detect and initialize Cache QoS Monitoring feature
On 29/11/13 05:48, dongxiao.xu@intel.com wrote:> From: Dongxiao Xu <dongxiao.xu@intel.com> > > Detect platform QoS feature status and enumerate the resource types, > one of which is to monitor the L3 cache occupancy. > > Also introduce a Xen grub command line parameter to control the > QoS feature status globally. > > Signed-off-by: Jiongxi Li <jiongxi.li@intel.com> > Signed-off-by: Dongxiao Xu <dongxiao.xu@intel.com>This is starting to look good. Sorry to keep nitpicking at it.> --- > xen/arch/x86/Makefile | 1 + > xen/arch/x86/cpu/intel.c | 6 +++ > xen/arch/x86/pqos.c | 95 ++++++++++++++++++++++++++++++++++++++ > xen/arch/x86/setup.c | 3 ++ > xen/include/asm-x86/cpufeature.h | 1 + > xen/include/asm-x86/pqos.h | 32 +++++++++++++ > 6 files changed, 138 insertions(+) > create mode 100644 xen/arch/x86/pqos.c > create mode 100644 xen/include/asm-x86/pqos.h > > diff --git a/xen/arch/x86/Makefile b/xen/arch/x86/Makefile > index d502bdf..54962e0 100644 > --- a/xen/arch/x86/Makefile > +++ b/xen/arch/x86/Makefile > @@ -58,6 +58,7 @@ obj-y += crash.o > obj-y += tboot.o > obj-y += hpet.o > obj-y += xstate.o > +obj-y += pqos.o > > obj-$(crash_debug) += gdbstub.o > > diff --git a/xen/arch/x86/cpu/intel.c b/xen/arch/x86/cpu/intel.c > index 27fe762..f0d83ea 100644 > --- a/xen/arch/x86/cpu/intel.c > +++ b/xen/arch/x86/cpu/intel.c > @@ -230,6 +230,12 @@ static void __devinit init_intel(struct cpuinfo_x86 *c) > ( c->cpuid_level >= 0x00000006 ) && > ( cpuid_eax(0x00000006) & (1u<<2) ) ) > set_bit(X86_FEATURE_ARAT, c->x86_capability); > + > + /* Check platform QoS monitoring capability */ > + if ((c->cpuid_level >= 0x00000007) && > + (cpuid_ebx(0x00000007) & (1u<<12))) > + set_bit(X86_FEATURE_QOSM, c->x86_capability); > + > } > > static struct cpu_dev intel_cpu_dev __cpuinitdata = { > diff --git a/xen/arch/x86/pqos.c b/xen/arch/x86/pqos.c > new file mode 100644 > index 0000000..e2172c4 > --- /dev/null > +++ b/xen/arch/x86/pqos.c > @@ -0,0 +1,95 @@ > +/* > + * pqos.c: Platform QoS related service for guest. > + * > + * Copyright (c) 2013, Intel Corporation > + * Author: Jiongxi Li <jiongxi.li@intel.com> > + * Author: Dongxiao Xu <dongxiao.xu@intel.com> > + * > + * This program is free software; you can redistribute it and/or modify it > + * under the terms and conditions of the GNU General Public License, > + * version 2, as published by the Free Software Foundation. > + * > + * This program is distributed in the hope it will be useful, but WITHOUT > + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or > + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for > + * more details. > + * > + * You should have received a copy of the GNU General Public License along with > + * this program; if not, write to the Free Software Foundation, Inc., 59 Temple > + * Place - Suite 330, Boston, MA 02111-1307 USA. > + */ > +#include <asm/processor.h> > +#include <xen/init.h> > +#include <asm/pqos.h> > + > +static bool_t __initdata pqos_enabled = 1; > +boolean_param("pqos", pqos_enabled); > + > +static unsigned int cqm_rmid_count = 256;This can probably be __read_mostly. Also, I would suggest "opt_max_rmids" as a name which makes it obvious when used elsewhere that it is a command line parameter limiting the possible rmids.> +integer_param("cqm_rmid_count", cqm_rmid_count);Any patch adding new command line parameters should also patch docs/misc/xen-command-line.markdown, to help keep the documentation up to date. However, I feel that all options relating to platform QoS should be available under a "qpos" custom_param, (similar to iommu=), so we don''t end up with loads of new command line options with common prefixes as new features are added. For this, I would suggest semantic like: pqos=[<boolean>],[max_rmids=<number>] Particularly, when a second qpos option is introduced, it might be sensible to have individual booleans for each option, as well as a global enable/disable. I would be interested in general views from others as far as this is concerned. ~Andrew> + > +unsigned int cqm_upscaling_factor = 0; > +bool_t cqm_enabled = 0; > +domid_t *cqm_rmid_array = NULL; > + > +static void __init init_cqm(void) > +{ > + unsigned int rmid; > + unsigned int eax, edx; > + unsigned int max_cqm_rmid; > + > + cpuid_count(0xf, 1, &eax, &cqm_upscaling_factor, &max_cqm_rmid, &edx); > + if ( !(edx & QOS_MONITOR_EVTID_L3) ) > + return; > + > + cqm_rmid_count = min(cqm_rmid_count, max_cqm_rmid + 1); > + > + cqm_rmid_array = xzalloc_array(domid_t, cqm_rmid_count);You fully populate the contents of the array lower. No need to waste time zeroing it first.> + if ( !cqm_rmid_array ) > + { > + cqm_rmid_count = 0; > + return; > + } > + > + /* Reserve RMID 0 for all domains not being monitored */ > + cqm_rmid_array[0] = DOMID_XEN; > + > + for ( rmid = 1; rmid < cqm_rmid_count; rmid++ ) > + cqm_rmid_array[rmid] = DOMID_INVALID; > + > + cqm_enabled = 1; > + > + printk(XENLOG_INFO "Cache QoS Monitoring Enabled.\n"); > +} > + > +static void __init init_qos_monitor(void) > +{ > + unsigned int qm_features; > + unsigned int eax, ebx, ecx; > + > + if ( !(boot_cpu_has(X86_FEATURE_QOSM)) ) > + return; > + > + cpuid_count(0xf, 0, &eax, &ebx, &ecx, &qm_features); > + > + if ( qm_features & QOS_MONITOR_TYPE_L3 ) > + init_cqm(); > +} > + > +void __init init_platform_qos(void) > +{ > + if ( !pqos_enabled ) > + return; > + > + init_qos_monitor(); > +} > + > +/* > + * Local variables: > + * mode: C > + * c-file-style: "BSD" > + * c-basic-offset: 4 > + * tab-width: 4 > + * indent-tabs-mode: nil > + * End: > + */ > diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c > index 5bf4ee0..95418e4 100644 > --- a/xen/arch/x86/setup.c > +++ b/xen/arch/x86/setup.c > @@ -48,6 +48,7 @@ > #include <asm/setup.h> > #include <xen/cpu.h> > #include <asm/nmi.h> > +#include <asm/pqos.h> > > /* opt_nosmp: If true, secondary processors are ignored. */ > static bool_t __initdata opt_nosmp; > @@ -1402,6 +1403,8 @@ void __init __start_xen(unsigned long mbi_p) > > domain_unpause_by_systemcontroller(dom0); > > + init_platform_qos(); > + > reset_stack_and_jump(init_done); > } > > diff --git a/xen/include/asm-x86/cpufeature.h b/xen/include/asm-x86/cpufeature.h > index 1cfaf94..ca59668 100644 > --- a/xen/include/asm-x86/cpufeature.h > +++ b/xen/include/asm-x86/cpufeature.h > @@ -147,6 +147,7 @@ > #define X86_FEATURE_ERMS (7*32+ 9) /* Enhanced REP MOVSB/STOSB */ > #define X86_FEATURE_INVPCID (7*32+10) /* Invalidate Process Context ID */ > #define X86_FEATURE_RTM (7*32+11) /* Restricted Transactional Memory */ > +#define X86_FEATURE_QOSM (7*32+12) /* Platform QoS monitoring capability */ > #define X86_FEATURE_NO_FPU_SEL (7*32+13) /* FPU CS/DS stored as zero */ > #define X86_FEATURE_SMAP (7*32+20) /* Supervisor Mode Access Prevention */ > > diff --git a/xen/include/asm-x86/pqos.h b/xen/include/asm-x86/pqos.h > new file mode 100644 > index 0000000..e8caca2 > --- /dev/null > +++ b/xen/include/asm-x86/pqos.h > @@ -0,0 +1,32 @@ > +/* > + * pqos.h: Platform QoS related service for guest. > + * > + * Copyright (c) 2013, Intel Corporation > + * Author: Jiongxi Li <jiongxi.li@intel.com> > + * Author: Dongxiao Xu <dongxiao.xu@intel.com> > + * > + * This program is free software; you can redistribute it and/or modify it > + * under the terms and conditions of the GNU General Public License, > + * version 2, as published by the Free Software Foundation. > + * > + * This program is distributed in the hope it will be useful, but WITHOUT > + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or > + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for > + * more details. > + * > + * You should have received a copy of the GNU General Public License along with > + * this program; if not, write to the Free Software Foundation, Inc., 59 Temple > + * Place - Suite 330, Boston, MA 02111-1307 USA. > + */ > +#ifndef ASM_PQOS_H > +#define ASM_PQOS_H > + > +/* QoS Resource Type Enumeration */ > +#define QOS_MONITOR_TYPE_L3 0x2 > + > +/* QoS Monitoring Event ID */ > +#define QOS_MONITOR_EVTID_L3 0x1 > + > +void init_platform_qos(void); > + > +#endif
Andrew Cooper
2013-Nov-29 13:59 UTC
Re: [PATCH v3 1/7] x86: detect and initialize Cache QoS Monitoring feature
On 29/11/13 05:48, dongxiao.xu@intel.com wrote:> From: Dongxiao Xu <dongxiao.xu@intel.com> > > Detect platform QoS feature status and enumerate the resource types, > one of which is to monitor the L3 cache occupancy. > > Also introduce a Xen grub command line parameter to control the > QoS feature status globally. > > Signed-off-by: Jiongxi Li <jiongxi.li@intel.com> > Signed-off-by: Dongxiao Xu <dongxiao.xu@intel.com> > --- > xen/arch/x86/Makefile | 1 + > xen/arch/x86/cpu/intel.c | 6 +++ > xen/arch/x86/pqos.c | 95 ++++++++++++++++++++++++++++++++++++++ > xen/arch/x86/setup.c | 3 ++ > xen/include/asm-x86/cpufeature.h | 1 + > xen/include/asm-x86/pqos.h | 32 +++++++++++++ > 6 files changed, 138 insertions(+) > create mode 100644 xen/arch/x86/pqos.c > create mode 100644 xen/include/asm-x86/pqos.h > > diff --git a/xen/arch/x86/Makefile b/xen/arch/x86/Makefile > index d502bdf..54962e0 100644 > --- a/xen/arch/x86/Makefile > +++ b/xen/arch/x86/Makefile > @@ -58,6 +58,7 @@ obj-y += crash.o > obj-y += tboot.o > obj-y += hpet.o > obj-y += xstate.o > +obj-y += pqos.o > > obj-$(crash_debug) += gdbstub.o > > diff --git a/xen/arch/x86/cpu/intel.c b/xen/arch/x86/cpu/intel.c > index 27fe762..f0d83ea 100644 > --- a/xen/arch/x86/cpu/intel.c > +++ b/xen/arch/x86/cpu/intel.c > @@ -230,6 +230,12 @@ static void __devinit init_intel(struct cpuinfo_x86 *c) > ( c->cpuid_level >= 0x00000006 ) && > ( cpuid_eax(0x00000006) & (1u<<2) ) ) > set_bit(X86_FEATURE_ARAT, c->x86_capability); > + > + /* Check platform QoS monitoring capability */ > + if ((c->cpuid_level >= 0x00000007) && > + (cpuid_ebx(0x00000007) & (1u<<12))) > + set_bit(X86_FEATURE_QOSM, c->x86_capability); > + > } > > static struct cpu_dev intel_cpu_dev __cpuinitdata = { > diff --git a/xen/arch/x86/pqos.c b/xen/arch/x86/pqos.c > new file mode 100644 > index 0000000..e2172c4 > --- /dev/null > +++ b/xen/arch/x86/pqos.c > @@ -0,0 +1,95 @@ > +/* > + * pqos.c: Platform QoS related service for guest. > + * > + * Copyright (c) 2013, Intel Corporation > + * Author: Jiongxi Li <jiongxi.li@intel.com> > + * Author: Dongxiao Xu <dongxiao.xu@intel.com> > + * > + * This program is free software; you can redistribute it and/or modify it > + * under the terms and conditions of the GNU General Public License, > + * version 2, as published by the Free Software Foundation. > + * > + * This program is distributed in the hope it will be useful, but WITHOUT > + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or > + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for > + * more details. > + * > + * You should have received a copy of the GNU General Public License along with > + * this program; if not, write to the Free Software Foundation, Inc., 59 Temple > + * Place - Suite 330, Boston, MA 02111-1307 USA. > + */ > +#include <asm/processor.h> > +#include <xen/init.h> > +#include <asm/pqos.h> > + > +static bool_t __initdata pqos_enabled = 1; > +boolean_param("pqos", pqos_enabled); > + > +static unsigned int cqm_rmid_count = 256; > +integer_param("cqm_rmid_count", cqm_rmid_count); > + > +unsigned int cqm_upscaling_factor = 0; > +bool_t cqm_enabled = 0; > +domid_t *cqm_rmid_array = NULL;I feel this would be more meaningful being called "rmid_to_dom" or something similar. (but do admit that this is entirely personal taste). ~Andrew> + > +static void __init init_cqm(void) > +{ > + unsigned int rmid; > + unsigned int eax, edx; > + unsigned int max_cqm_rmid; > + > + cpuid_count(0xf, 1, &eax, &cqm_upscaling_factor, &max_cqm_rmid, &edx); > + if ( !(edx & QOS_MONITOR_EVTID_L3) ) > + return; > + > + cqm_rmid_count = min(cqm_rmid_count, max_cqm_rmid + 1); > + > + cqm_rmid_array = xzalloc_array(domid_t, cqm_rmid_count); > + if ( !cqm_rmid_array ) > + { > + cqm_rmid_count = 0; > + return; > + } > + > + /* Reserve RMID 0 for all domains not being monitored */ > + cqm_rmid_array[0] = DOMID_XEN; > + > + for ( rmid = 1; rmid < cqm_rmid_count; rmid++ ) > + cqm_rmid_array[rmid] = DOMID_INVALID; > + > + cqm_enabled = 1; > + > + printk(XENLOG_INFO "Cache QoS Monitoring Enabled.\n"); > +} > + > +static void __init init_qos_monitor(void) > +{ > + unsigned int qm_features; > + unsigned int eax, ebx, ecx; > + > + if ( !(boot_cpu_has(X86_FEATURE_QOSM)) ) > + return; > + > + cpuid_count(0xf, 0, &eax, &ebx, &ecx, &qm_features); > + > + if ( qm_features & QOS_MONITOR_TYPE_L3 ) > + init_cqm(); > +} > + > +void __init init_platform_qos(void) > +{ > + if ( !pqos_enabled ) > + return; > + > + init_qos_monitor(); > +} > + > +/* > + * Local variables: > + * mode: C > + * c-file-style: "BSD" > + * c-basic-offset: 4 > + * tab-width: 4 > + * indent-tabs-mode: nil > + * End: > + */ > diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c > index 5bf4ee0..95418e4 100644 > --- a/xen/arch/x86/setup.c > +++ b/xen/arch/x86/setup.c > @@ -48,6 +48,7 @@ > #include <asm/setup.h> > #include <xen/cpu.h> > #include <asm/nmi.h> > +#include <asm/pqos.h> > > /* opt_nosmp: If true, secondary processors are ignored. */ > static bool_t __initdata opt_nosmp; > @@ -1402,6 +1403,8 @@ void __init __start_xen(unsigned long mbi_p) > > domain_unpause_by_systemcontroller(dom0); > > + init_platform_qos(); > + > reset_stack_and_jump(init_done); > } > > diff --git a/xen/include/asm-x86/cpufeature.h b/xen/include/asm-x86/cpufeature.h > index 1cfaf94..ca59668 100644 > --- a/xen/include/asm-x86/cpufeature.h > +++ b/xen/include/asm-x86/cpufeature.h > @@ -147,6 +147,7 @@ > #define X86_FEATURE_ERMS (7*32+ 9) /* Enhanced REP MOVSB/STOSB */ > #define X86_FEATURE_INVPCID (7*32+10) /* Invalidate Process Context ID */ > #define X86_FEATURE_RTM (7*32+11) /* Restricted Transactional Memory */ > +#define X86_FEATURE_QOSM (7*32+12) /* Platform QoS monitoring capability */ > #define X86_FEATURE_NO_FPU_SEL (7*32+13) /* FPU CS/DS stored as zero */ > #define X86_FEATURE_SMAP (7*32+20) /* Supervisor Mode Access Prevention */ > > diff --git a/xen/include/asm-x86/pqos.h b/xen/include/asm-x86/pqos.h > new file mode 100644 > index 0000000..e8caca2 > --- /dev/null > +++ b/xen/include/asm-x86/pqos.h > @@ -0,0 +1,32 @@ > +/* > + * pqos.h: Platform QoS related service for guest. > + * > + * Copyright (c) 2013, Intel Corporation > + * Author: Jiongxi Li <jiongxi.li@intel.com> > + * Author: Dongxiao Xu <dongxiao.xu@intel.com> > + * > + * This program is free software; you can redistribute it and/or modify it > + * under the terms and conditions of the GNU General Public License, > + * version 2, as published by the Free Software Foundation. > + * > + * This program is distributed in the hope it will be useful, but WITHOUT > + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or > + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for > + * more details. > + * > + * You should have received a copy of the GNU General Public License along with > + * this program; if not, write to the Free Software Foundation, Inc., 59 Temple > + * Place - Suite 330, Boston, MA 02111-1307 USA. > + */ > +#ifndef ASM_PQOS_H > +#define ASM_PQOS_H > + > +/* QoS Resource Type Enumeration */ > +#define QOS_MONITOR_TYPE_L3 0x2 > + > +/* QoS Monitoring Event ID */ > +#define QOS_MONITOR_EVTID_L3 0x1 > + > +void init_platform_qos(void); > + > +#endif
Andrew Cooper
2013-Nov-29 14:18 UTC
Re: [PATCH v3 2/7] x86: handle CQM resource when creating/destroying guests
On 29/11/13 05:48, dongxiao.xu@intel.com wrote:> From: Dongxiao Xu <dongxiao.xu@intel.com> > > Allocate an RMID for a guest when it is created. This per-guest > RMID will be used to monitor Cache QoS related data. The RMID will > be relinquished when guest is destroyed. > > Signed-off-by: Jiongxi Li <jiongxi.li@intel.com> > Signed-off-by: Dongxiao Xu <dongxiao.xu@intel.com> > --- > xen/arch/x86/domain.c | 8 ++++++ > xen/arch/x86/pqos.c | 55 ++++++++++++++++++++++++++++++++++++++++++ > xen/common/domctl.c | 5 +++- > xen/include/asm-x86/domain.h | 2 ++ > xen/include/asm-x86/pqos.h | 5 ++++ > xen/include/public/domctl.h | 3 +++ > xen/include/xen/sched.h | 3 +++ > 7 files changed, 80 insertions(+), 1 deletion(-) > > diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c > index a3868f9..41e1fc6 100644 > --- a/xen/arch/x86/domain.c > +++ b/xen/arch/x86/domain.c > @@ -60,6 +60,7 @@ > #include <xen/numa.h> > #include <xen/iommu.h> > #include <compat/vcpu.h> > +#include <asm/pqos.h> > > DEFINE_PER_CPU(struct vcpu *, curr_vcpu); > DEFINE_PER_CPU(unsigned long, cr4); > @@ -579,6 +580,11 @@ int arch_domain_create(struct domain *d, unsigned int domcr_flags) > tsc_set_info(d, TSC_MODE_DEFAULT, 0UL, 0, 0); > spin_lock_init(&d->arch.vtsc_lock); > > + /* Allocate CQM RMID for guest */ > + d->arch.pqos_cqm_rmid = 0; > + if ( system_supports_cqm() && (domcr_flags & DOMCRF_pqos_cqm) ) > + alloc_cqm_rmid(d); > +Should we fail domain creation if an rmid cannot be allocated? I really cant decide which would be better. In other words, if the toolstack issues a build with DOMCRF_pqos_cqm, and the build hypercall returns success, should the toolstack reasonably expect everything to be set up? I would think so. If others agree wrt this expectation, then domain creation needs to fail also if "(domcr_flags & DOMCRF_pqos_cqm) && !system_supports_cqm()"> return 0; > > fail: > @@ -612,6 +618,8 @@ void arch_domain_destroy(struct domain *d) > > free_xenheap_page(d->shared_info); > cleanup_domain_irq_mapping(d); > + > + free_cqm_rmid(d); > } > > unsigned long pv_guest_cr4_fixup(const struct vcpu *v, unsigned long guest_cr4) > diff --git a/xen/arch/x86/pqos.c b/xen/arch/x86/pqos.c > index e2172c4..1148f3b 100644 > --- a/xen/arch/x86/pqos.c > +++ b/xen/arch/x86/pqos.c > @@ -20,6 +20,7 @@ > */ > #include <asm/processor.h> > #include <xen/init.h> > +#include <xen/spinlock.h> > #include <asm/pqos.h> > > static bool_t __initdata pqos_enabled = 1; > @@ -31,6 +32,7 @@ integer_param("cqm_rmid_count", cqm_rmid_count); > unsigned int cqm_upscaling_factor = 0; > bool_t cqm_enabled = 0; > domid_t *cqm_rmid_array = NULL; > +static DEFINE_SPINLOCK(cqm_lock); > > static void __init init_cqm(void) > { > @@ -84,6 +86,59 @@ void __init init_platform_qos(void) > init_qos_monitor(); > } > > +bool_t system_supports_cqm(void) > +{ > + return cqm_enabled; > +} > + > +int alloc_cqm_rmid(struct domain *d) > +{ > + int rc = 0; > + unsigned int rmid; > + unsigned long flags; > + > + ASSERT(system_supports_cqm()); > + > + spin_lock_irqsave(&cqm_lock, flags); > + /* RMID=0 is reserved, enumerate from 1 */ > + for ( rmid = 1; rmid < cqm_rmid_count; rmid++ ) > + { > + if ( cqm_rmid_array[rmid] != DOMID_INVALID) > + continue; > + > + cqm_rmid_array[rmid] = d->domain_id; > + break; > + } > + spin_unlock_irqrestore(&cqm_lock, flags); > + > + /* No CQM RMID available, assign RMID=0 by default */ > + if ( rmid == cqm_rmid_count ) > + { > + rmid = 0; > + rc = -1; > + } > + > + d->arch.pqos_cqm_rmid = rmid; > + > + return rc; > +} > + > +void free_cqm_rmid(struct domain *d) > +{ > + unsigned int rmid = d->arch.pqos_cqm_rmid; > + unsigned long flags; > + > + /* We do not free system reserved "RMID=0" */ > + if ( rmid == 0 ) > + return; > + > + spin_lock_irqsave(&cqm_lock, flags); > + cqm_rmid_array[rmid] = DOMID_INVALID; > + spin_unlock_irqrestore(&cqm_lock, flags);Does this need the spinlock? given rmid in the range 1 to max_rmids, there can''t be any competition over who owns the entry, and setting it back to DOMID_INVALID wont race with the allocation loop.> + > + d->arch.pqos_cqm_rmid = 0; > +} > + > /* > * Local variables: > * mode: C > diff --git a/xen/common/domctl.c b/xen/common/domctl.c > index 904d27b..1c2e320 100644 > --- a/xen/common/domctl.c > +++ b/xen/common/domctl.c > @@ -425,7 +425,8 @@ long do_domctl(XEN_GUEST_HANDLE_PARAM(xen_domctl_t) u_domctl) > | XEN_DOMCTL_CDF_pvh_guest > | XEN_DOMCTL_CDF_hap > | XEN_DOMCTL_CDF_s3_integrity > - | XEN_DOMCTL_CDF_oos_off)) ) > + | XEN_DOMCTL_CDF_oos_off > + | XEN_DOMCTL_CDF_pqos_cqm)) )If you move the final bracket onto a new line now, future additions will be a single line addition rather than one removal and two additions. I am not sure what the prerogative is with that in terms of the Xen coding style. ~Andrew> break; > > dom = op->domain; > @@ -467,6 +468,8 @@ long do_domctl(XEN_GUEST_HANDLE_PARAM(xen_domctl_t) u_domctl) > domcr_flags |= DOMCRF_s3_integrity; > if ( op->u.createdomain.flags & XEN_DOMCTL_CDF_oos_off ) > domcr_flags |= DOMCRF_oos_off; > + if ( op->u.createdomain.flags & XEN_DOMCTL_CDF_pqos_cqm ) > + domcr_flags |= DOMCRF_pqos_cqm; > > d = domain_create(dom, domcr_flags, op->u.createdomain.ssidref); > if ( IS_ERR(d) ) > diff --git a/xen/include/asm-x86/domain.h b/xen/include/asm-x86/domain.h > index 9d39061..9487251 100644 > --- a/xen/include/asm-x86/domain.h > +++ b/xen/include/asm-x86/domain.h > @@ -313,6 +313,8 @@ struct arch_domain > spinlock_t e820_lock; > struct e820entry *e820; > unsigned int nr_e820; > + > + unsigned int pqos_cqm_rmid; /* CQM RMID assigned to the domain */ > } __cacheline_aligned; > > #define has_arch_pdevs(d) (!list_empty(&(d)->arch.pdev_list)) > diff --git a/xen/include/asm-x86/pqos.h b/xen/include/asm-x86/pqos.h > index e8caca2..c54905b 100644 > --- a/xen/include/asm-x86/pqos.h > +++ b/xen/include/asm-x86/pqos.h > @@ -20,6 +20,7 @@ > */ > #ifndef ASM_PQOS_H > #define ASM_PQOS_H > +#include <xen/sched.h> > > /* QoS Resource Type Enumeration */ > #define QOS_MONITOR_TYPE_L3 0x2 > @@ -29,4 +30,8 @@ > > void init_platform_qos(void); > > +bool_t system_supports_cqm(void); > +int alloc_cqm_rmid(struct domain *d); > +void free_cqm_rmid(struct domain *d); > + > #endif > diff --git a/xen/include/public/domctl.h b/xen/include/public/domctl.h > index 01a3652..47a850a 100644 > --- a/xen/include/public/domctl.h > +++ b/xen/include/public/domctl.h > @@ -62,6 +62,9 @@ struct xen_domctl_createdomain { > /* Is this a PVH guest (as opposed to an HVM or PV guest)? */ > #define _XEN_DOMCTL_CDF_pvh_guest 4 > #define XEN_DOMCTL_CDF_pvh_guest (1U<<_XEN_DOMCTL_CDF_pvh_guest) > + /* Enable pqos-cqm? */ > +#define _XEN_DOMCTL_CDF_pqos_cqm 5 > +#define XEN_DOMCTL_CDF_pqos_cqm (1U<<_XEN_DOMCTL_CDF_pqos_cqm) > uint32_t flags; > }; > typedef struct xen_domctl_createdomain xen_domctl_createdomain_t; > diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h > index cbdf377..3a42656 100644 > --- a/xen/include/xen/sched.h > +++ b/xen/include/xen/sched.h > @@ -507,6 +507,9 @@ struct domain *domain_create( > /* DOMCRF_pvh: Create PV domain in HVM container. */ > #define _DOMCRF_pvh 5 > #define DOMCRF_pvh (1U<<_DOMCRF_pvh) > + /* DOMCRF_pqos_cqm: Create a domain with CQM support */ > +#define _DOMCRF_pqos_cqm 6 > +#define DOMCRF_pqos_cqm (1U<<_DOMCRF_pqos_cqm) > > /* > * rcu_lock_domain_by_id() is more efficient than get_domain_by_id().
Andrew Cooper
2013-Nov-29 14:22 UTC
Re: [PATCH v3 3/7] x86: dynamically attach/detach CQM service for a guest
On 29/11/13 05:48, dongxiao.xu@intel.com wrote:> From: Dongxiao Xu <dongxiao.xu@intel.com> > > Add hypervisor side support for dynamically attach and detach CQM > services for a certain guest. > > Signed-off-by: Jiongxi Li <jiongxi.li@intel.com> > Signed-off-by: Dongxiao Xu <dongxiao.xu@intel.com> > --- > xen/arch/x86/domctl.c | 40 ++++++++++++++++++++++++++++++++++++++++ > xen/include/public/domctl.h | 14 ++++++++++++++ > 2 files changed, 54 insertions(+) > > diff --git a/xen/arch/x86/domctl.c b/xen/arch/x86/domctl.c > index f7e4586..7007990 100644 > --- a/xen/arch/x86/domctl.c > +++ b/xen/arch/x86/domctl.c > @@ -35,6 +35,7 @@ > #include <asm/mem_sharing.h> > #include <asm/xstate.h> > #include <asm/debugger.h> > +#include <asm/pqos.h> > > static int gdbsx_guest_mem_io( > domid_t domid, struct xen_domctl_gdbsx_memio *iop) > @@ -1223,6 +1224,45 @@ long arch_do_domctl( > } > break; > > + case XEN_DOMCTL_attach_pqos: > + { > + if ( domctl->u.qos_type.flags & XEN_DOMCTL_pqos_cqm ) > + { > + if ( !system_supports_cqm() ) > + ret = -ENODEV; > + else if ( d->arch.pqos_cqm_rmid > 0 ) > + ret = -EEXIST; > + else > + { > + ret = alloc_cqm_rmid(d); > + if ( ret < 0 ) > + ret = -EUSERS; > + } > + } > + else > + ret = -EINVAL; > + } > + break; > + > + case XEN_DOMCTL_detach_pqos: > + { > + if ( domctl->u.qos_type.flags & XEN_DOMCTL_pqos_cqm ) > + { > + if ( !system_supports_cqm() ) > + ret = -ENODEV; > + else if ( d->arch.pqos_cqm_rmid > 0 ) > + { > + free_cqm_rmid(d); > + ret = 0; > + } > + else > + ret = -ENOENT; > + } > + else > + ret = -EINVAL; > + } > + break; > + > default: > ret = iommu_do_domctl(domctl, d, u_domctl); > break; > diff --git a/xen/include/public/domctl.h b/xen/include/public/domctl.h > index 47a850a..800b2f4 100644 > --- a/xen/include/public/domctl.h > +++ b/xen/include/public/domctl.h > @@ -872,6 +872,17 @@ struct xen_domctl_set_max_evtchn { > typedef struct xen_domctl_set_max_evtchn xen_domctl_set_max_evtchn_t; > DEFINE_XEN_GUEST_HANDLE(xen_domctl_set_max_evtchn_t); > > +/* XEN_DOMCTL_attach_pqos */ > +/* XEN_DOMCTL_detach_pqos */ > +struct xen_domctl_qos_type { > + /* Attach or detach flag for cqm */ > +#define _XEN_DOMCTL_pqos_cqm 0 > +#define XEN_DOMCTL_pqos_cqm (1U<<_XEN_DOMCTL_pqos_cqm) > + uint32_t flags;How many different QoS do you think might come to be? Might it be worth making this a uint64_t before the ABI is set? Either way, Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>> +}; > +typedef struct xen_domctl_qos_type xen_domctl_qos_type_t; > +DEFINE_XEN_GUEST_HANDLE(xen_domctl_qos_type_t); > + > struct xen_domctl { > uint32_t cmd; > #define XEN_DOMCTL_createdomain 1 > @@ -941,6 +952,8 @@ struct xen_domctl { > #define XEN_DOMCTL_setnodeaffinity 68 > #define XEN_DOMCTL_getnodeaffinity 69 > #define XEN_DOMCTL_set_max_evtchn 70 > +#define XEN_DOMCTL_attach_pqos 71 > +#define XEN_DOMCTL_detach_pqos 72 > #define XEN_DOMCTL_gdbsx_guestmemio 1000 > #define XEN_DOMCTL_gdbsx_pausevcpu 1001 > #define XEN_DOMCTL_gdbsx_unpausevcpu 1002 > @@ -1001,6 +1014,7 @@ struct xen_domctl { > struct xen_domctl_set_broken_page_p2m set_broken_page_p2m; > struct xen_domctl_gdbsx_pauseunp_vcpu gdbsx_pauseunp_vcpu; > struct xen_domctl_gdbsx_domstatus gdbsx_domstatus; > + struct xen_domctl_qos_type qos_type; > uint8_t pad[128]; > } u; > };
Andrew Cooper
2013-Nov-29 14:53 UTC
Re: [PATCH v3 4/7] x86: collect CQM information from all sockets
On 29/11/13 05:48, dongxiao.xu@intel.com wrote:> From: Dongxiao Xu <dongxiao.xu@intel.com> > > Collect CQM information (L3 cache occupancy) from all sockets. > Upper layer application can parse the data structure to get the > information of guest''s L3 cache occupancy on certain sockets. > > Signed-off-by: Jiongxi Li <jiongxi.li@intel.com> > Signed-off-by: Dongxiao Xu <dongxiao.xu@intel.com> > --- > xen/arch/x86/pqos.c | 59 ++++++++++++++++++++++++++ > xen/arch/x86/sysctl.c | 89 +++++++++++++++++++++++++++++++++++++++ > xen/include/asm-x86/msr-index.h | 4 ++ > xen/include/asm-x86/pqos.h | 8 ++++ > xen/include/public/domctl.h | 9 ++++ > xen/include/public/sysctl.h | 11 +++++ > 6 files changed, 180 insertions(+) > > diff --git a/xen/arch/x86/pqos.c b/xen/arch/x86/pqos.c > index 1148f3b..615c5ea 100644 > --- a/xen/arch/x86/pqos.c > +++ b/xen/arch/x86/pqos.c > @@ -19,6 +19,7 @@ > * Place - Suite 330, Boston, MA 02111-1307 USA. > */ > #include <asm/processor.h> > +#include <asm/msr.h> > #include <xen/init.h> > #include <xen/spinlock.h> > #include <asm/pqos.h> > @@ -91,6 +92,26 @@ bool_t system_supports_cqm(void) > return cqm_enabled; > } > > +unsigned int get_cqm_count(void) > +{ > + return cqm_rmid_count; > +} > + > +unsigned int get_cqm_avail(void) > +{ > + unsigned int rmid, cqm_avail = 0; > + unsigned long flags; > + > + spin_lock_irqsave(&cqm_lock, flags); > + /* RMID=0 is reserved, enumerate from 1 */ > + for ( rmid = 1; rmid < cqm_rmid_count; rmid++ ) > + if ( cqm_rmid_array[rmid] == DOMID_INVALID ) > + cqm_avail++; > + spin_unlock_irqrestore(&cqm_lock, flags); > + > + return cqm_avail;This cqm_avail is stale as soon as you release the lock.> +} > + > int alloc_cqm_rmid(struct domain *d) > { > int rc = 0; > @@ -139,6 +160,44 @@ void free_cqm_rmid(struct domain *d) > d->arch.pqos_cqm_rmid = 0; > } > > +static void read_cqm_data(void *arg) > +{ > + uint64_t cqm_data; > + unsigned int rmid; > + int socket = cpu_to_socket(smp_processor_id()); > + struct xen_socket_cqmdata *data = arg; > + unsigned long flags, i; > + > + if ( socket < 0 ) > + return; > + > + spin_lock_irqsave(&cqm_lock, flags); > + /* RMID=0 is reserved, enumerate from 1 */ > + for ( rmid = 1; rmid < cqm_rmid_count; rmid++ ) > + { > + if ( cqm_rmid_array[rmid] == DOMID_INVALID ) > + continue; > + > + wrmsr(MSR_IA32_QOSEVTSEL, QOS_MONITOR_EVTID_L3, rmid); > + rdmsrl(MSR_IA32_QMC, cqm_data); > + > + i = socket * cqm_rmid_count + rmid; > + data[i].valid = !(cqm_data & IA32_QM_CTR_ERROR_MASK); > + if ( data[i].valid ) > + { > + data[i].l3c_occupancy = cqm_data * cqm_upscaling_factor; > + data[i].socket = socket; > + data[i].domid = cqm_rmid_array[rmid]; > + } > + } > + spin_unlock_irqrestore(&cqm_lock, flags); > +} > + > +void get_cqm_info(cpumask_t *cpu_cqmdata_map, struct xen_socket_cqmdata *data) > +{ > + on_selected_cpus(cpu_cqmdata_map, read_cqm_data, data, 1); > +} > + > /* > * Local variables: > * mode: C > diff --git a/xen/arch/x86/sysctl.c b/xen/arch/x86/sysctl.c > index 15d4b91..f916fe6 100644 > --- a/xen/arch/x86/sysctl.c > +++ b/xen/arch/x86/sysctl.c > @@ -28,6 +28,7 @@ > #include <xen/nodemask.h> > #include <xen/cpu.h> > #include <xsm/xsm.h> > +#include <asm/pqos.h> > > #define get_xen_guest_handle(val, hnd) do { val = (hnd).p; } while (0) > > @@ -66,6 +67,47 @@ void arch_do_physinfo(xen_sysctl_physinfo_t *pi) > pi->capabilities |= XEN_SYSCTL_PHYSCAP_hvm_directio; > } > > +/* Select one random CPU for each socket */I know this is not specifically a fault of your code, but these masks of cpus on specific sockets is really information which should be set up on boot and tweaked on cpu_up/down. It should certainly not be recalculated from scratch every time this hypercall is made. (And that would prevent needing to make an xalloc/xfree on the hypercall path) ~Andrew> +static void select_socket_cpu(cpumask_t *cpu_bitmap) > +{ > + int i; > + unsigned int cpu; > + cpumask_t *socket_cpuset; > + int max_socket = 0; > + unsigned int num_cpus = num_online_cpus(); > + DECLARE_BITMAP(sockets, num_cpus); > + > + cpumask_clear(cpu_bitmap); > + > + for_each_online_cpu(cpu) > + { > + i = cpu_to_socket(cpu); > + if ( i < 0 || test_and_set_bit(i, sockets) ) > + continue; > + max_socket = max(max_socket, i); > + } > + > + socket_cpuset = xzalloc_array(cpumask_t, max_socket + 1); > + if ( !socket_cpuset ) > + return; > + > + for_each_online_cpu(cpu) > + { > + i = cpu_to_socket(cpu); > + if ( i < 0 ) > + continue; > + cpumask_set_cpu(cpu, &socket_cpuset[i]); > + } > + > + for ( i = 0; i <= max_socket; i++ ) > + { > + cpu = cpumask_any(&socket_cpuset[i]); > + cpumask_set_cpu(cpu, cpu_bitmap); > + } > + > + xfree(socket_cpuset); > +} > + > long arch_do_sysctl( > struct xen_sysctl *sysctl, XEN_GUEST_HANDLE_PARAM(xen_sysctl_t) u_sysctl) > { > @@ -101,6 +143,53 @@ long arch_do_sysctl( > } > break; > > + case XEN_SYSCTL_getcqminfo: > + { > + struct xen_socket_cqmdata *info; > + uint32_t num_sockets; > + uint32_t num_rmid; > + cpumask_t cpu_cqmdata_map; > + > + if ( !system_supports_cqm() ) > + { > + ret = -ENODEV; > + break; > + } > + > + select_socket_cpu(&cpu_cqmdata_map); > + > + num_sockets = min((unsigned int)cpumask_weight(&cpu_cqmdata_map), > + sysctl->u.getcqminfo.num_sockets); > + num_rmid = get_cqm_count(); > + info = xzalloc_array(struct xen_socket_cqmdata, > + num_rmid * num_sockets); > + if ( !info ) > + { > + ret = -ENOMEM; > + break; > + } > + > + get_cqm_info(&cpu_cqmdata_map, info); > + > + if ( copy_to_guest_offset(sysctl->u.getcqminfo.buffer, > + 0, info, num_rmid * num_sockets) ) > + { > + ret = -EFAULT; > + xfree(info); > + break; > + } > + > + sysctl->u.getcqminfo.num_rmid = num_rmid; > + sysctl->u.getcqminfo.num_rmid_avail = get_cqm_avail(); > + sysctl->u.getcqminfo.num_sockets = num_sockets; > + > + if ( copy_to_guest(u_sysctl, sysctl, 1) ) > + ret = -EFAULT; > + > + xfree(info); > + } > + break; > + > default: > ret = -ENOSYS; > break; > diff --git a/xen/include/asm-x86/msr-index.h b/xen/include/asm-x86/msr-index.h > index e597a28..46ef165 100644 > --- a/xen/include/asm-x86/msr-index.h > +++ b/xen/include/asm-x86/msr-index.h > @@ -488,4 +488,8 @@ > /* Geode defined MSRs */ > #define MSR_GEODE_BUSCONT_CONF0 0x00001900 > > +/* Platform QoS register */ > +#define MSR_IA32_QOSEVTSEL 0x00000c8d > +#define MSR_IA32_QMC 0x00000c8e > + > #endif /* __ASM_MSR_INDEX_H */ > diff --git a/xen/include/asm-x86/pqos.h b/xen/include/asm-x86/pqos.h > index c54905b..2ab9277 100644 > --- a/xen/include/asm-x86/pqos.h > +++ b/xen/include/asm-x86/pqos.h > @@ -21,6 +21,8 @@ > #ifndef ASM_PQOS_H > #define ASM_PQOS_H > #include <xen/sched.h> > +#include <xen/cpumask.h> > +#include <public/domctl.h> > > /* QoS Resource Type Enumeration */ > #define QOS_MONITOR_TYPE_L3 0x2 > @@ -28,10 +30,16 @@ > /* QoS Monitoring Event ID */ > #define QOS_MONITOR_EVTID_L3 0x1 > > +/* IA32_QM_CTR */ > +#define IA32_QM_CTR_ERROR_MASK (0x3ul << 62) > + > void init_platform_qos(void); > > bool_t system_supports_cqm(void); > int alloc_cqm_rmid(struct domain *d); > void free_cqm_rmid(struct domain *d); > +unsigned int get_cqm_count(void); > +unsigned int get_cqm_avail(void); > +void get_cqm_info(cpumask_t *cpu_cqmdata_map, struct xen_socket_cqmdata *data); > > #endif > diff --git a/xen/include/public/domctl.h b/xen/include/public/domctl.h > index 800b2f4..53c740e 100644 > --- a/xen/include/public/domctl.h > +++ b/xen/include/public/domctl.h > @@ -883,6 +883,15 @@ struct xen_domctl_qos_type { > typedef struct xen_domctl_qos_type xen_domctl_qos_type_t; > DEFINE_XEN_GUEST_HANDLE(xen_domctl_qos_type_t); > > +struct xen_socket_cqmdata { > + uint64_t l3c_occupancy; > + uint32_t socket; > + domid_t domid; > + uint8_t valid; > +}; > +typedef struct xen_socket_cqmdata xen_socket_cqmdata_t; > +DEFINE_XEN_GUEST_HANDLE(xen_socket_cqmdata_t); > + > struct xen_domctl { > uint32_t cmd; > #define XEN_DOMCTL_createdomain 1 > diff --git a/xen/include/public/sysctl.h b/xen/include/public/sysctl.h > index 8437d31..85eee16 100644 > --- a/xen/include/public/sysctl.h > +++ b/xen/include/public/sysctl.h > @@ -632,6 +632,15 @@ struct xen_sysctl_coverage_op { > typedef struct xen_sysctl_coverage_op xen_sysctl_coverage_op_t; > DEFINE_XEN_GUEST_HANDLE(xen_sysctl_coverage_op_t); > > +/* XEN_SYSCTL_getcqminfo */ > +struct xen_sysctl_getcqminfo { > + XEN_GUEST_HANDLE_64(xen_socket_cqmdata_t) buffer; /* OUT */ > + uint32_t num_sockets; /* IN/OUT */ > + uint32_t num_rmid; /* OUT */ > + uint32_t num_rmid_avail; /* OUT */ > +}; > +typedef struct xen_sysctl_getcqminfo xen_sysctl_getcqminfo_t; > +DEFINE_XEN_GUEST_HANDLE(xen_sysctl_getcqminfo_t); > > struct xen_sysctl { > uint32_t cmd; > @@ -654,6 +663,7 @@ struct xen_sysctl { > #define XEN_SYSCTL_cpupool_op 18 > #define XEN_SYSCTL_scheduler_op 19 > #define XEN_SYSCTL_coverage_op 20 > +#define XEN_SYSCTL_getcqminfo 21 > uint32_t interface_version; /* XEN_SYSCTL_INTERFACE_VERSION */ > union { > struct xen_sysctl_readconsole readconsole; > @@ -675,6 +685,7 @@ struct xen_sysctl { > struct xen_sysctl_cpupool_op cpupool_op; > struct xen_sysctl_scheduler_op scheduler_op; > struct xen_sysctl_coverage_op coverage_op; > + struct xen_sysctl_getcqminfo getcqminfo; > uint8_t pad[128]; > } u; > };
Andrew Cooper
2013-Nov-29 14:56 UTC
Re: [PATCH v3 5/7] x86: enable CQM monitoring for each domain RMID
On 29/11/13 05:48, dongxiao.xu@intel.com wrote:> From: Dongxiao Xu <dongxiao.xu@intel.com> > > If the CQM service is attached to a domain, its related RMID will be set > to hardware for monitoring when the domain''s vcpu is scheduled in. When > the domain''s vcpu is scheduled out, RMID 0 (system reserved) will be set > for monitoring. > > Signed-off-by: Jiongxi Li <jiongxi.li@intel.com> > Signed-off-by: Dongxiao Xu <dongxiao.xu@intel.com> > --- > xen/arch/x86/domain.c | 5 +++++ > xen/arch/x86/pqos.c | 10 ++++++++++ > xen/include/asm-x86/msr-index.h | 1 + > xen/include/asm-x86/pqos.h | 1 + > 4 files changed, 17 insertions(+) > > diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c > index 41e1fc6..628f7eb 100644 > --- a/xen/arch/x86/domain.c > +++ b/xen/arch/x86/domain.c > @@ -1371,6 +1371,8 @@ static void __context_switch(void) > { > memcpy(&p->arch.user_regs, stack_regs, CTXT_SWITCH_STACK_BYTES); > vcpu_save_fpu(p); > + if ( system_supports_cqm() ) > + cqm_assoc_rmid(0); > p->arch.ctxt_switch_from(p); > } > > @@ -1395,6 +1397,9 @@ static void __context_switch(void) > } > vcpu_restore_fpu_eager(n); > n->arch.ctxt_switch_to(n); > + > + if ( system_supports_cqm() && n->domain->arch.pqos_cqm_rmid > 0 ) > + cqm_assoc_rmid(n->domain->arch.pqos_cqm_rmid); > } > > gdt = !is_pv_32on64_vcpu(n) ? per_cpu(gdt_table, cpu) : > diff --git a/xen/arch/x86/pqos.c b/xen/arch/x86/pqos.c > index 615c5ea..1faa650 100644 > --- a/xen/arch/x86/pqos.c > +++ b/xen/arch/x86/pqos.c > @@ -29,6 +29,7 @@ boolean_param("pqos", pqos_enabled); > > static unsigned int cqm_rmid_count = 256; > integer_param("cqm_rmid_count", cqm_rmid_count); > +static uint64_t rmid_mask; > > unsigned int cqm_upscaling_factor = 0; > bool_t cqm_enabled = 0; > @@ -75,6 +76,8 @@ static void __init init_qos_monitor(void) > > cpuid_count(0xf, 0, &eax, &ebx, &ecx, &qm_features); > > + rmid_mask = ~(~0ull << get_count_order(ebx)); > + > if ( qm_features & QOS_MONITOR_TYPE_L3 ) > init_cqm(); > } > @@ -198,6 +201,13 @@ void get_cqm_info(cpumask_t *cpu_cqmdata_map, struct xen_socket_cqmdata *data) > on_selected_cpus(cpu_cqmdata_map, read_cqm_data, data, 1); > } > > +void cqm_assoc_rmid(unsigned int rmid) > +{ > + uint64_t val;Xen style requires a newline here. Otherwise, Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>> + rdmsrl(MSR_IA32_PQR_ASSOC, val); > + wrmsrl(MSR_IA32_PQR_ASSOC, (val & ~(rmid_mask)) | (rmid & rmid_mask)); > +} > + > /* > * Local variables: > * mode: C > diff --git a/xen/include/asm-x86/msr-index.h b/xen/include/asm-x86/msr-index.h > index 46ef165..45f4918 100644 > --- a/xen/include/asm-x86/msr-index.h > +++ b/xen/include/asm-x86/msr-index.h > @@ -491,5 +491,6 @@ > /* Platform QoS register */ > #define MSR_IA32_QOSEVTSEL 0x00000c8d > #define MSR_IA32_QMC 0x00000c8e > +#define MSR_IA32_PQR_ASSOC 0x00000c8f > > #endif /* __ASM_MSR_INDEX_H */ > diff --git a/xen/include/asm-x86/pqos.h b/xen/include/asm-x86/pqos.h > index 2ab9277..c75643a 100644 > --- a/xen/include/asm-x86/pqos.h > +++ b/xen/include/asm-x86/pqos.h > @@ -41,5 +41,6 @@ void free_cqm_rmid(struct domain *d); > unsigned int get_cqm_count(void); > unsigned int get_cqm_avail(void); > void get_cqm_info(cpumask_t *cpu_cqmdata_map, struct xen_socket_cqmdata *data); > +void cqm_assoc_rmid(unsigned int rmid); > > #endif
Andrew Cooper
2013-Nov-29 15:01 UTC
Re: [PATCH v3 2/7] x86: handle CQM resource when creating/destroying guests
On 29/11/13 14:18, Andrew Cooper wrote:> On 29/11/13 05:48, dongxiao.xu@intel.com wrote: >> From: Dongxiao Xu <dongxiao.xu@intel.com> >> >> Allocate an RMID for a guest when it is created. This per-guest >> RMID will be used to monitor Cache QoS related data. The RMID will >> be relinquished when guest is destroyed. >> >> Signed-off-by: Jiongxi Li <jiongxi.li@intel.com> >> Signed-off-by: Dongxiao Xu <dongxiao.xu@intel.com> >> --- >> xen/arch/x86/domain.c | 8 ++++++ >> xen/arch/x86/pqos.c | 55 ++++++++++++++++++++++++++++++++++++++++++ >> xen/common/domctl.c | 5 +++- >> xen/include/asm-x86/domain.h | 2 ++ >> xen/include/asm-x86/pqos.h | 5 ++++ >> xen/include/public/domctl.h | 3 +++ >> xen/include/xen/sched.h | 3 +++ >> 7 files changed, 80 insertions(+), 1 deletion(-) >> >> diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c >> index a3868f9..41e1fc6 100644 >> --- a/xen/arch/x86/domain.c >> +++ b/xen/arch/x86/domain.c >> @@ -60,6 +60,7 @@ >> #include <xen/numa.h> >> #include <xen/iommu.h> >> #include <compat/vcpu.h> >> +#include <asm/pqos.h> >> >> DEFINE_PER_CPU(struct vcpu *, curr_vcpu); >> DEFINE_PER_CPU(unsigned long, cr4); >> @@ -579,6 +580,11 @@ int arch_domain_create(struct domain *d, unsigned int domcr_flags) >> tsc_set_info(d, TSC_MODE_DEFAULT, 0UL, 0, 0); >> spin_lock_init(&d->arch.vtsc_lock); >> >> + /* Allocate CQM RMID for guest */ >> + d->arch.pqos_cqm_rmid = 0; >> + if ( system_supports_cqm() && (domcr_flags & DOMCRF_pqos_cqm) ) >> + alloc_cqm_rmid(d); >> + > Should we fail domain creation if an rmid cannot be allocated? I really > cant decide which would be better. > > In other words, if the toolstack issues a build with DOMCRF_pqos_cqm, > and the build hypercall returns success, should the toolstack reasonably > expect everything to be set up? I would think so. > > If others agree wrt this expectation, then domain creation needs to fail > also if "(domcr_flags & DOMCRF_pqos_cqm) && !system_supports_cqm()"On further consideration, I am not so sure that DOMCRF_pqos_cqm should exist. If a toolstack wants to make a domain with pqos from the start, it can create the domain paused and issue the domctl from the next patch. This means that there is only one canonical way to turn on pqos, which does give substantially more useful error information in failure cases. ~Andrew> >> return 0; >> >> fail: >> @@ -612,6 +618,8 @@ void arch_domain_destroy(struct domain *d) >> >> free_xenheap_page(d->shared_info); >> cleanup_domain_irq_mapping(d); >> + >> + free_cqm_rmid(d); >> } >> >> unsigned long pv_guest_cr4_fixup(const struct vcpu *v, unsigned long guest_cr4) >> diff --git a/xen/arch/x86/pqos.c b/xen/arch/x86/pqos.c >> index e2172c4..1148f3b 100644 >> --- a/xen/arch/x86/pqos.c >> +++ b/xen/arch/x86/pqos.c >> @@ -20,6 +20,7 @@ >> */ >> #include <asm/processor.h> >> #include <xen/init.h> >> +#include <xen/spinlock.h> >> #include <asm/pqos.h> >> >> static bool_t __initdata pqos_enabled = 1; >> @@ -31,6 +32,7 @@ integer_param("cqm_rmid_count", cqm_rmid_count); >> unsigned int cqm_upscaling_factor = 0; >> bool_t cqm_enabled = 0; >> domid_t *cqm_rmid_array = NULL; >> +static DEFINE_SPINLOCK(cqm_lock); >> >> static void __init init_cqm(void) >> { >> @@ -84,6 +86,59 @@ void __init init_platform_qos(void) >> init_qos_monitor(); >> } >> >> +bool_t system_supports_cqm(void) >> +{ >> + return cqm_enabled; >> +} >> + >> +int alloc_cqm_rmid(struct domain *d) >> +{ >> + int rc = 0; >> + unsigned int rmid; >> + unsigned long flags; >> + >> + ASSERT(system_supports_cqm()); >> + >> + spin_lock_irqsave(&cqm_lock, flags); >> + /* RMID=0 is reserved, enumerate from 1 */ >> + for ( rmid = 1; rmid < cqm_rmid_count; rmid++ ) >> + { >> + if ( cqm_rmid_array[rmid] != DOMID_INVALID) >> + continue; >> + >> + cqm_rmid_array[rmid] = d->domain_id; >> + break; >> + } >> + spin_unlock_irqrestore(&cqm_lock, flags); >> + >> + /* No CQM RMID available, assign RMID=0 by default */ >> + if ( rmid == cqm_rmid_count ) >> + { >> + rmid = 0; >> + rc = -1; >> + } >> + >> + d->arch.pqos_cqm_rmid = rmid; >> + >> + return rc; >> +} >> + >> +void free_cqm_rmid(struct domain *d) >> +{ >> + unsigned int rmid = d->arch.pqos_cqm_rmid; >> + unsigned long flags; >> + >> + /* We do not free system reserved "RMID=0" */ >> + if ( rmid == 0 ) >> + return; >> + >> + spin_lock_irqsave(&cqm_lock, flags); >> + cqm_rmid_array[rmid] = DOMID_INVALID; >> + spin_unlock_irqrestore(&cqm_lock, flags); > Does this need the spinlock? given rmid in the range 1 to max_rmids, > there can''t be any competition over who owns the entry, and setting it > back to DOMID_INVALID wont race with the allocation loop. > >> + >> + d->arch.pqos_cqm_rmid = 0; >> +} >> + >> /* >> * Local variables: >> * mode: C >> diff --git a/xen/common/domctl.c b/xen/common/domctl.c >> index 904d27b..1c2e320 100644 >> --- a/xen/common/domctl.c >> +++ b/xen/common/domctl.c >> @@ -425,7 +425,8 @@ long do_domctl(XEN_GUEST_HANDLE_PARAM(xen_domctl_t) u_domctl) >> | XEN_DOMCTL_CDF_pvh_guest >> | XEN_DOMCTL_CDF_hap >> | XEN_DOMCTL_CDF_s3_integrity >> - | XEN_DOMCTL_CDF_oos_off)) ) >> + | XEN_DOMCTL_CDF_oos_off >> + | XEN_DOMCTL_CDF_pqos_cqm)) ) > If you move the final bracket onto a new line now, future additions will > be a single line addition rather than one removal and two additions. > > I am not sure what the prerogative is with that in terms of the Xen > coding style. > > ~Andrew > >> break; >> >> dom = op->domain; >> @@ -467,6 +468,8 @@ long do_domctl(XEN_GUEST_HANDLE_PARAM(xen_domctl_t) u_domctl) >> domcr_flags |= DOMCRF_s3_integrity; >> if ( op->u.createdomain.flags & XEN_DOMCTL_CDF_oos_off ) >> domcr_flags |= DOMCRF_oos_off; >> + if ( op->u.createdomain.flags & XEN_DOMCTL_CDF_pqos_cqm ) >> + domcr_flags |= DOMCRF_pqos_cqm; >> >> d = domain_create(dom, domcr_flags, op->u.createdomain.ssidref); >> if ( IS_ERR(d) ) >> diff --git a/xen/include/asm-x86/domain.h b/xen/include/asm-x86/domain.h >> index 9d39061..9487251 100644 >> --- a/xen/include/asm-x86/domain.h >> +++ b/xen/include/asm-x86/domain.h >> @@ -313,6 +313,8 @@ struct arch_domain >> spinlock_t e820_lock; >> struct e820entry *e820; >> unsigned int nr_e820; >> + >> + unsigned int pqos_cqm_rmid; /* CQM RMID assigned to the domain */ >> } __cacheline_aligned; >> >> #define has_arch_pdevs(d) (!list_empty(&(d)->arch.pdev_list)) >> diff --git a/xen/include/asm-x86/pqos.h b/xen/include/asm-x86/pqos.h >> index e8caca2..c54905b 100644 >> --- a/xen/include/asm-x86/pqos.h >> +++ b/xen/include/asm-x86/pqos.h >> @@ -20,6 +20,7 @@ >> */ >> #ifndef ASM_PQOS_H >> #define ASM_PQOS_H >> +#include <xen/sched.h> >> >> /* QoS Resource Type Enumeration */ >> #define QOS_MONITOR_TYPE_L3 0x2 >> @@ -29,4 +30,8 @@ >> >> void init_platform_qos(void); >> >> +bool_t system_supports_cqm(void); >> +int alloc_cqm_rmid(struct domain *d); >> +void free_cqm_rmid(struct domain *d); >> + >> #endif >> diff --git a/xen/include/public/domctl.h b/xen/include/public/domctl.h >> index 01a3652..47a850a 100644 >> --- a/xen/include/public/domctl.h >> +++ b/xen/include/public/domctl.h >> @@ -62,6 +62,9 @@ struct xen_domctl_createdomain { >> /* Is this a PVH guest (as opposed to an HVM or PV guest)? */ >> #define _XEN_DOMCTL_CDF_pvh_guest 4 >> #define XEN_DOMCTL_CDF_pvh_guest (1U<<_XEN_DOMCTL_CDF_pvh_guest) >> + /* Enable pqos-cqm? */ >> +#define _XEN_DOMCTL_CDF_pqos_cqm 5 >> +#define XEN_DOMCTL_CDF_pqos_cqm (1U<<_XEN_DOMCTL_CDF_pqos_cqm) >> uint32_t flags; >> }; >> typedef struct xen_domctl_createdomain xen_domctl_createdomain_t; >> diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h >> index cbdf377..3a42656 100644 >> --- a/xen/include/xen/sched.h >> +++ b/xen/include/xen/sched.h >> @@ -507,6 +507,9 @@ struct domain *domain_create( >> /* DOMCRF_pvh: Create PV domain in HVM container. */ >> #define _DOMCRF_pvh 5 >> #define DOMCRF_pvh (1U<<_DOMCRF_pvh) >> + /* DOMCRF_pqos_cqm: Create a domain with CQM support */ >> +#define _DOMCRF_pqos_cqm 6 >> +#define DOMCRF_pqos_cqm (1U<<_DOMCRF_pqos_cqm) >> >> /* >> * rcu_lock_domain_by_id() is more efficient than get_domain_by_id(). > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel
Jan Beulich
2013-Nov-29 15:05 UTC
Re: [PATCH v3 1/7] x86: detect and initialize Cache QoS Monitoring feature
>>> On 29.11.13 at 14:54, Andrew Cooper <andrew.cooper3@citrix.com> wrote: > However, I feel that all options relating to platform QoS should be > available under a "qpos" custom_param, (similar to iommu=), so we don''t > end up with loads of new command line options with common prefixes as > new features are added. > > For this, I would suggest semantic like: > > pqos=[<boolean>],[max_rmids=<number>] > > Particularly, when a second qpos option is introduced, it might be > sensible to have individual booleans for each option, as well as a > global enable/disable. > > > I would be interested in general views from others as far as this is > concerned.+1 Jan
Jan Beulich
2013-Nov-29 15:07 UTC
Re: [PATCH v3 2/7] x86: handle CQM resource when creating/destroying guests
>>> On 29.11.13 at 15:18, Andrew Cooper <andrew.cooper3@citrix.com> wrote: > On 29/11/13 05:48, dongxiao.xu@intel.com wrote: >> --- a/xen/common/domctl.c >> +++ b/xen/common/domctl.c >> @@ -425,7 +425,8 @@ long do_domctl(XEN_GUEST_HANDLE_PARAM(xen_domctl_t) u_domctl) >> | XEN_DOMCTL_CDF_pvh_guest >> | XEN_DOMCTL_CDF_hap >> | XEN_DOMCTL_CDF_s3_integrity >> - | XEN_DOMCTL_CDF_oos_off)) ) >> + | XEN_DOMCTL_CDF_oos_off >> + | XEN_DOMCTL_CDF_pqos_cqm)) ) > > If you move the final bracket onto a new line now, future additions will > be a single line addition rather than one removal and two additions.I''d rather not - a lone closing parenthesis on a line looks pretty odd, sort of calling for being cleaned up. Jan
Ian Campbell
2013-Nov-29 15:29 UTC
Re: [PATCH v3 0/7] enable Cache QoS Monitoring (CQM) feature
On Fri, 2013-11-29 at 13:48 +0800, dongxiao.xu@intel.com wrote:> From: Dongxiao Xu <dongxiao.xu@intel.com> > > Changes from v2: > - Address comments from Andrew Cooper, including: > * Merging tools stack changes into one patch. > * Reduce the IPI number to one per socket. > * Change structures for CQM data exchange between tools and Xen. > * Misc of format/variable/function name changes. > - Address comments from Konrad Rzeszutek Wilk, including: > * Simplify the error printing logic. > * Add xsm check for the new added hypercalls. > > Changes from v1: > - Address comments from Andrew Cooper, including: > * Change function names, e.g., alloc_cqm_rmid(), system_supports_cqm(), etc. > * Change some structure element order to save packing cost. > * Correct some function''s return value. > * Some programming styles change. > * ... > > Future generations of Intel Xeon processor may offer monitoring capability in > each logical processor to measure specific quality-of-service metric, > for example, the Cache QoS Monitoring to get L3 cache occupancy. > Detailed information please refer to Intel SDM chapter 17.14.Is this being proposed for 4.4? I think it is rather late to be adding such stuff.> tools/flask/policy/policy/modules/xen/xen.if | 2 +- > tools/flask/policy/policy/modules/xen/xen.te | 5 +- > tools/libxc/xc_domain.c | 48 ++++++ > tools/libxc/xenctrl.h | 12 ++ > tools/libxl/Makefile | 3 +- > tools/libxl/libxl.h | 5 + > tools/libxl/libxl_create.c | 3 + > tools/libxl/libxl_pqos.c | 108 +++++++++++++ > tools/libxl/libxl_types.idl | 1 + > tools/libxl/xl.h | 3 + > tools/libxl/xl_cmdimpl.c | 138 ++++++++++++++++ > tools/libxl/xl_cmdtable.c | 15 ++ > xen/arch/x86/Makefile | 1 + > xen/arch/x86/cpu/intel.c | 6 + > xen/arch/x86/domain.c | 13 ++ > xen/arch/x86/domctl.c | 40 +++++ > xen/arch/x86/pqos.c | 219 ++++++++++++++++++++++++++ > xen/arch/x86/setup.c | 3 + > xen/arch/x86/sysctl.c | 89 +++++++++++ > xen/common/domctl.c | 5 +- > xen/include/asm-x86/cpufeature.h | 1 + > xen/include/asm-x86/domain.h | 2 + > xen/include/asm-x86/msr-index.h | 5 + > xen/include/asm-x86/pqos.h | 46 ++++++ > xen/include/public/domctl.h | 26 +++ > xen/include/public/sysctl.h | 11 ++ > xen/include/xen/sched.h | 3 + > xen/xsm/flask/hooks.c | 7 + > xen/xsm/flask/policy/access_vectors | 17 +- > 29 files changed, 830 insertions(+), 7 deletions(-) > create mode 100644 tools/libxl/libxl_pqos.c > create mode 100644 xen/arch/x86/pqos.c > create mode 100644 xen/include/asm-x86/pqos.h >
Jan Beulich
2013-Nov-29 15:36 UTC
Re: [PATCH v3 0/7] enable Cache QoS Monitoring (CQM) feature
>>> On 29.11.13 at 16:29, Ian Campbell <Ian.Campbell@citrix.com> wrote: > On Fri, 2013-11-29 at 13:48 +0800, dongxiao.xu@intel.com wrote: >> From: Dongxiao Xu <dongxiao.xu@intel.com> >> >> Changes from v2: >> - Address comments from Andrew Cooper, including: >> * Merging tools stack changes into one patch. >> * Reduce the IPI number to one per socket. >> * Change structures for CQM data exchange between tools and Xen. >> * Misc of format/variable/function name changes. >> - Address comments from Konrad Rzeszutek Wilk, including: >> * Simplify the error printing logic. >> * Add xsm check for the new added hypercalls. >> >> Changes from v1: >> - Address comments from Andrew Cooper, including: >> * Change function names, e.g., alloc_cqm_rmid(), system_supports_cqm(), > etc. >> * Change some structure element order to save packing cost. >> * Correct some function''s return value. >> * Some programming styles change. >> * ... >> >> Future generations of Intel Xeon processor may offer monitoring capability > in >> each logical processor to measure specific quality-of-service metric, >> for example, the Cache QoS Monitoring to get L3 cache occupancy. >> Detailed information please refer to Intel SDM chapter 17.14. > > Is this being proposed for 4.4? I think it is rather late to be adding > such stuff.I think we already settled on not taking it, and Intel indicated that they''re fine with that decision. Jan
Ian Campbell
2013-Nov-29 15:41 UTC
Re: [PATCH v3 0/7] enable Cache QoS Monitoring (CQM) feature
On Fri, 2013-11-29 at 15:36 +0000, Jan Beulich wrote:> >>> On 29.11.13 at 16:29, Ian Campbell <Ian.Campbell@citrix.com> wrote: > > On Fri, 2013-11-29 at 13:48 +0800, dongxiao.xu@intel.com wrote: > >> From: Dongxiao Xu <dongxiao.xu@intel.com> > >> > >> Changes from v2: > >> - Address comments from Andrew Cooper, including: > >> * Merging tools stack changes into one patch. > >> * Reduce the IPI number to one per socket. > >> * Change structures for CQM data exchange between tools and Xen. > >> * Misc of format/variable/function name changes. > >> - Address comments from Konrad Rzeszutek Wilk, including: > >> * Simplify the error printing logic. > >> * Add xsm check for the new added hypercalls. > >> > >> Changes from v1: > >> - Address comments from Andrew Cooper, including: > >> * Change function names, e.g., alloc_cqm_rmid(), system_supports_cqm(), > > etc. > >> * Change some structure element order to save packing cost. > >> * Correct some function''s return value. > >> * Some programming styles change. > >> * ... > >> > >> Future generations of Intel Xeon processor may offer monitoring capability > > in > >> each logical processor to measure specific quality-of-service metric, > >> for example, the Cache QoS Monitoring to get L3 cache occupancy. > >> Detailed information please refer to Intel SDM chapter 17.14. > > > > Is this being proposed for 4.4? I think it is rather late to be adding > > such stuff. > > I think we already settled on not taking it, and Intel indicated > that they''re fine with that decision.OK, in which case the tools side of this is unlikely to make it to the top of my todo list until we are closer to the 4.5 development phase opening up. Ian.
Daniel De Graaf
2013-Nov-29 15:50 UTC
Re: [PATCH v3 6/7] xsm: add platform QoS related xsm policies
On 11/29/2013 12:48 AM, dongxiao.xu@intel.com wrote:> From: Dongxiao Xu <dongxiao.xu@intel.com> > > Add xsm policies for attach/detach pqos services and get CQM info > hypercalls. > > Signed-off-by: Dongxiao Xu <dongxiao.xu@intel.com> > --- > tools/flask/policy/policy/modules/xen/xen.if | 2 +- > tools/flask/policy/policy/modules/xen/xen.te | 5 ++++- > xen/xsm/flask/hooks.c | 7 +++++++ > xen/xsm/flask/policy/access_vectors | 17 ++++++++++++++--- > 4 files changed, 26 insertions(+), 5 deletions(-) >[...]> diff --git a/xen/xsm/flask/hooks.c b/xen/xsm/flask/hooks.c > index b1e2593..884922b 100644 > --- a/xen/xsm/flask/hooks.c > +++ b/xen/xsm/flask/hooks.c > @@ -730,6 +730,10 @@ static int flask_domctl(struct domain *d, int cmd) > case XEN_DOMCTL_set_max_evtchn: > return current_has_perm(d, SECCLASS_DOMAIN2, DOMAIN2__SET_MAX_EVTCHN); > > + case XEN_DOMCTL_attach_pqos: > + case XEN_DOMCTL_detach_pqos: > + return current_has_perm(d, SECCLASS_DOMAIN2, DOMAIN2__PQOS_OP); > + > default: > printk("flask_domctl: Unknown op %d\n", cmd); > return -EPERM; > @@ -785,6 +789,9 @@ static int flask_sysctl(int cmd) > case XEN_SYSCTL_numainfo: > return domain_has_xen(current->domain, XEN__PHYSINFO); > > + case XEN_SYSCTL_getcqminfo: > + return domain_has_xen(current->domain, XEN2__PQOS_OP);The domain_has_xen helper function assumes SECCLASS_XEN, but this call needs to pass SECCLASS_XEN2. The easy fix is to change this call to avc_current_has_perm(SECINITSID_XEN, SECCLASS_XEN2, XEN2__PQOS_OP, NULL) Otherwise, a class parameter would need to be added to domain_has_xen. With this changed, Acked-by: Daniel De Graaf <dgdegra@tycho.nsa.gov> -- Daniel De Graaf National Security Agency
Xu, Dongxiao
2013-Nov-30 00:41 UTC
Re: [PATCH v3 1/7] x86: detect and initialize Cache QoS Monitoring feature
> -----Original Message----- > From: Jan Beulich [mailto:JBeulich@suse.com] > Sent: Friday, November 29, 2013 11:05 PM > To: Andrew Cooper; Xu, Dongxiao > Cc: Ian.Campbell@citrix.com; Ian.Jackson@eu.citrix.com; > stefano.stabellini@eu.citrix.com; xen-devel@lists.xen.org; > konrad.wilk@oracle.com; dgdegra@tycho.nsa.gov; keir@xen.org > Subject: Re: [PATCH v3 1/7] x86: detect and initialize Cache QoS Monitoring > feature > > >>> On 29.11.13 at 14:54, Andrew Cooper <andrew.cooper3@citrix.com> wrote: > > However, I feel that all options relating to platform QoS should be > > available under a "qpos" custom_param, (similar to iommu=), so we don''t > > end up with loads of new command line options with common prefixes as > > new features are added. > > > > For this, I would suggest semantic like: > > > > pqos=[<boolean>],[max_rmids=<number>] > > > > Particularly, when a second qpos option is introduced, it might be > > sensible to have individual booleans for each option, as well as a > > global enable/disable. > > > > > > I would be interested in general views from others as far as this is > > concerned. > > +1Okay, will modify it. Thanks, Dongxiao> > Jan
Xu, Dongxiao
2013-Nov-30 00:42 UTC
Re: [PATCH v3 1/7] x86: detect and initialize Cache QoS Monitoring feature
> -----Original Message----- > From: Andrew Cooper [mailto:andrew.cooper3@citrix.com] > Sent: Friday, November 29, 2013 9:59 PM > To: Xu, Dongxiao > Cc: xen-devel@lists.xen.org; keir@xen.org; JBeulich@suse.com; > Ian.Jackson@eu.citrix.com; Ian.Campbell@citrix.com; > stefano.stabellini@eu.citrix.com; konrad.wilk@oracle.com; > dgdegra@tycho.nsa.gov > Subject: Re: [PATCH v3 1/7] x86: detect and initialize Cache QoS Monitoring > feature > > On 29/11/13 05:48, dongxiao.xu@intel.com wrote: > > From: Dongxiao Xu <dongxiao.xu@intel.com> > > > > Detect platform QoS feature status and enumerate the resource types, > > one of which is to monitor the L3 cache occupancy. > > > > Also introduce a Xen grub command line parameter to control the > > QoS feature status globally. > > > > Signed-off-by: Jiongxi Li <jiongxi.li@intel.com> > > Signed-off-by: Dongxiao Xu <dongxiao.xu@intel.com> > > --- > > xen/arch/x86/Makefile | 1 + > > xen/arch/x86/cpu/intel.c | 6 +++ > > xen/arch/x86/pqos.c | 95 > ++++++++++++++++++++++++++++++++++++++ > > xen/arch/x86/setup.c | 3 ++ > > xen/include/asm-x86/cpufeature.h | 1 + > > xen/include/asm-x86/pqos.h | 32 +++++++++++++ > > 6 files changed, 138 insertions(+) > > create mode 100644 xen/arch/x86/pqos.c > > create mode 100644 xen/include/asm-x86/pqos.h > > > > diff --git a/xen/arch/x86/Makefile b/xen/arch/x86/Makefile > > index d502bdf..54962e0 100644 > > --- a/xen/arch/x86/Makefile > > +++ b/xen/arch/x86/Makefile > > @@ -58,6 +58,7 @@ obj-y += crash.o > > obj-y += tboot.o > > obj-y += hpet.o > > obj-y += xstate.o > > +obj-y += pqos.o > > > > obj-$(crash_debug) += gdbstub.o > > > > diff --git a/xen/arch/x86/cpu/intel.c b/xen/arch/x86/cpu/intel.c > > index 27fe762..f0d83ea 100644 > > --- a/xen/arch/x86/cpu/intel.c > > +++ b/xen/arch/x86/cpu/intel.c > > @@ -230,6 +230,12 @@ static void __devinit init_intel(struct cpuinfo_x86 *c) > > ( c->cpuid_level >= 0x00000006 ) && > > ( cpuid_eax(0x00000006) & (1u<<2) ) ) > > set_bit(X86_FEATURE_ARAT, c->x86_capability); > > + > > + /* Check platform QoS monitoring capability */ > > + if ((c->cpuid_level >= 0x00000007) && > > + (cpuid_ebx(0x00000007) & (1u<<12))) > > + set_bit(X86_FEATURE_QOSM, c->x86_capability); > > + > > } > > > > static struct cpu_dev intel_cpu_dev __cpuinitdata = { > > diff --git a/xen/arch/x86/pqos.c b/xen/arch/x86/pqos.c > > new file mode 100644 > > index 0000000..e2172c4 > > --- /dev/null > > +++ b/xen/arch/x86/pqos.c > > @@ -0,0 +1,95 @@ > > +/* > > + * pqos.c: Platform QoS related service for guest. > > + * > > + * Copyright (c) 2013, Intel Corporation > > + * Author: Jiongxi Li <jiongxi.li@intel.com> > > + * Author: Dongxiao Xu <dongxiao.xu@intel.com> > > + * > > + * This program is free software; you can redistribute it and/or modify it > > + * under the terms and conditions of the GNU General Public License, > > + * version 2, as published by the Free Software Foundation. > > + * > > + * This program is distributed in the hope it will be useful, but WITHOUT > > + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY > or > > + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public > License for > > + * more details. > > + * > > + * You should have received a copy of the GNU General Public License along > with > > + * this program; if not, write to the Free Software Foundation, Inc., 59 Temple > > + * Place - Suite 330, Boston, MA 02111-1307 USA. > > + */ > > +#include <asm/processor.h> > > +#include <xen/init.h> > > +#include <asm/pqos.h> > > + > > +static bool_t __initdata pqos_enabled = 1; > > +boolean_param("pqos", pqos_enabled); > > + > > +static unsigned int cqm_rmid_count = 256; > > +integer_param("cqm_rmid_count", cqm_rmid_count); > > + > > +unsigned int cqm_upscaling_factor = 0; > > +bool_t cqm_enabled = 0; > > +domid_t *cqm_rmid_array = NULL; > > I feel this would be more meaningful being called "rmid_to_dom" or > something similar. (but do admit that this is entirely personal taste).Yes, that''ll be better, thanks! Dongxiao> > ~Andrew > > > + > > +static void __init init_cqm(void) > > +{ > > + unsigned int rmid; > > + unsigned int eax, edx; > > + unsigned int max_cqm_rmid; > > + > > + cpuid_count(0xf, 1, &eax, &cqm_upscaling_factor, &max_cqm_rmid, > &edx); > > + if ( !(edx & QOS_MONITOR_EVTID_L3) ) > > + return; > > + > > + cqm_rmid_count = min(cqm_rmid_count, max_cqm_rmid + 1); > > + > > + cqm_rmid_array = xzalloc_array(domid_t, cqm_rmid_count); > > + if ( !cqm_rmid_array ) > > + { > > + cqm_rmid_count = 0; > > + return; > > + } > > + > > + /* Reserve RMID 0 for all domains not being monitored */ > > + cqm_rmid_array[0] = DOMID_XEN; > > + > > + for ( rmid = 1; rmid < cqm_rmid_count; rmid++ ) > > + cqm_rmid_array[rmid] = DOMID_INVALID; > > + > > + cqm_enabled = 1; > > + > > + printk(XENLOG_INFO "Cache QoS Monitoring Enabled.\n"); > > +} > > + > > +static void __init init_qos_monitor(void) > > +{ > > + unsigned int qm_features; > > + unsigned int eax, ebx, ecx; > > + > > + if ( !(boot_cpu_has(X86_FEATURE_QOSM)) ) > > + return; > > + > > + cpuid_count(0xf, 0, &eax, &ebx, &ecx, &qm_features); > > + > > + if ( qm_features & QOS_MONITOR_TYPE_L3 ) > > + init_cqm(); > > +} > > + > > +void __init init_platform_qos(void) > > +{ > > + if ( !pqos_enabled ) > > + return; > > + > > + init_qos_monitor(); > > +} > > + > > +/* > > + * Local variables: > > + * mode: C > > + * c-file-style: "BSD" > > + * c-basic-offset: 4 > > + * tab-width: 4 > > + * indent-tabs-mode: nil > > + * End: > > + */ > > diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c > > index 5bf4ee0..95418e4 100644 > > --- a/xen/arch/x86/setup.c > > +++ b/xen/arch/x86/setup.c > > @@ -48,6 +48,7 @@ > > #include <asm/setup.h> > > #include <xen/cpu.h> > > #include <asm/nmi.h> > > +#include <asm/pqos.h> > > > > /* opt_nosmp: If true, secondary processors are ignored. */ > > static bool_t __initdata opt_nosmp; > > @@ -1402,6 +1403,8 @@ void __init __start_xen(unsigned long mbi_p) > > > > domain_unpause_by_systemcontroller(dom0); > > > > + init_platform_qos(); > > + > > reset_stack_and_jump(init_done); > > } > > > > diff --git a/xen/include/asm-x86/cpufeature.h > b/xen/include/asm-x86/cpufeature.h > > index 1cfaf94..ca59668 100644 > > --- a/xen/include/asm-x86/cpufeature.h > > +++ b/xen/include/asm-x86/cpufeature.h > > @@ -147,6 +147,7 @@ > > #define X86_FEATURE_ERMS (7*32+ 9) /* Enhanced REP MOVSB/STOSB */ > > #define X86_FEATURE_INVPCID (7*32+10) /* Invalidate Process Context > ID */ > > #define X86_FEATURE_RTM (7*32+11) /* Restricted Transactional > Memory */ > > +#define X86_FEATURE_QOSM (7*32+12) /* Platform QoS monitoring > capability */ > > #define X86_FEATURE_NO_FPU_SEL (7*32+13) /* FPU CS/DS stored as > zero */ > > #define X86_FEATURE_SMAP (7*32+20) /* Supervisor Mode Access > Prevention */ > > > > diff --git a/xen/include/asm-x86/pqos.h b/xen/include/asm-x86/pqos.h > > new file mode 100644 > > index 0000000..e8caca2 > > --- /dev/null > > +++ b/xen/include/asm-x86/pqos.h > > @@ -0,0 +1,32 @@ > > +/* > > + * pqos.h: Platform QoS related service for guest. > > + * > > + * Copyright (c) 2013, Intel Corporation > > + * Author: Jiongxi Li <jiongxi.li@intel.com> > > + * Author: Dongxiao Xu <dongxiao.xu@intel.com> > > + * > > + * This program is free software; you can redistribute it and/or modify it > > + * under the terms and conditions of the GNU General Public License, > > + * version 2, as published by the Free Software Foundation. > > + * > > + * This program is distributed in the hope it will be useful, but WITHOUT > > + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY > or > > + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public > License for > > + * more details. > > + * > > + * You should have received a copy of the GNU General Public License along > with > > + * this program; if not, write to the Free Software Foundation, Inc., 59 Temple > > + * Place - Suite 330, Boston, MA 02111-1307 USA. > > + */ > > +#ifndef ASM_PQOS_H > > +#define ASM_PQOS_H > > + > > +/* QoS Resource Type Enumeration */ > > +#define QOS_MONITOR_TYPE_L3 0x2 > > + > > +/* QoS Monitoring Event ID */ > > +#define QOS_MONITOR_EVTID_L3 0x1 > > + > > +void init_platform_qos(void); > > + > > +#endif
Xu, Dongxiao
2013-Nov-30 01:27 UTC
Re: [PATCH v3 4/7] x86: collect CQM information from all sockets
> -----Original Message----- > From: Andrew Cooper [mailto:andrew.cooper3@citrix.com] > Sent: Friday, November 29, 2013 10:54 PM > To: Xu, Dongxiao > Cc: xen-devel@lists.xen.org; keir@xen.org; JBeulich@suse.com; > Ian.Jackson@eu.citrix.com; Ian.Campbell@citrix.com; > stefano.stabellini@eu.citrix.com; konrad.wilk@oracle.com; > dgdegra@tycho.nsa.gov > Subject: Re: [PATCH v3 4/7] x86: collect CQM information from all sockets > > On 29/11/13 05:48, dongxiao.xu@intel.com wrote: > > From: Dongxiao Xu <dongxiao.xu@intel.com> > > > > Collect CQM information (L3 cache occupancy) from all sockets. > > Upper layer application can parse the data structure to get the > > information of guest''s L3 cache occupancy on certain sockets. > > > > Signed-off-by: Jiongxi Li <jiongxi.li@intel.com> > > Signed-off-by: Dongxiao Xu <dongxiao.xu@intel.com> > > --- > > xen/arch/x86/pqos.c | 59 ++++++++++++++++++++++++++ > > xen/arch/x86/sysctl.c | 89 > +++++++++++++++++++++++++++++++++++++++ > > xen/include/asm-x86/msr-index.h | 4 ++ > > xen/include/asm-x86/pqos.h | 8 ++++ > > xen/include/public/domctl.h | 9 ++++ > > xen/include/public/sysctl.h | 11 +++++ > > 6 files changed, 180 insertions(+) > > > > diff --git a/xen/arch/x86/pqos.c b/xen/arch/x86/pqos.c > > index 1148f3b..615c5ea 100644 > > --- a/xen/arch/x86/pqos.c > > +++ b/xen/arch/x86/pqos.c > > @@ -19,6 +19,7 @@ > > * Place - Suite 330, Boston, MA 02111-1307 USA. > > */ > > #include <asm/processor.h> > > +#include <asm/msr.h> > > #include <xen/init.h> > > #include <xen/spinlock.h> > > #include <asm/pqos.h> > > @@ -91,6 +92,26 @@ bool_t system_supports_cqm(void) > > return cqm_enabled; > > } > > > > +unsigned int get_cqm_count(void) > > +{ > > + return cqm_rmid_count; > > +} > > + > > +unsigned int get_cqm_avail(void) > > +{ > > + unsigned int rmid, cqm_avail = 0; > > + unsigned long flags; > > + > > + spin_lock_irqsave(&cqm_lock, flags); > > + /* RMID=0 is reserved, enumerate from 1 */ > > + for ( rmid = 1; rmid < cqm_rmid_count; rmid++ ) > > + if ( cqm_rmid_array[rmid] == DOMID_INVALID ) > > + cqm_avail++; > > + spin_unlock_irqrestore(&cqm_lock, flags); > > + > > + return cqm_avail; > > This cqm_avail is stale as soon as you release the lock.Okay, will remove this get_cqm_avail() function and get the CQM resource availability information by enumerating "sysctl_cqminfo_t *info" in tool stack side.> > > +} > > + > > int alloc_cqm_rmid(struct domain *d) > > { > > int rc = 0; > > @@ -139,6 +160,44 @@ void free_cqm_rmid(struct domain *d) > > d->arch.pqos_cqm_rmid = 0; > > } > > > > +static void read_cqm_data(void *arg) > > +{ > > + uint64_t cqm_data; > > + unsigned int rmid; > > + int socket = cpu_to_socket(smp_processor_id()); > > + struct xen_socket_cqmdata *data = arg; > > + unsigned long flags, i; > > + > > + if ( socket < 0 ) > > + return; > > + > > + spin_lock_irqsave(&cqm_lock, flags); > > + /* RMID=0 is reserved, enumerate from 1 */ > > + for ( rmid = 1; rmid < cqm_rmid_count; rmid++ ) > > + { > > + if ( cqm_rmid_array[rmid] == DOMID_INVALID ) > > + continue; > > + > > + wrmsr(MSR_IA32_QOSEVTSEL, QOS_MONITOR_EVTID_L3, rmid); > > + rdmsrl(MSR_IA32_QMC, cqm_data); > > + > > + i = socket * cqm_rmid_count + rmid; > > + data[i].valid = !(cqm_data & IA32_QM_CTR_ERROR_MASK); > > + if ( data[i].valid ) > > + { > > + data[i].l3c_occupancy = cqm_data * cqm_upscaling_factor; > > + data[i].socket = socket; > > + data[i].domid = cqm_rmid_array[rmid]; > > + } > > + } > > + spin_unlock_irqrestore(&cqm_lock, flags); > > +} > > + > > +void get_cqm_info(cpumask_t *cpu_cqmdata_map, struct > xen_socket_cqmdata *data) > > +{ > > + on_selected_cpus(cpu_cqmdata_map, read_cqm_data, data, 1); > > +} > > + > > /* > > * Local variables: > > * mode: C > > diff --git a/xen/arch/x86/sysctl.c b/xen/arch/x86/sysctl.c > > index 15d4b91..f916fe6 100644 > > --- a/xen/arch/x86/sysctl.c > > +++ b/xen/arch/x86/sysctl.c > > @@ -28,6 +28,7 @@ > > #include <xen/nodemask.h> > > #include <xen/cpu.h> > > #include <xsm/xsm.h> > > +#include <asm/pqos.h> > > > > #define get_xen_guest_handle(val, hnd) do { val = (hnd).p; } while (0) > > > > @@ -66,6 +67,47 @@ void arch_do_physinfo(xen_sysctl_physinfo_t *pi) > > pi->capabilities |= XEN_SYSCTL_PHYSCAP_hvm_directio; > > } > > > > +/* Select one random CPU for each socket */ > > I know this is not specifically a fault of your code, but these masks of > cpus on specific sockets is really information which should be set up on > boot and tweaked on cpu_up/down. > > It should certainly not be recalculated from scratch every time this > hypercall is made. (And that would prevent needing to make an > xalloc/xfree on the hypercall path)Okay, will move the detection of per socket cpu bitmap in system initialization code. Thanks, Dongxiao> > ~Andrew > > > +static void select_socket_cpu(cpumask_t *cpu_bitmap) > > +{ > > + int i; > > + unsigned int cpu; > > + cpumask_t *socket_cpuset; > > + int max_socket = 0; > > + unsigned int num_cpus = num_online_cpus(); > > + DECLARE_BITMAP(sockets, num_cpus); > > + > > + cpumask_clear(cpu_bitmap); > > + > > + for_each_online_cpu(cpu) > > + { > > + i = cpu_to_socket(cpu); > > + if ( i < 0 || test_and_set_bit(i, sockets) ) > > + continue; > > + max_socket = max(max_socket, i); > > + } > > + > > + socket_cpuset = xzalloc_array(cpumask_t, max_socket + 1); > > + if ( !socket_cpuset ) > > + return; > > + > > + for_each_online_cpu(cpu) > > + { > > + i = cpu_to_socket(cpu); > > + if ( i < 0 ) > > + continue; > > + cpumask_set_cpu(cpu, &socket_cpuset[i]); > > + } > > + > > + for ( i = 0; i <= max_socket; i++ ) > > + { > > + cpu = cpumask_any(&socket_cpuset[i]); > > + cpumask_set_cpu(cpu, cpu_bitmap); > > + } > > + > > + xfree(socket_cpuset); > > +} > > + > > long arch_do_sysctl( > > struct xen_sysctl *sysctl, XEN_GUEST_HANDLE_PARAM(xen_sysctl_t) > u_sysctl) > > { > > @@ -101,6 +143,53 @@ long arch_do_sysctl( > > } > > break; > > > > + case XEN_SYSCTL_getcqminfo: > > + { > > + struct xen_socket_cqmdata *info; > > + uint32_t num_sockets; > > + uint32_t num_rmid; > > + cpumask_t cpu_cqmdata_map; > > + > > + if ( !system_supports_cqm() ) > > + { > > + ret = -ENODEV; > > + break; > > + } > > + > > + select_socket_cpu(&cpu_cqmdata_map); > > + > > + num_sockets = min((unsigned > int)cpumask_weight(&cpu_cqmdata_map), > > + sysctl->u.getcqminfo.num_sockets); > > + num_rmid = get_cqm_count(); > > + info = xzalloc_array(struct xen_socket_cqmdata, > > + num_rmid * num_sockets); > > + if ( !info ) > > + { > > + ret = -ENOMEM; > > + break; > > + } > > + > > + get_cqm_info(&cpu_cqmdata_map, info); > > + > > + if ( copy_to_guest_offset(sysctl->u.getcqminfo.buffer, > > + 0, info, num_rmid * num_sockets) ) > > + { > > + ret = -EFAULT; > > + xfree(info); > > + break; > > + } > > + > > + sysctl->u.getcqminfo.num_rmid = num_rmid; > > + sysctl->u.getcqminfo.num_rmid_avail = get_cqm_avail(); > > + sysctl->u.getcqminfo.num_sockets = num_sockets; > > + > > + if ( copy_to_guest(u_sysctl, sysctl, 1) ) > > + ret = -EFAULT; > > + > > + xfree(info); > > + } > > + break; > > + > > default: > > ret = -ENOSYS; > > break; > > diff --git a/xen/include/asm-x86/msr-index.h > b/xen/include/asm-x86/msr-index.h > > index e597a28..46ef165 100644 > > --- a/xen/include/asm-x86/msr-index.h > > +++ b/xen/include/asm-x86/msr-index.h > > @@ -488,4 +488,8 @@ > > /* Geode defined MSRs */ > > #define MSR_GEODE_BUSCONT_CONF0 0x00001900 > > > > +/* Platform QoS register */ > > +#define MSR_IA32_QOSEVTSEL 0x00000c8d > > +#define MSR_IA32_QMC 0x00000c8e > > + > > #endif /* __ASM_MSR_INDEX_H */ > > diff --git a/xen/include/asm-x86/pqos.h b/xen/include/asm-x86/pqos.h > > index c54905b..2ab9277 100644 > > --- a/xen/include/asm-x86/pqos.h > > +++ b/xen/include/asm-x86/pqos.h > > @@ -21,6 +21,8 @@ > > #ifndef ASM_PQOS_H > > #define ASM_PQOS_H > > #include <xen/sched.h> > > +#include <xen/cpumask.h> > > +#include <public/domctl.h> > > > > /* QoS Resource Type Enumeration */ > > #define QOS_MONITOR_TYPE_L3 0x2 > > @@ -28,10 +30,16 @@ > > /* QoS Monitoring Event ID */ > > #define QOS_MONITOR_EVTID_L3 0x1 > > > > +/* IA32_QM_CTR */ > > +#define IA32_QM_CTR_ERROR_MASK (0x3ul << 62) > > + > > void init_platform_qos(void); > > > > bool_t system_supports_cqm(void); > > int alloc_cqm_rmid(struct domain *d); > > void free_cqm_rmid(struct domain *d); > > +unsigned int get_cqm_count(void); > > +unsigned int get_cqm_avail(void); > > +void get_cqm_info(cpumask_t *cpu_cqmdata_map, struct > xen_socket_cqmdata *data); > > > > #endif > > diff --git a/xen/include/public/domctl.h b/xen/include/public/domctl.h > > index 800b2f4..53c740e 100644 > > --- a/xen/include/public/domctl.h > > +++ b/xen/include/public/domctl.h > > @@ -883,6 +883,15 @@ struct xen_domctl_qos_type { > > typedef struct xen_domctl_qos_type xen_domctl_qos_type_t; > > DEFINE_XEN_GUEST_HANDLE(xen_domctl_qos_type_t); > > > > +struct xen_socket_cqmdata { > > + uint64_t l3c_occupancy; > > + uint32_t socket; > > + domid_t domid; > > + uint8_t valid; > > +}; > > +typedef struct xen_socket_cqmdata xen_socket_cqmdata_t; > > +DEFINE_XEN_GUEST_HANDLE(xen_socket_cqmdata_t); > > + > > struct xen_domctl { > > uint32_t cmd; > > #define XEN_DOMCTL_createdomain 1 > > diff --git a/xen/include/public/sysctl.h b/xen/include/public/sysctl.h > > index 8437d31..85eee16 100644 > > --- a/xen/include/public/sysctl.h > > +++ b/xen/include/public/sysctl.h > > @@ -632,6 +632,15 @@ struct xen_sysctl_coverage_op { > > typedef struct xen_sysctl_coverage_op xen_sysctl_coverage_op_t; > > DEFINE_XEN_GUEST_HANDLE(xen_sysctl_coverage_op_t); > > > > +/* XEN_SYSCTL_getcqminfo */ > > +struct xen_sysctl_getcqminfo { > > + XEN_GUEST_HANDLE_64(xen_socket_cqmdata_t) buffer; /* OUT */ > > + uint32_t num_sockets; /* IN/OUT */ > > + uint32_t num_rmid; /* OUT */ > > + uint32_t num_rmid_avail; /* OUT */ > > +}; > > +typedef struct xen_sysctl_getcqminfo xen_sysctl_getcqminfo_t; > > +DEFINE_XEN_GUEST_HANDLE(xen_sysctl_getcqminfo_t); > > > > struct xen_sysctl { > > uint32_t cmd; > > @@ -654,6 +663,7 @@ struct xen_sysctl { > > #define XEN_SYSCTL_cpupool_op 18 > > #define XEN_SYSCTL_scheduler_op 19 > > #define XEN_SYSCTL_coverage_op 20 > > +#define XEN_SYSCTL_getcqminfo 21 > > uint32_t interface_version; /* XEN_SYSCTL_INTERFACE_VERSION */ > > union { > > struct xen_sysctl_readconsole readconsole; > > @@ -675,6 +685,7 @@ struct xen_sysctl { > > struct xen_sysctl_cpupool_op cpupool_op; > > struct xen_sysctl_scheduler_op scheduler_op; > > struct xen_sysctl_coverage_op coverage_op; > > + struct xen_sysctl_getcqminfo getcqminfo; > > uint8_t pad[128]; > > } u; > > };
Xu, Dongxiao
2013-Nov-30 01:27 UTC
Re: [PATCH v3 5/7] x86: enable CQM monitoring for each domain RMID
> -----Original Message----- > From: Andrew Cooper [mailto:andrew.cooper3@citrix.com] > Sent: Friday, November 29, 2013 10:57 PM > To: Xu, Dongxiao > Cc: xen-devel@lists.xen.org; keir@xen.org; JBeulich@suse.com; > Ian.Jackson@eu.citrix.com; Ian.Campbell@citrix.com; > stefano.stabellini@eu.citrix.com; konrad.wilk@oracle.com; > dgdegra@tycho.nsa.gov > Subject: Re: [PATCH v3 5/7] x86: enable CQM monitoring for each domain RMID > > On 29/11/13 05:48, dongxiao.xu@intel.com wrote: > > From: Dongxiao Xu <dongxiao.xu@intel.com> > > > > If the CQM service is attached to a domain, its related RMID will be set > > to hardware for monitoring when the domain''s vcpu is scheduled in. When > > the domain''s vcpu is scheduled out, RMID 0 (system reserved) will be set > > for monitoring. > > > > Signed-off-by: Jiongxi Li <jiongxi.li@intel.com> > > Signed-off-by: Dongxiao Xu <dongxiao.xu@intel.com> > > --- > > xen/arch/x86/domain.c | 5 +++++ > > xen/arch/x86/pqos.c | 10 ++++++++++ > > xen/include/asm-x86/msr-index.h | 1 + > > xen/include/asm-x86/pqos.h | 1 + > > 4 files changed, 17 insertions(+) > > > > diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c > > index 41e1fc6..628f7eb 100644 > > --- a/xen/arch/x86/domain.c > > +++ b/xen/arch/x86/domain.c > > @@ -1371,6 +1371,8 @@ static void __context_switch(void) > > { > > memcpy(&p->arch.user_regs, stack_regs, > CTXT_SWITCH_STACK_BYTES); > > vcpu_save_fpu(p); > > + if ( system_supports_cqm() ) > > + cqm_assoc_rmid(0); > > p->arch.ctxt_switch_from(p); > > } > > > > @@ -1395,6 +1397,9 @@ static void __context_switch(void) > > } > > vcpu_restore_fpu_eager(n); > > n->arch.ctxt_switch_to(n); > > + > > + if ( system_supports_cqm() && n->domain->arch.pqos_cqm_rmid > > 0 ) > > + cqm_assoc_rmid(n->domain->arch.pqos_cqm_rmid); > > } > > > > gdt = !is_pv_32on64_vcpu(n) ? per_cpu(gdt_table, cpu) : > > diff --git a/xen/arch/x86/pqos.c b/xen/arch/x86/pqos.c > > index 615c5ea..1faa650 100644 > > --- a/xen/arch/x86/pqos.c > > +++ b/xen/arch/x86/pqos.c > > @@ -29,6 +29,7 @@ boolean_param("pqos", pqos_enabled); > > > > static unsigned int cqm_rmid_count = 256; > > integer_param("cqm_rmid_count", cqm_rmid_count); > > +static uint64_t rmid_mask; > > > > unsigned int cqm_upscaling_factor = 0; > > bool_t cqm_enabled = 0; > > @@ -75,6 +76,8 @@ static void __init init_qos_monitor(void) > > > > cpuid_count(0xf, 0, &eax, &ebx, &ecx, &qm_features); > > > > + rmid_mask = ~(~0ull << get_count_order(ebx)); > > + > > if ( qm_features & QOS_MONITOR_TYPE_L3 ) > > init_cqm(); > > } > > @@ -198,6 +201,13 @@ void get_cqm_info(cpumask_t *cpu_cqmdata_map, > struct xen_socket_cqmdata *data) > > on_selected_cpus(cpu_cqmdata_map, read_cqm_data, data, 1); > > } > > > > +void cqm_assoc_rmid(unsigned int rmid) > > +{ > > + uint64_t val; > > Xen style requires a newline here. > > Otherwise, > > Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>Will change it in next version. Thanks, Dongxiao> > > + rdmsrl(MSR_IA32_PQR_ASSOC, val); > > + wrmsrl(MSR_IA32_PQR_ASSOC, (val & ~(rmid_mask)) | (rmid & > rmid_mask)); > > +} > > + > > /* > > * Local variables: > > * mode: C > > diff --git a/xen/include/asm-x86/msr-index.h > b/xen/include/asm-x86/msr-index.h > > index 46ef165..45f4918 100644 > > --- a/xen/include/asm-x86/msr-index.h > > +++ b/xen/include/asm-x86/msr-index.h > > @@ -491,5 +491,6 @@ > > /* Platform QoS register */ > > #define MSR_IA32_QOSEVTSEL 0x00000c8d > > #define MSR_IA32_QMC 0x00000c8e > > +#define MSR_IA32_PQR_ASSOC 0x00000c8f > > > > #endif /* __ASM_MSR_INDEX_H */ > > diff --git a/xen/include/asm-x86/pqos.h b/xen/include/asm-x86/pqos.h > > index 2ab9277..c75643a 100644 > > --- a/xen/include/asm-x86/pqos.h > > +++ b/xen/include/asm-x86/pqos.h > > @@ -41,5 +41,6 @@ void free_cqm_rmid(struct domain *d); > > unsigned int get_cqm_count(void); > > unsigned int get_cqm_avail(void); > > void get_cqm_info(cpumask_t *cpu_cqmdata_map, struct > xen_socket_cqmdata *data); > > +void cqm_assoc_rmid(unsigned int rmid); > > > > #endif
Xu, Dongxiao
2013-Dec-02 02:17 UTC
Re: [PATCH v3 1/7] x86: detect and initialize Cache QoS Monitoring feature
> -----Original Message----- > From: Xu, Dongxiao > Sent: Saturday, November 30, 2013 8:41 AM > To: Jan Beulich; Andrew Cooper > Cc: Ian.Campbell@citrix.com; Ian.Jackson@eu.citrix.com; > stefano.stabellini@eu.citrix.com; xen-devel@lists.xen.org; > konrad.wilk@oracle.com; dgdegra@tycho.nsa.gov; keir@xen.org > Subject: RE: [PATCH v3 1/7] x86: detect and initialize Cache QoS Monitoring > feature > > > -----Original Message----- > > From: Jan Beulich [mailto:JBeulich@suse.com] > > Sent: Friday, November 29, 2013 11:05 PM > > To: Andrew Cooper; Xu, Dongxiao > > Cc: Ian.Campbell@citrix.com; Ian.Jackson@eu.citrix.com; > > stefano.stabellini@eu.citrix.com; xen-devel@lists.xen.org; > > konrad.wilk@oracle.com; dgdegra@tycho.nsa.gov; keir@xen.org > > Subject: Re: [PATCH v3 1/7] x86: detect and initialize Cache QoS Monitoring > > feature > > > > >>> On 29.11.13 at 14:54, Andrew Cooper <andrew.cooper3@citrix.com> > wrote: > > > However, I feel that all options relating to platform QoS should be > > > available under a "qpos" custom_param, (similar to iommu=), so we don''t > > > end up with loads of new command line options with common prefixes as > > > new features are added. > > > > > > For this, I would suggest semantic like: > > > > > > pqos=[<boolean>],[max_rmids=<number>] > > > > > > Particularly, when a second qpos option is introduced, it might be > > > sensible to have individual booleans for each option, as well as a > > > global enable/disable. > > > > > > > > > I would be interested in general views from others as far as this is > > > concerned. > > > > +1 > > Okay, will modify it.For parameter name, I think "opt_cqm_rmid_count" might be better, since: - RMID may be different between different QoS options, so we need to add a "cqm_" prefix here. - opt_cqm_rmid_count indicates the number of RMIDs, while opt_max_rmids means the maximum value of RMID. Here we need users limit the number of RMID in grub command line. Thanks, Dongxiao> > Thanks, > Dongxiao > > > > > Jan
Xu, Dongxiao
2013-Dec-02 09:22 UTC
Re: [PATCH v3 1/7] x86: detect and initialize Cache QoS Monitoring feature
> -----Original Message----- > From: xen-devel-bounces@lists.xen.org > [mailto:xen-devel-bounces@lists.xen.org] On Behalf Of Xu, Dongxiao > Sent: Monday, December 02, 2013 10:18 AM > To: Jan Beulich; Andrew Cooper > Cc: keir@xen.org; Ian.Campbell@citrix.com; stefano.stabellini@eu.citrix.com; > Ian.Jackson@eu.citrix.com; xen-devel@lists.xen.org; dgdegra@tycho.nsa.gov > Subject: Re: [Xen-devel] [PATCH v3 1/7] x86: detect and initialize Cache QoS > Monitoring feature > > > -----Original Message----- > > From: Xu, Dongxiao > > Sent: Saturday, November 30, 2013 8:41 AM > > To: Jan Beulich; Andrew Cooper > > Cc: Ian.Campbell@citrix.com; Ian.Jackson@eu.citrix.com; > > stefano.stabellini@eu.citrix.com; xen-devel@lists.xen.org; > > konrad.wilk@oracle.com; dgdegra@tycho.nsa.gov; keir@xen.org > > Subject: RE: [PATCH v3 1/7] x86: detect and initialize Cache QoS Monitoring > > feature > > > > > -----Original Message----- > > > From: Jan Beulich [mailto:JBeulich@suse.com] > > > Sent: Friday, November 29, 2013 11:05 PM > > > To: Andrew Cooper; Xu, Dongxiao > > > Cc: Ian.Campbell@citrix.com; Ian.Jackson@eu.citrix.com; > > > stefano.stabellini@eu.citrix.com; xen-devel@lists.xen.org; > > > konrad.wilk@oracle.com; dgdegra@tycho.nsa.gov; keir@xen.org > > > Subject: Re: [PATCH v3 1/7] x86: detect and initialize Cache QoS Monitoring > > > feature > > > > > > >>> On 29.11.13 at 14:54, Andrew Cooper <andrew.cooper3@citrix.com> > > wrote: > > > > However, I feel that all options relating to platform QoS should be > > > > available under a "qpos" custom_param, (similar to iommu=), so we don''t > > > > end up with loads of new command line options with common prefixes as > > > > new features are added. > > > > > > > > For this, I would suggest semantic like: > > > > > > > > pqos=[<boolean>],[max_rmids=<number>] > > > > > > > > Particularly, when a second qpos option is introduced, it might be > > > > sensible to have individual booleans for each option, as well as a > > > > global enable/disable. > > > > > > > > > > > > I would be interested in general views from others as far as this is > > > > concerned. > > > > > > +1 > > > > Okay, will modify it. > > For parameter name, I think "opt_cqm_rmid_count" might be better, since: > - RMID may be different between different QoS options, so we need to add a > "cqm_" prefix here. > - opt_cqm_rmid_count indicates the number of RMIDs, while opt_max_rmids > means the maximum value of RMID. Here we need users limit the number of > RMID in grub command line.Thinking further, I modified the parameter to be opt_cqm_max_rmid, where the default value is 255. Thanks, Dongxiao> > Thanks, > Dongxiao > > > > > Thanks, > > Dongxiao > > > > > > > > Jan > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel