Raghavendra K T
2011-Oct-23 19:03 UTC
[Xen-devel] [PATCH RFC V2 0/5] kvm : Paravirt-spinlock support for KVM guests
The 5-patch series to follow this email extends KVM-hypervisor and Linux guest running on KVM-hypervisor to support pv-ticket spinlocks, based heavily on Xen''s implementation. Two hypercalls are being introduced in KVM hypervisor, one that allows a vcpu (spinning on a lock) to block and another that allows a vcpu to kick another out of blocking state. This is discussed in : (posted by Srivatsa V). https://lkml.org/lkml/2010/7/26/24 https://lkml.org/lkml/2011/1/19/212 The BASE patch is tip rc9 + Jeremy''s following patches. compare exchange (https://lkml.org/lkml/2011/8/29/321), xadd (https://lkml.org/lkml/2011/10/4/328) x86/ticketlocklock (https://lkml.org/lkml/2011/10/12/496). Changes since last posting: - rebased patchesto -rc9 - synchronization related changes based on Jeremy''s changes (Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>) pointed by Stephan Diestelhorst <stephan.diestelhorst@amd.com> - enabling 32 bit guests - splitted patches into two more chunks Results: set up : Kernel for host/guest : 3.1-rc9 + Jeremys ticketlock cleanup, xadd, pv spinlock patches as BASE 3 guests with 8VCPU, 4GB RAM, 1 used for kernbench (kernbench -f -H -M -o 20) other for cpuhog (shell script while true with an instruction) scenario A: unpinned scenario B: each vcpu pinned to corresponding cpu 1x: no hogs 2x: 8hogs in one guest 3x: 8hogs each in two guest Result for Non PLE machine : Machine : IBM xSeries with Intel(R) Xeon(R) x5570 2.93GHz CPU with 8 core , 64GB RAM BASE BASE+patch %improvement mean (sd) mean (sd) Scenario A: case 1x: 157.495 (15.2924) 148.355 (7.35479) 5.80336 case 2x: 761.955 (581.384) 247.306 (64.4707) 67.5432 case 3x: 2804.36 (2613.57) 261.522 (85.0856) 90.6744 Scenario B case 1x: 1241.86 (812.155) 313.805 (140.99) 74.731 case 2x: 1277.32 (771.316) 308.325 (124.808) 75.8616 case 3x: 1239.45 (860.012) 307.535 (124.243) 75.1878 Result for PLE machine: Machine : IBM xSeries with Intel(R) Xeon(R) X7560 2.27GHz CPU with 32/64 core, with 8 online cores and 4*64GB RAM BASE BASE+patch %improvement mean (sd) mean (sd) Scenario A: case 1x: 72.7133 (43.7109) 70.0734 (37.7866) 3.63056 case 2x: 124.138 (79.9515) 119.28 (81.3597) 3.91339 case 3x: 174.561 (132.209) 159.283 (116.232) 8.75224 Scenario B case 1x: 166.101 (119.313) 160.056 (117.446) 3.63935 case 2x: 167.421 (120.767) 158.133 (115.022) 5.54769 case 3x: 169.317 (122.088) 159.353 (116.737) 5.88482 Srivatsa Vaddagiri, Suzuki Poulose, Raghavendra K T (5): Add debugfs support to print u32-arrays in debugfs Renaming of xen functions and change unsigned to u32 Add two hypercalls to KVM hypervisor to support pv-ticketlocks Added configuration support to enable debug information for KVM Guests pv-ticketlocks support for linux guests running on KVM hypervisor arch/x86/Kconfig | 9 ++ arch/x86/include/asm/kvm_para.h | 12 ++- arch/x86/kernel/head32.c | 3 + arch/x86/kernel/head64.c | 3 + arch/x86/kernel/kvm.c | 247 +++++++++++++++++++++++++++++++++++++++ arch/x86/kvm/x86.c | 67 +++++++++++- arch/x86/xen/debugfs.c | 104 ---------------- arch/x86/xen/debugfs.h | 4 - arch/x86/xen/spinlock.c | 2 +- fs/debugfs/file.c | 108 +++++++++++++++++ include/linux/debugfs.h | 11 ++ include/linux/kvm.h | 1 + include/linux/kvm_host.h | 5 + include/linux/kvm_para.h | 2 + virt/kvm/kvm_main.c | 1 + 15 files changed, 468 insertions(+), 111 deletions(-) --- _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Raghavendra K T
2011-Oct-23 19:04 UTC
[Xen-devel] [PATCH RFC V2 1/5] debugfs: Add support to print u32 array in debugfs
Add debugfs support to print u32-arrays in debugfs. Move the code from Xen to debugfs to make the code common for other users as well. Signed-off-by: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com> Signed-off-by: Suzuki Poulose <suzuki@in.ibm.com> Signed-off-by: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com> --- diff --git a/arch/x86/xen/debugfs.c b/arch/x86/xen/debugfs.c index 7c0fedd..c8377fb 100644 --- a/arch/x86/xen/debugfs.c +++ b/arch/x86/xen/debugfs.c @@ -19,107 +19,3 @@ struct dentry * __init xen_init_debugfs(void) return d_xen_debug; } -struct array_data -{ - void *array; - unsigned elements; -}; - -static int u32_array_open(struct inode *inode, struct file *file) -{ - file->private_data = NULL; - return nonseekable_open(inode, file); -} - -static size_t format_array(char *buf, size_t bufsize, const char *fmt, - u32 *array, unsigned array_size) -{ - size_t ret = 0; - unsigned i; - - for(i = 0; i < array_size; i++) { - size_t len; - - len = snprintf(buf, bufsize, fmt, array[i]); - len++; /* '' '' or ''\n'' */ - ret += len; - - if (buf) { - buf += len; - bufsize -= len; - buf[-1] = (i == array_size-1) ? ''\n'' : '' ''; - } - } - - ret++; /* \0 */ - if (buf) - *buf = ''\0''; - - return ret; -} - -static char *format_array_alloc(const char *fmt, u32 *array, unsigned array_size) -{ - size_t len = format_array(NULL, 0, fmt, array, array_size); - char *ret; - - ret = kmalloc(len, GFP_KERNEL); - if (ret == NULL) - return NULL; - - format_array(ret, len, fmt, array, array_size); - return ret; -} - -static ssize_t u32_array_read(struct file *file, char __user *buf, size_t len, - loff_t *ppos) -{ - struct inode *inode = file->f_path.dentry->d_inode; - struct array_data *data = inode->i_private; - size_t size; - - if (*ppos == 0) { - if (file->private_data) { - kfree(file->private_data); - file->private_data = NULL; - } - - file->private_data = format_array_alloc("%u", data->array, data->elements); - } - - size = 0; - if (file->private_data) - size = strlen(file->private_data); - - return simple_read_from_buffer(buf, len, ppos, file->private_data, size); -} - -static int xen_array_release(struct inode *inode, struct file *file) -{ - kfree(file->private_data); - - return 0; -} - -static const struct file_operations u32_array_fops = { - .owner = THIS_MODULE, - .open = u32_array_open, - .release= xen_array_release, - .read = u32_array_read, - .llseek = no_llseek, -}; - -struct dentry *xen_debugfs_create_u32_array(const char *name, mode_t mode, - struct dentry *parent, - u32 *array, unsigned elements) -{ - struct array_data *data = kmalloc(sizeof(*data), GFP_KERNEL); - - if (data == NULL) - return NULL; - - data->array = array; - data->elements = elements; - - return debugfs_create_file(name, mode, parent, data, &u32_array_fops); -} diff --git a/arch/x86/xen/debugfs.h b/arch/x86/xen/debugfs.h index e281320..12ebf33 100644 --- a/arch/x86/xen/debugfs.h +++ b/arch/x86/xen/debugfs.h @@ -3,8 +3,4 @@ struct dentry * __init xen_init_debugfs(void); -struct dentry *xen_debugfs_create_u32_array(const char *name, mode_t mode, - struct dentry *parent, - u32 *array, unsigned elements); - #endif /* _XEN_DEBUGFS_H */ diff --git a/fs/debugfs/file.c b/fs/debugfs/file.c index 90f7657..9aa12e3 100644 --- a/fs/debugfs/file.c +++ b/fs/debugfs/file.c @@ -18,6 +18,7 @@ #include <linux/pagemap.h> #include <linux/namei.h> #include <linux/debugfs.h> +#include <linux/slab.h> static ssize_t default_read_file(struct file *file, char __user *buf, size_t count, loff_t *ppos) @@ -525,3 +526,110 @@ struct dentry *debugfs_create_blob(const char *name, mode_t mode, return debugfs_create_file(name, mode, parent, blob, &fops_blob); } EXPORT_SYMBOL_GPL(debugfs_create_blob); + +struct array_data { + void *array; + unsigned elements; +}; + +static int u32_array_open(struct inode *inode, struct file *file) +{ + file->private_data = NULL; + return nonseekable_open(inode, file); +} + +static size_t format_array(char *buf, size_t bufsize, const char *fmt, + u32 *array, unsigned array_size) +{ + size_t ret = 0; + unsigned i; + + for (i = 0; i < array_size; i++) { + size_t len; + + len = snprintf(buf, bufsize, fmt, array[i]); + len++; /* '' '' or ''\n'' */ + ret += len; + + if (buf) { + buf += len; + bufsize -= len; + buf[-1] = (i == array_size-1) ? ''\n'' : '' ''; + } + } + + ret++; /* \0 */ + if (buf) + *buf = ''\0''; + + return ret; +} + +static char *format_array_alloc(const char *fmt, u32 *array, + unsigned array_size) +{ + size_t len = format_array(NULL, 0, fmt, array, array_size); + char *ret; + + ret = kmalloc(len, GFP_KERNEL); + if (ret == NULL) + return NULL; + + format_array(ret, len, fmt, array, array_size); + return ret; +} + +static ssize_t u32_array_read(struct file *file, char __user *buf, size_t len, + loff_t *ppos) +{ + struct inode *inode = file->f_path.dentry->d_inode; + struct array_data *data = inode->i_private; + size_t size; + + if (*ppos == 0) { + if (file->private_data) { + kfree(file->private_data); + file->private_data = NULL; + } + + file->private_data = format_array_alloc("%u", data->array, + data->elements); + } + + size = 0; + if (file->private_data) + size = strlen(file->private_data); + + return simple_read_from_buffer(buf, len, ppos, + file->private_data, size); +} + +static int xen_array_release(struct inode *inode, struct file *file) +{ + kfree(file->private_data); + + return 0; +} + +static const struct file_operations u32_array_fops = { + .owner = THIS_MODULE, + .open = u32_array_open, + .release = xen_array_release, + .read = u32_array_read, + .llseek = no_llseek, +}; + +struct dentry *xen_debugfs_create_u32_array(const char *name, mode_t mode, + struct dentry *parent, + u32 *array, unsigned elements) +{ + struct array_data *data = kmalloc(sizeof(*data), GFP_KERNEL); + + if (data == NULL) + return NULL; + + data->array = array; + data->elements = elements; + + return debugfs_create_file(name, mode, parent, data, &u32_array_fops); +} diff --git a/include/linux/debugfs.h b/include/linux/debugfs.h index e7d9b20..8c08636 100644 --- a/include/linux/debugfs.h +++ b/include/linux/debugfs.h @@ -74,6 +74,10 @@ struct dentry *debugfs_create_blob(const char *name, mode_t mode, struct dentry *parent, struct debugfs_blob_wrapper *blob); +struct dentry *xen_debugfs_create_u32_array(const char *name, mode_t mode, + struct dentry *parent, + u32 *array, unsigned elements); + bool debugfs_initialized(void); #else @@ -193,6 +197,13 @@ static inline bool debugfs_initialized(void) return false; } +struct dentry *xen_debugfs_create_u32_array(const char *name, mode_t mode, + struct dentry *parent, + u32 *array, unsigned elements) +{ + return ERR_PTR(-ENODEV); +} + #endif #endif _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Raghavendra K T
2011-Oct-23 19:04 UTC
[Xen-devel] [PATCH RFC V2 2/5] debugfs: Renaming of xen functions and change unsigned to u32
Renaming of xen functions and change unsigned to u32. Signed-off-by: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com> Signed-off-by: Suzuki Poulose <suzuki@in.ibm.com> Signed-off-by: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com> --- diff --git a/arch/x86/xen/spinlock.c b/arch/x86/xen/spinlock.c index fc506e6..14a8961 100644 --- a/arch/x86/xen/spinlock.c +++ b/arch/x86/xen/spinlock.c @@ -286,7 +286,7 @@ static int __init xen_spinlock_debugfs(void) debugfs_create_u64("time_blocked", 0444, d_spin_debug, &spinlock_stats.time_blocked); - xen_debugfs_create_u32_array("histo_blocked", 0444, d_spin_debug, + debugfs_create_u32_array("histo_blocked", 0444, d_spin_debug, spinlock_stats.histo_spin_blocked, HISTO_BUCKETS + 1); return 0; diff --git a/fs/debugfs/file.c b/fs/debugfs/file.c index 9aa12e3..68af40d 100644 --- a/fs/debugfs/file.c +++ b/fs/debugfs/file.c @@ -529,7 +529,7 @@ EXPORT_SYMBOL_GPL(debugfs_create_blob); struct array_data { void *array; - unsigned elements; + u32 elements; }; static int u32_array_open(struct inode *inode, struct file *file) @@ -539,10 +539,10 @@ static int u32_array_open(struct inode *inode, struct file *file) } static size_t format_array(char *buf, size_t bufsize, const char *fmt, - u32 *array, unsigned array_size) + u32 *array, u32 array_size) { size_t ret = 0; - unsigned i; + u32 i; for (i = 0; i < array_size; i++) { size_t len; @@ -566,7 +566,7 @@ static size_t format_array(char *buf, size_t bufsize, const char *fmt, } static char *format_array_alloc(const char *fmt, u32 *array, - unsigned array_size) + u32 array_size) { size_t len = format_array(NULL, 0, fmt, array, array_size); char *ret; @@ -604,7 +604,7 @@ static ssize_t u32_array_read(struct file *file, char __user *buf, size_t len, file->private_data, size); } -static int xen_array_release(struct inode *inode, struct file *file) +static int u32_array_release(struct inode *inode, struct file *file) { kfree(file->private_data); @@ -614,14 +614,14 @@ static int xen_array_release(struct inode *inode, struct file *file) static const struct file_operations u32_array_fops = { .owner = THIS_MODULE, .open = u32_array_open, - .release = xen_array_release, + .release = u32_array_release, .read = u32_array_read, .llseek = no_llseek, }; -struct dentry *xen_debugfs_create_u32_array(const char *name, mode_t mode, +struct dentry *debugfs_create_u32_array(const char *name, mode_t mode, struct dentry *parent, - u32 *array, unsigned elements) + u32 *array, u32 elements) { struct array_data *data = kmalloc(sizeof(*data), GFP_KERNEL); diff --git a/include/linux/debugfs.h b/include/linux/debugfs.h index 8c08636..253e2fb 100644 --- a/include/linux/debugfs.h +++ b/include/linux/debugfs.h @@ -74,9 +74,9 @@ struct dentry *debugfs_create_blob(const char *name, mode_t mode, struct dentry *parent, struct debugfs_blob_wrapper *blob); -struct dentry *xen_debugfs_create_u32_array(const char *name, mode_t mode, - struct dentry *parent, - u32 *array, unsigned elements); +struct dentry *debugfs_create_u32_array(const char *name, mode_t mode, + struct dentry *parent, + u32 *array, u32 elements); bool debugfs_initialized(void); @@ -197,9 +197,9 @@ static inline bool debugfs_initialized(void) return false; } -struct dentry *xen_debugfs_create_u32_array(const char *name, mode_t mode, - struct dentry *parent, - u32 *array, unsigned elements) +struct dentry *debugfs_create_u32_array(const char *name, mode_t mode, + struct dentry *parent, + u32 *array, u32 elements) { return ERR_PTR(-ENODEV); } _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Raghavendra K T
2011-Oct-23 19:05 UTC
[Xen-devel] [PATCH RFC V2 3/5] kvm hypervisor : Add two hypercalls to support pv-ticketlock
Add two hypercalls to KVM hypervisor to support pv-ticketlocks. KVM_HC_WAIT_FOR_KICK blocks the calling vcpu until another vcpu kicks it or it is woken up because of an event like interrupt. KVM_HC_KICK_CPU allows the calling vcpu to kick another vcpu. The presence of these hypercalls is indicated to guest via KVM_FEATURE_WAIT_FOR_KICK/KVM_CAP_WAIT_FOR_KICK. Qemu needs a corresponding patch to pass up the presence of this feature to guest via cpuid. Patch to qemu will be sent separately. There is no Xen/KVM hypercall interface to await kick from. Signed-off-by: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com> Signed-off-by: Suzuki Poulose <suzuki@in.ibm.com> Signed-off-by: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com> --- diff --git a/arch/x86/include/asm/kvm_para.h b/arch/x86/include/asm/kvm_para.h index 734c376..2874c19 100644 --- a/arch/x86/include/asm/kvm_para.h +++ b/arch/x86/include/asm/kvm_para.h @@ -16,12 +16,14 @@ #define KVM_FEATURE_CLOCKSOURCE 0 #define KVM_FEATURE_NOP_IO_DELAY 1 #define KVM_FEATURE_MMU_OP 2 + /* This indicates that the new set of kvmclock msrs * are available. The use of 0x11 and 0x12 is deprecated */ #define KVM_FEATURE_CLOCKSOURCE2 3 #define KVM_FEATURE_ASYNC_PF 4 #define KVM_FEATURE_STEAL_TIME 5 +#define KVM_FEATURE_WAIT_FOR_KICK 6 /* The last 8 bits are used to indicate how to interpret the flags field * in pvclock structure. If no bits are set, all flags are ignored. diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 84a28ea..b43fd18 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -2077,6 +2077,7 @@ int kvm_dev_ioctl_check_extension(long ext) case KVM_CAP_XSAVE: case KVM_CAP_ASYNC_PF: case KVM_CAP_GET_TSC_KHZ: + case KVM_CAP_WAIT_FOR_KICK: r = 1; break; case KVM_CAP_COALESCED_MMIO: @@ -2548,7 +2549,8 @@ static void do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function, (1 << KVM_FEATURE_NOP_IO_DELAY) | (1 << KVM_FEATURE_CLOCKSOURCE2) | (1 << KVM_FEATURE_ASYNC_PF) | - (1 << KVM_FEATURE_CLOCKSOURCE_STABLE_BIT); + (1 << KVM_FEATURE_CLOCKSOURCE_STABLE_BIT) | + (1 << KVM_FEATURE_WAIT_FOR_KICK); if (sched_info_on()) entry->eax |= (1 << KVM_FEATURE_STEAL_TIME); @@ -5231,6 +5233,61 @@ int kvm_hv_hypercall(struct kvm_vcpu *vcpu) return 1; } +/* + * kvm_pv_wait_for_kick_op : Block until kicked by either a KVM_HC_KICK_CPU + * hypercall or a event like interrupt. + * + * @vcpu : vcpu which is blocking. + */ +static void kvm_pv_wait_for_kick_op(struct kvm_vcpu *vcpu) +{ + DEFINE_WAIT(wait); + + /* + * Blocking on vcpu->wq allows us to wake up sooner if required to + * service pending events (like interrupts). + * + * Also set state to TASK_INTERRUPTIBLE before checking vcpu->kicked to + * avoid racing with kvm_pv_kick_cpu_op(). + */ + prepare_to_wait(&vcpu->wq, &wait, TASK_INTERRUPTIBLE); + + /* + * Somebody has already tried kicking us. Acknowledge that + * and terminate the wait. + */ + if (vcpu->kicked) { + vcpu->kicked = 0; + goto end_wait; + } + + /* Let''s wait for either KVM_HC_KICK_CPU or someother event + * to wake us up. + */ + + srcu_read_unlock(&vcpu->kvm->srcu, vcpu->srcu_idx); + schedule(); + vcpu->srcu_idx = srcu_read_lock(&vcpu->kvm->srcu); + +end_wait: + finish_wait(&vcpu->wq, &wait); +} + +/* + * kvm_pv_kick_cpu_op: Kick a vcpu. + * + * @cpu - vcpu to be kicked. + */ +static void kvm_pv_kick_cpu_op(struct kvm *kvm, int cpu) +{ + struct kvm_vcpu *vcpu = kvm_get_vcpu(kvm, cpu); + + if (vcpu) { + vcpu->kicked = 1; + wake_up_interruptible(&vcpu->wq); + } +} + int kvm_emulate_hypercall(struct kvm_vcpu *vcpu) { unsigned long nr, a0, a1, a2, a3, ret; @@ -5267,6 +5324,14 @@ int kvm_emulate_hypercall(struct kvm_vcpu *vcpu) case KVM_HC_MMU_OP: r = kvm_pv_mmu_op(vcpu, a0, hc_gpa(vcpu, a1, a2), &ret); break; + case KVM_HC_WAIT_FOR_KICK: + kvm_pv_wait_for_kick_op(vcpu); + ret = 0; + break; + case KVM_HC_KICK_CPU: + kvm_pv_kick_cpu_op(vcpu->kvm, a0); + ret = 0; + break; default: ret = -KVM_ENOSYS; break; diff --git a/include/linux/kvm.h b/include/linux/kvm.h index aace6b8..736acc0 100644 --- a/include/linux/kvm.h +++ b/include/linux/kvm.h @@ -553,6 +553,7 @@ struct kvm_ppc_pvinfo { #define KVM_CAP_SPAPR_TCE 63 #define KVM_CAP_PPC_SMT 64 #define KVM_CAP_PPC_RMA 65 +#define KVM_CAP_WAIT_FOR_KICK 66 #define KVM_CAP_S390_GMAP 71 #ifdef KVM_CAP_IRQ_ROUTING diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index eabb21a..2946563 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -152,6 +152,11 @@ struct kvm_vcpu { #endif struct kvm_vcpu_arch arch; + + /* + * blocked vcpu wakes up by checking this flag set by unlocker + */ + int kicked; }; static inline int kvm_vcpu_exiting_guest_mode(struct kvm_vcpu *vcpu) diff --git a/include/linux/kvm_para.h b/include/linux/kvm_para.h index 47a070b..4efee4d 100644 --- a/include/linux/kvm_para.h +++ b/include/linux/kvm_para.h @@ -19,6 +19,8 @@ #define KVM_HC_MMU_OP 2 #define KVM_HC_FEATURES 3 #define KVM_HC_PPC_MAP_MAGIC_PAGE 4 +#define KVM_HC_WAIT_FOR_KICK 5 +#define KVM_HC_KICK_CPU 6 /* * hypercalls use architecture specific diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index aefdda3..d09982d 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -224,6 +224,7 @@ int kvm_vcpu_init(struct kvm_vcpu *vcpu, struct kvm *kvm, unsigned id) vcpu->kvm = kvm; vcpu->vcpu_id = id; vcpu->pid = NULL; + vcpu->kicked = 0; init_waitqueue_head(&vcpu->wq); kvm_async_pf_vcpu_init(vcpu); _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Raghavendra K T
2011-Oct-23 19:07 UTC
[Xen-devel] [PATCH RFC V2 4/5] kvm guest : Added configuration support to enable debug information for KVM Guests
Added configuration support to enable debug information for KVM Guests in debugfs Signed-off-by: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com> Signed-off-by: Suzuki Poulose <suzuki@in.ibm.com> Signed-off-by: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com> --- diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 1f03f82..ed34269 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -562,6 +562,15 @@ config KVM_GUEST This option enables various optimizations for running under the KVM hypervisor. +config KVM_DEBUG_FS + bool "Enable debug information for KVM Guests in debugfs" + depends on KVM_GUEST + default n + ---help--- + This option enables collection of various statistics for KVM guest. + Statistics are displayed in debugfs filesystem. Enabling this option + may incur significant overhead. + source "arch/x86/lguest/Kconfig" config PARAVIRT _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Raghavendra K T
2011-Oct-23 19:07 UTC
[Xen-devel] [PATCH RFC V2 5/5] kvm guest : pv-ticketlocks support for linux guests running on KVM hypervisor
This patch extends Linux guests running on KVM hypervisor to support pv-ticketlocks. Very early during bootup, paravirtualied KVM guest detects if the hypervisor has required feature (KVM_FEATURE_WAIT_FOR_KICK) to support pv-ticketlocks. If so, support for pv-ticketlocks is registered via pv_lock_ops. Signed-off-by: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com> Signed-off-by: Suzuki Poulose <suzuki@in.ibm.com> Signed-off-by: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com> --- diff --git a/arch/x86/include/asm/kvm_para.h b/arch/x86/include/asm/kvm_para.h index 2874c19..c7f34b7 100644 --- a/arch/x86/include/asm/kvm_para.h +++ b/arch/x86/include/asm/kvm_para.h @@ -195,10 +195,18 @@ void kvm_async_pf_task_wait(u32 token); void kvm_async_pf_task_wake(u32 token); u32 kvm_read_and_reset_pf_reason(void); extern void kvm_disable_steal_time(void); -#else + +#ifdef CONFIG_PARAVIRT_SPINLOCKS +void __init kvm_guest_early_init(void); +#else /* CONFIG_PARAVIRT_SPINLOCKS */ +#define kvm_guest_early_init() do { } while (0) +#endif /* CONFIG_PARAVIRT_SPINLOCKS */ + +#else /* CONFIG_KVM_GUEST */ #define kvm_guest_init() do { } while (0) #define kvm_async_pf_task_wait(T) do {} while(0) #define kvm_async_pf_task_wake(T) do {} while(0) +#define kvm_guest_early_init() do { } while (0) static inline u32 kvm_read_and_reset_pf_reason(void) { return 0; diff --git a/arch/x86/kernel/head32.c b/arch/x86/kernel/head32.c index 3bb0850..fb25bca 100644 --- a/arch/x86/kernel/head32.c +++ b/arch/x86/kernel/head32.c @@ -9,6 +9,7 @@ #include <linux/start_kernel.h> #include <linux/mm.h> #include <linux/memblock.h> +#include <linux/kvm_para.h> #include <asm/setup.h> #include <asm/sections.h> @@ -59,6 +60,8 @@ void __init i386_start_kernel(void) break; } + kvm_guest_early_init(); + /* * At this point everything still needed from the boot loader * or BIOS or kernel text should be early reserved or marked not diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c index 5655c22..cabf8ec 100644 --- a/arch/x86/kernel/head64.c +++ b/arch/x86/kernel/head64.c @@ -13,6 +13,7 @@ #include <linux/start_kernel.h> #include <linux/io.h> #include <linux/memblock.h> +#include <linux/kvm_para.h> #include <asm/processor.h> #include <asm/proto.h> @@ -115,6 +116,8 @@ void __init x86_64_start_reservations(char *real_mode_data) reserve_ebda_region(); + kvm_guest_early_init(); + /* * At this point everything still needed from the boot loader * or BIOS or kernel text should be early reserved or marked not diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c index a9c2116..f4f341f 100644 --- a/arch/x86/kernel/kvm.c +++ b/arch/x86/kernel/kvm.c @@ -39,6 +39,16 @@ #include <asm/desc.h> #include <asm/tlbflush.h> +#ifdef CONFIG_PARAVIRT_SPINLOCKS + +#ifdef CONFIG_KVM_DEBUG_FS + +#include <linux/debugfs.h> + +#endif /* CONFIG_KVM_DEBUG_FS */ + +#endif /* CONFIG_PARAVIRT_SPINLOCKS */ + #define MMU_QUEUE_SIZE 1024 static int kvmapf = 1; @@ -627,3 +637,240 @@ static __init int activate_jump_labels(void) return 0; } arch_initcall(activate_jump_labels); + +#ifdef CONFIG_PARAVIRT_SPINLOCKS + +#ifdef CONFIG_KVM_DEBUG_FS + +static struct kvm_spinlock_stats +{ + u32 taken_slow; + u32 taken_slow_pickup; + + u32 released_slow; + u32 released_slow_kicked; + +#define HISTO_BUCKETS 30 + u32 histo_spin_blocked[HISTO_BUCKETS+1]; + + u64 time_blocked; +} spinlock_stats; + +static u8 zero_stats; + +static inline void check_zero(void) +{ + if (unlikely(zero_stats)) { + memset(&spinlock_stats, 0, sizeof(spinlock_stats)); + zero_stats = 0; + } +} + +#define ADD_STATS(elem, val) \ + do { check_zero(); spinlock_stats.elem += (val); } while (0) + +static inline u64 spin_time_start(void) +{ + return sched_clock(); +} + +static void __spin_time_accum(u64 delta, u32 *array) +{ + unsigned index = ilog2(delta); + + check_zero(); + + if (index < HISTO_BUCKETS) + array[index]++; + else + array[HISTO_BUCKETS]++; +} + +static inline void spin_time_accum_blocked(u64 start) +{ + u32 delta = sched_clock() - start; + + __spin_time_accum(delta, spinlock_stats.histo_spin_blocked); + spinlock_stats.time_blocked += delta; +} + +static struct dentry *d_spin_debug; +static struct dentry *d_kvm_debug; + +struct dentry *kvm_init_debugfs(void) +{ + d_kvm_debug = debugfs_create_dir("kvm", NULL); + if (!d_kvm_debug) + printk(KERN_WARNING "Could not create ''kvm'' debugfs directory\n"); + + return d_kvm_debug; +} + +static int __init kvm_spinlock_debugfs(void) +{ + struct dentry *d_kvm = kvm_init_debugfs(); + + if (d_kvm == NULL) + return -ENOMEM; + + d_spin_debug = debugfs_create_dir("spinlocks", d_kvm); + + debugfs_create_u8("zero_stats", 0644, d_spin_debug, &zero_stats); + + debugfs_create_u32("taken_slow", 0444, d_spin_debug, + &spinlock_stats.taken_slow); + debugfs_create_u32("taken_slow_pickup", 0444, d_spin_debug, + &spinlock_stats.taken_slow_pickup); + + debugfs_create_u32("released_slow", 0444, d_spin_debug, + &spinlock_stats.released_slow); + debugfs_create_u32("released_slow_kicked", 0444, d_spin_debug, + &spinlock_stats.released_slow_kicked); + + debugfs_create_u64("time_blocked", 0444, d_spin_debug, + &spinlock_stats.time_blocked); + + debugfs_create_u32_array("histo_blocked", 0444, d_spin_debug, + spinlock_stats.histo_spin_blocked, HISTO_BUCKETS + 1); + + return 0; +} +fs_initcall(kvm_spinlock_debugfs); +#else /* !CONFIG_KVM_DEBUG_FS */ +#define TIMEOUT (1 << 10) +#define ADD_STATS(elem, val) do { (void)(val); } while (0) + +static inline u64 spin_time_start(void) +{ + return 0; +} + +static inline void spin_time_accum_blocked(u64 start) +{ +} +#endif /* CONFIG_KVM_DEBUG_FS */ + +struct kvm_lock_waiting { + struct arch_spinlock *lock; + __ticket_t want; +}; + +/* cpus ''waiting'' on a spinlock to become available */ +static cpumask_t waiting_cpus; + +/* Track spinlock on which a cpu is waiting */ +static DEFINE_PER_CPU(struct kvm_lock_waiting, lock_waiting); + +static inline void kvm_wait_for_kick(void) +{ + kvm_hypercall0(KVM_HC_WAIT_FOR_KICK); +} + +static void kvm_lock_spinning(struct arch_spinlock *lock, __ticket_t want) +{ + struct kvm_lock_waiting *w = &__get_cpu_var(lock_waiting); + int cpu = smp_processor_id(); + u64 start; + unsigned long flags; + + start = spin_time_start(); + + /* + * Make sure an interrupt handler can''t upset things in a + * partially setup state. + */ + local_irq_save(flags); + + /* + * The ordering protocol on this is that the "lock" pointer + * may only be set non-NULL if the "want" ticket is correct. + * If we''re updating "want", we must first clear "lock". + */ + w->lock = NULL; + smp_wmb(); + w->want = want; + smp_wmb(); + w->lock = lock; + + ADD_STATS(taken_slow, 1); + + /* + * This uses set_bit, which is atomic but we should not rely on its + * reordering gurantees. So barrier is needed after this call. + */ + cpumask_set_cpu(cpu, &waiting_cpus); + + barrier(); + + /* + * Mark entry to slowpath before doing the pickup test to make + * sure we don''t deadlock with an unlocker. + */ + __ticket_enter_slowpath(lock); + + /* + * check again make sure it didn''t become free while + * we weren''t looking. + */ + if (ACCESS_ONCE(lock->tickets.head) == want) { + ADD_STATS(taken_slow_pickup, 1); + goto out; + } + + /* Allow interrupts while blocked */ + local_irq_restore(flags); + + kvm_wait_for_kick(); + + local_irq_save(flags); +out: + cpumask_clear_cpu(cpu, &waiting_cpus); + w->lock = NULL; + local_irq_restore(flags); + spin_time_accum_blocked(start); +} +PV_CALLEE_SAVE_REGS_THUNK(kvm_lock_spinning); + +/* Kick a cpu */ +static inline void kvm_kick_cpu(int cpu) +{ + kvm_hypercall1(KVM_HC_KICK_CPU, cpu); +} + +/* Kick vcpu waiting on @lock->head to reach value @ticket */ +static void kvm_unlock_kick(struct arch_spinlock *lock, __ticket_t ticket) +{ + int cpu; + + ADD_STATS(released_slow, 1); + + for_each_cpu(cpu, &waiting_cpus) { + const struct kvm_lock_waiting *w = &per_cpu(lock_waiting, cpu); + if (ACCESS_ONCE(w->lock) == lock && + ACCESS_ONCE(w->want) == ticket) { + ADD_STATS(released_slow_kicked, 1); + kvm_kick_cpu(cpu); + break; + } + } +} + +/* + * Setup pv_lock_ops to exploit KVM_FEATURE_WAIT_FOR_KICK if present. + * This needs to be setup really early in boot, before the first call to + * spinlock is issued! + */ +void __init kvm_guest_early_init(void) +{ + if (!kvm_para_available()) + return; + /* Does host kernel support KVM_FEATURE_WAIT_FOR_KICK? */ + if (!kvm_para_has_feature(KVM_FEATURE_WAIT_FOR_KICK)) + return; + + jump_label_inc(¶virt_ticketlocks_enabled); + + pv_lock_ops.lock_spinning = PV_CALLEE_SAVE(kvm_lock_spinning); + pv_lock_ops.unlock_kick = kvm_unlock_kick; +} +#endif /* CONFIG_PARAVIRT_SPINLOCKS */ _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Greg KH
2011-Oct-23 22:19 UTC
[Xen-devel] Re: [PATCH RFC V2 2/5] debugfs: Renaming of xen functions and change unsigned to u32
On Mon, Oct 24, 2011 at 12:34:59AM +0530, Raghavendra K T wrote:> Renaming of xen functions and change unsigned to u32.Why not just rename when you move the functions? Why the extra step? greg k-h _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Greg KH
2011-Oct-23 22:20 UTC
[Xen-devel] Re: [PATCH RFC V2 1/5] debugfs: Add support to print u32 array in debugfs
On Mon, Oct 24, 2011 at 12:34:04AM +0530, Raghavendra K T wrote:> Add debugfs support to print u32-arrays in debugfs. Move the code from Xen to debugfs > to make the code common for other users as well.You forgot the kerneldoc for the function explaining what it is and how to use it, and the EXPORT_SYMBOL_GPL() marking for the global function as that''s the only way it will be able to be used, right? thanks, greg k-h _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Raghavendra K T
2011-Oct-24 09:28 UTC
[Xen-devel] Re: [PATCH RFC V2 2/5] debugfs: Renaming of xen functions and change unsigned to u32
On 10/24/2011 03:49 AM, Greg KH wrote:> On Mon, Oct 24, 2011 at 12:34:59AM +0530, Raghavendra K T wrote: >> Renaming of xen functions and change unsigned to u32. > > Why not just rename when you move the functions? Why the extra step? >Intention was only clarity. Yes, if this patch is an overhead, I ''ll combine both the patches.> greg k-h >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Raghavendra K T
2011-Oct-24 09:30 UTC
[Xen-devel] Re: [PATCH RFC V2 1/5] debugfs: Add support to print u32 array in debugfs
On 10/24/2011 03:50 AM, Greg KH wrote:> On Mon, Oct 24, 2011 at 12:34:04AM +0530, Raghavendra K T wrote: >> Add debugfs support to print u32-arrays in debugfs. Move the code from Xen to debugfs >> to make the code common for other users as well. > > You forgot the kerneldoc for the function explaining what it is and how > to use it, and the EXPORT_SYMBOL_GPL() marking for the global function > as that''s the only way it will be able to be used, right? >Greg right. Thanks for finding this. I ''ll update the patch for that.> thanks, > > greg k-h >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Raghavendra K T
2011-Oct-24 09:33 UTC
[Xen-devel] Re: [PATCH RFC V2 5/5] kvm guest : pv-ticketlocks support for linux guests running on KVM hypervisor
On 10/24/2011 03:31 PM, Sasha Levin wrote:> On Mon, 2011-10-24 at 00:37 +0530, Raghavendra K T wrote: >> +#else /* CONFIG_PARAVIRT_SPINLOCKS */ >> +#define kvm_guest_early_init() do { } while (0) > > This should be defined as an empty function. >Yes Agree, I ''ll change to an empty function. - Raghu _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Greg KH
2011-Oct-24 09:38 UTC
[Xen-devel] Re: [PATCH RFC V2 2/5] debugfs: Renaming of xen functions and change unsigned to u32
On Mon, Oct 24, 2011 at 02:58:47PM +0530, Raghavendra K T wrote:> On 10/24/2011 03:49 AM, Greg KH wrote: > >On Mon, Oct 24, 2011 at 12:34:59AM +0530, Raghavendra K T wrote: > >>Renaming of xen functions and change unsigned to u32. > > > >Why not just rename when you move the functions? Why the extra step? > > > Intention was only clarity. Yes, if this patch is an overhead, I ''ll > combine both the patches.Yeah, it makes more sense as it originally confused me why you were adding a "xen_*" function to the debugfs core code :) thanks, greg k-h _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Sasha Levin
2011-Oct-24 10:01 UTC
[Xen-devel] Re: [PATCH RFC V2 5/5] kvm guest : pv-ticketlocks support for linux guests running on KVM hypervisor
On Mon, 2011-10-24 at 00:37 +0530, Raghavendra K T wrote:> This patch extends Linux guests running on KVM hypervisor to support > pv-ticketlocks. Very early during bootup, paravirtualied KVM guest detects if > the hypervisor has required feature (KVM_FEATURE_WAIT_FOR_KICK) to support > pv-ticketlocks. If so, support for pv-ticketlocks is registered via pv_lock_ops. > > Signed-off-by: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com> > Signed-off-by: Suzuki Poulose <suzuki@in.ibm.com> > Signed-off-by: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com> > --- > diff --git a/arch/x86/include/asm/kvm_para.h b/arch/x86/include/asm/kvm_para.h > index 2874c19..c7f34b7 100644 > --- a/arch/x86/include/asm/kvm_para.h > +++ b/arch/x86/include/asm/kvm_para.h > @@ -195,10 +195,18 @@ void kvm_async_pf_task_wait(u32 token); > void kvm_async_pf_task_wake(u32 token); > u32 kvm_read_and_reset_pf_reason(void); > extern void kvm_disable_steal_time(void); > -#else > + > +#ifdef CONFIG_PARAVIRT_SPINLOCKS > +void __init kvm_guest_early_init(void); > +#else /* CONFIG_PARAVIRT_SPINLOCKS */ > +#define kvm_guest_early_init() do { } while (0)This should be defined as an empty function. -- Sasha. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Avi Kivity
2011-Oct-24 10:14 UTC
[Xen-devel] Re: [PATCH RFC V2 3/5] kvm hypervisor : Add two hypercalls to support pv-ticketlock
On 10/23/2011 09:05 PM, Raghavendra K T wrote:> Add two hypercalls to KVM hypervisor to support pv-ticketlocks. > > KVM_HC_WAIT_FOR_KICK blocks the calling vcpu until another vcpu kicks it or it > is woken up because of an event like interrupt. > > KVM_HC_KICK_CPU allows the calling vcpu to kick another vcpu. > > The presence of these hypercalls is indicated to guest via > KVM_FEATURE_WAIT_FOR_KICK/KVM_CAP_WAIT_FOR_KICK. > > Qemu needs a corresponding patch to pass up the presence of this feature to > guest via cpuid. Patch to qemu will be sent separately. > > There is no Xen/KVM hypercall interface to await kick from. > > > +/* > + * kvm_pv_wait_for_kick_op : Block until kicked by either a KVM_HC_KICK_CPU > + * hypercall or a event like interrupt. > + * > + * @vcpu : vcpu which is blocking. > + */ > +static void kvm_pv_wait_for_kick_op(struct kvm_vcpu *vcpu) > +{ > + DEFINE_WAIT(wait); > + > + /* > + * Blocking on vcpu->wq allows us to wake up sooner if required to > + * service pending events (like interrupts). > + * > + * Also set state to TASK_INTERRUPTIBLE before checking vcpu->kicked to > + * avoid racing with kvm_pv_kick_cpu_op(). > + */ > + prepare_to_wait(&vcpu->wq, &wait, TASK_INTERRUPTIBLE); > + > + /* > + * Somebody has already tried kicking us. Acknowledge that > + * and terminate the wait. > + */ > + if (vcpu->kicked) { > + vcpu->kicked = 0; > + goto end_wait; > + } > + > + /* Let''s wait for either KVM_HC_KICK_CPU or someother event > + * to wake us up. > + */ > + > + srcu_read_unlock(&vcpu->kvm->srcu, vcpu->srcu_idx); > + schedule(); > + vcpu->srcu_idx = srcu_read_lock(&vcpu->kvm->srcu); > + > +end_wait: > + finish_wait(&vcpu->wq, &wait); > +}This hypercall can be replaced by a HLT instruction, no? I''m pretty sure this misses a lot of stuff from kvm_vcpu_block().> + > +/* > + * kvm_pv_kick_cpu_op: Kick a vcpu. > + * > + * @cpu - vcpu to be kicked. > + */ > +static void kvm_pv_kick_cpu_op(struct kvm *kvm, int cpu) > +{ > + struct kvm_vcpu *vcpu = kvm_get_vcpu(kvm, cpu); > +Is the vcpu number meaningful? We should reuse an existing identifier like the APIC ID.> + if (vcpu) { > + vcpu->kicked = 1;Need to use smp memory barriers here.> + wake_up_interruptible(&vcpu->wq); > + } > +} > + > int kvm_emulate_hypercall(struct kvm_vcpu *vcpu) > { > unsigned long nr, a0, a1, a2, a3, ret; >-- I have a truly marvellous patch that fixes the bug which this signature is too narrow to contain. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Avi Kivity
2011-Oct-24 10:15 UTC
[Xen-devel] Re: [PATCH RFC V2 4/5] kvm guest : Added configuration support to enable debug information for KVM Guests
On 10/23/2011 09:07 PM, Raghavendra K T wrote:> Added configuration support to enable debug information > for KVM Guests in debugfs > > Signed-off-by: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com> > Signed-off-by: Suzuki Poulose <suzuki@in.ibm.com> > Signed-off-by: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com> > --- > diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig > index 1f03f82..ed34269 100644 > --- a/arch/x86/Kconfig > +++ b/arch/x86/Kconfig > @@ -562,6 +562,15 @@ config KVM_GUEST > This option enables various optimizations for running under the KVM > hypervisor. > > +config KVM_DEBUG_FS > + bool "Enable debug information for KVM Guests in debugfs" > + depends on KVM_GUEST > + default n > + ---help--- > + This option enables collection of various statistics for KVM guest. > + Statistics are displayed in debugfs filesystem. Enabling this option > + may incur significant overhead. > + > source "arch/x86/lguest/Kconfig" > >This might be better implemented through tracepoints, which an be enabled dynamically. -- I have a truly marvellous patch that fixes the bug which this signature is too narrow to contain. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Raghavendra K T
2011-Oct-24 11:20 UTC
[Xen-devel] Re: [PATCH RFC V2 3/5] kvm hypervisor : Add two hypercalls to support pv-ticketlock
On 10/24/2011 03:44 PM, Avi Kivity wrote:> On 10/23/2011 09:05 PM, Raghavendra K T wrote: >> Add two hypercalls to KVM hypervisor to support pv-ticketlocks. >> + >> +end_wait: >> + finish_wait(&vcpu->wq,&wait); >> +} > > This hypercall can be replaced by a HLT instruction, no? > > I''m pretty sure this misses a lot of stuff from kvm_vcpu_block().Yes.. agree. HLT sounds better idea. ''ll try this out.> >> + if (vcpu) { >> + vcpu->kicked = 1; > > Need to use smp memory barriers here.Agree.> >> + wake_up_interruptible(&vcpu->wq); >> + } >> +} >> + >> int kvm_emulate_hypercall(struct kvm_vcpu *vcpu) >> { >> unsigned long nr, a0, a1, a2, a3, ret; >> >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Srivatsa Vaddagiri
2011-Oct-24 12:27 UTC
[Xen-devel] Re: [PATCH RFC V2 3/5] kvm hypervisor : Add two hypercalls to support pv-ticketlock
* Avi Kivity <avi@redhat.com> [2011-10-24 12:14:21]:> > +/* > > + * kvm_pv_wait_for_kick_op : Block until kicked by either a KVM_HC_KICK_CPU > > + * hypercall or a event like interrupt. > > + * > > + * @vcpu : vcpu which is blocking. > > + */ > > +static void kvm_pv_wait_for_kick_op(struct kvm_vcpu *vcpu) > > +{[snip]> > +} > > This hypercall can be replaced by a HLT instruction, no?Good point. Assuming yield_on_hlt=1, that would allow the vcpu to be put to sleep and let other vcpus make progress. I guess with that change, we can also dropthe need for other hypercall introduced in this patch (kvm_pv_kick_cpu_op()). Essentially a vcpu sleeping because of HLT instruction can be woken up by a IPI issued by vcpu releasing a lock. - vatsa _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Avi Kivity
2011-Oct-24 13:09 UTC
[Xen-devel] Re: [PATCH RFC V2 3/5] kvm hypervisor : Add two hypercalls to support pv-ticketlock
On 10/24/2011 02:27 PM, Srivatsa Vaddagiri wrote:> Good point. Assuming yield_on_hlt=1, that would allow the vcpu to be put > to sleep and let other vcpus make progress. > > I guess with that change, we can also dropthe need for other hypercall > introduced in this patch (kvm_pv_kick_cpu_op()). Essentially a vcpu sleeping > because of HLT instruction can be woken up by a IPI issued by vcpu releasing a > lock.Not if interrupts are disabled. My original plan was to use NMIs for wakeups, but it turns out NMIs can be coalesced under certain rare circumstances; this requires workarounds by the generic NMI code that make NMIs too slow. -- I have a truly marvellous patch that fixes the bug which this signature is too narrow to contain. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Srivatsa Vaddagiri
2011-Oct-24 13:50 UTC
[Xen-devel] Re: [PATCH RFC V2 3/5] kvm hypervisor : Add two hypercalls to support pv-ticketlock
* Avi Kivity <avi@redhat.com> [2011-10-24 15:09:25]:> > I guess with that change, we can also dropthe need for other hypercall > > introduced in this patch (kvm_pv_kick_cpu_op()). Essentially a vcpu sleeping > > because of HLT instruction can be woken up by a IPI issued by vcpu releasing a > > lock. > > Not if interrupts are disabled.Hmm yes ..so we need a kick hypercall then. - vatsa _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Raghavendra K T
2011-Oct-25 18:11 UTC
[Xen-devel] Re: [PATCH RFC V2 4/5] kvm guest : Added configuration support to enable debug information for KVM Guests
On 10/24/2011 03:45 PM, Avi Kivity wrote:> On 10/23/2011 09:07 PM, Raghavendra K T wrote: >> Added configuration support to enable debug information >> for KVM Guests in debugfs >> >> Signed-off-by: Srivatsa Vaddagiri<vatsa@linux.vnet.ibm.com> >> Signed-off-by: Suzuki Poulose<suzuki@in.ibm.com> >> Signed-off-by: Raghavendra K T<raghavendra.kt@linux.vnet.ibm.com> >> --- >> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig >> index 1f03f82..ed34269 100644 >> --- a/arch/x86/Kconfig >> +++ b/arch/x86/Kconfig >> @@ -562,6 +562,15 @@ config KVM_GUEST >> This option enables various optimizations for running under the KVM >> hypervisor. >> >> +config KVM_DEBUG_FS >> + bool "Enable debug information for KVM Guests in debugfs" >> + depends on KVM_GUEST >> + default n >> + ---help--- >> + This option enables collection of various statistics for KVM guest. >> + Statistics are displayed in debugfs filesystem. Enabling this option >> + may incur significant overhead. >> + >> source "arch/x86/lguest/Kconfig" >> >> > > This might be better implemented through tracepoints, which an be > enabled dynamically. >Interesting option. I ''ll explore this one. Thanks for pointing. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Raghavendra K T
2011-Oct-25 18:24 UTC
[Xen-devel] Re: [PATCH RFC V2 3/5] kvm hypervisor : Add two hypercalls to support pv-ticketlock
On 10/24/2011 07:20 PM, Srivatsa Vaddagiri wrote:> * Avi Kivity<avi@redhat.com> [2011-10-24 15:09:25]: > >>> I guess with that change, we can also dropthe need for other hypercall >>> introduced in this patch (kvm_pv_kick_cpu_op()). Essentially a vcpu sleeping >>> because of HLT instruction can be woken up by a IPI issued by vcpu releasing a >>> lock. >> >> Not if interrupts are disabled. > > Hmm yes ..so we need a kick hypercall then. >So then do also you foresee the need for directed yield at some point, to address LHP? provided we have good improvements to prove. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2011-Oct-25 18:30 UTC
[Xen-devel] Re: [PATCH RFC V2 4/5] kvm guest : Added configuration support to enable debug information for KVM Guests
On 10/24/2011 03:15 AM, Avi Kivity wrote:> On 10/23/2011 09:07 PM, Raghavendra K T wrote: >> Added configuration support to enable debug information >> for KVM Guests in debugfs >> >> Signed-off-by: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com> >> Signed-off-by: Suzuki Poulose <suzuki@in.ibm.com> >> Signed-off-by: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com> >> --- >> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig >> index 1f03f82..ed34269 100644 >> --- a/arch/x86/Kconfig >> +++ b/arch/x86/Kconfig >> @@ -562,6 +562,15 @@ config KVM_GUEST >> This option enables various optimizations for running under the KVM >> hypervisor. >> >> +config KVM_DEBUG_FS >> + bool "Enable debug information for KVM Guests in debugfs" >> + depends on KVM_GUEST >> + default n >> + ---help--- >> + This option enables collection of various statistics for KVM guest. >> + Statistics are displayed in debugfs filesystem. Enabling this option >> + may incur significant overhead. >> + >> source "arch/x86/lguest/Kconfig" >> >> > This might be better implemented through tracepoints, which an be > enabled dynamically.Tracepoints use spinlocks, so that could get awkward. J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2011-Oct-25 18:34 UTC
[Xen-devel] Re: [PATCH RFC V2 5/5] kvm guest : pv-ticketlocks support for linux guests running on KVM hypervisor
On 10/23/2011 12:07 PM, Raghavendra K T wrote:> This patch extends Linux guests running on KVM hypervisor to support > pv-ticketlocks. Very early during bootup, paravirtualied KVM guest detects if > the hypervisor has required feature (KVM_FEATURE_WAIT_FOR_KICK) to support > pv-ticketlocks. If so, support for pv-ticketlocks is registered via pv_lock_ops. > > Signed-off-by: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com> > Signed-off-by: Suzuki Poulose <suzuki@in.ibm.com> > Signed-off-by: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com> > --- > diff --git a/arch/x86/include/asm/kvm_para.h b/arch/x86/include/asm/kvm_para.h > index 2874c19..c7f34b7 100644 > --- a/arch/x86/include/asm/kvm_para.h > +++ b/arch/x86/include/asm/kvm_para.h > @@ -195,10 +195,18 @@ void kvm_async_pf_task_wait(u32 token); > void kvm_async_pf_task_wake(u32 token); > u32 kvm_read_and_reset_pf_reason(void); > extern void kvm_disable_steal_time(void); > -#else > + > +#ifdef CONFIG_PARAVIRT_SPINLOCKS > +void __init kvm_guest_early_init(void); > +#else /* CONFIG_PARAVIRT_SPINLOCKS */ > +#define kvm_guest_early_init() do { } while (0) > +#endif /* CONFIG_PARAVIRT_SPINLOCKS */ > + > +#else /* CONFIG_KVM_GUEST */ > #define kvm_guest_init() do { } while (0) > #define kvm_async_pf_task_wait(T) do {} while(0) > #define kvm_async_pf_task_wake(T) do {} while(0) > +#define kvm_guest_early_init() do { } while (0) > static inline u32 kvm_read_and_reset_pf_reason(void) > { > return 0; > diff --git a/arch/x86/kernel/head32.c b/arch/x86/kernel/head32.c > index 3bb0850..fb25bca 100644 > --- a/arch/x86/kernel/head32.c > +++ b/arch/x86/kernel/head32.c > @@ -9,6 +9,7 @@ > #include <linux/start_kernel.h> > #include <linux/mm.h> > #include <linux/memblock.h> > +#include <linux/kvm_para.h> > > #include <asm/setup.h> > #include <asm/sections.h> > @@ -59,6 +60,8 @@ void __init i386_start_kernel(void) > break; > } > > + kvm_guest_early_init(); > + > /* > * At this point everything still needed from the boot loader > * or BIOS or kernel text should be early reserved or marked not > diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c > index 5655c22..cabf8ec 100644 > --- a/arch/x86/kernel/head64.c > +++ b/arch/x86/kernel/head64.c > @@ -13,6 +13,7 @@ > #include <linux/start_kernel.h> > #include <linux/io.h> > #include <linux/memblock.h> > +#include <linux/kvm_para.h> > > #include <asm/processor.h> > #include <asm/proto.h> > @@ -115,6 +116,8 @@ void __init x86_64_start_reservations(char *real_mode_data) > > reserve_ebda_region(); > > + kvm_guest_early_init(); > + > /* > * At this point everything still needed from the boot loader > * or BIOS or kernel text should be early reserved or marked not > diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c > index a9c2116..f4f341f 100644 > --- a/arch/x86/kernel/kvm.c > +++ b/arch/x86/kernel/kvm.c > @@ -39,6 +39,16 @@ > #include <asm/desc.h> > #include <asm/tlbflush.h> > > +#ifdef CONFIG_PARAVIRT_SPINLOCKS > + > +#ifdef CONFIG_KVM_DEBUG_FS > + > +#include <linux/debugfs.h> > + > +#endif /* CONFIG_KVM_DEBUG_FS */ > + > +#endif /* CONFIG_PARAVIRT_SPINLOCKS */ > + > #define MMU_QUEUE_SIZE 1024 > > static int kvmapf = 1; > @@ -627,3 +637,240 @@ static __init int activate_jump_labels(void) > return 0; > } > arch_initcall(activate_jump_labels); > + > +#ifdef CONFIG_PARAVIRT_SPINLOCKS > + > +#ifdef CONFIG_KVM_DEBUG_FS > + > +static struct kvm_spinlock_stats > +{ > + u32 taken_slow; > + u32 taken_slow_pickup; > + > + u32 released_slow; > + u32 released_slow_kicked; > + > +#define HISTO_BUCKETS 30 > + u32 histo_spin_blocked[HISTO_BUCKETS+1]; > + > + u64 time_blocked; > +} spinlock_stats; > + > +static u8 zero_stats; > + > +static inline void check_zero(void) > +{ > + if (unlikely(zero_stats)) { > + memset(&spinlock_stats, 0, sizeof(spinlock_stats)); > + zero_stats = 0; > + } > +} > + > +#define ADD_STATS(elem, val) \ > + do { check_zero(); spinlock_stats.elem += (val); } while (0) > + > +static inline u64 spin_time_start(void) > +{ > + return sched_clock(); > +} > + > +static void __spin_time_accum(u64 delta, u32 *array) > +{ > + unsigned index = ilog2(delta); > + > + check_zero(); > + > + if (index < HISTO_BUCKETS) > + array[index]++; > + else > + array[HISTO_BUCKETS]++; > +} > + > +static inline void spin_time_accum_blocked(u64 start) > +{ > + u32 delta = sched_clock() - start; > + > + __spin_time_accum(delta, spinlock_stats.histo_spin_blocked); > + spinlock_stats.time_blocked += delta; > +} > + > +static struct dentry *d_spin_debug; > +static struct dentry *d_kvm_debug; > + > +struct dentry *kvm_init_debugfs(void) > +{ > + d_kvm_debug = debugfs_create_dir("kvm", NULL); > + if (!d_kvm_debug) > + printk(KERN_WARNING "Could not create ''kvm'' debugfs directory\n"); > + > + return d_kvm_debug; > +} > + > +static int __init kvm_spinlock_debugfs(void) > +{ > + struct dentry *d_kvm = kvm_init_debugfs(); > + > + if (d_kvm == NULL) > + return -ENOMEM; > + > + d_spin_debug = debugfs_create_dir("spinlocks", d_kvm); > + > + debugfs_create_u8("zero_stats", 0644, d_spin_debug, &zero_stats); > + > + debugfs_create_u32("taken_slow", 0444, d_spin_debug, > + &spinlock_stats.taken_slow); > + debugfs_create_u32("taken_slow_pickup", 0444, d_spin_debug, > + &spinlock_stats.taken_slow_pickup); > + > + debugfs_create_u32("released_slow", 0444, d_spin_debug, > + &spinlock_stats.released_slow); > + debugfs_create_u32("released_slow_kicked", 0444, d_spin_debug, > + &spinlock_stats.released_slow_kicked); > + > + debugfs_create_u64("time_blocked", 0444, d_spin_debug, > + &spinlock_stats.time_blocked); > + > + debugfs_create_u32_array("histo_blocked", 0444, d_spin_debug, > + spinlock_stats.histo_spin_blocked, HISTO_BUCKETS + 1); > + > + return 0; > +} > +fs_initcall(kvm_spinlock_debugfs); > +#else /* !CONFIG_KVM_DEBUG_FS */ > +#define TIMEOUT (1 << 10) > +#define ADD_STATS(elem, val) do { (void)(val); } while (0) > + > +static inline u64 spin_time_start(void) > +{ > + return 0; > +} > + > +static inline void spin_time_accum_blocked(u64 start) > +{ > +} > +#endif /* CONFIG_KVM_DEBUG_FS */ > + > +struct kvm_lock_waiting { > + struct arch_spinlock *lock; > + __ticket_t want; > +}; > + > +/* cpus ''waiting'' on a spinlock to become available */ > +static cpumask_t waiting_cpus; > + > +/* Track spinlock on which a cpu is waiting */ > +static DEFINE_PER_CPU(struct kvm_lock_waiting, lock_waiting); > + > +static inline void kvm_wait_for_kick(void) > +{ > + kvm_hypercall0(KVM_HC_WAIT_FOR_KICK); > +} > + > +static void kvm_lock_spinning(struct arch_spinlock *lock, __ticket_t want) > +{ > + struct kvm_lock_waiting *w = &__get_cpu_var(lock_waiting); > + int cpu = smp_processor_id(); > + u64 start; > + unsigned long flags; > + > + start = spin_time_start(); > + > + /* > + * Make sure an interrupt handler can''t upset things in a > + * partially setup state. > + */ > + local_irq_save(flags); > + > + /* > + * The ordering protocol on this is that the "lock" pointer > + * may only be set non-NULL if the "want" ticket is correct. > + * If we''re updating "want", we must first clear "lock". > + */ > + w->lock = NULL; > + smp_wmb(); > + w->want = want; > + smp_wmb(); > + w->lock = lock; > + > + ADD_STATS(taken_slow, 1); > + > + /* > + * This uses set_bit, which is atomic but we should not rely on its > + * reordering gurantees. So barrier is needed after this call. > + */ > + cpumask_set_cpu(cpu, &waiting_cpus); > + > + barrier(); > + > + /* > + * Mark entry to slowpath before doing the pickup test to make > + * sure we don''t deadlock with an unlocker. > + */ > + __ticket_enter_slowpath(lock); > + > + /* > + * check again make sure it didn''t become free while > + * we weren''t looking. > + */ > + if (ACCESS_ONCE(lock->tickets.head) == want) { > + ADD_STATS(taken_slow_pickup, 1); > + goto out; > + } > + > + /* Allow interrupts while blocked */ > + local_irq_restore(flags); > + > + kvm_wait_for_kick(); > + > + local_irq_save(flags); > +out: > + cpumask_clear_cpu(cpu, &waiting_cpus); > + w->lock = NULL; > + local_irq_restore(flags); > + spin_time_accum_blocked(start); > +} > +PV_CALLEE_SAVE_REGS_THUNK(kvm_lock_spinning); > + > +/* Kick a cpu */ > +static inline void kvm_kick_cpu(int cpu) > +{ > + kvm_hypercall1(KVM_HC_KICK_CPU, cpu); > +} > + > +/* Kick vcpu waiting on @lock->head to reach value @ticket */ > +static void kvm_unlock_kick(struct arch_spinlock *lock, __ticket_t ticket) > +{ > + int cpu; > + > + ADD_STATS(released_slow, 1); > + > + for_each_cpu(cpu, &waiting_cpus) { > + const struct kvm_lock_waiting *w = &per_cpu(lock_waiting, cpu); > + if (ACCESS_ONCE(w->lock) == lock && > + ACCESS_ONCE(w->want) == ticket) { > + ADD_STATS(released_slow_kicked, 1); > + kvm_kick_cpu(cpu); > + break; > + } > + } > +} > + > +/* > + * Setup pv_lock_ops to exploit KVM_FEATURE_WAIT_FOR_KICK if present. > + * This needs to be setup really early in boot, before the first call to > + * spinlock is issued!Actually, it doesn''t matter that much. The in-memory format is the same for regular and PV spinlocks, and the PV paths only come into play if the "slowpath" flag is set in the lock, which it never will be by the non-PV code. In principle, you could defer initializing PV ticketlocks until some arbitrarily late point if you notice that the system is oversubscribed enough to require it. The main constraint at present is that you need to update the pv_lock_ops structure before pvops patching happens, or you won''t see any effect from making changes.> + */ > +void __init kvm_guest_early_init(void) > +{ > + if (!kvm_para_available()) > + return; > + /* Does host kernel support KVM_FEATURE_WAIT_FOR_KICK? */ > + if (!kvm_para_has_feature(KVM_FEATURE_WAIT_FOR_KICK)) > + return; > + > + jump_label_inc(¶virt_ticketlocks_enabled); > + > + pv_lock_ops.lock_spinning = PV_CALLEE_SAVE(kvm_lock_spinning); > + pv_lock_ops.unlock_kick = kvm_unlock_kick; > +} > +#endif /* CONFIG_PARAVIRT_SPINLOCKS */J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2011-Oct-25 18:35 UTC
[Xen-devel] Re: [PATCH RFC V2 5/5] kvm guest : pv-ticketlocks support for linux guests running on KVM hypervisor
On 10/23/2011 12:07 PM, Raghavendra K T wrote:> This patch extends Linux guests running on KVM hypervisor to support > pv-ticketlocks. Very early during bootup, paravirtualied KVM guest detects if > the hypervisor has required feature (KVM_FEATURE_WAIT_FOR_KICK) to support > pv-ticketlocks. If so, support for pv-ticketlocks is registered via pv_lock_ops. > > Signed-off-by: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com> > Signed-off-by: Suzuki Poulose <suzuki@in.ibm.com> > Signed-off-by: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com> > --- > diff --git a/arch/x86/include/asm/kvm_para.h b/arch/x86/include/asm/kvm_para.h > index 2874c19..c7f34b7 100644 > --- a/arch/x86/include/asm/kvm_para.h > +++ b/arch/x86/include/asm/kvm_para.h > @@ -195,10 +195,18 @@ void kvm_async_pf_task_wait(u32 token); > void kvm_async_pf_task_wake(u32 token); > u32 kvm_read_and_reset_pf_reason(void); > extern void kvm_disable_steal_time(void); > -#else > + > +#ifdef CONFIG_PARAVIRT_SPINLOCKS > +void __init kvm_guest_early_init(void); > +#else /* CONFIG_PARAVIRT_SPINLOCKS */ > +#define kvm_guest_early_init() do { } while (0) > +#endif /* CONFIG_PARAVIRT_SPINLOCKS */ > + > +#else /* CONFIG_KVM_GUEST */ > #define kvm_guest_init() do { } while (0) > #define kvm_async_pf_task_wait(T) do {} while(0) > #define kvm_async_pf_task_wake(T) do {} while(0) > +#define kvm_guest_early_init() do { } while (0) > static inline u32 kvm_read_and_reset_pf_reason(void) > { > return 0; > diff --git a/arch/x86/kernel/head32.c b/arch/x86/kernel/head32.c > index 3bb0850..fb25bca 100644 > --- a/arch/x86/kernel/head32.c > +++ b/arch/x86/kernel/head32.c > @@ -9,6 +9,7 @@ > #include <linux/start_kernel.h> > #include <linux/mm.h> > #include <linux/memblock.h> > +#include <linux/kvm_para.h> > > #include <asm/setup.h> > #include <asm/sections.h> > @@ -59,6 +60,8 @@ void __init i386_start_kernel(void) > break; > } > > + kvm_guest_early_init(); > + > /* > * At this point everything still needed from the boot loader > * or BIOS or kernel text should be early reserved or marked not > diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c > index 5655c22..cabf8ec 100644 > --- a/arch/x86/kernel/head64.c > +++ b/arch/x86/kernel/head64.c > @@ -13,6 +13,7 @@ > #include <linux/start_kernel.h> > #include <linux/io.h> > #include <linux/memblock.h> > +#include <linux/kvm_para.h> > > #include <asm/processor.h> > #include <asm/proto.h> > @@ -115,6 +116,8 @@ void __init x86_64_start_reservations(char *real_mode_data) > > reserve_ebda_region(); > > + kvm_guest_early_init(); > + > /* > * At this point everything still needed from the boot loader > * or BIOS or kernel text should be early reserved or marked not > diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c > index a9c2116..f4f341f 100644 > --- a/arch/x86/kernel/kvm.c > +++ b/arch/x86/kernel/kvm.c > @@ -39,6 +39,16 @@ > #include <asm/desc.h> > #include <asm/tlbflush.h> > > +#ifdef CONFIG_PARAVIRT_SPINLOCKS > + > +#ifdef CONFIG_KVM_DEBUG_FS > + > +#include <linux/debugfs.h> > + > +#endif /* CONFIG_KVM_DEBUG_FS */ > + > +#endif /* CONFIG_PARAVIRT_SPINLOCKS */This is a big mess. Is there any problem with including linux/debugfs.h unconditionally? Or at least using "#if defined(CONFIG_PARAVIRT_SPINLOCKS) && defined(CONFIG_KVM_DEBUG_FS)"? J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Avi Kivity
2011-Oct-26 10:34 UTC
[Xen-devel] Re: [PATCH RFC V2 3/5] kvm hypervisor : Add two hypercalls to support pv-ticketlock
On 10/25/2011 08:24 PM, Raghavendra K T wrote:> > So then do also you foresee the need for directed yield at some point, > to address LHP? provided we have good improvements to prove.Doesn''t this patchset completely eliminate lock holder preemption? -- I have a truly marvellous patch that fixes the bug which this signature is too narrow to contain. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Raghavendra K T
2011-Oct-26 19:08 UTC
[Xen-devel] Re: [PATCH RFC V2 3/5] kvm hypervisor : Add two hypercalls to support pv-ticketlock
On 10/26/2011 04:04 PM, Avi Kivity wrote:> On 10/25/2011 08:24 PM, Raghavendra K T wrote:CCing Ryan also>> >> So then do also you foresee the need for directed yield at some point, >> to address LHP? provided we have good improvements to prove. > > Doesn''t this patchset completely eliminate lock holder preemption? >Basically I was curious whether we can do more better with your directed yield discussions in https://lkml.org/lkml/2010/8/2/106 . I felt we can get little more improvement with doing directed yield to lock-holder in case of LHP than sleeping. But I may be wrong. So wanted to get the feedback, on whether I am thinking in right direction. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Raghavendra K T
2011-Oct-26 19:23 UTC
[Xen-devel] Re: [PATCH RFC V2 5/5] kvm guest : pv-ticketlocks support for linux guests running on KVM hypervisor
On 10/26/2011 12:04 AM, Jeremy Fitzhardinge wrote:> On 10/23/2011 12:07 PM, Raghavendra K T wrote: >> This patch extends Linux guests running on KVM hypervisor to support >> +/* >> + * Setup pv_lock_ops to exploit KVM_FEATURE_WAIT_FOR_KICK if present. >> + * This needs to be setup really early in boot, before the first call to >> + * spinlock is issued! > > Actually, it doesn''t matter that much. The in-memory format is the same > for regular and PV spinlocks, and the PV paths only come into play if > the "slowpath" flag is set in the lock, which it never will be by the > non-PV code. > > In principle, you could defer initializing PV ticketlocks until some > arbitrarily late point if you notice that the system is oversubscribed > enough to require it.ok.. so this means it will not affect even if it is initialized in middle somewhere, but better to do it before we start seeing lock contention. our current aim was to have before any printk happens. So I ''ll trim the comment to somethings like : Setup pv_lock_ops to exploit KVM_FEATURE_WAIT_FOR_KICK if present. This needs to be setup early in boot. ?> > The main constraint at present is that you need to update the > pv_lock_ops structure before pvops patching happens, or you won''t see > any effect from making changes.Hmm. got it. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Raghavendra K T
2011-Oct-26 19:25 UTC
[Xen-devel] Re: [PATCH RFC V2 5/5] kvm guest : pv-ticketlocks support for linux guests running on KVM hypervisor
On 10/26/2011 12:05 AM, Jeremy Fitzhardinge wrote:> On 10/23/2011 12:07 PM, Raghavendra K T wrote: >> >> +#ifdef CONFIG_PARAVIRT_SPINLOCKS >> + >> +#ifdef CONFIG_KVM_DEBUG_FS >> + >> +#include<linux/debugfs.h> >> + >> +#endif /* CONFIG_KVM_DEBUG_FS */ >> + >> +#endif /* CONFIG_PARAVIRT_SPINLOCKS */ > > This is a big mess. Is there any problem with including linux/debugfs.h > unconditionally? Or at least using "#if > defined(CONFIG_PARAVIRT_SPINLOCKS)&& defined(CONFIG_KVM_DEBUG_FS)"? >I agree. I ''ll include unconditionally unless somebody objects.> J_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2011-Oct-26 19:46 UTC
[Xen-devel] Re: [PATCH RFC V2 5/5] kvm guest : pv-ticketlocks support for linux guests running on KVM hypervisor
On 10/26/2011 12:23 PM, Raghavendra K T wrote:> On 10/26/2011 12:04 AM, Jeremy Fitzhardinge wrote: >> On 10/23/2011 12:07 PM, Raghavendra K T wrote: >>> This patch extends Linux guests running on KVM hypervisor to support >>> +/* >>> + * Setup pv_lock_ops to exploit KVM_FEATURE_WAIT_FOR_KICK if present. >>> + * This needs to be setup really early in boot, before the first >>> call to >>> + * spinlock is issued! >> >> Actually, it doesn''t matter that much. The in-memory format is the same >> for regular and PV spinlocks, and the PV paths only come into play if >> the "slowpath" flag is set in the lock, which it never will be by the >> non-PV code. >> >> In principle, you could defer initializing PV ticketlocks until some >> arbitrarily late point if you notice that the system is oversubscribed >> enough to require it. > > ok.. so this means it will not affect even if it is initialized in > middle somewhere, but better to do it before we start seeing lock > contention.Right. Or more specifically, lock contention while you have VCPU overcommit.> our current aim was to have before any printk happens. > So I ''ll trim the comment to somethings like : > > Setup pv_lock_ops to exploit KVM_FEATURE_WAIT_FOR_KICK if present. > This needs to be setup early in boot. ?You can hook the smp_ops.smp_prepare_cpus call and initialize it there. There''s no need to add new hook code. J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Raghavendra K T
2011-Oct-26 19:55 UTC
[Xen-devel] Re: [PATCH RFC V2 5/5] kvm guest : pv-ticketlocks support for linux guests running on KVM hypervisor
On 10/27/2011 01:16 AM, Jeremy Fitzhardinge wrote:> On 10/26/2011 12:23 PM, Raghavendra K T wrote: >> On 10/26/2011 12:04 AM, Jeremy Fitzhardinge wrote: >>> On 10/23/2011 12:07 PM, Raghavendra K T wrote: > >> our current aim was to have before any printk happens. >> So I ''ll trim the comment to somethings like : >> >> Setup pv_lock_ops to exploit KVM_FEATURE_WAIT_FOR_KICK if present. >> This needs to be setup early in boot. ? > > You can hook the smp_ops.smp_prepare_cpus call and initialize it there. > There''s no need to add new hook code. >Ok.. Thanks for pointer. ''ll check on that.> J >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2011-Oct-26 20:09 UTC
[Xen-devel] Re: [PATCH RFC V2 3/5] kvm hypervisor : Add two hypercalls to support pv-ticketlock
On 10/26/2011 03:34 AM, Avi Kivity wrote:> On 10/25/2011 08:24 PM, Raghavendra K T wrote: >> So then do also you foresee the need for directed yield at some point, >> to address LHP? provided we have good improvements to prove. > Doesn''t this patchset completely eliminate lock holder preemption?Well, there''s the question of whether its better for someone waiting for a contended lock to just go to sleep and rely on the scheduler to give CPU time to whoever currently has the lock, or if the scheduler needs a little hint to boost the lock holder by giving it the waiter''s timeslice. I tend to prefer the former, since there''s no reason to suppose that the the lock holder vcpu is necessarily the scheduler''s top priority, and it may want to schedule something else anyway. J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel