The first patch is Ian''s fix for ARM xen_ulong_t, which is not in tree at the moment. The Linux side patches are working. As any change in the hypervisor side causes change in kernel side, I don''t bother posting them at this point. Changes since V3: * Dedicated EVTCHNOP for extended ABI query * Dedicated EVTCHNOP for 3-level ABI registration * 3-level ABI is registered in two phases: * register the bitmaps * register per-cpu L2 selector * libxl: evtchn_extended -> evtchn_extended_allowed Changes since V2: * new interface to register extended event channel ABI * use vmap to simplify mapping * replace MAX_EVTCHNS macro with inline function * libxl: evtchn_l3 -> evtchn_extended The most notable bit of this series is the interface change. In order to cope with future ABIs, the interface is renamed to EVTCHNOP_register_extended. It also provides supported ABI query, so that we can remove unused ABI in the future. The semantic meaning of EVTCHNOP_register_extended changes a bit. The `level'' in parameter now changes to `cmd'', which means we should go down to specific ABI routines. ABI-specific structures are still embedded in the union. Changes since V1: * move all evtchn related macros / struct definitions to event.h * only allow 3-level evtchn for Dom0 and driver domains * add evtchn_l3 flag in libxl Diffstat: docs/man/xl.cfg.pod.5 | 10 + tools/libxl/libxl_create.c | 4 + tools/libxl/libxl_types.idl | 1 + tools/libxl/xl_cmdimpl.c | 3 + xen/arch/arm/domain.c | 1 + xen/arch/x86/domain.c | 1 + xen/arch/x86/irq.c | 7 +- xen/common/domain.c | 3 + xen/common/domctl.c | 6 +- xen/common/event_channel.c | 458 +++++++++++++++++++++++++++++++++--- xen/common/keyhandler.c | 6 +- xen/common/schedule.c | 4 +- xen/include/asm-arm/config.h | 4 + xen/include/asm-x86/config.h | 7 +- xen/include/public/domctl.h | 3 + xen/include/public/event_channel.h | 48 ++++ xen/include/public/xen.h | 35 ++- xen/include/xen/event.h | 82 ++++++- xen/include/xen/sched.h | 65 ++--- xen/xsm/flask/hooks.c | 1 + 20 files changed, 641 insertions(+), 108 deletions(-)
Wei Liu
2013-Mar-05 12:30 UTC
[RFC PATCH V4 01/18] xen: correct BITS_PER_EVTCHN_WORD on arm
From: Ian Campbell <ian.campbell@citrix.com> This is always 64-bit on ARM, not BITS_PER_LONG Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Cc: keir@xen.org Cc: tim@xen.org Cc: stefano.stabellini@citrix.com Cc: jbuelich@suse.com --- xen/include/asm-arm/config.h | 3 +++ xen/include/asm-x86/config.h | 2 ++ xen/include/xen/sched.h | 4 ++-- 3 files changed, 7 insertions(+), 2 deletions(-) diff --git a/xen/include/asm-arm/config.h b/xen/include/asm-arm/config.h index 3910dd2..8be8563 100644 --- a/xen/include/asm-arm/config.h +++ b/xen/include/asm-arm/config.h @@ -22,6 +22,9 @@ #define BYTES_PER_LONG (1 << LONG_BYTEORDER) #define BITS_PER_LONG (BYTES_PER_LONG << 3) +/* xen_ulong_t is always 64 bits */ +#define BITS_PER_XEN_ULONG 64 + #define CONFIG_PAGING_ASSISTANCE 1 #define CONFIG_PAGING_LEVELS 3 diff --git a/xen/include/asm-x86/config.h b/xen/include/asm-x86/config.h index 0a5f031..cf93bd5 100644 --- a/xen/include/asm-x86/config.h +++ b/xen/include/asm-x86/config.h @@ -14,6 +14,8 @@ #define BITS_PER_LONG (BYTES_PER_LONG << 3) #define BITS_PER_BYTE 8 +#define BITS_PER_XEN_ULONG BITS_PER_LONG + #define CONFIG_X86 1 #define CONFIG_X86_HT 1 #define CONFIG_PAGING_ASSISTANCE 1 diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h index e108436..ccd0496 100644 --- a/xen/include/xen/sched.h +++ b/xen/include/xen/sched.h @@ -46,9 +46,9 @@ DEFINE_XEN_GUEST_HANDLE(vcpu_runstate_info_compat_t); extern struct domain *dom0; #ifndef CONFIG_COMPAT -#define BITS_PER_EVTCHN_WORD(d) BITS_PER_LONG +#define BITS_PER_EVTCHN_WORD(d) BITS_PER_XEN_ULONG #else -#define BITS_PER_EVTCHN_WORD(d) (has_32bit_shinfo(d) ? 32 : BITS_PER_LONG) +#define BITS_PER_EVTCHN_WORD(d) (has_32bit_shinfo(d) ? 32 : BITS_PER_XEN_ULONG) #endif #define MAX_EVTCHNS(d) (BITS_PER_EVTCHN_WORD(d) * BITS_PER_EVTCHN_WORD(d)) #define EVTCHNS_PER_BUCKET 128 -- 1.7.10.4
Affected files: * event_channel.c * sched.h * event.h * xen.h Signed-off-by: Wei Liu <wei.liu2@citrix.com> --- xen/common/event_channel.c | 16 ++++++++-------- xen/include/public/xen.h | 22 +++++++++++----------- xen/include/xen/event.h | 4 ++-- xen/include/xen/sched.h | 6 +++--- 4 files changed, 24 insertions(+), 24 deletions(-) diff --git a/xen/common/event_channel.c b/xen/common/event_channel.c index a0f293f..dabfa9e 100644 --- a/xen/common/event_channel.c +++ b/xen/common/event_channel.c @@ -1,15 +1,15 @@ /****************************************************************************** * event_channel.c - * + * * Event notifications from VIRQs, PIRQs, and other domains. - * + * * Copyright (c) 2003-2006, K A Fraser. - * + * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. - * + * * You should have received a copy of the GNU General Public License * along with this program; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA @@ -238,7 +238,7 @@ static long evtchn_bind_interdomain(evtchn_bind_interdomain_t *bind) lchn->u.interdomain.remote_dom = rd; lchn->u.interdomain.remote_port = (u16)rport; lchn->state = ECS_INTERDOMAIN; - + rchn->u.interdomain.remote_dom = ld; rchn->u.interdomain.remote_port = (u16)lport; rchn->state = ECS_INTERDOMAIN; @@ -255,7 +255,7 @@ static long evtchn_bind_interdomain(evtchn_bind_interdomain_t *bind) spin_unlock(&ld->event_lock); if ( ld != rd ) spin_unlock(&rd->event_lock); - + rcu_unlock_domain(rd); return rc; @@ -633,7 +633,7 @@ static void evtchn_set_pending(struct vcpu *v, int port) { vcpu_mark_events_pending(v); } - + /* Check if some VCPU might be polling for this event. */ if ( likely(bitmap_empty(d->poll_mask, d->max_vcpus)) ) return; @@ -930,7 +930,7 @@ int evtchn_unmask(unsigned int port) /* * These operations must happen in strict order. Based on - * include/xen/event.h:evtchn_set_pending(). + * include/xen/event.h:evtchn_set_pending(). */ if ( test_and_clear_bit(port, &shared_info(d, evtchn_mask)) && test_bit (port, &shared_info(d, evtchn_pending)) && diff --git a/xen/include/public/xen.h b/xen/include/public/xen.h index e9431e2..ba9e1ab 100644 --- a/xen/include/public/xen.h +++ b/xen/include/public/xen.h @@ -1,8 +1,8 @@ /****************************************************************************** * xen.h - * + * * Guest OS interface to Xen. - * + * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to * deal in the Software without restriction, including without limitation the @@ -137,11 +137,11 @@ DEFINE_XEN_GUEST_HANDLE(xen_ulong_t); #define __HYPERVISOR_dom0_op __HYPERVISOR_platform_op #endif -/* +/* * VIRTUAL INTERRUPTS - * + * * Virtual interrupts that a guest OS may receive from Xen. - * + * * In the side comments, ''V.'' denotes a per-VCPU VIRQ while ''G.'' denotes a * global VIRQ. The former can be bound once per VCPU and cannot be re-bound. * The latter can be allocated only once per guest: they must initially be @@ -190,7 +190,7 @@ DEFINE_XEN_GUEST_HANDLE(xen_ulong_t); * (x) encodes the PFD as follows: * x == 0 => PFD == DOMID_SELF * x != 0 => PFD == x - 1 - * + * * Sub-commands: ptr[1:0] specifies the appropriate MMU_* command. * ------------- * ptr[1:0] == MMU_NORMAL_PT_UPDATE: @@ -236,13 +236,13 @@ DEFINE_XEN_GUEST_HANDLE(xen_ulong_t); * To deallocate the pages, the operations are the reverse of the steps * mentioned above. The argument is MMUEXT_UNPIN_TABLE for all levels and the * pagetable MUST not be in use (meaning that the cr3 is not set to it). - * + * * ptr[1:0] == MMU_MACHPHYS_UPDATE: * Updates an entry in the machine->pseudo-physical mapping table. * ptr[:2] -- Machine address within the frame whose mapping to modify. * The frame must belong to the FD, if one is specified. * val -- Value to write into the mapping entry. - * + * * ptr[1:0] == MMU_PT_UPDATE_PRESERVE_AD: * As MMU_NORMAL_PT_UPDATE above, but A/D bits currently in the PTE are ORed * with those in @val. @@ -588,7 +588,7 @@ typedef struct vcpu_time_info vcpu_time_info_t; struct vcpu_info { /* * ''evtchn_upcall_pending'' is written non-zero by Xen to indicate - * a pending notification for a particular VCPU. It is then cleared + * a pending notification for a particular VCPU. It is then cleared * by the guest OS /before/ checking for pending work, thus avoiding * a set-and-check race. Note that the mask is only accessed by Xen * on the CPU that is currently hosting the VCPU. This means that the @@ -646,7 +646,7 @@ struct shared_info { * 3. Virtual interrupts (''events''). A domain can bind an event-channel * port to a virtual interrupt source, such as the virtual-timer * device or the emergency console. - * + * * Event channels are addressed by a "port index". Each channel is * associated with two bits of information: * 1. PENDING -- notifies the domain that there is a pending notification @@ -657,7 +657,7 @@ struct shared_info { * becomes pending while the channel is masked then the ''edge'' is lost * (i.e., when the channel is unmasked, the guest must manually handle * pending notifications as no upcall will be scheduled by Xen). - * + * * To expedite scanning of pending notifications, any 0->1 pending * transition on an unmasked channel causes a corresponding bit in a * per-vcpu selector word to be set. Each bit in the selector covers a diff --git a/xen/include/xen/event.h b/xen/include/xen/event.h index 71c3e92..65ac81a 100644 --- a/xen/include/xen/event.h +++ b/xen/include/xen/event.h @@ -1,8 +1,8 @@ /****************************************************************************** * event.h - * + * * A nice interface for passing asynchronous events to guest OSes. - * + * * Copyright (c) 2002-2006, K A Fraser */ diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h index ccd0496..ae3cd07 100644 --- a/xen/include/xen/sched.h +++ b/xen/include/xen/sched.h @@ -92,7 +92,7 @@ void evtchn_destroy_final(struct domain *d); /* from complete_domain_destroy */ struct waitqueue_vcpu; -struct vcpu +struct vcpu { int vcpu_id; @@ -453,7 +453,7 @@ struct domain *domain_create( /* * rcu_lock_domain_by_id() is more efficient than get_domain_by_id(). * This is the preferred function if the returned domain reference - * is short lived, but it cannot be used if the domain reference needs + * is short lived, but it cannot be used if the domain reference needs * to be kept beyond the current scope (e.g., across a softirq). * The returned domain reference must be discarded using rcu_unlock_domain(). */ @@ -574,7 +574,7 @@ void sync_local_execstate(void); * sync_vcpu_execstate() will switch and commit @prev''s state. */ void context_switch( - struct vcpu *prev, + struct vcpu *prev, struct vcpu *next); /* -- 1.7.10.4
As we move to extended evtchn ABI we need bigger d->evtchn, as a result this will bloat struct domain. So move this array out of struct domain and allocate a dedicated array for it. Signed-off-by: Wei Liu <wei.liu2@citrix.com> --- xen/common/event_channel.c | 14 ++++++++++++++ xen/include/xen/sched.h | 2 +- 2 files changed, 15 insertions(+), 1 deletion(-) diff --git a/xen/common/event_channel.c b/xen/common/event_channel.c index dabfa9e..6cb082e 100644 --- a/xen/common/event_channel.c +++ b/xen/common/event_channel.c @@ -1172,15 +1172,27 @@ void notify_via_xen_event_channel(struct domain *ld, int lport) int evtchn_init(struct domain *d) { + BUILD_BUG_ON(sizeof(struct evtchn *) * NR_EVTCHN_BUCKETS > PAGE_SIZE); + d->evtchn = xzalloc_array(struct evtchn *, NR_EVTCHN_BUCKETS); + + if ( d->evtchn == NULL ) + return -ENOMEM; + spin_lock_init(&d->event_lock); if ( get_free_port(d) != 0 ) + { + xfree(d->evtchn); return -EINVAL; + } evtchn_from_port(d, 0)->state = ECS_RESERVED; #if MAX_VIRT_CPUS > BITS_PER_LONG d->poll_mask = xmalloc_array(unsigned long, BITS_TO_LONGS(MAX_VIRT_CPUS)); if ( !d->poll_mask ) + { + xfree(d->evtchn); return -ENOMEM; + } bitmap_zero(d->poll_mask, MAX_VIRT_CPUS); #endif @@ -1214,6 +1226,8 @@ void evtchn_destroy(struct domain *d) spin_unlock(&d->event_lock); clear_global_virq_handlers(d); + + xfree(d->evtchn); } diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h index ae3cd07..f869cf1 100644 --- a/xen/include/xen/sched.h +++ b/xen/include/xen/sched.h @@ -260,7 +260,7 @@ struct domain spinlock_t rangesets_lock; /* Event channel information. */ - struct evtchn *evtchn[NR_EVTCHN_BUCKETS]; + struct evtchn **evtchn; spinlock_t event_lock; struct grant_table *grant_table; -- 1.7.10.4
Wei Liu
2013-Mar-05 12:30 UTC
[RFC PATCH V4 04/18] Move event channel macros / struct definition to proper place
After remove reference to NR_EVTCHN_BUCKETS in struct domain, we can move those macros / struct definitions to event.h. Also update xen/xsm/flask/hooks.c to include the new header. Signed-off-by: Wei Liu <wei.liu2@citrix.com> --- xen/include/xen/event.h | 46 ++++++++++++++++++++++++++++++++++++++++++++++ xen/include/xen/sched.h | 45 --------------------------------------------- xen/xsm/flask/hooks.c | 1 + 3 files changed, 47 insertions(+), 45 deletions(-) diff --git a/xen/include/xen/event.h b/xen/include/xen/event.h index 65ac81a..271d792 100644 --- a/xen/include/xen/event.h +++ b/xen/include/xen/event.h @@ -15,6 +15,52 @@ #include <asm/bitops.h> #include <asm/event.h> +#ifndef CONFIG_COMPAT +#define BITS_PER_EVTCHN_WORD(d) BITS_PER_XEN_ULONG +#else +#define BITS_PER_EVTCHN_WORD(d) (has_32bit_shinfo(d) ? 32 : BITS_PER_XEN_ULONG) +#endif +#define MAX_EVTCHNS(d) (BITS_PER_EVTCHN_WORD(d) * BITS_PER_EVTCHN_WORD(d)) + +#define EVTCHNS_PER_BUCKET 128 +#define NR_EVTCHN_BUCKETS (NR_EVENT_CHANNELS / EVTCHNS_PER_BUCKET) + +struct evtchn +{ +#define ECS_FREE 0 /* Channel is available for use. */ +#define ECS_RESERVED 1 /* Channel is reserved. */ +#define ECS_UNBOUND 2 /* Channel is waiting to bind to a remote domain. */ +#define ECS_INTERDOMAIN 3 /* Channel is bound to another domain. */ +#define ECS_PIRQ 4 /* Channel is bound to a physical IRQ line. */ +#define ECS_VIRQ 5 /* Channel is bound to a virtual IRQ line. */ +#define ECS_IPI 6 /* Channel is bound to a virtual IPI line. */ + u8 state; /* ECS_* */ + u8 xen_consumer; /* Consumer in Xen, if any? (0 = send to guest) */ + u16 notify_vcpu_id; /* VCPU for local delivery notification */ + union { + struct { + domid_t remote_domid; + } unbound; /* state == ECS_UNBOUND */ + struct { + u16 remote_port; + struct domain *remote_dom; + } interdomain; /* state == ECS_INTERDOMAIN */ + struct { + u16 irq; + u16 next_port; + u16 prev_port; + } pirq; /* state == ECS_PIRQ */ + u16 virq; /* state == ECS_VIRQ */ + } u; +#ifdef FLASK_ENABLE + void *ssid; +#endif +}; + +int evtchn_init(struct domain *d); /* from domain_create */ +void evtchn_destroy(struct domain *d); /* from domain_kill */ +void evtchn_destroy_final(struct domain *d); /* from complete_domain_destroy */ + /* * send_guest_vcpu_virq: Notify guest via a per-VCPU VIRQ. * @v: VCPU to which virtual IRQ should be sent diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h index f869cf1..58b7176 100644 --- a/xen/include/xen/sched.h +++ b/xen/include/xen/sched.h @@ -45,51 +45,6 @@ DEFINE_XEN_GUEST_HANDLE(vcpu_runstate_info_compat_t); /* A global pointer to the initial domain (DOM0). */ extern struct domain *dom0; -#ifndef CONFIG_COMPAT -#define BITS_PER_EVTCHN_WORD(d) BITS_PER_XEN_ULONG -#else -#define BITS_PER_EVTCHN_WORD(d) (has_32bit_shinfo(d) ? 32 : BITS_PER_XEN_ULONG) -#endif -#define MAX_EVTCHNS(d) (BITS_PER_EVTCHN_WORD(d) * BITS_PER_EVTCHN_WORD(d)) -#define EVTCHNS_PER_BUCKET 128 -#define NR_EVTCHN_BUCKETS (NR_EVENT_CHANNELS / EVTCHNS_PER_BUCKET) - -struct evtchn -{ -#define ECS_FREE 0 /* Channel is available for use. */ -#define ECS_RESERVED 1 /* Channel is reserved. */ -#define ECS_UNBOUND 2 /* Channel is waiting to bind to a remote domain. */ -#define ECS_INTERDOMAIN 3 /* Channel is bound to another domain. */ -#define ECS_PIRQ 4 /* Channel is bound to a physical IRQ line. */ -#define ECS_VIRQ 5 /* Channel is bound to a virtual IRQ line. */ -#define ECS_IPI 6 /* Channel is bound to a virtual IPI line. */ - u8 state; /* ECS_* */ - u8 xen_consumer; /* Consumer in Xen, if any? (0 = send to guest) */ - u16 notify_vcpu_id; /* VCPU for local delivery notification */ - union { - struct { - domid_t remote_domid; - } unbound; /* state == ECS_UNBOUND */ - struct { - u16 remote_port; - struct domain *remote_dom; - } interdomain; /* state == ECS_INTERDOMAIN */ - struct { - u16 irq; - u16 next_port; - u16 prev_port; - } pirq; /* state == ECS_PIRQ */ - u16 virq; /* state == ECS_VIRQ */ - } u; -#ifdef FLASK_ENABLE - void *ssid; -#endif -}; - -int evtchn_init(struct domain *d); /* from domain_create */ -void evtchn_destroy(struct domain *d); /* from domain_kill */ -void evtchn_destroy_final(struct domain *d); /* from complete_domain_destroy */ - struct waitqueue_vcpu; struct vcpu diff --git a/xen/xsm/flask/hooks.c b/xen/xsm/flask/hooks.c index 29a78dd..6d446ab 100644 --- a/xen/xsm/flask/hooks.c +++ b/xen/xsm/flask/hooks.c @@ -11,6 +11,7 @@ #include <xen/init.h> #include <xen/lib.h> #include <xen/sched.h> +#include <xen/event.h> #include <xen/paging.h> #include <xen/xmalloc.h> #include <xsm/xsm.h> -- 1.7.10.4
This variable indicates the maximum number of event channels a domain can use. Also replace MAX_EVTCHNS macro with inline function as this function will be used to calculate max event channels in the future. Signed-off-by: Wei Liu <wei.liu2@citrix.com> --- xen/common/event_channel.c | 5 +++-- xen/common/schedule.c | 2 +- xen/include/xen/event.h | 7 +++++-- xen/include/xen/sched.h | 1 + 4 files changed, 10 insertions(+), 5 deletions(-) diff --git a/xen/common/event_channel.c b/xen/common/event_channel.c index 6cb082e..0205c73 100644 --- a/xen/common/event_channel.c +++ b/xen/common/event_channel.c @@ -134,7 +134,7 @@ static int get_free_port(struct domain *d) if ( evtchn_from_port(d, port)->state == ECS_FREE ) return port; - if ( port == MAX_EVTCHNS(d) ) + if ( port == d->max_evtchns ) return -ENOSPC; chn = xzalloc_array(struct evtchn, EVTCHNS_PER_BUCKET); @@ -1177,6 +1177,7 @@ int evtchn_init(struct domain *d) if ( d->evtchn == NULL ) return -ENOMEM; + d->max_evtchns = max_evtchns(d); spin_lock_init(&d->event_lock); if ( get_free_port(d) != 0 ) @@ -1270,7 +1271,7 @@ static void domain_dump_evtchn_info(struct domain *d) spin_lock(&d->event_lock); - for ( port = 1; port < MAX_EVTCHNS(d); ++port ) + for ( port = 1; port < d->max_evtchns; ++port ) { const struct evtchn *chn; char *ssid; diff --git a/xen/common/schedule.c b/xen/common/schedule.c index de11110..22b8ffe 100644 --- a/xen/common/schedule.c +++ b/xen/common/schedule.c @@ -698,7 +698,7 @@ static long do_poll(struct sched_poll *sched_poll) goto out; rc = -EINVAL; - if ( port >= MAX_EVTCHNS(d) ) + if ( port >= d->max_evtchns ) goto out; rc = 0; diff --git a/xen/include/xen/event.h b/xen/include/xen/event.h index 271d792..c63b8b2 100644 --- a/xen/include/xen/event.h +++ b/xen/include/xen/event.h @@ -20,7 +20,10 @@ #else #define BITS_PER_EVTCHN_WORD(d) (has_32bit_shinfo(d) ? 32 : BITS_PER_XEN_ULONG) #endif -#define MAX_EVTCHNS(d) (BITS_PER_EVTCHN_WORD(d) * BITS_PER_EVTCHN_WORD(d)) +static inline unsigned int max_evtchns(struct domain *d) +{ + return BITS_PER_EVTCHN_WORD(d) * BITS_PER_EVTCHN_WORD(d); +} #define EVTCHNS_PER_BUCKET 128 #define NR_EVTCHN_BUCKETS (NR_EVENT_CHANNELS / EVTCHNS_PER_BUCKET) @@ -119,7 +122,7 @@ void notify_via_xen_event_channel(struct domain *ld, int lport); #define bucket_from_port(d,p) \ ((d)->evtchn[(p)/EVTCHNS_PER_BUCKET]) #define port_is_valid(d,p) \ - (((p) >= 0) && ((p) < MAX_EVTCHNS(d)) && \ + (((p) >= 0) && ((p) < d->max_evtchns) && \ (bucket_from_port(d,p) != NULL)) #define evtchn_from_port(d,p) \ (&(bucket_from_port(d,p))[(p)&(EVTCHNS_PER_BUCKET-1)]) diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h index 58b7176..ad0f042 100644 --- a/xen/include/xen/sched.h +++ b/xen/include/xen/sched.h @@ -217,6 +217,7 @@ struct domain /* Event channel information. */ struct evtchn **evtchn; spinlock_t event_lock; + unsigned int max_evtchns; struct grant_table *grant_table; -- 1.7.10.4
Wei Liu
2013-Mar-05 12:30 UTC
[RFC PATCH V4 06/18] Add evtchn_is_{pending, masked} and evtchn_clear_pending
Some code paths access the arrays in shared info directly. This only works with 2-level event channel. Add functions to abstract away implementation details. Signed-off-by: Wei Liu <wei.liu2@citrix.com> --- xen/arch/x86/irq.c | 7 +++---- xen/common/event_channel.c | 22 +++++++++++++++++++--- xen/common/keyhandler.c | 6 ++---- xen/common/schedule.c | 2 +- xen/include/xen/event.h | 6 ++++++ 5 files changed, 31 insertions(+), 12 deletions(-) diff --git a/xen/arch/x86/irq.c b/xen/arch/x86/irq.c index ca829bb..4033328 100644 --- a/xen/arch/x86/irq.c +++ b/xen/arch/x86/irq.c @@ -1452,7 +1452,7 @@ int pirq_guest_unmask(struct domain *d) { pirq = pirqs[i]->pirq; if ( pirqs[i]->masked && - !test_bit(pirqs[i]->evtchn, &shared_info(d, evtchn_mask)) ) + !evtchn_is_masked(d, pirqs[i]->evtchn) ) pirq_guest_eoi(pirqs[i]); } } while ( ++pirq < d->nr_pirqs && n == ARRAY_SIZE(pirqs) ); @@ -2090,13 +2090,12 @@ static void dump_irqs(unsigned char key) info = pirq_info(d, pirq); printk("%u:%3d(%c%c%c%c)", d->domain_id, pirq, - (test_bit(info->evtchn, - &shared_info(d, evtchn_pending)) ? + (evtchn_is_pending(d, info->evtchn) ? ''P'' : ''-''), (test_bit(info->evtchn / BITS_PER_EVTCHN_WORD(d), &vcpu_info(d->vcpu[0], evtchn_pending_sel)) ? ''S'' : ''-''), - (test_bit(info->evtchn, &shared_info(d, evtchn_mask)) ? + (evtchn_is_masked(d, info->evtchn) ? ''M'' : ''-''), (info->masked ? ''M'' : ''-'')); if ( i != action->nr_guests ) diff --git a/xen/common/event_channel.c b/xen/common/event_channel.c index 0205c73..667fd89 100644 --- a/xen/common/event_channel.c +++ b/xen/common/event_channel.c @@ -95,6 +95,7 @@ static uint8_t get_xen_consumer(xen_event_channel_notification_t fn) #define xen_notification_fn(e) (xen_consumers[(e)->xen_consumer-1]) static void evtchn_set_pending(struct vcpu *v, int port); +static void evtchn_clear_pending(struct domain *d, int port); static int virq_is_global(uint32_t virq) { @@ -156,6 +157,16 @@ static int get_free_port(struct domain *d) return port; } +int evtchn_is_pending(struct domain *d, int port) +{ + return test_bit(port, &shared_info(d, evtchn_pending)); +} + +int evtchn_is_masked(struct domain *d, int port) +{ + return test_bit(port, &shared_info(d, evtchn_mask)); +} + static long evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc) { @@ -529,7 +540,7 @@ static long __evtchn_close(struct domain *d1, int port1) } /* Clear pending event to avoid unexpected behavior on re-bind. */ - clear_bit(port1, &shared_info(d1, evtchn_pending)); + evtchn_clear_pending(d1, port1); /* Reset binding to vcpu0 when the channel is freed. */ chn1->state = ECS_FREE; @@ -653,6 +664,11 @@ static void evtchn_set_pending(struct vcpu *v, int port) } } +static void evtchn_clear_pending(struct domain *d, int port) +{ + clear_bit(port, &shared_info(d, evtchn_pending)); +} + int guest_enabled_event(struct vcpu *v, uint32_t virq) { return ((v != NULL) && (v->virq_to_evtchn[virq] != 0)); @@ -1284,8 +1300,8 @@ static void domain_dump_evtchn_info(struct domain *d) printk(" %4u [%d/%d]: s=%d n=%d x=%d", port, - !!test_bit(port, &shared_info(d, evtchn_pending)), - !!test_bit(port, &shared_info(d, evtchn_mask)), + !!evtchn_is_pending(d, port), + !!evtchn_is_masked(d, port), chn->state, chn->notify_vcpu_id, chn->xen_consumer); switch ( chn->state ) diff --git a/xen/common/keyhandler.c b/xen/common/keyhandler.c index e9ef45f..def3bf6 100644 --- a/xen/common/keyhandler.c +++ b/xen/common/keyhandler.c @@ -302,10 +302,8 @@ static void dump_domains(unsigned char key) printk("Notifying guest %d:%d (virq %d, port %d, stat %d/%d/%d)\n", d->domain_id, v->vcpu_id, VIRQ_DEBUG, v->virq_to_evtchn[VIRQ_DEBUG], - test_bit(v->virq_to_evtchn[VIRQ_DEBUG], - &shared_info(d, evtchn_pending)), - test_bit(v->virq_to_evtchn[VIRQ_DEBUG], - &shared_info(d, evtchn_mask)), + evtchn_is_pending(d, v->virq_to_evtchn[VIRQ_DEBUG]), + evtchn_is_masked(d, v->virq_to_evtchn[VIRQ_DEBUG]), test_bit(v->virq_to_evtchn[VIRQ_DEBUG] / BITS_PER_EVTCHN_WORD(d), &vcpu_info(v, evtchn_pending_sel))); diff --git a/xen/common/schedule.c b/xen/common/schedule.c index 22b8ffe..2e4a8d4 100644 --- a/xen/common/schedule.c +++ b/xen/common/schedule.c @@ -702,7 +702,7 @@ static long do_poll(struct sched_poll *sched_poll) goto out; rc = 0; - if ( test_bit(port, &shared_info(d, evtchn_pending)) ) + if ( evtchn_is_pending(d, port) ) goto out; } diff --git a/xen/include/xen/event.h b/xen/include/xen/event.h index c63b8b2..6912195 100644 --- a/xen/include/xen/event.h +++ b/xen/include/xen/event.h @@ -103,6 +103,12 @@ int evtchn_unmask(unsigned int port); /* Move all PIRQs after a vCPU was moved to another pCPU. */ void evtchn_move_pirqs(struct vcpu *v); +/* Tell a given event-channel port is pending or not */ +int evtchn_is_pending(struct domain *d, int port); + +/* Tell a given event-channel port is masked or not */ +int evtchn_is_masked(struct domain *d, int port); + /* Allocate/free a Xen-attached event channel port. */ typedef void (*xen_event_channel_notification_t)( struct vcpu *v, unsigned int port); -- 1.7.10.4
Wei Liu
2013-Mar-05 12:30 UTC
[RFC PATCH V4 07/18] Implement extended event channel ABIs query
This bitmap is a 64 bits unsigned integer. Each bit represents one ABI. Bit zero is reserved. Signed-off-by: Wei Liu <wei.liu2@citrix.com> --- xen/common/event_channel.c | 12 ++++++++++++ xen/include/public/event_channel.h | 14 ++++++++++++++ xen/include/xen/event.h | 2 ++ 3 files changed, 28 insertions(+) diff --git a/xen/common/event_channel.c b/xen/common/event_channel.c index 667fd89..6b23157 100644 --- a/xen/common/event_channel.c +++ b/xen/common/event_channel.c @@ -32,6 +32,9 @@ #include <public/event_channel.h> #include <xsm/xsm.h> +/* A bitmap of supported extended event channel ABIs */ +uint64_t extended_event_channel = EVTCHN_EXTENDED_NONE; + #define ERROR_EXIT(_errno) \ do { \ gdprintk(XENLOG_WARNING, \ @@ -1094,6 +1097,15 @@ long do_event_channel_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg) break; } + case EVTCHNOP_query_extended_abis: { + struct evtchn_query_extended_abis query; + query.abis = extended_event_channel; + rc = 0; + if ( __copy_to_guest(arg, &query, 1) ) + rc = -EFAULT; + break; + } + default: rc = -ENOSYS; break; diff --git a/xen/include/public/event_channel.h b/xen/include/public/event_channel.h index 472efdb..594ea76 100644 --- a/xen/include/public/event_channel.h +++ b/xen/include/public/event_channel.h @@ -71,6 +71,7 @@ #define EVTCHNOP_bind_vcpu 8 #define EVTCHNOP_unmask 9 #define EVTCHNOP_reset 10 +#define EVTCHNOP_query_extended_abis 11 /* ` } */ typedef uint32_t evtchn_port_t; @@ -258,6 +259,19 @@ struct evtchn_reset { typedef struct evtchn_reset evtchn_reset_t; /* + * EVTCHNOP_query_extended: Query the hypervisor for supported + * extended event channel ABIs. + */ +#define EVTCHN_EXTENDED_NONE 0 +#define _EVTCHN_EXTENDED_L3 1 +#define EVTCHN_EXTENDED_L3 (1UL << _EVTCHN_EXTENDED_L3) +struct evtchn_query_extended_abis { + /* OUT parameters. */ + uint64_t abis; +}; +typedef struct evtchn_query_extended_abis evtchn_query_extended_abis_t; + +/* * ` enum neg_errnoval * ` HYPERVISOR_event_channel_op_compat(struct evtchn_op *op) * ` diff --git a/xen/include/xen/event.h b/xen/include/xen/event.h index 6912195..fbbe9dc 100644 --- a/xen/include/xen/event.h +++ b/xen/include/xen/event.h @@ -157,4 +157,6 @@ void notify_via_xen_event_channel(struct domain *ld, int lport); mb(); /* set blocked status /then/ caller does his work */ \ } while ( 0 ) +/* A bitmap of supported extended event channel ABIs */ +extern uint64_t extended_event_channel; #endif /* __XEN_EVENT_H__ */ -- 1.7.10.4
Wei Liu
2013-Mar-05 12:30 UTC
[RFC PATCH V4 08/18] Define 3-level event channel registration interface
This event channel op has two sub-commands: * REGISTER_BITMAPS: register the shared pending / mask bitmaps * REGISTER_L2_SELECTOR: register L2 selector for the specific vcpu. The guest should issue REGISTER_BITMAPS first. If the registration of bitmaps succeed, it can issue REGISTER_L2_SELECTOR. Signed-off-by: Wei Liu <wei.liu2@citrix.com> --- xen/include/public/event_channel.h | 34 ++++++++++++++++++++++++++++++++++ 1 file changed, 34 insertions(+) diff --git a/xen/include/public/event_channel.h b/xen/include/public/event_channel.h index 594ea76..59a9780 100644 --- a/xen/include/public/event_channel.h +++ b/xen/include/public/event_channel.h @@ -72,6 +72,7 @@ #define EVTCHNOP_unmask 9 #define EVTCHNOP_reset 10 #define EVTCHNOP_query_extended_abis 11 +#define EVTCHNOP_register_3level 12 /* ` } */ typedef uint32_t evtchn_port_t; @@ -272,6 +273,39 @@ struct evtchn_query_extended_abis { typedef struct evtchn_query_extended_abis evtchn_query_extended_abis_t; /* + * EVTCHNOP_register_3level: Register 3-level event channel. + */ +/* + * 64 bit guests need 8 pages for evtchn_pending and evtchn_mask for + * 256k event channels while 32 bit ones only need 1 page for 32k + * event channels. + */ +#define EVTCHN_MAX_L3_PAGES 8 +/* + * A guest should register the bitmaps first, then register L2 selector for + * individual cpu. + */ +#define REGISTER_BITMAPS 1 +#define REGISTER_L2_SELECTOR 2 +struct evtchn_register_3level { + /* IN parameters. */ + uint32_t cmd; + union { + struct { + uint32_t nr_pages; + XEN_GUEST_HANDLE(xen_pfn_t) evtchn_pending; + XEN_GUEST_HANDLE(xen_pfn_t) evtchn_mask; + } bitmaps; + struct { + uint32_t cpu_id; + xen_pfn_t mfn; /* mfn of L2 selector */ + xen_pfn_t offset; /* offset of L2 selector within page */ + } l2_selector; + } u; +}; +typedef struct evtchn_register_3level evtchn_register_3level_t; + +/* * ` enum neg_errnoval * ` HYPERVISOR_event_channel_op_compat(struct evtchn_op *op) * ` -- 1.7.10.4
This field is a bitmap of currently in use extended event channel ABI, which can have 0 (no extended event channel in use) or 1 bit set. It is manipulated by hypervisor only, so if anything goes wrong it is a bug. The default event channel ABI is EVTCHN_EXTENDED_NONE, which means no extended event channel is used. Signed-off-by: Wei Liu <wei.liu2@citrix.com> --- xen/common/event_channel.c | 8 +++++++- xen/include/xen/event.h | 12 +++++++++++- xen/include/xen/sched.h | 1 + 3 files changed, 19 insertions(+), 2 deletions(-) diff --git a/xen/common/event_channel.c b/xen/common/event_channel.c index 6b23157..99af57e 100644 --- a/xen/common/event_channel.c +++ b/xen/common/event_channel.c @@ -988,6 +988,12 @@ out: return rc; } +static void __set_evtchn_abi(struct domain *d, uint64_t abi) +{ + d->evtchn_extended = abi; + /* This must go after setting ABI */ + d->max_evtchns = max_evtchns(d); +} long do_event_channel_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg) { @@ -1205,7 +1211,7 @@ int evtchn_init(struct domain *d) if ( d->evtchn == NULL ) return -ENOMEM; - d->max_evtchns = max_evtchns(d); + __set_evtchn_abi(d, EVTCHN_EXTENDED_NONE); spin_lock_init(&d->event_lock); if ( get_free_port(d) != 0 ) diff --git a/xen/include/xen/event.h b/xen/include/xen/event.h index fbbe9dc..f5a49a9 100644 --- a/xen/include/xen/event.h +++ b/xen/include/xen/event.h @@ -14,6 +14,7 @@ #include <xen/softirq.h> #include <asm/bitops.h> #include <asm/event.h> +#include <public/event_channel.h> #ifndef CONFIG_COMPAT #define BITS_PER_EVTCHN_WORD(d) BITS_PER_XEN_ULONG @@ -22,7 +23,16 @@ #endif static inline unsigned int max_evtchns(struct domain *d) { - return BITS_PER_EVTCHN_WORD(d) * BITS_PER_EVTCHN_WORD(d); + unsigned int ret = 0; + switch ( d->evtchn_extended ) + { + case EVTCHN_EXTENDED_NONE: + ret = BITS_PER_EVTCHN_WORD(d) * BITS_PER_EVTCHN_WORD(d); + break; + default: + BUG(); + } + return ret; } #define EVTCHNS_PER_BUCKET 128 diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h index ad0f042..8bdf5ec 100644 --- a/xen/include/xen/sched.h +++ b/xen/include/xen/sched.h @@ -218,6 +218,7 @@ struct domain struct evtchn **evtchn; spinlock_t event_lock; unsigned int max_evtchns; + unsigned int evtchn_extended; struct grant_table *grant_table; -- 1.7.10.4
Wei Liu
2013-Mar-05 12:30 UTC
[RFC PATCH V4 10/18] Calculate max event channels for EVTCHN_EXTENDED_L3
Signed-off-by: Wei Liu <wei.liu2@citrix.com> --- xen/include/xen/event.h | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/xen/include/xen/event.h b/xen/include/xen/event.h index f5a49a9..919f0e2 100644 --- a/xen/include/xen/event.h +++ b/xen/include/xen/event.h @@ -29,6 +29,10 @@ static inline unsigned int max_evtchns(struct domain *d) case EVTCHN_EXTENDED_NONE: ret = BITS_PER_EVTCHN_WORD(d) * BITS_PER_EVTCHN_WORD(d); break; + case EVTCHN_EXTENDED_L3: + ret = BITS_PER_EVTCHN_WORD(d) * BITS_PER_EVTCHN_WORD(d) + * BITS_PER_EVTCHN_WORD(d); + break; default: BUG(); } -- 1.7.10.4
For 64 bit build and 3-level event channel and the original value of EVTCHNS_PER_BUCKET (128), the space needed to accommodate d->evtchn would be 4 pages (PAGE_SIZE = 4096). Given that not every domain needs 3-level event channel, this leads to waste of memory. Having EVTCHN_PER_BUCKETS to be 512 can occupy exact one page. Signed-off-by: Wei Liu <wei.liu2@citrix.com> --- xen/include/xen/event.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/xen/include/xen/event.h b/xen/include/xen/event.h index 919f0e2..e2c3736 100644 --- a/xen/include/xen/event.h +++ b/xen/include/xen/event.h @@ -39,7 +39,7 @@ static inline unsigned int max_evtchns(struct domain *d) return ret; } -#define EVTCHNS_PER_BUCKET 128 +#define EVTCHNS_PER_BUCKET 512 #define NR_EVTCHN_BUCKETS (NR_EVENT_CHANNELS / EVTCHNS_PER_BUCKET) struct evtchn -- 1.7.10.4
Document limits of 2/3-level event channel ABIs. Also need to update event.h so that this change won''t break the build. Signed-off-by: Wei Liu <wei.liu2@citrix.com> --- xen/include/public/xen.h | 13 ++++++++++++- xen/include/xen/event.h | 2 +- 2 files changed, 13 insertions(+), 2 deletions(-) diff --git a/xen/include/public/xen.h b/xen/include/public/xen.h index ba9e1ab..e6c33d4 100644 --- a/xen/include/public/xen.h +++ b/xen/include/public/xen.h @@ -554,9 +554,20 @@ DEFINE_XEN_GUEST_HANDLE(multicall_entry_t); /* * Event channel endpoints per domain: + * 2-level for x86: * 1024 if a long is 32 bits; 4096 if a long is 64 bits. + * 3-level for x86: + * 32k if a long is 32 bits; 256k if a long is 64 bits. + * 2-level for ARM: + * 4096 for both 32 bits and 64 bits. + * 3-level for ARM: + * 256k for both 32 bits and 64 bits. */ -#define NR_EVENT_CHANNELS (sizeof(xen_ulong_t) * sizeof(xen_ulong_t) * 64) +#define NR_EVENT_CHANNELS_L2 (sizeof(xen_ulong_t) * sizeof(xen_ulong_t) * 64) +#define NR_EVENT_CHANNELS_L3 (NR_EVENT_CHANNELS_L2 * sizeof(xen_ulong_t) * 8) +#if !defined(__XEN__) && !defined(__XEN_TOOLS__) +#define NR_EVENT_CHANNELS NR_EVENT_CHANNELS_L2 /* for compatibility */ +#endif struct vcpu_time_info { /* diff --git a/xen/include/xen/event.h b/xen/include/xen/event.h index e2c3736..fa1f4b6 100644 --- a/xen/include/xen/event.h +++ b/xen/include/xen/event.h @@ -40,7 +40,7 @@ static inline unsigned int max_evtchns(struct domain *d) } #define EVTCHNS_PER_BUCKET 512 -#define NR_EVTCHN_BUCKETS (NR_EVENT_CHANNELS / EVTCHNS_PER_BUCKET) +#define NR_EVTCHN_BUCKETS (NR_EVENT_CHANNELS_L3 / EVTCHNS_PER_BUCKET) struct evtchn { -- 1.7.10.4
Use pointer in struct domain to reference evtchn_pending and evtchn_mask bitmaps. When building a domain, the default operation set is 2-level operation set. Signed-off-by: Wei Liu <wei.liu2@citrix.com> --- xen/arch/arm/domain.c | 1 + xen/arch/x86/domain.c | 1 + xen/common/event_channel.c | 52 +++++++++++++++++++++++++++++++++++++------- xen/include/xen/event.h | 3 +++ xen/include/xen/sched.h | 2 ++ 5 files changed, 51 insertions(+), 8 deletions(-) diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c index bca3d89..f6a5560 100644 --- a/xen/arch/arm/domain.c +++ b/xen/arch/arm/domain.c @@ -471,6 +471,7 @@ int arch_domain_create(struct domain *d, unsigned int domcr_flags) d->arch.vmpidr = boot_cpu_data.mpidr.bits; clear_page(d->shared_info); + evtchn_set_default_bitmap(d); share_xen_page_with_guest( virt_to_page(d->shared_info), d, XENSHARE_writable); diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c index 8d30d08..d7912b3 100644 --- a/xen/arch/x86/domain.c +++ b/xen/arch/x86/domain.c @@ -547,6 +547,7 @@ int arch_domain_create(struct domain *d, unsigned int domcr_flags) goto fail; clear_page(d->shared_info); + evtchn_set_default_bitmap(d); share_xen_page_with_guest( virt_to_page(d->shared_info), d, XENSHARE_writable); diff --git a/xen/common/event_channel.c b/xen/common/event_channel.c index 99af57e..4fb7794 100644 --- a/xen/common/event_channel.c +++ b/xen/common/event_channel.c @@ -162,15 +162,14 @@ static int get_free_port(struct domain *d) int evtchn_is_pending(struct domain *d, int port) { - return test_bit(port, &shared_info(d, evtchn_pending)); + return test_bit(port, d->evtchn_pending); } int evtchn_is_masked(struct domain *d, int port) { - return test_bit(port, &shared_info(d, evtchn_mask)); + return test_bit(port, d->evtchn_mask); } - static long evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc) { struct evtchn *chn; @@ -626,7 +625,7 @@ out: return ret; } -static void evtchn_set_pending(struct vcpu *v, int port) +static void evtchn_set_pending_l2(struct vcpu *v, int port) { struct domain *d = v->domain; int vcpuid; @@ -667,9 +666,23 @@ static void evtchn_set_pending(struct vcpu *v, int port) } } +static void evtchn_set_pending(struct vcpu *v, int port) +{ + struct domain *d = v->domain; + + switch ( d->evtchn_extended ) + { + case EVTCHN_EXTENDED_NONE: + evtchn_set_pending_l2(v, port); + break; + default: + BUG(); + } +} + static void evtchn_clear_pending(struct domain *d, int port) { - clear_bit(port, &shared_info(d, evtchn_pending)); + clear_bit(port, d->evtchn_pending); } int guest_enabled_event(struct vcpu *v, uint32_t virq) @@ -935,7 +948,7 @@ long evtchn_bind_vcpu(unsigned int port, unsigned int vcpu_id) } -int evtchn_unmask(unsigned int port) +static int evtchn_unmask_l2(unsigned int port) { struct domain *d = current->domain; struct vcpu *v; @@ -951,8 +964,8 @@ int evtchn_unmask(unsigned int port) * These operations must happen in strict order. Based on * include/xen/event.h:evtchn_set_pending(). */ - if ( test_and_clear_bit(port, &shared_info(d, evtchn_mask)) && - test_bit (port, &shared_info(d, evtchn_pending)) && + if ( test_and_clear_bit(port, d->evtchn_mask) && + test_bit (port, d->evtchn_pending) && !test_and_set_bit (port / BITS_PER_EVTCHN_WORD(d), &vcpu_info(v, evtchn_pending_sel)) ) { @@ -962,6 +975,23 @@ int evtchn_unmask(unsigned int port) return 0; } +int evtchn_unmask(unsigned int port) +{ + struct domain *d = current->domain; + int rc = 0; + + switch ( d->evtchn_extended ) + { + case EVTCHN_EXTENDED_NONE: + rc = evtchn_unmask_l2(port); + break; + default: + BUG(); + } + + return rc; +} + static long evtchn_reset(evtchn_reset_t *r) { @@ -1203,6 +1233,12 @@ void notify_via_xen_event_channel(struct domain *ld, int lport) spin_unlock(&ld->event_lock); } +void evtchn_set_default_bitmap(struct domain *d) +{ + d->evtchn_pending = (xen_ulong_t *)shared_info(d, evtchn_pending); + d->evtchn_mask = (xen_ulong_t *)shared_info(d, evtchn_mask); +} + int evtchn_init(struct domain *d) { diff --git a/xen/include/xen/event.h b/xen/include/xen/event.h index fa1f4b6..382ce91 100644 --- a/xen/include/xen/event.h +++ b/xen/include/xen/event.h @@ -138,6 +138,9 @@ int guest_enabled_event(struct vcpu *v, uint32_t virq); /* Notify remote end of a Xen-attached event channel.*/ void notify_via_xen_event_channel(struct domain *ld, int lport); +/* This is called after domain''s shared info page is setup */ +void evtchn_set_default_bitmap(struct domain *d); + /* Internal event channel object accessors */ #define bucket_from_port(d,p) \ ((d)->evtchn[(p)/EVTCHNS_PER_BUCKET]) diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h index 8bdf5ec..74a8d43 100644 --- a/xen/include/xen/sched.h +++ b/xen/include/xen/sched.h @@ -217,6 +217,8 @@ struct domain /* Event channel information. */ struct evtchn **evtchn; spinlock_t event_lock; + xen_ulong_t *evtchn_pending; + xen_ulong_t *evtchn_mask; unsigned int max_evtchns; unsigned int evtchn_extended; -- 1.7.10.4
This macro is used to optimise calculation. Signed-off-by: Wei Liu <wei.liu2@citrix.com> --- xen/include/asm-arm/config.h | 1 + xen/include/asm-x86/config.h | 5 ++++- xen/include/xen/event.h | 2 ++ 3 files changed, 7 insertions(+), 1 deletion(-) diff --git a/xen/include/asm-arm/config.h b/xen/include/asm-arm/config.h index 8be8563..3ba7df7 100644 --- a/xen/include/asm-arm/config.h +++ b/xen/include/asm-arm/config.h @@ -24,6 +24,7 @@ /* xen_ulong_t is always 64 bits */ #define BITS_PER_XEN_ULONG 64 +#define XEN_ULONG_BITORDER 6 #define CONFIG_PAGING_ASSISTANCE 1 diff --git a/xen/include/asm-x86/config.h b/xen/include/asm-x86/config.h index cf93bd5..a43810d 100644 --- a/xen/include/asm-x86/config.h +++ b/xen/include/asm-x86/config.h @@ -8,13 +8,16 @@ #define __X86_CONFIG_H__ #define LONG_BYTEORDER 3 +#define BYTE_BITORDER 3 +#define LONG_BITORDER (BYTE_BITORDER + LONG_BYTEORDER) #define CONFIG_PAGING_LEVELS 4 #define BYTES_PER_LONG (1 << LONG_BYTEORDER) #define BITS_PER_LONG (BYTES_PER_LONG << 3) -#define BITS_PER_BYTE 8 +#define BITS_PER_BYTE (1 << BYTE_BITORDER) #define BITS_PER_XEN_ULONG BITS_PER_LONG +#define XEN_ULONG_BITORDER LONG_BITORDER #define CONFIG_X86 1 #define CONFIG_X86_HT 1 diff --git a/xen/include/xen/event.h b/xen/include/xen/event.h index 382ce91..fd5db05 100644 --- a/xen/include/xen/event.h +++ b/xen/include/xen/event.h @@ -18,8 +18,10 @@ #ifndef CONFIG_COMPAT #define BITS_PER_EVTCHN_WORD(d) BITS_PER_XEN_ULONG +#define EVTCHN_WORD_BITORDER(d) XEN_ULONG_BITORDER #else #define BITS_PER_EVTCHN_WORD(d) (has_32bit_shinfo(d) ? 32 : BITS_PER_XEN_ULONG) +#define EVTCHN_WORD_BITORDER(d) (has_32bit_shinfo(d) ? 5 : XEN_ULONG_BITORDER) #endif static inline unsigned int max_evtchns(struct domain *d) { -- 1.7.10.4
Wei Liu
2013-Mar-05 12:30 UTC
[RFC PATCH V4 15/18] Infrastructure to manipulate 3-level event channel pages
Introduce __{,un}map_l3_bitmaps, __{,un}map_l2_selector for 3-level event channel ABI. Introduce evtchn_unregister_extended in the teardown path. Signed-off-by: Wei Liu <wei.liu2@citrix.com> --- xen/common/event_channel.c | 210 ++++++++++++++++++++++++++++++++++++++++++++ xen/include/xen/sched.h | 3 + 2 files changed, 213 insertions(+) diff --git a/xen/common/event_channel.c b/xen/common/event_channel.c index 4fb7794..4cf172b 100644 --- a/xen/common/event_channel.c +++ b/xen/common/event_channel.c @@ -1025,6 +1025,191 @@ static void __set_evtchn_abi(struct domain *d, uint64_t abi) d->max_evtchns = max_evtchns(d); } +static long __map_l3_bitmaps(struct domain *d, evtchn_register_3level_t *reg) +{ + int rc; + void *pending_mapping, *mask_mapping; + xen_pfn_t evtchn_pending[EVTCHN_MAX_L3_PAGES]; + xen_pfn_t evtchn_mask[EVTCHN_MAX_L3_PAGES]; + uint32_t nr_pages; + + /* Return if we''ve mapped those bitmaps */ + if ( d->evtchn_extended == EVTCHN_EXTENDED_L3 ) + return -EBUSY; + + nr_pages = reg->u.bitmaps.nr_pages; + + if ( nr_pages > EVTCHN_MAX_L3_PAGES ) + { + rc = -EINVAL; + goto out; + } + + memset(evtchn_pending, 0, sizeof(xen_pfn_t) * EVTCHN_MAX_L3_PAGES); + memset(evtchn_mask, 0, sizeof(xen_pfn_t) * EVTCHN_MAX_L3_PAGES); + + rc = -EFAULT; /* common error code for following operations */ + if ( copy_from_guest(evtchn_pending, reg->u.bitmaps.evtchn_pending, + nr_pages) ) + goto out; + if ( copy_from_guest(evtchn_mask, reg->u.bitmaps.evtchn_mask, + nr_pages) ) + goto out; + + rc = -ENOMEM; + pending_mapping = vmap(evtchn_pending, nr_pages); + if ( !pending_mapping ) + goto out; + + + mask_mapping = vmap(evtchn_mask, nr_pages); + if ( !mask_mapping ) + { + vunmap(pending_mapping); + goto out; + } + + d->evtchn_pending = pending_mapping; + d->evtchn_mask = mask_mapping; + + __set_evtchn_abi(d, EVTCHN_EXTENDED_L3); + + memcpy(d->evtchn_pending, &shared_info(d, evtchn_pending), + sizeof(shared_info(d, evtchn_pending))); + memcpy(d->evtchn_mask, &shared_info(d, evtchn_mask), + sizeof(shared_info(d, evtchn_mask))); + + rc = 0; + out: + return rc; +} + +static void __unmap_l3_bitmaps(struct domain *d) +{ + if ( d->evtchn_pending ) + { + vunmap(d->evtchn_pending); + d->evtchn_pending = NULL; + } + + if ( d->evtchn_mask ) + { + vunmap(d->evtchn_mask); + d->evtchn_mask = NULL; + } + + __set_evtchn_abi(d, EVTCHN_EXTENDED_NONE); +} + +static long __map_l2_selector(struct vcpu *v, evtchn_register_3level_t *reg) +{ + int rc; + void *mapping; + xen_pfn_t mfn = 0; + xen_pfn_t offset = 0; + + mfn = reg->u.l2_selector.mfn; + offset = reg->u.l2_selector.offset; + + /* Already mapped? */ + if ( v->evtchn_pending_sel_l2 ) + return -EBUSY; + + /* must within one page */ + if ( offset + sizeof(xen_ulong_t)*sizeof(xen_ulong_t)*8 > PAGE_SIZE ) + { + rc = -EINVAL; + goto out; + } + + mapping = vmap(&mfn, 1); + + if ( mapping == NULL ) + { + rc = -ENOMEM; + goto out; + } + + v->evtchn_pending_sel_l2 = mapping + offset; + + memcpy(&v->evtchn_pending_sel_l2[0], + &vcpu_info(v, evtchn_pending_sel), + sizeof(vcpu_info(v, evtchn_pending_sel))); + memset(&vcpu_info(v, evtchn_pending_sel), 0, + sizeof(vcpu_info(v, evtchn_pending_sel))); + set_bit(0, &vcpu_info(v, evtchn_pending_sel)); + + rc = 0; + + out: + return rc; +} + +static void __unmap_l2_selector(struct vcpu *v) +{ + if ( v->evtchn_pending_sel_l2 ) + { + unsigned long addr + (unsigned long)(v->evtchn_pending_sel_l2) & PAGE_MASK; + vunmap((void *)addr); + v->evtchn_pending_sel_l2 = NULL; + } +} + +static void __evtchn_unmap_all_3level(struct domain *d) +{ + struct vcpu *v; + for_each_vcpu ( d, v ) + __unmap_l2_selector(v); + __unmap_l3_bitmaps(d); +} + +static long evtchn_register_3level(evtchn_register_3level_t *arg) +{ + struct domain *d = current->domain; + int rc; + + /* + * This domain must be in one of the two states: + * a) it has no active extended ABI in use and tries to register + * L3 bitmaps + * b) it has activated 3-level ABI and tries to register L2 + * selector + */ + if ( !((d->evtchn_extended == EVTCHN_EXTENDED_NONE && + arg->cmd == REGISTER_BITMAPS) || + (d->evtchn_extended == EVTCHN_EXTENDED_L3 && + arg->cmd == REGISTER_L2_SELECTOR)) ) + { + rc = -EINVAL; + goto out; + } + + switch ( arg->cmd ) + { + case REGISTER_BITMAPS: + rc = __map_l3_bitmaps(d, arg); + break; + case REGISTER_L2_SELECTOR: { + int vcpu_id = arg->u.l2_selector.cpu_id; + struct vcpu *v; + if ( vcpu_id >= d->max_vcpus ) + rc = -EINVAL; + else + { + v = d->vcpu[vcpu_id]; + rc = __map_l2_selector(v, arg); + } + break; + } + default: + rc = -EINVAL; + } + + out: + return rc; +} + long do_event_channel_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg) { long rc; @@ -1142,6 +1327,14 @@ long do_event_channel_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg) break; } + case EVTCHNOP_register_3level: { + struct evtchn_register_3level reg; + if ( copy_from_guest(®, arg, 1) != 0 ) + return -EFAULT; + rc = evtchn_register_3level(®); + break; + } + default: rc = -ENOSYS; break; @@ -1270,6 +1463,21 @@ int evtchn_init(struct domain *d) return 0; } +/* Clean up all extended event channel ABI mappings */ +static void evtchn_unregister_extended(struct domain *d) +{ + switch ( d->evtchn_extended ) + { + case EVTCHN_EXTENDED_NONE: + /* Nothing to do */ + break; + case EVTCHN_EXTENDED_L3: + __evtchn_unmap_all_3level(d); + break; + default: + BUG(); + } +} void evtchn_destroy(struct domain *d) { @@ -1298,6 +1506,8 @@ void evtchn_destroy(struct domain *d) clear_global_virq_handlers(d); + evtchn_unregister_extended(d); + xfree(d->evtchn); } diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h index 74a8d43..cca5e7f 100644 --- a/xen/include/xen/sched.h +++ b/xen/include/xen/sched.h @@ -57,6 +57,9 @@ struct vcpu struct domain *domain; + /* For 3-level event channel ABI */ + xen_ulong_t *evtchn_pending_sel_l2; + struct vcpu *next_in_list; s_time_t periodic_period; -- 1.7.10.4
Wei Liu
2013-Mar-05 12:30 UTC
[RFC PATCH V4 16/18] Implement 3-level event channel routines
3-level event channel ABI is fully functional at this point, set corresponding bit in ABI bitmap as well. Signed-off-by: Wei Liu <wei.liu2@citrix.com> --- xen/common/event_channel.c | 124 ++++++++++++++++++++++++++++++++++++++------ 1 file changed, 108 insertions(+), 16 deletions(-) diff --git a/xen/common/event_channel.c b/xen/common/event_channel.c index 4cf172b..504d769 100644 --- a/xen/common/event_channel.c +++ b/xen/common/event_channel.c @@ -33,7 +33,22 @@ #include <xsm/xsm.h> /* A bitmap of supported extended event channel ABIs */ -uint64_t extended_event_channel = EVTCHN_EXTENDED_NONE; +uint64_t extended_event_channel = (EVTCHN_EXTENDED_NONE | + EVTCHN_EXTENDED_L3); + +static inline const char * evtchn_abi_str(unsigned int abi) +{ + switch ( abi ) + { + case EVTCHN_EXTENDED_NONE: + return "2-level"; + case EVTCHN_EXTENDED_L3: + return "3-level"; + default: + BUG(); + } + return ""; /* make compiler happy */ +} #define ERROR_EXIT(_errno) \ do { \ @@ -625,10 +640,33 @@ out: return ret; } +static void __check_vcpu_polling(struct vcpu *v, int port) +{ + int vcpuid; + struct domain *d = v->domain; + + /* Check if some VCPU might be polling for this event. */ + if ( likely(bitmap_empty(d->poll_mask, d->max_vcpus)) ) + return; + + /* Wake any interested (or potentially interested) pollers. */ + for ( vcpuid = find_first_bit(d->poll_mask, d->max_vcpus); + vcpuid < d->max_vcpus; + vcpuid = find_next_bit(d->poll_mask, d->max_vcpus, vcpuid+1) ) + { + v = d->vcpu[vcpuid]; + if ( ((v->poll_evtchn <= 0) || (v->poll_evtchn == port)) && + test_and_clear_bit(vcpuid, d->poll_mask) ) + { + v->poll_evtchn = 0; + vcpu_unblock(v); + } + } +} + static void evtchn_set_pending_l2(struct vcpu *v, int port) { struct domain *d = v->domain; - int vcpuid; /* * The following bit operations must happen in strict order. @@ -647,23 +685,36 @@ static void evtchn_set_pending_l2(struct vcpu *v, int port) vcpu_mark_events_pending(v); } - /* Check if some VCPU might be polling for this event. */ - if ( likely(bitmap_empty(d->poll_mask, d->max_vcpus)) ) + __check_vcpu_polling(v, port); +} + +static void evtchn_set_pending_l3(struct vcpu *v, int port) +{ + struct domain *d = v->domain; + unsigned int l1bit = port >> (EVTCHN_WORD_BITORDER(d) << 1); + unsigned int l2bit = port >> EVTCHN_WORD_BITORDER(d); + + if (unlikely(!v->evtchn_pending_sel_l2)) return; - /* Wake any interested (or potentially interested) pollers. */ - for ( vcpuid = find_first_bit(d->poll_mask, d->max_vcpus); - vcpuid < d->max_vcpus; - vcpuid = find_next_bit(d->poll_mask, d->max_vcpus, vcpuid+1) ) + /* + * The following bit operations must happen in strict order. + * NB. On x86, the atomic bit operations also act as memory barriers. + * There is therefore sufficiently strict ordering for this architecture -- + * others may require explicit memory barriers. + */ + + if ( test_and_set_bit(port, d->evtchn_pending) ) + return; + + if ( !test_bit(port, d->evtchn_mask) && + !test_and_set_bit(l2bit, v->evtchn_pending_sel_l2) && + !test_and_set_bit(l1bit, &vcpu_info(v, evtchn_pending_sel)) ) { - v = d->vcpu[vcpuid]; - if ( ((v->poll_evtchn <= 0) || (v->poll_evtchn == port)) && - test_and_clear_bit(vcpuid, d->poll_mask) ) - { - v->poll_evtchn = 0; - vcpu_unblock(v); - } + vcpu_mark_events_pending(v); } + + __check_vcpu_polling(v, port); } static void evtchn_set_pending(struct vcpu *v, int port) @@ -675,6 +726,9 @@ static void evtchn_set_pending(struct vcpu *v, int port) case EVTCHN_EXTENDED_NONE: evtchn_set_pending_l2(v, port); break; + case EVTCHN_EXTENDED_L3: + evtchn_set_pending_l3(v, port); + break; default: BUG(); } @@ -975,6 +1029,38 @@ static int evtchn_unmask_l2(unsigned int port) return 0; } +static int evtchn_unmask_l3(unsigned int port) +{ + struct domain *d = current->domain; + struct vcpu *v; + unsigned int l1bit = port >> (EVTCHN_WORD_BITORDER(d) << 1); + unsigned int l2bit = port >> EVTCHN_WORD_BITORDER(d); + + ASSERT(spin_is_locked(&d->event_lock)); + + if ( unlikely(!port_is_valid(d, port)) ) + return -EINVAL; + + v = d->vcpu[evtchn_from_port(d, port)->notify_vcpu_id]; + + if (unlikely(!v->evtchn_pending_sel_l2)) + return -EINVAL; + + /* + * These operations must happen in strict order. Based on + * include/xen/event.h:evtchn_set_pending(). + */ + if ( test_and_clear_bit(port, d->evtchn_mask) && + test_bit (port, d->evtchn_pending) && + !test_and_set_bit (l2bit, v->evtchn_pending_sel_l2) && + !test_and_set_bit (l1bit, &vcpu_info(v, evtchn_pending_sel)) ) + { + vcpu_mark_events_pending(v); + } + + return 0; +} + int evtchn_unmask(unsigned int port) { struct domain *d = current->domain; @@ -985,6 +1071,9 @@ int evtchn_unmask(unsigned int port) case EVTCHN_EXTENDED_NONE: rc = evtchn_unmask_l2(port); break; + case EVTCHN_EXTENDED_L3: + rc = evtchn_unmask_l3(port); + break; default: BUG(); } @@ -1546,8 +1635,11 @@ static void domain_dump_evtchn_info(struct domain *d) bitmap_scnlistprintf(keyhandler_scratch, sizeof(keyhandler_scratch), d->poll_mask, d->max_vcpus); printk("Event channel information for domain %d:\n" + "Using %s event channel ABI\n" "Polling vCPUs: {%s}\n" - " port [p/m]\n", d->domain_id, keyhandler_scratch); + " port [p/m]\n", + d->domain_id, evtchn_abi_str(d->evtchn_extended), + keyhandler_scratch); spin_lock(&d->event_lock); -- 1.7.10.4
Wei Liu
2013-Mar-05 12:30 UTC
[RFC PATCH V4 17/18] Only allow extended event channel on Dom0 and driver domains
For non-Dom0 domains, add a flag to indicate whether it can use any extended event channel ABIs. Admins can specify this flag when creating a driver domain. The rationale behind this option is, extended event channel ABIs will consume global mapping space in Xen, Admin should have control over these features. Signed-off-by: Wei Liu <wei.liu2@citrix.com> --- xen/common/domain.c | 3 +++ xen/common/domctl.c | 6 +++++- xen/common/event_channel.c | 9 ++++++++- xen/include/public/domctl.h | 3 +++ xen/include/xen/sched.h | 5 +++++ 5 files changed, 24 insertions(+), 2 deletions(-) diff --git a/xen/common/domain.c b/xen/common/domain.c index b360de1..f648601 100644 --- a/xen/common/domain.c +++ b/xen/common/domain.c @@ -250,6 +250,9 @@ struct domain *domain_create( if ( domcr_flags & DOMCRF_dummy ) return d; + if ( domcr_flags & DOMCRF_evtchn_extended_allowed ) + d->evtchn_extended_allowed = 1; + if ( !is_idle_domain(d) ) { if ( (err = xsm_domain_create(XSM_HOOK, d, ssidref)) != 0 ) diff --git a/xen/common/domctl.c b/xen/common/domctl.c index b7f6619..bb15da4 100644 --- a/xen/common/domctl.c +++ b/xen/common/domctl.c @@ -369,7 +369,8 @@ long do_domctl(XEN_GUEST_HANDLE_PARAM(xen_domctl_t) u_domctl) if ( supervisor_mode_kernel || (op->u.createdomain.flags & ~(XEN_DOMCTL_CDF_hvm_guest | XEN_DOMCTL_CDF_hap | - XEN_DOMCTL_CDF_s3_integrity | XEN_DOMCTL_CDF_oos_off)) ) + XEN_DOMCTL_CDF_s3_integrity | XEN_DOMCTL_CDF_oos_off | + XEN_DOMCTL_CDF_evtchn_extended_allowed)) ) break; dom = op->domain; @@ -405,6 +406,9 @@ long do_domctl(XEN_GUEST_HANDLE_PARAM(xen_domctl_t) u_domctl) domcr_flags |= DOMCRF_s3_integrity; if ( op->u.createdomain.flags & XEN_DOMCTL_CDF_oos_off ) domcr_flags |= DOMCRF_oos_off; + if ( op->u.createdomain.flags & XEN_DOMCTL_CDF_evtchn_extended_allowed ) + domcr_flags |= DOMCRF_evtchn_extended_allowed; + d = domain_create(dom, domcr_flags, op->u.createdomain.ssidref); if ( IS_ERR(d) ) diff --git a/xen/common/event_channel.c b/xen/common/event_channel.c index 504d769..a49fe3b 100644 --- a/xen/common/event_channel.c +++ b/xen/common/event_channel.c @@ -1409,7 +1409,11 @@ long do_event_channel_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg) case EVTCHNOP_query_extended_abis: { struct evtchn_query_extended_abis query; - query.abis = extended_event_channel; + struct domain *d = current->domain; + if ( d->domain_id == 0 || d->evtchn_extended_allowed ) + query.abis = extended_event_channel; + else + query.abis = 0; rc = 0; if ( __copy_to_guest(arg, &query, 1) ) rc = -EFAULT; @@ -1418,6 +1422,9 @@ long do_event_channel_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg) case EVTCHNOP_register_3level: { struct evtchn_register_3level reg; + struct domain *d = current->domain; + if ( d->domain_id != 0 && !d->evtchn_extended_allowed ) + return -EPERM; if ( copy_from_guest(®, arg, 1) != 0 ) return -EFAULT; rc = evtchn_register_3level(®); diff --git a/xen/include/public/domctl.h b/xen/include/public/domctl.h index deb19db..40e9486 100644 --- a/xen/include/public/domctl.h +++ b/xen/include/public/domctl.h @@ -59,6 +59,9 @@ struct xen_domctl_createdomain { /* Disable out-of-sync shadow page tables? */ #define _XEN_DOMCTL_CDF_oos_off 3 #define XEN_DOMCTL_CDF_oos_off (1U<<_XEN_DOMCTL_CDF_oos_off) + /* Can this domain use any extended event channel ABIs? */ +#define _XEN_DOMCTL_CDF_evtchn_extended_allowed 4 +#define XEN_DOMCTL_CDF_evtchn_extended_allowed (1U<<_XEN_DOMCTL_CDF_evtchn_extended_allowed) uint32_t flags; }; typedef struct xen_domctl_createdomain xen_domctl_createdomain_t; diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h index cca5e7f..b190fd0 100644 --- a/xen/include/xen/sched.h +++ b/xen/include/xen/sched.h @@ -256,6 +256,8 @@ struct domain bool_t is_paused_by_controller; /* Domain''s VCPUs are pinned 1:1 to physical CPUs? */ bool_t is_pinned; + /* Can this domain use any extended event channel ABIs? */ + bool_t evtchn_extended_allowed; /* Are any VCPUs polling event channels (SCHEDOP_poll)? */ #if MAX_VIRT_CPUS <= BITS_PER_LONG @@ -411,6 +413,9 @@ struct domain *domain_create( /* DOMCRF_oos_off: dont use out-of-sync optimization for shadow page tables */ #define _DOMCRF_oos_off 4 #define DOMCRF_oos_off (1U<<_DOMCRF_oos_off) +/* DOMCRF_evtchn_extended_allowed: this domain can use extended evtchn ABIs */ +#define _DOMCRF_evtchn_extended_allowed 5 +#define DOMCRF_evtchn_extended_allowed (1U<<_DOMCRF_evtchn_extended_allowed) /* * rcu_lock_domain_by_id() is more efficient than get_domain_by_id(). -- 1.7.10.4
Wei Liu
2013-Mar-05 12:30 UTC
[RFC PATCH V4 18/18] libxl: add evtchn_extended_allowed flag
Admins can add "evtchn_extended_allowed = 1" in domain config file to enable extended event channel ABI for a domain. Signed-off-by: Wei Liu <wei.liu2@citrix.com> CC: Ian Jackson <ian.jackson@eu.citrix.com> --- docs/man/xl.cfg.pod.5 | 10 ++++++++++ tools/libxl/libxl_create.c | 4 ++++ tools/libxl/libxl_types.idl | 1 + tools/libxl/xl_cmdimpl.c | 3 +++ 4 files changed, 18 insertions(+) diff --git a/docs/man/xl.cfg.pod.5 b/docs/man/xl.cfg.pod.5 index 25523c9..6e9b4ff 100644 --- a/docs/man/xl.cfg.pod.5 +++ b/docs/man/xl.cfg.pod.5 @@ -133,6 +133,16 @@ the same time, achieving efficient utilization of the host''s CPUs and RAM. =back +=head3 Event Channel ABIs + +=over 4 + +=item B<evtchn_extended_allowed=BOOLEAN> + +Flag for allowing domain to use any of the extended event channel ABIs. + +=back + =head3 CPU Scheduling =over 4 diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c index efeebf2..541e34b 100644 --- a/tools/libxl/libxl_create.c +++ b/tools/libxl/libxl_create.c @@ -35,6 +35,8 @@ int libxl__domain_create_info_setdefault(libxl__gc *gc, libxl_defbool_setdefault(&c_info->oos, true); } + libxl_defbool_setdefault(&c_info->evtchn_extended_allowed, false); + libxl_defbool_setdefault(&c_info->run_hotplug_scripts, true); return 0; @@ -406,6 +408,8 @@ int libxl__domain_make(libxl__gc *gc, libxl_domain_create_info *info, flags |= libxl_defbool_val(info->hap) ? XEN_DOMCTL_CDF_hap : 0; flags |= libxl_defbool_val(info->oos) ? 0 : XEN_DOMCTL_CDF_oos_off; } + flags |= libxl_defbool_val(info->evtchn_extended_allowed) ? + XEN_DOMCTL_CDF_evtchn_extended_allowed : 0; *domid = -1; /* Ultimately, handle is an array of 16 uint8_t, same as uuid */ diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl index 5b080ed..2430827 100644 --- a/tools/libxl/libxl_types.idl +++ b/tools/libxl/libxl_types.idl @@ -237,6 +237,7 @@ libxl_domain_create_info = Struct("domain_create_info",[ ("type", libxl_domain_type), ("hap", libxl_defbool), ("oos", libxl_defbool), + ("evtchn_extended_allowed",libxl_defbool), ("ssidref", uint32), ("name", string), ("uuid", libxl_uuid), diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c index a98705e..1ad9b28 100644 --- a/tools/libxl/xl_cmdimpl.c +++ b/tools/libxl/xl_cmdimpl.c @@ -651,6 +651,9 @@ static void parse_config_data(const char *config_source, xlu_cfg_get_defbool(config, "oos", &c_info->oos, 0); + xlu_cfg_get_defbool(config, "evtchn_extended_allowed", + &c_info->evtchn_extended_allowed, 0); + if (!xlu_cfg_get_string (config, "pool", &buf, 0)) { c_info->poolid = -1; cpupool_qualifier_to_cpupoolid(buf, &c_info->poolid, NULL); -- 1.7.10.4
Ian Jackson
2013-Mar-05 13:48 UTC
Re: [RFC PATCH V4 18/18] libxl: add evtchn_extended_allowed flag
Wei Liu writes ("[RFC PATCH V4 18/18] libxl: add evtchn_extended_allowed flag"):> Admins can add "evtchn_extended_allowed = 1" in domain config file to enable > extended event channel ABI for a domain.Why does it default to false ?> +=item B<evtchn_extended_allowed=BOOLEAN> > + > +Flag for allowing domain to use any of the extended event channel ABIs.This fails to explain why one might want to disable this. The only reason that comes to my mind is in case it has a security vulnerability, an admin who wasn''t currently using it could disable it. Are there other reasons ? This is an odd phrasing given that there is (currently in existence) only one extended event channel ABI. And in the future when we introduce more we probably want to be able to disable them individually ? Ian.
Jan Beulich
2013-Mar-05 14:22 UTC
Re: [RFC PATCH V4 15/18] Infrastructure to manipulate 3-level event channel pages
>>> On 05.03.13 at 13:30, Wei Liu <wei.liu2@citrix.com> wrote: > + memcpy(d->evtchn_pending, &shared_info(d, evtchn_pending), > + sizeof(shared_info(d, evtchn_pending))); > + memcpy(d->evtchn_mask, &shared_info(d, evtchn_mask), > + sizeof(shared_info(d, evtchn_mask)));Is there any point in this copying? If events area allowed to exist (and potentially trigger) at this point, then it is unsafe to do it this way. And if no active events are permitted, then I don''t see the need to copy anything here (just memset() the space to the intended initial value). Jan
Jan Beulich
2013-Mar-05 14:28 UTC
Re: [RFC PATCH V4 16/18] Implement 3-level event channel routines
>>> On 05.03.13 at 13:30, Wei Liu <wei.liu2@citrix.com> wrote: > @@ -1546,8 +1635,11 @@ static void domain_dump_evtchn_info(struct domain *d) > bitmap_scnlistprintf(keyhandler_scratch, sizeof(keyhandler_scratch), > d->poll_mask, d->max_vcpus); > printk("Event channel information for domain %d:\n" > + "Using %s event channel ABI\n" > "Polling vCPUs: {%s}\n" > - " port [p/m]\n", d->domain_id, keyhandler_scratch); > + " port [p/m]\n", > + d->domain_id, evtchn_abi_str(d->evtchn_extended), > + keyhandler_scratch); > > spin_lock(&d->event_lock); >Afaics there''s no guarding being added in the whole series against the dumping taking overly long. Doing this for 4,000 ports is already risky, but doing this for up to 256,000 ports is clearly too much. So without adjustment the ''e'' debug key becomes unusable particularly on large systems (where the eventual need for it may be highest, as having the highest chances of running into problems). Jan
Wei Liu
2013-Mar-05 16:07 UTC
Re: [RFC PATCH V4 16/18] Implement 3-level event channel routines
On Tue, 2013-03-05 at 14:28 +0000, Jan Beulich wrote:> >>> On 05.03.13 at 13:30, Wei Liu <wei.liu2@citrix.com> wrote: > > @@ -1546,8 +1635,11 @@ static void domain_dump_evtchn_info(struct domain *d) > > bitmap_scnlistprintf(keyhandler_scratch, sizeof(keyhandler_scratch), > > d->poll_mask, d->max_vcpus); > > printk("Event channel information for domain %d:\n" > > + "Using %s event channel ABI\n" > > "Polling vCPUs: {%s}\n" > > - " port [p/m]\n", d->domain_id, keyhandler_scratch); > > + " port [p/m]\n", > > + d->domain_id, evtchn_abi_str(d->evtchn_extended), > > + keyhandler_scratch); > > > > spin_lock(&d->event_lock); > > > > Afaics there''s no guarding being added in the whole series against > the dumping taking overly long. Doing this for 4,000 ports is > already risky, but doing this for up to 256,000 ports is clearly > too much. So without adjustment the ''e'' debug key becomes > unusable particularly on large systems (where the eventual need > for it may be highest, as having the highest chances of running > into problems). >One solution I can think of is to print out information on per-bucket basis, that''s about 512 ports processed each time. Does this look reasonable? Wei.> Jan >
Jan Beulich
2013-Mar-05 16:13 UTC
Re: [RFC PATCH V4 16/18] Implement 3-level event channel routines
>>> On 05.03.13 at 17:07, Wei Liu <wei.liu2@citrix.com> wrote: > On Tue, 2013-03-05 at 14:28 +0000, Jan Beulich wrote: >> >>> On 05.03.13 at 13:30, Wei Liu <wei.liu2@citrix.com> wrote: >> > @@ -1546,8 +1635,11 @@ static void domain_dump_evtchn_info(struct domain > *d) >> > bitmap_scnlistprintf(keyhandler_scratch, sizeof(keyhandler_scratch), >> > d->poll_mask, d->max_vcpus); >> > printk("Event channel information for domain %d:\n" >> > + "Using %s event channel ABI\n" >> > "Polling vCPUs: {%s}\n" >> > - " port [p/m]\n", d->domain_id, keyhandler_scratch); >> > + " port [p/m]\n", >> > + d->domain_id, evtchn_abi_str(d->evtchn_extended), >> > + keyhandler_scratch); >> > >> > spin_lock(&d->event_lock); >> > >> >> Afaics there''s no guarding being added in the whole series against >> the dumping taking overly long. Doing this for 4,000 ports is >> already risky, but doing this for up to 256,000 ports is clearly >> too much. So without adjustment the ''e'' debug key becomes >> unusable particularly on large systems (where the eventual need >> for it may be highest, as having the highest chances of running >> into problems). >> > > One solution I can think of is to print out information on per-bucket > basis, that''s about 512 ports processed each time. Does this look > reasonable?Yes, sure. Jan
Wei Liu
2013-Mar-05 17:11 UTC
Re: [RFC PATCH V4 18/18] libxl: add evtchn_extended_allowed flag
On Tue, 2013-03-05 at 13:48 +0000, Ian Jackson wrote:> Wei Liu writes ("[RFC PATCH V4 18/18] libxl: add evtchn_extended_allowed flag"): > > Admins can add "evtchn_extended_allowed = 1" in domain config file to enable > > extended event channel ABI for a domain. > > Why does it default to false ? > > > +=item B<evtchn_extended_allowed=BOOLEAN> > > + > > +Flag for allowing domain to use any of the extended event channel ABIs. > > This fails to explain why one might want to disable this. The only > reason that comes to my mind is in case it has a security > vulnerability, an admin who wasn''t currently using it could disable > it. > > Are there other reasons ? >It is not for security reason. The main concern is that a) extended event channel might use too much global mapping space in Xen; b) in 3-level ABI''s case, normal DomU will never consume so many event channels.> This is an odd phrasing given that there is (currently in existence) > only one extended event channel ABI. And in the future when we > introduce more we probably want to be able to disable them > individually ? >This option in fact was introduced as 3-level event channel ABI only, the name was evtchn_l3 or something at first. But Jan later suggested naming it something more generic. If we are sure that we want to enable extended event channel for all later ABIs, this option can be restrict to 3-level ABI. Wei.> Ian.
Ian Jackson
2013-Mar-05 17:38 UTC
Re: [RFC PATCH V4 18/18] libxl: add evtchn_extended_allowed flag
Wei Liu writes ("Re: [RFC PATCH V4 18/18] libxl: add evtchn_extended_allowed flag"):> On Tue, 2013-03-05 at 13:48 +0000, Ian Jackson wrote: > > This fails to explain why one might want to disable this. The only > > reason that comes to my mind is in case it has a security > > vulnerability, an admin who wasn''t currently using it could disable > > it. > > > > Are there other reasons ? > > It is not for security reason. > > The main concern is that a) extended event channel might use too much > global mapping space in Xen; b) in 3-level ABI''s case, normal DomU will > never consume so many event channels.This is rather opaque from the documentation as proposed. Perhaps a limit on the total number of event channels for a domain would make more sense ?> > This is an odd phrasing given that there is (currently in existence) > > only one extended event channel ABI. And in the future when we > > introduce more we probably want to be able to disable them > > individually ? > > This option in fact was introduced as 3-level event channel ABI only, > the name was evtchn_l3 or something at first. But Jan later suggested > naming it something more generic.Well, to an extent we''re trying to predict what we might want to enable/disable in the future.> If we are sure that we want to enable extended event channel for all > later ABIs, this option can be restrict to 3-level ABI.Maybe we want to enable/disable them separately. Ian.
Wei Liu
2013-Mar-05 17:51 UTC
Re: [RFC PATCH V4 18/18] libxl: add evtchn_extended_allowed flag
On Tue, Mar 05, 2013 at 05:38:54PM +0000, Ian Jackson wrote:> Wei Liu writes ("Re: [RFC PATCH V4 18/18] libxl: add evtchn_extended_allowed flag"): > > On Tue, 2013-03-05 at 13:48 +0000, Ian Jackson wrote: > > > This fails to explain why one might want to disable this. The only > > > reason that comes to my mind is in case it has a security > > > vulnerability, an admin who wasn''t currently using it could disable > > > it. > > > > > > Are there other reasons ? > > > > It is not for security reason. > > > > The main concern is that a) extended event channel might use too much > > global mapping space in Xen; b) in 3-level ABI''s case, normal DomU will > > never consume so many event channels. > > This is rather opaque from the documentation as proposed. Perhaps a > limit on the total number of event channels for a domain would make > more sense ?I will improve the documentation. But putting a limit on total number of event channels for a domain for now is not what I expect, because a) having limit on 2/3-level event channels brings no significant improvement, b) the infrastructure to notify a guest about its limit doesn''t exists.> > > > This is an odd phrasing given that there is (currently in existence) > > > only one extended event channel ABI. And in the future when we > > > introduce more we probably want to be able to disable them > > > individually ? > > > > This option in fact was introduced as 3-level event channel ABI only, > > the name was evtchn_l3 or something at first. But Jan later suggested > > naming it something more generic. > > Well, to an extent we''re trying to predict what we might want to > enable/disable in the future. >It is hard to predict, as 3-level ABI is the only extended ABI exists at the moment...> > If we are sure that we want to enable extended event channel for all > > later ABIs, this option can be restrict to 3-level ABI. > > Maybe we want to enable/disable them separately. >Then let''s restrict it to 3-level ABI, say, name it evtchn_extended_l3_allowed. Wei.> Ian.
David Vrabel
2013-Mar-05 17:56 UTC
Re: [RFC PATCH V4 18/18] libxl: add evtchn_extended_allowed flag
On 05/03/13 17:51, Wei Liu wrote:> On Tue, Mar 05, 2013 at 05:38:54PM +0000, Ian Jackson wrote: >> Wei Liu writes ("Re: [RFC PATCH V4 18/18] libxl: add evtchn_extended_allowed flag"): >>> On Tue, 2013-03-05 at 13:48 +0000, Ian Jackson wrote: >>>> This fails to explain why one might want to disable this. The only >>>> reason that comes to my mind is in case it has a security >>>> vulnerability, an admin who wasn''t currently using it could disable >>>> it. >>>> >>>> Are there other reasons ? >>> >>> It is not for security reason. >>> >>> The main concern is that a) extended event channel might use too much >>> global mapping space in Xen; b) in 3-level ABI''s case, normal DomU will >>> never consume so many event channels. >> >> This is rather opaque from the documentation as proposed. Perhaps a >> limit on the total number of event channels for a domain would make >> more sense ? > > I will improve the documentation. But putting a limit on total number of > event channels for a domain for now is not what I expect, because a) > having limit on 2/3-level event channels brings no significant > improvement, b) the infrastructure to notify a guest about its limit > doesn''t exists.The user-visible limit option could be toolstack only. i.e., internally libxc decides a limit of > 4096 (> 1024 for a 32-bit x86 guest) requires enabling extended event channels. David
Wei Liu
2013-Mar-05 18:08 UTC
Re: [RFC PATCH V4 18/18] libxl: add evtchn_extended_allowed flag
On Tue, Mar 05, 2013 at 05:56:12PM +0000, David Vrabel wrote:> On 05/03/13 17:51, Wei Liu wrote: > > On Tue, Mar 05, 2013 at 05:38:54PM +0000, Ian Jackson wrote: > >> Wei Liu writes ("Re: [RFC PATCH V4 18/18] libxl: add evtchn_extended_allowed flag"): > >>> On Tue, 2013-03-05 at 13:48 +0000, Ian Jackson wrote: > >>>> This fails to explain why one might want to disable this. The only > >>>> reason that comes to my mind is in case it has a security > >>>> vulnerability, an admin who wasn''t currently using it could disable > >>>> it. > >>>> > >>>> Are there other reasons ? > >>> > >>> It is not for security reason. > >>> > >>> The main concern is that a) extended event channel might use too much > >>> global mapping space in Xen; b) in 3-level ABI''s case, normal DomU will > >>> never consume so many event channels. > >> > >> This is rather opaque from the documentation as proposed. Perhaps a > >> limit on the total number of event channels for a domain would make > >> more sense ? > > > > I will improve the documentation. But putting a limit on total number of > > event channels for a domain for now is not what I expect, because a) > > having limit on 2/3-level event channels brings no significant > > improvement, b) the infrastructure to notify a guest about its limit > > doesn''t exists. > > The user-visible limit option could be toolstack only. i.e., internally > libxc decides a limit of > 4096 (> 1024 for a 32-bit x86 guest) requires > enabling extended event channels.True. This can also benefit future ABIs. But I think I should add the decision making logic in libxl, not libxc. Wei.> > David
Wei Liu
2013-Mar-06 17:16 UTC
Re: [RFC PATCH V4 18/18] libxl: add evtchn_extended_allowed flag
Let''s start with documentation. I came up with two options for libxl, I plan to choose one from them. The first one is generic, the second one is 3-level ABI centric. =item B<max_event_channels=NUMBER> Maximum number of event channels a guest can use. The number should be within (0, 32768] for 32 bits guest and (0, 262144] for 64 bits guest. libxl will decide which ABI to use. In most cases, user doesn''t need to use this option. The default value for this option is 0, which is interpreted by libxl as "use the default 2-level event channel ABI". Users should be aware that specifying this option might enable extended event channel ABIs, which consume certain amount of Xen internal resources (memory, address space). The amount of resources consumed by the extended ABIs are implementation-specific. Currently 2/3-level event channel ABIs are available. For 32 bits guest, if B<max_event_channels> > 1024, 3-level ABI will be used. For 64 bits guest, if B<max_event_channels> > 4096, 3-level ABI will be used. In other cases, 2-level ABI will be used. The internal logic of libxl choosing the correct ABI might change if more ABIs are available. =item B<event_channel_3level_abi_allowed=BOOLEAN> Flag for allowing domain to use the 3-level event channel ABI, which is designed for Dom0 and driver domains. This ABI provides 32K event channels for 32 bits guest and 256K event channels for 64 bits guest. The default value of this option is false, as this ABI consumes resources inside Xen (memory, address space). A normal DomU will never use so many event channels. User may want to enable this ABI for driver domains only. Which one makes more sense to you? Wei.
David Vrabel
2013-Mar-07 11:23 UTC
Re: [RFC PATCH V4 18/18] libxl: add evtchn_extended_allowed flag
On 06/03/13 17:16, Wei Liu wrote:> Let''s start with documentation. I came up with two options for libxl, I > plan to choose one from them. The first one is generic, the second one > is 3-level ABI centric. > > =item B<max_event_channels=NUMBER> >I would remove all references to specific ABIs where possible. Perhaps: "=item B<max_event_channels=NUMBER> Maximum number of event channels a guest can use. Use this to limit the amount of Xen resources the guest may use for event channels. The default value for this option is 1024 which is sufficient for a normal guest. Driver or service domains may need more. Depending on architecture and the support available in the guest, the guest may be restricted to less than this limit." David
David Vrabel
2013-Mar-15 18:15 UTC
Re: [RFC PATCH V4 11/18] Bump EVTCHNS_PER_BUCKET to 512
On 05/03/13 12:30, Wei Liu wrote:> For 64 bit build and 3-level event channel and the original value of > EVTCHNS_PER_BUCKET (128), the space needed to accommodate d->evtchn would be 4 > pages (PAGE_SIZE = 4096). Given that not every domain needs 3-level event > channel, this leads to waste of memory. Having EVTCHN_PER_BUCKETS to be 512 > can occupy exact one page.This makes the list of buckets a page in size but each bucket is now 4 pages. I think another layer of indirection is required to keep all the allocations <= PAGE_SIZE. David> Signed-off-by: Wei Liu <wei.liu2@citrix.com> > --- > xen/include/xen/event.h | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/xen/include/xen/event.h b/xen/include/xen/event.h > index 919f0e2..e2c3736 100644 > --- a/xen/include/xen/event.h > +++ b/xen/include/xen/event.h > @@ -39,7 +39,7 @@ static inline unsigned int max_evtchns(struct domain *d) > return ret; > } > > -#define EVTCHNS_PER_BUCKET 128 > +#define EVTCHNS_PER_BUCKET 512 > #define NR_EVTCHN_BUCKETS (NR_EVENT_CHANNELS / EVTCHNS_PER_BUCKET) > > struct evtchn
On Fri, 2013-03-15 at 18:15 +0000, David Vrabel wrote:> On 05/03/13 12:30, Wei Liu wrote: > > For 64 bit build and 3-level event channel and the original value of > > EVTCHNS_PER_BUCKET (128), the space needed to accommodate d->evtchn would be 4 > > pages (PAGE_SIZE = 4096). Given that not every domain needs 3-level event > > channel, this leads to waste of memory. Having EVTCHN_PER_BUCKETS to be 512 > > can occupy exact one page. > > This makes the list of buckets a page in size but each bucket is now 4 > pages. >I''m confused. On 64 bit w/ FLASK enabled, sizeof(struct evtchn) = 18B, bucket size 9216B = 2.25 pages; w/o FLASK enabled, sizeof(struct evtchn) = 10B, bucket size = 5120B = 1.25 pages. Yesterday I did start rewriting this bit to optimize space usage though. :-) Wei.
David Vrabel
2013-Mar-15 18:43 UTC
Re: [RFC PATCH V4 11/18] Bump EVTCHNS_PER_BUCKET to 512
On 15/03/13 18:37, Wei Liu wrote:> On Fri, 2013-03-15 at 18:15 +0000, David Vrabel wrote: >> On 05/03/13 12:30, Wei Liu wrote: >>> For 64 bit build and 3-level event channel and the original value of >>> EVTCHNS_PER_BUCKET (128), the space needed to accommodate d->evtchn would be 4 >>> pages (PAGE_SIZE = 4096). Given that not every domain needs 3-level event >>> channel, this leads to waste of memory. Having EVTCHN_PER_BUCKETS to be 512 >>> can occupy exact one page. >> >> This makes the list of buckets a page in size but each bucket is now 4 >> pages. >> > > I''m confused.3 pages (not 4) with flask, 2 pages without.> On 64 bit w/ FLASK enabled, sizeof(struct evtchn) = 18B, bucket size > 9216B = 2.25 pages; w/o FLASK enabled, sizeof(struct evtchn) = 10B, > bucket size = 5120B = 1.25 pages.struct evtchn isn''t packed so it is 24 bytes with flask and 16 without. David