Hi all This is another version of the patch series. Unfortunately the kernel side is not available at the moment. :-( Keir, Jan, Ian and David, are you happy with this design in general? I would like to have explicit ACK / NACK on this if possible, as feature freeze for 4.3 is quite close now. Changes since V2: * new interface to register extended event channel ABI * use vmap to simplify mapping * replace MAX_EVTCHNS macro with inline function * libxl: evtchn_l3 -> evtchn_extended The most notable bit of this series is the interface change. In order to cope with future ABIs, the interface is renamed to EVTCHNOP_register_extended. It also provides supported ABI query, so that we can remove unused ABI in the future. The semantic meaning of EVTCHNOP_register_extended changes a bit. The `level'' in parameter now changes to `cmd'', which means we should go down to specific ABI routines. ABI-specific structures are still embedded in the union. Changes since V1: * move all evtchn related macros / struct definitions to event.h * only allow 3-level evtchn for Dom0 and driver domains * add evtchn_l3 flag in libxl Wei. Diffstat: tools/libxl/libxl_create.c | 4 + tools/libxl/libxl_types.idl | 1 + tools/libxl/xl_cmdimpl.c | 2 + tools/libxl/xl_sxp.c | 2 + xen/arch/arm/domain.c | 1 + xen/arch/x86/domain.c | 1 + xen/arch/x86/irq.c | 7 +- xen/common/domain.c | 3 + xen/common/domctl.c | 5 +- xen/common/event_channel.c | 458 +++++++++++++++++++++++++++++++++--- xen/common/keyhandler.c | 6 +- xen/common/schedule.c | 4 +- xen/include/asm-arm/types.h | 7 +- xen/include/asm-x86/config.h | 4 +- xen/include/public/domctl.h | 3 + xen/include/public/event_channel.h | 44 ++++ xen/include/public/xen.h | 35 ++- xen/include/xen/event.h | 85 ++++++- xen/include/xen/sched.h | 65 ++--- xen/xsm/flask/hooks.c | 1 + 20 files changed, 623 insertions(+), 115 deletions(-)
Affected files: * event_channel.c * sched.h * event.h * xen.h Signed-off-by: Wei Liu <wei.liu2@citrix.com> --- xen/common/event_channel.c | 16 ++++++++-------- xen/include/public/xen.h | 22 +++++++++++----------- xen/include/xen/event.h | 4 ++-- xen/include/xen/sched.h | 6 +++--- 4 files changed, 24 insertions(+), 24 deletions(-) diff --git a/xen/common/event_channel.c b/xen/common/event_channel.c index 2d7afc9..9231eb0 100644 --- a/xen/common/event_channel.c +++ b/xen/common/event_channel.c @@ -1,15 +1,15 @@ /****************************************************************************** * event_channel.c - * + * * Event notifications from VIRQs, PIRQs, and other domains. - * + * * Copyright (c) 2003-2006, K A Fraser. - * + * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. - * + * * You should have received a copy of the GNU General Public License * along with this program; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA @@ -238,7 +238,7 @@ static long evtchn_bind_interdomain(evtchn_bind_interdomain_t *bind) lchn->u.interdomain.remote_dom = rd; lchn->u.interdomain.remote_port = (u16)rport; lchn->state = ECS_INTERDOMAIN; - + rchn->u.interdomain.remote_dom = ld; rchn->u.interdomain.remote_port = (u16)lport; rchn->state = ECS_INTERDOMAIN; @@ -255,7 +255,7 @@ static long evtchn_bind_interdomain(evtchn_bind_interdomain_t *bind) spin_unlock(&ld->event_lock); if ( ld != rd ) spin_unlock(&rd->event_lock); - + rcu_unlock_domain(rd); return rc; @@ -633,7 +633,7 @@ static void evtchn_set_pending(struct vcpu *v, int port) { vcpu_mark_events_pending(v); } - + /* Check if some VCPU might be polling for this event. */ if ( likely(bitmap_empty(d->poll_mask, d->max_vcpus)) ) return; @@ -930,7 +930,7 @@ int evtchn_unmask(unsigned int port) /* * These operations must happen in strict order. Based on - * include/xen/event.h:evtchn_set_pending(). + * include/xen/event.h:evtchn_set_pending(). */ if ( test_and_clear_bit(port, &shared_info(d, evtchn_mask)) && test_bit (port, &shared_info(d, evtchn_pending)) && diff --git a/xen/include/public/xen.h b/xen/include/public/xen.h index 99c8212..87380e6 100644 --- a/xen/include/public/xen.h +++ b/xen/include/public/xen.h @@ -1,8 +1,8 @@ /****************************************************************************** * xen.h - * + * * Guest OS interface to Xen. - * + * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to * deal in the Software without restriction, including without limitation the @@ -137,11 +137,11 @@ DEFINE_XEN_GUEST_HANDLE(xen_ulong_t); #define __HYPERVISOR_dom0_op __HYPERVISOR_platform_op #endif -/* +/* * VIRTUAL INTERRUPTS - * + * * Virtual interrupts that a guest OS may receive from Xen. - * + * * In the side comments, ''V.'' denotes a per-VCPU VIRQ while ''G.'' denotes a * global VIRQ. The former can be bound once per VCPU and cannot be re-bound. * The latter can be allocated only once per guest: they must initially be @@ -190,7 +190,7 @@ DEFINE_XEN_GUEST_HANDLE(xen_ulong_t); * (x) encodes the PFD as follows: * x == 0 => PFD == DOMID_SELF * x != 0 => PFD == x - 1 - * + * * Sub-commands: ptr[1:0] specifies the appropriate MMU_* command. * ------------- * ptr[1:0] == MMU_NORMAL_PT_UPDATE: @@ -236,13 +236,13 @@ DEFINE_XEN_GUEST_HANDLE(xen_ulong_t); * To deallocate the pages, the operations are the reverse of the steps * mentioned above. The argument is MMUEXT_UNPIN_TABLE for all levels and the * pagetable MUST not be in use (meaning that the cr3 is not set to it). - * + * * ptr[1:0] == MMU_MACHPHYS_UPDATE: * Updates an entry in the machine->pseudo-physical mapping table. * ptr[:2] -- Machine address within the frame whose mapping to modify. * The frame must belong to the FD, if one is specified. * val -- Value to write into the mapping entry. - * + * * ptr[1:0] == MMU_PT_UPDATE_PRESERVE_AD: * As MMU_NORMAL_PT_UPDATE above, but A/D bits currently in the PTE are ORed * with those in @val. @@ -588,7 +588,7 @@ typedef struct vcpu_time_info vcpu_time_info_t; struct vcpu_info { /* * ''evtchn_upcall_pending'' is written non-zero by Xen to indicate - * a pending notification for a particular VCPU. It is then cleared + * a pending notification for a particular VCPU. It is then cleared * by the guest OS /before/ checking for pending work, thus avoiding * a set-and-check race. Note that the mask is only accessed by Xen * on the CPU that is currently hosting the VCPU. This means that the @@ -646,7 +646,7 @@ struct shared_info { * 3. Virtual interrupts (''events''). A domain can bind an event-channel * port to a virtual interrupt source, such as the virtual-timer * device or the emergency console. - * + * * Event channels are addressed by a "port index". Each channel is * associated with two bits of information: * 1. PENDING -- notifies the domain that there is a pending notification @@ -657,7 +657,7 @@ struct shared_info { * becomes pending while the channel is masked then the ''edge'' is lost * (i.e., when the channel is unmasked, the guest must manually handle * pending notifications as no upcall will be scheduled by Xen). - * + * * To expedite scanning of pending notifications, any 0->1 pending * transition on an unmasked channel causes a corresponding bit in a * per-vcpu selector word to be set. Each bit in the selector covers a diff --git a/xen/include/xen/event.h b/xen/include/xen/event.h index 71c3e92..65ac81a 100644 --- a/xen/include/xen/event.h +++ b/xen/include/xen/event.h @@ -1,8 +1,8 @@ /****************************************************************************** * event.h - * + * * A nice interface for passing asynchronous events to guest OSes. - * + * * Copyright (c) 2002-2006, K A Fraser */ diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h index ba0f2f8..f6846d4 100644 --- a/xen/include/xen/sched.h +++ b/xen/include/xen/sched.h @@ -92,7 +92,7 @@ void evtchn_destroy_final(struct domain *d); /* from complete_domain_destroy */ struct waitqueue_vcpu; -struct vcpu +struct vcpu { int vcpu_id; @@ -453,7 +453,7 @@ struct domain *domain_create( /* * rcu_lock_domain_by_id() is more efficient than get_domain_by_id(). * This is the preferred function if the returned domain reference - * is short lived, but it cannot be used if the domain reference needs + * is short lived, but it cannot be used if the domain reference needs * to be kept beyond the current scope (e.g., across a softirq). * The returned domain reference must be discarded using rcu_unlock_domain(). */ @@ -574,7 +574,7 @@ void sync_local_execstate(void); * sync_vcpu_execstate() will switch and commit @prev''s state. */ void context_switch( - struct vcpu *prev, + struct vcpu *prev, struct vcpu *next); /* -- 1.7.10.4
As we move to N level evtchn we need bigger d->evtchn, as a result this will bloat struct domain. So move this array out of struct domain and allocate a dedicated page for it. Signed-off-by: Wei Liu <wei.liu2@citrix.com> --- xen/common/event_channel.c | 17 +++++++++++++++-- xen/include/xen/sched.h | 2 +- 2 files changed, 16 insertions(+), 3 deletions(-) diff --git a/xen/common/event_channel.c b/xen/common/event_channel.c index 9231eb0..3293f91 100644 --- a/xen/common/event_channel.c +++ b/xen/common/event_channel.c @@ -1172,15 +1172,26 @@ void notify_via_xen_event_channel(struct domain *ld, int lport) int evtchn_init(struct domain *d) { + BUILD_BUG_ON(sizeof(struct evtchn *) * NR_EVTCHN_BUCKETS > PAGE_SIZE); + d->evtchn = alloc_xenheap_page(); + + if ( d->evtchn == NULL ) + return -ENOMEM; + clear_page(d->evtchn); + spin_lock_init(&d->event_lock); - if ( get_free_port(d) != 0 ) + if ( get_free_port(d) != 0 ) { + free_xenheap_page(d->evtchn); return -EINVAL; + } evtchn_from_port(d, 0)->state = ECS_RESERVED; #if MAX_VIRT_CPUS > BITS_PER_LONG d->poll_mask = xmalloc_array(unsigned long, BITS_TO_LONGS(MAX_VIRT_CPUS)); - if ( !d->poll_mask ) + if ( !d->poll_mask ) { + free_xenheap_page(d->evtchn); return -ENOMEM; + } bitmap_zero(d->poll_mask, MAX_VIRT_CPUS); #endif @@ -1214,6 +1225,8 @@ void evtchn_destroy(struct domain *d) spin_unlock(&d->event_lock); clear_global_virq_handlers(d); + + free_xenheap_page(d->evtchn); } diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h index f6846d4..812bd87 100644 --- a/xen/include/xen/sched.h +++ b/xen/include/xen/sched.h @@ -260,7 +260,7 @@ struct domain spinlock_t rangesets_lock; /* Event channel information. */ - struct evtchn *evtchn[NR_EVTCHN_BUCKETS]; + struct evtchn **evtchn; spinlock_t event_lock; struct grant_table *grant_table; -- 1.7.10.4
Wei Liu
2013-Feb-27 14:33 UTC
[RFC PATCH V3 03/22] Move event channel macros / struct definition to proper place
After remove reference to NR_EVTCHN_BUCKETS in struct domain, we can move those macros / struct definitions to event.h. Signed-off-by: Wei Liu <wei.liu2@citrix.com> --- xen/include/xen/event.h | 46 ++++++++++++++++++++++++++++++++++++++++++++++ xen/include/xen/sched.h | 45 --------------------------------------------- 2 files changed, 46 insertions(+), 45 deletions(-) diff --git a/xen/include/xen/event.h b/xen/include/xen/event.h index 65ac81a..a1574ea 100644 --- a/xen/include/xen/event.h +++ b/xen/include/xen/event.h @@ -15,6 +15,52 @@ #include <asm/bitops.h> #include <asm/event.h> +#ifndef CONFIG_COMPAT +#define BITS_PER_EVTCHN_WORD(d) BITS_PER_LONG +#else +#define BITS_PER_EVTCHN_WORD(d) (has_32bit_shinfo(d) ? 32 : BITS_PER_LONG) +#endif +#define MAX_EVTCHNS(d) (BITS_PER_EVTCHN_WORD(d) * BITS_PER_EVTCHN_WORD(d)) + +#define EVTCHNS_PER_BUCKET 128 +#define NR_EVTCHN_BUCKETS (NR_EVENT_CHANNELS / EVTCHNS_PER_BUCKET) + +struct evtchn +{ +#define ECS_FREE 0 /* Channel is available for use. */ +#define ECS_RESERVED 1 /* Channel is reserved. */ +#define ECS_UNBOUND 2 /* Channel is waiting to bind to a remote domain. */ +#define ECS_INTERDOMAIN 3 /* Channel is bound to another domain. */ +#define ECS_PIRQ 4 /* Channel is bound to a physical IRQ line. */ +#define ECS_VIRQ 5 /* Channel is bound to a virtual IRQ line. */ +#define ECS_IPI 6 /* Channel is bound to a virtual IPI line. */ + u8 state; /* ECS_* */ + u8 xen_consumer; /* Consumer in Xen, if any? (0 = send to guest) */ + u16 notify_vcpu_id; /* VCPU for local delivery notification */ + union { + struct { + domid_t remote_domid; + } unbound; /* state == ECS_UNBOUND */ + struct { + u16 remote_port; + struct domain *remote_dom; + } interdomain; /* state == ECS_INTERDOMAIN */ + struct { + u16 irq; + u16 next_port; + u16 prev_port; + } pirq; /* state == ECS_PIRQ */ + u16 virq; /* state == ECS_VIRQ */ + } u; +#ifdef FLASK_ENABLE + void *ssid; +#endif +}; + +int evtchn_init(struct domain *d); /* from domain_create */ +void evtchn_destroy(struct domain *d); /* from domain_kill */ +void evtchn_destroy_final(struct domain *d); /* from complete_domain_destroy */ + /* * send_guest_vcpu_virq: Notify guest via a per-VCPU VIRQ. * @v: VCPU to which virtual IRQ should be sent diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h index 812bd87..bb65cbf 100644 --- a/xen/include/xen/sched.h +++ b/xen/include/xen/sched.h @@ -45,51 +45,6 @@ DEFINE_XEN_GUEST_HANDLE(vcpu_runstate_info_compat_t); /* A global pointer to the initial domain (DOM0). */ extern struct domain *dom0; -#ifndef CONFIG_COMPAT -#define BITS_PER_EVTCHN_WORD(d) BITS_PER_LONG -#else -#define BITS_PER_EVTCHN_WORD(d) (has_32bit_shinfo(d) ? 32 : BITS_PER_LONG) -#endif -#define MAX_EVTCHNS(d) (BITS_PER_EVTCHN_WORD(d) * BITS_PER_EVTCHN_WORD(d)) -#define EVTCHNS_PER_BUCKET 128 -#define NR_EVTCHN_BUCKETS (NR_EVENT_CHANNELS / EVTCHNS_PER_BUCKET) - -struct evtchn -{ -#define ECS_FREE 0 /* Channel is available for use. */ -#define ECS_RESERVED 1 /* Channel is reserved. */ -#define ECS_UNBOUND 2 /* Channel is waiting to bind to a remote domain. */ -#define ECS_INTERDOMAIN 3 /* Channel is bound to another domain. */ -#define ECS_PIRQ 4 /* Channel is bound to a physical IRQ line. */ -#define ECS_VIRQ 5 /* Channel is bound to a virtual IRQ line. */ -#define ECS_IPI 6 /* Channel is bound to a virtual IPI line. */ - u8 state; /* ECS_* */ - u8 xen_consumer; /* Consumer in Xen, if any? (0 = send to guest) */ - u16 notify_vcpu_id; /* VCPU for local delivery notification */ - union { - struct { - domid_t remote_domid; - } unbound; /* state == ECS_UNBOUND */ - struct { - u16 remote_port; - struct domain *remote_dom; - } interdomain; /* state == ECS_INTERDOMAIN */ - struct { - u16 irq; - u16 next_port; - u16 prev_port; - } pirq; /* state == ECS_PIRQ */ - u16 virq; /* state == ECS_VIRQ */ - } u; -#ifdef FLASK_ENABLE - void *ssid; -#endif -}; - -int evtchn_init(struct domain *d); /* from domain_create */ -void evtchn_destroy(struct domain *d); /* from domain_kill */ -void evtchn_destroy_final(struct domain *d); /* from complete_domain_destroy */ - struct waitqueue_vcpu; struct vcpu -- 1.7.10.4
Some definitions for event channel have been moved to new header file. Signed-off-by: Wei Liu <wei.liu2@citrix.com> --- xen/xsm/flask/hooks.c | 1 + 1 file changed, 1 insertion(+) diff --git a/xen/xsm/flask/hooks.c b/xen/xsm/flask/hooks.c index 29a78dd..6d446ab 100644 --- a/xen/xsm/flask/hooks.c +++ b/xen/xsm/flask/hooks.c @@ -11,6 +11,7 @@ #include <xen/init.h> #include <xen/lib.h> #include <xen/sched.h> +#include <xen/event.h> #include <xen/paging.h> #include <xen/xmalloc.h> #include <xsm/xsm.h> -- 1.7.10.4
Wei Liu
2013-Feb-27 14:33 UTC
[RFC PATCH V3 05/22] Change MAX_EVTCHNS macro to max_evtchns inline function
The calculation of max event channels depends on the actual ABI in use. Try to avoid gcc-ism macro. Signed-off-by: Wei Liu <wei.liu2@citrix.com> --- xen/common/event_channel.c | 4 ++-- xen/common/schedule.c | 2 +- xen/include/xen/event.h | 7 +++++-- 3 files changed, 8 insertions(+), 5 deletions(-) diff --git a/xen/common/event_channel.c b/xen/common/event_channel.c index 3293f91..684a021 100644 --- a/xen/common/event_channel.c +++ b/xen/common/event_channel.c @@ -134,7 +134,7 @@ static int get_free_port(struct domain *d) if ( evtchn_from_port(d, port)->state == ECS_FREE ) return port; - if ( port == MAX_EVTCHNS(d) ) + if ( port == max_evtchns(d) ) return -ENOSPC; chn = xzalloc_array(struct evtchn, EVTCHNS_PER_BUCKET); @@ -1269,7 +1269,7 @@ static void domain_dump_evtchn_info(struct domain *d) spin_lock(&d->event_lock); - for ( port = 1; port < MAX_EVTCHNS(d); ++port ) + for ( port = 1; port < max_evtchns(d); ++port ) { const struct evtchn *chn; char *ssid; diff --git a/xen/common/schedule.c b/xen/common/schedule.c index e6a90d8..e59eb4d 100644 --- a/xen/common/schedule.c +++ b/xen/common/schedule.c @@ -689,7 +689,7 @@ static long do_poll(struct sched_poll *sched_poll) goto out; rc = -EINVAL; - if ( port >= MAX_EVTCHNS(d) ) + if ( port >= max_evtchns(d) ) goto out; rc = 0; diff --git a/xen/include/xen/event.h b/xen/include/xen/event.h index a1574ea..f8de10b 100644 --- a/xen/include/xen/event.h +++ b/xen/include/xen/event.h @@ -20,7 +20,10 @@ #else #define BITS_PER_EVTCHN_WORD(d) (has_32bit_shinfo(d) ? 32 : BITS_PER_LONG) #endif -#define MAX_EVTCHNS(d) (BITS_PER_EVTCHN_WORD(d) * BITS_PER_EVTCHN_WORD(d)) +static inline unsigned int max_evtchns(struct domain *d) +{ + return BITS_PER_EVTCHN_WORD(d) * BITS_PER_EVTCHN_WORD(d); +} #define EVTCHNS_PER_BUCKET 128 #define NR_EVTCHN_BUCKETS (NR_EVENT_CHANNELS / EVTCHNS_PER_BUCKET) @@ -119,7 +122,7 @@ void notify_via_xen_event_channel(struct domain *ld, int lport); #define bucket_from_port(d,p) \ ((d)->evtchn[(p)/EVTCHNS_PER_BUCKET]) #define port_is_valid(d,p) \ - (((p) >= 0) && ((p) < MAX_EVTCHNS(d)) && \ + (((p) >= 0) && ((p) < max_evtchns(d)) && \ (bucket_from_port(d,p) != NULL)) #define evtchn_from_port(d,p) \ (&(bucket_from_port(d,p))[(p)&(EVTCHNS_PER_BUCKET-1)]) -- 1.7.10.4
Wei Liu
2013-Feb-27 14:33 UTC
[RFC PATCH V3 06/22] Define extended event channel registration interface
This interface allows user to query supported event channel types. If we need to disable a specific ABI in the future, we just need to remove the dead code and clear corresponding feature bit. Signed-off-by: Wei Liu <wei.liu2@citrix.com> --- xen/include/public/event_channel.h | 44 ++++++++++++++++++++++++++++++++++++ 1 file changed, 44 insertions(+) diff --git a/xen/include/public/event_channel.h b/xen/include/public/event_channel.h index 07ff321..dff3364 100644 --- a/xen/include/public/event_channel.h +++ b/xen/include/public/event_channel.h @@ -71,6 +71,7 @@ #define EVTCHNOP_bind_vcpu 8 #define EVTCHNOP_unmask 9 #define EVTCHNOP_reset 10 +#define EVTCHNOP_register_extended 11 /* ` } */ typedef uint32_t evtchn_port_t; @@ -258,6 +259,49 @@ struct evtchn_reset { typedef struct evtchn_reset evtchn_reset_t; /* + * EVTCHNOP_register_extended: Register extended event channel + * NOTES: + * 1. Currently only 3-level is supported. + * 2. Should fall back to 2-level if this call fails. + */ +/* 64 bit guests need 8 pages for evtchn_pending and evtchn_mask for + * 256k event channels while 32 bit ones only need 1 page for 32k + * event channels. */ +#define EVTCHN_MAX_L3_PAGES 8 +struct evtchn_register_3level { + /* IN parameters. */ + uint32_t nr_pages; /* for evtchn_{pending,mask} */ + uint32_t nr_vcpus; /* for l2sel_{mfns,offsets} */ + XEN_GUEST_HANDLE(xen_pfn_t) evtchn_pending; + XEN_GUEST_HANDLE(xen_pfn_t) evtchn_mask; + XEN_GUEST_HANDLE(xen_pfn_t) l2sel_mfns; + XEN_GUEST_HANDLE(xen_pfn_t) l2sel_offsets; +}; +typedef struct evtchn_register_3level evtchn_register_3level_t; +DEFINE_XEN_GUEST_HANDLE(evtchn_register_3level_t); + +/* commands: + * EVTCHN_EXTENDED_QUERY(0): query supported extended event channel types, + * _NONE supported types are or''ed in return value of + * the hypercall. + * EVTCHN_EXTENDED_*: specific extended event channel subcommand. + */ +#define EVTCHN_EXTENDED_QUERY 0 +/* supported extended event channel */ +#define EVTCHN_EXTENDED_NONE 0 +#define _EVTCHN_EXTENDED_L3 0 +#define EVTCHN_EXTENDED_L3 (1U << _EVTCHN_EXTENDED_L3) +struct evtchn_register_extended { + /* IN parameters. */ + uint32_t cmd; + union { + evtchn_register_3level_t l3; + } u; +}; +typedef struct evtchn_register_extended evtchn_register_extended_t; +DEFINE_XEN_GUEST_HANDLE(evtchn_register_extended_t); + +/* * ` enum neg_errnoval * ` HYPERVISOR_event_channel_op_compat(struct evtchn_op *op) * ` -- 1.7.10.4
This field is a bitmap of currently in use extended event channel ABI, which can have 0 (no extended event channel in use) 1 bit set. It is manipulated by hypervisor only, so if anything goes wrong it is a bug. The default event channel ABI is EVTCHN_EXTENDED_NONE, which means no extended event channel is used. Signed-off-by: Wei Liu <wei.liu2@citrix.com> --- xen/common/event_channel.c | 1 + xen/include/xen/event.h | 12 +++++++++++- xen/include/xen/sched.h | 1 + 3 files changed, 13 insertions(+), 1 deletion(-) diff --git a/xen/common/event_channel.c b/xen/common/event_channel.c index 684a021..bd4b1e8 100644 --- a/xen/common/event_channel.c +++ b/xen/common/event_channel.c @@ -1179,6 +1179,7 @@ int evtchn_init(struct domain *d) return -ENOMEM; clear_page(d->evtchn); + d->evtchn_extended = EVTCHN_EXTENDED_NONE; spin_lock_init(&d->event_lock); if ( get_free_port(d) != 0 ) { free_xenheap_page(d->evtchn); diff --git a/xen/include/xen/event.h b/xen/include/xen/event.h index f8de10b..240344b 100644 --- a/xen/include/xen/event.h +++ b/xen/include/xen/event.h @@ -14,6 +14,7 @@ #include <xen/softirq.h> #include <asm/bitops.h> #include <asm/event.h> +#include <public/event_channel.h> #ifndef CONFIG_COMPAT #define BITS_PER_EVTCHN_WORD(d) BITS_PER_LONG @@ -22,7 +23,16 @@ #endif static inline unsigned int max_evtchns(struct domain *d) { - return BITS_PER_EVTCHN_WORD(d) * BITS_PER_EVTCHN_WORD(d); + unsigned int ret = 0; + switch ( d->evtchn_extended ) + { + case EVTCHN_EXTENDED_NONE: + ret = BITS_PER_EVTCHN_WORD(d) * BITS_PER_EVTCHN_WORD(d); + break; + default: + BUG(); + } + return ret; } #define EVTCHNS_PER_BUCKET 128 diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h index bb65cbf..dd40444 100644 --- a/xen/include/xen/sched.h +++ b/xen/include/xen/sched.h @@ -217,6 +217,7 @@ struct domain /* Event channel information. */ struct evtchn **evtchn; spinlock_t event_lock; + unsigned int evtchn_extended; struct grant_table *grant_table; -- 1.7.10.4
Wei Liu
2013-Feb-27 14:34 UTC
[RFC PATCH V3 08/22] Calculate max event channels for EVTCHN_EXTENDED_L3
Signed-off-by: Wei Liu <wei.liu2@citrix.com> --- xen/include/xen/event.h | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/xen/include/xen/event.h b/xen/include/xen/event.h index 240344b..2686960 100644 --- a/xen/include/xen/event.h +++ b/xen/include/xen/event.h @@ -29,6 +29,10 @@ static inline unsigned int max_evtchns(struct domain *d) case EVTCHN_EXTENDED_NONE: ret = BITS_PER_EVTCHN_WORD(d) * BITS_PER_EVTCHN_WORD(d); break; + case EVTCHN_EXTENDED_L3: + ret = BITS_PER_EVTCHN_WORD(d) * BITS_PER_EVTCHN_WORD(d) + * BITS_PER_EVTCHN_WORD(d); + break; default: BUG(); } -- 1.7.10.4
For 64 bit build and 3-level event channel and the original value of EVTCHNS_PER_BUCKET (128), the space needed to accommodate d->evtchn would be 4 pages (PAGE_SIZE = 4096). Given that not every domain needs 3-level event channel, this leads to waste of memory. Also we''ve restricted d->evtchn to one page, if we move to 3-level event channel, Xen cannot build. Having EVTCHN_PER_BUCKETS to be 512 can occupy exact one page. Signed-off-by: Wei Liu <wei.liu2@citrix.com> --- xen/include/xen/event.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/xen/include/xen/event.h b/xen/include/xen/event.h index 2686960..68ce2ae 100644 --- a/xen/include/xen/event.h +++ b/xen/include/xen/event.h @@ -39,7 +39,7 @@ static inline unsigned int max_evtchns(struct domain *d) return ret; } -#define EVTCHNS_PER_BUCKET 128 +#define EVTCHNS_PER_BUCKET 512 #define NR_EVTCHN_BUCKETS (NR_EVENT_CHANNELS / EVTCHNS_PER_BUCKET) struct evtchn -- 1.7.10.4
Wei Liu
2013-Feb-27 14:34 UTC
[RFC PATCH V3 10/22] Add evtchn_is_{pending, masked} and evtchn_clear_pending
Some code paths access the arrays in shared info directly. This only works with 2-level event channel. Add functions to abstract away implementation details. Signed-off-by: Wei Liu <wei.liu2@citrix.com> --- xen/arch/x86/irq.c | 7 +++---- xen/common/event_channel.c | 22 +++++++++++++++++++--- xen/common/keyhandler.c | 6 ++---- xen/common/schedule.c | 2 +- xen/include/xen/event.h | 6 ++++++ 5 files changed, 31 insertions(+), 12 deletions(-) diff --git a/xen/arch/x86/irq.c b/xen/arch/x86/irq.c index b98deb5..d070868 100644 --- a/xen/arch/x86/irq.c +++ b/xen/arch/x86/irq.c @@ -1452,7 +1452,7 @@ int pirq_guest_unmask(struct domain *d) { pirq = pirqs[i]->pirq; if ( pirqs[i]->masked && - !test_bit(pirqs[i]->evtchn, &shared_info(d, evtchn_mask)) ) + !evtchn_is_masked(d, pirqs[i]->evtchn) ) pirq_guest_eoi(pirqs[i]); } } while ( ++pirq < d->nr_pirqs && n == ARRAY_SIZE(pirqs) ); @@ -2090,13 +2090,12 @@ static void dump_irqs(unsigned char key) info = pirq_info(d, pirq); printk("%u:%3d(%c%c%c%c)", d->domain_id, pirq, - (test_bit(info->evtchn, - &shared_info(d, evtchn_pending)) ? + (evtchn_is_pending(d, info->evtchn) ? ''P'' : ''-''), (test_bit(info->evtchn / BITS_PER_EVTCHN_WORD(d), &vcpu_info(d->vcpu[0], evtchn_pending_sel)) ? ''S'' : ''-''), - (test_bit(info->evtchn, &shared_info(d, evtchn_mask)) ? + (evtchn_is_masked(d, info->evtchn) ? ''M'' : ''-''), (info->masked ? ''M'' : ''-'')); if ( i != action->nr_guests ) diff --git a/xen/common/event_channel.c b/xen/common/event_channel.c index bd4b1e8..c73c709 100644 --- a/xen/common/event_channel.c +++ b/xen/common/event_channel.c @@ -95,6 +95,7 @@ static uint8_t get_xen_consumer(xen_event_channel_notification_t fn) #define xen_notification_fn(e) (xen_consumers[(e)->xen_consumer-1]) static void evtchn_set_pending(struct vcpu *v, int port); +static void evtchn_clear_pending(struct domain *d, int port); static int virq_is_global(uint32_t virq) { @@ -156,6 +157,16 @@ static int get_free_port(struct domain *d) return port; } +int evtchn_is_pending(struct domain *d, int port) +{ + return test_bit(port, &shared_info(d, evtchn_pending)); +} + +int evtchn_is_masked(struct domain *d, int port) +{ + return test_bit(port, &shared_info(d, evtchn_mask)); +} + static long evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc) { @@ -529,7 +540,7 @@ static long __evtchn_close(struct domain *d1, int port1) } /* Clear pending event to avoid unexpected behavior on re-bind. */ - clear_bit(port1, &shared_info(d1, evtchn_pending)); + evtchn_clear_pending(d1, port1); /* Reset binding to vcpu0 when the channel is freed. */ chn1->state = ECS_FREE; @@ -653,6 +664,11 @@ static void evtchn_set_pending(struct vcpu *v, int port) } } +static void evtchn_clear_pending(struct domain *d, int port) +{ + clear_bit(port, &shared_info(d, evtchn_pending)); +} + int guest_enabled_event(struct vcpu *v, uint32_t virq) { return ((v != NULL) && (v->virq_to_evtchn[virq] != 0)); @@ -1283,8 +1299,8 @@ static void domain_dump_evtchn_info(struct domain *d) printk(" %4u [%d/%d]: s=%d n=%d x=%d", port, - !!test_bit(port, &shared_info(d, evtchn_pending)), - !!test_bit(port, &shared_info(d, evtchn_mask)), + !!evtchn_is_pending(d, port), + !!evtchn_is_masked(d, port), chn->state, chn->notify_vcpu_id, chn->xen_consumer); switch ( chn->state ) diff --git a/xen/common/keyhandler.c b/xen/common/keyhandler.c index 98a2c7f..1be3ca8 100644 --- a/xen/common/keyhandler.c +++ b/xen/common/keyhandler.c @@ -302,10 +302,8 @@ static void dump_domains(unsigned char key) printk("Notifying guest %d:%d (virq %d, port %d, stat %d/%d/%d)\n", d->domain_id, v->vcpu_id, VIRQ_DEBUG, v->virq_to_evtchn[VIRQ_DEBUG], - test_bit(v->virq_to_evtchn[VIRQ_DEBUG], - &shared_info(d, evtchn_pending)), - test_bit(v->virq_to_evtchn[VIRQ_DEBUG], - &shared_info(d, evtchn_mask)), + evtchn_is_pending(d, v->virq_to_evtchn[VIRQ_DEBUG]), + evtchn_is_masked(d, v->virq_to_evtchn[VIRQ_DEBUG]), test_bit(v->virq_to_evtchn[VIRQ_DEBUG] / BITS_PER_EVTCHN_WORD(d), &vcpu_info(v, evtchn_pending_sel))); diff --git a/xen/common/schedule.c b/xen/common/schedule.c index e59eb4d..35f16cf 100644 --- a/xen/common/schedule.c +++ b/xen/common/schedule.c @@ -693,7 +693,7 @@ static long do_poll(struct sched_poll *sched_poll) goto out; rc = 0; - if ( test_bit(port, &shared_info(d, evtchn_pending)) ) + if ( evtchn_is_pending(d, port) ) goto out; } diff --git a/xen/include/xen/event.h b/xen/include/xen/event.h index 68ce2ae..6ad0745 100644 --- a/xen/include/xen/event.h +++ b/xen/include/xen/event.h @@ -117,6 +117,12 @@ int evtchn_unmask(unsigned int port); /* Move all PIRQs after a vCPU was moved to another pCPU. */ void evtchn_move_pirqs(struct vcpu *v); +/* Tell a given event-channel port is pending or not */ +int evtchn_is_pending(struct domain *d, int port); + +/* Tell a given event-channel port is masked or not */ +int evtchn_is_masked(struct domain *d, int port); + /* Allocate/free a Xen-attached event channel port. */ typedef void (*xen_event_channel_notification_t)( struct vcpu *v, unsigned int port); -- 1.7.10.4
Also need to update event.h so that this change won''t break the build. Signed-off-by: Wei Liu <wei.liu2@citrix.com> --- xen/include/public/xen.h | 13 ++++++++++++- xen/include/xen/event.h | 2 +- 2 files changed, 13 insertions(+), 2 deletions(-) diff --git a/xen/include/public/xen.h b/xen/include/public/xen.h index 87380e6..6d6e319 100644 --- a/xen/include/public/xen.h +++ b/xen/include/public/xen.h @@ -554,9 +554,20 @@ DEFINE_XEN_GUEST_HANDLE(multicall_entry_t); /* * Event channel endpoints per domain: + * 2-level for x86: * 1024 if a long is 32 bits; 4096 if a long is 64 bits. + * 3-level for x86: + * 32k if a long is 32 bits; 256k if a long is 64 bits. + * 2-level for ARM: + * 4096 for both 32 bits and 64 bits. + * 3-level for ARM: + * 256k for both 32 bits and 64 bits. */ -#define NR_EVENT_CHANNELS (sizeof(xen_ulong_t) * sizeof(xen_ulong_t) * 64) +#define NR_EVENT_CHANNELS_L2 (sizeof(xen_ulong_t) * sizeof(xen_ulong_t) * 64) +#define NR_EVENT_CHANNELS_L3 (NR_EVENT_CHANNELS_L2 * 64) +#if !defined(__XEN__) && !defined(__XEN_TOOLS__) +#define NR_EVENT_CHANNELS NR_EVENT_CHANNELS_L2 /* for compatibility */ +#endif struct vcpu_time_info { /* diff --git a/xen/include/xen/event.h b/xen/include/xen/event.h index 6ad0745..aea61eb 100644 --- a/xen/include/xen/event.h +++ b/xen/include/xen/event.h @@ -40,7 +40,7 @@ static inline unsigned int max_evtchns(struct domain *d) } #define EVTCHNS_PER_BUCKET 512 -#define NR_EVTCHN_BUCKETS (NR_EVENT_CHANNELS / EVTCHNS_PER_BUCKET) +#define NR_EVTCHN_BUCKETS (NR_EVENT_CHANNELS_L3 / EVTCHNS_PER_BUCKET) struct evtchn { -- 1.7.10.4
Wei Liu
2013-Feb-27 14:34 UTC
[RFC PATCH V3 12/22] Add supported extended event channel ABI bitmap
Signed-off-by: Wei Liu <wei.liu2@citrix.com> --- xen/common/event_channel.c | 4 ++++ xen/include/xen/event.h | 1 + 2 files changed, 5 insertions(+) diff --git a/xen/common/event_channel.c b/xen/common/event_channel.c index c73c709..470b8d2 100644 --- a/xen/common/event_channel.c +++ b/xen/common/event_channel.c @@ -32,6 +32,10 @@ #include <public/event_channel.h> #include <xsm/xsm.h> +/* A bitmap of supported extended event channel ABIs */ +uint32_t extended_event_channel = (EVTCHN_EXTENDED_NONE | + EVTCHN_EXTENDED_L3); + #define ERROR_EXIT(_errno) \ do { \ gdprintk(XENLOG_WARNING, \ diff --git a/xen/include/xen/event.h b/xen/include/xen/event.h index aea61eb..6a2ee27 100644 --- a/xen/include/xen/event.h +++ b/xen/include/xen/event.h @@ -171,4 +171,5 @@ void notify_via_xen_event_channel(struct domain *ld, int lport); mb(); /* set blocked status /then/ caller does his work */ \ } while ( 0 ) +extern uint32_t extended_event_channel; #endif /* __XEN_EVENT_H__ */ -- 1.7.10.4
Signed-off-by: Wei Liu <wei.liu2@citrix.com> --- xen/common/event_channel.c | 14 ++++++++++++++ 1 file changed, 14 insertions(+) diff --git a/xen/common/event_channel.c b/xen/common/event_channel.c index 470b8d2..abed10d 100644 --- a/xen/common/event_channel.c +++ b/xen/common/event_channel.c @@ -36,6 +36,20 @@ uint32_t extended_event_channel = (EVTCHN_EXTENDED_NONE | EVTCHN_EXTENDED_L3); +static inline const char * evtchn_abi_str(unsigned int abi) +{ + switch ( abi ) + { + case EVTCHN_EXTENDED_NONE: + return "2-level"; + case EVTCHN_EXTENDED_L3: + return "3-level"; + default: + BUG(); + } + return ""; /* make compiler happy */ +} + #define ERROR_EXIT(_errno) \ do { \ gdprintk(XENLOG_WARNING, \ -- 1.7.10.4
Wei Liu
2013-Feb-27 14:34 UTC
[RFC PATCH V3 14/22] Add control structures for 3-level event channel
The references to shared bitmap pending / mask are embedded in struct domain. And pointer to the second level selector is embedded in struct vcpu. Signed-off-by: Wei Liu <wei.liu2@citrix.com> --- xen/include/xen/sched.h | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h index dd40444..4a3e51c 100644 --- a/xen/include/xen/sched.h +++ b/xen/include/xen/sched.h @@ -57,6 +57,9 @@ struct vcpu struct domain *domain; + /* For 3-level event channels */ + unsigned long *evtchn_pending_sel_l2; + struct vcpu *next_in_list; s_time_t periodic_period; @@ -219,6 +222,9 @@ struct domain spinlock_t event_lock; unsigned int evtchn_extended; + unsigned long *evtchn_pending; + unsigned long *evtchn_mask; + struct grant_table *grant_table; /* -- 1.7.10.4
Use pointer in struct domain to reference evtchn_pending and evtchn_mask bitmaps. When building a domain, the default operation set is 2-level operation set. Signed-off-by: Wei Liu <wei.liu2@citrix.com> --- xen/arch/arm/domain.c | 1 + xen/arch/x86/domain.c | 1 + xen/common/event_channel.c | 57 +++++++++++++++++++++++++++++++++++--------- xen/include/xen/event.h | 3 +++ 4 files changed, 51 insertions(+), 11 deletions(-) diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c index e37ec54..1c12c95 100644 --- a/xen/arch/arm/domain.c +++ b/xen/arch/arm/domain.c @@ -422,6 +422,7 @@ int arch_domain_create(struct domain *d, unsigned int domcr_flags) goto fail; clear_page(d->shared_info); + evtchn_set_default_bitmap(d); share_xen_page_with_guest( virt_to_page(d->shared_info), d, XENSHARE_writable); diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c index b7f6749..fba86b9 100644 --- a/xen/arch/x86/domain.c +++ b/xen/arch/x86/domain.c @@ -580,6 +580,7 @@ int arch_domain_create(struct domain *d, unsigned int domcr_flags) goto fail; clear_page(d->shared_info); + evtchn_set_default_bitmap(d); share_xen_page_with_guest( virt_to_page(d->shared_info), d, XENSHARE_writable); diff --git a/xen/common/event_channel.c b/xen/common/event_channel.c index abed10d..807f05f 100644 --- a/xen/common/event_channel.c +++ b/xen/common/event_channel.c @@ -69,6 +69,9 @@ static inline const char * evtchn_abi_str(unsigned int abi) #define consumer_is_xen(e) (!!(e)->xen_consumer) +static void evtchn_set_pending(struct vcpu *v, int port); +static void evtchn_clear_pending(struct domain *d, int port); + /* * The function alloc_unbound_xen_event_channel() allows an arbitrary * notifier function to be specified. However, very few unique functions @@ -112,9 +115,6 @@ static uint8_t get_xen_consumer(xen_event_channel_notification_t fn) /* Get the notification function for a given Xen-bound event channel. */ #define xen_notification_fn(e) (xen_consumers[(e)->xen_consumer-1]) -static void evtchn_set_pending(struct vcpu *v, int port); -static void evtchn_clear_pending(struct domain *d, int port); - static int virq_is_global(uint32_t virq) { int rc; @@ -177,15 +177,14 @@ static int get_free_port(struct domain *d) int evtchn_is_pending(struct domain *d, int port) { - return test_bit(port, &shared_info(d, evtchn_pending)); + return test_bit(port, d->evtchn_pending); } int evtchn_is_masked(struct domain *d, int port) { - return test_bit(port, &shared_info(d, evtchn_mask)); + return test_bit(port, d->evtchn_mask); } - static long evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc) { struct evtchn *chn; @@ -641,7 +640,7 @@ out: return ret; } -static void evtchn_set_pending(struct vcpu *v, int port) +static void evtchn_set_pending_l2(struct vcpu *v, int port) { struct domain *d = v->domain; int vcpuid; @@ -682,9 +681,23 @@ static void evtchn_set_pending(struct vcpu *v, int port) } } +static void evtchn_set_pending(struct vcpu *v, int port) +{ + struct domain *d = v->domain; + + switch ( d->evtchn_extended ) + { + case EVTCHN_EXTENDED_NONE: + evtchn_set_pending_l2(v, port); + break; + default: + BUG(); + } +} + static void evtchn_clear_pending(struct domain *d, int port) { - clear_bit(port, &shared_info(d, evtchn_pending)); + clear_bit(port, d->evtchn_pending); } int guest_enabled_event(struct vcpu *v, uint32_t virq) @@ -950,7 +963,7 @@ long evtchn_bind_vcpu(unsigned int port, unsigned int vcpu_id) } -int evtchn_unmask(unsigned int port) +static int evtchn_unmask_l2(unsigned int port) { struct domain *d = current->domain; struct vcpu *v; @@ -966,8 +979,8 @@ int evtchn_unmask(unsigned int port) * These operations must happen in strict order. Based on * include/xen/event.h:evtchn_set_pending(). */ - if ( test_and_clear_bit(port, &shared_info(d, evtchn_mask)) && - test_bit (port, &shared_info(d, evtchn_pending)) && + if ( test_and_clear_bit(port, d->evtchn_mask) && + test_bit (port, d->evtchn_pending) && !test_and_set_bit (port / BITS_PER_EVTCHN_WORD(d), &vcpu_info(v, evtchn_pending_sel)) ) { @@ -977,6 +990,23 @@ int evtchn_unmask(unsigned int port) return 0; } +int evtchn_unmask(unsigned int port) +{ + struct domain *d = current->domain; + int rc = 0; + + switch ( d->evtchn_extended ) + { + case EVTCHN_EXTENDED_NONE: + rc = evtchn_unmask_l2(port); + break; + default: + BUG(); + } + + return rc; +} + static long evtchn_reset(evtchn_reset_t *r) { @@ -1203,6 +1233,11 @@ void notify_via_xen_event_channel(struct domain *ld, int lport) spin_unlock(&ld->event_lock); } +void evtchn_set_default_bitmap(struct domain *d) +{ + d->evtchn_pending = (unsigned long *)shared_info(d, evtchn_pending); + d->evtchn_mask = (unsigned long *)shared_info(d, evtchn_mask); +} int evtchn_init(struct domain *d) { diff --git a/xen/include/xen/event.h b/xen/include/xen/event.h index 6a2ee27..8a38a80 100644 --- a/xen/include/xen/event.h +++ b/xen/include/xen/event.h @@ -138,6 +138,9 @@ int guest_enabled_event(struct vcpu *v, uint32_t virq); /* Notify remote end of a Xen-attached event channel.*/ void notify_via_xen_event_channel(struct domain *ld, int lport); +/* This is called after domain''s shared info page is setup */ +void evtchn_set_default_bitmap(struct domain *d); + /* Internal event channel object accessors */ #define bucket_from_port(d,p) \ ((d)->evtchn[(p)/EVTCHNS_PER_BUCKET]) -- 1.7.10.4
Wei Liu
2013-Feb-27 14:34 UTC
[RFC PATCH V3 16/22] Introduce some macros for event channels
Signed-off-by: Wei Liu <wei.liu2@citrix.com> --- xen/include/asm-arm/types.h | 7 +++++-- xen/include/asm-x86/config.h | 4 +++- xen/include/xen/event.h | 6 ++++++ 3 files changed, 14 insertions(+), 3 deletions(-) diff --git a/xen/include/asm-arm/types.h b/xen/include/asm-arm/types.h index 48864f9..65562b8 100644 --- a/xen/include/asm-arm/types.h +++ b/xen/include/asm-arm/types.h @@ -41,10 +41,13 @@ typedef char bool_t; #define test_and_clear_bool(b) xchg(&(b), 0) #endif /* __ASSEMBLY__ */ +#define BYTE_BITORDER 3 +#define BITS_PER_BYTE (1 << BYTE_BITORDER) -#define BITS_PER_LONG 32 -#define BYTES_PER_LONG 4 +#define BITS_PER_LONG (1 << LONG_BITORDER) #define LONG_BYTEORDER 2 +#define LONG_BITORDER (LONG_BYTEORDER + BYTE_BITORDER) +#define BYTES_PER_LONG (1 << LONG_BYTEORDER) #endif /* __ARM_TYPES_H__ */ /* diff --git a/xen/include/asm-x86/config.h b/xen/include/asm-x86/config.h index 202fda1..a88461b 100644 --- a/xen/include/asm-x86/config.h +++ b/xen/include/asm-x86/config.h @@ -8,11 +8,13 @@ #define __X86_CONFIG_H__ #define LONG_BYTEORDER 3 +#define BYTE_BITORDER 3 +#define LONG_BITORDER (BYTE_BITORDER + LONG_BYTEORDER) #define CONFIG_PAGING_LEVELS 4 #define BYTES_PER_LONG (1 << LONG_BYTEORDER) #define BITS_PER_LONG (BYTES_PER_LONG << 3) -#define BITS_PER_BYTE 8 +#define BITS_PER_BYTE (1 << BYTE_BITORDER) #define CONFIG_X86 1 #define CONFIG_X86_HT 1 diff --git a/xen/include/xen/event.h b/xen/include/xen/event.h index 8a38a80..e4fce73 100644 --- a/xen/include/xen/event.h +++ b/xen/include/xen/event.h @@ -42,6 +42,12 @@ static inline unsigned int max_evtchns(struct domain *d) #define EVTCHNS_PER_BUCKET 512 #define NR_EVTCHN_BUCKETS (NR_EVENT_CHANNELS_L3 / EVTCHNS_PER_BUCKET) +#ifndef CONFIG_COMPAT +#define EVTCHN_WORD_BITORDER(d) LONG_BITORDER +#else +#define EVTCHN_WORD_BITORDER(d) (has_32bit_shinfo(d) ? 5 : LONG_BITORDER) +#endif + struct evtchn { #define ECS_FREE 0 /* Channel is available for use. */ -- 1.7.10.4
Wei Liu
2013-Feb-27 14:34 UTC
[RFC PATCH V3 17/22] Infrastructure to manipulate 3-level event channel pages
Signed-off-by: Wei Liu <wei.liu2@citrix.com> --- xen/common/event_channel.c | 170 ++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 170 insertions(+) diff --git a/xen/common/event_channel.c b/xen/common/event_channel.c index 807f05f..26daa7e 100644 --- a/xen/common/event_channel.c +++ b/xen/common/event_channel.c @@ -1034,6 +1034,176 @@ out: } +static long __map_l3_arrays(struct domain *d, xen_pfn_t *pending, + xen_pfn_t *mask, int nr_pages) +{ + int rc = -ENOMEM; + void *mapping; + + mapping = vmap(pending, nr_pages); + if ( !mapping ) + goto out; + d->evtchn_pending = mapping; + + mapping = vmap(mask, nr_pages); + if ( !mapping ) + { + vunmap(d->evtchn_pending); + d->evtchn_pending = NULL; + goto out; + } + d->evtchn_mask = mapping; + + rc = 0; + out: + return rc; +} + +static void __unmap_l3_arrays(struct domain *d) +{ + if ( d->evtchn_pending ) + { + vunmap(d->evtchn_pending); + d->evtchn_pending = NULL; + } + + if ( d->evtchn_mask ) + { + vunmap(d->evtchn_mask); + d->evtchn_mask = NULL; + } +} + +static long __map_l2_selector(struct vcpu *v, unsigned long gfn, + unsigned long off) +{ + int rc = -ENOMEM; + void *mapping; + + mapping = vmap(&gfn, 1); + + if ( mapping == NULL ) + goto out; + + v->evtchn_pending_sel_l2 = mapping + off; + rc = 0; + + out: + return rc; +} + +static void __unmap_l2_selector(struct vcpu *v) +{ + if ( v->evtchn_pending_sel_l2 ) + { + unsigned long addr + (unsigned long)(v->evtchn_pending_sel_l2) & PAGE_MASK; + vunmap((void *)addr); + v->evtchn_pending_sel_l2 = NULL; + } +} + +static void __evtchn_unmap_all_3level(struct domain *d) +{ + struct vcpu *v; + for_each_vcpu ( d, v ) + __unmap_l2_selector(v); + __unmap_l3_arrays(d); +} + +static void __evtchn_setup_bitmap_l3(struct domain *d) +{ + struct vcpu *v; + + /* Easy way to setup 3-level bitmap, just move existing selector + * to next level then copy pending array and mask array */ + for_each_vcpu ( d, v ) + { + memcpy(&v->evtchn_pending_sel_l2[0], + &vcpu_info(v, evtchn_pending_sel), + sizeof(vcpu_info(v, evtchn_pending_sel))); + memset(&vcpu_info(v, evtchn_pending_sel), 0, + sizeof(vcpu_info(v, evtchn_pending_sel))); + set_bit(0, &vcpu_info(v, evtchn_pending_sel)); + } + + memcpy(d->evtchn_pending, &shared_info(d, evtchn_pending), + sizeof(shared_info(d, evtchn_pending))); + memcpy(d->evtchn_mask, &shared_info(d, evtchn_mask), + sizeof(shared_info(d, evtchn_mask))); +} + +static long evtchn_register_3level(evtchn_register_3level_t *arg) +{ + struct domain *d = current->domain; + struct vcpu *v; + int rc = 0; + xen_pfn_t evtchn_pending[EVTCHN_MAX_L3_PAGES]; + xen_pfn_t evtchn_mask[EVTCHN_MAX_L3_PAGES]; + xen_pfn_t l2sel_mfn = 0; + xen_pfn_t l2sel_offset = 0; + + if ( d->evtchn_extended ) + { + rc = -EINVAL; + goto out; + } + + if ( arg->nr_vcpus > d->max_vcpus || + arg->nr_pages > EVTCHN_MAX_L3_PAGES ) + { + rc = -EINVAL; + goto out; + } + + memset(evtchn_pending, 0, sizeof(xen_pfn_t) * EVTCHN_MAX_L3_PAGES); + memset(evtchn_mask, 0, sizeof(xen_pfn_t) * EVTCHN_MAX_L3_PAGES); + + rc = -EFAULT; /* common error code for following operations */ + if ( copy_from_guest(evtchn_pending, arg->evtchn_pending, arg->nr_pages) ) + goto out; + if ( copy_from_guest(evtchn_mask, arg->evtchn_mask, arg->nr_pages) ) + goto out; + + rc = __map_l3_arrays(d, evtchn_pending, evtchn_mask, arg->nr_pages); + if ( rc ) + goto out; + + for_each_vcpu ( d, v ) + { + int vcpu_id = v->vcpu_id; + + rc = -EFAULT; /* common error code for following operations */ + if ( unlikely(copy_from_guest_offset(&l2sel_mfn, arg->l2sel_mfns, + vcpu_id, 1)) ) + { + __evtchn_unmap_all_3level(d); + goto out; + } + if ( unlikely(copy_from_guest_offset(&l2sel_offset, arg->l2sel_offsets, + vcpu_id, 1)) ) + { + __evtchn_unmap_all_3level(d); + goto out; + } + + if ( (rc = __map_l2_selector(v, l2sel_mfn, l2sel_offset)) ) + { + __evtchn_unmap_all_3level(d); + goto out; + } + } + + __evtchn_setup_bitmap_l3(d); + + d->evtchn_extended = EVTCHN_EXTENDED_L3; + + rc = 0; + + out: + return rc; +} + long do_event_channel_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg) { long rc; -- 1.7.10.4
Note: this call always fails as it is not yet completed. Signed-off-by: Wei Liu <wei.liu2@citrix.com> --- xen/common/event_channel.c | 56 ++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 56 insertions(+) diff --git a/xen/common/event_channel.c b/xen/common/event_channel.c index 26daa7e..bb6e5f9 100644 --- a/xen/common/event_channel.c +++ b/xen/common/event_channel.c @@ -1204,6 +1204,34 @@ static long evtchn_register_3level(evtchn_register_3level_t *arg) return rc; } +/* + * NOTE to extneded event channel users: + * extended channels are likely to consume lots large global mapping + * area in Xen. For example, 3-level event channel consumes 16 + + * nr_vcpus pages global mapping area. + */ +static long evtchn_register_extended(struct evtchn_register_extended *reg) +{ + struct domain *d = current->domain; + int rc; + + spin_lock(&d->event_lock); + + switch ( reg->cmd ) + { + case EVTCHN_EXTENDED_NONE: + default: + rc = -EINVAL; + case EVTCHN_EXTENDED_L3: + rc = evtchn_register_3level(®->u.l3); + break; + } + + spin_unlock(&d->event_lock); + + return rc; +} + long do_event_channel_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg) { long rc; @@ -1312,6 +1340,19 @@ long do_event_channel_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg) break; } + case EVTCHNOP_register_extended: { + struct evtchn_register_extended reg; + + if ( copy_from_guest(®, arg, 1) != 0 ) + return -EFAULT; + rc = evtchn_register_extended(®); + + /* XXX always fails this call because it is not yet completed */ + rc = -EINVAL; + + break; + } + default: rc = -ENOSYS; break; @@ -1438,6 +1479,19 @@ int evtchn_init(struct domain *d) return 0; } +static void evtchn_unmap_extended(struct domain *d) +{ + switch ( d->evtchn_extended ) + { + case EVTCHN_EXTENDED_NONE: + default: + break; + case EVTCHN_EXTENDED_L3: + __evtchn_unmap_all_3level(d); + break; + } +} + void evtchn_destroy(struct domain *d) { @@ -1466,6 +1520,8 @@ void evtchn_destroy(struct domain *d) clear_global_virq_handlers(d); + evtchn_unmap_extended(d); + free_xenheap_page(d->evtchn); } -- 1.7.10.4
Wei Liu
2013-Feb-27 14:34 UTC
[RFC PATCH V3 19/22] Enable exteneded event channel ABI query
Signed-off-by: Wei Liu <wei.liu2@citrix.com> --- xen/common/event_channel.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/xen/common/event_channel.c b/xen/common/event_channel.c index bb6e5f9..88679d5 100644 --- a/xen/common/event_channel.c +++ b/xen/common/event_channel.c @@ -1219,12 +1219,14 @@ static long evtchn_register_extended(struct evtchn_register_extended *reg) switch ( reg->cmd ) { - case EVTCHN_EXTENDED_NONE: - default: - rc = -EINVAL; + case EVTCHN_EXTENDED_QUERY: + rc = extended_event_channel; + break; case EVTCHN_EXTENDED_L3: rc = evtchn_register_3level(®->u.l3); break; + default: + rc = -EINVAL; } spin_unlock(&d->event_lock); -- 1.7.10.4
Wei Liu
2013-Feb-27 14:34 UTC
[RFC PATCH V3 20/22] Implement 3-level event channel routines
Signed-off-by: Wei Liu <wei.liu2@citrix.com> --- xen/common/event_channel.c | 107 ++++++++++++++++++++++++++++++++++++-------- 1 file changed, 88 insertions(+), 19 deletions(-) diff --git a/xen/common/event_channel.c b/xen/common/event_channel.c index 88679d5..fac5dc9 100644 --- a/xen/common/event_channel.c +++ b/xen/common/event_channel.c @@ -640,10 +640,33 @@ out: return ret; } +static void __check_vcpu_polling(struct vcpu *v, int port) +{ + int vcpuid; + struct domain *d = v->domain; + + /* Check if some VCPU might be polling for this event. */ + if ( likely(bitmap_empty(d->poll_mask, d->max_vcpus)) ) + return; + + /* Wake any interested (or potentially interested) pollers. */ + for ( vcpuid = find_first_bit(d->poll_mask, d->max_vcpus); + vcpuid < d->max_vcpus; + vcpuid = find_next_bit(d->poll_mask, d->max_vcpus, vcpuid+1) ) + { + v = d->vcpu[vcpuid]; + if ( ((v->poll_evtchn <= 0) || (v->poll_evtchn == port)) && + test_and_clear_bit(vcpuid, d->poll_mask) ) + { + v->poll_evtchn = 0; + vcpu_unblock(v); + } + } +} + static void evtchn_set_pending_l2(struct vcpu *v, int port) { struct domain *d = v->domain; - int vcpuid; /* * The following bit operations must happen in strict order. @@ -662,23 +685,33 @@ static void evtchn_set_pending_l2(struct vcpu *v, int port) vcpu_mark_events_pending(v); } - /* Check if some VCPU might be polling for this event. */ - if ( likely(bitmap_empty(d->poll_mask, d->max_vcpus)) ) - return; + __check_vcpu_polling(v, port); +} - /* Wake any interested (or potentially interested) pollers. */ - for ( vcpuid = find_first_bit(d->poll_mask, d->max_vcpus); - vcpuid < d->max_vcpus; - vcpuid = find_next_bit(d->poll_mask, d->max_vcpus, vcpuid+1) ) +static void evtchn_set_pending_l3(struct vcpu *v, int port) +{ + struct domain *d = v->domain; + unsigned int l1bit = port >> (EVTCHN_WORD_BITORDER(d) << 1); + unsigned int l2bit = port >> EVTCHN_WORD_BITORDER(d); + + /* + * The following bit operations must happen in strict order. + * NB. On x86, the atomic bit operations also act as memory barriers. + * There is therefore sufficiently strict ordering for this architecture -- + * others may require explicit memory barriers. + */ + + if ( test_and_set_bit(port, d->evtchn_pending) ) + return; + + if ( !test_bit(port, d->evtchn_mask) && + !test_and_set_bit(l2bit, v->evtchn_pending_sel_l2) && + !test_and_set_bit(l1bit, &vcpu_info(v, evtchn_pending_sel)) ) { - v = d->vcpu[vcpuid]; - if ( ((v->poll_evtchn <= 0) || (v->poll_evtchn == port)) && - test_and_clear_bit(vcpuid, d->poll_mask) ) - { - v->poll_evtchn = 0; - vcpu_unblock(v); - } + vcpu_mark_events_pending(v); } + + __check_vcpu_polling(v, port); } static void evtchn_set_pending(struct vcpu *v, int port) @@ -690,6 +723,9 @@ static void evtchn_set_pending(struct vcpu *v, int port) case EVTCHN_EXTENDED_NONE: evtchn_set_pending_l2(v, port); break; + case EVTCHN_EXTENDED_L3: + evtchn_set_pending_l3(v, port); + break; default: BUG(); } @@ -990,6 +1026,35 @@ static int evtchn_unmask_l2(unsigned int port) return 0; } +static int evtchn_unmask_l3(unsigned int port) +{ + struct domain *d = current->domain; + struct vcpu *v; + unsigned int l1bit = port >> (EVTCHN_WORD_BITORDER(d) << 1); + unsigned int l2bit = port >> EVTCHN_WORD_BITORDER(d); + + ASSERT(spin_is_locked(&d->event_lock)); + + if ( unlikely(!port_is_valid(d, port)) ) + return -EINVAL; + + v = d->vcpu[evtchn_from_port(d, port)->notify_vcpu_id]; + + /* + * These operations must happen in strict order. Based on + * include/xen/event.h:evtchn_set_pending(). + */ + if ( test_and_clear_bit(port, d->evtchn_mask) && + test_bit (port, d->evtchn_pending) && + !test_and_set_bit (l2bit, v->evtchn_pending_sel_l2) && + !test_and_set_bit (l1bit, &vcpu_info(v, evtchn_pending_sel)) ) + { + vcpu_mark_events_pending(v); + } + + return 0; +} + int evtchn_unmask(unsigned int port) { struct domain *d = current->domain; @@ -1000,6 +1065,9 @@ int evtchn_unmask(unsigned int port) case EVTCHN_EXTENDED_NONE: rc = evtchn_unmask_l2(port); break; + case EVTCHN_EXTENDED_L3: + rc = evtchn_unmask_l3(port); + break; default: BUG(); } @@ -1347,10 +1415,8 @@ long do_event_channel_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg) if ( copy_from_guest(®, arg, 1) != 0 ) return -EFAULT; - rc = evtchn_register_extended(®); - /* XXX always fails this call because it is not yet completed */ - rc = -EINVAL; + rc = evtchn_register_extended(®); break; } @@ -1562,8 +1628,11 @@ static void domain_dump_evtchn_info(struct domain *d) bitmap_scnlistprintf(keyhandler_scratch, sizeof(keyhandler_scratch), d->poll_mask, d->max_vcpus); printk("Event channel information for domain %d:\n" + "Using %s event channel ABI\n" "Polling vCPUs: {%s}\n" - " port [p/m]\n", d->domain_id, keyhandler_scratch); + " port [p/m]\n", + d->domain_id, evtchn_abi_str(d->evtchn_extended), + keyhandler_scratch); spin_lock(&d->event_lock); -- 1.7.10.4
Wei Liu
2013-Feb-27 14:34 UTC
[RFC PATCH V3 21/22] Only allow extended event channel on Dom0 and driver domains
For non-Dom0 domains, add a flag to indicate whether it can use extended event channel, admins can specify this flag when creating a driver domain. The rationale behide this option is, a) extended event channel will consume global mapping space in Xen, b) likely that only Dom0 and driver domains need this. Signed-off-by: Wei Liu <wei.liu2@citrix.com> --- xen/common/domain.c | 3 +++ xen/common/domctl.c | 5 ++++- xen/common/event_channel.c | 6 +++++- xen/include/public/domctl.h | 3 +++ xen/include/xen/sched.h | 5 +++++ 5 files changed, 20 insertions(+), 2 deletions(-) diff --git a/xen/common/domain.c b/xen/common/domain.c index 07f62b3..293e4b1 100644 --- a/xen/common/domain.c +++ b/xen/common/domain.c @@ -250,6 +250,9 @@ struct domain *domain_create( if ( domcr_flags & DOMCRF_dummy ) return d; + if ( domcr_flags & DOMCRF_evtchn_extended ) + d->evtchn_extended_allowed = 1; + if ( !is_idle_domain(d) ) { if ( (err = xsm_domain_create(XSM_HOOK, d, ssidref)) != 0 ) diff --git a/xen/common/domctl.c b/xen/common/domctl.c index a713ce6..09b9752 100644 --- a/xen/common/domctl.c +++ b/xen/common/domctl.c @@ -369,7 +369,8 @@ long do_domctl(XEN_GUEST_HANDLE_PARAM(xen_domctl_t) u_domctl) if ( supervisor_mode_kernel || (op->u.createdomain.flags & ~(XEN_DOMCTL_CDF_hvm_guest | XEN_DOMCTL_CDF_hap | - XEN_DOMCTL_CDF_s3_integrity | XEN_DOMCTL_CDF_oos_off)) ) + XEN_DOMCTL_CDF_s3_integrity | XEN_DOMCTL_CDF_oos_off | + XEN_DOMCTL_CDF_evtchn_extended)) ) break; dom = op->domain; @@ -405,6 +406,8 @@ long do_domctl(XEN_GUEST_HANDLE_PARAM(xen_domctl_t) u_domctl) domcr_flags |= DOMCRF_s3_integrity; if ( op->u.createdomain.flags & XEN_DOMCTL_CDF_oos_off ) domcr_flags |= DOMCRF_oos_off; + if ( op->u.createdomain.flags & XEN_DOMCTL_CDF_evtchn_extended ) + domcr_flags |= DOMCRF_evtchn_extended; d = domain_create(dom, domcr_flags, op->u.createdomain.ssidref); if ( IS_ERR(d) ) diff --git a/xen/common/event_channel.c b/xen/common/event_channel.c index fac5dc9..56c4fb0 100644 --- a/xen/common/event_channel.c +++ b/xen/common/event_channel.c @@ -1291,7 +1291,11 @@ static long evtchn_register_extended(struct evtchn_register_extended *reg) rc = extended_event_channel; break; case EVTCHN_EXTENDED_L3: - rc = evtchn_register_3level(®->u.l3); + if ( d->domain_id != 0 && + d->evtchn_extended_allowed == 0 ) + rc = -EPERM; + else + rc = evtchn_register_3level(®->u.l3); break; default: rc = -EINVAL; diff --git a/xen/include/public/domctl.h b/xen/include/public/domctl.h index 74160b0..d5c468e 100644 --- a/xen/include/public/domctl.h +++ b/xen/include/public/domctl.h @@ -59,6 +59,9 @@ struct xen_domctl_createdomain { /* Disable out-of-sync shadow page tables? */ #define _XEN_DOMCTL_CDF_oos_off 3 #define XEN_DOMCTL_CDF_oos_off (1U<<_XEN_DOMCTL_CDF_oos_off) + /* Can this domain use extended event channel? */ +#define _XEN_DOMCTL_CDF_evtchn_extended 4 +#define XEN_DOMCTL_CDF_evtchn_extended (1U<<_XEN_DOMCTL_CDF_evtchn_extended) uint32_t flags; }; typedef struct xen_domctl_createdomain xen_domctl_createdomain_t; diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h index 4a3e51c..d0aba9a 100644 --- a/xen/include/xen/sched.h +++ b/xen/include/xen/sched.h @@ -224,6 +224,8 @@ struct domain unsigned long *evtchn_pending; unsigned long *evtchn_mask; + /* Can the domain use extended event channel (new ABI)? */ + bool_t evtchn_extended_allowed; struct grant_table *grant_table; @@ -411,6 +413,9 @@ struct domain *domain_create( /* DOMCRF_oos_off: dont use out-of-sync optimization for shadow page tables */ #define _DOMCRF_oos_off 4 #define DOMCRF_oos_off (1U<<_DOMCRF_oos_off) +/* DOMCRF_evtchn_extended: this domain can use extended event channel */ +#define _DOMCRF_evtchn_extended 5 +#define DOMCRF_evtchn_extended (1U<<_DOMCRF_evtchn_extended) /* * rcu_lock_domain_by_id() is more efficient than get_domain_by_id(). -- 1.7.10.4
Admins can add "evtchn_extended = 1" in domain config file to enable extended event channel ABI for a domain. Signed-off-by: Wei Liu <wei.liu2@citrix.com> --- tools/libxl/libxl_create.c | 4 ++++ tools/libxl/libxl_types.idl | 1 + tools/libxl/xl_cmdimpl.c | 2 ++ tools/libxl/xl_sxp.c | 2 ++ 4 files changed, 9 insertions(+) diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c index fa81f88..724e038 100644 --- a/tools/libxl/libxl_create.c +++ b/tools/libxl/libxl_create.c @@ -35,6 +35,8 @@ int libxl__domain_create_info_setdefault(libxl__gc *gc, libxl_defbool_setdefault(&c_info->oos, true); } + libxl_defbool_setdefault(&c_info->evtchn_extended, false); + libxl_defbool_setdefault(&c_info->run_hotplug_scripts, true); return 0; @@ -391,6 +393,8 @@ int libxl__domain_make(libxl__gc *gc, libxl_domain_create_info *info, flags |= libxl_defbool_val(info->hap) ? XEN_DOMCTL_CDF_hap : 0; flags |= libxl_defbool_val(info->oos) ? 0 : XEN_DOMCTL_CDF_oos_off; } + flags |= libxl_defbool_val(info->evtchn_extended) ? + XEN_DOMCTL_CDF_evtchn_extended : 0; *domid = -1; /* Ultimately, handle is an array of 16 uint8_t, same as uuid */ diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl index 89a8030..37a20c4 100644 --- a/tools/libxl/libxl_types.idl +++ b/tools/libxl/libxl_types.idl @@ -236,6 +236,7 @@ libxl_domain_create_info = Struct("domain_create_info",[ ("type", libxl_domain_type), ("hap", libxl_defbool), ("oos", libxl_defbool), + ("evtchn_extended",libxl_defbool), ("ssidref", uint32), ("name", string), ("uuid", libxl_uuid), diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c index 003b129..9a1dd1b 100644 --- a/tools/libxl/xl_cmdimpl.c +++ b/tools/libxl/xl_cmdimpl.c @@ -651,6 +651,8 @@ static void parse_config_data(const char *config_source, xlu_cfg_get_defbool(config, "oos", &c_info->oos, 0); + xlu_cfg_get_defbool(config, "evtchn_extended", &c_info->evtchn_extended, 0); + if (!xlu_cfg_get_string (config, "pool", &buf, 0)) { c_info->poolid = -1; cpupool_qualifier_to_cpupoolid(buf, &c_info->poolid, NULL); diff --git a/tools/libxl/xl_sxp.c b/tools/libxl/xl_sxp.c index a16a025..6d40e7a 100644 --- a/tools/libxl/xl_sxp.c +++ b/tools/libxl/xl_sxp.c @@ -44,6 +44,8 @@ void printf_info_sexp(int domid, libxl_domain_config *d_config) printf("\t(hvm %d)\n", c_info->type == LIBXL_DOMAIN_TYPE_HVM); printf("\t(hap %s)\n", libxl_defbool_to_string(c_info->hap)); printf("\t(oos %s)\n", libxl_defbool_to_string(c_info->oos)); + printf("\t(evtchn_extended %s)\n", + libxl_defbool_to_string(c_info->evtchn_extended)); printf("\t(ssidref %d)\n", c_info->ssidref); printf("\t(name %s)\n", c_info->name); -- 1.7.10.4
Keir Fraser
2013-Feb-27 16:28 UTC
Re: [RFC PATCH V3] Implement 3-level event channel in Xen
On 27/02/2013 14:33, "Wei Liu" <wei.liu2@citrix.com> wrote:> Hi all > > This is another version of the patch series. Unfortunately the kernel side is > not available at the moment. :-( > > Keir, Jan, Ian and David, are you happy with this design in general? I would > like to have explicit ACK / NACK on this if possible, as feature freeze for > 4.3 > is quite close now.I''m happy with the design, in that it solves a current need, and the implementation exists. It seems this is something we really do need for 4.3. -- Keir> Changes since V2: > * new interface to register extended event channel ABI > * use vmap to simplify mapping > * replace MAX_EVTCHNS macro with inline function > * libxl: evtchn_l3 -> evtchn_extended > > The most notable bit of this series is the interface change. In order to cope > with future ABIs, the interface is renamed to EVTCHNOP_register_extended. It > also provides supported ABI query, so that we can remove unused ABI in the > future. > > The semantic meaning of EVTCHNOP_register_extended changes a bit. The `level'' > in parameter now changes to `cmd'', which means we should go down to specific > ABI routines. ABI-specific structures are still embedded in the union. > > Changes since V1: > * move all evtchn related macros / struct definitions to event.h > * only allow 3-level evtchn for Dom0 and driver domains > * add evtchn_l3 flag in libxl > > > Wei. > > Diffstat: > tools/libxl/libxl_create.c | 4 + > tools/libxl/libxl_types.idl | 1 + > tools/libxl/xl_cmdimpl.c | 2 + > tools/libxl/xl_sxp.c | 2 + > xen/arch/arm/domain.c | 1 + > xen/arch/x86/domain.c | 1 + > xen/arch/x86/irq.c | 7 +- > xen/common/domain.c | 3 + > xen/common/domctl.c | 5 +- > xen/common/event_channel.c | 458 > +++++++++++++++++++++++++++++++++--- > xen/common/keyhandler.c | 6 +- > xen/common/schedule.c | 4 +- > xen/include/asm-arm/types.h | 7 +- > xen/include/asm-x86/config.h | 4 +- > xen/include/public/domctl.h | 3 + > xen/include/public/event_channel.h | 44 ++++ > xen/include/public/xen.h | 35 ++- > xen/include/xen/event.h | 85 ++++++- > xen/include/xen/sched.h | 65 ++--- > xen/xsm/flask/hooks.c | 1 + > 20 files changed, 623 insertions(+), 115 deletions(-) > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel
>>> On 27.02.13 at 15:33, Wei Liu <wei.liu2@citrix.com> wrote: > --- a/xen/common/event_channel.c > +++ b/xen/common/event_channel.c > @@ -1172,15 +1172,26 @@ void notify_via_xen_event_channel(struct domain *ld, int lport) > > int evtchn_init(struct domain *d) > { > + BUILD_BUG_ON(sizeof(struct evtchn *) * NR_EVTCHN_BUCKETS > PAGE_SIZE); > + d->evtchn = alloc_xenheap_page();You may not have monitored other changes that were done meanwhile. With fb034f42648ecac835a1666def468f673edd2725, please use xmalloc_array() here (but keep the BUILD_BUG_ON()).> + > + if ( d->evtchn == NULL ) > + return -ENOMEM; > + clear_page(d->evtchn); > + > spin_lock_init(&d->event_lock); > - if ( get_free_port(d) != 0 ) > + if ( get_free_port(d) != 0 ) {The opening brace belongs on its own line.> + free_xenheap_page(d->evtchn); > return -EINVAL; > + } > evtchn_from_port(d, 0)->state = ECS_RESERVED; > > #if MAX_VIRT_CPUS > BITS_PER_LONG > d->poll_mask = xmalloc_array(unsigned long, BITS_TO_LONGS(MAX_VIRT_CPUS)); > - if ( !d->poll_mask ) > + if ( !d->poll_mask ) {And here again. Please also check your other patches whether there are further instances of this. Jan> + free_xenheap_page(d->evtchn); > return -ENOMEM; > + } > bitmap_zero(d->poll_mask, MAX_VIRT_CPUS); > #endif > > @@ -1214,6 +1225,8 @@ void evtchn_destroy(struct domain *d) > spin_unlock(&d->event_lock); > > clear_global_virq_handlers(d); > + > + free_xenheap_page(d->evtchn); > } > > > diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h > index f6846d4..812bd87 100644 > --- a/xen/include/xen/sched.h > +++ b/xen/include/xen/sched.h > @@ -260,7 +260,7 @@ struct domain > spinlock_t rangesets_lock; > > /* Event channel information. */ > - struct evtchn *evtchn[NR_EVTCHN_BUCKETS]; > + struct evtchn **evtchn; > spinlock_t event_lock; > > struct grant_table *grant_table; > -- > 1.7.10.4
Jan Beulich
2013-Feb-27 16:42 UTC
Re: [RFC PATCH V3 06/22] Define extended event channel registration interface
>>> On 27.02.13 at 15:33, Wei Liu <wei.liu2@citrix.com> wrote: > +/* commands: > + * EVTCHN_EXTENDED_QUERY(0): query supported extended event channel types, > + * _NONE supported types are or''ed in return value of > + * the hypercall. > + * EVTCHN_EXTENDED_*: specific extended event channel subcommand. > + */ > +#define EVTCHN_EXTENDED_QUERY 0 > +/* supported extended event channel */ > +#define EVTCHN_EXTENDED_NONE 0 > +#define _EVTCHN_EXTENDED_L3 0 > +#define EVTCHN_EXTENDED_L3 (1U << _EVTCHN_EXTENDED_L3) > +struct evtchn_register_extended { > + /* IN parameters. */ > + uint32_t cmd;Looking at patch 18 you seem to indeed plan on passing a bit mask with a single bit set as command. Is that really reasonable? I can see the need for the query to return a bit mask, but that''s it. Jan
Jan Beulich
2013-Feb-27 16:47 UTC
Re: [RFC PATCH V3 12/22] Add supported extended event channel ABI bitmap
>>> On 27.02.13 at 15:34, Wei Liu <wei.liu2@citrix.com> wrote: > Signed-off-by: Wei Liu <wei.liu2@citrix.com> > --- > xen/common/event_channel.c | 4 ++++ > xen/include/xen/event.h | 1 + > 2 files changed, 5 insertions(+) > > diff --git a/xen/common/event_channel.c b/xen/common/event_channel.c > index c73c709..470b8d2 100644 > --- a/xen/common/event_channel.c > +++ b/xen/common/event_channel.c > @@ -32,6 +32,10 @@ > #include <public/event_channel.h> > #include <xsm/xsm.h> > > +/* A bitmap of supported extended event channel ABIs */ > +uint32_t extended_event_channel = (EVTCHN_EXTENDED_NONE |This should be just EVTCHN_EXTENDED_NONE until the last patch that fully implements 3-level event channels. Jan> + EVTCHN_EXTENDED_L3); > + > #define ERROR_EXIT(_errno) \ > do { \ > gdprintk(XENLOG_WARNING, \
>>> On 27.02.13 at 15:34, Wei Liu <wei.liu2@citrix.com> wrote: > --- a/xen/common/event_channel.c > +++ b/xen/common/event_channel.c > @@ -36,6 +36,20 @@ > uint32_t extended_event_channel = (EVTCHN_EXTENDED_NONE | > EVTCHN_EXTENDED_L3); > > +static inline const char * evtchn_abi_str(unsigned int abi) > +{ > + switch ( abi ) > + { > + case EVTCHN_EXTENDED_NONE: > + return "2-level"; > + case EVTCHN_EXTENDED_L3: > + return "3-level"; > + default: > + BUG(); > + } > + return ""; /* make compiler happy */ > +} > +This is the sort of change that looks completely bogus - even the next few patches don''t seem to make use of this. Why can''t this be added when the first user of it appears? It surely won''t make reviewing that patch more difficult... And that''s a general problem (for me at least) with how you break up patches: Having looked ahead at 18 and 19, the latter undoes quite a significant portion of what the former did. Both being far from huge and unreviewable, why don''t you fold them so things actually make sense to the reader? Jan
Jan Beulich
2013-Feb-27 16:53 UTC
Re: [RFC PATCH V3 16/22] Introduce some macros for event channels
>>> On 27.02.13 at 15:34, Wei Liu <wei.liu2@citrix.com> wrote: > --- a/xen/include/asm-arm/types.h > +++ b/xen/include/asm-arm/types.h > @@ -41,10 +41,13 @@ typedef char bool_t; > #define test_and_clear_bool(b) xchg(&(b), 0) > > #endif /* __ASSEMBLY__ */ > +#define BYTE_BITORDER 3 > +#define BITS_PER_BYTE (1 << BYTE_BITORDER) > > -#define BITS_PER_LONG 32 > -#define BYTES_PER_LONG 4 > +#define BITS_PER_LONG (1 << LONG_BITORDER) > #define LONG_BYTEORDER 2 > +#define LONG_BITORDER (LONG_BYTEORDER + BYTE_BITORDER) > +#define BYTES_PER_LONG (1 << LONG_BYTEORDER)Is that all really correct and complete in the context of arm64 and an ABI-long not being 32 bits on arm32? Jan
Jan Beulich
2013-Feb-27 16:58 UTC
Re: [RFC PATCH V3 21/22] Only allow extended event channel on Dom0 and driver domains
>>> On 27.02.13 at 15:34, Wei Liu <wei.liu2@citrix.com> wrote: > --- a/xen/include/xen/sched.h > +++ b/xen/include/xen/sched.h > @@ -224,6 +224,8 @@ struct domain > > unsigned long *evtchn_pending; > unsigned long *evtchn_mask; > + /* Can the domain use extended event channel (new ABI)? */ > + bool_t evtchn_extended_allowed; > > struct grant_table *grant_table; >Please find a better place for this - putting a 1-byte variable between two pointers adds unnecessary 7-byte padding. And there''s already a bunch of bool_t-s in struct domain... Jan
Jan Beulich
2013-Feb-27 17:01 UTC
Re: [RFC PATCH V3] Implement 3-level event channel in Xen
>>> On 27.02.13 at 15:33, Wei Liu <wei.liu2@citrix.com> wrote: > Keir, Jan, Ian and David, are you happy with this design in general? I would > like to have explicit ACK / NACK on this if possible, as feature freeze for > 4.3 is quite close now.The patches look reasonable (apart from the comments I just gave on some of them), but other than Keir I''m not that eager to see this one go in for 4.3 in order to then likely be replaced by an implementation of David''s design in the 4.4 cycle. Jan
Ian Campbell
2013-Feb-27 17:04 UTC
Re: [RFC PATCH V3 16/22] Introduce some macros for event channels
On Wed, 2013-02-27 at 16:53 +0000, Jan Beulich wrote:> >>> On 27.02.13 at 15:34, Wei Liu <wei.liu2@citrix.com> wrote: > > --- a/xen/include/asm-arm/types.h > > +++ b/xen/include/asm-arm/types.h > > @@ -41,10 +41,13 @@ typedef char bool_t; > > #define test_and_clear_bool(b) xchg(&(b), 0) > > > > #endif /* __ASSEMBLY__ */ > > +#define BYTE_BITORDER 3 > > +#define BITS_PER_BYTE (1 << BYTE_BITORDER) > > > > -#define BITS_PER_LONG 32 > > -#define BYTES_PER_LONG 4 > > +#define BITS_PER_LONG (1 << LONG_BITORDER) > > #define LONG_BYTEORDER 2 > > +#define LONG_BITORDER (LONG_BYTEORDER + BYTE_BITORDER) > > +#define BYTES_PER_LONG (1 << LONG_BYTEORDER) > > Is that all really correct and complete in the context of arm64 and > an ABI-long not being 32 bits on arm32?This header is about the internal types, I think, and so should represent the compiler''s idea of what the actual long type is for the benefit of common code. Of course if these are also being used for ABI things, this is a bug on ARM. I fixed the use of BITS_PER_LONG in the evtchn interface in a patch this morning. I couldn''t see any others. Ian.
Keir Fraser
2013-Feb-27 19:49 UTC
Re: [RFC PATCH V3] Implement 3-level event channel in Xen
On 27/02/2013 17:01, "Jan Beulich" <JBeulich@suse.com> wrote:>>>> On 27.02.13 at 15:33, Wei Liu <wei.liu2@citrix.com> wrote: >> Keir, Jan, Ian and David, are you happy with this design in general? I would >> like to have explicit ACK / NACK on this if possible, as feature freeze for >> 4.3 is quite close now. > > The patches look reasonable (apart from the comments I just gave > on some of them), but other than Keir I''m not that eager to see this > one go in for 4.3 in order to then likely be replaced by an > implementation of David''s design in the 4.4 cycle.If this went in for 4.3, a re-design would really have to be implemented and measurably prove its worth to make it in as another replacement. -- Keir> Jan >
On Wed, Feb 27, 2013 at 07:49:34PM +0000, Keir Fraser wrote:> On 27/02/2013 17:01, "Jan Beulich" <JBeulich@suse.com> wrote: > > >>>> On 27.02.13 at 15:33, Wei Liu <wei.liu2@citrix.com> wrote: > >> Keir, Jan, Ian and David, are you happy with this design in general? I would > >> like to have explicit ACK / NACK on this if possible, as feature freeze for > >> 4.3 is quite close now. > > > > The patches look reasonable (apart from the comments I just gave > > on some of them), but other than Keir I''m not that eager to see this > > one go in for 4.3 in order to then likely be replaced by an > > implementation of David''s design in the 4.4 cycle. > > If this went in for 4.3, a re-design would really have to be implemented and > measurably prove its worth to make it in as another replacement. >Even this design does not go in for 4.3, any new design will still have to be implemented and measurable prove its worth, as with this design. The value of this design lies in that a) it''s straight forward, easy to prove its correctness (or wrongness), b) it meets our need for now, c) a new interface will always be necessary, be it this design or any other design, d) it buys time for any better designs to become mature. I understand your (and Jan''s) concern for burdens of maintaining different ABIs, that''s why the interface has been made to allow user to query enabled / supported ABIs. If we need to add / remove ABIs, it''s just a matter of setting some feature bits. I don''t mean to push this design to get merged, it''s up to maintainers to decide. Actually I''m fine with any decision. Explicit Ack / Nack will help me arrange my future work better. Wei.> -- Keir > > > Jan > > > >
Keir Fraser
2013-Feb-28 05:58 UTC
Re: [RFC PATCH V3] Implement 3-level event channel in Xen
On 27/02/2013 23:19, "Wei Liu" <wei.liu2@citrix.com> wrote:> On Wed, Feb 27, 2013 at 07:49:34PM +0000, Keir Fraser wrote: >> On 27/02/2013 17:01, "Jan Beulich" <JBeulich@suse.com> wrote: >> >>>>>> On 27.02.13 at 15:33, Wei Liu <wei.liu2@citrix.com> wrote: >>>> Keir, Jan, Ian and David, are you happy with this design in general? I >>>> would >>>> like to have explicit ACK / NACK on this if possible, as feature freeze for >>>> 4.3 is quite close now. >>> >>> The patches look reasonable (apart from the comments I just gave >>> on some of them), but other than Keir I''m not that eager to see this >>> one go in for 4.3 in order to then likely be replaced by an >>> implementation of David''s design in the 4.4 cycle. >> >> If this went in for 4.3, a re-design would really have to be implemented and >> measurably prove its worth to make it in as another replacement. > > Even this design does not go in for 4.3, any new design will still have > to be implemented and measurable prove its worth, as with this design.I believe there''s a requirement to solve the dom0 event-channel limit for 4.3. Either of the proposed designs obviously solves that, without need for implement and test. Of course, we need to do that to test for correctness and performance regressions anyway. But I mean with that primary requirement satisfied, yet another evtchn ABI really then has to have some compelling advantages to get committed. -- Keir> The value of this design lies in that a) it''s straight forward, easy to > prove its correctness (or wrongness), b) it meets our need for now, c) > a new interface will always be necessary, be it this design or any other > design, d) it buys time for any better designs to become mature. > > I understand your (and Jan''s) concern for burdens of maintaining > different ABIs, that''s why the interface has been made to allow user to > query enabled / supported ABIs. If we need to add / remove ABIs, it''s > just a matter of setting some feature bits. > > I don''t mean to push this design to get merged, it''s up to maintainers > to decide. Actually I''m fine with any decision. Explicit Ack / Nack will > help me arrange my future work better. > > > Wei. > >> -- Keir >> >>> Jan >>> >> >>
Jan Beulich
2013-Feb-28 07:23 UTC
Re: [RFC PATCH V3] Implement 3-level event channel in Xen
>>> On 27.02.13 at 20:49, Keir Fraser <keir.xen@gmail.com> wrote: > On 27/02/2013 17:01, "Jan Beulich" <JBeulich@suse.com> wrote: > >>>>> On 27.02.13 at 15:33, Wei Liu <wei.liu2@citrix.com> wrote: >>> Keir, Jan, Ian and David, are you happy with this design in general? I would >>> like to have explicit ACK / NACK on this if possible, as feature freeze for >>> 4.3 is quite close now. >> >> The patches look reasonable (apart from the comments I just gave >> on some of them), but other than Keir I''m not that eager to see this >> one go in for 4.3 in order to then likely be replaced by an >> implementation of David''s design in the 4.4 cycle. > > If this went in for 4.3, a re-design would really have to be implemented and > measurably prove its worth to make it in as another replacement.Or be easier to maintain and/or extend further (which I expect would both apply to David''s alternative). Jan
Jan Beulich
2013-Feb-28 07:54 UTC
Re: [RFC PATCH V3 16/22] Introduce some macros for event channels
>>> On 27.02.13 at 18:04, Ian Campbell <Ian.Campbell@citrix.com> wrote: > On Wed, 2013-02-27 at 16:53 +0000, Jan Beulich wrote: >> >>> On 27.02.13 at 15:34, Wei Liu <wei.liu2@citrix.com> wrote: >> > --- a/xen/include/asm-arm/types.h >> > +++ b/xen/include/asm-arm/types.h >> > @@ -41,10 +41,13 @@ typedef char bool_t; >> > #define test_and_clear_bool(b) xchg(&(b), 0) >> > >> > #endif /* __ASSEMBLY__ */ >> > +#define BYTE_BITORDER 3 >> > +#define BITS_PER_BYTE (1 << BYTE_BITORDER) >> > >> > -#define BITS_PER_LONG 32 >> > -#define BYTES_PER_LONG 4 >> > +#define BITS_PER_LONG (1 << LONG_BITORDER) >> > #define LONG_BYTEORDER 2 >> > +#define LONG_BITORDER (LONG_BYTEORDER + BYTE_BITORDER) >> > +#define BYTES_PER_LONG (1 << LONG_BYTEORDER) >> >> Is that all really correct and complete in the context of arm64 and >> an ABI-long not being 32 bits on arm32? > > This header is about the internal types, I think, and so should > represent the compiler''s idea of what the actual long type is for the > benefit of common code. > > Of course if these are also being used for ABI things, this is a bug on > ARM.Wei - could you please clarify/double check? I was under the impression that these were used e.g. for thing like the event channel bit maps and selector words, and those are clearly xen_ulong_t, not unsigned long. Thanks, Jan
Ian Campbell
2013-Feb-28 08:35 UTC
Re: [RFC PATCH V3 16/22] Introduce some macros for event channels
On Thu, 2013-02-28 at 07:54 +0000, Jan Beulich wrote:> Wei - could you please clarify/double check? I was under the > impression that these were used e.g. for thing like the event > channel bit maps and selector words, and those are clearly > xen_ulong_t, not unsigned long.This is fixed by <1361970894-12965-1-git-send-email-ian.campbell@citrix.com> "xen: correct BITS_PER_EVTCHN_WORD on arm" which is not yet applied only because I''m supposed to be packing for my trip to Linaro Connect and not checking email ;-) Ian.
Wei Liu
2013-Feb-28 11:17 UTC
Re: [RFC PATCH V3 16/22] Introduce some macros for event channels
On Thu, Feb 28, 2013 at 07:54:26AM +0000, Jan Beulich wrote:> >>> On 27.02.13 at 18:04, Ian Campbell <Ian.Campbell@citrix.com> wrote: > > On Wed, 2013-02-27 at 16:53 +0000, Jan Beulich wrote: > >> >>> On 27.02.13 at 15:34, Wei Liu <wei.liu2@citrix.com> wrote: > >> > --- a/xen/include/asm-arm/types.h > >> > +++ b/xen/include/asm-arm/types.h > >> > @@ -41,10 +41,13 @@ typedef char bool_t; > >> > #define test_and_clear_bool(b) xchg(&(b), 0) > >> > > >> > #endif /* __ASSEMBLY__ */ > >> > +#define BYTE_BITORDER 3 > >> > +#define BITS_PER_BYTE (1 << BYTE_BITORDER) > >> > > >> > -#define BITS_PER_LONG 32 > >> > -#define BYTES_PER_LONG 4 > >> > +#define BITS_PER_LONG (1 << LONG_BITORDER) > >> > #define LONG_BYTEORDER 2 > >> > +#define LONG_BITORDER (LONG_BYTEORDER + BYTE_BITORDER) > >> > +#define BYTES_PER_LONG (1 << LONG_BYTEORDER) > >> > >> Is that all really correct and complete in the context of arm64 and > >> an ABI-long not being 32 bits on arm32? > > > > This header is about the internal types, I think, and so should > > represent the compiler''s idea of what the actual long type is for the > > benefit of common code. > > > > Of course if these are also being used for ABI things, this is a bug on > > ARM. > > Wei - could you please clarify/double check? I was under the > impression that these were used e.g. for thing like the event > channel bit maps and selector words, and those are clearly > xen_ulong_t, not unsigned long. >Yes, this one needs to be fixed. The whole series was built on top of Ian''s xen_ulong_t patch, but I missed this one. :-( Also I need to pick up his latest patch. Wei.> Thanks, Jan >
Wei Liu
2013-Feb-28 11:19 UTC
Re: [RFC PATCH V3 21/22] Only allow extended event channel on Dom0 and driver domains
On Wed, Feb 27, 2013 at 04:58:46PM +0000, Jan Beulich wrote:> >>> On 27.02.13 at 15:34, Wei Liu <wei.liu2@citrix.com> wrote: > > --- a/xen/include/xen/sched.h > > +++ b/xen/include/xen/sched.h > > @@ -224,6 +224,8 @@ struct domain > > > > unsigned long *evtchn_pending; > > unsigned long *evtchn_mask; > > + /* Can the domain use extended event channel (new ABI)? */ > > + bool_t evtchn_extended_allowed; > > > > struct grant_table *grant_table; > > > > Please find a better place for this - putting a 1-byte variable > between two pointers adds unnecessary 7-byte padding. And > there''s already a bunch of bool_t-s in struct domain... >Sure. Wei.
On 27/02/13 14:33, Wei Liu wrote:> Some definitions for event channel have been moved to new header file.This suggests the previous patch broke the build. Fold 3 and 4 together or you will break bisection. David> Signed-off-by: Wei Liu <wei.liu2@citrix.com> > --- > xen/xsm/flask/hooks.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/xen/xsm/flask/hooks.c b/xen/xsm/flask/hooks.c > index 29a78dd..6d446ab 100644 > --- a/xen/xsm/flask/hooks.c > +++ b/xen/xsm/flask/hooks.c > @@ -11,6 +11,7 @@ > #include <xen/init.h> > #include <xen/lib.h> > #include <xen/sched.h> > +#include <xen/event.h> > #include <xen/paging.h> > #include <xen/xmalloc.h> > #include <xsm/xsm.h>
Wei Liu
2013-Feb-28 11:21 UTC
Re: [RFC PATCH V3 12/22] Add supported extended event channel ABI bitmap
On Wed, Feb 27, 2013 at 04:47:44PM +0000, Jan Beulich wrote:> >>> On 27.02.13 at 15:34, Wei Liu <wei.liu2@citrix.com> wrote: > > Signed-off-by: Wei Liu <wei.liu2@citrix.com> > > --- > > xen/common/event_channel.c | 4 ++++ > > xen/include/xen/event.h | 1 + > > 2 files changed, 5 insertions(+) > > > > diff --git a/xen/common/event_channel.c b/xen/common/event_channel.c > > index c73c709..470b8d2 100644 > > --- a/xen/common/event_channel.c > > +++ b/xen/common/event_channel.c > > @@ -32,6 +32,10 @@ > > #include <public/event_channel.h> > > #include <xsm/xsm.h> > > > > +/* A bitmap of supported extended event channel ABIs */ > > +uint32_t extended_event_channel = (EVTCHN_EXTENDED_NONE | > > This should be just EVTCHN_EXTENDED_NONE until the last > patch that fully implements 3-level event channels. >No problem. Wei.> Jan > > > + EVTCHN_EXTENDED_L3); > > + > > #define ERROR_EXIT(_errno) \ > > do { \ > > gdprintk(XENLOG_WARNING, \ > >
On Thu, Feb 28, 2013 at 11:20:40AM +0000, David Vrabel wrote:> On 27/02/13 14:33, Wei Liu wrote: > > Some definitions for event channel have been moved to new header file. > > This suggests the previous patch broke the build. Fold 3 and 4 together > or you will break bisection.OK, thanks. Wei.> > David > > > Signed-off-by: Wei Liu <wei.liu2@citrix.com> > > --- > > xen/xsm/flask/hooks.c | 1 + > > 1 file changed, 1 insertion(+) > > > > diff --git a/xen/xsm/flask/hooks.c b/xen/xsm/flask/hooks.c > > index 29a78dd..6d446ab 100644 > > --- a/xen/xsm/flask/hooks.c > > +++ b/xen/xsm/flask/hooks.c > > @@ -11,6 +11,7 @@ > > #include <xen/init.h> > > #include <xen/lib.h> > > #include <xen/sched.h> > > +#include <xen/event.h> > > #include <xen/paging.h> > > #include <xen/xmalloc.h> > > #include <xsm/xsm.h> >
Wei Liu
2013-Feb-28 11:25 UTC
Re: [RFC PATCH V3 06/22] Define extended event channel registration interface
On Wed, Feb 27, 2013 at 04:42:56PM +0000, Jan Beulich wrote:> >>> On 27.02.13 at 15:33, Wei Liu <wei.liu2@citrix.com> wrote: > > +/* commands: > > + * EVTCHN_EXTENDED_QUERY(0): query supported extended event channel types, > > + * _NONE supported types are or''ed in return value of > > + * the hypercall. > > + * EVTCHN_EXTENDED_*: specific extended event channel subcommand. > > + */ > > +#define EVTCHN_EXTENDED_QUERY 0 > > +/* supported extended event channel */ > > +#define EVTCHN_EXTENDED_NONE 0 > > +#define _EVTCHN_EXTENDED_L3 0 > > +#define EVTCHN_EXTENDED_L3 (1U << _EVTCHN_EXTENDED_L3) > > +struct evtchn_register_extended { > > + /* IN parameters. */ > > + uint32_t cmd; > > Looking at patch 18 you seem to indeed plan on passing a bit mask > with a single bit set as command. Is that really reasonable? I can > see the need for the query to return a bit mask, but that''s it. >When passing as command, the cmd field is not a bit mask. It is just that I reuse the numeric representation of the bit mask, saving the need to assign different numbers to commands. Wei.> Jan >
On Wed, Feb 27, 2013 at 04:51:45PM +0000, Jan Beulich wrote:> >>> On 27.02.13 at 15:34, Wei Liu <wei.liu2@citrix.com> wrote: > > --- a/xen/common/event_channel.c > > +++ b/xen/common/event_channel.c > > @@ -36,6 +36,20 @@ > > uint32_t extended_event_channel = (EVTCHN_EXTENDED_NONE | > > EVTCHN_EXTENDED_L3); > > > > +static inline const char * evtchn_abi_str(unsigned int abi) > > +{ > > + switch ( abi ) > > + { > > + case EVTCHN_EXTENDED_NONE: > > + return "2-level"; > > + case EVTCHN_EXTENDED_L3: > > + return "3-level"; > > + default: > > + BUG(); > > + } > > + return ""; /* make compiler happy */ > > +} > > + > > This is the sort of change that looks completely bogus - even the > next few patches don''t seem to make use of this. Why can''t this > be added when the first user of it appears? It surely won''t make > reviewing that patch more difficult... >Do you mean the implementation is bogus or the way I break my patches?> And that''s a general problem (for me at least) with how you break > up patches: Having looked ahead at 18 and 19, the latter undoes > quite a significant portion of what the former did. Both being far > from huge and unreviewable, why don''t you fold them so things > actually make sense to the reader? >Sure, this can be done. Wei.
On 27/02/13 14:33, Wei Liu wrote:> Affected files: > * event_channel.c > * sched.h > * event.h > * xen.hIs this sort of white space patch useful? David> Signed-off-by: Wei Liu <wei.liu2@citrix.com> > --- > xen/common/event_channel.c | 16 ++++++++-------- > xen/include/public/xen.h | 22 +++++++++++----------- > xen/include/xen/event.h | 4 ++-- > xen/include/xen/sched.h | 6 +++--- > 4 files changed, 24 insertions(+), 24 deletions(-) > > diff --git a/xen/common/event_channel.c b/xen/common/event_channel.c > index 2d7afc9..9231eb0 100644 > --- a/xen/common/event_channel.c > +++ b/xen/common/event_channel.c > @@ -1,15 +1,15 @@ > /****************************************************************************** > * event_channel.c > - * > + * > * Event notifications from VIRQs, PIRQs, and other domains. > - * > + * > * Copyright (c) 2003-2006, K A Fraser. > - * > + * > * This program is distributed in the hope that it will be useful, > * but WITHOUT ANY WARRANTY; without even the implied warranty of > * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > * GNU General Public License for more details. > - * > + * > * You should have received a copy of the GNU General Public License > * along with this program; if not, write to the Free Software > * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA > @@ -238,7 +238,7 @@ static long evtchn_bind_interdomain(evtchn_bind_interdomain_t *bind) > lchn->u.interdomain.remote_dom = rd; > lchn->u.interdomain.remote_port = (u16)rport; > lchn->state = ECS_INTERDOMAIN; > - > + > rchn->u.interdomain.remote_dom = ld; > rchn->u.interdomain.remote_port = (u16)lport; > rchn->state = ECS_INTERDOMAIN; > @@ -255,7 +255,7 @@ static long evtchn_bind_interdomain(evtchn_bind_interdomain_t *bind) > spin_unlock(&ld->event_lock); > if ( ld != rd ) > spin_unlock(&rd->event_lock); > - > + > rcu_unlock_domain(rd); > > return rc; > @@ -633,7 +633,7 @@ static void evtchn_set_pending(struct vcpu *v, int port) > { > vcpu_mark_events_pending(v); > } > - > + > /* Check if some VCPU might be polling for this event. */ > if ( likely(bitmap_empty(d->poll_mask, d->max_vcpus)) ) > return; > @@ -930,7 +930,7 @@ int evtchn_unmask(unsigned int port) > > /* > * These operations must happen in strict order. Based on > - * include/xen/event.h:evtchn_set_pending(). > + * include/xen/event.h:evtchn_set_pending(). > */ > if ( test_and_clear_bit(port, &shared_info(d, evtchn_mask)) && > test_bit (port, &shared_info(d, evtchn_pending)) && > diff --git a/xen/include/public/xen.h b/xen/include/public/xen.h > index 99c8212..87380e6 100644 > --- a/xen/include/public/xen.h > +++ b/xen/include/public/xen.h > @@ -1,8 +1,8 @@ > /****************************************************************************** > * xen.h > - * > + * > * Guest OS interface to Xen. > - * > + * > * Permission is hereby granted, free of charge, to any person obtaining a copy > * of this software and associated documentation files (the "Software"), to > * deal in the Software without restriction, including without limitation the > @@ -137,11 +137,11 @@ DEFINE_XEN_GUEST_HANDLE(xen_ulong_t); > #define __HYPERVISOR_dom0_op __HYPERVISOR_platform_op > #endif > > -/* > +/* > * VIRTUAL INTERRUPTS > - * > + * > * Virtual interrupts that a guest OS may receive from Xen. > - * > + * > * In the side comments, ''V.'' denotes a per-VCPU VIRQ while ''G.'' denotes a > * global VIRQ. The former can be bound once per VCPU and cannot be re-bound. > * The latter can be allocated only once per guest: they must initially be > @@ -190,7 +190,7 @@ DEFINE_XEN_GUEST_HANDLE(xen_ulong_t); > * (x) encodes the PFD as follows: > * x == 0 => PFD == DOMID_SELF > * x != 0 => PFD == x - 1 > - * > + * > * Sub-commands: ptr[1:0] specifies the appropriate MMU_* command. > * ------------- > * ptr[1:0] == MMU_NORMAL_PT_UPDATE: > @@ -236,13 +236,13 @@ DEFINE_XEN_GUEST_HANDLE(xen_ulong_t); > * To deallocate the pages, the operations are the reverse of the steps > * mentioned above. The argument is MMUEXT_UNPIN_TABLE for all levels and the > * pagetable MUST not be in use (meaning that the cr3 is not set to it). > - * > + * > * ptr[1:0] == MMU_MACHPHYS_UPDATE: > * Updates an entry in the machine->pseudo-physical mapping table. > * ptr[:2] -- Machine address within the frame whose mapping to modify. > * The frame must belong to the FD, if one is specified. > * val -- Value to write into the mapping entry. > - * > + * > * ptr[1:0] == MMU_PT_UPDATE_PRESERVE_AD: > * As MMU_NORMAL_PT_UPDATE above, but A/D bits currently in the PTE are ORed > * with those in @val. > @@ -588,7 +588,7 @@ typedef struct vcpu_time_info vcpu_time_info_t; > struct vcpu_info { > /* > * ''evtchn_upcall_pending'' is written non-zero by Xen to indicate > - * a pending notification for a particular VCPU. It is then cleared > + * a pending notification for a particular VCPU. It is then cleared > * by the guest OS /before/ checking for pending work, thus avoiding > * a set-and-check race. Note that the mask is only accessed by Xen > * on the CPU that is currently hosting the VCPU. This means that the > @@ -646,7 +646,7 @@ struct shared_info { > * 3. Virtual interrupts (''events''). A domain can bind an event-channel > * port to a virtual interrupt source, such as the virtual-timer > * device or the emergency console. > - * > + * > * Event channels are addressed by a "port index". Each channel is > * associated with two bits of information: > * 1. PENDING -- notifies the domain that there is a pending notification > @@ -657,7 +657,7 @@ struct shared_info { > * becomes pending while the channel is masked then the ''edge'' is lost > * (i.e., when the channel is unmasked, the guest must manually handle > * pending notifications as no upcall will be scheduled by Xen). > - * > + * > * To expedite scanning of pending notifications, any 0->1 pending > * transition on an unmasked channel causes a corresponding bit in a > * per-vcpu selector word to be set. Each bit in the selector covers a > diff --git a/xen/include/xen/event.h b/xen/include/xen/event.h > index 71c3e92..65ac81a 100644 > --- a/xen/include/xen/event.h > +++ b/xen/include/xen/event.h > @@ -1,8 +1,8 @@ > /****************************************************************************** > * event.h > - * > + * > * A nice interface for passing asynchronous events to guest OSes. > - * > + * > * Copyright (c) 2002-2006, K A Fraser > */ > > diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h > index ba0f2f8..f6846d4 100644 > --- a/xen/include/xen/sched.h > +++ b/xen/include/xen/sched.h > @@ -92,7 +92,7 @@ void evtchn_destroy_final(struct domain *d); /* from complete_domain_destroy */ > > struct waitqueue_vcpu; > > -struct vcpu > +struct vcpu > { > int vcpu_id; > > @@ -453,7 +453,7 @@ struct domain *domain_create( > /* > * rcu_lock_domain_by_id() is more efficient than get_domain_by_id(). > * This is the preferred function if the returned domain reference > - * is short lived, but it cannot be used if the domain reference needs > + * is short lived, but it cannot be used if the domain reference needs > * to be kept beyond the current scope (e.g., across a softirq). > * The returned domain reference must be discarded using rcu_unlock_domain(). > */ > @@ -574,7 +574,7 @@ void sync_local_execstate(void); > * sync_vcpu_execstate() will switch and commit @prev''s state. > */ > void context_switch( > - struct vcpu *prev, > + struct vcpu *prev, > struct vcpu *next); > > /*
Jan Beulich
2013-Feb-28 11:32 UTC
Re: [RFC PATCH V3 06/22] Define extended event channel registration interface
>>> On 28.02.13 at 12:25, Wei Liu <wei.liu2@citrix.com> wrote: > On Wed, Feb 27, 2013 at 04:42:56PM +0000, Jan Beulich wrote: >> >>> On 27.02.13 at 15:33, Wei Liu <wei.liu2@citrix.com> wrote: >> > +/* commands: >> > + * EVTCHN_EXTENDED_QUERY(0): query supported extended event channel types, >> > + * _NONE supported types are or''ed in return value of >> > + * the hypercall. >> > + * EVTCHN_EXTENDED_*: specific extended event channel subcommand. >> > + */ >> > +#define EVTCHN_EXTENDED_QUERY 0 >> > +/* supported extended event channel */ >> > +#define EVTCHN_EXTENDED_NONE 0 >> > +#define _EVTCHN_EXTENDED_L3 0 >> > +#define EVTCHN_EXTENDED_L3 (1U << _EVTCHN_EXTENDED_L3) >> > +struct evtchn_register_extended { >> > + /* IN parameters. */ >> > + uint32_t cmd; >> >> Looking at patch 18 you seem to indeed plan on passing a bit mask >> with a single bit set as command. Is that really reasonable? I can >> see the need for the query to return a bit mask, but that''s it. >> > > When passing as command, the cmd field is not a bit mask. It is just > that I reuse the numeric representation of the bit mask, saving the need > to assign different numbers to commands.I understand that, but this will limit you to 32 commands (plus zero for the query). I''d instead suggest to set _EVTCHN_EXTENDED_L3 to 1, and pass this as command. Bit 0 of the mask returned by the query will then remain unused, allowing it to become a meaning when e.g. the 31 bits don''t suffice anymore. Which isn''t to say that I expect 31 or even more different implementations here - I merely like to have interfaces be extensible from an abstract perspective, even if one can''t foresee the need for extensions (which might turn out necessary tens of years later). Jan
>>> On 28.02.13 at 12:28, Wei Liu <wei.liu2@citrix.com> wrote: > On Wed, Feb 27, 2013 at 04:51:45PM +0000, Jan Beulich wrote: >> >>> On 27.02.13 at 15:34, Wei Liu <wei.liu2@citrix.com> wrote: >> > --- a/xen/common/event_channel.c >> > +++ b/xen/common/event_channel.c >> > @@ -36,6 +36,20 @@ >> > uint32_t extended_event_channel = (EVTCHN_EXTENDED_NONE | >> > EVTCHN_EXTENDED_L3); >> > >> > +static inline const char * evtchn_abi_str(unsigned int abi) >> > +{ >> > + switch ( abi ) >> > + { >> > + case EVTCHN_EXTENDED_NONE: >> > + return "2-level"; >> > + case EVTCHN_EXTENDED_L3: >> > + return "3-level"; >> > + default: >> > + BUG(); >> > + } >> > + return ""; /* make compiler happy */ >> > +} >> > + >> >> This is the sort of change that looks completely bogus - even the >> next few patches don''t seem to make use of this. Why can''t this >> be added when the first user of it appears? It surely won''t make >> reviewing that patch more difficult... >> > > Do you mean the implementation is bogus or the way I break > my patches?The latter - the need for this function, as said, doesn''t even become visible looking at the next few patches. Jan
On Thu, Feb 28, 2013 at 11:30:56AM +0000, David Vrabel wrote:> On 27/02/13 14:33, Wei Liu wrote: > > Affected files: > > * event_channel.c > > * sched.h > > * event.h > > * xen.h > > Is this sort of white space patch useful? >Not really. But people who have their editors configured to eliminate white spaces automatically will trip over this and introduce stray white spaces changes in their patches. Feel free to drop this. Wei.
>>> On 28.02.13 at 12:30, David Vrabel <david.vrabel@citrix.com> wrote: > On 27/02/13 14:33, Wei Liu wrote: >> Affected files: >> * event_channel.c >> * sched.h >> * event.h >> * xen.h > > Is this sort of white space patch useful?I think so. It seems we still don''t have an in-tree coding style document, and I don''t recall where the out-of-tree one lives, but even if it doesn''t state the "no trailing white space" rule, I think it should. Jan
Wei Liu
2013-Feb-28 11:41 UTC
Re: [RFC PATCH V3 06/22] Define extended event channel registration interface
On Thu, Feb 28, 2013 at 11:32:24AM +0000, Jan Beulich wrote:> >>> On 28.02.13 at 12:25, Wei Liu <wei.liu2@citrix.com> wrote: > > On Wed, Feb 27, 2013 at 04:42:56PM +0000, Jan Beulich wrote: > >> >>> On 27.02.13 at 15:33, Wei Liu <wei.liu2@citrix.com> wrote: > >> > +/* commands: > >> > + * EVTCHN_EXTENDED_QUERY(0): query supported extended event channel types, > >> > + * _NONE supported types are or''ed in return value of > >> > + * the hypercall. > >> > + * EVTCHN_EXTENDED_*: specific extended event channel subcommand. > >> > + */ > >> > +#define EVTCHN_EXTENDED_QUERY 0 > >> > +/* supported extended event channel */ > >> > +#define EVTCHN_EXTENDED_NONE 0 > >> > +#define _EVTCHN_EXTENDED_L3 0 > >> > +#define EVTCHN_EXTENDED_L3 (1U << _EVTCHN_EXTENDED_L3) > >> > +struct evtchn_register_extended { > >> > + /* IN parameters. */ > >> > + uint32_t cmd; > >> > >> Looking at patch 18 you seem to indeed plan on passing a bit mask > >> with a single bit set as command. Is that really reasonable? I can > >> see the need for the query to return a bit mask, but that''s it. > >> > > > > When passing as command, the cmd field is not a bit mask. It is just > > that I reuse the numeric representation of the bit mask, saving the need > > to assign different numbers to commands. > > I understand that, but this will limit you to 32 commands (plus > zero for the query). > > I''d instead suggest to set _EVTCHN_EXTENDED_L3 to 1, and > pass this as command. Bit 0 of the mask returned by the query > will then remain unused, allowing it to become a meaning when > e.g. the 31 bits don''t suffice anymore. Which isn''t to say that > I expect 31 or even more different implementations here - I > merely like to have interfaces be extensible from an abstract > perspective, even if one can''t foresee the need for extensions > (which might turn out necessary tens of years later). >Sounds reasonable, I''ll make it that way. Wei.> Jan >
David Vrabel
2013-Feb-28 11:55 UTC
Re: [RFC PATCH V3 07/22] Add evtchn_extended in struct domain
On 27/02/13 14:34, Wei Liu wrote:> This field is a bitmap of currently in use extended event channel ABI, which > can have 0 (no extended event channel in use) 1 bit set. It is manipulated by > hypervisor only, so if anything goes wrong it is a bug. > > The default event channel ABI is EVTCHN_EXTENDED_NONE, which means no extended > event channel is used.[...]> --- a/xen/include/xen/event.h > +++ b/xen/include/xen/event.h > @@ -14,6 +14,7 @@ > #include <xen/softirq.h> > #include <asm/bitops.h> > #include <asm/event.h> > +#include <public/event_channel.h> > > #ifndef CONFIG_COMPAT > #define BITS_PER_EVTCHN_WORD(d) BITS_PER_LONG > @@ -22,7 +23,16 @@ > #endif > static inline unsigned int max_evtchns(struct domain *d) > { > - return BITS_PER_EVTCHN_WORD(d) * BITS_PER_EVTCHN_WORD(d); > + unsigned int ret = 0; > + switch ( d->evtchn_extended ) > + { > + case EVTCHN_EXTENDED_NONE: > + ret = BITS_PER_EVTCHN_WORD(d) * BITS_PER_EVTCHN_WORD(d); > + break; > + default: > + BUG(); > + }BUG''ing in every switch that uses d->evtchn_extended doesn''t seem useful and may add extra overhead in hot paths. Perhaps an ASSERT() but this this field is written in such a limited set of places this doesn''t seem useful. David
David Vrabel
2013-Feb-28 11:58 UTC
Re: [RFC PATCH V3 05/22] Change MAX_EVTCHNS macro to max_evtchns inline function
On 27/02/13 14:33, Wei Liu wrote:> The calculation of max event channels depends on the actual ABI in use. Try to > avoid gcc-ism macro.[...]> +static inline unsigned int max_evtchns(struct domain *d) > +{ > + return BITS_PER_EVTCHN_WORD(d) * BITS_PER_EVTCHN_WORD(d); > +}This value doesn''t change over the life of the domain. Calculate it once and save it in a new d->max_evtchns field? David
On 27/02/13 14:34, Wei Liu wrote: "Update Xen public header" Which header? Updated how and why? This subject and commit message needs to be more descriptive. David
On Thu, 2013-02-28 at 11:38 +0000, Jan Beulich wrote:> >>> On 28.02.13 at 12:30, David Vrabel <david.vrabel@citrix.com> wrote: > > On 27/02/13 14:33, Wei Liu wrote: > >> Affected files: > >> * event_channel.c > >> * sched.h > >> * event.h > >> * xen.h > > > > Is this sort of white space patch useful? > > I think so. It seems we still don''t have an in-tree coding style > document,We have CODING_STYLE at the top-level now. I did''nt hceck what it says about whitespace...
On Thu, 2013-02-28 at 12:14 +0000, Ian Campbell wrote:> On Thu, 2013-02-28 at 11:38 +0000, Jan Beulich wrote: > > >>> On 28.02.13 at 12:30, David Vrabel <david.vrabel@citrix.com> wrote: > > > On 27/02/13 14:33, Wei Liu wrote: > > >> Affected files: > > >> * event_channel.c > > >> * sched.h > > >> * event.h > > >> * xen.h > > > > > > Is this sort of white space patch useful? > > > > I think so. It seems we still don''t have an in-tree coding style > > document, > > We have CODING_STYLE at the top-level now. I did''nt hceck what it says > about whitespace... > >Oh yes there is a CODING_STYLE now. And it says: There should be no trailing white space at the end of lines (including after the opening /* of a comment block). So this patch makes sense now... Wei.
David Vrabel
2013-Feb-28 12:32 UTC
Re: [RFC PATCH V3 06/22] Define extended event channel registration interface
On 27/02/13 14:33, Wei Liu wrote:> This interface allows user to query supported event channel types. If we need > to disable a specific ABI in the future, we just need to remove the dead code > and clear corresponding feature bit. > > Signed-off-by: Wei Liu <wei.liu2@citrix.com> > --- > xen/include/public/event_channel.h | 44 ++++++++++++++++++++++++++++++++++++ > 1 file changed, 44 insertions(+) > > diff --git a/xen/include/public/event_channel.h b/xen/include/public/event_channel.h > index 07ff321..dff3364 100644 > --- a/xen/include/public/event_channel.h > +++ b/xen/include/public/event_channel.h > @@ -71,6 +71,7 @@ > #define EVTCHNOP_bind_vcpu 8 > #define EVTCHNOP_unmask 9 > #define EVTCHNOP_reset 10 > +#define EVTCHNOP_register_extended 11 > /* ` } */ > > typedef uint32_t evtchn_port_t; > @@ -258,6 +259,49 @@ struct evtchn_reset { > typedef struct evtchn_reset evtchn_reset_t; > > /* > + * EVTCHNOP_register_extended: Register extended event channel > + * NOTES: > + * 1. Currently only 3-level is supported. > + * 2. Should fall back to 2-level if this call fails. > + */ > +/* 64 bit guests need 8 pages for evtchn_pending and evtchn_mask for > + * 256k event channels while 32 bit ones only need 1 page for 32k > + * event channels. */ > +#define EVTCHN_MAX_L3_PAGES 8 > +struct evtchn_register_3level { > + /* IN parameters. */ > + uint32_t nr_pages; /* for evtchn_{pending,mask} */ > + uint32_t nr_vcpus; /* for l2sel_{mfns,offsets} */ > + XEN_GUEST_HANDLE(xen_pfn_t) evtchn_pending; > + XEN_GUEST_HANDLE(xen_pfn_t) evtchn_mask; > + XEN_GUEST_HANDLE(xen_pfn_t) l2sel_mfns; > + XEN_GUEST_HANDLE(xen_pfn_t) l2sel_offsets; > +};Registering per-VCPU resources at start-of-day doesn''t seem like the right thing to do here. Why waste resources for VCPUs that are never going to be used? And I don''t think we want to constraint how VCPU hotplug works by requiring resource for all possible VCPUs to be allocated up-front. If there isn''t enough resource for all VCPUs to all use the 3-level ABI then I think the preferred trade off is to offline the VCPUs that cannot get resources and not fallback to the 2-level ABI. The event channel limit is a hard scalability limit, number of VCPUs is a soft limit, so a guest is more likely to gracefully degrade in usefulness if it loses VCPUs instead of available event channels. Obviously, if 3-level is requested and the boot VCPUs fails to enable it, then it should fallback to 2-level because this is better than just panicking.> +typedef struct evtchn_register_3level evtchn_register_3level_t; > +DEFINE_XEN_GUEST_HANDLE(evtchn_register_3level_t); > + > +/* commands: > + * EVTCHN_EXTENDED_QUERY(0): query supported extended event channel types, > + * _NONE supported types are or''ed in return value of > + * the hypercall. > + * EVTCHN_EXTENDED_*: specific extended event channel subcommand.Combining query and register makes no sense.> + */ > +#define EVTCHN_EXTENDED_QUERY 0 > +/* supported extended event channel */ > +#define EVTCHN_EXTENDED_NONE 0 > +#define _EVTCHN_EXTENDED_L3 0 > +#define EVTCHN_EXTENDED_L3 (1U << _EVTCHN_EXTENDED_L3) > +struct evtchn_register_extended { > + /* IN parameters. */ > + uint32_t cmd; > + union { > + evtchn_register_3level_t l3; > + } u; > +};Adding new members to this union as new event channels ABI are implemented are going to change its size and potentially the alignment of the union member. This seems a potential for future ABI compatibility problems. See also by comment to patch 18. There doesn''t seem to be any value in having a single register sub-op for all possible future ABIs. Just add a new sub-op for each new ABI. David
David Vrabel
2013-Feb-28 12:33 UTC
Re: [RFC PATCH V3 18/22] Implement EVTCHNOP_register_extended
On 27/02/13 14:34, Wei Liu wrote:> Note: this call always fails as it is not yet completed. > > Signed-off-by: Wei Liu <wei.liu2@citrix.com> > --- > xen/common/event_channel.c | 56 ++++++++++++++++++++++++++++++++++++++++++++ > 1 file changed, 56 insertions(+) > > diff --git a/xen/common/event_channel.c b/xen/common/event_channel.c > index 26daa7e..bb6e5f9 100644 > --- a/xen/common/event_channel.c > +++ b/xen/common/event_channel.c > @@ -1204,6 +1204,34 @@ static long evtchn_register_3level(evtchn_register_3level_t *arg) > return rc; > } > > +/* > + * NOTE to extneded event channel users: > + * extended channels are likely to consume lots large global mapping > + * area in Xen. For example, 3-level event channel consumes 16 + > + * nr_vcpus pages global mapping area. > + */ > +static long evtchn_register_extended(struct evtchn_register_extended *reg) > +{ > + struct domain *d = current->domain; > + int rc; > + > + spin_lock(&d->event_lock); > + > + switch ( reg->cmd ) > + { > + case EVTCHN_EXTENDED_NONE: > + default: > + rc = -EINVAL; > + case EVTCHN_EXTENDED_L3: > + rc = evtchn_register_3level(®->u.l3); > + break; > + } > + > + spin_unlock(&d->event_lock); > + > + return rc; > +} > + > long do_event_channel_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg) > { > long rc; > @@ -1312,6 +1340,19 @@ long do_event_channel_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg) > break; > } > > + case EVTCHNOP_register_extended: { > + struct evtchn_register_extended reg; > + > + if ( copy_from_guest(®, arg, 1) != 0 ) > + return -EFAULT;If the guest''s idea of the size of struct evtchn_register_extended is smaller than Xen''s, then Xen may try to copy more data that is actually available. This may then return -EFAULT unexpectedly if the guest allocated the structure at the end of a page and the following page is not mapped. David
David Vrabel
2013-Feb-28 12:36 UTC
Re: [RFC PATCH V3 19/22] Enable exteneded event channel ABI query
On 27/02/13 14:34, Wei Liu wrote:> --- a/xen/common/event_channel.c > +++ b/xen/common/event_channel.c > @@ -1219,12 +1219,14 @@ static long evtchn_register_extended(struct evtchn_register_extended *reg) > > switch ( reg->cmd ) > { > - case EVTCHN_EXTENDED_NONE: > - default: > - rc = -EINVAL; > + case EVTCHN_EXTENDED_QUERY: > + rc = extended_event_channel; > + break;I don''t like overloading the return value with both success/failure and the data. Can this info be returned in a parameter? David
David Vrabel
2013-Feb-28 12:43 UTC
Re: [RFC PATCH V3 21/22] Only allow extended event channel on Dom0 and driver domains
On 27/02/13 14:34, Wei Liu wrote:> For non-Dom0 domains, add a flag to indicate whether it can use extended event > channel, admins can specify this flag when creating a driver domain. > > The rationale behide this option is, a) extended event channel will > consume global mapping space in Xen, b) likely that only Dom0 and driver > domains need this.I would prefer the toolstack to limit the max port that the guest can use (see the EVTCHNOP_set_limit sub-op I proposed), than this single bit (which isn''t useful for the FIFO-based proposal). David
David Vrabel
2013-Feb-28 12:48 UTC
Re: [RFC PATCH V3 22/22] libxl: add evtchn_extended flag
On 27/02/13 14:34, Wei Liu wrote:> Admins can add "evtchn_extended = 1" in domain config file to enable extended > event channel ABI for a domain.You haven''t added any documentation for this new option. I think an option to limit the number of event channels is more intuitive for the guest administrator. See also my comment on patch 21. When you come to write documentation that is useful for a host administrator or toolstack author for what "evtchn_extended" actually means you''ll see what I mean. Particularly when you have to say it means slightly different things depending on whether you''re on ARM or on 32 or 64-bit x86. David
>>> On 28.02.13 at 13:14, Ian Campbell <ian.campbell@citrix.com> wrote: > On Thu, 2013-02-28 at 11:38 +0000, Jan Beulich wrote: >> >>> On 28.02.13 at 12:30, David Vrabel <david.vrabel@citrix.com> wrote: >> > On 27/02/13 14:33, Wei Liu wrote: >> >> Affected files: >> >> * event_channel.c >> >> * sched.h >> >> * event.h >> >> * xen.h >> > >> > Is this sort of white space patch useful? >> >> I think so. It seems we still don''t have an in-tree coding style >> document, > > We have CODING_STYLE at the top-level now. I did''nt hceck what it says > about whitespace...Oh, and I checked under docs/. It says "There should be no trailing white space at the end of lines (including after the opening /* of a comment block)", so a patch like the one here ought to be welcome. Jan
Wei Liu
2013-Feb-28 13:59 UTC
Re: [RFC PATCH V3 05/22] Change MAX_EVTCHNS macro to max_evtchns inline function
On Thu, 2013-02-28 at 11:58 +0000, David Vrabel wrote:> On 27/02/13 14:33, Wei Liu wrote: > > The calculation of max event channels depends on the actual ABI in use. Try to > > avoid gcc-ism macro. > [...] > > +static inline unsigned int max_evtchns(struct domain *d) > > +{ > > + return BITS_PER_EVTCHN_WORD(d) * BITS_PER_EVTCHN_WORD(d); > > +} > > This value doesn''t change over the life of the domain. Calculate it once > and save it in a new d->max_evtchns field? >This can change during domain life cycle. But it is doable to save it in d->max_evtchns and change it when necessary. Wei.> David
Wei Liu
2013-Feb-28 15:04 UTC
Re: [RFC PATCH V3 06/22] Define extended event channel registration interface
On Thu, 2013-02-28 at 12:32 +0000, David Vrabel wrote:> On 27/02/13 14:33, Wei Liu wrote: > > This interface allows user to query supported event channel types. If we need > > to disable a specific ABI in the future, we just need to remove the dead code > > and clear corresponding feature bit. > > > > Signed-off-by: Wei Liu <wei.liu2@citrix.com> > > --- > > xen/include/public/event_channel.h | 44 ++++++++++++++++++++++++++++++++++++ > > 1 file changed, 44 insertions(+) > > > > diff --git a/xen/include/public/event_channel.h b/xen/include/public/event_channel.h > > index 07ff321..dff3364 100644 > > --- a/xen/include/public/event_channel.h > > +++ b/xen/include/public/event_channel.h > > @@ -71,6 +71,7 @@ > > #define EVTCHNOP_bind_vcpu 8 > > #define EVTCHNOP_unmask 9 > > #define EVTCHNOP_reset 10 > > +#define EVTCHNOP_register_extended 11 > > /* ` } */ > > > > typedef uint32_t evtchn_port_t; > > @@ -258,6 +259,49 @@ struct evtchn_reset { > > typedef struct evtchn_reset evtchn_reset_t; > > > > /* > > + * EVTCHNOP_register_extended: Register extended event channel > > + * NOTES: > > + * 1. Currently only 3-level is supported. > > + * 2. Should fall back to 2-level if this call fails. > > + */ > > +/* 64 bit guests need 8 pages for evtchn_pending and evtchn_mask for > > + * 256k event channels while 32 bit ones only need 1 page for 32k > > + * event channels. */ > > +#define EVTCHN_MAX_L3_PAGES 8 > > +struct evtchn_register_3level { > > + /* IN parameters. */ > > + uint32_t nr_pages; /* for evtchn_{pending,mask} */ > > + uint32_t nr_vcpus; /* for l2sel_{mfns,offsets} */ > > + XEN_GUEST_HANDLE(xen_pfn_t) evtchn_pending; > > + XEN_GUEST_HANDLE(xen_pfn_t) evtchn_mask; > > + XEN_GUEST_HANDLE(xen_pfn_t) l2sel_mfns; > > + XEN_GUEST_HANDLE(xen_pfn_t) l2sel_offsets; > > +}; > > Registering per-VCPU resources at start-of-day doesn''t seem like the > right thing to do here. Why waste resources for VCPUs that are never > going to be used? And I don''t think we want to constraint how VCPU > hotplug works by requiring resource for all possible VCPUs to be > allocated up-front. >My first thought was that it is important for Dom0 or driver domains to get what they want otherwise the whole system suffers from performance degradation. I''m considering per-cpu registration call now, for the reason of saving resources.> > > + */ > > +#define EVTCHN_EXTENDED_QUERY 0 > > +/* supported extended event channel */ > > +#define EVTCHN_EXTENDED_NONE 0 > > +#define _EVTCHN_EXTENDED_L3 0 > > +#define EVTCHN_EXTENDED_L3 (1U << _EVTCHN_EXTENDED_L3) > > +struct evtchn_register_extended { > > + /* IN parameters. */ > > + uint32_t cmd; > > + union { > > + evtchn_register_3level_t l3; > > + } u; > > +}; > > Adding new members to this union as new event channels ABI are > implemented are going to change its size and potentially the alignment > of the union member. This seems a potential for future ABI > compatibility problems. See also by comment to patch 18. >One way to solve this is to have that union contain pointers to specific structures, but as you suggested, just add sub-op for each new ABI is Ok. Wei.
Wei Liu
2013-Feb-28 19:29 UTC
Re: [RFC PATCH V3 21/22] Only allow extended event channel on Dom0 and driver domains
On Thu, 2013-02-28 at 12:43 +0000, David Vrabel wrote:> On 27/02/13 14:34, Wei Liu wrote: > > For non-Dom0 domains, add a flag to indicate whether it can use extended event > > channel, admins can specify this flag when creating a driver domain. > > > > The rationale behide this option is, a) extended event channel will > > consume global mapping space in Xen, b) likely that only Dom0 and driver > > domains need this. > > I would prefer the toolstack to limit the max port that the guest can > use (see the EVTCHNOP_set_limit sub-op I proposed), than this single bit > (which isn''t useful for the FIFO-based proposal). >I''m not very keen on adding that EVTCHNOP_set_limit now, because: * limiting number of event channels in 2/3-level brings no significant improvement. * you would need to way to notify guest kernel its limit (be another OP or via Xenstore), because in any case a guest uses user space evtchn driver it would need to know the limit. In any case, the purpose of this option is to indicate whether the guest can use *any* of the extended ABIs. A better naming should be evtchn_extended_allowed, which should default to 1 in FIFO-based ABIs. Wei.> David
Ian Jackson
2013-Mar-01 11:55 UTC
Re: [RFC PATCH V3 22/22] libxl: add evtchn_extended flag
Wei Liu writes ("[Xen-devel] [RFC PATCH V3 22/22] libxl: add evtchn_extended flag"):> Admins can add "evtchn_extended = 1" in domain config file to enable extended > event channel ABI for a domain....> diff --git a/tools/libxl/xl_sxp.c b/tools/libxl/xl_sxp.c > index a16a025..6d40e7a 100644 > --- a/tools/libxl/xl_sxp.c > +++ b/tools/libxl/xl_sxp.c > @@ -44,6 +44,8 @@ void printf_info_sexp(int domid, libxl_domain_config *d_config) > printf("\t(hvm %d)\n", c_info->type == LIBXL_DOMAIN_TYPE_HVM); > printf("\t(hap %s)\n", libxl_defbool_to_string(c_info->hap)); > printf("\t(oos %s)\n", libxl_defbool_to_string(c_info->oos)); > + printf("\t(evtchn_extended %s)\n", > + libxl_defbool_to_string(c_info->evtchn_extended));Please don''t add stuff to this. It''s for compatibility only. Thanks, Ian.
Ian Jackson
2013-Mar-01 12:00 UTC
Re: [RFC PATCH V3] Implement 3-level event channel in Xen
Jan Beulich writes ("Re: [Xen-devel] [RFC PATCH V3] Implement 3-level event channel in Xen"):> Or be easier to maintain and/or extend further (which I expect > would both apply to David''s alternative).I think that given that the design of David''s hasn''t been finalised and no code exists, it is clear that David''s version won''t make the deadline for the 4.3 feature freeze. So I think David''s version has missed the boat for 4.3. Since this scalability problem is on our list as critical for 4.3, the conclusion is that we should accept Wei''s approach. Ian.