Also fixed some minor bugs in vallina kernel, see patch 1 to 3. Changes from RFC V2: * Register 3-level event channel only for Dom0 * Dynamically allocate evtchn_pending / evtchn_mask * Avoid redirection with function pointers * Share routines between 2 and 3 level event channels Changes from RFC V1: * New debug interrupt handle, avoid flooding log with useless output * Register 3-level event channels in different locations, please see last changeset
Four things are fixed: a) the test result of globaly masked event; b) make the per-cpu selector L1 to be consistent with description in __xen_evtchn_do_upcall''s comment; c) the test result of L1 selector; d) add KERN_DEBUG in printk. Signed-off-by: Wei Liu <wei.liu2@citrix.com> --- drivers/xen/events.c | 34 +++++++++++++++++----------------- 1 file changed, 17 insertions(+), 17 deletions(-) diff --git a/drivers/xen/events.c b/drivers/xen/events.c index 7595581..2c94aad 100644 --- a/drivers/xen/events.c +++ b/drivers/xen/events.c @@ -1176,7 +1176,7 @@ irqreturn_t xen_debug_interrupt(int irq, void *dev_id) spin_lock_irqsave(&debug_lock, flags); - printk("\nvcpu %d\n ", cpu); + printk(KERN_DEBUG "\nvcpu %d\n ", cpu); for_each_online_cpu(i) { int pending; @@ -1184,56 +1184,56 @@ irqreturn_t xen_debug_interrupt(int irq, void *dev_id) pending = (get_irq_regs() && i == cpu) ? xen_irqs_disabled(get_irq_regs()) : v->evtchn_upcall_mask; - printk("%d: masked=%d pending=%d event_sel %0*lx\n ", i, + printk(KERN_DEBUG "%d: masked=%d pending=%d event_sel %0*lx\n ", i, pending, v->evtchn_upcall_pending, (int)(sizeof(v->evtchn_pending_sel)*2), v->evtchn_pending_sel); } v = per_cpu(xen_vcpu, cpu); - printk("\npending:\n "); + printk(KERN_DEBUG "\npending:\n "); for (i = ARRAY_SIZE(sh->evtchn_pending)-1; i >= 0; i--) - printk("%0*lx%s", (int)sizeof(sh->evtchn_pending[0])*2, + printk(KERN_DEBUG "%0*lx%s", (int)sizeof(sh->evtchn_pending[0])*2, sh->evtchn_pending[i], i % 8 == 0 ? "\n " : " "); - printk("\nglobal mask:\n "); + printk(KERN_DEBUG "\nglobal mask:\n "); for (i = ARRAY_SIZE(sh->evtchn_mask)-1; i >= 0; i--) - printk("%0*lx%s", + printk(KERN_DEBUG "%0*lx%s", (int)(sizeof(sh->evtchn_mask[0])*2), sh->evtchn_mask[i], i % 8 == 0 ? "\n " : " "); - printk("\nglobally unmasked:\n "); + printk(KERN_DEBUG "\nglobally unmasked:\n "); for (i = ARRAY_SIZE(sh->evtchn_mask)-1; i >= 0; i--) - printk("%0*lx%s", (int)(sizeof(sh->evtchn_mask[0])*2), + printk(KERN_DEBUG "%0*lx%s", (int)(sizeof(sh->evtchn_mask[0])*2), sh->evtchn_pending[i] & ~sh->evtchn_mask[i], i % 8 == 0 ? "\n " : " "); - printk("\nlocal cpu%d mask:\n ", cpu); + printk(KERN_DEBUG "\nlocal cpu%d mask:\n ", cpu); for (i = (NR_EVENT_CHANNELS/BITS_PER_LONG)-1; i >= 0; i--) - printk("%0*lx%s", (int)(sizeof(cpu_evtchn[0])*2), + printk(KERN_DEBUG "%0*lx%s", (int)(sizeof(cpu_evtchn[0])*2), cpu_evtchn[i], i % 8 == 0 ? "\n " : " "); - printk("\nlocally unmasked:\n "); + printk(KERN_DEBUG "\nlocally unmasked:\n "); for (i = ARRAY_SIZE(sh->evtchn_mask)-1; i >= 0; i--) { unsigned long pending = sh->evtchn_pending[i] & ~sh->evtchn_mask[i] & cpu_evtchn[i]; - printk("%0*lx%s", (int)(sizeof(sh->evtchn_mask[0])*2), + printk(KERN_DEBUG "%0*lx%s", (int)(sizeof(sh->evtchn_mask[0])*2), pending, i % 8 == 0 ? "\n " : " "); } - printk("\npending list:\n"); + printk(KERN_DEBUG "\npending list:\n"); for (i = 0; i < NR_EVENT_CHANNELS; i++) { if (sync_test_bit(i, sh->evtchn_pending)) { int word_idx = i / BITS_PER_LONG; - printk(" %d: event %d -> irq %d%s%s%s\n", + printk(KERN_DEBUG " %d: event %d -> irq %d%s%s%s\n", cpu_from_evtchn(i), i, evtchn_to_irq[i], - sync_test_bit(word_idx, &v->evtchn_pending_sel) - ? "" : " l2-clear", - !sync_test_bit(i, sh->evtchn_mask) + !sync_test_bit(word_idx, &v->evtchn_pending_sel) + ? "" : " l1-clear", + sync_test_bit(i, sh->evtchn_mask) ? "" : " globally-masked", sync_test_bit(i, cpu_evtchn) ? "" : " locally-masked"); -- 1.7.10.4
Wei Liu
2013-Jan-31 14:46 UTC
[PATCH 02/13] xen: fix error handling path if xen_allocate_irq_dynamic fails
It is possible that the call to xen_allocate_irq_dynamic() returns negative number other than -1. Signed-off-by: Wei Liu <wei.liu2@citrix.com> --- drivers/xen/events.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/xen/events.c b/drivers/xen/events.c index 2c94aad..93a3648 100644 --- a/drivers/xen/events.c +++ b/drivers/xen/events.c @@ -820,7 +820,7 @@ int bind_evtchn_to_irq(unsigned int evtchn) if (irq == -1) { irq = xen_allocate_irq_dynamic(); - if (irq == -1) + if (irq < 0) goto out; irq_set_chip_and_handler_name(irq, &xen_dynamic_chip, @@ -923,7 +923,7 @@ int bind_virq_to_irq(unsigned int virq, unsigned int cpu) if (irq == -1) { irq = xen_allocate_irq_dynamic(); - if (irq == -1) + if (irq < 0) goto out; irq_set_chip_and_handler_name(irq, &xen_percpu_chip, -- 1.7.10.4
It is possible the port was allocated but the irq was not. Take care of this case. Signed-off-by: Wei Liu <wei.liu2@citrix.com> --- drivers/xen/evtchn.c | 12 +++++++++++- 1 file changed, 11 insertions(+), 1 deletion(-) diff --git a/drivers/xen/evtchn.c b/drivers/xen/evtchn.c index b1f60a0..d2bbea1 100644 --- a/drivers/xen/evtchn.c +++ b/drivers/xen/evtchn.c @@ -277,7 +277,17 @@ static void evtchn_unbind_from_user(struct per_user_data *u, int port) { int irq = irq_from_evtchn(port); - unbind_from_irqhandler(irq, (void *)(unsigned long)port); + /* It is possible that the port was allocated but the irq was + * not */ + if (irq >= 0) { + unbind_from_irqhandler(irq, (void *)(unsigned long)port); + } else { + struct evtchn_close close; + close.port = port; + if (port != 0 && /* port 0 is never used */ + HYPERVISOR_event_channel_op(EVTCHNOP_close, &close) != 0) + BUG(); + } set_port_user(port, NULL); } -- 1.7.10.4
Signed-off-by: Wei Liu <wei.liu2@citrix.com> --- include/xen/interface/event_channel.h | 33 +++++++++++++++++++++++++++++++++ include/xen/interface/xen.h | 9 ++++++++- 2 files changed, 41 insertions(+), 1 deletion(-) diff --git a/include/xen/interface/event_channel.h b/include/xen/interface/event_channel.h index f494292..9d8b9e7 100644 --- a/include/xen/interface/event_channel.h +++ b/include/xen/interface/event_channel.h @@ -190,6 +190,39 @@ struct evtchn_reset { }; typedef struct evtchn_reset evtchn_reset_t; +/* + * EVTCHNOP_register_nlevel: Register N-level event channel + * NOTES: + * 1. Currently only 3-level is supported. + * 2. Should fall back to 2-level if this call fails. + */ +#define EVTCHNOP_register_nlevel 11 +/* 64 bit guests need 8 pages for evtchn_pending and evtchn_mask for + * 256k event channels while 32 bit ones only need 1 page for 32k + * event channels. */ +#define EVTCHN_MAX_L3_PAGES 8 +struct evtchn_register_3level { + /* IN parameters. */ + uint32_t nr_pages; /* for evtchn_{pending,mask} */ + uint32_t nr_vcpus; /* for l2sel_{mfns,mask} */ + GUEST_HANDLE(ulong) evtchn_pending; + GUEST_HANDLE(ulong) evtchn_mask; + GUEST_HANDLE(ulong) l2sel_mfns; + GUEST_HANDLE(ulong) l2sel_offsets; +}; +typedef struct evtchn_register_3level evtchn_register_3level_t; +DEFINE_GUEST_HANDLE(evtchn_register_3level_t); + +struct evtchn_register_nlevel { + /* IN parameters. */ + uint32_t level; + union { + evtchn_register_3level_t l3; + } u; +}; +typedef struct evtchn_register_nlevel evtchn_register_nlevel_t; +DEFINE_GUEST_HANDLE(evtchn_register_nlevel_t); + struct evtchn_op { uint32_t cmd; /* EVTCHNOP_* */ union { diff --git a/include/xen/interface/xen.h b/include/xen/interface/xen.h index a890804..5220e33 100644 --- a/include/xen/interface/xen.h +++ b/include/xen/interface/xen.h @@ -283,9 +283,16 @@ DEFINE_GUEST_HANDLE_STRUCT(multicall_entry); /* * Event channel endpoints per domain: + * 2-level: * 1024 if a long is 32 bits; 4096 if a long is 64 bits. + * 3-level: + * 32k if a long is 32 bits; 256k if a long is 64 bits. */ -#define NR_EVENT_CHANNELS (sizeof(unsigned long) * sizeof(unsigned long) * 64) +#define NR_EVENT_CHANNELS_L2 (sizeof(unsigned long) * sizeof(unsigned long) * 64) +#define NR_EVENT_CHANNELS_L3 (NR_EVENT_CHANNELS_L2 * 64) +#if !defined(__XEN__) && !defined(__XEN_TOOLS__) +#define NR_EVENT_CHANNELS NR_EVENT_CHANNELS_L2 /* for compatibility */ +#endif struct vcpu_time_info { /* -- 1.7.10.4
Signed-off-by: Wei Liu <wei.liu2@citrix.com> --- drivers/xen/events.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/drivers/xen/events.c b/drivers/xen/events.c index 93a3648..0fae3f9 100644 --- a/drivers/xen/events.c +++ b/drivers/xen/events.c @@ -345,6 +345,12 @@ static inline int test_evtchn(int port) return sync_test_bit(port, &s->evtchn_pending[0]); } +static inline int test_and_set_mask(int port) +{ + struct shared_info *s = HYPERVISOR_shared_info; + return sync_test_and_set_bit(port, &s->evtchn_mask[0]); +} + /** * notify_remote_via_irq - send event to remote end of event channel via irq @@ -1464,7 +1470,7 @@ int resend_irq_on_evtchn(unsigned int irq) if (!VALID_EVTCHN(evtchn)) return 1; - masked = sync_test_and_set_bit(evtchn, s->evtchn_mask); + masked = test_and_set_mask(evtchn); sync_set_bit(evtchn, s->evtchn_pending); if (!masked) unmask_evtchn(evtchn); @@ -1513,7 +1519,7 @@ static int retrigger_dynirq(struct irq_data *data) if (VALID_EVTCHN(evtchn)) { int masked; - masked = sync_test_and_set_bit(evtchn, sh->evtchn_mask); + masked = test_and_set_mask(evtchn); sync_set_bit(evtchn, sh->evtchn_pending); if (!masked) unmask_evtchn(evtchn); -- 1.7.10.4
Signed-off-by: Wei Liu <wei.liu2@citrix.com> --- drivers/xen/events.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/xen/events.c b/drivers/xen/events.c index 0fae3f9..0679d27 100644 --- a/drivers/xen/events.c +++ b/drivers/xen/events.c @@ -1471,7 +1471,7 @@ int resend_irq_on_evtchn(unsigned int irq) return 1; masked = test_and_set_mask(evtchn); - sync_set_bit(evtchn, s->evtchn_pending); + set_evtchn(evtchn); if (!masked) unmask_evtchn(evtchn); @@ -1520,7 +1520,7 @@ static int retrigger_dynirq(struct irq_data *data) int masked; masked = test_and_set_mask(evtchn); - sync_set_bit(evtchn, sh->evtchn_pending); + set_evtchn(evtchn); if (!masked) unmask_evtchn(evtchn); ret = 1; -- 1.7.10.4
Use global pointers in common operations to allow for better code sharing between N-level event channel. Functions which are not suitable for sharing are also taken care of. Also update drivers/xen/evtchn.c to use exported variable instead of macro. Signed-off-by: Wei Liu <wei.liu2@citrix.com> --- drivers/xen/events.c | 180 ++++++++++++++++++++++++++++++++------------------ drivers/xen/evtchn.c | 12 ++-- include/xen/events.h | 2 + 3 files changed, 123 insertions(+), 71 deletions(-) diff --git a/drivers/xen/events.c b/drivers/xen/events.c index 0679d27..4820a52 100644 --- a/drivers/xen/events.c +++ b/drivers/xen/events.c @@ -51,6 +51,16 @@ #include <xen/interface/hvm/hvm_op.h> #include <xen/interface/hvm/params.h> +/* N-level event channel, starting from 2 */ +unsigned int evtchn_level = 2; +EXPORT_SYMBOL_GPL(evtchn_level); +unsigned int nr_event_channels; +EXPORT_SYMBOL_GPL(nr_event_channels); + +/* The following pointers point to pending bitmap and mask bitmap. */ +static unsigned long *evtchn_pending; +static unsigned long *evtchn_mask; + /* * This lock protects updates to the following mapping and reference-count * arrays. The lock does not need to be acquired to read the mapping tables. @@ -113,7 +123,7 @@ static int *evtchn_to_irq; static unsigned long *pirq_eoi_map; static bool (*pirq_needs_eoi)(unsigned irq); -static DEFINE_PER_CPU(unsigned long [NR_EVENT_CHANNELS/BITS_PER_LONG], +static DEFINE_PER_CPU(unsigned long [NR_EVENT_CHANNELS_L2/BITS_PER_LONG], cpu_evtchn_mask); /* Xen will never allocate port zero for any purpose. */ @@ -286,12 +296,11 @@ static bool pirq_needs_eoi_flag(unsigned irq) } static inline unsigned long active_evtchns(unsigned int cpu, - struct shared_info *sh, unsigned int idx) { - return sh->evtchn_pending[idx] & + return evtchn_pending[idx] & per_cpu(cpu_evtchn_mask, cpu)[idx] & - ~sh->evtchn_mask[idx]; + ~evtchn_mask[idx]; } static void bind_evtchn_to_cpu(unsigned int chn, unsigned int cpu) @@ -329,26 +338,22 @@ static void init_evtchn_cpu_bindings(void) static inline void clear_evtchn(int port) { - struct shared_info *s = HYPERVISOR_shared_info; - sync_clear_bit(port, &s->evtchn_pending[0]); + sync_clear_bit(port, &evtchn_pending[0]); } static inline void set_evtchn(int port) { - struct shared_info *s = HYPERVISOR_shared_info; - sync_set_bit(port, &s->evtchn_pending[0]); + sync_set_bit(port, &evtchn_pending[0]); } static inline int test_evtchn(int port) { - struct shared_info *s = HYPERVISOR_shared_info; - return sync_test_bit(port, &s->evtchn_pending[0]); + return sync_test_bit(port, &evtchn_pending[0]); } static inline int test_and_set_mask(int port) { - struct shared_info *s = HYPERVISOR_shared_info; - return sync_test_and_set_bit(port, &s->evtchn_mask[0]); + return sync_test_and_set_bit(port, &evtchn_mask[0]); } @@ -371,13 +376,28 @@ EXPORT_SYMBOL_GPL(notify_remote_via_irq); static void mask_evtchn(int port) { - struct shared_info *s = HYPERVISOR_shared_info; - sync_set_bit(port, &s->evtchn_mask[0]); + sync_set_bit(port, &evtchn_mask[0]); +} + +static inline void __unmask_local_port_l2(int port) +{ + struct vcpu_info *vcpu_info = __this_cpu_read(xen_vcpu); + + sync_clear_bit(port, &evtchn_mask[0]); + + /* + * The following is basically the equivalent of + * ''hw_resend_irq''. Just like a real IO-APIC we ''lose + * the interrupt edge'' if the channel is masked. + */ + if (sync_test_bit(port, &evtchn_pending[0]) && + !sync_test_and_set_bit(port / BITS_PER_LONG, + &vcpu_info->evtchn_pending_sel)) + vcpu_info->evtchn_upcall_pending = 1; } static void unmask_evtchn(int port) { - struct shared_info *s = HYPERVISOR_shared_info; unsigned int cpu = get_cpu(); BUG_ON(!irqs_disabled()); @@ -387,19 +407,13 @@ static void unmask_evtchn(int port) struct evtchn_unmask unmask = { .port = port }; (void)HYPERVISOR_event_channel_op(EVTCHNOP_unmask, &unmask); } else { - struct vcpu_info *vcpu_info = __this_cpu_read(xen_vcpu); - - sync_clear_bit(port, &s->evtchn_mask[0]); - - /* - * The following is basically the equivalent of - * ''hw_resend_irq''. Just like a real IO-APIC we ''lose - * the interrupt edge'' if the channel is masked. - */ - if (sync_test_bit(port, &s->evtchn_pending[0]) && - !sync_test_and_set_bit(port / BITS_PER_LONG, - &vcpu_info->evtchn_pending_sel)) - vcpu_info->evtchn_upcall_pending = 1; + switch (evtchn_level) { + case 2: + __unmask_local_port_l2(port); + break; + default: + BUG(); + } } put_cpu(); @@ -902,7 +916,7 @@ static int find_virq(unsigned int virq, unsigned int cpu) int port, rc = -ENOENT; memset(&status, 0, sizeof(status)); - for (port = 0; port <= NR_EVENT_CHANNELS; port++) { + for (port = 0; port <= nr_event_channels; port++) { status.dom = DOMID_SELF; status.port = port; rc = HYPERVISOR_event_channel_op(EVTCHNOP_status, &status); @@ -1127,7 +1141,7 @@ int evtchn_get(unsigned int evtchn) struct irq_info *info; int err = -ENOENT; - if (evtchn >= NR_EVENT_CHANNELS) + if (evtchn >= nr_event_channels) return -EINVAL; mutex_lock(&irq_mapping_update_lock); @@ -1170,15 +1184,16 @@ void xen_send_IPI_one(unsigned int cpu, enum ipi_vector vector) notify_remote_via_irq(irq); } +static irqreturn_t xen_debug_interrupt_l2(int irq, void *dev_id); + irqreturn_t xen_debug_interrupt(int irq, void *dev_id) { - struct shared_info *sh = HYPERVISOR_shared_info; - int cpu = smp_processor_id(); - unsigned long *cpu_evtchn = per_cpu(cpu_evtchn_mask, cpu); - int i; - unsigned long flags; + irqreturn_t rc; static DEFINE_SPINLOCK(debug_lock); + unsigned long flags; + int cpu = smp_processor_id(); struct vcpu_info *v; + int i; spin_lock_irqsave(&debug_lock, flags); @@ -1195,24 +1210,45 @@ irqreturn_t xen_debug_interrupt(int irq, void *dev_id) (int)(sizeof(v->evtchn_pending_sel)*2), v->evtchn_pending_sel); } + + switch (evtchn_level) { + case 2: + rc = xen_debug_interrupt_l2(irq, dev_id); + break; + default: + BUG(); + } + + spin_unlock_irqrestore(&debug_lock, flags); + return rc; +} + +static irqreturn_t xen_debug_interrupt_l2(int irq, void *dev_id) +{ + int cpu = smp_processor_id(); + unsigned long *cpu_evtchn = per_cpu(cpu_evtchn_mask, cpu); + int i; + unsigned long nr_elems = NR_EVENT_CHANNELS_L2 / BITS_PER_LONG; + struct vcpu_info *v; + v = per_cpu(xen_vcpu, cpu); printk(KERN_DEBUG "\npending:\n "); - for (i = ARRAY_SIZE(sh->evtchn_pending)-1; i >= 0; i--) - printk(KERN_DEBUG "%0*lx%s", (int)sizeof(sh->evtchn_pending[0])*2, - sh->evtchn_pending[i], + for (i = nr_elems; i >= 0; i--) + printk(KERN_DEBUG "%0*lx%s", (int)sizeof(evtchn_pending[0])*2, + evtchn_pending[i], i % 8 == 0 ? "\n " : " "); printk(KERN_DEBUG "\nglobal mask:\n "); - for (i = ARRAY_SIZE(sh->evtchn_mask)-1; i >= 0; i--) + for (i = nr_elems-1; i >= 0; i--) printk(KERN_DEBUG "%0*lx%s", - (int)(sizeof(sh->evtchn_mask[0])*2), - sh->evtchn_mask[i], + (int)(sizeof(evtchn_mask[0])*2), + evtchn_mask[i], i % 8 == 0 ? "\n " : " "); printk(KERN_DEBUG "\nglobally unmasked:\n "); - for (i = ARRAY_SIZE(sh->evtchn_mask)-1; i >= 0; i--) - printk(KERN_DEBUG "%0*lx%s", (int)(sizeof(sh->evtchn_mask[0])*2), - sh->evtchn_pending[i] & ~sh->evtchn_mask[i], + for (i = nr_elems-1; i >= 0; i--) + printk(KERN_DEBUG "%0*lx%s", (int)(sizeof(evtchn_mask[0])*2), + evtchn_pending[i] & ~evtchn_mask[i], i % 8 == 0 ? "\n " : " "); printk(KERN_DEBUG "\nlocal cpu%d mask:\n ", cpu); @@ -1222,32 +1258,30 @@ irqreturn_t xen_debug_interrupt(int irq, void *dev_id) i % 8 == 0 ? "\n " : " "); printk(KERN_DEBUG "\nlocally unmasked:\n "); - for (i = ARRAY_SIZE(sh->evtchn_mask)-1; i >= 0; i--) { - unsigned long pending = sh->evtchn_pending[i] - & ~sh->evtchn_mask[i] + for (i = nr_elems-1; i >= 0; i--) { + unsigned long pending = evtchn_pending[i] + & ~evtchn_mask[i] & cpu_evtchn[i]; - printk(KERN_DEBUG "%0*lx%s", (int)(sizeof(sh->evtchn_mask[0])*2), + printk(KERN_DEBUG "%0*lx%s", (int)(sizeof(evtchn_mask[0])*2), pending, i % 8 == 0 ? "\n " : " "); } printk(KERN_DEBUG "\npending list:\n"); - for (i = 0; i < NR_EVENT_CHANNELS; i++) { - if (sync_test_bit(i, sh->evtchn_pending)) { + for (i = 0; i < NR_EVENT_CHANNELS_L2; i++) { + if (sync_test_bit(i, evtchn_pending)) { int word_idx = i / BITS_PER_LONG; printk(KERN_DEBUG " %d: event %d -> irq %d%s%s%s\n", cpu_from_evtchn(i), i, evtchn_to_irq[i], !sync_test_bit(word_idx, &v->evtchn_pending_sel) ? "" : " l1-clear", - sync_test_bit(i, sh->evtchn_mask) + sync_test_bit(i, evtchn_mask) ? "" : " globally-masked", sync_test_bit(i, cpu_evtchn) ? "" : " locally-masked"); } } - spin_unlock_irqrestore(&debug_lock, flags); - return IRQ_HANDLED; } @@ -1269,13 +1303,12 @@ static DEFINE_PER_CPU(unsigned int, current_bit_idx); * a bitset of words which contain pending event bits. The second * level is a bitset of pending events themselves. */ -static void __xen_evtchn_do_upcall(void) +static void __xen_evtchn_do_upcall_l2(void) { int start_word_idx, start_bit_idx; int word_idx, bit_idx; int i; int cpu = get_cpu(); - struct shared_info *s = HYPERVISOR_shared_info; struct vcpu_info *vcpu_info = __this_cpu_read(xen_vcpu); unsigned count; @@ -1314,7 +1347,7 @@ static void __xen_evtchn_do_upcall(void) } word_idx = __ffs(words); - pending_bits = active_evtchns(cpu, s, word_idx); + pending_bits = active_evtchns(cpu, word_idx); bit_idx = 0; /* usually scan entire word from start */ if (word_idx == start_word_idx) { /* We scan the starting word in two parts */ @@ -1383,7 +1416,13 @@ void xen_evtchn_do_upcall(struct pt_regs *regs) exit_idle(); irq_enter(); - __xen_evtchn_do_upcall(); + switch (evtchn_level) { + case 2: + __xen_evtchn_do_upcall_l2(); + break; + default: + BUG(); + } irq_exit(); set_irq_regs(old_regs); @@ -1391,7 +1430,13 @@ void xen_evtchn_do_upcall(struct pt_regs *regs) void xen_hvm_evtchn_do_upcall(void) { - __xen_evtchn_do_upcall(); + switch (evtchn_level) { + case 2: + __xen_evtchn_do_upcall_l2(); + break; + default: + BUG(); + } } EXPORT_SYMBOL_GPL(xen_hvm_evtchn_do_upcall); @@ -1465,7 +1510,6 @@ static int set_affinity_irq(struct irq_data *data, const struct cpumask *dest, int resend_irq_on_evtchn(unsigned int irq) { int masked, evtchn = evtchn_from_irq(irq); - struct shared_info *s = HYPERVISOR_shared_info; if (!VALID_EVTCHN(evtchn)) return 1; @@ -1513,7 +1557,6 @@ static void mask_ack_dynirq(struct irq_data *data) static int retrigger_dynirq(struct irq_data *data) { int evtchn = evtchn_from_irq(data->irq); - struct shared_info *sh = HYPERVISOR_shared_info; int ret = 0; if (VALID_EVTCHN(evtchn)) { @@ -1689,14 +1732,14 @@ void xen_irq_resume(void) init_evtchn_cpu_bindings(); /* New event-channel space is not ''live'' yet. */ - for (evtchn = 0; evtchn < NR_EVENT_CHANNELS; evtchn++) + for (evtchn = 0; evtchn < nr_event_channels; evtchn++) mask_evtchn(evtchn); /* No IRQ <-> event-channel mappings. */ list_for_each_entry(info, &xen_irq_list_head, list) info->evtchn = 0; /* zap event-channel binding */ - for (evtchn = 0; evtchn < NR_EVENT_CHANNELS; evtchn++) + for (evtchn = 0; evtchn < nr_event_channels; evtchn++) evtchn_to_irq[evtchn] = -1; for_each_possible_cpu(cpu) { @@ -1792,17 +1835,24 @@ void xen_callback_vector(void) {} void __init xen_init_IRQ(void) { int i, rc; + struct shared_info *s = HYPERVISOR_shared_info; + + evtchn_pending = s->evtchn_pending; + evtchn_mask = s->evtchn_mask; + + evtchn_level = 2; + nr_event_channels = NR_EVENT_CHANNELS_L2; - evtchn_to_irq = kcalloc(NR_EVENT_CHANNELS, sizeof(*evtchn_to_irq), + evtchn_to_irq = kcalloc(nr_event_channels, sizeof(*evtchn_to_irq), GFP_KERNEL); BUG_ON(!evtchn_to_irq); - for (i = 0; i < NR_EVENT_CHANNELS; i++) + for (i = 0; i < nr_event_channels; i++) evtchn_to_irq[i] = -1; init_evtchn_cpu_bindings(); /* No event channels are ''live'' right now. */ - for (i = 0; i < NR_EVENT_CHANNELS; i++) + for (i = 0; i < nr_event_channels; i++) mask_evtchn(i); pirq_needs_eoi = pirq_needs_eoi_flag; diff --git a/drivers/xen/evtchn.c b/drivers/xen/evtchn.c index d2bbea1..7515ecc 100644 --- a/drivers/xen/evtchn.c +++ b/drivers/xen/evtchn.c @@ -232,7 +232,7 @@ static ssize_t evtchn_write(struct file *file, const char __user *buf, for (i = 0; i < (count/sizeof(evtchn_port_t)); i++) { unsigned port = kbuf[i]; - if (port < NR_EVENT_CHANNELS && + if (port < nr_event_channels && get_port_user(port) == u && !get_port_enabled(port)) { set_port_enabled(port, true); @@ -374,7 +374,7 @@ static long evtchn_ioctl(struct file *file, break; rc = -EINVAL; - if (unbind.port >= NR_EVENT_CHANNELS) + if (unbind.port >= nr_event_channels) break; spin_lock_irq(&port_user_lock); @@ -402,7 +402,7 @@ static long evtchn_ioctl(struct file *file, if (copy_from_user(¬ify, uarg, sizeof(notify))) break; - if (notify.port >= NR_EVENT_CHANNELS) { + if (notify.port >= nr_event_channels) { rc = -EINVAL; } else if (get_port_user(notify.port) != u) { rc = -ENOTCONN; @@ -492,7 +492,7 @@ static int evtchn_release(struct inode *inode, struct file *filp) free_page((unsigned long)u->ring); - for (i = 0; i < NR_EVENT_CHANNELS; i++) { + for (i = 0; i < nr_event_channels; i++) { if (get_port_user(i) != u) continue; @@ -501,7 +501,7 @@ static int evtchn_release(struct inode *inode, struct file *filp) spin_unlock_irq(&port_user_lock); - for (i = 0; i < NR_EVENT_CHANNELS; i++) { + for (i = 0; i < nr_event_channels; i++) { if (get_port_user(i) != u) continue; @@ -538,7 +538,7 @@ static int __init evtchn_init(void) if (!xen_domain()) return -ENODEV; - port_user = kcalloc(NR_EVENT_CHANNELS, sizeof(*port_user), GFP_KERNEL); + port_user = kcalloc(nr_event_channels, sizeof(*port_user), GFP_KERNEL); if (port_user == NULL) return -ENOMEM; diff --git a/include/xen/events.h b/include/xen/events.h index 04399b2..6b117ac 100644 --- a/include/xen/events.h +++ b/include/xen/events.h @@ -109,4 +109,6 @@ int xen_irq_from_gsi(unsigned gsi); /* Determine whether to ignore this IRQ if it is passed to a guest. */ int xen_test_irq_shared(int irq); +extern unsigned int nr_event_channels; + #endif /* _XEN_EVENTS_H */ -- 1.7.10.4
Signed-off-by: Wei Liu <wei.liu2@citrix.com> --- drivers/xen/events.c | 21 ++++++++++++++++++--- 1 file changed, 18 insertions(+), 3 deletions(-) diff --git a/drivers/xen/events.c b/drivers/xen/events.c index 4820a52..30ca620 100644 --- a/drivers/xen/events.c +++ b/drivers/xen/events.c @@ -123,8 +123,7 @@ static int *evtchn_to_irq; static unsigned long *pirq_eoi_map; static bool (*pirq_needs_eoi)(unsigned irq); -static DEFINE_PER_CPU(unsigned long [NR_EVENT_CHANNELS_L2/BITS_PER_LONG], - cpu_evtchn_mask); +static DEFINE_PER_CPU(unsigned long *, cpu_evtchn_mask); /* Xen will never allocate port zero for any purpose. */ #define VALID_EVTCHN(chn) ((chn) != 0) @@ -333,7 +332,8 @@ static void init_evtchn_cpu_bindings(void) for_each_possible_cpu(i) memset(per_cpu(cpu_evtchn_mask, i), - (i == 0) ? ~0 : 0, sizeof(*per_cpu(cpu_evtchn_mask, i))); + (i == 0) ? ~0 : 0, + sizeof(unsigned long)*(nr_event_channels/BITS_PER_LONG)); } static inline void clear_evtchn(int port) @@ -1835,6 +1835,7 @@ void xen_callback_vector(void) {} void __init xen_init_IRQ(void) { int i, rc; + int cpu; struct shared_info *s = HYPERVISOR_shared_info; evtchn_pending = s->evtchn_pending; @@ -1849,6 +1850,20 @@ void __init xen_init_IRQ(void) for (i = 0; i < nr_event_channels; i++) evtchn_to_irq[i] = -1; + for_each_possible_cpu(cpu) { + void *p; + unsigned int nr = nr_event_channels / BITS_PER_LONG; + + p = kzalloc_node(sizeof(unsigned long) * nr, + GFP_KERNEL, + cpu_to_node(cpu)); + if (!p) + p = kzalloc(sizeof(unsigned long) * nr, + GFP_KERNEL); + BUG_ON(!p); + per_cpu(cpu_evtchn_mask, cpu) = p; + } + init_evtchn_cpu_bindings(); /* No event channels are ''live'' right now. */ -- 1.7.10.4
Only do_upcall, debug_interrupt and unmask_evtchn are required. Signed-off-by: Wei Liu <wei.liu2@citrix.com> --- drivers/xen/events.c | 291 ++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 291 insertions(+) diff --git a/drivers/xen/events.c b/drivers/xen/events.c index 30ca620..d953e81 100644 --- a/drivers/xen/events.c +++ b/drivers/xen/events.c @@ -51,6 +51,9 @@ #include <xen/interface/hvm/hvm_op.h> #include <xen/interface/hvm/params.h> +/* Helper macro(s) */ +#define LONG_BITORDER (BITS_PER_LONG == 64 ? 6 : 5) + /* N-level event channel, starting from 2 */ unsigned int evtchn_level = 2; EXPORT_SYMBOL_GPL(evtchn_level); @@ -61,6 +64,9 @@ EXPORT_SYMBOL_GPL(nr_event_channels); static unsigned long *evtchn_pending; static unsigned long *evtchn_mask; +/* 2nd level selector for 3-level event channel */ +static DEFINE_PER_CPU(unsigned long[sizeof(unsigned long) * 8], evtchn_sel_l2); + /* * This lock protects updates to the following mapping and reference-count * arrays. The lock does not need to be acquired to read the mapping tables. @@ -396,6 +402,28 @@ static inline void __unmask_local_port_l2(int port) vcpu_info->evtchn_upcall_pending = 1; } +static inline void __unmask_local_port_l3(int port) +{ + struct vcpu_info *vcpu_info = __this_cpu_read(xen_vcpu); + int cpu = smp_processor_id(); + unsigned int l1bit = port >> (LONG_BITORDER << 1); + unsigned int l2bit = port >> LONG_BITORDER; + + sync_clear_bit(port, &evtchn_mask[0]); + + /* + * The following is basically the equivalent of + * ''hw_resend_irq''. Just like a real IO-APIC we ''lose + * the interrupt edge'' if the channel is masked. + */ + if (sync_test_bit(port, &evtchn_pending[0]) && + !sync_test_and_set_bit(l2bit, + &per_cpu(evtchn_sel_l2, cpu)[0]) && + !sync_test_and_set_bit(l1bit, + &vcpu_info->evtchn_pending_sel)) + vcpu_info->evtchn_upcall_pending = 1; +} + static void unmask_evtchn(int port) { unsigned int cpu = get_cpu(); @@ -411,6 +439,9 @@ static void unmask_evtchn(int port) case 2: __unmask_local_port_l2(port); break; + case 3: + __unmask_local_port_l3(port); + break; default: BUG(); } @@ -1185,6 +1216,7 @@ void xen_send_IPI_one(unsigned int cpu, enum ipi_vector vector) } static irqreturn_t xen_debug_interrupt_l2(int irq, void *dev_id); +static irqreturn_t xen_debug_interrupt_l3(int irq, void *dev_id); irqreturn_t xen_debug_interrupt(int irq, void *dev_id) { @@ -1215,6 +1247,9 @@ irqreturn_t xen_debug_interrupt(int irq, void *dev_id) case 2: rc = xen_debug_interrupt_l2(irq, dev_id); break; + case 3: + rc = xen_debug_interrupt_l3(irq, dev_id); + break; default: BUG(); } @@ -1285,8 +1320,109 @@ static irqreturn_t xen_debug_interrupt_l2(int irq, void *dev_id) return IRQ_HANDLED; } +static irqreturn_t xen_debug_interrupt_l3(int irq, void *dev_id) +{ + int cpu = smp_processor_id(); + unsigned long *cpu_evtchn = per_cpu(cpu_evtchn_mask, cpu); + unsigned long nr_elems = NR_EVENT_CHANNELS_L3 / BITS_PER_LONG; + int i; + struct vcpu_info *v; + + v = per_cpu(xen_vcpu, cpu); + + printk(KERN_DEBUG "\npending (only show words which have bits set to 1):\n "); + for (i = nr_elems-1; i >= 0; i--) + if (evtchn_pending[i] != 0UL) { + printk(KERN_DEBUG " word index %d %0*lx\n", + i, + (int)sizeof(evtchn_pending[0])*2, + evtchn_pending[i]); + } + + printk(KERN_DEBUG "\nglobal mask (only show words which have bits set to 0):\n "); + for (i = nr_elems-1; i >= 0; i--) + if (evtchn_mask[i] != ~0UL) { + printk(KERN_DEBUG " word index %d %0*lx\n", + i, + (int)sizeof(evtchn_mask[0])*2, + evtchn_mask[i]); + } + + printk(KERN_DEBUG "\nglobally unmasked (only show result words which have bits set to 1):\n "); + for (i = nr_elems-1; i >= 0; i--) + if ((evtchn_pending[i] & ~evtchn_mask[i]) != 0UL) { + printk(KERN_DEBUG " word index %d %0*lx\n", + i, + (int)(sizeof(evtchn_mask[0])*2), + evtchn_pending[i] & ~evtchn_mask[i]); + } + + printk(KERN_DEBUG "\nlocal cpu%d mask (only show words which have bits set to 1):\n ", cpu); + for (i = (NR_EVENT_CHANNELS_L3/BITS_PER_LONG)-1; i >= 0; i--) + if (cpu_evtchn[i] != 0UL) { + printk(KERN_DEBUG " word index %d %0*lx\n", + i, + (int)(sizeof(cpu_evtchn[0])*2), + cpu_evtchn[i]); + } + + printk(KERN_DEBUG "\nlocally unmasked (only show result words which have bits set to 1):\n "); + for (i = nr_elems-1; i >= 0; i--) { + unsigned long pending = evtchn_pending[i] + & ~evtchn_mask[i] + & cpu_evtchn[i]; + if (pending != 0UL) { + printk(KERN_DEBUG " word index %d %0*lx\n", + i, + (int)(sizeof(evtchn_mask[0])*2), + pending); + } + } + + printk(KERN_DEBUG "\npending list:\n"); + for (i = 0; i < NR_EVENT_CHANNELS_L3; i++) { + if (sync_test_bit(i, evtchn_pending)) { + int word_idx = i / (BITS_PER_LONG * BITS_PER_LONG); + int word_idx_l2 = i / BITS_PER_LONG; + printk(KERN_DEBUG " %d: event %d -> irq %d%s%s%s%s\n", + cpu_from_evtchn(i), i, + evtchn_to_irq[i], + !sync_test_bit(word_idx, &v->evtchn_pending_sel) + ? "" : " l1-clear", + !sync_test_bit(word_idx_l2, per_cpu(evtchn_sel_l2, cpu)) + ? "" : " l2-clear", + sync_test_bit(i, evtchn_mask) + ? "" : " globally-masked", + sync_test_bit(i, cpu_evtchn) + ? "" : " locally-masked"); + } + } + + return IRQ_HANDLED; +} + +/* The following per-cpu variables are used to save current state of event + * processing loop. + * + * 2-level event channel: + * current_word_idx is the bit index in L1 selector indicating the currently + * processing word in shared bitmap. + * current_bit_idx is the bit index in the currently processing word in shared + * bitmap. + * N.B. current_word_idx_l2 is not used. + * + * 3-level event channel: + * current_word_idx is the bit index in L1 selector indicating the currently + * processing word in L2 selector. + * current_word_idx_l2 is the bit index in L2 selector word indicating the + * currently processing word in shared bitmap. + * current_bit_idx is the bit index in the currently processing word in shared + * bitmap. + * + */ static DEFINE_PER_CPU(unsigned, xed_nesting_count); static DEFINE_PER_CPU(unsigned int, current_word_idx); +static DEFINE_PER_CPU(unsigned int, current_word_idx_l2); static DEFINE_PER_CPU(unsigned int, current_bit_idx); /* @@ -1409,6 +1545,155 @@ out: put_cpu(); } +/* + * In the 3-level event channel implementation, the first level is a + * bitset of words which contain pending bits in the second level. + * The second level is another bitsets which contain pending bits in + * the third level. The third level is a bit set of pending events + * themselves. + */ +static void __xen_evtchn_do_upcall_l3(void) +{ + struct vcpu_info *vcpu_info = __this_cpu_read(xen_vcpu); + unsigned count; + int start_word_idx_l1, start_word_idx_l2, start_bit_idx; + int word_idx_l1, word_idx_l2, bit_idx; + int i, j; + int cpu = get_cpu(); + + do { + unsigned long pending_words_l1; + + vcpu_info->evtchn_upcall_pending = 0; + + if (__this_cpu_inc_return(xed_nesting_count) - 1) + goto out; +#ifndef CONFIG_X86 + /* No need for a barrier -- XCHG is a barrier on x86. */ + /* Clear master flag /before/ clearing selector flag. */ + wmb(); +#endif + /* here we get l1 pending selector */ + pending_words_l1 = xchg(&vcpu_info->evtchn_pending_sel, 0); + + start_word_idx_l1 = __this_cpu_read(current_word_idx); + start_word_idx_l2 = __this_cpu_read(current_word_idx_l2); + start_bit_idx = __this_cpu_read(current_bit_idx); + + word_idx_l1 = start_word_idx_l1; + + /* loop through l1, try to pick up l2 */ + for (i = 0; pending_words_l1 != 0; i++) { + unsigned long words_l1; + unsigned long pending_words_l2; + + words_l1 = MASK_LSBS(pending_words_l1, word_idx_l1); + + if (words_l1 == 0) { + word_idx_l1 = 0; + start_word_idx_l2 = 0; + continue; + } + + word_idx_l1 = __ffs(words_l1); + + pending_words_l2 + xchg(&per_cpu(evtchn_sel_l2, cpu)[word_idx_l1], + 0); + + word_idx_l2 = 0; + if (word_idx_l1 == start_word_idx_l1) { + if (i == 0) + word_idx_l2 = start_word_idx_l2; + else + word_idx_l2 &= (1UL << start_word_idx_l2) - 1; + } + + for (j = 0; pending_words_l2 != 0; j++) { + unsigned long pending_bits; + unsigned long words_l2; + unsigned long idx; + + words_l2 = MASK_LSBS(pending_words_l2, + word_idx_l2); + + if (words_l2 == 0) { + word_idx_l2 = 0; + bit_idx = 0; + continue; + } + + word_idx_l2 = __ffs(words_l2); + + idx = word_idx_l1*BITS_PER_LONG+word_idx_l2; + pending_bits + active_evtchns(cpu, idx); + + bit_idx = 0; + if (word_idx_l2 == start_word_idx_l2) { + if (j == 0) + bit_idx = start_bit_idx; + else + bit_idx &= (1UL<<start_bit_idx)-1; + } + + /* process port */ + do { + unsigned long bits; + int port, irq; + struct irq_desc *desc; + + bits = MASK_LSBS(pending_bits, bit_idx); + + if (bits == 0) + break; + + bit_idx = __ffs(bits); + + port = (word_idx_l1 << (LONG_BITORDER << 1)) + + (word_idx_l2 << LONG_BITORDER) + + bit_idx; + + irq = evtchn_to_irq[port]; + + if (irq != -1) { + desc = irq_to_desc(irq); + if (desc) + generic_handle_irq_desc(irq, desc); + } + + bit_idx = (bit_idx + 1) % BITS_PER_LONG; + + __this_cpu_write(current_bit_idx, bit_idx); + __this_cpu_write(current_word_idx_l2, + bit_idx ? word_idx_l2 : + (word_idx_l2+1) % BITS_PER_LONG); + __this_cpu_write(current_word_idx_l2, + word_idx_l2 ? word_idx_l1 : + (word_idx_l1+1) % BITS_PER_LONG); + } while (bit_idx != 0); + + if ((word_idx_l2 != start_word_idx_l2) || (j != 0)) + pending_words_l2 &= ~(1UL << word_idx_l2); + + word_idx_l2 = (word_idx_l2 + 1) % BITS_PER_LONG; + } + + if ((word_idx_l1 != start_word_idx_l1) || (i != 0)) + pending_words_l1 &= ~(1UL << word_idx_l1); + + word_idx_l1 = (word_idx_l1 + 1) % BITS_PER_LONG; + } + + BUG_ON(!irqs_disabled()); + count = __this_cpu_read(xed_nesting_count); + __this_cpu_write(xed_nesting_count, 0); + } while (count != 1 || vcpu_info->evtchn_upcall_pending); + +out: + put_cpu(); +} + void xen_evtchn_do_upcall(struct pt_regs *regs) { struct pt_regs *old_regs = set_irq_regs(regs); @@ -1420,6 +1705,9 @@ void xen_evtchn_do_upcall(struct pt_regs *regs) case 2: __xen_evtchn_do_upcall_l2(); break; + case 3: + __xen_evtchn_do_upcall_l3(); + break; default: BUG(); } @@ -1434,6 +1722,9 @@ void xen_hvm_evtchn_do_upcall(void) case 2: __xen_evtchn_do_upcall_l2(); break; + case 3: + __xen_evtchn_do_upcall_l3(); + break; default: BUG(); } -- 1.7.10.4
Wei Liu
2013-Jan-31 14:47 UTC
[PATCH 10/13] xen: introduce xen_event_channel_register_3level
Signed-off-by: Wei Liu <wei.liu2@citrix.com> --- drivers/xen/events.c | 94 ++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 94 insertions(+) diff --git a/drivers/xen/events.c b/drivers/xen/events.c index d953e81..9038211 100644 --- a/drivers/xen/events.c +++ b/drivers/xen/events.c @@ -53,6 +53,9 @@ /* Helper macro(s) */ #define LONG_BITORDER (BITS_PER_LONG == 64 ? 6 : 5) +/* event bitmap size: 1 page for 32 bit and 8 pages for 64 bit */ +#define BITMAP_PG_ORDER (BITS_PER_LONG == 64 ? 3 : 1) +#define BITMAP_NR_PAGES (BITMAP_PG_ORDER == 3 ? 8 : 1) /* N-level event channel, starting from 2 */ unsigned int evtchn_level = 2; @@ -2123,6 +2126,97 @@ void xen_callback_vector(void) void xen_callback_vector(void) {} #endif +static int xen_event_channel_register_3level(void) +{ + evtchn_register_nlevel_t reg; + int i, cpu; + unsigned long *_evtchn_pending = NULL; + unsigned long *_evtchn_mask = NULL; + unsigned long *l2sel_mfns = NULL; + unsigned long *l2sel_offsets = NULL; + int rc; + + /* If we come from restore path, we don''t need to allocate + * pages. + */ + if (!evtchn_pending && !evtchn_mask) { + evtchn_pending + (unsigned long *)__get_free_pages(GFP_KERNEL, + BITMAP_PG_ORDER); + evtchn_mask + (unsigned long *)__get_free_pages(GFP_KERNEL, + BITMAP_PG_ORDER); + if (!evtchn_pending || !evtchn_mask) { + free_pages((unsigned long)evtchn_pending, BITMAP_NR_PAGES); + free_pages((unsigned long)evtchn_mask, BITMAP_NR_PAGES); + evtchn_pending = NULL; + evtchn_mask = NULL; + rc = -ENOMEM; + goto err; + } + } + + rc = -ENOMEM; /* Common error code for following operations */ +#define __ALLOC_ARRAY(_ptr, _nr) \ + do { \ + (_ptr) = kzalloc(sizeof(unsigned long) * (_nr), \ + GFP_KERNEL); \ + if (!(_ptr)) \ + goto out; \ + } while (0) + + __ALLOC_ARRAY(_evtchn_pending, BITMAP_NR_PAGES); + __ALLOC_ARRAY(_evtchn_mask, BITMAP_NR_PAGES); + __ALLOC_ARRAY(l2sel_mfns, nr_cpu_ids); + __ALLOC_ARRAY(l2sel_offsets, nr_cpu_ids); +#undef __ALLOC_ARRAY + + memset(®, 0, sizeof(reg)); + + for (i = 0; i < BITMAP_NR_PAGES; i++) { + unsigned long offset = PAGE_SIZE * i; + _evtchn_pending[i] + arbitrary_virt_to_mfn( + (void *)((unsigned long)evtchn_pending+offset)); + _evtchn_mask[i] + arbitrary_virt_to_mfn( + (void *)((unsigned long)evtchn_mask+offset)); + } + + for_each_possible_cpu(cpu) { + l2sel_mfns[cpu] + arbitrary_virt_to_mfn(&per_cpu(evtchn_sel_l2, cpu)); + l2sel_offsets[cpu] + offset_in_page(&per_cpu(evtchn_sel_l2, cpu)); + } + + reg.u.l3.nr_pages = BITMAP_NR_PAGES; + reg.u.l3.evtchn_pending = _evtchn_pending; + reg.u.l3.evtchn_mask = _evtchn_mask; + + reg.u.l3.nr_vcpus = nr_cpu_ids; + reg.u.l3.l2sel_mfns = l2sel_mfns; + reg.u.l3.l2sel_offsets = l2sel_offsets; + + reg.level = 3; + + rc = HYPERVISOR_event_channel_op(EVTCHNOP_register_nlevel, ®); + if (rc) { + free_pages((unsigned long)evtchn_pending, BITMAP_NR_PAGES); + free_pages((unsigned long)evtchn_mask, BITMAP_NR_PAGES); + evtchn_pending = NULL; + evtchn_mask = NULL; + } + +out: + kfree(_evtchn_pending); + kfree(_evtchn_mask); + kfree(l2sel_mfns); + kfree(l2sel_offsets); +err: + return rc; +} + void __init xen_init_IRQ(void) { int i, rc; -- 1.7.10.4
Wei Liu
2013-Jan-31 14:47 UTC
[PATCH 11/13] xen: introduce xen_event_channel_register_nlevel
Signed-off-by: Wei Liu <wei.liu2@citrix.com> --- drivers/xen/events.c | 46 ++++++++++++++++++++++++++++++++++++++++------ include/xen/events.h | 6 ++++++ 2 files changed, 46 insertions(+), 6 deletions(-) diff --git a/drivers/xen/events.c b/drivers/xen/events.c index 9038211..f77909d 100644 --- a/drivers/xen/events.c +++ b/drivers/xen/events.c @@ -2217,17 +2217,51 @@ err: return rc; } +int xen_event_channel_register_nlevel(unsigned int level) +{ + int rc; + + switch (level) { + case 3: + rc = xen_event_channel_register_3level(); + break; + default: + printk(KERN_INFO "Trying to register unsupported %d-level event channel\n", + level); + rc = -EINVAL; + } + + return rc; +} + +void xen_set_event_channel_nlevel(unsigned int level) +{ + struct shared_info *s = HYPERVISOR_shared_info; + switch (level) { + case 2: + evtchn_pending = s->evtchn_pending; + evtchn_mask = s->evtchn_mask; + evtchn_level = 2; + nr_event_channels = NR_EVENT_CHANNELS_L2; + break; + case 3: + /* evtchn_pending/mask already set */ + evtchn_level = 3; + nr_event_channels = NR_EVENT_CHANNELS_L3; + break; + default: + printk(KERN_EMERG "Trying to set unsupported %d-level event channel\n", + level); + BUG(); + } +} + void __init xen_init_IRQ(void) { int i, rc; int cpu; - struct shared_info *s = HYPERVISOR_shared_info; - - evtchn_pending = s->evtchn_pending; - evtchn_mask = s->evtchn_mask; - evtchn_level = 2; - nr_event_channels = NR_EVENT_CHANNELS_L2; + xen_set_event_channel_nlevel(2); evtchn_to_irq = kcalloc(nr_event_channels, sizeof(*evtchn_to_irq), GFP_KERNEL); diff --git a/include/xen/events.h b/include/xen/events.h index 6b117ac..cbc1b71 100644 --- a/include/xen/events.h +++ b/include/xen/events.h @@ -111,4 +111,10 @@ int xen_test_irq_shared(int irq); extern unsigned int nr_event_channels; +/* Register N-level event channel */ +int xen_event_channel_register_nlevel(unsigned int level); + +/* Set event channel to N-level if registration succeed */ +void xen_set_event_channel_nlevel(unsigned int level); + #endif /* _XEN_EVENTS_H */ -- 1.7.10.4
3-level event channel is registered in a) xen_init_IRQ(), when the guest is fresh started; b) xen_vcpu_restore(), when the guest is restored. Signed-off-by: Wei Liu <wei.liu2@citrix.com> --- arch/x86/xen/enlighten.c | 13 +++++++++++++ drivers/xen/events.c | 11 ++++++++++- 2 files changed, 23 insertions(+), 1 deletion(-) diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c index bc893e7..919c7ed 100644 --- a/arch/x86/xen/enlighten.c +++ b/arch/x86/xen/enlighten.c @@ -43,6 +43,7 @@ #include <xen/hvm.h> #include <xen/hvc-console.h> #include <xen/acpi.h> +#include <xen/events.h> #include <asm/paravirt.h> #include <asm/apic.h> @@ -177,6 +178,7 @@ static void xen_vcpu_setup(int cpu) void xen_vcpu_restore(void) { int cpu; + int rc; for_each_possible_cpu(cpu) { bool other_cpu = (cpu != smp_processor_id()); @@ -195,6 +197,17 @@ void xen_vcpu_restore(void) HYPERVISOR_vcpu_op(VCPUOP_up, cpu, NULL)) BUG(); } + + rc = xen_event_channel_register_nlevel(3); + if (!rc) { + printk(KERN_INFO "Register 3-level event channel succeeded.\n"); + xen_set_event_channel_nlevel(3); + } else { + printk(KERN_INFO "Register 3-level event channel failed with %d.\n" + "Fall back to default 2-level event channel.\n", + rc); + xen_set_event_channel_nlevel(2); + } } static void __init xen_banner(void) diff --git a/drivers/xen/events.c b/drivers/xen/events.c index f77909d..87088b2 100644 --- a/drivers/xen/events.c +++ b/drivers/xen/events.c @@ -2261,7 +2261,16 @@ void __init xen_init_IRQ(void) int i, rc; int cpu; - xen_set_event_channel_nlevel(2); + rc = xen_event_channel_register_nlevel(3); + if (!rc) { + printk(KERN_INFO "Register 3-level event channel succeeded.\n"); + xen_set_event_channel_nlevel(3); + } else { + printk(KERN_INFO "Register 3-level event channel failed with %d.\n" + "Fall back to default 2-level event channel.\n", + rc); + xen_set_event_channel_nlevel(2); + } evtchn_to_irq = kcalloc(nr_event_channels, sizeof(*evtchn_to_irq), GFP_KERNEL); -- 1.7.10.4
Wei Liu
2013-Jan-31 14:47 UTC
[PATCH 13/13] xen: only register 3-level event channel for Dom0
Signed-off-by: Wei Liu <wei.liu2@citrix.com> --- arch/x86/xen/enlighten.c | 23 +++++++++++++---------- drivers/xen/events.c | 22 +++++++++++++--------- 2 files changed, 26 insertions(+), 19 deletions(-) diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c index 919c7ed..10b7565 100644 --- a/arch/x86/xen/enlighten.c +++ b/arch/x86/xen/enlighten.c @@ -178,7 +178,6 @@ static void xen_vcpu_setup(int cpu) void xen_vcpu_restore(void) { int cpu; - int rc; for_each_possible_cpu(cpu) { bool other_cpu = (cpu != smp_processor_id()); @@ -198,15 +197,19 @@ void xen_vcpu_restore(void) BUG(); } - rc = xen_event_channel_register_nlevel(3); - if (!rc) { - printk(KERN_INFO "Register 3-level event channel succeeded.\n"); - xen_set_event_channel_nlevel(3); - } else { - printk(KERN_INFO "Register 3-level event channel failed with %d.\n" - "Fall back to default 2-level event channel.\n", - rc); - xen_set_event_channel_nlevel(2); + if (xen_initial_domain()) { + int rc = xen_event_channel_register_nlevel(3); + if (!rc) { + printk(KERN_INFO + "Register 3-level event channel succeeded.\n"); + xen_set_event_channel_nlevel(3); + } else { + printk(KERN_INFO + "Register 3-level event channel failed with %d.\n" + "Fall back to default 2-level event channel.\n", + rc); + xen_set_event_channel_nlevel(2); + } } } diff --git a/drivers/xen/events.c b/drivers/xen/events.c index 87088b2..cdf7e84 100644 --- a/drivers/xen/events.c +++ b/drivers/xen/events.c @@ -2261,15 +2261,19 @@ void __init xen_init_IRQ(void) int i, rc; int cpu; - rc = xen_event_channel_register_nlevel(3); - if (!rc) { - printk(KERN_INFO "Register 3-level event channel succeeded.\n"); - xen_set_event_channel_nlevel(3); - } else { - printk(KERN_INFO "Register 3-level event channel failed with %d.\n" - "Fall back to default 2-level event channel.\n", - rc); - xen_set_event_channel_nlevel(2); + if (xen_initial_domain()) { + rc = xen_event_channel_register_nlevel(3); + if (!rc) { + printk(KERN_INFO + "Register 3-level event channel succeeded.\n"); + xen_set_event_channel_nlevel(3); + } else { + printk(KERN_INFO + "Register 3-level event channel failed with %d.\n" + "Fall back to default 2-level event channel.\n", + rc); + xen_set_event_channel_nlevel(2); + } } evtchn_to_irq = kcalloc(nr_event_channels, sizeof(*evtchn_to_irq), -- 1.7.10.4
Wei Liu
2013-Jan-31 16:39 UTC
Re: [PATCH 13/13] xen: only register 3-level event channel for Dom0
The previous patch is wrong, correct version attached. ----8<----- commit e12daae7f69d6ddc22d3268e8d97048489d4d9ff Author: Wei Liu <wei.liu2@citrix.com> Date: Thu Jan 31 14:31:40 2013 +0000 xen: only register 3-level event channel for Dom0 Signed-off-by: Wei Liu <wei.liu2@citrix.com> diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c index 919c7ed..e397c51 100644 --- a/arch/x86/xen/enlighten.c +++ b/arch/x86/xen/enlighten.c @@ -178,7 +178,6 @@ static void xen_vcpu_setup(int cpu) void xen_vcpu_restore(void) { int cpu; - int rc; for_each_possible_cpu(cpu) { bool other_cpu = (cpu != smp_processor_id()); @@ -198,14 +197,20 @@ void xen_vcpu_restore(void) BUG(); } - rc = xen_event_channel_register_nlevel(3); - if (!rc) { - printk(KERN_INFO "Register 3-level event channel succeeded.\n"); - xen_set_event_channel_nlevel(3); + if (xen_initial_domain()) { + int rc = xen_event_channel_register_nlevel(3); + if (!rc) { + printk(KERN_INFO + "Register 3-level event channel succeeded.\n"); + xen_set_event_channel_nlevel(3); + } else { + printk(KERN_INFO + "Register 3-level event channel failed with %d.\n" + "Fall back to default 2-level event channel.\n", + rc); + xen_set_event_channel_nlevel(2); + } } else { - printk(KERN_INFO "Register 3-level event channel failed with %d.\n" - "Fall back to default 2-level event channel.\n", - rc); xen_set_event_channel_nlevel(2); } } diff --git a/drivers/xen/events.c b/drivers/xen/events.c index 87088b2..8467d6a 100644 --- a/drivers/xen/events.c +++ b/drivers/xen/events.c @@ -2261,14 +2261,20 @@ void __init xen_init_IRQ(void) int i, rc; int cpu; - rc = xen_event_channel_register_nlevel(3); - if (!rc) { - printk(KERN_INFO "Register 3-level event channel succeeded.\n"); - xen_set_event_channel_nlevel(3); + if (xen_initial_domain()) { + rc = xen_event_channel_register_nlevel(3); + if (!rc) { + printk(KERN_INFO + "Register 3-level event channel succeeded.\n"); + xen_set_event_channel_nlevel(3); + } else { + printk(KERN_INFO + "Register 3-level event channel failed with %d.\n" + "Fall back to default 2-level event channel.\n", + rc); + xen_set_event_channel_nlevel(2); + } } else { - printk(KERN_INFO "Register 3-level event channel failed with %d.\n" - "Fall back to default 2-level event channel.\n", - rc); xen_set_event_channel_nlevel(2); }
David Vrabel
2013-Feb-01 11:06 UTC
Re: [PATCH 01/13] xen: fix output of xen_debug_interrupt
On 31/01/13 14:46, Wei Liu wrote:> Four things are fixed: > a) the test result of globaly masked event; > b) make the per-cpu selector L1 to be consistent with description in > __xen_evtchn_do_upcall''s comment; > c) the test result of L1 selector; > d) add KERN_DEBUG in printk.It''s hard to pick out the correctness fixes with all the cosmetic KERN_DEBUG fixes in the same patch.> Signed-off-by: Wei Liu <wei.liu2@citrix.com> > --- > drivers/xen/events.c | 34 +++++++++++++++++----------------- > 1 file changed, 17 insertions(+), 17 deletions(-) > > diff --git a/drivers/xen/events.c b/drivers/xen/events.c > index 7595581..2c94aad 100644 > --- a/drivers/xen/events.c > +++ b/drivers/xen/events.c[...]> + !sync_test_bit(word_idx, &v->evtchn_pending_sel) > + ? "" : " l1-clear",This looks backwards now.> + sync_test_bit(i, sh->evtchn_mask) > ? "" : " globally-masked", > sync_test_bit(i, cpu_evtchn) > ? "" : " locally-masked");David
On 31/01/13 14:46, Wei Liu wrote:> It is possible the port was allocated but the irq was not. Take care of this > case.I think the port should be closed when the evtchn_bind_to_user() fails otherwise the evtchn driver is leaving the event channel in an inconsistent state. David> Signed-off-by: Wei Liu <wei.liu2@citrix.com> > --- > drivers/xen/evtchn.c | 12 +++++++++++- > 1 file changed, 11 insertions(+), 1 deletion(-) > > diff --git a/drivers/xen/evtchn.c b/drivers/xen/evtchn.c > index b1f60a0..d2bbea1 100644 > --- a/drivers/xen/evtchn.c > +++ b/drivers/xen/evtchn.c > @@ -277,7 +277,17 @@ static void evtchn_unbind_from_user(struct per_user_data *u, int port) > { > int irq = irq_from_evtchn(port); > > - unbind_from_irqhandler(irq, (void *)(unsigned long)port); > + /* It is possible that the port was allocated but the irq was > + * not */ > + if (irq >= 0) { > + unbind_from_irqhandler(irq, (void *)(unsigned long)port); > + } else { > + struct evtchn_close close; > + close.port = port; > + if (port != 0 && /* port 0 is never used */ > + HYPERVISOR_event_channel_op(EVTCHNOP_close, &close) != 0) > + BUG(); > + } > > set_port_user(port, NULL); > }
David Vrabel
2013-Feb-01 11:19 UTC
Re: [PATCH 02/13] xen: fix error handling path if xen_allocate_irq_dynamic fails
On 31/01/13 14:46, Wei Liu wrote:> It is possible that the call to xen_allocate_irq_dynamic() returns negative > number other than -1.Yes, this ultimately calls__irq_alloc_descs() which returns -ve error codes.> Signed-off-by: Wei Liu <wei.liu2@citrix.com>Reviewed-by: David Vrabel <david.vrabel@citrix.com> Cc: stable@vger.kernel.org perhaps? David
Ian Campbell
2013-Feb-01 11:26 UTC
Re: [PATCH 02/13] xen: fix error handling path if xen_allocate_irq_dynamic fails
On Thu, 2013-01-31 at 14:46 +0000, Wei Liu wrote:> It is possible that the call to xen_allocate_irq_dynamic() returns negative > number other than -1.Hopefully this means it always returns -ERRNO, rather than a mixture of -ERRNO and literal -1? Ian.
David Vrabel
2013-Feb-01 11:29 UTC
Re: [PATCH 08/13] xen: dynamically allocate cpu_evtchn_mask
On 31/01/13 14:47, Wei Liu wrote:> Signed-off-by: Wei Liu <wei.liu2@citrix.com> > --- > drivers/xen/events.c | 21 ++++++++++++++++++--- > 1 file changed, 18 insertions(+), 3 deletions(-) > > diff --git a/drivers/xen/events.c b/drivers/xen/events.c > index 4820a52..30ca620 100644 > --- a/drivers/xen/events.c > +++ b/drivers/xen/events.c[...]> + for_each_possible_cpu(cpu) { > + void *p; > + unsigned int nr = nr_event_channels / BITS_PER_LONG; > + > + p = kzalloc_node(sizeof(unsigned long) * nr, > + GFP_KERNEL, > + cpu_to_node(cpu)); > + if (!p) > + p = kzalloc(sizeof(unsigned long) * nr, > + GFP_KERNEL);kzalloc_node() will fallback to allocating from other nodes, no need to try as well. David
On 31/01/13 14:47, Wei Liu wrote:> 3-level event channel is registered in > a) xen_init_IRQ(), when the guest is fresh started; > b) xen_vcpu_restore(), when the guest is restored. > > Signed-off-by: Wei Liu <wei.liu2@citrix.com> > --- > arch/x86/xen/enlighten.c | 13 +++++++++++++ > drivers/xen/events.c | 11 ++++++++++- > 2 files changed, 23 insertions(+), 1 deletion(-) > > diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c > index bc893e7..919c7ed 100644 > --- a/arch/x86/xen/enlighten.c > +++ b/arch/x86/xen/enlighten.c[...]> + > + rc = xen_event_channel_register_nlevel(3); > + if (!rc) { > + printk(KERN_INFO "Register 3-level event channel succeeded.\n"); > + xen_set_event_channel_nlevel(3); > + } else { > + printk(KERN_INFO "Register 3-level event channel failed with %d.\n" > + "Fall back to default 2-level event channel.\n", > + rc); > + xen_set_event_channel_nlevel(2); > + }You should have a function for this, instead of copy-and-paste in two different places. David
On 31/01/13 14:46, Wei Liu wrote:> Signed-off-by: Wei Liu <wei.liu2@citrix.com> > --- > drivers/xen/events.c | 10 ++++++++-- > 1 file changed, 8 insertions(+), 2 deletions(-) > > diff --git a/drivers/xen/events.c b/drivers/xen/events.c > index 93a3648..0fae3f9 100644 > --- a/drivers/xen/events.c > +++ b/drivers/xen/events.c > @@ -345,6 +345,12 @@ static inline int test_evtchn(int port) > return sync_test_bit(port, &s->evtchn_pending[0]); > } > > +static inline int test_and_set_mask(int port)This name is too generic. Suggest test_and_set_evtchn_mask() or similar. David
On Fri, 2013-02-01 at 11:15 +0000, David Vrabel wrote:> On 31/01/13 14:46, Wei Liu wrote: > > It is possible the port was allocated but the irq was not. Take care of this > > case. > > I think the port should be closed when the evtchn_bind_to_user() fails > otherwise the evtchn driver is leaving the event channel in an > inconsistent state. >Good point! I will move the fix there. Wei.
Wei Liu
2013-Feb-01 12:36 UTC
Re: [PATCH 02/13] xen: fix error handling path if xen_allocate_irq_dynamic fails
On Fri, 2013-02-01 at 11:26 +0000, Ian Campbell wrote:> On Thu, 2013-01-31 at 14:46 +0000, Wei Liu wrote: > > It is possible that the call to xen_allocate_irq_dynamic() returns negative > > number other than -1. > > Hopefully this means it always returns -ERRNO, rather than a mixture of > -ERRNO and literal -1? > > Ian. >Yes, xen_allocate_irq_dynamic() ultimately calls __irq_alloc_descs() which returns -ERRNO. Wei.
On Fri, 2013-02-01 at 11:06 +0000, David Vrabel wrote:> On 31/01/13 14:46, Wei Liu wrote: > > Four things are fixed: > > a) the test result of globaly masked event; > > b) make the per-cpu selector L1 to be consistent with description in > > __xen_evtchn_do_upcall''s comment; > > c) the test result of L1 selector; > > d) add KERN_DEBUG in printk. > > It''s hard to pick out the correctness fixes with all the cosmetic > KERN_DEBUG fixes in the same patch. >Moved to separate changeset.> > Signed-off-by: Wei Liu <wei.liu2@citrix.com> > > --- > > drivers/xen/events.c | 34 +++++++++++++++++----------------- > > 1 file changed, 17 insertions(+), 17 deletions(-) > > > > diff --git a/drivers/xen/events.c b/drivers/xen/events.c > > index 7595581..2c94aad 100644 > > --- a/drivers/xen/events.c > > +++ b/drivers/xen/events.c > [...] > > + !sync_test_bit(word_idx, &v->evtchn_pending_sel) > > + ? "" : " l1-clear", > > This looks backwards now. >Did you mean the "globally-masked" below? I think I made a mistake here, because in global mask 1 means masked while in per cpu mask 0 means masked... Wei.> > + sync_test_bit(i, sh->evtchn_mask) > > ? "" : " globally-masked", > > sync_test_bit(i, cpu_evtchn) > > ? "" : " locally-masked"); > > David >
On Fri, 2013-02-01 at 11:06 +0000, David Vrabel wrote:> On 31/01/13 14:46, Wei Liu wrote: > > Four things are fixed: > > a) the test result of globaly masked event; > > b) make the per-cpu selector L1 to be consistent with description in > > __xen_evtchn_do_upcall''s comment; > > c) the test result of L1 selector; > > d) add KERN_DEBUG in printk. > > It''s hard to pick out the correctness fixes with all the cosmetic > KERN_DEBUG fixes in the same patch. > > > Signed-off-by: Wei Liu <wei.liu2@citrix.com> > > --- > > drivers/xen/events.c | 34 +++++++++++++++++----------------- > > 1 file changed, 17 insertions(+), 17 deletions(-) > > > > diff --git a/drivers/xen/events.c b/drivers/xen/events.c > > index 7595581..2c94aad 100644 > > --- a/drivers/xen/events.c > > +++ b/drivers/xen/events.c > [...] > > + !sync_test_bit(word_idx, &v->evtchn_pending_sel) > > + ? "" : " l1-clear", > > This looks backwards now.Ah, I understand now. You''re right, this fix is not necessary. Wei.> > > + sync_test_bit(i, sh->evtchn_mask) > > ? "" : " globally-masked", > > sync_test_bit(i, cpu_evtchn) > > ? "" : " locally-masked"); > > David >
On Fri, 2013-02-01 at 11:35 +0000, David Vrabel wrote:> On 31/01/13 14:46, Wei Liu wrote: > > Signed-off-by: Wei Liu <wei.liu2@citrix.com> > > --- > > drivers/xen/events.c | 10 ++++++++-- > > 1 file changed, 8 insertions(+), 2 deletions(-) > > > > diff --git a/drivers/xen/events.c b/drivers/xen/events.c > > index 93a3648..0fae3f9 100644 > > --- a/drivers/xen/events.c > > +++ b/drivers/xen/events.c > > @@ -345,6 +345,12 @@ static inline int test_evtchn(int port) > > return sync_test_bit(port, &s->evtchn_pending[0]); > > } > > > > +static inline int test_and_set_mask(int port) > > This name is too generic. Suggest test_and_set_evtchn_mask() or similar.For a static function inside drivers/xen/events.c that seems like overkill. Ian.
Jan Beulich
2013-Feb-04 08:56 UTC
Re: [PATCH 10/13] xen: introduce xen_event_channel_register_3level
>>> On 31.01.13 at 15:47, Wei Liu <wei.liu2@citrix.com> wrote: > @@ -2123,6 +2126,97 @@ void xen_callback_vector(void) > void xen_callback_vector(void) {} > #endif > > +static int xen_event_channel_register_3level(void) > +{ > + evtchn_register_nlevel_t reg; > + int i, cpu; > + unsigned long *_evtchn_pending = NULL; > + unsigned long *_evtchn_mask = NULL; > + unsigned long *l2sel_mfns = NULL; > + unsigned long *l2sel_offsets = NULL; > + int rc; > + > + /* If we come from restore path, we don''t need to allocate > + * pages. > + */ > + if (!evtchn_pending && !evtchn_mask) { > + evtchn_pending > + (unsigned long *)__get_free_pages(GFP_KERNEL, > + BITMAP_PG_ORDER); > + evtchn_mask > + (unsigned long *)__get_free_pages(GFP_KERNEL, > + BITMAP_PG_ORDER); > + if (!evtchn_pending || !evtchn_mask) { > + free_pages((unsigned long)evtchn_pending, BITMAP_NR_PAGES); > + free_pages((unsigned long)evtchn_mask, BITMAP_NR_PAGES);free_pages() takes an order just like __get_free_pages() does.> + evtchn_pending = NULL; > + evtchn_mask = NULL; > + rc = -ENOMEM; > + goto err; > + } > + } > + > + rc = -ENOMEM; /* Common error code for following operations */ > +#define __ALLOC_ARRAY(_ptr, _nr) \ > + do { \ > + (_ptr) = kzalloc(sizeof(unsigned long) * (_nr), \ > + GFP_KERNEL); \ > + if (!(_ptr)) \ > + goto out; \ > + } while (0) > + > + __ALLOC_ARRAY(_evtchn_pending, BITMAP_NR_PAGES); > + __ALLOC_ARRAY(_evtchn_mask, BITMAP_NR_PAGES); > + __ALLOC_ARRAY(l2sel_mfns, nr_cpu_ids); > + __ALLOC_ARRAY(l2sel_offsets, nr_cpu_ids); > +#undef __ALLOC_ARRAY > + > + memset(®, 0, sizeof(reg)); > + > + for (i = 0; i < BITMAP_NR_PAGES; i++) { > + unsigned long offset = PAGE_SIZE * i; > + _evtchn_pending[i] > + arbitrary_virt_to_mfn( > + (void *)((unsigned long)evtchn_pending+offset)); > + _evtchn_mask[i] > + arbitrary_virt_to_mfn( > + (void *)((unsigned long)evtchn_mask+offset)); > + } > + > + for_each_possible_cpu(cpu) { > + l2sel_mfns[cpu] > + arbitrary_virt_to_mfn(&per_cpu(evtchn_sel_l2, cpu)); > + l2sel_offsets[cpu] > + offset_in_page(&per_cpu(evtchn_sel_l2, cpu)); > + } > + > + reg.u.l3.nr_pages = BITMAP_NR_PAGES; > + reg.u.l3.evtchn_pending = _evtchn_pending; > + reg.u.l3.evtchn_mask = _evtchn_mask; > + > + reg.u.l3.nr_vcpus = nr_cpu_ids; > + reg.u.l3.l2sel_mfns = l2sel_mfns; > + reg.u.l3.l2sel_offsets = l2sel_offsets; > + > + reg.level = 3; > + > + rc = HYPERVISOR_event_channel_op(EVTCHNOP_register_nlevel, ®); > + if (rc) { > + free_pages((unsigned long)evtchn_pending, BITMAP_NR_PAGES); > + free_pages((unsigned long)evtchn_mask, BITMAP_NR_PAGES);Same here. Jan> + evtchn_pending = NULL; > + evtchn_mask = NULL; > + } > + > +out: > + kfree(_evtchn_pending); > + kfree(_evtchn_mask); > + kfree(l2sel_mfns); > + kfree(l2sel_offsets); > +err: > + return rc; > +} > + > void __init xen_init_IRQ(void) > { > int i, rc; > -- > 1.7.10.4
Wei Liu
2013-Feb-04 10:36 UTC
Re: [PATCH 10/13] xen: introduce xen_event_channel_register_3level
On Mon, 2013-02-04 at 08:56 +0000, Jan Beulich wrote:> >>> On 31.01.13 at 15:47, Wei Liu <wei.liu2@citrix.com> wrote: > > @@ -2123,6 +2126,97 @@ void xen_callback_vector(void) > > void xen_callback_vector(void) {} > > #endif > > > > +static int xen_event_channel_register_3level(void) > > +{ > > + evtchn_register_nlevel_t reg; > > + int i, cpu; > > + unsigned long *_evtchn_pending = NULL; > > + unsigned long *_evtchn_mask = NULL; > > + unsigned long *l2sel_mfns = NULL; > > + unsigned long *l2sel_offsets = NULL; > > + int rc; > > + > > + /* If we come from restore path, we don''t need to allocate > > + * pages. > > + */ > > + if (!evtchn_pending && !evtchn_mask) { > > + evtchn_pending > > + (unsigned long *)__get_free_pages(GFP_KERNEL, > > + BITMAP_PG_ORDER); > > + evtchn_mask > > + (unsigned long *)__get_free_pages(GFP_KERNEL, > > + BITMAP_PG_ORDER); > > + if (!evtchn_pending || !evtchn_mask) { > > + free_pages((unsigned long)evtchn_pending, BITMAP_NR_PAGES); > > + free_pages((unsigned long)evtchn_mask, BITMAP_NR_PAGES); > > free_pages() takes an order just like __get_free_pages() does.Good catch, thanks! Wei.
Konrad Rzeszutek Wilk
2013-Feb-05 16:55 UTC
Re: [PATCH 10/13] xen: introduce xen_event_channel_register_3level
On Thu, Jan 31, 2013 at 02:47:04PM +0000, Wei Liu wrote:> Signed-off-by: Wei Liu <wei.liu2@citrix.com> > --- > drivers/xen/events.c | 94 ++++++++++++++++++++++++++++++++++++++++++++++++++ > 1 file changed, 94 insertions(+) > > diff --git a/drivers/xen/events.c b/drivers/xen/events.c > index d953e81..9038211 100644 > --- a/drivers/xen/events.c > +++ b/drivers/xen/events.c > @@ -53,6 +53,9 @@ > > /* Helper macro(s) */ > #define LONG_BITORDER (BITS_PER_LONG == 64 ? 6 : 5)Can you provide a comment explaining why the ''6'' or ''5'' value?> +/* event bitmap size: 1 page for 32 bit and 8 pages for 64 bit */ > +#define BITMAP_PG_ORDER (BITS_PER_LONG == 64 ? 3 : 1) > +#define BITMAP_NR_PAGES (BITMAP_PG_ORDER == 3 ? 8 : 1)Is there some math behind this? Could you provide a comment explaining the reason for needing so much for a 64-bit vs only needing on page on 32-bit?> > /* N-level event channel, starting from 2 */ > unsigned int evtchn_level = 2; > @@ -2123,6 +2126,97 @@ void xen_callback_vector(void) > void xen_callback_vector(void) {} > #endif > > +static int xen_event_channel_register_3level(void) > +{ > + evtchn_register_nlevel_t reg;Please no typdefs.> + int i, cpu; > + unsigned long *_evtchn_pending = NULL; > + unsigned long *_evtchn_mask = NULL; > + unsigned long *l2sel_mfns = NULL; > + unsigned long *l2sel_offsets = NULL; > + int rc; > + > + /* If we come from restore path, we don''t need to allocate > + * pages. > + */ > + if (!evtchn_pending && !evtchn_mask) { > + evtchn_pending > + (unsigned long *)__get_free_pages(GFP_KERNEL, > + BITMAP_PG_ORDER); > + evtchn_mask > + (unsigned long *)__get_free_pages(GFP_KERNEL, > + BITMAP_PG_ORDER); > + if (!evtchn_pending || !evtchn_mask) { > + free_pages((unsigned long)evtchn_pending, BITMAP_NR_PAGES); > + free_pages((unsigned long)evtchn_mask, BITMAP_NR_PAGES); > + evtchn_pending = NULL; > + evtchn_mask = NULL; > + rc = -ENOMEM; > + goto err; > + } > + } > + > + rc = -ENOMEM; /* Common error code for following operations */ > +#define __ALLOC_ARRAY(_ptr, _nr) \ > + do { \ > + (_ptr) = kzalloc(sizeof(unsigned long) * (_nr), \ > + GFP_KERNEL); \ > + if (!(_ptr)) \ > + goto out; \ > + } while (0) > + > + __ALLOC_ARRAY(_evtchn_pending, BITMAP_NR_PAGES); > + __ALLOC_ARRAY(_evtchn_mask, BITMAP_NR_PAGES); > + __ALLOC_ARRAY(l2sel_mfns, nr_cpu_ids); > + __ALLOC_ARRAY(l2sel_offsets, nr_cpu_ids); > +#undef __ALLOC_ARRAY > + > + memset(®, 0, sizeof(reg)); > + > + for (i = 0; i < BITMAP_NR_PAGES; i++) { > + unsigned long offset = PAGE_SIZE * i; > + _evtchn_pending[i] > + arbitrary_virt_to_mfn( > + (void *)((unsigned long)evtchn_pending+offset)); > + _evtchn_mask[i] > + arbitrary_virt_to_mfn( > + (void *)((unsigned long)evtchn_mask+offset)); > + } > + > + for_each_possible_cpu(cpu) { > + l2sel_mfns[cpu] > + arbitrary_virt_to_mfn(&per_cpu(evtchn_sel_l2, cpu)); > + l2sel_offsets[cpu] > + offset_in_page(&per_cpu(evtchn_sel_l2, cpu)); > + } > + > + reg.u.l3.nr_pages = BITMAP_NR_PAGES; > + reg.u.l3.evtchn_pending = _evtchn_pending; > + reg.u.l3.evtchn_mask = _evtchn_mask; > + > + reg.u.l3.nr_vcpus = nr_cpu_ids; > + reg.u.l3.l2sel_mfns = l2sel_mfns; > + reg.u.l3.l2sel_offsets = l2sel_offsets; > + > + reg.level = 3; > + > + rc = HYPERVISOR_event_channel_op(EVTCHNOP_register_nlevel, ®); > + if (rc) { > + free_pages((unsigned long)evtchn_pending, BITMAP_NR_PAGES); > + free_pages((unsigned long)evtchn_mask, BITMAP_NR_PAGES); > + evtchn_pending = NULL; > + evtchn_mask = NULL; > + } > + > +out: > + kfree(_evtchn_pending); > + kfree(_evtchn_mask); > + kfree(l2sel_mfns); > + kfree(l2sel_offsets);So it is OK to just free it even on success??> +err: > + return rc; > +} > + > void __init xen_init_IRQ(void) > { > int i, rc; > -- > 1.7.10.4 >
Konrad Rzeszutek Wilk
2013-Feb-05 16:57 UTC
Re: [PATCH 03/13] xen: fix evtchn_unbind_from_user
On Fri, Feb 01, 2013 at 12:33:05PM +0000, Wei Liu wrote:> On Fri, 2013-02-01 at 11:15 +0000, David Vrabel wrote: > > On 31/01/13 14:46, Wei Liu wrote: > > > It is possible the port was allocated but the irq was not. Take care of this > > > case. > > > > I think the port should be closed when the evtchn_bind_to_user() fails > > otherwise the evtchn driver is leaving the event channel in an > > inconsistent state. > > > > Good point! I will move the fix there.Ok, and in this function just do a BUG_ON(xx) if the case is still hit.> > Wei. >
On Thu, Jan 31, 2013 at 02:46:58PM +0000, Wei Liu wrote:> Signed-off-by: Wei Liu <wei.liu2@citrix.com> > --- > include/xen/interface/event_channel.h | 33 +++++++++++++++++++++++++++++++++ > include/xen/interface/xen.h | 9 ++++++++- > 2 files changed, 41 insertions(+), 1 deletion(-) > > diff --git a/include/xen/interface/event_channel.h b/include/xen/interface/event_channel.h > index f494292..9d8b9e7 100644 > --- a/include/xen/interface/event_channel.h > +++ b/include/xen/interface/event_channel.h > @@ -190,6 +190,39 @@ struct evtchn_reset { > }; > typedef struct evtchn_reset evtchn_reset_t; > > +/* > + * EVTCHNOP_register_nlevel: Register N-level event channel > + * NOTES: > + * 1. Currently only 3-level is supported. > + * 2. Should fall back to 2-level if this call fails. > + */ > +#define EVTCHNOP_register_nlevel 11 > +/* 64 bit guests need 8 pages for evtchn_pending and evtchn_mask for > + * 256k event channels while 32 bit ones only need 1 page for 32k > + * event channels. */ > +#define EVTCHN_MAX_L3_PAGES 8 > +struct evtchn_register_3level { > + /* IN parameters. */ > + uint32_t nr_pages; /* for evtchn_{pending,mask} */ > + uint32_t nr_vcpus; /* for l2sel_{mfns,mask} */ > + GUEST_HANDLE(ulong) evtchn_pending; > + GUEST_HANDLE(ulong) evtchn_mask; > + GUEST_HANDLE(ulong) l2sel_mfns; > + GUEST_HANDLE(ulong) l2sel_offsets; > +}; > +typedef struct evtchn_register_3level evtchn_register_3level_t; > +DEFINE_GUEST_HANDLE(evtchn_register_3level_t); > + > +struct evtchn_register_nlevel { > + /* IN parameters. */ > + uint32_t level; > + union { > + evtchn_register_3level_t l3; > + } u; > +}; > +typedef struct evtchn_register_nlevel evtchn_register_nlevel_t; > +DEFINE_GUEST_HANDLE(evtchn_register_nlevel_t); > + > struct evtchn_op { > uint32_t cmd; /* EVTCHNOP_* */ > union { > diff --git a/include/xen/interface/xen.h b/include/xen/interface/xen.h > index a890804..5220e33 100644 > --- a/include/xen/interface/xen.h > +++ b/include/xen/interface/xen.h > @@ -283,9 +283,16 @@ DEFINE_GUEST_HANDLE_STRUCT(multicall_entry); > > /* > * Event channel endpoints per domain: > + * 2-level: > * 1024 if a long is 32 bits; 4096 if a long is 64 bits. > + * 3-level: > + * 32k if a long is 32 bits; 256k if a long is 64 bits. > */ > -#define NR_EVENT_CHANNELS (sizeof(unsigned long) * sizeof(unsigned long) * 64) > +#define NR_EVENT_CHANNELS_L2 (sizeof(unsigned long) * sizeof(unsigned long) * 64)We did a bit of header change to make the ARM code always use ''unsinged long long'' on 32-bit (so it is a nice 8-bytes) and ''unsigned long'' on 64-bit. This was all done using the xen_pfn_t and xen_mfn_t typdefs. Any chance you can do that here? That way on ARM - irregardless if it is 32-bit or 64-bit it is always of the 64-bit size? Or do we not care about the ARM case here?> +#define NR_EVENT_CHANNELS_L3 (NR_EVENT_CHANNELS_L2 * 64) > +#if !defined(__XEN__) && !defined(__XEN_TOOLS__) > +#define NR_EVENT_CHANNELS NR_EVENT_CHANNELS_L2 /* for compatibility */ > +#endif > > struct vcpu_time_info { > /* > -- > 1.7.10.4 >
Konrad Rzeszutek Wilk
2013-Feb-05 17:04 UTC
Re: [PATCH 07/13] xen: generalized event channel operations
On Thu, Jan 31, 2013 at 02:47:01PM +0000, Wei Liu wrote:> Use global pointers in common operations to allow for better code sharing > between N-level event channel. > > Functions which are not suitable for sharing are also taken care of. > > Also update drivers/xen/evtchn.c to use exported variable instead of macro. > > Signed-off-by: Wei Liu <wei.liu2@citrix.com> > --- > drivers/xen/events.c | 180 ++++++++++++++++++++++++++++++++------------------ > drivers/xen/evtchn.c | 12 ++-- > include/xen/events.h | 2 + > 3 files changed, 123 insertions(+), 71 deletions(-) > > diff --git a/drivers/xen/events.c b/drivers/xen/events.c > index 0679d27..4820a52 100644 > --- a/drivers/xen/events.c > +++ b/drivers/xen/events.c > @@ -51,6 +51,16 @@ > #include <xen/interface/hvm/hvm_op.h> > #include <xen/interface/hvm/params.h> > > +/* N-level event channel, starting from 2 */ > +unsigned int evtchn_level = 2;What if the hypervisor does not support that? Shouldn''t be by default at 1?> +EXPORT_SYMBOL_GPL(evtchn_level);Prefix it please with ''xen''> +unsigned int nr_event_channels; > +EXPORT_SYMBOL_GPL(nr_event_channels);Ditto here.> + > +/* The following pointers point to pending bitmap and mask bitmap. */ > +static unsigned long *evtchn_pending; > +static unsigned long *evtchn_mask; > + > /* > * This lock protects updates to the following mapping and reference-count > * arrays. The lock does not need to be acquired to read the mapping tables. > @@ -113,7 +123,7 @@ static int *evtchn_to_irq; > static unsigned long *pirq_eoi_map; > static bool (*pirq_needs_eoi)(unsigned irq); > > -static DEFINE_PER_CPU(unsigned long [NR_EVENT_CHANNELS/BITS_PER_LONG], > +static DEFINE_PER_CPU(unsigned long [NR_EVENT_CHANNELS_L2/BITS_PER_LONG], > cpu_evtchn_mask); > > /* Xen will never allocate port zero for any purpose. */ > @@ -286,12 +296,11 @@ static bool pirq_needs_eoi_flag(unsigned irq) > } > > static inline unsigned long active_evtchns(unsigned int cpu, > - struct shared_info *sh, > unsigned int idx) > { > - return sh->evtchn_pending[idx] & > + return evtchn_pending[idx] & > per_cpu(cpu_evtchn_mask, cpu)[idx] & > - ~sh->evtchn_mask[idx]; > + ~evtchn_mask[idx]; > } > > static void bind_evtchn_to_cpu(unsigned int chn, unsigned int cpu) > @@ -329,26 +338,22 @@ static void init_evtchn_cpu_bindings(void) > > static inline void clear_evtchn(int port) > { > - struct shared_info *s = HYPERVISOR_shared_info; > - sync_clear_bit(port, &s->evtchn_pending[0]); > + sync_clear_bit(port, &evtchn_pending[0]); > } > > static inline void set_evtchn(int port) > { > - struct shared_info *s = HYPERVISOR_shared_info; > - sync_set_bit(port, &s->evtchn_pending[0]); > + sync_set_bit(port, &evtchn_pending[0]); > } > > static inline int test_evtchn(int port) > { > - struct shared_info *s = HYPERVISOR_shared_info; > - return sync_test_bit(port, &s->evtchn_pending[0]); > + return sync_test_bit(port, &evtchn_pending[0]); > } > > static inline int test_and_set_mask(int port) > { > - struct shared_info *s = HYPERVISOR_shared_info; > - return sync_test_and_set_bit(port, &s->evtchn_mask[0]); > + return sync_test_and_set_bit(port, &evtchn_mask[0]); > } > > > @@ -371,13 +376,28 @@ EXPORT_SYMBOL_GPL(notify_remote_via_irq); > > static void mask_evtchn(int port) > { > - struct shared_info *s = HYPERVISOR_shared_info; > - sync_set_bit(port, &s->evtchn_mask[0]); > + sync_set_bit(port, &evtchn_mask[0]); > +} > + > +static inline void __unmask_local_port_l2(int port) > +{ > + struct vcpu_info *vcpu_info = __this_cpu_read(xen_vcpu); > + > + sync_clear_bit(port, &evtchn_mask[0]); > + > + /* > + * The following is basically the equivalent of > + * ''hw_resend_irq''. Just like a real IO-APIC we ''lose > + * the interrupt edge'' if the channel is masked. > + */ > + if (sync_test_bit(port, &evtchn_pending[0]) && > + !sync_test_and_set_bit(port / BITS_PER_LONG, > + &vcpu_info->evtchn_pending_sel)) > + vcpu_info->evtchn_upcall_pending = 1; > } > > static void unmask_evtchn(int port) > { > - struct shared_info *s = HYPERVISOR_shared_info; > unsigned int cpu = get_cpu(); > > BUG_ON(!irqs_disabled()); > @@ -387,19 +407,13 @@ static void unmask_evtchn(int port) > struct evtchn_unmask unmask = { .port = port }; > (void)HYPERVISOR_event_channel_op(EVTCHNOP_unmask, &unmask); > } else { > - struct vcpu_info *vcpu_info = __this_cpu_read(xen_vcpu); > - > - sync_clear_bit(port, &s->evtchn_mask[0]); > - > - /* > - * The following is basically the equivalent of > - * ''hw_resend_irq''. Just like a real IO-APIC we ''lose > - * the interrupt edge'' if the channel is masked. > - */ > - if (sync_test_bit(port, &s->evtchn_pending[0]) && > - !sync_test_and_set_bit(port / BITS_PER_LONG, > - &vcpu_info->evtchn_pending_sel)) > - vcpu_info->evtchn_upcall_pending = 1; > + switch (evtchn_level) { > + case 2: > + __unmask_local_port_l2(port); > + break; > + default: > + BUG(); > + } > } > > put_cpu(); > @@ -902,7 +916,7 @@ static int find_virq(unsigned int virq, unsigned int cpu) > int port, rc = -ENOENT; > > memset(&status, 0, sizeof(status)); > - for (port = 0; port <= NR_EVENT_CHANNELS; port++) { > + for (port = 0; port <= nr_event_channels; port++) { > status.dom = DOMID_SELF; > status.port = port; > rc = HYPERVISOR_event_channel_op(EVTCHNOP_status, &status); > @@ -1127,7 +1141,7 @@ int evtchn_get(unsigned int evtchn) > struct irq_info *info; > int err = -ENOENT; > > - if (evtchn >= NR_EVENT_CHANNELS) > + if (evtchn >= nr_event_channels) > return -EINVAL; > > mutex_lock(&irq_mapping_update_lock); > @@ -1170,15 +1184,16 @@ void xen_send_IPI_one(unsigned int cpu, enum ipi_vector vector) > notify_remote_via_irq(irq); > } > > +static irqreturn_t xen_debug_interrupt_l2(int irq, void *dev_id); > + > irqreturn_t xen_debug_interrupt(int irq, void *dev_id) > { > - struct shared_info *sh = HYPERVISOR_shared_info; > - int cpu = smp_processor_id(); > - unsigned long *cpu_evtchn = per_cpu(cpu_evtchn_mask, cpu); > - int i; > - unsigned long flags; > + irqreturn_t rc; > static DEFINE_SPINLOCK(debug_lock); > + unsigned long flags; > + int cpu = smp_processor_id(); > struct vcpu_info *v; > + int i; > > spin_lock_irqsave(&debug_lock, flags); > > @@ -1195,24 +1210,45 @@ irqreturn_t xen_debug_interrupt(int irq, void *dev_id) > (int)(sizeof(v->evtchn_pending_sel)*2), > v->evtchn_pending_sel); > } > + > + switch (evtchn_level) { > + case 2: > + rc = xen_debug_interrupt_l2(irq, dev_id); > + break; > + default: > + BUG(); > + } > + > + spin_unlock_irqrestore(&debug_lock, flags); > + return rc; > +} > + > +static irqreturn_t xen_debug_interrupt_l2(int irq, void *dev_id) > +{ > + int cpu = smp_processor_id(); > + unsigned long *cpu_evtchn = per_cpu(cpu_evtchn_mask, cpu); > + int i; > + unsigned long nr_elems = NR_EVENT_CHANNELS_L2 / BITS_PER_LONG; > + struct vcpu_info *v; > + > v = per_cpu(xen_vcpu, cpu); > > printk(KERN_DEBUG "\npending:\n "); > - for (i = ARRAY_SIZE(sh->evtchn_pending)-1; i >= 0; i--) > - printk(KERN_DEBUG "%0*lx%s", (int)sizeof(sh->evtchn_pending[0])*2, > - sh->evtchn_pending[i], > + for (i = nr_elems; i >= 0; i--) > + printk(KERN_DEBUG "%0*lx%s", (int)sizeof(evtchn_pending[0])*2, > + evtchn_pending[i], > i % 8 == 0 ? "\n " : " "); > printk(KERN_DEBUG "\nglobal mask:\n "); > - for (i = ARRAY_SIZE(sh->evtchn_mask)-1; i >= 0; i--) > + for (i = nr_elems-1; i >= 0; i--) > printk(KERN_DEBUG "%0*lx%s", > - (int)(sizeof(sh->evtchn_mask[0])*2), > - sh->evtchn_mask[i], > + (int)(sizeof(evtchn_mask[0])*2), > + evtchn_mask[i], > i % 8 == 0 ? "\n " : " "); > > printk(KERN_DEBUG "\nglobally unmasked:\n "); > - for (i = ARRAY_SIZE(sh->evtchn_mask)-1; i >= 0; i--) > - printk(KERN_DEBUG "%0*lx%s", (int)(sizeof(sh->evtchn_mask[0])*2), > - sh->evtchn_pending[i] & ~sh->evtchn_mask[i], > + for (i = nr_elems-1; i >= 0; i--) > + printk(KERN_DEBUG "%0*lx%s", (int)(sizeof(evtchn_mask[0])*2), > + evtchn_pending[i] & ~evtchn_mask[i], > i % 8 == 0 ? "\n " : " "); > > printk(KERN_DEBUG "\nlocal cpu%d mask:\n ", cpu); > @@ -1222,32 +1258,30 @@ irqreturn_t xen_debug_interrupt(int irq, void *dev_id) > i % 8 == 0 ? "\n " : " "); > > printk(KERN_DEBUG "\nlocally unmasked:\n "); > - for (i = ARRAY_SIZE(sh->evtchn_mask)-1; i >= 0; i--) { > - unsigned long pending = sh->evtchn_pending[i] > - & ~sh->evtchn_mask[i] > + for (i = nr_elems-1; i >= 0; i--) { > + unsigned long pending = evtchn_pending[i] > + & ~evtchn_mask[i] > & cpu_evtchn[i]; > - printk(KERN_DEBUG "%0*lx%s", (int)(sizeof(sh->evtchn_mask[0])*2), > + printk(KERN_DEBUG "%0*lx%s", (int)(sizeof(evtchn_mask[0])*2), > pending, i % 8 == 0 ? "\n " : " "); > } > > printk(KERN_DEBUG "\npending list:\n"); > - for (i = 0; i < NR_EVENT_CHANNELS; i++) { > - if (sync_test_bit(i, sh->evtchn_pending)) { > + for (i = 0; i < NR_EVENT_CHANNELS_L2; i++) { > + if (sync_test_bit(i, evtchn_pending)) { > int word_idx = i / BITS_PER_LONG; > printk(KERN_DEBUG " %d: event %d -> irq %d%s%s%s\n", > cpu_from_evtchn(i), i, > evtchn_to_irq[i], > !sync_test_bit(word_idx, &v->evtchn_pending_sel) > ? "" : " l1-clear", > - sync_test_bit(i, sh->evtchn_mask) > + sync_test_bit(i, evtchn_mask) > ? "" : " globally-masked", > sync_test_bit(i, cpu_evtchn) > ? "" : " locally-masked"); > } > } > > - spin_unlock_irqrestore(&debug_lock, flags); > - > return IRQ_HANDLED; > } > > @@ -1269,13 +1303,12 @@ static DEFINE_PER_CPU(unsigned int, current_bit_idx); > * a bitset of words which contain pending event bits. The second > * level is a bitset of pending events themselves. > */ > -static void __xen_evtchn_do_upcall(void) > +static void __xen_evtchn_do_upcall_l2(void) > { > int start_word_idx, start_bit_idx; > int word_idx, bit_idx; > int i; > int cpu = get_cpu(); > - struct shared_info *s = HYPERVISOR_shared_info; > struct vcpu_info *vcpu_info = __this_cpu_read(xen_vcpu); > unsigned count; > > @@ -1314,7 +1347,7 @@ static void __xen_evtchn_do_upcall(void) > } > word_idx = __ffs(words); > > - pending_bits = active_evtchns(cpu, s, word_idx); > + pending_bits = active_evtchns(cpu, word_idx); > bit_idx = 0; /* usually scan entire word from start */ > if (word_idx == start_word_idx) { > /* We scan the starting word in two parts */ > @@ -1383,7 +1416,13 @@ void xen_evtchn_do_upcall(struct pt_regs *regs) > exit_idle(); > irq_enter(); > > - __xen_evtchn_do_upcall(); > + switch (evtchn_level) { > + case 2: > + __xen_evtchn_do_upcall_l2(); > + break; > + default: > + BUG(); > + } > > irq_exit(); > set_irq_regs(old_regs); > @@ -1391,7 +1430,13 @@ void xen_evtchn_do_upcall(struct pt_regs *regs) > > void xen_hvm_evtchn_do_upcall(void) > { > - __xen_evtchn_do_upcall(); > + switch (evtchn_level) { > + case 2: > + __xen_evtchn_do_upcall_l2(); > + break; > + default: > + BUG(); > + } > } > EXPORT_SYMBOL_GPL(xen_hvm_evtchn_do_upcall); > > @@ -1465,7 +1510,6 @@ static int set_affinity_irq(struct irq_data *data, const struct cpumask *dest, > int resend_irq_on_evtchn(unsigned int irq) > { > int masked, evtchn = evtchn_from_irq(irq); > - struct shared_info *s = HYPERVISOR_shared_info; > > if (!VALID_EVTCHN(evtchn)) > return 1; > @@ -1513,7 +1557,6 @@ static void mask_ack_dynirq(struct irq_data *data) > static int retrigger_dynirq(struct irq_data *data) > { > int evtchn = evtchn_from_irq(data->irq); > - struct shared_info *sh = HYPERVISOR_shared_info; > int ret = 0; > > if (VALID_EVTCHN(evtchn)) { > @@ -1689,14 +1732,14 @@ void xen_irq_resume(void) > init_evtchn_cpu_bindings(); > > /* New event-channel space is not ''live'' yet. */ > - for (evtchn = 0; evtchn < NR_EVENT_CHANNELS; evtchn++) > + for (evtchn = 0; evtchn < nr_event_channels; evtchn++) > mask_evtchn(evtchn); > > /* No IRQ <-> event-channel mappings. */ > list_for_each_entry(info, &xen_irq_list_head, list) > info->evtchn = 0; /* zap event-channel binding */ > > - for (evtchn = 0; evtchn < NR_EVENT_CHANNELS; evtchn++) > + for (evtchn = 0; evtchn < nr_event_channels; evtchn++) > evtchn_to_irq[evtchn] = -1; > > for_each_possible_cpu(cpu) { > @@ -1792,17 +1835,24 @@ void xen_callback_vector(void) {} > void __init xen_init_IRQ(void) > { > int i, rc; > + struct shared_info *s = HYPERVISOR_shared_info; > + > + evtchn_pending = s->evtchn_pending; > + evtchn_mask = s->evtchn_mask; > + > + evtchn_level = 2; > + nr_event_channels = NR_EVENT_CHANNELS_L2; > > - evtchn_to_irq = kcalloc(NR_EVENT_CHANNELS, sizeof(*evtchn_to_irq), > + evtchn_to_irq = kcalloc(nr_event_channels, sizeof(*evtchn_to_irq), > GFP_KERNEL); > BUG_ON(!evtchn_to_irq); > - for (i = 0; i < NR_EVENT_CHANNELS; i++) > + for (i = 0; i < nr_event_channels; i++) > evtchn_to_irq[i] = -1; > > init_evtchn_cpu_bindings(); > > /* No event channels are ''live'' right now. */ > - for (i = 0; i < NR_EVENT_CHANNELS; i++) > + for (i = 0; i < nr_event_channels; i++) > mask_evtchn(i); > > pirq_needs_eoi = pirq_needs_eoi_flag; > diff --git a/drivers/xen/evtchn.c b/drivers/xen/evtchn.c > index d2bbea1..7515ecc 100644 > --- a/drivers/xen/evtchn.c > +++ b/drivers/xen/evtchn.c > @@ -232,7 +232,7 @@ static ssize_t evtchn_write(struct file *file, const char __user *buf, > for (i = 0; i < (count/sizeof(evtchn_port_t)); i++) { > unsigned port = kbuf[i]; > > - if (port < NR_EVENT_CHANNELS && > + if (port < nr_event_channels && > get_port_user(port) == u && > !get_port_enabled(port)) { > set_port_enabled(port, true); > @@ -374,7 +374,7 @@ static long evtchn_ioctl(struct file *file, > break; > > rc = -EINVAL; > - if (unbind.port >= NR_EVENT_CHANNELS) > + if (unbind.port >= nr_event_channels) > break; > > spin_lock_irq(&port_user_lock); > @@ -402,7 +402,7 @@ static long evtchn_ioctl(struct file *file, > if (copy_from_user(¬ify, uarg, sizeof(notify))) > break; > > - if (notify.port >= NR_EVENT_CHANNELS) { > + if (notify.port >= nr_event_channels) { > rc = -EINVAL; > } else if (get_port_user(notify.port) != u) { > rc = -ENOTCONN; > @@ -492,7 +492,7 @@ static int evtchn_release(struct inode *inode, struct file *filp) > > free_page((unsigned long)u->ring); > > - for (i = 0; i < NR_EVENT_CHANNELS; i++) { > + for (i = 0; i < nr_event_channels; i++) { > if (get_port_user(i) != u) > continue; > > @@ -501,7 +501,7 @@ static int evtchn_release(struct inode *inode, struct file *filp) > > spin_unlock_irq(&port_user_lock); > > - for (i = 0; i < NR_EVENT_CHANNELS; i++) { > + for (i = 0; i < nr_event_channels; i++) { > if (get_port_user(i) != u) > continue; > > @@ -538,7 +538,7 @@ static int __init evtchn_init(void) > if (!xen_domain()) > return -ENODEV; > > - port_user = kcalloc(NR_EVENT_CHANNELS, sizeof(*port_user), GFP_KERNEL); > + port_user = kcalloc(nr_event_channels, sizeof(*port_user), GFP_KERNEL); > if (port_user == NULL) > return -ENOMEM; > > diff --git a/include/xen/events.h b/include/xen/events.h > index 04399b2..6b117ac 100644 > --- a/include/xen/events.h > +++ b/include/xen/events.h > @@ -109,4 +109,6 @@ int xen_irq_from_gsi(unsigned gsi); > /* Determine whether to ignore this IRQ if it is passed to a guest. */ > int xen_test_irq_shared(int irq); > > +extern unsigned int nr_event_channels; > + > #endif /* _XEN_EVENTS_H */ > -- > 1.7.10.4 >
Wei Liu
2013-Feb-05 17:05 UTC
Re: [PATCH 10/13] xen: introduce xen_event_channel_register_3level
On Tue, 2013-02-05 at 16:55 +0000, Konrad Rzeszutek Wilk wrote:> > + > > + rc = HYPERVISOR_event_channel_op(EVTCHNOP_register_nlevel, ®); > > + if (rc) { > > + free_pages((unsigned long)evtchn_pending, BITMAP_NR_PAGES); > > + free_pages((unsigned long)evtchn_mask, BITMAP_NR_PAGES); > > + evtchn_pending = NULL; > > + evtchn_mask = NULL; > > + } > > + > > +out: > > + kfree(_evtchn_pending); > > + kfree(_evtchn_mask); > > + kfree(l2sel_mfns); > > + kfree(l2sel_offsets); > > So it is OK to just free it even on success??Yes. They are only used for registration. And for all the above stuffs regarding comments, I will fix them in later post. Wei.
On Tue, 2013-02-05 at 17:04 +0000, Konrad Rzeszutek Wilk wrote:> On Thu, Jan 31, 2013 at 02:47:01PM +0000, Wei Liu wrote: > > Use global pointers in common operations to allow for better code sharing > > between N-level event channel. > > > > Functions which are not suitable for sharing are also taken care of. > > > > Also update drivers/xen/evtchn.c to use exported variable instead of macro. > > > > Signed-off-by: Wei Liu <wei.liu2@citrix.com> > > --- > > drivers/xen/events.c | 180 ++++++++++++++++++++++++++++++++------------------ > > drivers/xen/evtchn.c | 12 ++-- > > include/xen/events.h | 2 + > > 3 files changed, 123 insertions(+), 71 deletions(-) > > > > diff --git a/drivers/xen/events.c b/drivers/xen/events.c > > index 0679d27..4820a52 100644 > > --- a/drivers/xen/events.c > > +++ b/drivers/xen/events.c > > @@ -51,6 +51,16 @@ > > #include <xen/interface/hvm/hvm_op.h> > > #include <xen/interface/hvm/params.h> > > > > +/* N-level event channel, starting from 2 */ > > +unsigned int evtchn_level = 2; > > What if the hypervisor does not support that? Shouldn''t be by default > at 1? >the default implementation is 2 level. ;-)> > +EXPORT_SYMBOL_GPL(evtchn_level); > > Prefix it please with ''xen'' > > > +unsigned int nr_event_channels; > > +EXPORT_SYMBOL_GPL(nr_event_channels); > > Ditto here. >NP. Wei.
Konrad Rzeszutek Wilk
2013-Feb-05 17:09 UTC
Re: [PATCH 09/13] xen: implement 3-level event channel routines
On Thu, Jan 31, 2013 at 02:47:03PM +0000, Wei Liu wrote:> Only do_upcall, debug_interrupt and unmask_evtchn are required. > > Signed-off-by: Wei Liu <wei.liu2@citrix.com> > --- > drivers/xen/events.c | 291 ++++++++++++++++++++++++++++++++++++++++++++++++++ > 1 file changed, 291 insertions(+) > > diff --git a/drivers/xen/events.c b/drivers/xen/events.c > index 30ca620..d953e81 100644 > --- a/drivers/xen/events.c > +++ b/drivers/xen/events.c > @@ -51,6 +51,9 @@ > #include <xen/interface/hvm/hvm_op.h> > #include <xen/interface/hvm/params.h> > > +/* Helper macro(s) */ > +#define LONG_BITORDER (BITS_PER_LONG == 64 ? 6 : 5)That really needs an explanation.> + > /* N-level event channel, starting from 2 */ > unsigned int evtchn_level = 2; > EXPORT_SYMBOL_GPL(evtchn_level); > @@ -61,6 +64,9 @@ EXPORT_SYMBOL_GPL(nr_event_channels); > static unsigned long *evtchn_pending; > static unsigned long *evtchn_mask; > > +/* 2nd level selector for 3-level event channel */And that ''8'' there needs a #define> +static DEFINE_PER_CPU(unsigned long[sizeof(unsigned long) * 8], evtchn_sel_l2); > + > /* > * This lock protects updates to the following mapping and reference-count > * arrays. The lock does not need to be acquired to read the mapping tables. > @@ -396,6 +402,28 @@ static inline void __unmask_local_port_l2(int port) > vcpu_info->evtchn_upcall_pending = 1; > } > > +static inline void __unmask_local_port_l3(int port) > +{ > + struct vcpu_info *vcpu_info = __this_cpu_read(xen_vcpu); > + int cpu = smp_processor_id(); > + unsigned int l1bit = port >> (LONG_BITORDER << 1); > + unsigned int l2bit = port >> LONG_BITORDER; > + > + sync_clear_bit(port, &evtchn_mask[0]); > + > + /* > + * The following is basically the equivalent of > + * ''hw_resend_irq''. Just like a real IO-APIC we ''lose > + * the interrupt edge'' if the channel is masked. > + */ > + if (sync_test_bit(port, &evtchn_pending[0]) && > + !sync_test_and_set_bit(l2bit, > + &per_cpu(evtchn_sel_l2, cpu)[0]) && > + !sync_test_and_set_bit(l1bit, > + &vcpu_info->evtchn_pending_sel)) > + vcpu_info->evtchn_upcall_pending = 1; > +} > + > static void unmask_evtchn(int port) > { > unsigned int cpu = get_cpu(); > @@ -411,6 +439,9 @@ static void unmask_evtchn(int port) > case 2: > __unmask_local_port_l2(port); > break; > + case 3: > + __unmask_local_port_l3(port); > + break; > default: > BUG(); > } > @@ -1185,6 +1216,7 @@ void xen_send_IPI_one(unsigned int cpu, enum ipi_vector vector) > } > > static irqreturn_t xen_debug_interrupt_l2(int irq, void *dev_id); > +static irqreturn_t xen_debug_interrupt_l3(int irq, void *dev_id); > > irqreturn_t xen_debug_interrupt(int irq, void *dev_id) > { > @@ -1215,6 +1247,9 @@ irqreturn_t xen_debug_interrupt(int irq, void *dev_id) > case 2: > rc = xen_debug_interrupt_l2(irq, dev_id); > break; > + case 3: > + rc = xen_debug_interrupt_l3(irq, dev_id); > + break; > default: > BUG(); > } > @@ -1285,8 +1320,109 @@ static irqreturn_t xen_debug_interrupt_l2(int irq, void *dev_id) > return IRQ_HANDLED; > } > > +static irqreturn_t xen_debug_interrupt_l3(int irq, void *dev_id) > +{ > + int cpu = smp_processor_id(); > + unsigned long *cpu_evtchn = per_cpu(cpu_evtchn_mask, cpu); > + unsigned long nr_elems = NR_EVENT_CHANNELS_L3 / BITS_PER_LONG; > + int i; > + struct vcpu_info *v; > + > + v = per_cpu(xen_vcpu, cpu); > + > + printk(KERN_DEBUG "\npending (only show words which have bits set to 1):\n "); > + for (i = nr_elems-1; i >= 0; i--) > + if (evtchn_pending[i] != 0UL) { > + printk(KERN_DEBUG " word index %d %0*lx\n", > + i, > + (int)sizeof(evtchn_pending[0])*2, > + evtchn_pending[i]); > + } > + > + printk(KERN_DEBUG "\nglobal mask (only show words which have bits set to 0):\n "); > + for (i = nr_elems-1; i >= 0; i--) > + if (evtchn_mask[i] != ~0UL) { > + printk(KERN_DEBUG " word index %d %0*lx\n", > + i, > + (int)sizeof(evtchn_mask[0])*2, > + evtchn_mask[i]); > + } > + > + printk(KERN_DEBUG "\nglobally unmasked (only show result words which have bits set to 1):\n "); > + for (i = nr_elems-1; i >= 0; i--) > + if ((evtchn_pending[i] & ~evtchn_mask[i]) != 0UL) { > + printk(KERN_DEBUG " word index %d %0*lx\n", > + i, > + (int)(sizeof(evtchn_mask[0])*2), > + evtchn_pending[i] & ~evtchn_mask[i]); > + } > + > + printk(KERN_DEBUG "\nlocal cpu%d mask (only show words which have bits set to 1):\n ", cpu); > + for (i = (NR_EVENT_CHANNELS_L3/BITS_PER_LONG)-1; i >= 0; i--) > + if (cpu_evtchn[i] != 0UL) { > + printk(KERN_DEBUG " word index %d %0*lx\n", > + i, > + (int)(sizeof(cpu_evtchn[0])*2), > + cpu_evtchn[i]); > + } > + > + printk(KERN_DEBUG "\nlocally unmasked (only show result words which have bits set to 1):\n "); > + for (i = nr_elems-1; i >= 0; i--) { > + unsigned long pending = evtchn_pending[i] > + & ~evtchn_mask[i] > + & cpu_evtchn[i]; > + if (pending != 0UL) { > + printk(KERN_DEBUG " word index %d %0*lx\n", > + i, > + (int)(sizeof(evtchn_mask[0])*2), > + pending); > + } > + } > + > + printk(KERN_DEBUG "\npending list:\n"); > + for (i = 0; i < NR_EVENT_CHANNELS_L3; i++) { > + if (sync_test_bit(i, evtchn_pending)) { > + int word_idx = i / (BITS_PER_LONG * BITS_PER_LONG); > + int word_idx_l2 = i / BITS_PER_LONG; > + printk(KERN_DEBUG " %d: event %d -> irq %d%s%s%s%s\n", > + cpu_from_evtchn(i), i, > + evtchn_to_irq[i], > + !sync_test_bit(word_idx, &v->evtchn_pending_sel) > + ? "" : " l1-clear", > + !sync_test_bit(word_idx_l2, per_cpu(evtchn_sel_l2, cpu)) > + ? "" : " l2-clear", > + sync_test_bit(i, evtchn_mask) > + ? "" : " globally-masked", > + sync_test_bit(i, cpu_evtchn) > + ? "" : " locally-masked"); > + } > + } > + > + return IRQ_HANDLED;Um, there has to be a way to fold the most common cases of the L2 and L3 of this function in one?> +} > + > +/* The following per-cpu variables are used to save current state of event > + * processing loop. > + * > + * 2-level event channel: > + * current_word_idx is the bit index in L1 selector indicating the currently > + * processing word in shared bitmap. > + * current_bit_idx is the bit index in the currently processing word in shared > + * bitmap. > + * N.B. current_word_idx_l2 is not used. > + * > + * 3-level event channel: > + * current_word_idx is the bit index in L1 selector indicating the currently > + * processing word in L2 selector. > + * current_word_idx_l2 is the bit index in L2 selector word indicating the > + * currently processing word in shared bitmap. > + * current_bit_idx is the bit index in the currently processing word in shared > + * bitmap. > + * > + */ > static DEFINE_PER_CPU(unsigned, xed_nesting_count); > static DEFINE_PER_CPU(unsigned int, current_word_idx); > +static DEFINE_PER_CPU(unsigned int, current_word_idx_l2); > static DEFINE_PER_CPU(unsigned int, current_bit_idx); > > /* > @@ -1409,6 +1545,155 @@ out: > put_cpu(); > } > > +/* > + * In the 3-level event channel implementation, the first level is a > + * bitset of words which contain pending bits in the second level. > + * The second level is another bitsets which contain pending bits in > + * the third level. The third level is a bit set of pending events > + * themselves. > + */ > +static void __xen_evtchn_do_upcall_l3(void) > +{ > + struct vcpu_info *vcpu_info = __this_cpu_read(xen_vcpu); > + unsigned count; > + int start_word_idx_l1, start_word_idx_l2, start_bit_idx; > + int word_idx_l1, word_idx_l2, bit_idx; > + int i, j; > + int cpu = get_cpu(); > + > + do { > + unsigned long pending_words_l1; > + > + vcpu_info->evtchn_upcall_pending = 0; > + > + if (__this_cpu_inc_return(xed_nesting_count) - 1) > + goto out; > +#ifndef CONFIG_X86 > + /* No need for a barrier -- XCHG is a barrier on x86. */ > + /* Clear master flag /before/ clearing selector flag. */ > + wmb(); > +#endif > + /* here we get l1 pending selector */ > + pending_words_l1 = xchg(&vcpu_info->evtchn_pending_sel, 0); > + > + start_word_idx_l1 = __this_cpu_read(current_word_idx); > + start_word_idx_l2 = __this_cpu_read(current_word_idx_l2); > + start_bit_idx = __this_cpu_read(current_bit_idx); > + > + word_idx_l1 = start_word_idx_l1; > + > + /* loop through l1, try to pick up l2 */ > + for (i = 0; pending_words_l1 != 0; i++) { > + unsigned long words_l1; > + unsigned long pending_words_l2; > + > + words_l1 = MASK_LSBS(pending_words_l1, word_idx_l1); > + > + if (words_l1 == 0) { > + word_idx_l1 = 0; > + start_word_idx_l2 = 0; > + continue; > + } > + > + word_idx_l1 = __ffs(words_l1); > + > + pending_words_l2 > + xchg(&per_cpu(evtchn_sel_l2, cpu)[word_idx_l1], > + 0); > + > + word_idx_l2 = 0; > + if (word_idx_l1 == start_word_idx_l1) { > + if (i == 0) > + word_idx_l2 = start_word_idx_l2; > + else > + word_idx_l2 &= (1UL << start_word_idx_l2) - 1; > + } > + > + for (j = 0; pending_words_l2 != 0; j++) { > + unsigned long pending_bits; > + unsigned long words_l2; > + unsigned long idx; > + > + words_l2 = MASK_LSBS(pending_words_l2, > + word_idx_l2); > + > + if (words_l2 == 0) { > + word_idx_l2 = 0; > + bit_idx = 0; > + continue; > + } > + > + word_idx_l2 = __ffs(words_l2); > + > + idx = word_idx_l1*BITS_PER_LONG+word_idx_l2; > + pending_bits > + active_evtchns(cpu, idx); > + > + bit_idx = 0; > + if (word_idx_l2 == start_word_idx_l2) { > + if (j == 0) > + bit_idx = start_bit_idx; > + else > + bit_idx &= (1UL<<start_bit_idx)-1; > + } > + > + /* process port */ > + do { > + unsigned long bits; > + int port, irq; > + struct irq_desc *desc; > + > + bits = MASK_LSBS(pending_bits, bit_idx); > + > + if (bits == 0) > + break; > + > + bit_idx = __ffs(bits); > + > + port = (word_idx_l1 << (LONG_BITORDER << 1)) + > + (word_idx_l2 << LONG_BITORDER) + > + bit_idx; > + > + irq = evtchn_to_irq[port]; > + > + if (irq != -1) { > + desc = irq_to_desc(irq); > + if (desc) > + generic_handle_irq_desc(irq, desc); > + } > + > + bit_idx = (bit_idx + 1) % BITS_PER_LONG; > + > + __this_cpu_write(current_bit_idx, bit_idx); > + __this_cpu_write(current_word_idx_l2, > + bit_idx ? word_idx_l2 : > + (word_idx_l2+1) % BITS_PER_LONG); > + __this_cpu_write(current_word_idx_l2, > + word_idx_l2 ? word_idx_l1 : > + (word_idx_l1+1) % BITS_PER_LONG); > + } while (bit_idx != 0); > + > + if ((word_idx_l2 != start_word_idx_l2) || (j != 0)) > + pending_words_l2 &= ~(1UL << word_idx_l2); > + > + word_idx_l2 = (word_idx_l2 + 1) % BITS_PER_LONG; > + }This is a bit of a complex code. Is there any way you can split this up in smaller inline functions?> + > + if ((word_idx_l1 != start_word_idx_l1) || (i != 0)) > + pending_words_l1 &= ~(1UL << word_idx_l1); > + > + word_idx_l1 = (word_idx_l1 + 1) % BITS_PER_LONG; > + } > + > + BUG_ON(!irqs_disabled()); > + count = __this_cpu_read(xed_nesting_count); > + __this_cpu_write(xed_nesting_count, 0); > + } while (count != 1 || vcpu_info->evtchn_upcall_pending); > + > +out: > + put_cpu(); > +} > + > void xen_evtchn_do_upcall(struct pt_regs *regs) > { > struct pt_regs *old_regs = set_irq_regs(regs); > @@ -1420,6 +1705,9 @@ void xen_evtchn_do_upcall(struct pt_regs *regs) > case 2: > __xen_evtchn_do_upcall_l2(); > break; > + case 3: > + __xen_evtchn_do_upcall_l3(); > + break; > default: > BUG(); > } > @@ -1434,6 +1722,9 @@ void xen_hvm_evtchn_do_upcall(void) > case 2: > __xen_evtchn_do_upcall_l2(); > break; > + case 3: > + __xen_evtchn_do_upcall_l3(); > + break; > default: > BUG(); > } > -- > 1.7.10.4 >
Konrad Rzeszutek Wilk
2013-Feb-05 17:14 UTC
Re: [PATCH 10/13] xen: introduce xen_event_channel_register_3level
On Tue, Feb 05, 2013 at 05:05:00PM +0000, Wei Liu wrote:> On Tue, 2013-02-05 at 16:55 +0000, Konrad Rzeszutek Wilk wrote: > > > + > > > + rc = HYPERVISOR_event_channel_op(EVTCHNOP_register_nlevel, ®); > > > + if (rc) { > > > + free_pages((unsigned long)evtchn_pending, BITMAP_NR_PAGES); > > > + free_pages((unsigned long)evtchn_mask, BITMAP_NR_PAGES); > > > + evtchn_pending = NULL; > > > + evtchn_mask = NULL; > > > + } > > > + > > > +out: > > > + kfree(_evtchn_pending); > > > + kfree(_evtchn_mask); > > > + kfree(l2sel_mfns); > > > + kfree(l2sel_offsets); > > > > So it is OK to just free it even on success?? > > Yes. They are only used for registration.OK. Might want to provide a comment saying that right above the __ALLOC_PAGES. Hm, might even call the macro ''ALLOC_TMP_PAGES''> > And for all the above stuffs regarding comments, I will fix them in > later post. > > > Wei. > >
Konrad Rzeszutek Wilk
2013-Feb-05 17:19 UTC
Re: [PATCH 07/13] xen: generalized event channel operations
On Tue, Feb 05, 2013 at 05:08:51PM +0000, Wei Liu wrote:> On Tue, 2013-02-05 at 17:04 +0000, Konrad Rzeszutek Wilk wrote: > > On Thu, Jan 31, 2013 at 02:47:01PM +0000, Wei Liu wrote: > > > Use global pointers in common operations to allow for better code sharing > > > between N-level event channel. > > > > > > Functions which are not suitable for sharing are also taken care of. > > > > > > Also update drivers/xen/evtchn.c to use exported variable instead of macro. > > > > > > Signed-off-by: Wei Liu <wei.liu2@citrix.com> > > > --- > > > drivers/xen/events.c | 180 ++++++++++++++++++++++++++++++++------------------ > > > drivers/xen/evtchn.c | 12 ++-- > > > include/xen/events.h | 2 + > > > 3 files changed, 123 insertions(+), 71 deletions(-) > > > > > > diff --git a/drivers/xen/events.c b/drivers/xen/events.c > > > index 0679d27..4820a52 100644 > > > --- a/drivers/xen/events.c > > > +++ b/drivers/xen/events.c > > > @@ -51,6 +51,16 @@ > > > #include <xen/interface/hvm/hvm_op.h> > > > #include <xen/interface/hvm/params.h> > > > > > > +/* N-level event channel, starting from 2 */ > > > +unsigned int evtchn_level = 2; > > > > What if the hypervisor does not support that? Shouldn''t be by default > > at 1? > > > > the default implementation is 2 level. ;-)I am looking at evtchn_pending and evtchn_mask in the ''struct shared_info'' (include/xen/interface/xen.h) and it is not obvious from that. What am I missing?> > > > +EXPORT_SYMBOL_GPL(evtchn_level); > > > > Prefix it please with ''xen'' > > > > > +unsigned int nr_event_channels; > > > +EXPORT_SYMBOL_GPL(nr_event_channels); > > > > Ditto here. > > > > NP. > > > Wei. > >
On Tue, 2013-02-05 at 17:19 +0000, Konrad Rzeszutek Wilk wrote:> On Tue, Feb 05, 2013 at 05:08:51PM +0000, Wei Liu wrote: > > On Tue, 2013-02-05 at 17:04 +0000, Konrad Rzeszutek Wilk wrote: > > > On Thu, Jan 31, 2013 at 02:47:01PM +0000, Wei Liu wrote: > > > > Use global pointers in common operations to allow for better code sharing > > > > between N-level event channel. > > > > > > > > Functions which are not suitable for sharing are also taken care of. > > > > > > > > Also update drivers/xen/evtchn.c to use exported variable instead of macro. > > > > > > > > Signed-off-by: Wei Liu <wei.liu2@citrix.com> > > > > --- > > > > drivers/xen/events.c | 180 ++++++++++++++++++++++++++++++++------------------ > > > > drivers/xen/evtchn.c | 12 ++-- > > > > include/xen/events.h | 2 + > > > > 3 files changed, 123 insertions(+), 71 deletions(-) > > > > > > > > diff --git a/drivers/xen/events.c b/drivers/xen/events.c > > > > index 0679d27..4820a52 100644 > > > > --- a/drivers/xen/events.c > > > > +++ b/drivers/xen/events.c > > > > @@ -51,6 +51,16 @@ > > > > #include <xen/interface/hvm/hvm_op.h> > > > > #include <xen/interface/hvm/params.h> > > > > > > > > +/* N-level event channel, starting from 2 */ > > > > +unsigned int evtchn_level = 2; > > > > > > What if the hypervisor does not support that? Shouldn''t be by default > > > at 1? > > > > > > > the default implementation is 2 level. ;-) > > I am looking at evtchn_pending and evtchn_mask in the ''struct > shared_info'' (include/xen/interface/xen.h) and it is not obvious from > that. What am I missing? >To avoid scanning the whole bitmap when doing upcall, every vcpu is equipped with a selector - embedded in struct vcpu_info. So it is a 2-level event lookup path. Wei
On Tue, 2013-02-05 at 17:00 +0000, Konrad Rzeszutek Wilk wrote:> On Thu, Jan 31, 2013 at 02:46:58PM +0000, Wei Liu wrote: > > Signed-off-by: Wei Liu <wei.liu2@citrix.com> > > --- > > include/xen/interface/event_channel.h | 33 +++++++++++++++++++++++++++++++++ > > include/xen/interface/xen.h | 9 ++++++++- > > 2 files changed, 41 insertions(+), 1 deletion(-) > > > > diff --git a/include/xen/interface/event_channel.h b/include/xen/interface/event_channel.h > > index f494292..9d8b9e7 100644 > > --- a/include/xen/interface/event_channel.h > > +++ b/include/xen/interface/event_channel.h > > @@ -190,6 +190,39 @@ struct evtchn_reset { > > }; > > typedef struct evtchn_reset evtchn_reset_t; > > > > +/* > > + * EVTCHNOP_register_nlevel: Register N-level event channel > > + * NOTES: > > + * 1. Currently only 3-level is supported. > > + * 2. Should fall back to 2-level if this call fails. > > + */ > > +#define EVTCHNOP_register_nlevel 11 > > +/* 64 bit guests need 8 pages for evtchn_pending and evtchn_mask for > > + * 256k event channels while 32 bit ones only need 1 page for 32k > > + * event channels. */ > > +#define EVTCHN_MAX_L3_PAGES 8 > > +struct evtchn_register_3level { > > + /* IN parameters. */ > > + uint32_t nr_pages; /* for evtchn_{pending,mask} */ > > + uint32_t nr_vcpus; /* for l2sel_{mfns,mask} */ > > + GUEST_HANDLE(ulong) evtchn_pending; > > + GUEST_HANDLE(ulong) evtchn_mask; > > + GUEST_HANDLE(ulong) l2sel_mfns; > > + GUEST_HANDLE(ulong) l2sel_offsets; > > +}; > > +typedef struct evtchn_register_3level evtchn_register_3level_t; > > +DEFINE_GUEST_HANDLE(evtchn_register_3level_t); > > + > > +struct evtchn_register_nlevel { > > + /* IN parameters. */ > > + uint32_t level; > > + union { > > + evtchn_register_3level_t l3; > > + } u; > > +}; > > +typedef struct evtchn_register_nlevel evtchn_register_nlevel_t; > > +DEFINE_GUEST_HANDLE(evtchn_register_nlevel_t); > > + > > struct evtchn_op { > > uint32_t cmd; /* EVTCHNOP_* */ > > union { > > diff --git a/include/xen/interface/xen.h b/include/xen/interface/xen.h > > index a890804..5220e33 100644 > > --- a/include/xen/interface/xen.h > > +++ b/include/xen/interface/xen.h > > @@ -283,9 +283,16 @@ DEFINE_GUEST_HANDLE_STRUCT(multicall_entry); > > > > /* > > * Event channel endpoints per domain: > > + * 2-level: > > * 1024 if a long is 32 bits; 4096 if a long is 64 bits. > > + * 3-level: > > + * 32k if a long is 32 bits; 256k if a long is 64 bits. > > */ > > -#define NR_EVENT_CHANNELS (sizeof(unsigned long) * sizeof(unsigned long) * 64) > > +#define NR_EVENT_CHANNELS_L2 (sizeof(unsigned long) * sizeof(unsigned long) * 64) > > We did a bit of header change to make the ARM code always use ''unsinged > long long'' on 32-bit (so it is a nice 8-bytes) and ''unsigned long'' on > 64-bit. This was all done using the xen_pfn_t and xen_mfn_t typdefs. > > Any chance you can do that here? That way on ARM - irregardless if it is > 32-bit or 64-bit it is always of the 64-bit size? >I don''t think so. The reason to use unsigned long here is to guarantee each selector (in 2-level case there is only L1 selector) fits into a word.> Or do we not care about the ARM case here? >What do you mean? How would this definition affect ARM case? Wei.> > +#define NR_EVENT_CHANNELS_L3 (NR_EVENT_CHANNELS_L2 * 64) > > +#if !defined(__XEN__) && !defined(__XEN_TOOLS__) > > +#define NR_EVENT_CHANNELS NR_EVENT_CHANNELS_L2 /* for compatibility */ > > +#endif > > > > struct vcpu_time_info { > > /* > > -- > > 1.7.10.4 > >
Wei Liu
2013-Feb-05 17:39 UTC
Re: [PATCH 09/13] xen: implement 3-level event channel routines
On Tue, 2013-02-05 at 17:09 +0000, Konrad Rzeszutek Wilk wrote:> > > > +static irqreturn_t xen_debug_interrupt_l3(int irq, void *dev_id) > > +{ > > + int cpu = smp_processor_id(); > > + unsigned long *cpu_evtchn = per_cpu(cpu_evtchn_mask, cpu); > > + unsigned long nr_elems = NR_EVENT_CHANNELS_L3 / BITS_PER_LONG; > > + int i; > > + struct vcpu_info *v; > > + > > + v = per_cpu(xen_vcpu, cpu); > > + > > + printk(KERN_DEBUG "\npending (only show words which have bits set to 1):\n "); > > + for (i = nr_elems-1; i >= 0; i--) > > + if (evtchn_pending[i] != 0UL) { > > + printk(KERN_DEBUG " word index %d %0*lx\n", > > + i, > > + (int)sizeof(evtchn_pending[0])*2, > > + evtchn_pending[i]); > > + } > > + > > + printk(KERN_DEBUG "\nglobal mask (only show words which have bits set to 0):\n "); > > + for (i = nr_elems-1; i >= 0; i--) > > + if (evtchn_mask[i] != ~0UL) { > > + printk(KERN_DEBUG " word index %d %0*lx\n", > > + i, > > + (int)sizeof(evtchn_mask[0])*2, > > + evtchn_mask[i]); > > + } > > + > > + printk(KERN_DEBUG "\nglobally unmasked (only show result words which have bits set to 1):\n "); > > + for (i = nr_elems-1; i >= 0; i--) > > + if ((evtchn_pending[i] & ~evtchn_mask[i]) != 0UL) { > > + printk(KERN_DEBUG " word index %d %0*lx\n", > > + i, > > + (int)(sizeof(evtchn_mask[0])*2), > > + evtchn_pending[i] & ~evtchn_mask[i]); > > + } > > + > > + printk(KERN_DEBUG "\nlocal cpu%d mask (only show words which have bits set to 1):\n ", cpu); > > + for (i = (NR_EVENT_CHANNELS_L3/BITS_PER_LONG)-1; i >= 0; i--) > > + if (cpu_evtchn[i] != 0UL) { > > + printk(KERN_DEBUG " word index %d %0*lx\n", > > + i, > > + (int)(sizeof(cpu_evtchn[0])*2), > > + cpu_evtchn[i]); > > + } > > + > > + printk(KERN_DEBUG "\nlocally unmasked (only show result words which have bits set to 1):\n "); > > + for (i = nr_elems-1; i >= 0; i--) { > > + unsigned long pending = evtchn_pending[i] > > + & ~evtchn_mask[i] > > + & cpu_evtchn[i]; > > + if (pending != 0UL) { > > + printk(KERN_DEBUG " word index %d %0*lx\n", > > + i, > > + (int)(sizeof(evtchn_mask[0])*2), > > + pending); > > + } > > + } > > + > > + printk(KERN_DEBUG "\npending list:\n"); > > + for (i = 0; i < NR_EVENT_CHANNELS_L3; i++) { > > + if (sync_test_bit(i, evtchn_pending)) { > > + int word_idx = i / (BITS_PER_LONG * BITS_PER_LONG); > > + int word_idx_l2 = i / BITS_PER_LONG; > > + printk(KERN_DEBUG " %d: event %d -> irq %d%s%s%s%s\n", > > + cpu_from_evtchn(i), i, > > + evtchn_to_irq[i], > > + !sync_test_bit(word_idx, &v->evtchn_pending_sel) > > + ? "" : " l1-clear", > > + !sync_test_bit(word_idx_l2, per_cpu(evtchn_sel_l2, cpu)) > > + ? "" : " l2-clear", > > + sync_test_bit(i, evtchn_mask) > > + ? "" : " globally-masked", > > + sync_test_bit(i, cpu_evtchn) > > + ? "" : " locally-masked"); > > + } > > + } > > + > > + return IRQ_HANDLED; > > Um, there has to be a way to fold the most common cases of the L2 and L3 > of this function in one? >The only common part is the for-loop. I could try to move statements in for-loops to dedicated functions, then the file will be filled with 10+ functions like: void output_l{2,3}_{globally,locally}_{masked,unmasked}() void output_l{2,3}_pending() void print_heading_for_l{2,3}()> > + pending_words_l2 &= ~(1UL << word_idx_l2); > > + > > + word_idx_l2 = (word_idx_l2 + 1) % BITS_PER_LONG; > > + } > > This is a bit of a complex code. Is there any way you can split this up > in smaller inline functions?I will try. Wei.
Konrad Rzeszutek Wilk
2013-Feb-05 19:44 UTC
Re: [PATCH 07/13] xen: generalized event channel operations
On Tue, Feb 05, 2013 at 05:23:21PM +0000, Wei Liu wrote:> On Tue, 2013-02-05 at 17:19 +0000, Konrad Rzeszutek Wilk wrote: > > On Tue, Feb 05, 2013 at 05:08:51PM +0000, Wei Liu wrote: > > > On Tue, 2013-02-05 at 17:04 +0000, Konrad Rzeszutek Wilk wrote: > > > > On Thu, Jan 31, 2013 at 02:47:01PM +0000, Wei Liu wrote: > > > > > Use global pointers in common operations to allow for better code sharing > > > > > between N-level event channel. > > > > > > > > > > Functions which are not suitable for sharing are also taken care of. > > > > > > > > > > Also update drivers/xen/evtchn.c to use exported variable instead of macro. > > > > > > > > > > Signed-off-by: Wei Liu <wei.liu2@citrix.com> > > > > > --- > > > > > drivers/xen/events.c | 180 ++++++++++++++++++++++++++++++++------------------ > > > > > drivers/xen/evtchn.c | 12 ++-- > > > > > include/xen/events.h | 2 + > > > > > 3 files changed, 123 insertions(+), 71 deletions(-) > > > > > > > > > > diff --git a/drivers/xen/events.c b/drivers/xen/events.c > > > > > index 0679d27..4820a52 100644 > > > > > --- a/drivers/xen/events.c > > > > > +++ b/drivers/xen/events.c > > > > > @@ -51,6 +51,16 @@ > > > > > #include <xen/interface/hvm/hvm_op.h> > > > > > #include <xen/interface/hvm/params.h> > > > > > > > > > > +/* N-level event channel, starting from 2 */ > > > > > +unsigned int evtchn_level = 2; > > > > > > > > What if the hypervisor does not support that? Shouldn''t be by default > > > > at 1? > > > > > > > > > > the default implementation is 2 level. ;-) > > > > I am looking at evtchn_pending and evtchn_mask in the ''struct > > shared_info'' (include/xen/interface/xen.h) and it is not obvious from > > that. What am I missing? > > > > To avoid scanning the whole bitmap when doing upcall, every vcpu is > equipped with a selector - embedded in struct vcpu_info. > > So it is a 2-level event lookup path. >Ah, OK. I was thinking of the normal 2 level lookup - like a page-table. Could you include the explanation in the patch please? Thank you.> > Wei >
Konrad Rzeszutek Wilk
2013-Feb-05 19:46 UTC
Re: [PATCH 09/13] xen: implement 3-level event channel routines
On Tue, Feb 05, 2013 at 05:39:02PM +0000, Wei Liu wrote:> On Tue, 2013-02-05 at 17:09 +0000, Konrad Rzeszutek Wilk wrote: > > > > > > +static irqreturn_t xen_debug_interrupt_l3(int irq, void *dev_id) > > > +{ > > > + int cpu = smp_processor_id(); > > > + unsigned long *cpu_evtchn = per_cpu(cpu_evtchn_mask, cpu); > > > + unsigned long nr_elems = NR_EVENT_CHANNELS_L3 / BITS_PER_LONG; > > > + int i; > > > + struct vcpu_info *v; > > > + > > > + v = per_cpu(xen_vcpu, cpu); > > > + > > > + printk(KERN_DEBUG "\npending (only show words which have bits set to 1):\n "); > > > + for (i = nr_elems-1; i >= 0; i--) > > > + if (evtchn_pending[i] != 0UL) { > > > + printk(KERN_DEBUG " word index %d %0*lx\n", > > > + i, > > > + (int)sizeof(evtchn_pending[0])*2, > > > + evtchn_pending[i]); > > > + } > > > + > > > + printk(KERN_DEBUG "\nglobal mask (only show words which have bits set to 0):\n "); > > > + for (i = nr_elems-1; i >= 0; i--) > > > + if (evtchn_mask[i] != ~0UL) { > > > + printk(KERN_DEBUG " word index %d %0*lx\n", > > > + i, > > > + (int)sizeof(evtchn_mask[0])*2, > > > + evtchn_mask[i]); > > > + } > > > + > > > + printk(KERN_DEBUG "\nglobally unmasked (only show result words which have bits set to 1):\n "); > > > + for (i = nr_elems-1; i >= 0; i--) > > > + if ((evtchn_pending[i] & ~evtchn_mask[i]) != 0UL) { > > > + printk(KERN_DEBUG " word index %d %0*lx\n", > > > + i, > > > + (int)(sizeof(evtchn_mask[0])*2), > > > + evtchn_pending[i] & ~evtchn_mask[i]); > > > + } > > > + > > > + printk(KERN_DEBUG "\nlocal cpu%d mask (only show words which have bits set to 1):\n ", cpu); > > > + for (i = (NR_EVENT_CHANNELS_L3/BITS_PER_LONG)-1; i >= 0; i--) > > > + if (cpu_evtchn[i] != 0UL) { > > > + printk(KERN_DEBUG " word index %d %0*lx\n", > > > + i, > > > + (int)(sizeof(cpu_evtchn[0])*2), > > > + cpu_evtchn[i]); > > > + } > > > + > > > + printk(KERN_DEBUG "\nlocally unmasked (only show result words which have bits set to 1):\n "); > > > + for (i = nr_elems-1; i >= 0; i--) { > > > + unsigned long pending = evtchn_pending[i] > > > + & ~evtchn_mask[i] > > > + & cpu_evtchn[i]; > > > + if (pending != 0UL) { > > > + printk(KERN_DEBUG " word index %d %0*lx\n", > > > + i, > > > + (int)(sizeof(evtchn_mask[0])*2), > > > + pending); > > > + } > > > + } > > > + > > > + printk(KERN_DEBUG "\npending list:\n"); > > > + for (i = 0; i < NR_EVENT_CHANNELS_L3; i++) { > > > + if (sync_test_bit(i, evtchn_pending)) { > > > + int word_idx = i / (BITS_PER_LONG * BITS_PER_LONG); > > > + int word_idx_l2 = i / BITS_PER_LONG; > > > + printk(KERN_DEBUG " %d: event %d -> irq %d%s%s%s%s\n", > > > + cpu_from_evtchn(i), i, > > > + evtchn_to_irq[i], > > > + !sync_test_bit(word_idx, &v->evtchn_pending_sel) > > > + ? "" : " l1-clear", > > > + !sync_test_bit(word_idx_l2, per_cpu(evtchn_sel_l2, cpu)) > > > + ? "" : " l2-clear", > > > + sync_test_bit(i, evtchn_mask) > > > + ? "" : " globally-masked", > > > + sync_test_bit(i, cpu_evtchn) > > > + ? "" : " locally-masked"); > > > + } > > > + } > > > + > > > + return IRQ_HANDLED; > > > > Um, there has to be a way to fold the most common cases of the L2 and L3 > > of this function in one? > > > > The only common part is the for-loop. I could try to move statements in > for-loops to dedicated functions, then the file will be filled with 10+ > functions like: > > void output_l{2,3}_{globally,locally}_{masked,unmasked}() > > void output_l{2,3}_pending() > > void print_heading_for_l{2,3}()Can it be via macros? Like arch/x86/mm/dump_pagetables.c has it for the pagetable walker?> > > > > + pending_words_l2 &= ~(1UL << word_idx_l2); > > > + > > > + word_idx_l2 = (word_idx_l2 + 1) % BITS_PER_LONG; > > > + } > > > > This is a bit of a complex code. Is there any way you can split this up > > in smaller inline functions? > > I will try. > > > Wei. > >
Konrad Rzeszutek Wilk
2013-Feb-06 15:49 UTC
Re: [PATCH 02/13] xen: fix error handling path if xen_allocate_irq_dynamic fails
On Thu, Jan 31, 2013 at 02:46:56PM +0000, Wei Liu wrote:> It is possible that the call to xen_allocate_irq_dynamic() returns negative > number other than -1.Applied.> > Signed-off-by: Wei Liu <wei.liu2@citrix.com> > --- > drivers/xen/events.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/drivers/xen/events.c b/drivers/xen/events.c > index 2c94aad..93a3648 100644 > --- a/drivers/xen/events.c > +++ b/drivers/xen/events.c > @@ -820,7 +820,7 @@ int bind_evtchn_to_irq(unsigned int evtchn) > > if (irq == -1) { > irq = xen_allocate_irq_dynamic(); > - if (irq == -1) > + if (irq < 0) > goto out; > > irq_set_chip_and_handler_name(irq, &xen_dynamic_chip, > @@ -923,7 +923,7 @@ int bind_virq_to_irq(unsigned int virq, unsigned int cpu) > > if (irq == -1) { > irq = xen_allocate_irq_dynamic(); > - if (irq == -1) > + if (irq < 0) > goto out; > > irq_set_chip_and_handler_name(irq, &xen_percpu_chip, > -- > 1.7.10.4 >
On Tue, Feb 05, 2013 at 05:23:40PM +0000, Wei Liu wrote:> On Tue, 2013-02-05 at 17:00 +0000, Konrad Rzeszutek Wilk wrote: > > On Thu, Jan 31, 2013 at 02:46:58PM +0000, Wei Liu wrote: > > > Signed-off-by: Wei Liu <wei.liu2@citrix.com> > > > --- > > > include/xen/interface/event_channel.h | 33 +++++++++++++++++++++++++++++++++ > > > include/xen/interface/xen.h | 9 ++++++++- > > > 2 files changed, 41 insertions(+), 1 deletion(-) > > > > > > diff --git a/include/xen/interface/event_channel.h b/include/xen/interface/event_channel.h > > > index f494292..9d8b9e7 100644 > > > --- a/include/xen/interface/event_channel.h > > > +++ b/include/xen/interface/event_channel.h > > > @@ -190,6 +190,39 @@ struct evtchn_reset { > > > }; > > > typedef struct evtchn_reset evtchn_reset_t; > > > > > > +/* > > > + * EVTCHNOP_register_nlevel: Register N-level event channel > > > + * NOTES: > > > + * 1. Currently only 3-level is supported. > > > + * 2. Should fall back to 2-level if this call fails. > > > + */ > > > +#define EVTCHNOP_register_nlevel 11 > > > +/* 64 bit guests need 8 pages for evtchn_pending and evtchn_mask for > > > + * 256k event channels while 32 bit ones only need 1 page for 32k > > > + * event channels. */ > > > +#define EVTCHN_MAX_L3_PAGES 8 > > > +struct evtchn_register_3level { > > > + /* IN parameters. */ > > > + uint32_t nr_pages; /* for evtchn_{pending,mask} */ > > > + uint32_t nr_vcpus; /* for l2sel_{mfns,mask} */ > > > + GUEST_HANDLE(ulong) evtchn_pending; > > > + GUEST_HANDLE(ulong) evtchn_mask; > > > + GUEST_HANDLE(ulong) l2sel_mfns; > > > + GUEST_HANDLE(ulong) l2sel_offsets; > > > +}; > > > +typedef struct evtchn_register_3level evtchn_register_3level_t;Please please not put them in. I know that there are some typdefs in there but they should actually be removed.> > > +DEFINE_GUEST_HANDLE(evtchn_register_3level_t); > > > + > > > +struct evtchn_register_nlevel { > > > + /* IN parameters. */ > > > + uint32_t level; > > > + union { > > > + evtchn_register_3level_t l3; > > > + } u; > > > +}; > > > +typedef struct evtchn_register_nlevel evtchn_register_nlevel_t; > > > +DEFINE_GUEST_HANDLE(evtchn_register_nlevel_t); > > > + > > > struct evtchn_op { > > > uint32_t cmd; /* EVTCHNOP_* */ > > > union { > > > diff --git a/include/xen/interface/xen.h b/include/xen/interface/xen.h > > > index a890804..5220e33 100644 > > > --- a/include/xen/interface/xen.h > > > +++ b/include/xen/interface/xen.h > > > @@ -283,9 +283,16 @@ DEFINE_GUEST_HANDLE_STRUCT(multicall_entry); > > > > > > /* > > > * Event channel endpoints per domain: > > > + * 2-level: > > > * 1024 if a long is 32 bits; 4096 if a long is 64 bits. > > > + * 3-level: > > > + * 32k if a long is 32 bits; 256k if a long is 64 bits. > > > */ > > > -#define NR_EVENT_CHANNELS (sizeof(unsigned long) * sizeof(unsigned long) * 64) > > > +#define NR_EVENT_CHANNELS_L2 (sizeof(unsigned long) * sizeof(unsigned long) * 64) > > > > We did a bit of header change to make the ARM code always use ''unsinged > > long long'' on 32-bit (so it is a nice 8-bytes) and ''unsigned long'' on > > 64-bit. This was all done using the xen_pfn_t and xen_mfn_t typdefs. > > > > Any chance you can do that here? That way on ARM - irregardless if it is > > 32-bit or 64-bit it is always of the 64-bit size? > > > > I don''t think so. The reason to use unsigned long here is to guarantee > each selector (in 2-level case there is only L1 selector) fits into a > word.OK, so we do depend on this being of different size on 32-bit and 64-bit.> > > Or do we not care about the ARM case here? > > > > What do you mean? How would this definition affect ARM case? >Xen ARM uses 64-bit values even if the host/guest is 32-bit.> > Wei. > > > > +#define NR_EVENT_CHANNELS_L3 (NR_EVENT_CHANNELS_L2 * 64) > > > +#if !defined(__XEN__) && !defined(__XEN_TOOLS__) > > > +#define NR_EVENT_CHANNELS NR_EVENT_CHANNELS_L2 /* for compatibility */ > > > +#endif > > > > > > struct vcpu_time_info { > > > /* > > > -- > > > 1.7.10.4 > > > > >
> -----Original Message-----[snip]> > > > -#define NR_EVENT_CHANNELS (sizeof(unsigned long) * > > > > sizeof(unsigned long) * 64) > > > > +#define NR_EVENT_CHANNELS_L2 (sizeof(unsigned long) * > > > > +sizeof(unsigned long) * 64) > > > > > > We did a bit of header change to make the ARM code always use > > > ''unsinged long long'' on 32-bit (so it is a nice 8-bytes) and > > > ''unsigned long'' on 64-bit. This was all done using the xen_pfn_t and > xen_mfn_t typdefs. > > > > > > Any chance you can do that here? That way on ARM - irregardless if > > > it is 32-bit or 64-bit it is always of the 64-bit size? > > > > > > > I don''t think so. The reason to use unsigned long here is to guarantee > > each selector (in 2-level case there is only L1 selector) fits into a > > word. >That''s still going to be a problem with Windows drivers. Unfortunately MSVC uses a 64-bit model where longs are still 32-bit. The only thing that is word size is a pointer. Any chance we can use uintptr_t rather than an unsigned long? (At the moment I have to sed all the public headers to replace long with LONG_PTR and it''s a PITA). Paul
On Thu, 2013-02-07 at 09:22 +0000, Paul Durrant wrote:> > -----Original Message----- > [snip] > > > > > -#define NR_EVENT_CHANNELS (sizeof(unsigned long) * > > > > > sizeof(unsigned long) * 64) > > > > > +#define NR_EVENT_CHANNELS_L2 (sizeof(unsigned long) * > > > > > +sizeof(unsigned long) * 64) > > > > > > > > We did a bit of header change to make the ARM code always use > > > > ''unsinged long long'' on 32-bit (so it is a nice 8-bytes) and > > > > ''unsigned long'' on 64-bit. This was all done using the xen_pfn_t and > > xen_mfn_t typdefs. > > > > > > > > Any chance you can do that here? That way on ARM - irregardless if > > > > it is 32-bit or 64-bit it is always of the 64-bit size? > > > > > > > > > > I don''t think so. The reason to use unsigned long here is to guarantee > > > each selector (in 2-level case there is only L1 selector) fits into a > > > word. > > > > That''s still going to be a problem with Windows drivers. Unfortunately MSVC uses a 64-bit model where longs are still 32-bit. The only thing that is word size is a pointer. Any chance we can use uintptr_t rather than an unsigned long? (At the moment I have to sed all the public headers to replace long with LONG_PTR and it''s a PITA). >TBH I don''t know much about Windows. But are you suggesting replace all the relevant bit in the header or just the specific event channel interface? Wei.> Paul
On Thu, 2013-02-07 at 09:22 +0000, Paul Durrant wrote:> > -----Original Message----- > [snip] > > > > > -#define NR_EVENT_CHANNELS (sizeof(unsigned long) * > > > > > sizeof(unsigned long) * 64) > > > > > +#define NR_EVENT_CHANNELS_L2 (sizeof(unsigned long) * > > > > > +sizeof(unsigned long) * 64) > > > > > > > > We did a bit of header change to make the ARM code always use > > > > ''unsinged long long'' on 32-bit (so it is a nice 8-bytes) and > > > > ''unsigned long'' on 64-bit. This was all done using the xen_pfn_t and > > xen_mfn_t typdefs. > > > > > > > > Any chance you can do that here? That way on ARM - irregardless if > > > > it is 32-bit or 64-bit it is always of the 64-bit size? > > > > > > > > > > I don''t think so. The reason to use unsigned long here is to guarantee > > > each selector (in 2-level case there is only L1 selector) fits into a > > > word. > > > > That''s still going to be a problem with Windows drivers. Unfortunately > MSVC uses a 64-bit model where longs are still 32-bit. The only thing > that is word size is a pointer. Any chance we can use uintptr_t rather > than an unsigned long? (At the moment I have to sed all the public > headers to replace long with LONG_PTR and it''s a PITA).It''s considered pretty normal these days that users of these headers import them and manipulate them to fit their environment (often by hand and manual synching). For example in the Linux copy we remove the struct typedefs because Linux coding style doesn''t use them. Ian.
> -----Original Message----- > From: Wei Liu [mailto:wei.liu2@citrix.com] > Sent: 07 February 2013 11:58 > To: Paul Durrant > Cc: Wei Liu; Konrad Rzeszutek Wilk; David Vrabel; Ian Campbell; > jbeulich@suse.com; xen-devel@lists.xen.org > Subject: Re: [Xen-devel] [PATCH 04/13] xen: sync public headers > > On Thu, 2013-02-07 at 09:22 +0000, Paul Durrant wrote: > > > -----Original Message----- > > [snip] > > > > > > -#define NR_EVENT_CHANNELS (sizeof(unsigned long) * > > > > > > sizeof(unsigned long) * 64) > > > > > > +#define NR_EVENT_CHANNELS_L2 (sizeof(unsigned long) * > > > > > > +sizeof(unsigned long) * 64) > > > > > > > > > > We did a bit of header change to make the ARM code always use > > > > > ''unsinged long long'' on 32-bit (so it is a nice 8-bytes) and > > > > > ''unsigned long'' on 64-bit. This was all done using the xen_pfn_t > > > > > and > > > xen_mfn_t typdefs. > > > > > > > > > > Any chance you can do that here? That way on ARM - irregardless > > > > > if it is 32-bit or 64-bit it is always of the 64-bit size? > > > > > > > > > > > > > I don''t think so. The reason to use unsigned long here is to > > > > guarantee each selector (in 2-level case there is only L1 > > > > selector) fits into a word. > > > > > > > That''s still going to be a problem with Windows drivers. Unfortunately > MSVC uses a 64-bit model where longs are still 32-bit. The only thing that is > word size is a pointer. Any chance we can use uintptr_t rather than an > unsigned long? (At the moment I have to sed all the public headers to > replace long with LONG_PTR and it''s a PITA). > > > > TBH I don''t know much about Windows. But are you suggesting replace all > the relevant bit in the header or just the specific event channel interface? >I was just pointing out that assumption that unsigned long == native word size does not hold for 64-bit Windows. So, if we''re adding something new can we avoid use of unsigned long and use an abstract type defined to be the native word size? Paul
On Fri, 2013-02-08 at 16:06 +0000, Paul Durrant wrote:> > -----Original Message----- > > From: Wei Liu [mailto:wei.liu2@citrix.com] > > Sent: 07 February 2013 11:58 > > To: Paul Durrant > > Cc: Wei Liu; Konrad Rzeszutek Wilk; David Vrabel; Ian Campbell; > > jbeulich@suse.com; xen-devel@lists.xen.org > > Subject: Re: [Xen-devel] [PATCH 04/13] xen: sync public headers > > > > On Thu, 2013-02-07 at 09:22 +0000, Paul Durrant wrote: > > > > -----Original Message----- > > > [snip] > > > > > > > -#define NR_EVENT_CHANNELS (sizeof(unsigned long) * > > > > > > > sizeof(unsigned long) * 64) > > > > > > > +#define NR_EVENT_CHANNELS_L2 (sizeof(unsigned long) * > > > > > > > +sizeof(unsigned long) * 64) > > > > > > > > > > > > We did a bit of header change to make the ARM code always use > > > > > > ''unsinged long long'' on 32-bit (so it is a nice 8-bytes) and > > > > > > ''unsigned long'' on 64-bit. This was all done using the xen_pfn_t > > > > > > and > > > > xen_mfn_t typdefs. > > > > > > > > > > > > Any chance you can do that here? That way on ARM - irregardless > > > > > > if it is 32-bit or 64-bit it is always of the 64-bit size? > > > > > > > > > > > > > > > > I don''t think so. The reason to use unsigned long here is to > > > > > guarantee each selector (in 2-level case there is only L1 > > > > > selector) fits into a word. > > > > > > > > > > That''s still going to be a problem with Windows drivers. Unfortunately > > MSVC uses a 64-bit model where longs are still 32-bit. The only thing that is > > word size is a pointer. Any chance we can use uintptr_t rather than an > > unsigned long? (At the moment I have to sed all the public headers to > > replace long with LONG_PTR and it''s a PITA). > > > > > > > TBH I don''t know much about Windows. But are you suggesting replace all > > the relevant bit in the header or just the specific event channel interface? > > > > I was just pointing out that assumption that unsigned long == native > word size does not hold for 64-bit Windows. So, if we''re adding > something new can we avoid use of unsigned long and use an abstract > type defined to be the native word size?Probably ought to be the existing xen_ulong_t. Ian.
> -----Original Message----- > From: Ian Campbell > Sent: 08 February 2013 16:23 > To: Paul Durrant > Cc: Wei Liu; Konrad Rzeszutek Wilk; David Vrabel; jbeulich@suse.com; xen- > devel@lists.xen.org > Subject: Re: [Xen-devel] [PATCH 04/13] xen: sync public headers > > On Fri, 2013-02-08 at 16:06 +0000, Paul Durrant wrote: > > > -----Original Message----- > > > From: Wei Liu [mailto:wei.liu2@citrix.com] > > > Sent: 07 February 2013 11:58 > > > To: Paul Durrant > > > Cc: Wei Liu; Konrad Rzeszutek Wilk; David Vrabel; Ian Campbell; > > > jbeulich@suse.com; xen-devel@lists.xen.org > > > Subject: Re: [Xen-devel] [PATCH 04/13] xen: sync public headers > > > > > > On Thu, 2013-02-07 at 09:22 +0000, Paul Durrant wrote: > > > > > -----Original Message----- > > > > [snip] > > > > > > > > -#define NR_EVENT_CHANNELS (sizeof(unsigned long) * > > > > > > > > sizeof(unsigned long) * 64) > > > > > > > > +#define NR_EVENT_CHANNELS_L2 (sizeof(unsigned long) * > > > > > > > > +sizeof(unsigned long) * 64) > > > > > > > > > > > > > > We did a bit of header change to make the ARM code always > > > > > > > use ''unsinged long long'' on 32-bit (so it is a nice 8-bytes) > > > > > > > and ''unsigned long'' on 64-bit. This was all done using the > > > > > > > xen_pfn_t and > > > > > xen_mfn_t typdefs. > > > > > > > > > > > > > > Any chance you can do that here? That way on ARM - > > > > > > > irregardless if it is 32-bit or 64-bit it is always of the 64-bit size? > > > > > > > > > > > > > > > > > > > I don''t think so. The reason to use unsigned long here is to > > > > > > guarantee each selector (in 2-level case there is only L1 > > > > > > selector) fits into a word. > > > > > > > > > > > > > That''s still going to be a problem with Windows drivers. > > > > Unfortunately > > > MSVC uses a 64-bit model where longs are still 32-bit. The only > > > thing that is word size is a pointer. Any chance we can use > > > uintptr_t rather than an unsigned long? (At the moment I have to sed > > > all the public headers to replace long with LONG_PTR and it''s a PITA). > > > > > > > > > > TBH I don''t know much about Windows. But are you suggesting replace > > > all the relevant bit in the header or just the specific event channel > interface? > > > > > > > I was just pointing out that assumption that unsigned long == native > > word size does not hold for 64-bit Windows. So, if we''re adding > > something new can we avoid use of unsigned long and use an abstract > > type defined to be the native word size? > > Probably ought to be the existing xen_ulong_t. >That works for me :-) Paul
On Fri, 2013-02-08 at 16:22 +0000, Ian Campbell wrote:> On Fri, 2013-02-08 at 16:06 +0000, Paul Durrant wrote: > > > -----Original Message----- > > > From: Wei Liu [mailto:wei.liu2@citrix.com] > > > Sent: 07 February 2013 11:58 > > > To: Paul Durrant > > > Cc: Wei Liu; Konrad Rzeszutek Wilk; David Vrabel; Ian Campbell; > > > jbeulich@suse.com; xen-devel@lists.xen.org > > > Subject: Re: [Xen-devel] [PATCH 04/13] xen: sync public headers > > > > > > On Thu, 2013-02-07 at 09:22 +0000, Paul Durrant wrote: > > > > > -----Original Message----- > > > > [snip] > > > > > > > > -#define NR_EVENT_CHANNELS (sizeof(unsigned long) * > > > > > > > > sizeof(unsigned long) * 64) > > > > > > > > +#define NR_EVENT_CHANNELS_L2 (sizeof(unsigned long) * > > > > > > > > +sizeof(unsigned long) * 64) > > > > > > > > > > > > > > We did a bit of header change to make the ARM code always use > > > > > > > ''unsinged long long'' on 32-bit (so it is a nice 8-bytes) and > > > > > > > ''unsigned long'' on 64-bit. This was all done using the xen_pfn_t > > > > > > > and > > > > > xen_mfn_t typdefs. > > > > > > > > > > > > > > Any chance you can do that here? That way on ARM - irregardless > > > > > > > if it is 32-bit or 64-bit it is always of the 64-bit size? > > > > > > > > > > > > > > > > > > > I don''t think so. The reason to use unsigned long here is to > > > > > > guarantee each selector (in 2-level case there is only L1 > > > > > > selector) fits into a word. > > > > > > > > > > > > > That''s still going to be a problem with Windows drivers. Unfortunately > > > MSVC uses a 64-bit model where longs are still 32-bit. The only thing that is > > > word size is a pointer. Any chance we can use uintptr_t rather than an > > > unsigned long? (At the moment I have to sed all the public headers to > > > replace long with LONG_PTR and it''s a PITA). > > > > > > > > > > TBH I don''t know much about Windows. But are you suggesting replace all > > > the relevant bit in the header or just the specific event channel interface? > > > > > > > I was just pointing out that assumption that unsigned long == native > > word size does not hold for 64-bit Windows. So, if we''re adding > > something new can we avoid use of unsigned long and use an abstract > > type defined to be the native word size? > > Probably ought to be the existing xen_ulong_t.AFAICT there is no typedef for xen_ulong_t in Linux, we need to add it manually then. Wei.> Ian. >
On Fri, 2013-02-08 at 16:37 +0000, Wei Liu wrote:> AFAICT there is no typedef for xen_ulong_t in Linux, we need to add it > manually then.There is from ~3.7 onwards, since we had to add it for ARM (where for ABI reason xen_ulong_t is 8 bytes, not 4 like unsigned long). Ian.
At 16:36 +0000 on 08 Feb (1360341398), Paul Durrant wrote:> > > > > > > I don''t think so. The reason to use unsigned long here is to > > > > > > > guarantee each selector (in 2-level case there is only L1 > > > > > > > selector) fits into a word. > > > > > > > > > > > > > > > > That''s still going to be a problem with Windows drivers. > > > > > Unfortunately > > > > MSVC uses a 64-bit model where longs are still 32-bit. The only > > > > thing that is word size is a pointer. Any chance we can use > > > > uintptr_t rather than an unsigned long? (At the moment I have to sed > > > > all the public headers to replace long with LONG_PTR and it''s a PITA). > > > > > > > > > > > > > TBH I don''t know much about Windows. But are you suggesting replace > > > > all the relevant bit in the header or just the specific event channel > > interface? > > > > > > > > > > I was just pointing out that assumption that unsigned long == native > > > word size does not hold for 64-bit Windows. So, if we''re adding > > > something new can we avoid use of unsigned long and use an abstract > > > type defined to be the native word size? > > > > Probably ought to be the existing xen_ulong_t. > > That works for me :-)Hmm. xen_ulong_t is a typedef for unsigned long on x86. Are you suggesting we change that? On the grounds that any platform where it would make a difference to the ABI has already had to adjust its headers? I think I can buy that. In any case, if we are defining up a new native-word-size type, it oughtn''t to be uintptr_t --- there are systems where pointers are not word-sized (e.g. x32) and we might want to support one of them at some point. Explicitly setting it to uint32_t/uint64_t also avoids any further problems with compiler-specific variations. Tim.
>>> On 08.02.13 at 17:49, Tim Deegan <tim@xen.org> wrote: > Hmm. xen_ulong_t is a typedef for unsigned long on x86. Are you > suggesting we change that? On the grounds that any platform where it > would make a difference to the ABI has already had to adjust its > headers? I think I can buy that.I don''t think we should be changing the underlying definition on x86. Jan
On Fri, 2013-02-08 at 16:49 +0000, Tim Deegan wrote:> At 16:36 +0000 on 08 Feb (1360341398), Paul Durrant wrote: > > > > > > > > I don''t think so. The reason to use unsigned long here is to > > > > > > > > guarantee each selector (in 2-level case there is only L1 > > > > > > > > selector) fits into a word. > > > > > > > > > > > > > > > > > > > That''s still going to be a problem with Windows drivers. > > > > > > Unfortunately > > > > > MSVC uses a 64-bit model where longs are still 32-bit. The only > > > > > thing that is word size is a pointer. Any chance we can use > > > > > uintptr_t rather than an unsigned long? (At the moment I have to sed > > > > > all the public headers to replace long with LONG_PTR and it''s a PITA). > > > > > > > > > > > > > > > > TBH I don''t know much about Windows. But are you suggesting replace > > > > > all the relevant bit in the header or just the specific event channel > > > interface? > > > > > > > > > > > > > I was just pointing out that assumption that unsigned long == native > > > > word size does not hold for 64-bit Windows. So, if we''re adding > > > > something new can we avoid use of unsigned long and use an abstract > > > > type defined to be the native word size? > > > > > > Probably ought to be the existing xen_ulong_t. > > > > That works for me :-) > > Hmm. xen_ulong_t is a typedef for unsigned long on x86. Are you > suggesting we change that?I don''t think that would be a good idea, but using xen_ulong_t does at least mean he only needs to change one place rather than sed''ing up the whole lot. Ian.
At 16:59 +0000 on 08 Feb (1360342741), Ian Campbell wrote:> On Fri, 2013-02-08 at 16:49 +0000, Tim Deegan wrote: > > At 16:36 +0000 on 08 Feb (1360341398), Paul Durrant wrote: > > > > > > > > > I don''t think so. The reason to use unsigned long here is to > > > > > > > > > guarantee each selector (in 2-level case there is only L1 > > > > > > > > > selector) fits into a word. > > > > > > > > > > > > > > > > > > > > > > That''s still going to be a problem with Windows drivers. > > > > > > > Unfortunately > > > > > > MSVC uses a 64-bit model where longs are still 32-bit. The only > > > > > > thing that is word size is a pointer. Any chance we can use > > > > > > uintptr_t rather than an unsigned long? (At the moment I have to sed > > > > > > all the public headers to replace long with LONG_PTR and it''s a PITA). > > > > > > > > > > > > > > > > > > > TBH I don''t know much about Windows. But are you suggesting replace > > > > > > all the relevant bit in the header or just the specific event channel > > > > interface? > > > > > > > > > > > > > > > > I was just pointing out that assumption that unsigned long == native > > > > > word size does not hold for 64-bit Windows. So, if we''re adding > > > > > something new can we avoid use of unsigned long and use an abstract > > > > > type defined to be the native word size? > > > > > > > > Probably ought to be the existing xen_ulong_t. > > > > > > That works for me :-) > > > > Hmm. xen_ulong_t is a typedef for unsigned long on x86. Are you > > suggesting we change that? > > I don''t think that would be a good idea, but using xen_ulong_t does at > least mean he only needs to change one place rather than sed''ing up the > whole lot.Ah, OK. But on arm32 it''s explicitly the wrong thing (i.e. bigger than a word). Is that just going to be part of the cost of using explicit sizes everywhere on arm? Tim.
> -----Original Message----- > From: Tim Deegan [mailto:tim@xen.org] > Sent: 08 February 2013 16:50 > To: Paul Durrant > Cc: Ian Campbell; xen-devel@lists.xen.org; Wei Liu; David Vrabel; > jbeulich@suse.com; Konrad Rzeszutek Wilk > Subject: Re: [Xen-devel] [PATCH 04/13] xen: sync public headers > > At 16:36 +0000 on 08 Feb (1360341398), Paul Durrant wrote: > > > > > > > > I don''t think so. The reason to use unsigned long here is > > > > > > > > to guarantee each selector (in 2-level case there is only > > > > > > > > L1 > > > > > > > > selector) fits into a word. > > > > > > > > > > > > > > > > > > > That''s still going to be a problem with Windows drivers. > > > > > > Unfortunately > > > > > MSVC uses a 64-bit model where longs are still 32-bit. The only > > > > > thing that is word size is a pointer. Any chance we can use > > > > > uintptr_t rather than an unsigned long? (At the moment I have to > > > > > sed all the public headers to replace long with LONG_PTR and it''s a > PITA). > > > > > > > > > > > > > > > > TBH I don''t know much about Windows. But are you suggesting > > > > > replace all the relevant bit in the header or just the specific > > > > > event channel > > > interface? > > > > > > > > > > > > > I was just pointing out that assumption that unsigned long => > > > native word size does not hold for 64-bit Windows. So, if we''re > > > > adding something new can we avoid use of unsigned long and use an > > > > abstract type defined to be the native word size? > > > > > > Probably ought to be the existing xen_ulong_t. > > > > That works for me :-) > > Hmm. xen_ulong_t is a typedef for unsigned long on x86. Are you > suggesting we change that? On the grounds that any platform where it > would make a difference to the ABI has already had to adjust its headers? I > think I can buy that. >Yes, the crux of the issue is what *is* an unsigned long? If Xen uses one model and the guest OS uses a different one then use of non-abstract types in the public headers becomes awkward. Paul
On Fri, 2013-02-08 at 17:06 +0000, Tim Deegan wrote:> At 16:59 +0000 on 08 Feb (1360342741), Ian Campbell wrote: > > On Fri, 2013-02-08 at 16:49 +0000, Tim Deegan wrote: > > > At 16:36 +0000 on 08 Feb (1360341398), Paul Durrant wrote: > > > > > > > > > > I don''t think so. The reason to use unsigned long here is to > > > > > > > > > > guarantee each selector (in 2-level case there is only L1 > > > > > > > > > > selector) fits into a word. > > > > > > > > > > > > > > > > > > > > > > > > > That''s still going to be a problem with Windows drivers. > > > > > > > > Unfortunately > > > > > > > MSVC uses a 64-bit model where longs are still 32-bit. The only > > > > > > > thing that is word size is a pointer. Any chance we can use > > > > > > > uintptr_t rather than an unsigned long? (At the moment I have to sed > > > > > > > all the public headers to replace long with LONG_PTR and it''s a PITA). > > > > > > > > > > > > > > > > > > > > > > TBH I don''t know much about Windows. But are you suggesting replace > > > > > > > all the relevant bit in the header or just the specific event channel > > > > > interface? > > > > > > > > > > > > > > > > > > > I was just pointing out that assumption that unsigned long == native > > > > > > word size does not hold for 64-bit Windows. So, if we''re adding > > > > > > something new can we avoid use of unsigned long and use an abstract > > > > > > type defined to be the native word size? > > > > > > > > > > Probably ought to be the existing xen_ulong_t. > > > > > > > > That works for me :-) > > > > > > Hmm. xen_ulong_t is a typedef for unsigned long on x86. Are you > > > suggesting we change that? > > > > I don''t think that would be a good idea, but using xen_ulong_t does at > > least mean he only needs to change one place rather than sed''ing up the > > whole lot. > > Ah, OK. > > But on arm32 it''s explicitly the wrong thing (i.e. bigger than a word). > Is that just going to be part of the cost of using explicit sizes > everywhere on arm?Actually, Stefano and I were discussing this the other day, the use of unsigned long in the evtchn stuff on ARM32 is a 32/64 ABI snafu and should be fixed -- using xen_ulong_t would fix this too... Ian.
On 08/02/13 17:09, Ian Campbell wrote:> > Actually, Stefano and I were discussing this the other day, the use of > unsigned long in the evtchn stuff on ARM32 is a 32/64 ABI snafu and > should be fixed -- using xen_ulong_t would fix this too...Keep in mind that it must be possible to do an atomic xchg() on the vcpu_info''s evtchn_pending_sel word. I don''t think 64-bit words will work with a 32-bit ARM guest. David
On Fri, 2013-02-08 at 19:45 +0000, David Vrabel wrote:> On 08/02/13 17:09, Ian Campbell wrote: > > > > Actually, Stefano and I were discussing this the other day, the use of > > unsigned long in the evtchn stuff on ARM32 is a 32/64 ABI snafu and > > should be fixed -- using xen_ulong_t would fix this too... > > Keep in mind that it must be possible to do an atomic xchg() on the > vcpu_info''s evtchn_pending_sel word. I don''t think 64-bit words will > work with a 32-bit ARM guest.ARM (at least the variants we care about) has load/store exclusive, including a double word sized option (ldrexd/strexd), so this is fine. I think the double word variants were introduced along with LPAE, so you can update a PTE atomically. I also suspect it would be fine to treat a 64-bit word as two independent 32-bit halves which must be accessed atomically, at least for the purposes of updating the pending sel word(assumnig h/v and guest agree on this), but thankfully due to the above I don''t have to think about it *too* hard to be sure ;-) Ian.