thr3ads.net - Xen devel - (no subject) [Sep 2013]

If this information is useful, please help other people find it:
Share via:

David Vrabel

2013-Sep-13 16:59 UTC

(no subject)

This is an RFC of Linux guest-side implementation of the FIFO-based
event channel ABI described in this design document:

http://xenbits.xen.org/people/dvrabel/event-channels-E.pdf

Refer also to the Xen series.

Remaining work:

* Add an function to set a event channel''s priority and use this to
  set the VIRQ timer to the highest priority.

Patch 1 is a obvious refactoring of common code.

Patch 2-6 prepare for supporting multiple ABIs.

Patch 7 adds the low-level evtchn_ops hooks.

Patch 8-9 add an additional hook for ABI-specific per-port setup
(used for expanding the event array as more event are bound).

Patch 10 allows many more event channels to be supported by altering
how the event channel to irq map is allocated.  Note that other
factors limit the number of supported IRQs (IRQs is 8192 + 64 *
NR_CPUS).

Patch 11 is some trival refactoring.

Patch 12-13 add the ABI and the implementation.

David

David Vrabel

2013-Sep-13 16:59 UTC

head link

[PATCH 01/13] xen/events: refactor retrigger_dynirq() and resend_irq_on_evtchn()

From: David Vrabel <david.vrabel@citrix.com>

These two function did the same thing with different parameters, put
the common bits in retrigger_evtchn().

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
---
 drivers/xen/events.c |   27 +++++++++------------------
 1 files changed, 9 insertions(+), 18 deletions(-)

diff --git a/drivers/xen/events.c b/drivers/xen/events.c
index 4035e83..ddcdbb5 100644
--- a/drivers/xen/events.c
+++ b/drivers/xen/events.c
@@ -1558,13 +1558,13 @@ static int set_affinity_irq(struct irq_data *data, const
struct cpumask *dest,
 	return rebind_irq_to_cpu(data->irq, tcpu);
 }
 
-int resend_irq_on_evtchn(unsigned int irq)
+static int retrigger_evtchn(int evtchn)
 {
-	int masked, evtchn = evtchn_from_irq(irq);
+	int masked;
 	struct shared_info *s = HYPERVISOR_shared_info;
 
 	if (!VALID_EVTCHN(evtchn))
-		return 1;
+		return 0;
 
 	masked = sync_test_and_set_bit(evtchn, BM(s->evtchn_mask));
 	sync_set_bit(evtchn, BM(s->evtchn_pending));
@@ -1574,6 +1574,11 @@ int resend_irq_on_evtchn(unsigned int irq)
 	return 1;
 }
 
+int resend_irq_on_evtchn(unsigned int irq)
+{
+	return retrigger_evtchn(evtchn_from_irq(irq));
+}
+
 static void enable_dynirq(struct irq_data *data)
 {
 	int evtchn = evtchn_from_irq(data->irq);
@@ -1608,21 +1613,7 @@ static void mask_ack_dynirq(struct irq_data *data)
 
 static int retrigger_dynirq(struct irq_data *data)
 {
-	int evtchn = evtchn_from_irq(data->irq);
-	struct shared_info *sh = HYPERVISOR_shared_info;
-	int ret = 0;
-
-	if (VALID_EVTCHN(evtchn)) {
-		int masked;
-
-		masked = sync_test_and_set_bit(evtchn, BM(sh->evtchn_mask));
-		sync_set_bit(evtchn, BM(sh->evtchn_pending));
-		if (!masked)
-			unmask_evtchn(evtchn);
-		ret = 1;
-	}
-
-	return ret;
+	return retrigger_evtchn(evtchn_from_irq(data->irq));
 }
 
 static void restore_pirqs(void)
-- 
1.7.2.5

David Vrabel

2013-Sep-13 16:59 UTC

head link

[PATCH 02/13] xen/events: remove unnecessary init_evtchn_cpu_bindings()

From: David Vrabel <david.vrabel@citrix.com>

Event channels are always explicitly bound to a specific VCPU before
they are first enabled.  There is no need to initialize all possible
events as bound to VCPU 0 at start of day or after a resume.

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
---
 drivers/xen/events.c |   22 ----------------------
 1 files changed, 0 insertions(+), 22 deletions(-)

diff --git a/drivers/xen/events.c b/drivers/xen/events.c
index ddcdbb5..1e2c74b 100644
--- a/drivers/xen/events.c
+++ b/drivers/xen/events.c
@@ -334,24 +334,6 @@ static void bind_evtchn_to_cpu(unsigned int chn, unsigned
int cpu)
 	info_for_irq(irq)->cpu = cpu;
 }
 
-static void init_evtchn_cpu_bindings(void)
-{
-	int i;
-#ifdef CONFIG_SMP
-	struct irq_info *info;
-
-	/* By default all event channels notify CPU#0. */
-	list_for_each_entry(info, &xen_irq_list_head, list) {
-		struct irq_desc *desc = irq_to_desc(info->irq);
-		cpumask_copy(desc->irq_data.affinity, cpumask_of(0));
-	}
-#endif
-
-	for_each_possible_cpu(i)
-		memset(per_cpu(cpu_evtchn_mask, i),
-		       (i == 0) ? ~0 : 0, NR_EVENT_CHANNELS/8);
-}
-
 static inline void clear_evtchn(int port)
 {
 	struct shared_info *s = HYPERVISOR_shared_info;
@@ -1778,8 +1760,6 @@ void xen_irq_resume(void)
 	unsigned int cpu, evtchn;
 	struct irq_info *info;
 
-	init_evtchn_cpu_bindings();
-
 	/* New event-channel space is not ''live'' yet. */
 	for (evtchn = 0; evtchn < NR_EVENT_CHANNELS; evtchn++)
 		mask_evtchn(evtchn);
@@ -1890,8 +1870,6 @@ void __init xen_init_IRQ(void)
 	for (i = 0; i < NR_EVENT_CHANNELS; i++)
 		evtchn_to_irq[i] = -1;
 
-	init_evtchn_cpu_bindings();
-
 	/* No event channels are ''live'' right now. */
 	for (i = 0; i < NR_EVENT_CHANNELS; i++)
 		mask_evtchn(i);
-- 
1.7.2.5

David Vrabel

2013-Sep-13 16:59 UTC

head link

[PATCH 03/13] xen/events: introduce test_and_set_mask

From: Wei Liu <wei.liu2@citrix.com>

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
---
 drivers/xen/events.c |   11 ++++++++---
 1 files changed, 8 insertions(+), 3 deletions(-)

diff --git a/drivers/xen/events.c b/drivers/xen/events.c
index 1e2c74b..359e983 100644
--- a/drivers/xen/events.c
+++ b/drivers/xen/events.c
@@ -352,6 +352,12 @@ static inline int test_evtchn(int port)
 	return sync_test_bit(port, BM(&s->evtchn_pending[0]));
 }
 
+static inline int test_and_set_mask(int port)
+{
+	struct shared_info *s = HYPERVISOR_shared_info;
+	return sync_test_and_set_bit(port, BM(&s->evtchn_mask[0]));
+}
+
 
 /**
  * notify_remote_via_irq - send event to remote end of event channel via irq
@@ -1493,7 +1499,6 @@ void rebind_evtchn_irq(int evtchn, int irq)
 /* Rebind an evtchn so that it gets delivered to a specific cpu */
 static int rebind_irq_to_cpu(unsigned irq, unsigned tcpu)
 {
-	struct shared_info *s = HYPERVISOR_shared_info;
 	struct evtchn_bind_vcpu bind_vcpu;
 	int evtchn = evtchn_from_irq(irq);
 	int masked;
@@ -1516,7 +1521,7 @@ static int rebind_irq_to_cpu(unsigned irq, unsigned tcpu)
 	 * Mask the event while changing the VCPU binding to prevent
 	 * it being delivered on an unexpected VCPU.
 	 */
-	masked = sync_test_and_set_bit(evtchn, BM(s->evtchn_mask));
+	masked = test_and_set_mask(evtchn);
 
 	/*
 	 * If this fails, it usually just indicates that we''re dealing with a
@@ -1548,7 +1553,7 @@ static int retrigger_evtchn(int evtchn)
 	if (!VALID_EVTCHN(evtchn))
 		return 0;
 
-	masked = sync_test_and_set_bit(evtchn, BM(s->evtchn_mask));
+	masked = test_and_set_mask(evtchn);
 	sync_set_bit(evtchn, BM(s->evtchn_pending));
 	if (!masked)
 		unmask_evtchn(evtchn);
-- 
1.7.2.5

David Vrabel

2013-Sep-13 16:59 UTC

head link

[PATCH 04/13] xen/events: replace raw bit ops with functions

From: Wei Liu <wei.liu2@citrix.com>

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
---
 drivers/xen/events.c |    3 +--
 1 files changed, 1 insertions(+), 2 deletions(-)

diff --git a/drivers/xen/events.c b/drivers/xen/events.c
index 359e983..fec5da4 100644
--- a/drivers/xen/events.c
+++ b/drivers/xen/events.c
@@ -1548,13 +1548,12 @@ static int set_affinity_irq(struct irq_data *data, const
struct cpumask *dest,
 static int retrigger_evtchn(int evtchn)
 {
 	int masked;
-	struct shared_info *s = HYPERVISOR_shared_info;
 
 	if (!VALID_EVTCHN(evtchn))
 		return 0;
 
 	masked = test_and_set_mask(evtchn);
-	sync_set_bit(evtchn, BM(s->evtchn_pending));
+	set_evtchn(evtchn);
 	if (!masked)
 		unmask_evtchn(evtchn);
 
-- 
1.7.2.5

David Vrabel

2013-Sep-13 16:59 UTC

head link

[PATCH 05/13] xen/events: move drivers/xen/events.c into drivers/xen/events/

From: David Vrabel <david.vrabel@citrix.com>

events.c will be split into multiple files so move it into its own
directory.

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
---
 drivers/xen/Makefile              |    3 ++-
 drivers/xen/events/Makefile       |    1 +
 drivers/xen/{ => events}/events.c |    0
 3 files changed, 3 insertions(+), 1 deletions(-)
 create mode 100644 drivers/xen/events/Makefile
 rename drivers/xen/{ => events}/events.c (100%)

diff --git a/drivers/xen/Makefile b/drivers/xen/Makefile
index 14fe79d..d75c811 100644
--- a/drivers/xen/Makefile
+++ b/drivers/xen/Makefile
@@ -2,7 +2,8 @@ ifeq ($(filter y, $(CONFIG_ARM) $(CONFIG_ARM64)),)
 obj-$(CONFIG_HOTPLUG_CPU)		+= cpu_hotplug.o
 endif
 obj-$(CONFIG_X86)			+= fallback.o
-obj-y	+= grant-table.o features.o events.o balloon.o manage.o
+obj-y	+= grant-table.o features.o balloon.o manage.o
+obj-y	+= events/
 obj-y	+= xenbus/
 
 nostackp := $(call cc-option, -fno-stack-protector)
diff --git a/drivers/xen/events/Makefile b/drivers/xen/events/Makefile
new file mode 100644
index 0000000..86fd7c1
--- /dev/null
+++ b/drivers/xen/events/Makefile
@@ -0,0 +1 @@
+obj-y += events.o
diff --git a/drivers/xen/events.c b/drivers/xen/events/events.c
similarity index 100%
rename from drivers/xen/events.c
rename to drivers/xen/events/events.c
-- 
1.7.2.5

David Vrabel

2013-Sep-13 16:59 UTC

head link

[PATCH 06/13] xen/events: move 2-level specific code into its own file

From: David Vrabel <david.vrabel@citrix.com>

In preparation for alternative event channel ABIs, move all the
functions accessing the shared data structures into their own file.

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
---
 drivers/xen/events/Makefile          |    1 +
 drivers/xen/events/events.c          |  372 +---------------------------------
 drivers/xen/events/events_internal.h |   74 +++++++
 drivers/xen/events/n-level.c         |  322 +++++++++++++++++++++++++++++
 4 files changed, 408 insertions(+), 361 deletions(-)
 create mode 100644 drivers/xen/events/events_internal.h
 create mode 100644 drivers/xen/events/n-level.c

diff --git a/drivers/xen/events/Makefile b/drivers/xen/events/Makefile
index 86fd7c1..aea331e 100644
--- a/drivers/xen/events/Makefile
+++ b/drivers/xen/events/Makefile
@@ -1 +1,2 @@
 obj-y += events.o
+obj-y += n-level.o
diff --git a/drivers/xen/events/events.c b/drivers/xen/events/events.c
index fec5da4..2700c12 100644
--- a/drivers/xen/events/events.c
+++ b/drivers/xen/events/events.c
@@ -59,6 +59,8 @@
 #include <xen/interface/vcpu.h>
 #include <asm/hw_irq.h>
 
+#include "events_internal.h"
+
 /*
  * This lock protects updates to the following mapping and reference-count
  * arrays. The lock does not need to be acquired to read the mapping tables.
@@ -73,72 +75,12 @@ static DEFINE_PER_CPU(int [NR_VIRQS], virq_to_irq) = {[0 ...
NR_VIRQS-1] = -1};
 /* IRQ <-> IPI mapping */
 static DEFINE_PER_CPU(int [XEN_NR_IPIS], ipi_to_irq) = {[0 ... XEN_NR_IPIS-1] =
-1};
 
-/* Interrupt types. */
-enum xen_irq_type {
-	IRQT_UNBOUND = 0,
-	IRQT_PIRQ,
-	IRQT_VIRQ,
-	IRQT_IPI,
-	IRQT_EVTCHN
-};
-
-/*
- * Packed IRQ information:
- * type - enum xen_irq_type
- * event channel - irq->event channel mapping
- * cpu - cpu this event channel is bound to
- * index - type-specific information:
- *    PIRQ - physical IRQ, GSI, flags, and owner domain
- *    VIRQ - virq number
- *    IPI - IPI vector
- *    EVTCHN -
- */
-struct irq_info {
-	struct list_head list;
-	int refcnt;
-	enum xen_irq_type type;	/* type */
-	unsigned irq;
-	unsigned short evtchn;	/* event channel */
-	unsigned short cpu;	/* cpu bound */
-
-	union {
-		unsigned short virq;
-		enum ipi_vector ipi;
-		struct {
-			unsigned short pirq;
-			unsigned short gsi;
-			unsigned char flags;
-			uint16_t domid;
-		} pirq;
-	} u;
-};
-#define PIRQ_NEEDS_EOI	(1 << 0)
-#define PIRQ_SHAREABLE	(1 << 1)
-
-static int *evtchn_to_irq;
+int *evtchn_to_irq;
 #ifdef CONFIG_X86
 static unsigned long *pirq_eoi_map;
 #endif
 static bool (*pirq_needs_eoi)(unsigned irq);
 
-/*
- * Note sizeof(xen_ulong_t) can be more than sizeof(unsigned long). Be
- * careful to only use bitops which allow for this (e.g
- * test_bit/find_first_bit and friends but not __ffs) and to pass
- * BITS_PER_EVTCHN_WORD as the bitmask length.
- */
-#define BITS_PER_EVTCHN_WORD (sizeof(xen_ulong_t)*8)
-/*
- * Make a bitmask (i.e. unsigned long *) of a xen_ulong_t
- * array. Primarily to avoid long lines (hence the terse name).
- */
-#define BM(x) (unsigned long *)(x)
-/* Find the first set bit in a evtchn mask */
-#define EVTCHN_FIRST_BIT(w) find_first_bit(BM(&(w)), BITS_PER_EVTCHN_WORD)
-
-static DEFINE_PER_CPU(xen_ulong_t [NR_EVENT_CHANNELS/BITS_PER_EVTCHN_WORD],
-		      cpu_evtchn_mask);
-
 /* Xen will never allocate port zero for any purpose. */
 #define VALID_EVTCHN(chn)	((chn) != 0)
 
@@ -149,7 +91,7 @@ static void enable_dynirq(struct irq_data *data);
 static void disable_dynirq(struct irq_data *data);
 
 /* Get info for IRQ */
-static struct irq_info *info_for_irq(unsigned irq)
+struct irq_info *info_for_irq(unsigned irq)
 {
 	return irq_get_handler_data(irq);
 }
@@ -279,12 +221,12 @@ static enum xen_irq_type type_from_irq(unsigned irq)
 	return info_for_irq(irq)->type;
 }
 
-static unsigned cpu_from_irq(unsigned irq)
+unsigned cpu_from_irq(unsigned irq)
 {
 	return info_for_irq(irq)->cpu;
 }
 
-static unsigned int cpu_from_evtchn(unsigned int evtchn)
+unsigned int cpu_from_evtchn(unsigned int evtchn)
 {
 	int irq = evtchn_to_irq[evtchn];
 	unsigned ret = 0;
@@ -310,55 +252,21 @@ static bool pirq_needs_eoi_flag(unsigned irq)
 	return info->u.pirq.flags & PIRQ_NEEDS_EOI;
 }
 
-static inline xen_ulong_t active_evtchns(unsigned int cpu,
-					 struct shared_info *sh,
-					 unsigned int idx)
-{
-	return sh->evtchn_pending[idx] &
-		per_cpu(cpu_evtchn_mask, cpu)[idx] &
-		~sh->evtchn_mask[idx];
-}
-
 static void bind_evtchn_to_cpu(unsigned int chn, unsigned int cpu)
 {
 	int irq = evtchn_to_irq[chn];
+	struct irq_info *info = info_for_irq(irq);
 
 	BUG_ON(irq == -1);
 #ifdef CONFIG_SMP
 	cpumask_copy(irq_to_desc(irq)->irq_data.affinity, cpumask_of(cpu));
 #endif
 
-	clear_bit(chn, BM(per_cpu(cpu_evtchn_mask, cpu_from_irq(irq))));
-	set_bit(chn, BM(per_cpu(cpu_evtchn_mask, cpu)));
-
-	info_for_irq(irq)->cpu = cpu;
-}
+	xen_evtchn_port_bind_to_cpu(info, cpu);
 
-static inline void clear_evtchn(int port)
-{
-	struct shared_info *s = HYPERVISOR_shared_info;
-	sync_clear_bit(port, BM(&s->evtchn_pending[0]));
-}
-
-static inline void set_evtchn(int port)
-{
-	struct shared_info *s = HYPERVISOR_shared_info;
-	sync_set_bit(port, BM(&s->evtchn_pending[0]));
-}
-
-static inline int test_evtchn(int port)
-{
-	struct shared_info *s = HYPERVISOR_shared_info;
-	return sync_test_bit(port, BM(&s->evtchn_pending[0]));
-}
-
-static inline int test_and_set_mask(int port)
-{
-	struct shared_info *s = HYPERVISOR_shared_info;
-	return sync_test_and_set_bit(port, BM(&s->evtchn_mask[0]));
+	info->cpu = cpu;
 }
 
-
 /**
  * notify_remote_via_irq - send event to remote end of event channel via irq
  * @irq: irq of event channel to send event to
@@ -376,63 +284,6 @@ void notify_remote_via_irq(int irq)
 }
 EXPORT_SYMBOL_GPL(notify_remote_via_irq);
 
-static void mask_evtchn(int port)
-{
-	struct shared_info *s = HYPERVISOR_shared_info;
-	sync_set_bit(port, BM(&s->evtchn_mask[0]));
-}
-
-static void unmask_evtchn(int port)
-{
-	struct shared_info *s = HYPERVISOR_shared_info;
-	unsigned int cpu = get_cpu();
-	int do_hypercall = 0, evtchn_pending = 0;
-
-	BUG_ON(!irqs_disabled());
-
-	if (unlikely((cpu != cpu_from_evtchn(port))))
-		do_hypercall = 1;
-	else {
-		/*
-		 * Need to clear the mask before checking pending to
-		 * avoid a race with an event becoming pending.
-		 *
-		 * EVTCHNOP_unmask will only trigger an upcall if the
-		 * mask bit was set, so if a hypercall is needed
-		 * remask the event.
-		 */
-		sync_clear_bit(port, BM(&s->evtchn_mask[0]));
-		evtchn_pending = sync_test_bit(port, BM(&s->evtchn_pending[0]));
-
-		if (unlikely(evtchn_pending && xen_hvm_domain())) {
-			sync_set_bit(port, BM(&s->evtchn_mask[0]));
-			do_hypercall = 1;
-		}
-	}
-
-	/* Slow path (hypercall) if this is a non-local port or if this is
-	 * an hvm domain and an event is pending (hvm domains don''t have
-	 * their own implementation of irq_enable). */
-	if (do_hypercall) {
-		struct evtchn_unmask unmask = { .port = port };
-		(void)HYPERVISOR_event_channel_op(EVTCHNOP_unmask, &unmask);
-	} else {
-		struct vcpu_info *vcpu_info = __this_cpu_read(xen_vcpu);
-
-		/*
-		 * The following is basically the equivalent of
-		 * ''hw_resend_irq''. Just like a real IO-APIC we
''lose
-		 * the interrupt edge'' if the channel is masked.
-		 */
-		if (evtchn_pending &&
-		    !sync_test_and_set_bit(port / BITS_PER_EVTCHN_WORD,
-					   BM(&vcpu_info->evtchn_pending_sel)))
-			vcpu_info->evtchn_upcall_pending = 1;
-	}
-
-	put_cpu();
-}
-
 static void xen_irq_init(unsigned irq)
 {
 	struct irq_info *info;
@@ -1216,222 +1067,21 @@ void xen_send_IPI_one(unsigned int cpu, enum ipi_vector
vector)
 	notify_remote_via_irq(irq);
 }
 
-irqreturn_t xen_debug_interrupt(int irq, void *dev_id)
-{
-	struct shared_info *sh = HYPERVISOR_shared_info;
-	int cpu = smp_processor_id();
-	xen_ulong_t *cpu_evtchn = per_cpu(cpu_evtchn_mask, cpu);
-	int i;
-	unsigned long flags;
-	static DEFINE_SPINLOCK(debug_lock);
-	struct vcpu_info *v;
-
-	spin_lock_irqsave(&debug_lock, flags);
-
-	printk("\nvcpu %d\n  ", cpu);
-
-	for_each_online_cpu(i) {
-		int pending;
-		v = per_cpu(xen_vcpu, i);
-		pending = (get_irq_regs() && i == cpu)
-			? xen_irqs_disabled(get_irq_regs())
-			: v->evtchn_upcall_mask;
-		printk("%d: masked=%d pending=%d event_sel
%0*"PRI_xen_ulong"\n  ", i,
-		       pending, v->evtchn_upcall_pending,
-		       (int)(sizeof(v->evtchn_pending_sel)*2),
-		       v->evtchn_pending_sel);
-	}
-	v = per_cpu(xen_vcpu, cpu);
-
-	printk("\npending:\n   ");
-	for (i = ARRAY_SIZE(sh->evtchn_pending)-1; i >= 0; i--)
-		printk("%0*"PRI_xen_ulong"%s",
-		       (int)sizeof(sh->evtchn_pending[0])*2,
-		       sh->evtchn_pending[i],
-		       i % 8 == 0 ? "\n   " : " ");
-	printk("\nglobal mask:\n   ");
-	for (i = ARRAY_SIZE(sh->evtchn_mask)-1; i >= 0; i--)
-		printk("%0*"PRI_xen_ulong"%s",
-		       (int)(sizeof(sh->evtchn_mask[0])*2),
-		       sh->evtchn_mask[i],
-		       i % 8 == 0 ? "\n   " : " ");
-
-	printk("\nglobally unmasked:\n   ");
-	for (i = ARRAY_SIZE(sh->evtchn_mask)-1; i >= 0; i--)
-		printk("%0*"PRI_xen_ulong"%s",
-		       (int)(sizeof(sh->evtchn_mask[0])*2),
-		       sh->evtchn_pending[i] & ~sh->evtchn_mask[i],
-		       i % 8 == 0 ? "\n   " : " ");
-
-	printk("\nlocal cpu%d mask:\n   ", cpu);
-	for (i = (NR_EVENT_CHANNELS/BITS_PER_EVTCHN_WORD)-1; i >= 0; i--)
-		printk("%0*"PRI_xen_ulong"%s",
(int)(sizeof(cpu_evtchn[0])*2),
-		       cpu_evtchn[i],
-		       i % 8 == 0 ? "\n   " : " ");
-
-	printk("\nlocally unmasked:\n   ");
-	for (i = ARRAY_SIZE(sh->evtchn_mask)-1; i >= 0; i--) {
-		xen_ulong_t pending = sh->evtchn_pending[i]
-			& ~sh->evtchn_mask[i]
-			& cpu_evtchn[i];
-		printk("%0*"PRI_xen_ulong"%s",
-		       (int)(sizeof(sh->evtchn_mask[0])*2),
-		       pending, i % 8 == 0 ? "\n   " : " ");
-	}
-
-	printk("\npending list:\n");
-	for (i = 0; i < NR_EVENT_CHANNELS; i++) {
-		if (sync_test_bit(i, BM(sh->evtchn_pending))) {
-			int word_idx = i / BITS_PER_EVTCHN_WORD;
-			printk("  %d: event %d -> irq %d%s%s%s\n",
-			       cpu_from_evtchn(i), i,
-			       evtchn_to_irq[i],
-			       sync_test_bit(word_idx, BM(&v->evtchn_pending_sel))
-					     ? "" : " l2-clear",
-			       !sync_test_bit(i, BM(sh->evtchn_mask))
-					     ? "" : " globally-masked",
-			       sync_test_bit(i, BM(cpu_evtchn))
-					     ? "" : " locally-masked");
-		}
-	}
-
-	spin_unlock_irqrestore(&debug_lock, flags);
-
-	return IRQ_HANDLED;
-}
-
 static DEFINE_PER_CPU(unsigned, xed_nesting_count);
-static DEFINE_PER_CPU(unsigned int, current_word_idx);
-static DEFINE_PER_CPU(unsigned int, current_bit_idx);
 
-/*
- * Mask out the i least significant bits of w
- */
-#define MASK_LSBS(w, i) (w & ((~((xen_ulong_t)0UL)) << i))
-
-/*
- * Search the CPUs pending events bitmasks.  For each one found, map
- * the event number to an irq, and feed it into do_IRQ() for
- * handling.
- *
- * Xen uses a two-level bitmap to speed searching.  The first level is
- * a bitset of words which contain pending event bits.  The second
- * level is a bitset of pending events themselves.
- */
 static void __xen_evtchn_do_upcall(void)
 {
-	int start_word_idx, start_bit_idx;
-	int word_idx, bit_idx;
-	int i, irq;
-	int cpu = get_cpu();
-	struct shared_info *s = HYPERVISOR_shared_info;
 	struct vcpu_info *vcpu_info = __this_cpu_read(xen_vcpu);
+	int cpu = get_cpu();
 	unsigned count;
 
 	do {
-		xen_ulong_t pending_words;
-		xen_ulong_t pending_bits;
-		struct irq_desc *desc;
-
 		vcpu_info->evtchn_upcall_pending = 0;
 
 		if (__this_cpu_inc_return(xed_nesting_count) - 1)
 			goto out;
 
-		/*
-		 * Master flag must be cleared /before/ clearing
-		 * selector flag. xchg_xen_ulong must contain an
-		 * appropriate barrier.
-		 */
-		if ((irq = per_cpu(virq_to_irq, cpu)[VIRQ_TIMER]) != -1) {
-			int evtchn = evtchn_from_irq(irq);
-			word_idx = evtchn / BITS_PER_LONG;
-			pending_bits = evtchn % BITS_PER_LONG;
-			if (active_evtchns(cpu, s, word_idx) & (1ULL << pending_bits)) {
-				desc = irq_to_desc(irq);
-				if (desc)
-					generic_handle_irq_desc(irq, desc);
-			}
-		}
-
-		pending_words = xchg_xen_ulong(&vcpu_info->evtchn_pending_sel, 0);
-
-		start_word_idx = __this_cpu_read(current_word_idx);
-		start_bit_idx = __this_cpu_read(current_bit_idx);
-
-		word_idx = start_word_idx;
-
-		for (i = 0; pending_words != 0; i++) {
-			xen_ulong_t words;
-
-			words = MASK_LSBS(pending_words, word_idx);
-
-			/*
-			 * If we masked out all events, wrap to beginning.
-			 */
-			if (words == 0) {
-				word_idx = 0;
-				bit_idx = 0;
-				continue;
-			}
-			word_idx = EVTCHN_FIRST_BIT(words);
-
-			pending_bits = active_evtchns(cpu, s, word_idx);
-			bit_idx = 0; /* usually scan entire word from start */
-			/*
-			 * We scan the starting word in two parts.
-			 *
-			 * 1st time: start in the middle, scanning the
-			 * upper bits.
-			 *
-			 * 2nd time: scan the whole word (not just the
-			 * parts skipped in the first pass) -- if an
-			 * event in the previously scanned bits is
-			 * pending again it would just be scanned on
-			 * the next loop anyway.
-			 */
-			if (word_idx == start_word_idx) {
-				if (i == 0)
-					bit_idx = start_bit_idx;
-			}
-
-			do {
-				xen_ulong_t bits;
-				int port;
-
-				bits = MASK_LSBS(pending_bits, bit_idx);
-
-				/* If we masked out all events, move on. */
-				if (bits == 0)
-					break;
-
-				bit_idx = EVTCHN_FIRST_BIT(bits);
-
-				/* Process port. */
-				port = (word_idx * BITS_PER_EVTCHN_WORD) + bit_idx;
-				irq = evtchn_to_irq[port];
-
-				if (irq != -1) {
-					desc = irq_to_desc(irq);
-					if (desc)
-						generic_handle_irq_desc(irq, desc);
-				}
-
-				bit_idx = (bit_idx + 1) % BITS_PER_EVTCHN_WORD;
-
-				/* Next caller starts at last processed + 1 */
-				__this_cpu_write(current_word_idx,
-						 bit_idx ? word_idx :
-						 (word_idx+1) % BITS_PER_EVTCHN_WORD);
-				__this_cpu_write(current_bit_idx, bit_idx);
-			} while (bit_idx != 0);
-
-			/* Scan start_l1i twice; all others once. */
-			if ((word_idx != start_word_idx) || (i != 0))
-				pending_words &= ~(1UL << word_idx);
-
-			word_idx = (word_idx + 1) % BITS_PER_EVTCHN_WORD;
-		}
+		xen_evtchn_handle_events(cpu);
 
 		BUG_ON(!irqs_disabled());
 
diff --git a/drivers/xen/events/events_internal.h
b/drivers/xen/events/events_internal.h
new file mode 100644
index 0000000..79ac70b
--- /dev/null
+++ b/drivers/xen/events/events_internal.h
@@ -0,0 +1,74 @@
+/*
+ * Xen Event Channels (internal header)
+ *
+ * Copyright (C) 2013 Citrix Systems R&D Ltd.
+ *
+ * This source code is licensed under the GNU General Public License,
+ * Version 2 or later.  See the file COPYING for more details.
+ */
+#ifndef __EVENTS_INTERNAL_H__
+#define __EVENTS_INTERNAL_H__
+
+/* Interrupt types. */
+enum xen_irq_type {
+	IRQT_UNBOUND = 0,
+	IRQT_PIRQ,
+	IRQT_VIRQ,
+	IRQT_IPI,
+	IRQT_EVTCHN
+};
+
+/*
+ * Packed IRQ information:
+ * type - enum xen_irq_type
+ * event channel - irq->event channel mapping
+ * cpu - cpu this event channel is bound to
+ * index - type-specific information:
+ *    PIRQ - vector, with MSB being "needs EIO", or physical IRQ of
the HVM
+ *           guest, or GSI (real passthrough IRQ) of the device.
+ *    VIRQ - virq number
+ *    IPI - IPI vector
+ *    EVTCHN -
+ */
+struct irq_info {
+	struct list_head list;
+	int refcnt;
+	enum xen_irq_type type;	/* type */
+	unsigned irq;
+	unsigned short evtchn;	/* event channel */
+	unsigned short cpu;	/* cpu bound */
+
+	union {
+		unsigned short virq;
+		enum ipi_vector ipi;
+		struct {
+			unsigned short pirq;
+			unsigned short gsi;
+			unsigned char vector;
+			unsigned char flags;
+			uint16_t domid;
+		} pirq;
+	} u;
+};
+
+#define PIRQ_NEEDS_EOI	(1 << 0)
+#define PIRQ_SHAREABLE	(1 << 1)
+
+extern int *evtchn_to_irq;
+
+struct irq_info *info_for_irq(unsigned irq);
+unsigned cpu_from_irq(unsigned irq);
+unsigned cpu_from_evtchn(unsigned int evtchn);
+
+void xen_evtchn_port_bind_to_cpu(struct irq_info *info, int cpu);
+
+void clear_evtchn(int port);
+void set_evtchn(int port);
+int test_evtchn(int port);
+int test_and_set_mask(int port);
+void mask_evtchn(int port);
+void unmask_evtchn(int port);
+
+void xen_evtchn_handle_events(int cpu);
+
+#endif /* #ifndef __EVENTS_INTERNAL_H__ */
diff --git a/drivers/xen/events/n-level.c b/drivers/xen/events/n-level.c
new file mode 100644
index 0000000..b86a853
--- /dev/null
+++ b/drivers/xen/events/n-level.c
@@ -0,0 +1,322 @@
+/*
+ * Xen event channels (N-level ABI)
+ *
+ * Jeremy Fitzhardinge <jeremy@xensource.com>, XenSource Inc, 2007
+ */
+
+#include <linux/linkage.h>
+#include <linux/interrupt.h>
+#include <linux/irq.h>
+#include <linux/module.h>
+
+#include <asm/sync_bitops.h>
+#include <asm/xen/hypercall.h>
+#include <asm/xen/hypervisor.h>
+
+#include <xen/xen.h>
+#include <xen/xen-ops.h>
+#include <xen/events.h>
+#include <xen/interface/xen.h>
+#include <xen/interface/event_channel.h>
+
+#include "events_internal.h"
+
+/*
+ * Note sizeof(xen_ulong_t) can be more than sizeof(unsigned long). Be
+ * careful to only use bitops which allow for this (e.g
+ * test_bit/find_first_bit and friends but not __ffs) and to pass
+ * BITS_PER_EVTCHN_WORD as the bitmask length.
+ */
+#define BITS_PER_EVTCHN_WORD (sizeof(xen_ulong_t)*8)
+/*
+ * Make a bitmask (i.e. unsigned long *) of a xen_ulong_t
+ * array. Primarily to avoid long lines (hence the terse name).
+ */
+#define BM(x) (unsigned long *)(x)
+/* Find the first set bit in a evtchn mask */
+#define EVTCHN_FIRST_BIT(w) find_first_bit(BM(&(w)), BITS_PER_EVTCHN_WORD)
+
+static DEFINE_PER_CPU(xen_ulong_t [NR_EVENT_CHANNELS/BITS_PER_EVTCHN_WORD],
+		      cpu_evtchn_mask);
+
+void xen_evtchn_port_bind_to_cpu(struct irq_info *info, int cpu)
+{
+	clear_bit(info->evtchn, BM(per_cpu(cpu_evtchn_mask, info->cpu)));
+	set_bit(info->evtchn, BM(per_cpu(cpu_evtchn_mask, cpu)));
+}
+
+void clear_evtchn(int port)
+{
+	struct shared_info *s = HYPERVISOR_shared_info;
+	sync_clear_bit(port, BM(&s->evtchn_pending[0]));
+}
+
+void set_evtchn(int port)
+{
+	struct shared_info *s = HYPERVISOR_shared_info;
+	sync_set_bit(port, BM(&s->evtchn_pending[0]));
+}
+
+int test_evtchn(int port)
+{
+	struct shared_info *s = HYPERVISOR_shared_info;
+	return sync_test_bit(port, BM(&s->evtchn_pending[0]));
+}
+
+int test_and_set_mask(int port)
+{
+	struct shared_info *s = HYPERVISOR_shared_info;
+	return sync_test_and_set_bit(port, BM(&s->evtchn_mask[0]));
+}
+
+void mask_evtchn(int port)
+{
+	struct shared_info *s = HYPERVISOR_shared_info;
+	sync_set_bit(port, BM(&s->evtchn_mask[0]));
+}
+
+void unmask_evtchn(int port)
+{
+	struct shared_info *s = HYPERVISOR_shared_info;
+	unsigned int cpu = get_cpu();
+	int do_hypercall = 0, evtchn_pending = 0;
+
+	BUG_ON(!irqs_disabled());
+
+	if (unlikely((cpu != cpu_from_evtchn(port))))
+		do_hypercall = 1;
+	else {
+		sync_clear_bit(port, BM(&s->evtchn_mask[0]));
+		evtchn_pending = sync_test_bit(port, BM(&s->evtchn_pending[0]));
+
+		if (unlikely(evtchn_pending && xen_hvm_domain()))
+			do_hypercall = 1;
+	}
+
+	/* Slow path (hypercall) if this is a non-local port or if this is
+	 * an hvm domain and an event is pending (hvm domains don''t have
+	 * their own implementation of irq_enable). */
+	if (do_hypercall) {
+		struct evtchn_unmask unmask = { .port = port };
+		(void)HYPERVISOR_event_channel_op(EVTCHNOP_unmask, &unmask);
+	} else {
+		struct vcpu_info *vcpu_info = __this_cpu_read(xen_vcpu);
+
+		/*
+		 * The following is basically the equivalent of
+		 * ''hw_resend_irq''. Just like a real IO-APIC we
''lose
+		 * the interrupt edge'' if the channel is masked.
+		 */
+		if (evtchn_pending &&
+		    !sync_test_and_set_bit(port / BITS_PER_EVTCHN_WORD,
+					   BM(&vcpu_info->evtchn_pending_sel)))
+			vcpu_info->evtchn_upcall_pending = 1;
+	}
+
+	put_cpu();
+}
+
+static DEFINE_PER_CPU(unsigned int, current_word_idx);
+static DEFINE_PER_CPU(unsigned int, current_bit_idx);
+
+/*
+ * Mask out the i least significant bits of w
+ */
+#define MASK_LSBS(w, i) (w & ((~((xen_ulong_t)0UL)) << i))
+
+static inline xen_ulong_t active_evtchns(unsigned int cpu,
+					 struct shared_info *sh,
+					 unsigned int idx)
+{
+	return sh->evtchn_pending[idx] &
+		per_cpu(cpu_evtchn_mask, cpu)[idx] &
+		~sh->evtchn_mask[idx];
+}
+
+/*
+ * Search the CPU''s pending events bitmasks.  For each one found, map
+ * the event number to an irq, and feed it into do_IRQ() for handling.
+ *
+ * Xen uses a two-level bitmap to speed searching.  The first level is
+ * a bitset of words which contain pending event bits.  The second
+ * level is a bitset of pending events themselves.
+ */
+void xen_evtchn_handle_events(int cpu)
+{
+	xen_ulong_t pending_words;
+	int start_word_idx, start_bit_idx;
+	int word_idx, bit_idx;
+	int i;
+	struct shared_info *s = HYPERVISOR_shared_info;
+	struct vcpu_info *vcpu_info = __this_cpu_read(xen_vcpu);
+
+	/*
+	 * Master flag must be cleared /before/ clearing
+	 * selector flag. xchg_xen_ulong must contain an
+	 * appropriate barrier.
+	 */
+	pending_words = xchg_xen_ulong(&vcpu_info->evtchn_pending_sel, 0);
+
+	start_word_idx = __this_cpu_read(current_word_idx);
+	start_bit_idx = __this_cpu_read(current_bit_idx);
+
+	word_idx = start_word_idx;
+
+	for (i = 0; pending_words != 0; i++) {
+		xen_ulong_t pending_bits;
+		xen_ulong_t words;
+
+		words = MASK_LSBS(pending_words, word_idx);
+
+		/*
+		 * If we masked out all events, wrap to beginning.
+		 */
+		if (words == 0) {
+			word_idx = 0;
+			bit_idx = 0;
+			continue;
+		}
+		word_idx = EVTCHN_FIRST_BIT(words);
+
+		pending_bits = active_evtchns(cpu, s, word_idx);
+		bit_idx = 0; /* usually scan entire word from start */
+		/*
+		 * We scan the starting word in two parts.
+		 *
+		 * 1st time: start in the middle, scanning the
+		 * upper bits.
+		 *
+		 * 2nd time: scan the whole word (not just the
+		 * parts skipped in the first pass) -- if an
+		 * event in the previously scanned bits is
+		 * pending again it would just be scanned on
+		 * the next loop anyway.
+		 */
+		if (word_idx == start_word_idx) {
+			if (i == 0)
+				bit_idx = start_bit_idx;
+		}
+
+		do {
+			xen_ulong_t bits;
+			int port, irq;
+			struct irq_desc *desc;
+
+			bits = MASK_LSBS(pending_bits, bit_idx);
+
+			/* If we masked out all events, move on. */
+			if (bits == 0)
+				break;
+
+			bit_idx = EVTCHN_FIRST_BIT(bits);
+
+			/* Process port. */
+			port = (word_idx * BITS_PER_EVTCHN_WORD) + bit_idx;
+			irq = evtchn_to_irq[port];
+
+			if (irq != -1) {
+				desc = irq_to_desc(irq);
+				if (desc)
+					generic_handle_irq_desc(irq, desc);
+			}
+
+			bit_idx = (bit_idx + 1) % BITS_PER_EVTCHN_WORD;
+
+			/* Next caller starts at last processed + 1 */
+			__this_cpu_write(current_word_idx,
+					 bit_idx ? word_idx :
+					 (word_idx+1) % BITS_PER_EVTCHN_WORD);
+			__this_cpu_write(current_bit_idx, bit_idx);
+		} while (bit_idx != 0);
+
+		/* Scan start_l1i twice; all others once. */
+		if ((word_idx != start_word_idx) || (i != 0))
+			pending_words &= ~(1UL << word_idx);
+
+		word_idx = (word_idx + 1) % BITS_PER_EVTCHN_WORD;
+	}
+}
+
+irqreturn_t xen_debug_interrupt(int irq, void *dev_id)
+{
+	struct shared_info *sh = HYPERVISOR_shared_info;
+	int cpu = smp_processor_id();
+	xen_ulong_t *cpu_evtchn = per_cpu(cpu_evtchn_mask, cpu);
+	int i;
+	unsigned long flags;
+	static DEFINE_SPINLOCK(debug_lock);
+	struct vcpu_info *v;
+
+	spin_lock_irqsave(&debug_lock, flags);
+
+	printk("\nvcpu %d\n  ", cpu);
+
+	for_each_online_cpu(i) {
+		int pending;
+		v = per_cpu(xen_vcpu, i);
+		pending = (get_irq_regs() && i == cpu)
+			? xen_irqs_disabled(get_irq_regs())
+			: v->evtchn_upcall_mask;
+		printk("%d: masked=%d pending=%d event_sel
%0*"PRI_xen_ulong"\n  ", i,
+		       pending, v->evtchn_upcall_pending,
+		       (int)(sizeof(v->evtchn_pending_sel)*2),
+		       v->evtchn_pending_sel);
+	}
+	v = per_cpu(xen_vcpu, cpu);
+
+	printk("\npending:\n   ");
+	for (i = ARRAY_SIZE(sh->evtchn_pending)-1; i >= 0; i--)
+		printk("%0*"PRI_xen_ulong"%s",
+		       (int)sizeof(sh->evtchn_pending[0])*2,
+		       sh->evtchn_pending[i],
+		       i % 8 == 0 ? "\n   " : " ");
+	printk("\nglobal mask:\n   ");
+	for (i = ARRAY_SIZE(sh->evtchn_mask)-1; i >= 0; i--)
+		printk("%0*"PRI_xen_ulong"%s",
+		       (int)(sizeof(sh->evtchn_mask[0])*2),
+		       sh->evtchn_mask[i],
+		       i % 8 == 0 ? "\n   " : " ");
+
+	printk("\nglobally unmasked:\n   ");
+	for (i = ARRAY_SIZE(sh->evtchn_mask)-1; i >= 0; i--)
+		printk("%0*"PRI_xen_ulong"%s",
+		       (int)(sizeof(sh->evtchn_mask[0])*2),
+		       sh->evtchn_pending[i] & ~sh->evtchn_mask[i],
+		       i % 8 == 0 ? "\n   " : " ");
+
+	printk("\nlocal cpu%d mask:\n   ", cpu);
+	for (i = (NR_EVENT_CHANNELS/BITS_PER_EVTCHN_WORD)-1; i >= 0; i--)
+		printk("%0*"PRI_xen_ulong"%s",
(int)(sizeof(cpu_evtchn[0])*2),
+		       cpu_evtchn[i],
+		       i % 8 == 0 ? "\n   " : " ");
+
+	printk("\nlocally unmasked:\n   ");
+	for (i = ARRAY_SIZE(sh->evtchn_mask)-1; i >= 0; i--) {
+		xen_ulong_t pending = sh->evtchn_pending[i]
+			& ~sh->evtchn_mask[i]
+			& cpu_evtchn[i];
+		printk("%0*"PRI_xen_ulong"%s",
+		       (int)(sizeof(sh->evtchn_mask[0])*2),
+		       pending, i % 8 == 0 ? "\n   " : " ");
+	}
+
+	printk("\npending list:\n");
+	for (i = 0; i < NR_EVENT_CHANNELS; i++) {
+		if (sync_test_bit(i, BM(sh->evtchn_pending))) {
+			int word_idx = i / BITS_PER_EVTCHN_WORD;
+			printk("  %d: event %d -> irq %d%s%s%s\n",
+			       cpu_from_evtchn(i), i,
+			       evtchn_to_irq[i],
+			       sync_test_bit(word_idx, BM(&v->evtchn_pending_sel))
+					     ? "" : " l2-clear",
+			       !sync_test_bit(i, BM(sh->evtchn_mask))
+					     ? "" : " globally-masked",
+			       sync_test_bit(i, BM(cpu_evtchn))
+					     ? "" : " locally-masked");
+		}
+	}
+
+	spin_unlock_irqrestore(&debug_lock, flags);
+
+	return IRQ_HANDLED;
+}
-- 
1.7.2.5

David Vrabel

2013-Sep-13 16:59 UTC

head link

[PATCH 07/13] xen/events: add struct evtchn_ops for the low-level port operations

From: David Vrabel <david.vrabel@citrix.com>

evtchn_ops contains the low-level operations that access the shared
data structures.  This allows alternate ABIs to be supported.

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
---
 drivers/xen/events/events.c          |    4 ++
 drivers/xen/events/events_internal.h |   62 +++++++++++++++++++++++++++++----
 drivers/xen/events/n-level.c         |   32 +++++++++++++----
 3 files changed, 82 insertions(+), 16 deletions(-)

diff --git a/drivers/xen/events/events.c b/drivers/xen/events/events.c
index 2700c12..8ecde63 100644
--- a/drivers/xen/events/events.c
+++ b/drivers/xen/events/events.c
@@ -61,6 +61,8 @@
 
 #include "events_internal.h"
 
+const struct evtchn_ops *evtchn_ops;
+
 /*
  * This lock protects updates to the following mapping and reference-count
  * arrays. The lock does not need to be acquired to read the mapping tables.
@@ -1518,6 +1520,8 @@ void __init xen_init_IRQ(void)
 {
 	int i;
 
+	xen_evtchn_init_nlevel();
+
 	evtchn_to_irq = kcalloc(NR_EVENT_CHANNELS, sizeof(*evtchn_to_irq),
 				    GFP_KERNEL);
 	BUG_ON(!evtchn_to_irq);
diff --git a/drivers/xen/events/events_internal.h
b/drivers/xen/events/events_internal.h
index 79ac70b..513c34d 100644
--- a/drivers/xen/events/events_internal.h
+++ b/drivers/xen/events/events_internal.h
@@ -54,21 +54,67 @@ struct irq_info {
 #define PIRQ_NEEDS_EOI	(1 << 0)
 #define PIRQ_SHAREABLE	(1 << 1)
 
+struct evtchn_ops {
+	void (*bind_to_cpu)(struct irq_info *info, int cpu);
+
+	void (*clear_pending)(int port);
+	void (*set_pending)(int port);
+	bool (*is_pending)(int port);
+	bool (*test_and_set_mask)(int port);
+	void (*mask)(int port);
+	void (*unmask)(int port);
+
+	void (*handle_events)(int cpu);
+};
+
+extern const struct evtchn_ops *evtchn_ops;
+
 extern int *evtchn_to_irq;
 
 struct irq_info *info_for_irq(unsigned irq);
 unsigned cpu_from_irq(unsigned irq);
 unsigned cpu_from_evtchn(unsigned int evtchn);
 
-void xen_evtchn_port_bind_to_cpu(struct irq_info *info, int cpu);
+static inline void xen_evtchn_port_bind_to_cpu(struct irq_info *info, int cpu)
+{
+	evtchn_ops->bind_to_cpu(info, cpu);
+}
+
+static inline void clear_evtchn(int port)
+{
+	evtchn_ops->clear_pending(port);
+}
+
+static inline void set_evtchn(int port)
+{
+	evtchn_ops->set_pending(port);
+}
+
+static inline bool test_evtchn(int port)
+{
+	return evtchn_ops->is_pending(port);
+}
+
+static inline bool test_and_set_mask(int port)
+{
+	return evtchn_ops->test_and_set_mask(port);
+}
+
+static inline void mask_evtchn(int port)
+{
+	return evtchn_ops->mask(port);
+}
+
+static inline void unmask_evtchn(int port)
+{
+	return evtchn_ops->unmask(port);
+}
 
-void clear_evtchn(int port);
-void set_evtchn(int port);
-int test_evtchn(int port);
-int test_and_set_mask(int port);
-void mask_evtchn(int port);
-void unmask_evtchn(int port);
+static inline void xen_evtchn_handle_events(int cpu)
+{
+	return evtchn_ops->handle_events(cpu);
+}
 
-void xen_evtchn_handle_events(int cpu);
+void xen_evtchn_init_nlevel(void);
 
 #endif /* #ifndef __EVENTS_INTERNAL_H__ */
diff --git a/drivers/xen/events/n-level.c b/drivers/xen/events/n-level.c
index b86a853..6c0662e 100644
--- a/drivers/xen/events/n-level.c
+++ b/drivers/xen/events/n-level.c
@@ -39,43 +39,43 @@
 static DEFINE_PER_CPU(xen_ulong_t [NR_EVENT_CHANNELS/BITS_PER_EVTCHN_WORD],
 		      cpu_evtchn_mask);
 
-void xen_evtchn_port_bind_to_cpu(struct irq_info *info, int cpu)
+static void nlevel_bind_to_cpu(struct irq_info *info, int cpu)
 {
 	clear_bit(info->evtchn, BM(per_cpu(cpu_evtchn_mask, info->cpu)));
 	set_bit(info->evtchn, BM(per_cpu(cpu_evtchn_mask, cpu)));
 }
 
-void clear_evtchn(int port)
+static void nlevel_clear_pending(int port)
 {
 	struct shared_info *s = HYPERVISOR_shared_info;
 	sync_clear_bit(port, BM(&s->evtchn_pending[0]));
 }
 
-void set_evtchn(int port)
+static void nlevel_set_pending(int port)
 {
 	struct shared_info *s = HYPERVISOR_shared_info;
 	sync_set_bit(port, BM(&s->evtchn_pending[0]));
 }
 
-int test_evtchn(int port)
+static bool nlevel_is_pending(int port)
 {
 	struct shared_info *s = HYPERVISOR_shared_info;
 	return sync_test_bit(port, BM(&s->evtchn_pending[0]));
 }
 
-int test_and_set_mask(int port)
+static bool nlevel_test_and_set_mask(int port)
 {
 	struct shared_info *s = HYPERVISOR_shared_info;
 	return sync_test_and_set_bit(port, BM(&s->evtchn_mask[0]));
 }
 
-void mask_evtchn(int port)
+static void nlevel_mask(int port)
 {
 	struct shared_info *s = HYPERVISOR_shared_info;
 	sync_set_bit(port, BM(&s->evtchn_mask[0]));
 }
 
-void unmask_evtchn(int port)
+static void nlevel_unmask(int port)
 {
 	struct shared_info *s = HYPERVISOR_shared_info;
 	unsigned int cpu = get_cpu();
@@ -141,7 +141,7 @@ static inline xen_ulong_t active_evtchns(unsigned int cpu,
  * a bitset of words which contain pending event bits.  The second
  * level is a bitset of pending events themselves.
  */
-void xen_evtchn_handle_events(int cpu)
+static void nlevel_handle_events(int cpu)
 {
 	xen_ulong_t pending_words;
 	int start_word_idx, start_bit_idx;
@@ -320,3 +320,19 @@ irqreturn_t xen_debug_interrupt(int irq, void *dev_id)
 
 	return IRQ_HANDLED;
 }
+
+static const struct evtchn_ops evtchn_ops_nlevel = {
+	.bind_to_cpu       = nlevel_bind_to_cpu,
+	.clear_pending     = nlevel_clear_pending,
+	.set_pending       = nlevel_set_pending,
+	.is_pending        = nlevel_is_pending,
+	.test_and_set_mask = nlevel_test_and_set_mask,
+	.mask              = nlevel_mask,
+	.unmask            = nlevel_unmask,
+	.handle_events     = nlevel_handle_events,
+};
+
+void __init xen_evtchn_init_nlevel(void)
+{
+	evtchn_ops = &evtchn_ops_nlevel;
+}
-- 
1.7.2.5

David Vrabel

2013-Sep-13 16:59 UTC

head link

[PATCH 08/13] xen/events: allow setup of irq_info to fail

From: David Vrabel <david.vrabel@citrix.com>

The FIFO-based event ABI requires additional setup of newly bound
events (it may need to expand the event array) and this setup may
fail.

xen_irq_info_common_init() is a useful place to put this setup so
allow this call to fail.  This call and the other similar calls are
renamed to be *_setup() to reflect that they may now fail.

This failure can only occur with new event channels not on rebind.

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
---
 drivers/xen/events/events.c |  156 +++++++++++++++++++++++++------------------
 1 files changed, 91 insertions(+), 65 deletions(-)

diff --git a/drivers/xen/events/events.c b/drivers/xen/events/events.c
index 8ecde63..8f55a49 100644
--- a/drivers/xen/events/events.c
+++ b/drivers/xen/events/events.c
@@ -99,7 +99,7 @@ struct irq_info *info_for_irq(unsigned irq)
 }
 
 /* Constructors for packed IRQ information. */
-static void xen_irq_info_common_init(struct irq_info *info,
+static int xen_irq_info_common_setup(struct irq_info *info,
 				     unsigned irq,
 				     enum xen_irq_type type,
 				     unsigned short evtchn,
@@ -116,45 +116,47 @@ static void xen_irq_info_common_init(struct irq_info
*info,
 	evtchn_to_irq[evtchn] = irq;
 
 	irq_clear_status_flags(irq, IRQ_NOREQUEST|IRQ_NOAUTOEN);
+
+	return 0;
 }
 
-static void xen_irq_info_evtchn_init(unsigned irq,
+static int xen_irq_info_evtchn_setup(unsigned irq,
 				     unsigned short evtchn)
 {
 	struct irq_info *info = info_for_irq(irq);
 
-	xen_irq_info_common_init(info, irq, IRQT_EVTCHN, evtchn, 0);
+	return xen_irq_info_common_setup(info, irq, IRQT_EVTCHN, evtchn, 0);
 }
 
-static void xen_irq_info_ipi_init(unsigned cpu,
+static int xen_irq_info_ipi_setup(unsigned cpu,
 				  unsigned irq,
 				  unsigned short evtchn,
 				  enum ipi_vector ipi)
 {
 	struct irq_info *info = info_for_irq(irq);
 
-	xen_irq_info_common_init(info, irq, IRQT_IPI, evtchn, 0);
-
 	info->u.ipi = ipi;
 
 	per_cpu(ipi_to_irq, cpu)[ipi] = irq;
+
+	return xen_irq_info_common_setup(info, irq, IRQT_IPI, evtchn, 0);
 }
 
-static void xen_irq_info_virq_init(unsigned cpu,
+static int xen_irq_info_virq_setup(unsigned cpu,
 				   unsigned irq,
 				   unsigned short evtchn,
 				   unsigned short virq)
 {
 	struct irq_info *info = info_for_irq(irq);
 
-	xen_irq_info_common_init(info, irq, IRQT_VIRQ, evtchn, 0);
-
 	info->u.virq = virq;
 
 	per_cpu(virq_to_irq, cpu)[virq] = irq;
+
+	return xen_irq_info_common_setup(info, irq, IRQT_VIRQ, evtchn, 0);
 }
 
-static void xen_irq_info_pirq_init(unsigned irq,
+static int xen_irq_info_pirq_setup(unsigned irq,
 				   unsigned short evtchn,
 				   unsigned short pirq,
 				   unsigned short gsi,
@@ -163,12 +165,12 @@ static void xen_irq_info_pirq_init(unsigned irq,
 {
 	struct irq_info *info = info_for_irq(irq);
 
-	xen_irq_info_common_init(info, irq, IRQT_PIRQ, evtchn, 0);
-
 	info->u.pirq.pirq = pirq;
 	info->u.pirq.gsi = gsi;
 	info->u.pirq.domid = domid;
 	info->u.pirq.flags = flags;
+
+	return xen_irq_info_common_setup(info, irq, IRQT_PIRQ, evtchn, 0);
 }
 
 /*
@@ -516,6 +518,47 @@ int xen_irq_from_gsi(unsigned gsi)
 }
 EXPORT_SYMBOL_GPL(xen_irq_from_gsi);
 
+static void __unbind_from_irq(unsigned int irq)
+{
+	struct evtchn_close close;
+	int evtchn = evtchn_from_irq(irq);
+	struct irq_info *info = irq_get_handler_data(irq);
+
+	if (info->refcnt > 0) {
+		info->refcnt--;
+		if (info->refcnt != 0)
+			return;
+	}
+
+	if (VALID_EVTCHN(evtchn)) {
+		close.port = evtchn;
+		if (HYPERVISOR_event_channel_op(EVTCHNOP_close, &close) != 0)
+			BUG();
+
+		switch (type_from_irq(irq)) {
+		case IRQT_VIRQ:
+			per_cpu(virq_to_irq, cpu_from_evtchn(evtchn))
+				[virq_from_irq(irq)] = -1;
+			break;
+		case IRQT_IPI:
+			per_cpu(ipi_to_irq, cpu_from_evtchn(evtchn))
+				[ipi_from_irq(irq)] = -1;
+			break;
+		default:
+			break;
+		}
+
+		/* Closed ports are implicitly re-bound to VCPU0. */
+		bind_evtchn_to_cpu(evtchn, 0);
+
+		evtchn_to_irq[evtchn] = -1;
+	}
+
+	BUG_ON(info_for_irq(irq)->type == IRQT_UNBOUND);
+
+	xen_free_irq(irq);
+}
+
 /*
  * Do not make any assumptions regarding the relationship between the
  * IRQ number returned here and the Xen pirq argument.
@@ -531,6 +574,7 @@ int xen_bind_pirq_gsi_to_irq(unsigned gsi,
 {
 	int irq = -1;
 	struct physdev_irq irq_op;
+	int ret;
 
 	mutex_lock(&irq_mapping_update_lock);
 
@@ -558,8 +602,13 @@ int xen_bind_pirq_gsi_to_irq(unsigned gsi,
 		goto out;
 	}
 
-	xen_irq_info_pirq_init(irq, 0, pirq, gsi, DOMID_SELF,
+	ret = xen_irq_info_pirq_setup(irq, 0, pirq, gsi, DOMID_SELF,
 			       shareable ? PIRQ_SHAREABLE : 0);
+	if (ret < 0) {
+		__unbind_from_irq(irq);
+		irq = ret;
+		goto out;
+	}
 
 	pirq_query_unmask(irq);
 	/* We try to use the handler with the appropriate semantic for the
@@ -619,7 +668,9 @@ int xen_bind_pirq_msi_to_irq(struct pci_dev *dev, struct
msi_desc *msidesc,
 	irq_set_chip_and_handler_name(irq, &xen_pirq_chip, handle_edge_irq,
 			name);
 
-	xen_irq_info_pirq_init(irq, 0, pirq, 0, domid, 0);
+	ret = xen_irq_info_pirq_setup(irq, 0, pirq, 0, domid, 0);
+	if (ret < 0)
+		goto error_irq;
 	ret = irq_set_msi_desc(irq, msidesc);
 	if (ret < 0)
 		goto error_irq;
@@ -627,8 +678,8 @@ out:
 	mutex_unlock(&irq_mapping_update_lock);
 	return irq;
 error_irq:
+	__unbind_from_irq(irq);
 	mutex_unlock(&irq_mapping_update_lock);
-	xen_free_irq(irq);
 	return ret;
 }
 #endif
@@ -698,9 +749,11 @@ int xen_pirq_from_irq(unsigned irq)
 	return pirq_from_irq(irq);
 }
 EXPORT_SYMBOL_GPL(xen_pirq_from_irq);
+
 int bind_evtchn_to_irq(unsigned int evtchn)
 {
 	int irq;
+	int ret;
 
 	mutex_lock(&irq_mapping_update_lock);
 
@@ -714,7 +767,12 @@ int bind_evtchn_to_irq(unsigned int evtchn)
 		irq_set_chip_and_handler_name(irq, &xen_dynamic_chip,
 					      handle_edge_irq, "event");
 
-		xen_irq_info_evtchn_init(irq, evtchn);
+		ret = xen_irq_info_evtchn_setup(irq, evtchn);
+		if (ret < 0) {
+			__unbind_from_irq(irq);
+			irq = ret;
+			goto out;
+		}
 	} else {
 		struct irq_info *info = info_for_irq(irq);
 		WARN_ON(info == NULL || info->type != IRQT_EVTCHN);
@@ -731,6 +789,7 @@ static int bind_ipi_to_irq(unsigned int ipi, unsigned int
cpu)
 {
 	struct evtchn_bind_ipi bind_ipi;
 	int evtchn, irq;
+	int ret;
 
 	mutex_lock(&irq_mapping_update_lock);
 
@@ -750,8 +809,12 @@ static int bind_ipi_to_irq(unsigned int ipi, unsigned int
cpu)
 			BUG();
 		evtchn = bind_ipi.port;
 
-		xen_irq_info_ipi_init(cpu, irq, evtchn, ipi);
-
+		ret = xen_irq_info_ipi_setup(cpu, irq, evtchn, ipi);
+		if (ret < 0) {
+			__unbind_from_irq(irq);
+			irq = ret;
+			goto out;
+		}
 		bind_evtchn_to_cpu(evtchn, cpu);
 	} else {
 		struct irq_info *info = info_for_irq(irq);
@@ -830,7 +893,12 @@ int bind_virq_to_irq(unsigned int virq, unsigned int cpu)
 			evtchn = ret;
 		}
 
-		xen_irq_info_virq_init(cpu, irq, evtchn, virq);
+		ret = xen_irq_info_virq_setup(cpu, irq, evtchn, virq);
+		if (ret < 0) {
+			__unbind_from_irq(irq);
+			irq = ret;
+			goto out;
+		}
 
 		bind_evtchn_to_cpu(evtchn, cpu);
 	} else {
@@ -846,50 +914,8 @@ out:
 
 static void unbind_from_irq(unsigned int irq)
 {
-	struct evtchn_close close;
-	int evtchn = evtchn_from_irq(irq);
-	struct irq_info *info = irq_get_handler_data(irq);
-
-	if (WARN_ON(!info))
-		return;
-
 	mutex_lock(&irq_mapping_update_lock);
-
-	if (info->refcnt > 0) {
-		info->refcnt--;
-		if (info->refcnt != 0)
-			goto done;
-	}
-
-	if (VALID_EVTCHN(evtchn)) {
-		close.port = evtchn;
-		if (HYPERVISOR_event_channel_op(EVTCHNOP_close, &close) != 0)
-			BUG();
-
-		switch (type_from_irq(irq)) {
-		case IRQT_VIRQ:
-			per_cpu(virq_to_irq, cpu_from_evtchn(evtchn))
-				[virq_from_irq(irq)] = -1;
-			break;
-		case IRQT_IPI:
-			per_cpu(ipi_to_irq, cpu_from_evtchn(evtchn))
-				[ipi_from_irq(irq)] = -1;
-			break;
-		default:
-			break;
-		}
-
-		/* Closed ports are implicitly re-bound to VCPU0. */
-		bind_evtchn_to_cpu(evtchn, 0);
-
-		evtchn_to_irq[evtchn] = -1;
-	}
-
-	BUG_ON(info_for_irq(irq)->type == IRQT_UNBOUND);
-
-	xen_free_irq(irq);
-
- done:
+	__unbind_from_irq(irq);
 	mutex_unlock(&irq_mapping_update_lock);
 }
 
@@ -1137,7 +1163,7 @@ void rebind_evtchn_irq(int evtchn, int irq)
 	   so there should be a proper type */
 	BUG_ON(info->type == IRQT_UNBOUND);
 
-	xen_irq_info_evtchn_init(irq, evtchn);
+	xen_irq_info_evtchn_setup(irq, evtchn);
 
 	mutex_unlock(&irq_mapping_update_lock);
 
@@ -1312,7 +1338,7 @@ static void restore_cpu_virqs(unsigned int cpu)
 		evtchn = bind_virq.port;
 
 		/* Record the new mapping. */
-		xen_irq_info_virq_init(cpu, irq, evtchn, virq);
+		xen_irq_info_virq_setup(cpu, irq, evtchn, virq);
 		bind_evtchn_to_cpu(evtchn, cpu);
 	}
 }
@@ -1336,7 +1362,7 @@ static void restore_cpu_ipis(unsigned int cpu)
 		evtchn = bind_ipi.port;
 
 		/* Record the new mapping. */
-		xen_irq_info_ipi_init(cpu, irq, evtchn, ipi);
+		xen_irq_info_ipi_setup(cpu, irq, evtchn, ipi);
 		bind_evtchn_to_cpu(evtchn, cpu);
 	}
 }
-- 
1.7.2.5

David Vrabel

2013-Sep-13 16:59 UTC

head link

[PATCH 09/13] xen/events: add a evtchn_op for port setup

From: David Vrabel <david.vrabel@citrix.com>

Add a hook for port-specific setup and call it from
xen_irq_info_common_setup().

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
---
 drivers/xen/events/events.c          |    2 +-
 drivers/xen/events/events_internal.h |    8 ++++++++
 2 files changed, 9 insertions(+), 1 deletions(-)

diff --git a/drivers/xen/events/events.c b/drivers/xen/events/events.c
index 8f55a49..cf1c7ba 100644
--- a/drivers/xen/events/events.c
+++ b/drivers/xen/events/events.c
@@ -117,7 +117,7 @@ static int xen_irq_info_common_setup(struct irq_info *info,
 
 	irq_clear_status_flags(irq, IRQ_NOREQUEST|IRQ_NOAUTOEN);
 
-	return 0;
+	return xen_evtchn_port_setup(info);
 }
 
 static int xen_irq_info_evtchn_setup(unsigned irq,
diff --git a/drivers/xen/events/events_internal.h
b/drivers/xen/events/events_internal.h
index 513c34d..32cb928 100644
--- a/drivers/xen/events/events_internal.h
+++ b/drivers/xen/events/events_internal.h
@@ -55,6 +55,7 @@ struct irq_info {
 #define PIRQ_SHAREABLE	(1 << 1)
 
 struct evtchn_ops {
+	int (*setup)(struct irq_info *info);
 	void (*bind_to_cpu)(struct irq_info *info, int cpu);
 
 	void (*clear_pending)(int port);
@@ -75,6 +76,13 @@ struct irq_info *info_for_irq(unsigned irq);
 unsigned cpu_from_irq(unsigned irq);
 unsigned cpu_from_evtchn(unsigned int evtchn);
 
+static inline int xen_evtchn_port_setup(struct irq_info *info)
+{
+	if (evtchn_ops->setup)
+		return evtchn_ops->setup(info);
+	return 0;
+}
+
 static inline void xen_evtchn_port_bind_to_cpu(struct irq_info *info, int cpu)
 {
 	evtchn_ops->bind_to_cpu(info, cpu);
-- 
1.7.2.5

David Vrabel

2013-Sep-13 16:59 UTC

head link

[PATCH 10/13] xen/events: Refactor evtchn_to_irq array to be dynamically allocated

From: Malcolm Crossley <malcolm.crossley@citrix.com>

Refactor static array evtchn_to_irq array to be dynamically allocated by
implementing get and set functions for accesses to the array.

Two new port ops are added: max_channels (maximum supported number of
event channels) and nr_channels (number of currently usable event
channels).  For the N-level ABI, these numbers are both the same as
the shared data structure is a fixed size. For the FIFO ABI, these
will be different as the event array is expanded dynamically.

This allows more than 65000 event channels so an unsigned short is no
longer sufficient for an event channel port number and unsigned int is
used instead.

Signed-off-by: Malcolm Crossley <malcolm.crossley@citrix.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
---
 drivers/xen/events/events.c          |  129 +++++++++++++++++++++++++---------
 drivers/xen/events/events_internal.h |   18 ++++-
 drivers/xen/events/n-level.c         |   11 +++-
 3 files changed, 121 insertions(+), 37 deletions(-)

diff --git a/drivers/xen/events/events.c b/drivers/xen/events/events.c
index cf1c7ba..8660459 100644
--- a/drivers/xen/events/events.c
+++ b/drivers/xen/events/events.c
@@ -77,12 +77,16 @@ static DEFINE_PER_CPU(int [NR_VIRQS], virq_to_irq) = {[0 ...
NR_VIRQS-1] = -1};
 /* IRQ <-> IPI mapping */
 static DEFINE_PER_CPU(int [XEN_NR_IPIS], ipi_to_irq) = {[0 ... XEN_NR_IPIS-1] =
-1};
 
-int *evtchn_to_irq;
+int **evtchn_to_irq;
 #ifdef CONFIG_X86
 static unsigned long *pirq_eoi_map;
 #endif
 static bool (*pirq_needs_eoi)(unsigned irq);
 
+#define EVTCHN_ROW(e)  (e / (PAGE_SIZE/sizeof(**evtchn_to_irq)))
+#define EVTCHN_COL(e)  (e % (PAGE_SIZE/sizeof(**evtchn_to_irq)))
+#define EVTCHN_PER_ROW (PAGE_SIZE / sizeof(**evtchn_to_irq))
+
 /* Xen will never allocate port zero for any purpose. */
 #define VALID_EVTCHN(chn)	((chn) != 0)
 
@@ -92,6 +96,61 @@ static struct irq_chip xen_pirq_chip;
 static void enable_dynirq(struct irq_data *data);
 static void disable_dynirq(struct irq_data *data);
 
+static void clear_evtchn_to_irq_row(unsigned row)
+{
+	unsigned col;
+
+	for (col = 0; col < EVTCHN_PER_ROW; col++)
+		evtchn_to_irq[row][col] = -1;
+}
+
+static void clear_evtchn_to_irq_all(void)
+{
+	unsigned row;
+
+	for (row = 0; row < EVTCHN_ROW(xen_evtchn_max_channels()); row++) {
+		if (evtchn_to_irq[row] == NULL)
+			continue;
+		clear_evtchn_to_irq_row(row);
+	}
+} 
+
+static int set_evtchn_to_irq(unsigned evtchn, unsigned irq)
+{
+	unsigned row;
+	unsigned col;
+
+	if (evtchn >= xen_evtchn_max_channels())
+		return -EINVAL;
+
+	row = EVTCHN_ROW(evtchn);
+	col = EVTCHN_COL(evtchn);
+
+	if (evtchn_to_irq[row] == NULL) {
+		/* Unallocated irq entries return -1 anyway */
+		if (irq == -1)
+			return 0;
+
+		evtchn_to_irq[row] = (int *)get_zeroed_page(GFP_KERNEL);
+		if (evtchn_to_irq[row] == NULL)
+			return -ENOMEM;
+
+		clear_evtchn_to_irq_row(row);
+	}
+
+	evtchn_to_irq[EVTCHN_ROW(evtchn)][EVTCHN_COL(evtchn)] = irq;
+	return 0;
+}
+
+int get_evtchn_to_irq(unsigned evtchn)
+{
+	if (evtchn >= xen_evtchn_max_channels())
+		return -1;
+	if (evtchn_to_irq[EVTCHN_ROW(evtchn)] == NULL)
+		return -1;
+	return evtchn_to_irq[EVTCHN_ROW(evtchn)][EVTCHN_COL(evtchn)];
+}
+
 /* Get info for IRQ */
 struct irq_info *info_for_irq(unsigned irq)
 {
@@ -102,7 +161,7 @@ struct irq_info *info_for_irq(unsigned irq)
 static int xen_irq_info_common_setup(struct irq_info *info,
 				     unsigned irq,
 				     enum xen_irq_type type,
-				     unsigned short evtchn,
+				     unsigned evtchn,
 				     unsigned short cpu)
 {
 
@@ -113,7 +172,8 @@ static int xen_irq_info_common_setup(struct irq_info *info,
 	info->evtchn = evtchn;
 	info->cpu = cpu;
 
-	evtchn_to_irq[evtchn] = irq;
+	if (set_evtchn_to_irq(evtchn, irq))
+		return -ENOMEM;
 
 	irq_clear_status_flags(irq, IRQ_NOREQUEST|IRQ_NOAUTOEN);
 
@@ -121,7 +181,7 @@ static int xen_irq_info_common_setup(struct irq_info *info,
 }
 
 static int xen_irq_info_evtchn_setup(unsigned irq,
-				     unsigned short evtchn)
+				     unsigned evtchn)
 {
 	struct irq_info *info = info_for_irq(irq);
 
@@ -130,7 +190,7 @@ static int xen_irq_info_evtchn_setup(unsigned irq,
 
 static int xen_irq_info_ipi_setup(unsigned cpu,
 				  unsigned irq,
-				  unsigned short evtchn,
+				  unsigned evtchn,
 				  enum ipi_vector ipi)
 {
 	struct irq_info *info = info_for_irq(irq);
@@ -144,8 +204,8 @@ static int xen_irq_info_ipi_setup(unsigned cpu,
 
 static int xen_irq_info_virq_setup(unsigned cpu,
 				   unsigned irq,
-				   unsigned short evtchn,
-				   unsigned short virq)
+				   unsigned evtchn,
+				   unsigned virq)
 {
 	struct irq_info *info = info_for_irq(irq);
 
@@ -157,9 +217,9 @@ static int xen_irq_info_virq_setup(unsigned cpu,
 }
 
 static int xen_irq_info_pirq_setup(unsigned irq,
-				   unsigned short evtchn,
-				   unsigned short pirq,
-				   unsigned short gsi,
+				   unsigned evtchn,
+				   unsigned pirq,
+				   unsigned gsi,
 				   uint16_t domid,
 				   unsigned char flags)
 {
@@ -186,7 +246,7 @@ static unsigned int evtchn_from_irq(unsigned irq)
 
 unsigned irq_from_evtchn(unsigned int evtchn)
 {
-	return evtchn_to_irq[evtchn];
+	return get_evtchn_to_irq(evtchn);
 }
 EXPORT_SYMBOL_GPL(irq_from_evtchn);
 
@@ -232,7 +292,7 @@ unsigned cpu_from_irq(unsigned irq)
 
 unsigned int cpu_from_evtchn(unsigned int evtchn)
 {
-	int irq = evtchn_to_irq[evtchn];
+	int irq = get_evtchn_to_irq(evtchn);
 	unsigned ret = 0;
 
 	if (irq != -1)
@@ -258,7 +318,7 @@ static bool pirq_needs_eoi_flag(unsigned irq)
 
 static void bind_evtchn_to_cpu(unsigned int chn, unsigned int cpu)
 {
-	int irq = evtchn_to_irq[chn];
+	int irq = get_evtchn_to_irq(chn);
 	struct irq_info *info = info_for_irq(irq);
 
 	BUG_ON(irq == -1);
@@ -453,7 +513,9 @@ static unsigned int __startup_pirq(unsigned int irq)
 
 	pirq_query_unmask(irq);
 
-	evtchn_to_irq[evtchn] = irq;
+	rc = set_evtchn_to_irq(evtchn, irq);
+	if (rc != 0)
+		return 0;
 	bind_evtchn_to_cpu(evtchn, 0);
 	info->evtchn = evtchn;
 
@@ -474,7 +536,7 @@ static void shutdown_pirq(struct irq_data *data)
 	struct evtchn_close close;
 	unsigned int irq = data->irq;
 	struct irq_info *info = info_for_irq(irq);
-	int evtchn = evtchn_from_irq(irq);
+	int rc, evtchn = evtchn_from_irq(irq);
 
 	BUG_ON(info->type != IRQT_PIRQ);
 
@@ -488,7 +550,7 @@ static void shutdown_pirq(struct irq_data *data)
 		BUG();
 
 	bind_evtchn_to_cpu(evtchn, 0);
-	evtchn_to_irq[evtchn] = -1;
+	rc = set_evtchn_to_irq(evtchn, -1);
 	info->evtchn = 0;
 }
 
@@ -551,7 +613,7 @@ static void __unbind_from_irq(unsigned int irq)
 		/* Closed ports are implicitly re-bound to VCPU0. */
 		bind_evtchn_to_cpu(evtchn, 0);
 
-		evtchn_to_irq[evtchn] = -1;
+		set_evtchn_to_irq(evtchn, -1);
 	}
 
 	BUG_ON(info_for_irq(irq)->type == IRQT_UNBOUND);
@@ -755,9 +817,12 @@ int bind_evtchn_to_irq(unsigned int evtchn)
 	int irq;
 	int ret;
 
+	if (evtchn >= xen_evtchn_max_channels())
+		return -ENOMEM;
+
 	mutex_lock(&irq_mapping_update_lock);
 
-	irq = evtchn_to_irq[evtchn];
+	irq = get_evtchn_to_irq(evtchn);
 
 	if (irq == -1) {
 		irq = xen_allocate_irq_dynamic();
@@ -847,7 +912,7 @@ static int find_virq(unsigned int virq, unsigned int cpu)
 	int port, rc = -ENOENT;
 
 	memset(&status, 0, sizeof(status));
-	for (port = 0; port <= NR_EVENT_CHANNELS; port++) {
+	for (port = 0; port < xen_evtchn_max_channels(); port++) {
 		status.dom = DOMID_SELF;
 		status.port = port;
 		rc = HYPERVISOR_event_channel_op(EVTCHNOP_status, &status);
@@ -927,8 +992,9 @@ int bind_evtchn_to_irqhandler(unsigned int evtchn,
 	int irq, retval;
 
 	irq = bind_evtchn_to_irq(evtchn);
-	if (irq < 0)
+	if (irq < 0){
 		return irq;
+	}
 	retval = request_irq(irq, handler, irqflags, devname, dev_id);
 	if (retval != 0) {
 		unbind_from_irq(irq);
@@ -1017,7 +1083,7 @@ EXPORT_SYMBOL_GPL(unbind_from_irqhandler);
 
 int evtchn_make_refcounted(unsigned int evtchn)
 {
-	int irq = evtchn_to_irq[evtchn];
+	int irq = get_evtchn_to_irq(evtchn);
 	struct irq_info *info;
 
 	if (irq == -1)
@@ -1042,12 +1108,12 @@ int evtchn_get(unsigned int evtchn)
 	struct irq_info *info;
 	int err = -ENOENT;
 
-	if (evtchn >= NR_EVENT_CHANNELS)
+	if (evtchn >= xen_evtchn_max_channels())
 		return -EINVAL;
 
 	mutex_lock(&irq_mapping_update_lock);
 
-	irq = evtchn_to_irq[evtchn];
+	irq = get_evtchn_to_irq(evtchn);
 	if (irq == -1)
 		goto done;
 
@@ -1071,7 +1137,7 @@ EXPORT_SYMBOL_GPL(evtchn_get);
 
 void evtchn_put(unsigned int evtchn)
 {
-	int irq = evtchn_to_irq[evtchn];
+	int irq = get_evtchn_to_irq(evtchn);
 	if (WARN_ON(irq == -1))
 		return;
 	unbind_from_irq(irq);
@@ -1158,7 +1224,7 @@ void rebind_evtchn_irq(int evtchn, int irq)
 	mutex_lock(&irq_mapping_update_lock);
 
 	/* After resume the irq<->evtchn mappings are all cleared out */
-	BUG_ON(evtchn_to_irq[evtchn] != -1);
+	BUG_ON(get_evtchn_to_irq(evtchn) != -1);
 	/* Expect irq to have been bound before,
 	   so there should be a proper type */
 	BUG_ON(info->type == IRQT_UNBOUND);
@@ -1443,15 +1509,14 @@ void xen_irq_resume(void)
 	struct irq_info *info;
 
 	/* New event-channel space is not ''live'' yet. */
-	for (evtchn = 0; evtchn < NR_EVENT_CHANNELS; evtchn++)
+	for (evtchn = 0; evtchn < xen_evtchn_nr_channels(); evtchn++)
 		mask_evtchn(evtchn);
 
 	/* No IRQ <-> event-channel mappings. */
 	list_for_each_entry(info, &xen_irq_list_head, list)
 		info->evtchn = 0; /* zap event-channel binding */
 
-	for (evtchn = 0; evtchn < NR_EVENT_CHANNELS; evtchn++)
-		evtchn_to_irq[evtchn] = -1;
+	clear_evtchn_to_irq_all();
 
 	for_each_possible_cpu(cpu) {
 		restore_cpu_virqs(cpu);
@@ -1548,14 +1613,12 @@ void __init xen_init_IRQ(void)
 
 	xen_evtchn_init_nlevel();
 
-	evtchn_to_irq = kcalloc(NR_EVENT_CHANNELS, sizeof(*evtchn_to_irq),
-				    GFP_KERNEL);
+	evtchn_to_irq = kcalloc(EVTCHN_ROW(xen_evtchn_max_channels()),
+				sizeof(*evtchn_to_irq), GFP_KERNEL);
 	BUG_ON(!evtchn_to_irq);
-	for (i = 0; i < NR_EVENT_CHANNELS; i++)
-		evtchn_to_irq[i] = -1;
 
 	/* No event channels are ''live'' right now. */
-	for (i = 0; i < NR_EVENT_CHANNELS; i++)
+	for (i = 0; i < xen_evtchn_nr_channels(); i++)
 		mask_evtchn(i);
 
 	pirq_needs_eoi = pirq_needs_eoi_flag;
diff --git a/drivers/xen/events/events_internal.h
b/drivers/xen/events/events_internal.h
index 32cb928..9d8b70c 100644
--- a/drivers/xen/events/events_internal.h
+++ b/drivers/xen/events/events_internal.h
@@ -35,7 +35,7 @@ struct irq_info {
 	int refcnt;
 	enum xen_irq_type type;	/* type */
 	unsigned irq;
-	unsigned short evtchn;	/* event channel */
+	unsigned int evtchn;	/* event channel */
 	unsigned short cpu;	/* cpu bound */
 
 	union {
@@ -55,6 +55,9 @@ struct irq_info {
 #define PIRQ_SHAREABLE	(1 << 1)
 
 struct evtchn_ops {
+	unsigned (*max_channels)(void);
+	unsigned (*nr_channels)(void);
+
 	int (*setup)(struct irq_info *info);
 	void (*bind_to_cpu)(struct irq_info *info, int cpu);
 
@@ -70,12 +73,23 @@ struct evtchn_ops {
 
 extern const struct evtchn_ops *evtchn_ops;
 
-extern int *evtchn_to_irq;
+extern int **evtchn_to_irq;
+int get_evtchn_to_irq(unsigned int evtchn);
 
 struct irq_info *info_for_irq(unsigned irq);
 unsigned cpu_from_irq(unsigned irq);
 unsigned cpu_from_evtchn(unsigned int evtchn);
 
+static inline unsigned xen_evtchn_max_channels(void)
+{
+	return evtchn_ops->max_channels();
+}
+
+static inline unsigned xen_evtchn_nr_channels(void)
+{
+	return evtchn_ops->nr_channels();
+}
+
 static inline int xen_evtchn_port_setup(struct irq_info *info)
 {
 	if (evtchn_ops->setup)
diff --git a/drivers/xen/events/n-level.c b/drivers/xen/events/n-level.c
index 6c0662e..da8d0aa 100644
--- a/drivers/xen/events/n-level.c
+++ b/drivers/xen/events/n-level.c
@@ -39,6 +39,11 @@
 static DEFINE_PER_CPU(xen_ulong_t [NR_EVENT_CHANNELS/BITS_PER_EVTCHN_WORD],
 		      cpu_evtchn_mask);
 
+static unsigned nlevel_max_channels(void)
+{
+	return NR_EVENT_CHANNELS;
+}
+
 static void nlevel_bind_to_cpu(struct irq_info *info, int cpu)
 {
 	clear_bit(info->evtchn, BM(per_cpu(cpu_evtchn_mask, info->cpu)));
@@ -212,7 +217,7 @@ static void nlevel_handle_events(int cpu)
 
 			/* Process port. */
 			port = (word_idx * BITS_PER_EVTCHN_WORD) + bit_idx;
-			irq = evtchn_to_irq[port];
+			irq = get_evtchn_to_irq(port);
 
 			if (irq != -1) {
 				desc = irq_to_desc(irq);
@@ -306,7 +311,7 @@ irqreturn_t xen_debug_interrupt(int irq, void *dev_id)
 			int word_idx = i / BITS_PER_EVTCHN_WORD;
 			printk("  %d: event %d -> irq %d%s%s%s\n",
 			       cpu_from_evtchn(i), i,
-			       evtchn_to_irq[i],
+			       get_evtchn_to_irq(i),
 			       sync_test_bit(word_idx, BM(&v->evtchn_pending_sel))
 					     ? "" : " l2-clear",
 			       !sync_test_bit(i, BM(sh->evtchn_mask))
@@ -322,6 +327,8 @@ irqreturn_t xen_debug_interrupt(int irq, void *dev_id)
 }
 
 static const struct evtchn_ops evtchn_ops_nlevel = {
+	.max_channels      = nlevel_max_channels,
+	.nr_channels       = nlevel_max_channels,
 	.bind_to_cpu       = nlevel_bind_to_cpu,
 	.clear_pending     = nlevel_clear_pending,
 	.set_pending       = nlevel_set_pending,
-- 
1.7.2.5

David Vrabel

2013-Sep-13 16:59 UTC

head link

[PATCH 11/13] xen/events: add xen_evtchn_mask_all()

From: David Vrabel <david.vrabel@citrix.com>

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
---
 drivers/xen/events/events.c |   18 ++++++++++++------
 1 files changed, 12 insertions(+), 6 deletions(-)

diff --git a/drivers/xen/events/events.c b/drivers/xen/events/events.c
index 8660459..6c55b1f 100644
--- a/drivers/xen/events/events.c
+++ b/drivers/xen/events/events.c
@@ -331,6 +331,14 @@ static void bind_evtchn_to_cpu(unsigned int chn, unsigned
int cpu)
 	info->cpu = cpu;
 }
 
+static void xen_evtchn_mask_all(void)
+{
+	unsigned evtchn;
+
+	for (evtchn = 0; evtchn < xen_evtchn_nr_channels(); evtchn++)
+		mask_evtchn(evtchn);
+}
+
 /**
  * notify_remote_via_irq - send event to remote end of event channel via irq
  * @irq: irq of event channel to send event to
@@ -1505,12 +1513,11 @@ EXPORT_SYMBOL_GPL(xen_test_irq_shared);
 
 void xen_irq_resume(void)
 {
-	unsigned int cpu, evtchn;
+	unsigned int cpu;
 	struct irq_info *info;
 
 	/* New event-channel space is not ''live'' yet. */
-	for (evtchn = 0; evtchn < xen_evtchn_nr_channels(); evtchn++)
-		mask_evtchn(evtchn);
+	xen_evtchn_mask_all();
 
 	/* No IRQ <-> event-channel mappings. */
 	list_for_each_entry(info, &xen_irq_list_head, list)
@@ -1609,7 +1616,7 @@ void xen_callback_vector(void) {}
 
 void __init xen_init_IRQ(void)
 {
-	int i;
+	int ret;
 
 	xen_evtchn_init_nlevel();
 
@@ -1618,8 +1625,7 @@ void __init xen_init_IRQ(void)
 	BUG_ON(!evtchn_to_irq);
 
 	/* No event channels are ''live'' right now. */
-	for (i = 0; i < xen_evtchn_nr_channels(); i++)
-		mask_evtchn(i);
+	xen_evtchn_mask_all();
 
 	pirq_needs_eoi = pirq_needs_eoi_flag;
 
-- 
1.7.2.5

David Vrabel

2013-Sep-13 17:00 UTC

head link

[PATCH 12/13] xen/events: Add the hypervisor interface for the FIFO-based event channels

From: David Vrabel <david.vrabel@citrix.com>

Add the hypercall sub-ops and the structures for the shared data used
in the FIFO-based event channel ABI.

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
---
 include/xen/interface/event_channel.h |   67 +++++++++++++++++++++++++++++++++
 1 files changed, 67 insertions(+), 0 deletions(-)

diff --git a/include/xen/interface/event_channel.h
b/include/xen/interface/event_channel.h
index f494292..f54ad0a 100644
--- a/include/xen/interface/event_channel.h
+++ b/include/xen/interface/event_channel.h
@@ -190,6 +190,48 @@ struct evtchn_reset {
 };
 typedef struct evtchn_reset evtchn_reset_t;
 
+/*
+ * EVTCHNOP_init_control: initialize the control block for the FIFO ABI.
+ */
+#define EVTCHNOP_init_control    11
+struct evtchn_init_control {
+	/* IN parameters. */
+	uint64_t control_mfn;
+	uint32_t offset;
+	uint32_t vcpu;
+	/* OUT parameters. */
+	uint8_t link_bits;
+};
+
+/*
+ * EVTCHNOP_expand_array: add an additional page to the event array.
+ */
+#define EVTCHNOP_expand_array    12
+struct evtchn_expand_array {
+	/* IN parameters. */
+	uint64_t array_mfn;
+};
+
+/*
+ * EVTCHNOP_set_priority: set the priority for an event channel.
+ */
+#define EVTCHNOP_set_priority    13
+struct evtchn_set_priority {
+	/* IN parameters. */
+	uint32_t port;
+	uint32_t priority;
+};
+
+/*
+ * EVTCHNOP_set_limit: set the maximum event channel port that may be bound.
+ */
+#define EVTCHNOP_set_limit       14
+struct evtchn_set_limit {
+	/* IN parameters. */
+	uint32_t domid;
+	uint32_t max_port;
+};
+
 struct evtchn_op {
 	uint32_t cmd; /* EVTCHNOP_* */
 	union {
@@ -207,4 +249,29 @@ struct evtchn_op {
 };
 DEFINE_GUEST_HANDLE_STRUCT(evtchn_op);
 
+/*
+ * FIFO ABI
+ */
+
+/* Events may have priorities from 0 (highest) to 15 (lowest). */
+#define EVTCHN_FIFO_PRIORITY_MIN     15
+#define EVTCHN_FIFO_PRIORITY_DEFAULT 7
+
+#define EVTCHN_FIFO_MAX_QUEUES (EVTCHN_FIFO_PRIORITY_MIN + 1)
+
+typedef uint32_t event_word_t;
+
+#define EVTCHN_FIFO_PENDING 31
+#define EVTCHN_FIFO_MASKED  30
+#define EVTCHN_FIFO_LINKED  29
+
+#define EVTCHN_FIFO_LINK_BITS 17
+#define EVTCHN_FIFO_LINK_MASK ((1 << EVTCHN_FIFO_LINK_BITS) - 1)
+
+struct evtchn_fifo_control_block {
+	uint32_t     ready;
+	uint32_t     _rsvd1;
+	event_word_t head[EVTCHN_FIFO_MAX_QUEUES];
+};
+
 #endif /* __XEN_PUBLIC_EVENT_CHANNEL_H__ */
-- 
1.7.2.5

David Vrabel

2013-Sep-13 17:00 UTC

head link

[PATCH 13/13] xen/events: use the FIFO-based ABI if available

From: David Vrabel <david.vrabel@citrix.com>

If the hypervisor supports the FIFO-based ABI, enable it by
initializing the control block for the boot VCPU and subsequent VCPUs
as they are brought up.  The event array is expanded as required when
event ports are setup.

This implementation has some known limitations (which will be
corrected in a subsequent patch):

- The timer VIRQ which previously was treated as the highest priority
  event has the default priority.

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
---
 drivers/xen/events/Makefile          |    1 +
 drivers/xen/events/events.c          |    7 +-
 drivers/xen/events/events_internal.h |    8 +
 drivers/xen/events/fifo.c            |  395 ++++++++++++++++++++++++++++++++++
 4 files changed, 410 insertions(+), 1 deletions(-)
 create mode 100644 drivers/xen/events/fifo.c

diff --git a/drivers/xen/events/Makefile b/drivers/xen/events/Makefile
index aea331e..74644d0 100644
--- a/drivers/xen/events/Makefile
+++ b/drivers/xen/events/Makefile
@@ -1,2 +1,3 @@
 obj-y += events.o
+obj-y += fifo.o
 obj-y += n-level.o
diff --git a/drivers/xen/events/events.c b/drivers/xen/events/events.c
index 6c55b1f..561ae7b 100644
--- a/drivers/xen/events/events.c
+++ b/drivers/xen/events/events.c
@@ -1518,6 +1518,7 @@ void xen_irq_resume(void)
 
 	/* New event-channel space is not ''live'' yet. */
 	xen_evtchn_mask_all();
+	xen_evtchn_resume();
 
 	/* No IRQ <-> event-channel mappings. */
 	list_for_each_entry(info, &xen_irq_list_head, list)
@@ -1618,7 +1619,11 @@ void __init xen_init_IRQ(void)
 {
 	int ret;
 
-	xen_evtchn_init_nlevel();
+	ret = xen_evtchn_init_fifo();
+	if (ret < 0) {
+		printk(KERN_INFO "xen: falling back to n-level event channels");
+		xen_evtchn_init_nlevel();
+	}
 
 	evtchn_to_irq = kcalloc(EVTCHN_ROW(xen_evtchn_max_channels()),
 				sizeof(*evtchn_to_irq), GFP_KERNEL);
diff --git a/drivers/xen/events/events_internal.h
b/drivers/xen/events/events_internal.h
index 9d8b70c..f7fbcea 100644
--- a/drivers/xen/events/events_internal.h
+++ b/drivers/xen/events/events_internal.h
@@ -69,6 +69,7 @@ struct evtchn_ops {
 	void (*unmask)(int port);
 
 	void (*handle_events)(int cpu);
+	void (*resume)(void);
 };
 
 extern const struct evtchn_ops *evtchn_ops;
@@ -137,6 +138,13 @@ static inline void xen_evtchn_handle_events(int cpu)
 	return evtchn_ops->handle_events(cpu);
 }
 
+static inline void xen_evtchn_resume(void)
+{
+	if (evtchn_ops->resume)
+		evtchn_ops->resume();
+}
+
 void xen_evtchn_init_nlevel(void);
+int xen_evtchn_init_fifo(void);
 
 #endif /* #ifndef __EVENTS_INTERNAL_H__ */
diff --git a/drivers/xen/events/fifo.c b/drivers/xen/events/fifo.c
new file mode 100644
index 0000000..8047430
--- /dev/null
+++ b/drivers/xen/events/fifo.c
@@ -0,0 +1,395 @@
+/*
+ * Xen event channels (FIFO-based ABI)
+ *
+ * Copyright (C) 2013 Citrix Systems R&D ltd.
+ *
+ * This source code is licensed under the GNU General Public License,
+ * Version 2 or later.  See the file COPYING for more details.
+ */
+
+#include <linux/linkage.h>
+#include <linux/interrupt.h>
+#include <linux/irq.h>
+#include <linux/module.h>
+#include <linux/smp.h>
+#include <linux/percpu.h>
+#include <linux/cpu.h>
+
+#include <asm/sync_bitops.h>
+#include <asm/xen/hypercall.h>
+#include <asm/xen/hypervisor.h>
+#include <asm/xen/page.h>
+
+#include <xen/xen.h>
+#include <xen/xen-ops.h>
+#include <xen/events.h>
+#include <xen/interface/xen.h>
+#include <xen/interface/event_channel.h>
+
+#include "events_internal.h"
+
+#define EVENT_WORDS_PER_PAGE (PAGE_SIZE / sizeof(event_word_t))
+#define MAX_EVENT_ARRAY_PAGES ((1 << EVTCHN_FIFO_LINK_BITS)	\
+			       / EVENT_WORDS_PER_PAGE)
+
+struct evtchn_fifo_queue {
+	uint32_t head[EVTCHN_FIFO_MAX_QUEUES];
+};
+
+static DEFINE_PER_CPU(struct evtchn_fifo_control_block *, cpu_control_block);
+static DEFINE_PER_CPU(struct evtchn_fifo_queue, cpu_queue);
+static event_word_t *event_array[MAX_EVENT_ARRAY_PAGES];
+static unsigned event_array_pages;
+
+#define BM(w) ((unsigned long *)(w))
+
+static inline event_word_t *event_word_from_port(int port)
+{
+	int i = port / EVENT_WORDS_PER_PAGE;
+
+	if (i >= event_array_pages)
+		return NULL;
+	return event_array[i] + port % EVENT_WORDS_PER_PAGE;
+}
+
+static unsigned fifo_max_channels(void)
+{
+	return 1 << EVTCHN_FIFO_LINK_BITS;
+}
+
+static unsigned fifo_nr_channels(void)
+{
+	return event_array_pages * EVENT_WORDS_PER_PAGE;
+}
+
+static void free_unused_array_pages(void)
+{
+	unsigned i;
+
+	for (i = event_array_pages; i < MAX_EVENT_ARRAY_PAGES; i++) {
+		if (!event_array[i])
+			break;
+		free_page((unsigned long)event_array[i]);
+		event_array[i] = NULL;
+	}
+}
+
+static int fifo_setup(struct irq_info *info)
+{
+	int port = info->evtchn;
+	int i;
+	int ret = -ENOMEM;
+
+	i = port / EVENT_WORDS_PER_PAGE;
+
+	if (i >= MAX_EVENT_ARRAY_PAGES)
+		return -EINVAL;
+
+	while (i >= event_array_pages) {
+		void *array_page;
+		struct evtchn_expand_array expand_array;
+
+		/* Might already have a page if we''ve resumed. */
+		array_page = event_array[event_array_pages];
+		if (!array_page) {
+			array_page = (void *)get_zeroed_page(GFP_KERNEL);
+			if (array_page == NULL)
+				goto error;
+			event_array[event_array_pages] = array_page;
+		} else
+			memset(array_page, 0, PAGE_SIZE);
+
+		expand_array.array_mfn = virt_to_mfn(array_page);
+
+		ret = HYPERVISOR_event_channel_op(EVTCHNOP_expand_array, &expand_array);
+		if (ret < 0)
+			goto error;
+
+		event_array_pages++;
+	}
+	return 0;
+
+  error:
+	if (event_array_pages == 0)
+		panic("xen: unable to expand event array with initial page (%d)\n",
ret);
+	else
+		printk(KERN_ERR "xen: unable to expand event array (%d)\n", ret);
+	free_unused_array_pages();
+	return ret;
+}
+
+static void fifo_bind_to_cpu(struct irq_info *info, int cpu)
+{
+	/* no-op */
+}
+
+static void fifo_clear_pending(int port)
+{
+	event_word_t *word = event_word_from_port(port);
+	sync_clear_bit(EVTCHN_FIFO_PENDING, BM(word));
+}
+
+static void fifo_set_pending(int port)
+{
+	event_word_t *word = event_word_from_port(port);
+	sync_set_bit(EVTCHN_FIFO_PENDING, BM(word));
+}
+
+static bool fifo_is_pending(int port)
+{
+	event_word_t *word = event_word_from_port(port);
+	return sync_test_bit(EVTCHN_FIFO_PENDING, BM(word));
+}
+
+static bool fifo_test_and_set_mask(int port)
+{
+	event_word_t *word = event_word_from_port(port);
+	return sync_test_and_set_bit(EVTCHN_FIFO_MASKED, BM(word));
+}
+
+static void fifo_mask(int port)
+{
+	event_word_t *word = event_word_from_port(port);
+	if (word)
+		sync_set_bit(EVTCHN_FIFO_MASKED, BM(word));
+}
+
+static void fifo_unmask(int port)
+{
+	unsigned int cpu = get_cpu();
+	bool do_hypercall = false;
+	bool evtchn_pending = false;
+
+	BUG_ON(!irqs_disabled());
+
+	if (unlikely((cpu != cpu_from_evtchn(port))))
+		do_hypercall = true;
+	else {
+		event_word_t *word = event_word_from_port(port);
+
+		sync_clear_bit(EVTCHN_FIFO_MASKED, BM(word));
+		evtchn_pending = sync_test_bit(EVTCHN_FIFO_PENDING, BM(word));
+		if (evtchn_pending)
+			do_hypercall = true;
+	}
+
+	if (do_hypercall) {
+		struct evtchn_unmask unmask = { .port = port };
+		(void)HYPERVISOR_event_channel_op(EVTCHNOP_unmask, &unmask);
+	}
+
+	put_cpu();
+}
+
+static uint32_t clear_linked(volatile event_word_t *word)
+{
+    event_word_t n, o, w;
+
+    w = *word;
+
+    do {
+        o = w;
+        n = (w & ~((1 << EVTCHN_FIFO_LINKED) |
EVTCHN_FIFO_LINK_MASK));
+    } while ( (w = sync_cmpxchg(word, o, n)) != o );
+
+    return w & EVTCHN_FIFO_LINK_MASK;
+}
+
+static void handle_irq_for_port(int port)
+{
+	int irq;
+	struct irq_desc *desc;
+
+	irq = get_evtchn_to_irq(port);
+	if (irq != -1) {
+		desc = irq_to_desc(irq);
+		if (desc)
+			generic_handle_irq_desc(irq, desc);
+	}
+}
+
+static void consume_one_event(int cpu,
+			      struct evtchn_fifo_control_block *control_block,
+			      int priority, uint32_t *ready)
+{
+	struct evtchn_fifo_queue *q = &per_cpu(cpu_queue, cpu);
+	uint32_t head;
+	int port;
+	event_word_t *word;
+
+	head = q->head[priority];
+
+	/* Reached the tail last time?  Read the new HEAD from the
+	   control block. */
+	if (head == 0) {
+		rmb(); /* Ensure word is up-to-date before reading head. */
+		head = control_block->head[priority];
+	}
+
+	port = head;
+	word = event_word_from_port(port);
+	head = clear_linked(word);
+
+	/*
+	 * If the link is non-zero, there are more events in the
+	 * queue, otherwise the queue is empty.
+	 *
+	 * If the queue is empty, clear this priority from our local
+	 * copy of the ready word.
+	 */
+	if (head == 0)
+		clear_bit(priority, BM(ready));
+
+	if (sync_test_bit(EVTCHN_FIFO_PENDING, BM(word))
+	    && !sync_test_bit(EVTCHN_FIFO_MASKED, BM(word)))
+		handle_irq_for_port(port);
+
+	q->head[priority] = head;
+}
+
+#define EVTCHN_FIFO_READY_MASK ((1 << EVTCHN_FIFO_MAX_QUEUES) - 1)
+
+static void fifo_handle_events(int cpu)
+{
+	struct evtchn_fifo_control_block *control_block;
+	uint32_t ready;
+	int q;
+
+	control_block = per_cpu(cpu_control_block, cpu);
+
+	ready = xchg(&control_block->ready, 0);
+
+	while (ready & EVTCHN_FIFO_READY_MASK) {
+		q = find_first_bit(BM(&ready), EVTCHN_FIFO_MAX_QUEUES);
+		consume_one_event(cpu, control_block, q, &ready);
+		ready |= xchg(&control_block->ready, 0);
+	}
+}
+
+static void fifo_resume(void)
+{
+	unsigned cpu;
+
+	for_each_possible_cpu(cpu) {
+		void *control_block = per_cpu(cpu_control_block, cpu);
+		struct evtchn_init_control init_control;
+		int ret;
+
+		if (!control_block)
+			continue;
+
+		/*
+		 * If this CPU is offline, take the opportunity to
+		 * free the control block while it is not being
+		 * used.
+		 */
+		if (!cpu_online(cpu)) {
+			free_page((unsigned long)control_block);
+			per_cpu(cpu_control_block, cpu) = NULL;
+			continue;
+		}
+
+		init_control.control_mfn = virt_to_mfn(control_block);
+		init_control.offset = 0;
+		init_control.vcpu = cpu;
+
+		ret = HYPERVISOR_event_channel_op(EVTCHNOP_init_control,
+						  &init_control);
+		if (ret < 0)
+			BUG();
+	}
+
+	/*
+	 * The event array starts out as empty again and is extended
+	 * as normal when events are bound.  The existing pages will
+	 * be reused.
+	 */
+	event_array_pages = 0;
+}
+
+static const struct evtchn_ops evtchn_ops_fifo = {
+	.max_channels      = fifo_max_channels,
+	.nr_channels       = fifo_nr_channels,
+	.setup             = fifo_setup,
+	.bind_to_cpu       = fifo_bind_to_cpu,
+	.clear_pending     = fifo_clear_pending,
+	.set_pending       = fifo_set_pending,
+	.is_pending        = fifo_is_pending,
+	.test_and_set_mask = fifo_test_and_set_mask,
+	.mask              = fifo_mask,
+	.unmask            = fifo_unmask,
+	.handle_events     = fifo_handle_events,
+	.resume            = fifo_resume,
+};
+
+static int __cpuinit fifo_init_control_block(int cpu)
+{
+	struct page *control_block = NULL;
+	struct evtchn_init_control init_control;
+	int ret = -ENOMEM;
+
+	control_block = alloc_page(GFP_KERNEL|__GFP_ZERO);
+	if (control_block == NULL)
+		goto error;
+
+	init_control.control_mfn = virt_to_mfn(page_address(control_block));
+	init_control.offset      = 0;
+	init_control.vcpu        = cpu;
+
+	ret = HYPERVISOR_event_channel_op(EVTCHNOP_init_control, &init_control);
+	if (ret < 0)
+		goto error;
+
+	per_cpu(cpu_control_block, cpu) = page_address(control_block);
+
+	return 0;
+
+  error:
+	__free_page(control_block);
+	return ret;
+}
+
+static int __cpuinit fifo_cpu_notification(struct notifier_block *self,
+					   unsigned long action, void *hcpu)
+{
+	int cpu = (long)hcpu;
+	int ret = 0;
+
+	switch (action) {
+	case CPU_UP_PREPARE:
+		if (!per_cpu(cpu_control_block, cpu))
+			ret = fifo_init_control_block(cpu);
+		break;
+	default:
+		break;
+	}
+	return ret < 0 ? NOTIFY_BAD : NOTIFY_OK;
+}
+
+static struct notifier_block fifo_cpu_notifier __cpuinitdata = {
+	.notifier_call	= fifo_cpu_notification,
+};
+
+
+int __init xen_evtchn_init_fifo(void)
+{
+	int cpu = get_cpu();
+	int ret;
+
+	ret = fifo_init_control_block(cpu);
+	if (ret < 0)
+		goto error;
+
+	printk(KERN_INFO "xen: switching to FIFO-based event channels\n");
+
+	evtchn_ops = &evtchn_ops_fifo;
+
+	register_cpu_notifier(&fifo_cpu_notifier);
+
+	put_cpu();
+	return 0;
+
+  error:
+	put_cpu();
+	return ret;
+}
-- 
1.7.2.5

David Vrabel

2013-Sep-13 17:03 UTC

head link

Re: [RFC PATCHv3 00/12] Linux: FIFO-based event channel ABI

Oops, no title and no list of changes, sorry.

Changes in v3:

- Support suspend/resume by reinitializing the control blocks on resume.
- Only init control block if one does not exist (should fix CPU
  hotplug).

David

Konrad Rzeszutek Wilk

2013-Sep-24 14:37 UTC

head link

Re: [PATCH 01/13] xen/events: refactor retrigger_dynirq() and resend_irq_on_evtchn()

On Fri, Sep 13, 2013 at 05:59:49PM +0100, David Vrabel
wrote:> From: David Vrabel <david.vrabel@citrix.com>
> 
> These two function did the same thing with different parameters, put
> the common bits in retrigger_evtchn().
Plus the ''resend_irq_on_evtchn'' does not look to be used by
anybody.

Looks good to me.> 
> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
> ---
>  drivers/xen/events.c |   27 +++++++++------------------
>  1 files changed, 9 insertions(+), 18 deletions(-)
> 
> diff --git a/drivers/xen/events.c b/drivers/xen/events.c
> index 4035e83..ddcdbb5 100644
> --- a/drivers/xen/events.c
> +++ b/drivers/xen/events.c
> @@ -1558,13 +1558,13 @@ static int set_affinity_irq(struct irq_data *data,
const struct cpumask *dest,
>  	return rebind_irq_to_cpu(data->irq, tcpu);
>  }
>  
> -int resend_irq_on_evtchn(unsigned int irq)
> +static int retrigger_evtchn(int evtchn)
>  {
> -	int masked, evtchn = evtchn_from_irq(irq);
> +	int masked;
>  	struct shared_info *s = HYPERVISOR_shared_info;
>  
>  	if (!VALID_EVTCHN(evtchn))
> -		return 1;
> +		return 0;
>  
>  	masked = sync_test_and_set_bit(evtchn, BM(s->evtchn_mask));
>  	sync_set_bit(evtchn, BM(s->evtchn_pending));
> @@ -1574,6 +1574,11 @@ int resend_irq_on_evtchn(unsigned int irq)
>  	return 1;
>  }
>  
> +int resend_irq_on_evtchn(unsigned int irq)
> +{
> +	return retrigger_evtchn(evtchn_from_irq(irq));
> +}
> +
>  static void enable_dynirq(struct irq_data *data)
>  {
>  	int evtchn = evtchn_from_irq(data->irq);
> @@ -1608,21 +1613,7 @@ static void mask_ack_dynirq(struct irq_data *data)
>  
>  static int retrigger_dynirq(struct irq_data *data)
>  {
> -	int evtchn = evtchn_from_irq(data->irq);
> -	struct shared_info *sh = HYPERVISOR_shared_info;
> -	int ret = 0;
> -
> -	if (VALID_EVTCHN(evtchn)) {
> -		int masked;
> -
> -		masked = sync_test_and_set_bit(evtchn, BM(sh->evtchn_mask));
> -		sync_set_bit(evtchn, BM(sh->evtchn_pending));
> -		if (!masked)
> -			unmask_evtchn(evtchn);
> -		ret = 1;
> -	}
> -
> -	return ret;
> +	return retrigger_evtchn(evtchn_from_irq(data->irq));
>  }
>  
>  static void restore_pirqs(void)
> -- 
> 1.7.2.5
>

Konrad Rzeszutek Wilk

2013-Sep-24 14:40 UTC

head link

Re: [PATCH 02/13] xen/events: remove unnecessary init_evtchn_cpu_bindings()

On Fri, Sep 13, 2013 at 05:59:50PM +0100, David Vrabel
wrote:> From: David Vrabel <david.vrabel@citrix.com>
> 
> Event channels are always explicitly bound to a specific VCPU before
> they are first enabled.  There is no need to initialize all possible
> events as bound to VCPU 0 at start of day or after a resume.
How long has this presumption about explicit bounding been in the
hypervisor?

Should there also be a patch in the Xen headers documenting this
behavior?

Looks good to me - thought I would add the comment mentioning how
long this behavior has been with the hypervisor - as we still can
run on Xen 4.0 hypervisors and should not break that (and perhaps
they don''t rebind there?).
> 
> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
> ---
>  drivers/xen/events.c |   22 ----------------------
>  1 files changed, 0 insertions(+), 22 deletions(-)
> 
> diff --git a/drivers/xen/events.c b/drivers/xen/events.c
> index ddcdbb5..1e2c74b 100644
> --- a/drivers/xen/events.c
> +++ b/drivers/xen/events.c
> @@ -334,24 +334,6 @@ static void bind_evtchn_to_cpu(unsigned int chn,
unsigned int cpu)
>  	info_for_irq(irq)->cpu = cpu;
>  }
>  
> -static void init_evtchn_cpu_bindings(void)
> -{
> -	int i;
> -#ifdef CONFIG_SMP
> -	struct irq_info *info;
> -
> -	/* By default all event channels notify CPU#0. */
> -	list_for_each_entry(info, &xen_irq_list_head, list) {
> -		struct irq_desc *desc = irq_to_desc(info->irq);
> -		cpumask_copy(desc->irq_data.affinity, cpumask_of(0));
> -	}
> -#endif
> -
> -	for_each_possible_cpu(i)
> -		memset(per_cpu(cpu_evtchn_mask, i),
> -		       (i == 0) ? ~0 : 0, NR_EVENT_CHANNELS/8);
> -}
> -
>  static inline void clear_evtchn(int port)
>  {
>  	struct shared_info *s = HYPERVISOR_shared_info;
> @@ -1778,8 +1760,6 @@ void xen_irq_resume(void)
>  	unsigned int cpu, evtchn;
>  	struct irq_info *info;
>  
> -	init_evtchn_cpu_bindings();
> -
>  	/* New event-channel space is not ''live'' yet. */
>  	for (evtchn = 0; evtchn < NR_EVENT_CHANNELS; evtchn++)
>  		mask_evtchn(evtchn);
> @@ -1890,8 +1870,6 @@ void __init xen_init_IRQ(void)
>  	for (i = 0; i < NR_EVENT_CHANNELS; i++)
>  		evtchn_to_irq[i] = -1;
>  
> -	init_evtchn_cpu_bindings();
> -
>  	/* No event channels are ''live'' right now. */
>  	for (i = 0; i < NR_EVENT_CHANNELS; i++)
>  		mask_evtchn(i);
> -- 
> 1.7.2.5
>

Konrad Rzeszutek Wilk

2013-Sep-24 14:40 UTC

head link

Re: [PATCH 03/13] xen/events: introduce test_and_set_mask

On Fri, Sep 13, 2013 at 05:59:51PM +0100, David Vrabel
wrote:> From: Wei Liu <wei.liu2@citrix.com>
> 
> Signed-off-by: Wei Liu <wei.liu2@citrix.com>
> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Konrad Rzeszutek Wilk
<konrad.wilk@oracle.com>> ---
>  drivers/xen/events.c |   11 ++++++++---
>  1 files changed, 8 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/xen/events.c b/drivers/xen/events.c
> index 1e2c74b..359e983 100644
> --- a/drivers/xen/events.c
> +++ b/drivers/xen/events.c
> @@ -352,6 +352,12 @@ static inline int test_evtchn(int port)
>  	return sync_test_bit(port, BM(&s->evtchn_pending[0]));
>  }
>  
> +static inline int test_and_set_mask(int port)
> +{
> +	struct shared_info *s = HYPERVISOR_shared_info;
> +	return sync_test_and_set_bit(port, BM(&s->evtchn_mask[0]));
> +}
> +
>  
>  /**
>   * notify_remote_via_irq - send event to remote end of event channel via
irq
> @@ -1493,7 +1499,6 @@ void rebind_evtchn_irq(int evtchn, int irq)
>  /* Rebind an evtchn so that it gets delivered to a specific cpu */
>  static int rebind_irq_to_cpu(unsigned irq, unsigned tcpu)
>  {
> -	struct shared_info *s = HYPERVISOR_shared_info;
>  	struct evtchn_bind_vcpu bind_vcpu;
>  	int evtchn = evtchn_from_irq(irq);
>  	int masked;
> @@ -1516,7 +1521,7 @@ static int rebind_irq_to_cpu(unsigned irq, unsigned
tcpu)
>  	 * Mask the event while changing the VCPU binding to prevent
>  	 * it being delivered on an unexpected VCPU.
>  	 */
> -	masked = sync_test_and_set_bit(evtchn, BM(s->evtchn_mask));
> +	masked = test_and_set_mask(evtchn);
>  
>  	/*
>  	 * If this fails, it usually just indicates that we''re dealing
with a
> @@ -1548,7 +1553,7 @@ static int retrigger_evtchn(int evtchn)
>  	if (!VALID_EVTCHN(evtchn))
>  		return 0;
>  
> -	masked = sync_test_and_set_bit(evtchn, BM(s->evtchn_mask));
> +	masked = test_and_set_mask(evtchn);
>  	sync_set_bit(evtchn, BM(s->evtchn_pending));
>  	if (!masked)
>  		unmask_evtchn(evtchn);
> -- 
> 1.7.2.5
>

Konrad Rzeszutek Wilk

2013-Sep-24 14:41 UTC

head link

Re: [PATCH 04/13] xen/events: replace raw bit ops with functions

On Fri, Sep 13, 2013 at 05:59:52PM +0100, David Vrabel
wrote:> From: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Konrad Rzeszutek Wilk
<konrad.wilk@oracle.com>> 
> Signed-off-by: Wei Liu <wei.liu2@citrix.com>
> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
> ---
>  drivers/xen/events.c |    3 +--
>  1 files changed, 1 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/xen/events.c b/drivers/xen/events.c
> index 359e983..fec5da4 100644
> --- a/drivers/xen/events.c
> +++ b/drivers/xen/events.c
> @@ -1548,13 +1548,12 @@ static int set_affinity_irq(struct irq_data *data,
const struct cpumask *dest,
>  static int retrigger_evtchn(int evtchn)
>  {
>  	int masked;
> -	struct shared_info *s = HYPERVISOR_shared_info;
>  
>  	if (!VALID_EVTCHN(evtchn))
>  		return 0;
>  
>  	masked = test_and_set_mask(evtchn);
> -	sync_set_bit(evtchn, BM(s->evtchn_pending));
> +	set_evtchn(evtchn);
>  	if (!masked)
>  		unmask_evtchn(evtchn);
>  
> -- 
> 1.7.2.5
>

Konrad Rzeszutek Wilk

2013-Sep-24 14:44 UTC

head link

Re: [PATCH 07/13] xen/events: add struct evtchn_ops for the low-level port operations

On Fri, Sep 13, 2013 at 05:59:55PM +0100, David Vrabel
wrote:> From: David Vrabel <david.vrabel@citrix.com>
> 
> evtchn_ops contains the low-level operations that access the shared
> data structures.  This allows alternate ABIs to be supported.
Reviewed-by: Konrad Rzeszutek Wilk
<konrad.wilk@oracle.com>> 
> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
> ---
>  drivers/xen/events/events.c          |    4 ++
>  drivers/xen/events/events_internal.h |   62
+++++++++++++++++++++++++++++----
>  drivers/xen/events/n-level.c         |   32 +++++++++++++----
>  3 files changed, 82 insertions(+), 16 deletions(-)
> 
> diff --git a/drivers/xen/events/events.c b/drivers/xen/events/events.c
> index 2700c12..8ecde63 100644
> --- a/drivers/xen/events/events.c
> +++ b/drivers/xen/events/events.c
> @@ -61,6 +61,8 @@
>  
>  #include "events_internal.h"
>  
> +const struct evtchn_ops *evtchn_ops;
> +
>  /*
>   * This lock protects updates to the following mapping and reference-count
>   * arrays. The lock does not need to be acquired to read the mapping
tables.
> @@ -1518,6 +1520,8 @@ void __init xen_init_IRQ(void)
>  {
>  	int i;
>  
> +	xen_evtchn_init_nlevel();
> +
>  	evtchn_to_irq = kcalloc(NR_EVENT_CHANNELS, sizeof(*evtchn_to_irq),
>  				    GFP_KERNEL);
>  	BUG_ON(!evtchn_to_irq);
> diff --git a/drivers/xen/events/events_internal.h
b/drivers/xen/events/events_internal.h
> index 79ac70b..513c34d 100644
> --- a/drivers/xen/events/events_internal.h
> +++ b/drivers/xen/events/events_internal.h
> @@ -54,21 +54,67 @@ struct irq_info {
>  #define PIRQ_NEEDS_EOI	(1 << 0)
>  #define PIRQ_SHAREABLE	(1 << 1)
>  
> +struct evtchn_ops {
> +	void (*bind_to_cpu)(struct irq_info *info, int cpu);
> +
> +	void (*clear_pending)(int port);
> +	void (*set_pending)(int port);
> +	bool (*is_pending)(int port);
> +	bool (*test_and_set_mask)(int port);
> +	void (*mask)(int port);
> +	void (*unmask)(int port);
> +
> +	void (*handle_events)(int cpu);
> +};
> +
> +extern const struct evtchn_ops *evtchn_ops;
> +
>  extern int *evtchn_to_irq;
>  
>  struct irq_info *info_for_irq(unsigned irq);
>  unsigned cpu_from_irq(unsigned irq);
>  unsigned cpu_from_evtchn(unsigned int evtchn);
>  
> -void xen_evtchn_port_bind_to_cpu(struct irq_info *info, int cpu);
> +static inline void xen_evtchn_port_bind_to_cpu(struct irq_info *info, int
cpu)
> +{
> +	evtchn_ops->bind_to_cpu(info, cpu);
> +}
> +
> +static inline void clear_evtchn(int port)
> +{
> +	evtchn_ops->clear_pending(port);
> +}
> +
> +static inline void set_evtchn(int port)
> +{
> +	evtchn_ops->set_pending(port);
> +}
> +
> +static inline bool test_evtchn(int port)
> +{
> +	return evtchn_ops->is_pending(port);
> +}
> +
> +static inline bool test_and_set_mask(int port)
> +{
> +	return evtchn_ops->test_and_set_mask(port);
> +}
> +
> +static inline void mask_evtchn(int port)
> +{
> +	return evtchn_ops->mask(port);
> +}
> +
> +static inline void unmask_evtchn(int port)
> +{
> +	return evtchn_ops->unmask(port);
> +}
>  
> -void clear_evtchn(int port);
> -void set_evtchn(int port);
> -int test_evtchn(int port);
> -int test_and_set_mask(int port);
> -void mask_evtchn(int port);
> -void unmask_evtchn(int port);
> +static inline void xen_evtchn_handle_events(int cpu)
> +{
> +	return evtchn_ops->handle_events(cpu);
> +}
>  
> -void xen_evtchn_handle_events(int cpu);
> +void xen_evtchn_init_nlevel(void);
>  
>  #endif /* #ifndef __EVENTS_INTERNAL_H__ */
> diff --git a/drivers/xen/events/n-level.c b/drivers/xen/events/n-level.c
> index b86a853..6c0662e 100644
> --- a/drivers/xen/events/n-level.c
> +++ b/drivers/xen/events/n-level.c
> @@ -39,43 +39,43 @@
>  static DEFINE_PER_CPU(xen_ulong_t
[NR_EVENT_CHANNELS/BITS_PER_EVTCHN_WORD],
>  		      cpu_evtchn_mask);
>  
> -void xen_evtchn_port_bind_to_cpu(struct irq_info *info, int cpu)
> +static void nlevel_bind_to_cpu(struct irq_info *info, int cpu)
>  {
>  	clear_bit(info->evtchn, BM(per_cpu(cpu_evtchn_mask, info->cpu)));
>  	set_bit(info->evtchn, BM(per_cpu(cpu_evtchn_mask, cpu)));
>  }
>  
> -void clear_evtchn(int port)
> +static void nlevel_clear_pending(int port)
>  {
>  	struct shared_info *s = HYPERVISOR_shared_info;
>  	sync_clear_bit(port, BM(&s->evtchn_pending[0]));
>  }
>  
> -void set_evtchn(int port)
> +static void nlevel_set_pending(int port)
>  {
>  	struct shared_info *s = HYPERVISOR_shared_info;
>  	sync_set_bit(port, BM(&s->evtchn_pending[0]));
>  }
>  
> -int test_evtchn(int port)
> +static bool nlevel_is_pending(int port)
>  {
>  	struct shared_info *s = HYPERVISOR_shared_info;
>  	return sync_test_bit(port, BM(&s->evtchn_pending[0]));
>  }
>  
> -int test_and_set_mask(int port)
> +static bool nlevel_test_and_set_mask(int port)
>  {
>  	struct shared_info *s = HYPERVISOR_shared_info;
>  	return sync_test_and_set_bit(port, BM(&s->evtchn_mask[0]));
>  }
>  
> -void mask_evtchn(int port)
> +static void nlevel_mask(int port)
>  {
>  	struct shared_info *s = HYPERVISOR_shared_info;
>  	sync_set_bit(port, BM(&s->evtchn_mask[0]));
>  }
>  
> -void unmask_evtchn(int port)
> +static void nlevel_unmask(int port)
>  {
>  	struct shared_info *s = HYPERVISOR_shared_info;
>  	unsigned int cpu = get_cpu();
> @@ -141,7 +141,7 @@ static inline xen_ulong_t active_evtchns(unsigned int
cpu,
>   * a bitset of words which contain pending event bits.  The second
>   * level is a bitset of pending events themselves.
>   */
> -void xen_evtchn_handle_events(int cpu)
> +static void nlevel_handle_events(int cpu)
>  {
>  	xen_ulong_t pending_words;
>  	int start_word_idx, start_bit_idx;
> @@ -320,3 +320,19 @@ irqreturn_t xen_debug_interrupt(int irq, void *dev_id)
>  
>  	return IRQ_HANDLED;
>  }
> +
> +static const struct evtchn_ops evtchn_ops_nlevel = {
> +	.bind_to_cpu       = nlevel_bind_to_cpu,
> +	.clear_pending     = nlevel_clear_pending,
> +	.set_pending       = nlevel_set_pending,
> +	.is_pending        = nlevel_is_pending,
> +	.test_and_set_mask = nlevel_test_and_set_mask,
> +	.mask              = nlevel_mask,
> +	.unmask            = nlevel_unmask,
> +	.handle_events     = nlevel_handle_events,
> +};
> +
> +void __init xen_evtchn_init_nlevel(void)
> +{
> +	evtchn_ops = &evtchn_ops_nlevel;
> +}
> -- 
> 1.7.2.5
>

Konrad Rzeszutek Wilk

2013-Sep-24 14:47 UTC

head link

Re: [PATCH 08/13] xen/events: allow setup of irq_info to fail

On Fri, Sep 13, 2013 at 05:59:56PM +0100, David Vrabel
wrote:> From: David Vrabel <david.vrabel@citrix.com>
> 
> The FIFO-based event ABI requires additional setup of newly bound
> events (it may need to expand the event array) and this setup may
> fail.
> 
> xen_irq_info_common_init() is a useful place to put this setup so
> allow this call to fail.  This call and the other similar calls are
> renamed to be *_setup() to reflect that they may now fail.
> 
> This failure can only occur with new event channels not on rebind.

Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

> 
> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
> ---
>  drivers/xen/events/events.c |  156
+++++++++++++++++++++++++------------------
>  1 files changed, 91 insertions(+), 65 deletions(-)
> 
> diff --git a/drivers/xen/events/events.c b/drivers/xen/events/events.c
> index 8ecde63..8f55a49 100644
> --- a/drivers/xen/events/events.c
> +++ b/drivers/xen/events/events.c
> @@ -99,7 +99,7 @@ struct irq_info *info_for_irq(unsigned irq)
>  }
>  
>  /* Constructors for packed IRQ information. */
> -static void xen_irq_info_common_init(struct irq_info *info,
> +static int xen_irq_info_common_setup(struct irq_info *info,
>  				     unsigned irq,
>  				     enum xen_irq_type type,
>  				     unsigned short evtchn,
> @@ -116,45 +116,47 @@ static void xen_irq_info_common_init(struct irq_info
*info,
>  	evtchn_to_irq[evtchn] = irq;
>  
>  	irq_clear_status_flags(irq, IRQ_NOREQUEST|IRQ_NOAUTOEN);
> +
> +	return 0;
>  }
>  
> -static void xen_irq_info_evtchn_init(unsigned irq,
> +static int xen_irq_info_evtchn_setup(unsigned irq,
>  				     unsigned short evtchn)
>  {
>  	struct irq_info *info = info_for_irq(irq);
>  
> -	xen_irq_info_common_init(info, irq, IRQT_EVTCHN, evtchn, 0);
> +	return xen_irq_info_common_setup(info, irq, IRQT_EVTCHN, evtchn, 0);
>  }
>  
> -static void xen_irq_info_ipi_init(unsigned cpu,
> +static int xen_irq_info_ipi_setup(unsigned cpu,
>  				  unsigned irq,
>  				  unsigned short evtchn,
>  				  enum ipi_vector ipi)
>  {
>  	struct irq_info *info = info_for_irq(irq);
>  
> -	xen_irq_info_common_init(info, irq, IRQT_IPI, evtchn, 0);
> -
>  	info->u.ipi = ipi;
>  
>  	per_cpu(ipi_to_irq, cpu)[ipi] = irq;
> +
> +	return xen_irq_info_common_setup(info, irq, IRQT_IPI, evtchn, 0);
>  }
>  
> -static void xen_irq_info_virq_init(unsigned cpu,
> +static int xen_irq_info_virq_setup(unsigned cpu,
>  				   unsigned irq,
>  				   unsigned short evtchn,
>  				   unsigned short virq)
>  {
>  	struct irq_info *info = info_for_irq(irq);
>  
> -	xen_irq_info_common_init(info, irq, IRQT_VIRQ, evtchn, 0);
> -
>  	info->u.virq = virq;
>  
>  	per_cpu(virq_to_irq, cpu)[virq] = irq;
> +
> +	return xen_irq_info_common_setup(info, irq, IRQT_VIRQ, evtchn, 0);
>  }
>  
> -static void xen_irq_info_pirq_init(unsigned irq,
> +static int xen_irq_info_pirq_setup(unsigned irq,
>  				   unsigned short evtchn,
>  				   unsigned short pirq,
>  				   unsigned short gsi,
> @@ -163,12 +165,12 @@ static void xen_irq_info_pirq_init(unsigned irq,
>  {
>  	struct irq_info *info = info_for_irq(irq);
>  
> -	xen_irq_info_common_init(info, irq, IRQT_PIRQ, evtchn, 0);
> -
>  	info->u.pirq.pirq = pirq;
>  	info->u.pirq.gsi = gsi;
>  	info->u.pirq.domid = domid;
>  	info->u.pirq.flags = flags;
> +
> +	return xen_irq_info_common_setup(info, irq, IRQT_PIRQ, evtchn, 0);
>  }
>  
>  /*
> @@ -516,6 +518,47 @@ int xen_irq_from_gsi(unsigned gsi)
>  }
>  EXPORT_SYMBOL_GPL(xen_irq_from_gsi);
>  
> +static void __unbind_from_irq(unsigned int irq)
> +{
> +	struct evtchn_close close;
> +	int evtchn = evtchn_from_irq(irq);
> +	struct irq_info *info = irq_get_handler_data(irq);
> +
> +	if (info->refcnt > 0) {
> +		info->refcnt--;
> +		if (info->refcnt != 0)
> +			return;
> +	}
> +
> +	if (VALID_EVTCHN(evtchn)) {
> +		close.port = evtchn;
> +		if (HYPERVISOR_event_channel_op(EVTCHNOP_close, &close) != 0)
> +			BUG();
> +
> +		switch (type_from_irq(irq)) {
> +		case IRQT_VIRQ:
> +			per_cpu(virq_to_irq, cpu_from_evtchn(evtchn))
> +				[virq_from_irq(irq)] = -1;
> +			break;
> +		case IRQT_IPI:
> +			per_cpu(ipi_to_irq, cpu_from_evtchn(evtchn))
> +				[ipi_from_irq(irq)] = -1;
> +			break;
> +		default:
> +			break;
> +		}
> +
> +		/* Closed ports are implicitly re-bound to VCPU0. */
> +		bind_evtchn_to_cpu(evtchn, 0);
> +
> +		evtchn_to_irq[evtchn] = -1;
> +	}
> +
> +	BUG_ON(info_for_irq(irq)->type == IRQT_UNBOUND);
> +
> +	xen_free_irq(irq);
> +}
> +
>  /*
>   * Do not make any assumptions regarding the relationship between the
>   * IRQ number returned here and the Xen pirq argument.
> @@ -531,6 +574,7 @@ int xen_bind_pirq_gsi_to_irq(unsigned gsi,
>  {
>  	int irq = -1;
>  	struct physdev_irq irq_op;
> +	int ret;
>  
>  	mutex_lock(&irq_mapping_update_lock);
>  
> @@ -558,8 +602,13 @@ int xen_bind_pirq_gsi_to_irq(unsigned gsi,
>  		goto out;
>  	}
>  
> -	xen_irq_info_pirq_init(irq, 0, pirq, gsi, DOMID_SELF,
> +	ret = xen_irq_info_pirq_setup(irq, 0, pirq, gsi, DOMID_SELF,
>  			       shareable ? PIRQ_SHAREABLE : 0);
> +	if (ret < 0) {
> +		__unbind_from_irq(irq);
> +		irq = ret;
> +		goto out;
> +	}
>  
>  	pirq_query_unmask(irq);
>  	/* We try to use the handler with the appropriate semantic for the
> @@ -619,7 +668,9 @@ int xen_bind_pirq_msi_to_irq(struct pci_dev *dev,
struct msi_desc *msidesc,
>  	irq_set_chip_and_handler_name(irq, &xen_pirq_chip, handle_edge_irq,
>  			name);
>  
> -	xen_irq_info_pirq_init(irq, 0, pirq, 0, domid, 0);
> +	ret = xen_irq_info_pirq_setup(irq, 0, pirq, 0, domid, 0);
> +	if (ret < 0)
> +		goto error_irq;
>  	ret = irq_set_msi_desc(irq, msidesc);
>  	if (ret < 0)
>  		goto error_irq;
> @@ -627,8 +678,8 @@ out:
>  	mutex_unlock(&irq_mapping_update_lock);
>  	return irq;
>  error_irq:
> +	__unbind_from_irq(irq);
>  	mutex_unlock(&irq_mapping_update_lock);
> -	xen_free_irq(irq);
>  	return ret;
>  }
>  #endif
> @@ -698,9 +749,11 @@ int xen_pirq_from_irq(unsigned irq)
>  	return pirq_from_irq(irq);
>  }
>  EXPORT_SYMBOL_GPL(xen_pirq_from_irq);
> +
>  int bind_evtchn_to_irq(unsigned int evtchn)
>  {
>  	int irq;
> +	int ret;
>  
>  	mutex_lock(&irq_mapping_update_lock);
>  
> @@ -714,7 +767,12 @@ int bind_evtchn_to_irq(unsigned int evtchn)
>  		irq_set_chip_and_handler_name(irq, &xen_dynamic_chip,
>  					      handle_edge_irq, "event");
>  
> -		xen_irq_info_evtchn_init(irq, evtchn);
> +		ret = xen_irq_info_evtchn_setup(irq, evtchn);
> +		if (ret < 0) {
> +			__unbind_from_irq(irq);
> +			irq = ret;
> +			goto out;
> +		}
>  	} else {
>  		struct irq_info *info = info_for_irq(irq);
>  		WARN_ON(info == NULL || info->type != IRQT_EVTCHN);
> @@ -731,6 +789,7 @@ static int bind_ipi_to_irq(unsigned int ipi, unsigned
int cpu)
>  {
>  	struct evtchn_bind_ipi bind_ipi;
>  	int evtchn, irq;
> +	int ret;
>  
>  	mutex_lock(&irq_mapping_update_lock);
>  
> @@ -750,8 +809,12 @@ static int bind_ipi_to_irq(unsigned int ipi, unsigned
int cpu)
>  			BUG();
>  		evtchn = bind_ipi.port;
>  
> -		xen_irq_info_ipi_init(cpu, irq, evtchn, ipi);
> -
> +		ret = xen_irq_info_ipi_setup(cpu, irq, evtchn, ipi);
> +		if (ret < 0) {
> +			__unbind_from_irq(irq);
> +			irq = ret;
> +			goto out;
> +		}
>  		bind_evtchn_to_cpu(evtchn, cpu);
>  	} else {
>  		struct irq_info *info = info_for_irq(irq);
> @@ -830,7 +893,12 @@ int bind_virq_to_irq(unsigned int virq, unsigned int
cpu)
>  			evtchn = ret;
>  		}
>  
> -		xen_irq_info_virq_init(cpu, irq, evtchn, virq);
> +		ret = xen_irq_info_virq_setup(cpu, irq, evtchn, virq);
> +		if (ret < 0) {
> +			__unbind_from_irq(irq);
> +			irq = ret;
> +			goto out;
> +		}
>  
>  		bind_evtchn_to_cpu(evtchn, cpu);
>  	} else {
> @@ -846,50 +914,8 @@ out:
>  
>  static void unbind_from_irq(unsigned int irq)
>  {
> -	struct evtchn_close close;
> -	int evtchn = evtchn_from_irq(irq);
> -	struct irq_info *info = irq_get_handler_data(irq);
> -
> -	if (WARN_ON(!info))
> -		return;
> -
>  	mutex_lock(&irq_mapping_update_lock);
> -
> -	if (info->refcnt > 0) {
> -		info->refcnt--;
> -		if (info->refcnt != 0)
> -			goto done;
> -	}
> -
> -	if (VALID_EVTCHN(evtchn)) {
> -		close.port = evtchn;
> -		if (HYPERVISOR_event_channel_op(EVTCHNOP_close, &close) != 0)
> -			BUG();
> -
> -		switch (type_from_irq(irq)) {
> -		case IRQT_VIRQ:
> -			per_cpu(virq_to_irq, cpu_from_evtchn(evtchn))
> -				[virq_from_irq(irq)] = -1;
> -			break;
> -		case IRQT_IPI:
> -			per_cpu(ipi_to_irq, cpu_from_evtchn(evtchn))
> -				[ipi_from_irq(irq)] = -1;
> -			break;
> -		default:
> -			break;
> -		}
> -
> -		/* Closed ports are implicitly re-bound to VCPU0. */
> -		bind_evtchn_to_cpu(evtchn, 0);
> -
> -		evtchn_to_irq[evtchn] = -1;
> -	}
> -
> -	BUG_ON(info_for_irq(irq)->type == IRQT_UNBOUND);
> -
> -	xen_free_irq(irq);
> -
> - done:
> +	__unbind_from_irq(irq);
>  	mutex_unlock(&irq_mapping_update_lock);
>  }
>  
> @@ -1137,7 +1163,7 @@ void rebind_evtchn_irq(int evtchn, int irq)
>  	   so there should be a proper type */
>  	BUG_ON(info->type == IRQT_UNBOUND);
>  
> -	xen_irq_info_evtchn_init(irq, evtchn);
> +	xen_irq_info_evtchn_setup(irq, evtchn);
>  
>  	mutex_unlock(&irq_mapping_update_lock);
>  
> @@ -1312,7 +1338,7 @@ static void restore_cpu_virqs(unsigned int cpu)
>  		evtchn = bind_virq.port;
>  
>  		/* Record the new mapping. */
> -		xen_irq_info_virq_init(cpu, irq, evtchn, virq);
> +		xen_irq_info_virq_setup(cpu, irq, evtchn, virq);
>  		bind_evtchn_to_cpu(evtchn, cpu);
>  	}
>  }
> @@ -1336,7 +1362,7 @@ static void restore_cpu_ipis(unsigned int cpu)
>  		evtchn = bind_ipi.port;
>  
>  		/* Record the new mapping. */
> -		xen_irq_info_ipi_init(cpu, irq, evtchn, ipi);
> +		xen_irq_info_ipi_setup(cpu, irq, evtchn, ipi);
>  		bind_evtchn_to_cpu(evtchn, cpu);
>  	}
>  }
> -- 
> 1.7.2.5
>

Konrad Rzeszutek Wilk

2013-Sep-24 14:47 UTC

head link

Re: [PATCH 09/13] xen/events: add a evtchn_op for port setup

On Fri, Sep 13, 2013 at 05:59:57PM +0100, David Vrabel
wrote:> From: David Vrabel <david.vrabel@citrix.com>
> 
> Add a hook for port-specific setup and call it from
> xen_irq_info_common_setup().
Could you explain a bit about it''s semantic please?
> 
> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
> ---
>  drivers/xen/events/events.c          |    2 +-
>  drivers/xen/events/events_internal.h |    8 ++++++++
>  2 files changed, 9 insertions(+), 1 deletions(-)
> 
> diff --git a/drivers/xen/events/events.c b/drivers/xen/events/events.c
> index 8f55a49..cf1c7ba 100644
> --- a/drivers/xen/events/events.c
> +++ b/drivers/xen/events/events.c
> @@ -117,7 +117,7 @@ static int xen_irq_info_common_setup(struct irq_info
*info,
>  
>  	irq_clear_status_flags(irq, IRQ_NOREQUEST|IRQ_NOAUTOEN);
>  
> -	return 0;
> +	return xen_evtchn_port_setup(info);
>  }
>  
>  static int xen_irq_info_evtchn_setup(unsigned irq,
> diff --git a/drivers/xen/events/events_internal.h
b/drivers/xen/events/events_internal.h
> index 513c34d..32cb928 100644
> --- a/drivers/xen/events/events_internal.h
> +++ b/drivers/xen/events/events_internal.h
> @@ -55,6 +55,7 @@ struct irq_info {
>  #define PIRQ_SHAREABLE	(1 << 1)
>  
>  struct evtchn_ops {
> +	int (*setup)(struct irq_info *info);
>  	void (*bind_to_cpu)(struct irq_info *info, int cpu);
>  
>  	void (*clear_pending)(int port);
> @@ -75,6 +76,13 @@ struct irq_info *info_for_irq(unsigned irq);
>  unsigned cpu_from_irq(unsigned irq);
>  unsigned cpu_from_evtchn(unsigned int evtchn);
>  
> +static inline int xen_evtchn_port_setup(struct irq_info *info)
> +{
> +	if (evtchn_ops->setup)
> +		return evtchn_ops->setup(info);
> +	return 0;
> +}
> +
>  static inline void xen_evtchn_port_bind_to_cpu(struct irq_info *info, int
cpu)
>  {
>  	evtchn_ops->bind_to_cpu(info, cpu);
> -- 
> 1.7.2.5
>

Konrad Rzeszutek Wilk

2013-Sep-24 14:58 UTC

head link

Re: [PATCH 10/13] xen/events: Refactor evtchn_to_irq array to be dynamically allocated

On Fri, Sep 13, 2013 at 05:59:58PM +0100, David Vrabel
wrote:> From: Malcolm Crossley <malcolm.crossley@citrix.com>
> 
> Refactor static array evtchn_to_irq array to be dynamically allocated by
> implementing get and set functions for accesses to the array.
> 
> Two new port ops are added: max_channels (maximum supported number of
> event channels) and nr_channels (number of currently usable event
> channels).  For the N-level ABI, these numbers are both the same as
> the shared data structure is a fixed size. For the FIFO ABI, these                            ^^ has 
> will be different as the event array is expanded dynamically.
> 
> This allows more than 65000 event channels so an unsigned short is no
> longer sufficient for an event channel port number and unsigned int is
> used instead.
> 
> Signed-off-by: Malcolm Crossley <malcolm.crossley@citrix.com>
> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
> ---
>  drivers/xen/events/events.c          |  129
+++++++++++++++++++++++++---------
>  drivers/xen/events/events_internal.h |   18 ++++-
>  drivers/xen/events/n-level.c         |   11 +++-
>  3 files changed, 121 insertions(+), 37 deletions(-)
> 
> diff --git a/drivers/xen/events/events.c b/drivers/xen/events/events.c
> index cf1c7ba..8660459 100644
> --- a/drivers/xen/events/events.c
> +++ b/drivers/xen/events/events.c
> @@ -77,12 +77,16 @@ static DEFINE_PER_CPU(int [NR_VIRQS], virq_to_irq) =
{[0 ... NR_VIRQS-1] = -1};
>  /* IRQ <-> IPI mapping */
>  static DEFINE_PER_CPU(int [XEN_NR_IPIS], ipi_to_irq) = {[0 ...
XEN_NR_IPIS-1] = -1};
>  
> -int *evtchn_to_irq;
> +int **evtchn_to_irq;
>  #ifdef CONFIG_X86
>  static unsigned long *pirq_eoi_map;
>  #endif
>  static bool (*pirq_needs_eoi)(unsigned irq);
>  
> +#define EVTCHN_ROW(e)  (e / (PAGE_SIZE/sizeof(**evtchn_to_irq)))
> +#define EVTCHN_COL(e)  (e % (PAGE_SIZE/sizeof(**evtchn_to_irq)))
> +#define EVTCHN_PER_ROW (PAGE_SIZE / sizeof(**evtchn_to_irq))
> +
>  /* Xen will never allocate port zero for any purpose. */
>  #define VALID_EVTCHN(chn)	((chn) != 0)
>  
> @@ -92,6 +96,61 @@ static struct irq_chip xen_pirq_chip;
>  static void enable_dynirq(struct irq_data *data);
>  static void disable_dynirq(struct irq_data *data);
>  
> +static void clear_evtchn_to_irq_row(unsigned row)
> +{
> +	unsigned col;
> +
> +	for (col = 0; col < EVTCHN_PER_ROW; col++)
> +		evtchn_to_irq[row][col] = -1;
> +}
> +
> +static void clear_evtchn_to_irq_all(void)
> +{
> +	unsigned row;
> +
> +	for (row = 0; row < EVTCHN_ROW(xen_evtchn_max_channels()); row++) {
> +		if (evtchn_to_irq[row] == NULL)
> +			continue;
> +		clear_evtchn_to_irq_row(row);
> +	}
> +} 
> +
> +static int set_evtchn_to_irq(unsigned evtchn, unsigned irq)
> +{
> +	unsigned row;
> +	unsigned col;
> +
> +	if (evtchn >= xen_evtchn_max_channels())
> +		return -EINVAL;
> +
> +	row = EVTCHN_ROW(evtchn);
> +	col = EVTCHN_COL(evtchn);
> +
> +	if (evtchn_to_irq[row] == NULL) {
> +		/* Unallocated irq entries return -1 anyway */                                           ^^^^^ - are

> +		if (irq == -1)
> +			return 0;
> +
> +		evtchn_to_irq[row] = (int *)get_zeroed_page(GFP_KERNEL);
> +		if (evtchn_to_irq[row] == NULL)
> +			return -ENOMEM;
> +
> +		clear_evtchn_to_irq_row(row);
> +	}
> +
> +	evtchn_to_irq[EVTCHN_ROW(evtchn)][EVTCHN_COL(evtchn)] = irq;
> +	return 0;
> +}
> +
> +int get_evtchn_to_irq(unsigned evtchn)
> +{
> +	if (evtchn >= xen_evtchn_max_channels())
> +		return -1;
> +	if (evtchn_to_irq[EVTCHN_ROW(evtchn)] == NULL)
> +		return -1;
> +	return evtchn_to_irq[EVTCHN_ROW(evtchn)][EVTCHN_COL(evtchn)];
> +}
> +
>  /* Get info for IRQ */
>  struct irq_info *info_for_irq(unsigned irq)
>  {
> @@ -102,7 +161,7 @@ struct irq_info *info_for_irq(unsigned irq)
>  static int xen_irq_info_common_setup(struct irq_info *info,
>  				     unsigned irq,
>  				     enum xen_irq_type type,
> -				     unsigned short evtchn,
> +				     unsigned evtchn,
>  				     unsigned short cpu)
>  {
>  
> @@ -113,7 +172,8 @@ static int xen_irq_info_common_setup(struct irq_info
*info,
>  	info->evtchn = evtchn;
>  	info->cpu = cpu;
>  
> -	evtchn_to_irq[evtchn] = irq;
> +	if (set_evtchn_to_irq(evtchn, irq))
> +		return -ENOMEM;
>  
>  	irq_clear_status_flags(irq, IRQ_NOREQUEST|IRQ_NOAUTOEN);
>  
> @@ -121,7 +181,7 @@ static int xen_irq_info_common_setup(struct irq_info
*info,
>  }
>  
>  static int xen_irq_info_evtchn_setup(unsigned irq,
> -				     unsigned short evtchn)
> +				     unsigned evtchn)
>  {
>  	struct irq_info *info = info_for_irq(irq);
>  
> @@ -130,7 +190,7 @@ static int xen_irq_info_evtchn_setup(unsigned irq,
>  
>  static int xen_irq_info_ipi_setup(unsigned cpu,
>  				  unsigned irq,
> -				  unsigned short evtchn,
> +				  unsigned evtchn,
>  				  enum ipi_vector ipi)
>  {
>  	struct irq_info *info = info_for_irq(irq);
> @@ -144,8 +204,8 @@ static int xen_irq_info_ipi_setup(unsigned cpu,
>  
>  static int xen_irq_info_virq_setup(unsigned cpu,
>  				   unsigned irq,
> -				   unsigned short evtchn,
> -				   unsigned short virq)
> +				   unsigned evtchn,
> +				   unsigned virq)
>  {
>  	struct irq_info *info = info_for_irq(irq);
>  
> @@ -157,9 +217,9 @@ static int xen_irq_info_virq_setup(unsigned cpu,
>  }
>  
>  static int xen_irq_info_pirq_setup(unsigned irq,
> -				   unsigned short evtchn,
> -				   unsigned short pirq,
> -				   unsigned short gsi,
> +				   unsigned evtchn,
> +				   unsigned pirq,
> +				   unsigned gsi,
>  				   uint16_t domid,
>  				   unsigned char flags)
>  {
> @@ -186,7 +246,7 @@ static unsigned int evtchn_from_irq(unsigned irq)
>  
>  unsigned irq_from_evtchn(unsigned int evtchn)
>  {
> -	return evtchn_to_irq[evtchn];
> +	return get_evtchn_to_irq(evtchn);
>  }
>  EXPORT_SYMBOL_GPL(irq_from_evtchn);
>  
> @@ -232,7 +292,7 @@ unsigned cpu_from_irq(unsigned irq)
>  
>  unsigned int cpu_from_evtchn(unsigned int evtchn)
>  {
> -	int irq = evtchn_to_irq[evtchn];
> +	int irq = get_evtchn_to_irq(evtchn);
>  	unsigned ret = 0;
>  
>  	if (irq != -1)
> @@ -258,7 +318,7 @@ static bool pirq_needs_eoi_flag(unsigned irq)
>  
>  static void bind_evtchn_to_cpu(unsigned int chn, unsigned int cpu)
>  {
> -	int irq = evtchn_to_irq[chn];
> +	int irq = get_evtchn_to_irq(chn);
>  	struct irq_info *info = info_for_irq(irq);
>  
>  	BUG_ON(irq == -1);
> @@ -453,7 +513,9 @@ static unsigned int __startup_pirq(unsigned int irq)
>  
>  	pirq_query_unmask(irq);
>  
> -	evtchn_to_irq[evtchn] = irq;
> +	rc = set_evtchn_to_irq(evtchn, irq);
> +	if (rc != 0)
> +		return 0;
I think you also need to unroll the hypercall you did (EVTCHNOP_bind_pirq).

I think doing EVTCHNOP_close should suffice?

>  	bind_evtchn_to_cpu(evtchn, 0);
>  	info->evtchn = evtchn;
>  
> @@ -474,7 +536,7 @@ static void shutdown_pirq(struct irq_data *data)
>  	struct evtchn_close close;
>  	unsigned int irq = data->irq;
>  	struct irq_info *info = info_for_irq(irq);
> -	int evtchn = evtchn_from_irq(irq);
> +	int rc, evtchn = evtchn_from_irq(irq);
>  
>  	BUG_ON(info->type != IRQT_PIRQ);
>  
> @@ -488,7 +550,7 @@ static void shutdown_pirq(struct irq_data *data)
>  		BUG();
>  
>  	bind_evtchn_to_cpu(evtchn, 0);
> -	evtchn_to_irq[evtchn] = -1;
> +	rc = set_evtchn_to_irq(evtchn, -1);
>  	info->evtchn = 0;
>  }
>  
> @@ -551,7 +613,7 @@ static void __unbind_from_irq(unsigned int irq)
>  		/* Closed ports are implicitly re-bound to VCPU0. */
>  		bind_evtchn_to_cpu(evtchn, 0);
>  
> -		evtchn_to_irq[evtchn] = -1;
> +		set_evtchn_to_irq(evtchn, -1);
>  	}
>  
>  	BUG_ON(info_for_irq(irq)->type == IRQT_UNBOUND);
> @@ -755,9 +817,12 @@ int bind_evtchn_to_irq(unsigned int evtchn)
>  	int irq;
>  	int ret;
>  
> +	if (evtchn >= xen_evtchn_max_channels())
> +		return -ENOMEM;
> +
>  	mutex_lock(&irq_mapping_update_lock);
>  
> -	irq = evtchn_to_irq[evtchn];
> +	irq = get_evtchn_to_irq(evtchn);
>  
>  	if (irq == -1) {
>  		irq = xen_allocate_irq_dynamic();
> @@ -847,7 +912,7 @@ static int find_virq(unsigned int virq, unsigned int
cpu)
>  	int port, rc = -ENOENT;
>  
>  	memset(&status, 0, sizeof(status));
> -	for (port = 0; port <= NR_EVENT_CHANNELS; port++) {
> +	for (port = 0; port < xen_evtchn_max_channels(); port++) {
>  		status.dom = DOMID_SELF;
>  		status.port = port;
>  		rc = HYPERVISOR_event_channel_op(EVTCHNOP_status, &status);
> @@ -927,8 +992,9 @@ int bind_evtchn_to_irqhandler(unsigned int evtchn,
>  	int irq, retval;
>  
>  	irq = bind_evtchn_to_irq(evtchn);
> -	if (irq < 0)
> +	if (irq < 0){
>  		return irq;
> +	}
>  	retval = request_irq(irq, handler, irqflags, devname, dev_id);
>  	if (retval != 0) {
>  		unbind_from_irq(irq);
> @@ -1017,7 +1083,7 @@ EXPORT_SYMBOL_GPL(unbind_from_irqhandler);
>  
>  int evtchn_make_refcounted(unsigned int evtchn)
>  {
> -	int irq = evtchn_to_irq[evtchn];
> +	int irq = get_evtchn_to_irq(evtchn);
>  	struct irq_info *info;
>  
>  	if (irq == -1)
> @@ -1042,12 +1108,12 @@ int evtchn_get(unsigned int evtchn)
>  	struct irq_info *info;
>  	int err = -ENOENT;
>  
> -	if (evtchn >= NR_EVENT_CHANNELS)
> +	if (evtchn >= xen_evtchn_max_channels())
>  		return -EINVAL;
>  
>  	mutex_lock(&irq_mapping_update_lock);
>  
> -	irq = evtchn_to_irq[evtchn];
> +	irq = get_evtchn_to_irq(evtchn);
>  	if (irq == -1)
>  		goto done;
>  
> @@ -1071,7 +1137,7 @@ EXPORT_SYMBOL_GPL(evtchn_get);
>  
>  void evtchn_put(unsigned int evtchn)
>  {
> -	int irq = evtchn_to_irq[evtchn];
> +	int irq = get_evtchn_to_irq(evtchn);
>  	if (WARN_ON(irq == -1))
>  		return;
>  	unbind_from_irq(irq);
> @@ -1158,7 +1224,7 @@ void rebind_evtchn_irq(int evtchn, int irq)
>  	mutex_lock(&irq_mapping_update_lock);
>  
>  	/* After resume the irq<->evtchn mappings are all cleared out */
> -	BUG_ON(evtchn_to_irq[evtchn] != -1);
> +	BUG_ON(get_evtchn_to_irq(evtchn) != -1);
>  	/* Expect irq to have been bound before,
>  	   so there should be a proper type */
>  	BUG_ON(info->type == IRQT_UNBOUND);
> @@ -1443,15 +1509,14 @@ void xen_irq_resume(void)
>  	struct irq_info *info;
>  
>  	/* New event-channel space is not ''live'' yet. */
> -	for (evtchn = 0; evtchn < NR_EVENT_CHANNELS; evtchn++)
> +	for (evtchn = 0; evtchn < xen_evtchn_nr_channels(); evtchn++)
>  		mask_evtchn(evtchn);
>  
>  	/* No IRQ <-> event-channel mappings. */
>  	list_for_each_entry(info, &xen_irq_list_head, list)
>  		info->evtchn = 0; /* zap event-channel binding */
>  
> -	for (evtchn = 0; evtchn < NR_EVENT_CHANNELS; evtchn++)
> -		evtchn_to_irq[evtchn] = -1;
> +	clear_evtchn_to_irq_all();
>  
>  	for_each_possible_cpu(cpu) {
>  		restore_cpu_virqs(cpu);
> @@ -1548,14 +1613,12 @@ void __init xen_init_IRQ(void)
>  
>  	xen_evtchn_init_nlevel();
>  
> -	evtchn_to_irq = kcalloc(NR_EVENT_CHANNELS, sizeof(*evtchn_to_irq),
> -				    GFP_KERNEL);
> +	evtchn_to_irq = kcalloc(EVTCHN_ROW(xen_evtchn_max_channels()),
> +				sizeof(*evtchn_to_irq), GFP_KERNEL);
>  	BUG_ON(!evtchn_to_irq);
> -	for (i = 0; i < NR_EVENT_CHANNELS; i++)
> -		evtchn_to_irq[i] = -1;
>  
>  	/* No event channels are ''live'' right now. */
> -	for (i = 0; i < NR_EVENT_CHANNELS; i++)
> +	for (i = 0; i < xen_evtchn_nr_channels(); i++)
>  		mask_evtchn(i);
>  
>  	pirq_needs_eoi = pirq_needs_eoi_flag;
> diff --git a/drivers/xen/events/events_internal.h
b/drivers/xen/events/events_internal.h
> index 32cb928..9d8b70c 100644
> --- a/drivers/xen/events/events_internal.h
> +++ b/drivers/xen/events/events_internal.h
> @@ -35,7 +35,7 @@ struct irq_info {
>  	int refcnt;
>  	enum xen_irq_type type;	/* type */
>  	unsigned irq;
> -	unsigned short evtchn;	/* event channel */
> +	unsigned int evtchn;	/* event channel */
>  	unsigned short cpu;	/* cpu bound */
>  
>  	union {
> @@ -55,6 +55,9 @@ struct irq_info {
>  #define PIRQ_SHAREABLE	(1 << 1)
>  
>  struct evtchn_ops {
> +	unsigned (*max_channels)(void);
> +	unsigned (*nr_channels)(void);
> +
>  	int (*setup)(struct irq_info *info);
>  	void (*bind_to_cpu)(struct irq_info *info, int cpu);
>  
> @@ -70,12 +73,23 @@ struct evtchn_ops {
>  
>  extern const struct evtchn_ops *evtchn_ops;
>  
> -extern int *evtchn_to_irq;
> +extern int **evtchn_to_irq;
> +int get_evtchn_to_irq(unsigned int evtchn);
>  
>  struct irq_info *info_for_irq(unsigned irq);
>  unsigned cpu_from_irq(unsigned irq);
>  unsigned cpu_from_evtchn(unsigned int evtchn);
>  
> +static inline unsigned xen_evtchn_max_channels(void)
> +{
> +	return evtchn_ops->max_channels();
> +}
> +
> +static inline unsigned xen_evtchn_nr_channels(void)
> +{
> +	return evtchn_ops->nr_channels();
> +}
> +
>  static inline int xen_evtchn_port_setup(struct irq_info *info)
>  {
>  	if (evtchn_ops->setup)
> diff --git a/drivers/xen/events/n-level.c b/drivers/xen/events/n-level.c
> index 6c0662e..da8d0aa 100644
> --- a/drivers/xen/events/n-level.c
> +++ b/drivers/xen/events/n-level.c
> @@ -39,6 +39,11 @@
>  static DEFINE_PER_CPU(xen_ulong_t
[NR_EVENT_CHANNELS/BITS_PER_EVTCHN_WORD],
>  		      cpu_evtchn_mask);
>  
> +static unsigned nlevel_max_channels(void)
> +{
> +	return NR_EVENT_CHANNELS;
> +}
> +
>  static void nlevel_bind_to_cpu(struct irq_info *info, int cpu)
>  {
>  	clear_bit(info->evtchn, BM(per_cpu(cpu_evtchn_mask, info->cpu)));
> @@ -212,7 +217,7 @@ static void nlevel_handle_events(int cpu)
>  
>  			/* Process port. */
>  			port = (word_idx * BITS_PER_EVTCHN_WORD) + bit_idx;
> -			irq = evtchn_to_irq[port];
> +			irq = get_evtchn_to_irq(port);
>  
>  			if (irq != -1) {
>  				desc = irq_to_desc(irq);
> @@ -306,7 +311,7 @@ irqreturn_t xen_debug_interrupt(int irq, void *dev_id)
>  			int word_idx = i / BITS_PER_EVTCHN_WORD;
>  			printk("  %d: event %d -> irq %d%s%s%s\n",
>  			       cpu_from_evtchn(i), i,
> -			       evtchn_to_irq[i],
> +			       get_evtchn_to_irq(i),
>  			       sync_test_bit(word_idx, BM(&v->evtchn_pending_sel))
>  					     ? "" : " l2-clear",
>  			       !sync_test_bit(i, BM(sh->evtchn_mask))
> @@ -322,6 +327,8 @@ irqreturn_t xen_debug_interrupt(int irq, void *dev_id)
>  }
>  
>  static const struct evtchn_ops evtchn_ops_nlevel = {
> +	.max_channels      = nlevel_max_channels,
> +	.nr_channels       = nlevel_max_channels,
>  	.bind_to_cpu       = nlevel_bind_to_cpu,
>  	.clear_pending     = nlevel_clear_pending,
>  	.set_pending       = nlevel_set_pending,
> -- 
> 1.7.2.5
>

Konrad Rzeszutek Wilk

2013-Sep-24 14:58 UTC

head link

Re: [PATCH 11/13] xen/events: add xen_evtchn_mask_all()

On Fri, Sep 13, 2013 at 05:59:59PM +0100, David Vrabel
wrote:> From: David Vrabel <david.vrabel@citrix.com>
> 
> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Konrad Rzeszutek Wilk
<konrad.wilk@oracle.com>> ---
>  drivers/xen/events/events.c |   18 ++++++++++++------
>  1 files changed, 12 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/xen/events/events.c b/drivers/xen/events/events.c
> index 8660459..6c55b1f 100644
> --- a/drivers/xen/events/events.c
> +++ b/drivers/xen/events/events.c
> @@ -331,6 +331,14 @@ static void bind_evtchn_to_cpu(unsigned int chn,
unsigned int cpu)
>  	info->cpu = cpu;
>  }
>  
> +static void xen_evtchn_mask_all(void)
> +{
> +	unsigned evtchn;
> +
> +	for (evtchn = 0; evtchn < xen_evtchn_nr_channels(); evtchn++)
> +		mask_evtchn(evtchn);
> +}
> +
>  /**
>   * notify_remote_via_irq - send event to remote end of event channel via
irq
>   * @irq: irq of event channel to send event to
> @@ -1505,12 +1513,11 @@ EXPORT_SYMBOL_GPL(xen_test_irq_shared);
>  
>  void xen_irq_resume(void)
>  {
> -	unsigned int cpu, evtchn;
> +	unsigned int cpu;
>  	struct irq_info *info;
>  
>  	/* New event-channel space is not ''live'' yet. */
> -	for (evtchn = 0; evtchn < xen_evtchn_nr_channels(); evtchn++)
> -		mask_evtchn(evtchn);
> +	xen_evtchn_mask_all();
>  
>  	/* No IRQ <-> event-channel mappings. */
>  	list_for_each_entry(info, &xen_irq_list_head, list)
> @@ -1609,7 +1616,7 @@ void xen_callback_vector(void) {}
>  
>  void __init xen_init_IRQ(void)
>  {
> -	int i;
> +	int ret;
>  
>  	xen_evtchn_init_nlevel();
>  
> @@ -1618,8 +1625,7 @@ void __init xen_init_IRQ(void)
>  	BUG_ON(!evtchn_to_irq);
>  
>  	/* No event channels are ''live'' right now. */
> -	for (i = 0; i < xen_evtchn_nr_channels(); i++)
> -		mask_evtchn(i);
> +	xen_evtchn_mask_all();
>  
>  	pirq_needs_eoi = pirq_needs_eoi_flag;
>  
> -- 
> 1.7.2.5
>

Konrad Rzeszutek Wilk

2013-Sep-24 15:06 UTC

head link

Re: [PATCH 12/13] xen/events: Add the hypervisor interface for the FIFO-based event channels

On Fri, Sep 13, 2013 at 06:00:00PM +0100, David Vrabel
wrote:> From: David Vrabel <david.vrabel@citrix.com>
> 
> Add the hypercall sub-ops and the structures for the shared data used
> in the FIFO-based event channel ABI.
And you should include an URL to the design document. Or include the
text version of it (if possible) in this description.

Perhaps also say in which version of the hypervisor this functionality
became available.
> 
> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
> ---
>  include/xen/interface/event_channel.h |   67
+++++++++++++++++++++++++++++++++
>  1 files changed, 67 insertions(+), 0 deletions(-)
> 
> diff --git a/include/xen/interface/event_channel.h
b/include/xen/interface/event_channel.h
> index f494292..f54ad0a 100644
> --- a/include/xen/interface/event_channel.h
> +++ b/include/xen/interface/event_channel.h
> @@ -190,6 +190,48 @@ struct evtchn_reset {
>  };
>  typedef struct evtchn_reset evtchn_reset_t;
>  
> +/*
> + * EVTCHNOP_init_control: initialize the control block for the FIFO ABI.
> + */
> +#define EVTCHNOP_init_control    11
> +struct evtchn_init_control {
> +	/* IN parameters. */
> +	uint64_t control_mfn;
Odd. The design document mentions ''control_gfn'' ?

Should it be ''gmfn'' ?
> +	uint32_t offset;
> +	uint32_t vcpu;
> +	/* OUT parameters. */
> +	uint8_t link_bits;
No __pad[7] ?
> +};
> +
> +/*
> + * EVTCHNOP_expand_array: add an additional page to the event array.
> + */
> +#define EVTCHNOP_expand_array    12
> +struct evtchn_expand_array {
> +	/* IN parameters. */
> +	uint64_t array_mfn;
Design says ''array_gfn'' ?
> +};
> +
> +/*
> + * EVTCHNOP_set_priority: set the priority for an event channel.
> + */
> +#define EVTCHNOP_set_priority    13
> +struct evtchn_set_priority {
> +	/* IN parameters. */
> +	uint32_t port;
> +	uint32_t priority;
> +};
> +
> +/*
> + * EVTCHNOP_set_limit: set the maximum event channel port that may be
bound.
> + */
> +#define EVTCHNOP_set_limit       14
> +struct evtchn_set_limit {
> +	/* IN parameters. */
> +	uint32_t domid;
> +	uint32_t max_port;
> +};
> +
>  struct evtchn_op {
>  	uint32_t cmd; /* EVTCHNOP_* */
>  	union {
> @@ -207,4 +249,29 @@ struct evtchn_op {
>  };
>  DEFINE_GUEST_HANDLE_STRUCT(evtchn_op);
>  
> +/*
> + * FIFO ABI
> + */
> +
> +/* Events may have priorities from 0 (highest) to 15 (lowest). */
> +#define EVTCHN_FIFO_PRIORITY_MIN     15
> +#define EVTCHN_FIFO_PRIORITY_DEFAULT 7
> +
> +#define EVTCHN_FIFO_MAX_QUEUES (EVTCHN_FIFO_PRIORITY_MIN + 1)
> +
> +typedef uint32_t event_word_t;
> +
> +#define EVTCHN_FIFO_PENDING 31
> +#define EVTCHN_FIFO_MASKED  30
> +#define EVTCHN_FIFO_LINKED  29
> +
> +#define EVTCHN_FIFO_LINK_BITS 17
> +#define EVTCHN_FIFO_LINK_MASK ((1 << EVTCHN_FIFO_LINK_BITS) - 1)
> +
> +struct evtchn_fifo_control_block {
> +	uint32_t     ready;
> +	uint32_t     _rsvd1;

The design document looks to have ''READY'' be 2 bytes? Octet 0
and 1?
(figure 2).

Which one is correct?

> +	event_word_t head[EVTCHN_FIFO_MAX_QUEUES];
> +};
> +
>  #endif /* __XEN_PUBLIC_EVENT_CHANNEL_H__ */
> -- 
> 1.7.2.5
>

Konrad Rzeszutek Wilk

2013-Sep-24 15:08 UTC

head link

Re: [PATCH 12/13] xen/events: Add the hypervisor interface for the FIFO-based event channels

> +/*
> + * FIFO ABI
> + */
> +
> +/* Events may have priorities from 0 (highest) to 15 (lowest). */
> +#define EVTCHN_FIFO_PRIORITY_MIN     15
> +#define EVTCHN_FIFO_PRIORITY_DEFAULT 7
> +
> +#define EVTCHN_FIFO_MAX_QUEUES (EVTCHN_FIFO_PRIORITY_MIN + 1)
> +
> +typedef uint32_t event_word_t;
Typedefs are frowed upon in the Linux kernel. Can you just use uint32_t
please?

Konrad Rzeszutek Wilk

2013-Sep-24 15:50 UTC

head link

Re: [PATCH 13/13] xen/events: use the FIFO-based ABI if available

On Fri, Sep 13, 2013 at 06:00:01PM +0100, David Vrabel
wrote:> From: David Vrabel <david.vrabel@citrix.com>
> 
> If the hypervisor supports the FIFO-based ABI, enable it by
> initializing the control block for the boot VCPU and subsequent VCPUs
> as they are brought up.  The event array is expanded as required when
> event ports are setup.
> 
> This implementation has some known limitations (which will be
> corrected in a subsequent patch):
> 
> - The timer VIRQ which previously was treated as the highest priority
>   event has the default priority.
> 
> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
> ---
>  drivers/xen/events/Makefile          |    1 +
>  drivers/xen/events/events.c          |    7 +-
>  drivers/xen/events/events_internal.h |    8 +
>  drivers/xen/events/fifo.c            |  395
++++++++++++++++++++++++++++++++++
>  4 files changed, 410 insertions(+), 1 deletions(-)
>  create mode 100644 drivers/xen/events/fifo.c
> 
> diff --git a/drivers/xen/events/Makefile b/drivers/xen/events/Makefile
> index aea331e..74644d0 100644
> --- a/drivers/xen/events/Makefile
> +++ b/drivers/xen/events/Makefile
> @@ -1,2 +1,3 @@
>  obj-y += events.o
> +obj-y += fifo.o
>  obj-y += n-level.o
> diff --git a/drivers/xen/events/events.c b/drivers/xen/events/events.c
> index 6c55b1f..561ae7b 100644
> --- a/drivers/xen/events/events.c
> +++ b/drivers/xen/events/events.c
> @@ -1518,6 +1518,7 @@ void xen_irq_resume(void)
>  
>  	/* New event-channel space is not ''live'' yet. */
>  	xen_evtchn_mask_all();
> +	xen_evtchn_resume();
>  
>  	/* No IRQ <-> event-channel mappings. */
>  	list_for_each_entry(info, &xen_irq_list_head, list)
> @@ -1618,7 +1619,11 @@ void __init xen_init_IRQ(void)
>  {
>  	int ret;
>  
> -	xen_evtchn_init_nlevel();
> +	ret = xen_evtchn_init_fifo();
> +	if (ret < 0) {
> +		printk(KERN_INFO "xen: falling back to n-level event
channels");
> +		xen_evtchn_init_nlevel();
> +	}
>  
>  	evtchn_to_irq = kcalloc(EVTCHN_ROW(xen_evtchn_max_channels()),
>  				sizeof(*evtchn_to_irq), GFP_KERNEL);
> diff --git a/drivers/xen/events/events_internal.h
b/drivers/xen/events/events_internal.h
> index 9d8b70c..f7fbcea 100644
> --- a/drivers/xen/events/events_internal.h
> +++ b/drivers/xen/events/events_internal.h
> @@ -69,6 +69,7 @@ struct evtchn_ops {
>  	void (*unmask)(int port);
>  
>  	void (*handle_events)(int cpu);
> +	void (*resume)(void);
>  };
>  
>  extern const struct evtchn_ops *evtchn_ops;
> @@ -137,6 +138,13 @@ static inline void xen_evtchn_handle_events(int cpu)
>  	return evtchn_ops->handle_events(cpu);
>  }
>  
> +static inline void xen_evtchn_resume(void)
> +{
> +	if (evtchn_ops->resume)
> +		evtchn_ops->resume();
> +}
> +
>  void xen_evtchn_init_nlevel(void);
> +int xen_evtchn_init_fifo(void);
>  
>  #endif /* #ifndef __EVENTS_INTERNAL_H__ */
> diff --git a/drivers/xen/events/fifo.c b/drivers/xen/events/fifo.c
> new file mode 100644
> index 0000000..8047430
> --- /dev/null
> +++ b/drivers/xen/events/fifo.c
> @@ -0,0 +1,395 @@
> +/*
> + * Xen event channels (FIFO-based ABI)
> + *
> + * Copyright (C) 2013 Citrix Systems R&D ltd.
> + *
> + * This source code is licensed under the GNU General Public License,
> + * Version 2 or later.  See the file COPYING for more details.
> + */
> +
> +#include <linux/linkage.h>
> +#include <linux/interrupt.h>
> +#include <linux/irq.h>
> +#include <linux/module.h>
> +#include <linux/smp.h>
> +#include <linux/percpu.h>
> +#include <linux/cpu.h>
> +
> +#include <asm/sync_bitops.h>
> +#include <asm/xen/hypercall.h>
> +#include <asm/xen/hypervisor.h>
> +#include <asm/xen/page.h>
> +
> +#include <xen/xen.h>
> +#include <xen/xen-ops.h>
> +#include <xen/events.h>
> +#include <xen/interface/xen.h>
> +#include <xen/interface/event_channel.h>
> +
> +#include "events_internal.h"
> +
> +#define EVENT_WORDS_PER_PAGE (PAGE_SIZE / sizeof(event_word_t))
> +#define MAX_EVENT_ARRAY_PAGES ((1 << EVTCHN_FIFO_LINK_BITS)	\
> +			       / EVENT_WORDS_PER_PAGE)
> +
> +struct evtchn_fifo_queue {
> +	uint32_t head[EVTCHN_FIFO_MAX_QUEUES];
> +};
> +
> +static DEFINE_PER_CPU(struct evtchn_fifo_control_block *,
cpu_control_block);
> +static DEFINE_PER_CPU(struct evtchn_fifo_queue, cpu_queue);
> +static event_word_t *event_array[MAX_EVENT_ARRAY_PAGES];
> +static unsigned event_array_pages;
__read_mostly
> +
> +#define BM(w) ((unsigned long *)(w))
> +
> +static inline event_word_t *event_word_from_port(int port)
Should port be ''unsigned int''?
> +{
> +	int i = port / EVENT_WORDS_PER_PAGE;
And this be ''unsigned'' as well?
> +
> +	if (i >= event_array_pages)
> +		return NULL;
> +	return event_array[i] + port % EVENT_WORDS_PER_PAGE;
> +}
> +
> +static unsigned fifo_max_channels(void)
> +{
> +	return 1 << EVTCHN_FIFO_LINK_BITS;
> +}
> +
> +static unsigned fifo_nr_channels(void)
> +{
> +	return event_array_pages * EVENT_WORDS_PER_PAGE;
> +}
> +
> +static void free_unused_array_pages(void)
> +{
> +	unsigned i;
> +
> +	for (i = event_array_pages; i < MAX_EVENT_ARRAY_PAGES; i++) {
> +		if (!event_array[i])
> +			break;
> +		free_page((unsigned long)event_array[i]);
> +		event_array[i] = NULL;
> +	}
> +}
> +
> +static int fifo_setup(struct irq_info *info)
> +{
> +	int port = info->evtchn;

unsigned int?
> +	int i;
> +	int ret = -ENOMEM;
> +
> +	i = port / EVENT_WORDS_PER_PAGE;
> +
> +	if (i >= MAX_EVENT_ARRAY_PAGES)
> +		return -EINVAL;
> +
> +	while (i >= event_array_pages) {
> +		void *array_page;
> +		struct evtchn_expand_array expand_array;
> +
> +		/* Might already have a page if we''ve resumed. */
> +		array_page = event_array[event_array_pages];
> +		if (!array_page) {
> +			array_page = (void *)get_zeroed_page(GFP_KERNEL);
> +			if (array_page == NULL)
> +				goto error;
> +			event_array[event_array_pages] = array_page;
> +		} else
> +			memset(array_page, 0, PAGE_SIZE);
> +
> +		expand_array.array_mfn = virt_to_mfn(array_page);
> +
> +		ret = HYPERVISOR_event_channel_op(EVTCHNOP_expand_array,
&expand_array);
> +		if (ret < 0)
> +			goto error;
> +
> +		event_array_pages++;
> +	}
> +	return 0;
> +
> +  error:
> +	if (event_array_pages == 0)
> +		panic("xen: unable to expand event array with initial page
(%d)\n", ret);
Shouldn''t we just printk and return a negative value so we can use
the old style event channels?
> +	else
> +		printk(KERN_ERR "xen: unable to expand event array (%d)\n",
ret);
> +	free_unused_array_pages();
> +	return ret;
> +}
> +
> +static void fifo_bind_to_cpu(struct irq_info *info, int cpu)
> +{
> +	/* no-op */
Perhaps you can also say: "The event array is shared between all of the
guests VCPUs." (from design-E).
> +}
> +
> +static void fifo_clear_pending(int port)
> +{
> +	event_word_t *word = event_word_from_port(port);
> +	sync_clear_bit(EVTCHN_FIFO_PENDING, BM(word));
> +}
> +
> +static void fifo_set_pending(int port)
> +{
> +	event_word_t *word = event_word_from_port(port);
> +	sync_set_bit(EVTCHN_FIFO_PENDING, BM(word));
> +}
> +
> +static bool fifo_is_pending(int port)
> +{
> +	event_word_t *word = event_word_from_port(port);
> +	return sync_test_bit(EVTCHN_FIFO_PENDING, BM(word));
> +}
> +
> +static bool fifo_test_and_set_mask(int port)
> +{
> +	event_word_t *word = event_word_from_port(port);
> +	return sync_test_and_set_bit(EVTCHN_FIFO_MASKED, BM(word));
> +}
> +
> +static void fifo_mask(int port)
> +{
> +	event_word_t *word = event_word_from_port(port);
> +	if (word)
> +		sync_set_bit(EVTCHN_FIFO_MASKED, BM(word));
> +}
> +
> +static void fifo_unmask(int port)
> +{
> +	unsigned int cpu = get_cpu();
> +	bool do_hypercall = false;
> +	bool evtchn_pending = false;
> +
> +	BUG_ON(!irqs_disabled());
> +
> +	if (unlikely((cpu != cpu_from_evtchn(port))))
> +		do_hypercall = true;
Since the event array is shared across all of the vCPUs is this still
needed?
> +	else {
> +		event_word_t *word = event_word_from_port(port);
> +
> +		sync_clear_bit(EVTCHN_FIFO_MASKED, BM(word));
> +		evtchn_pending = sync_test_bit(EVTCHN_FIFO_PENDING, BM(word));
> +		if (evtchn_pending)
> +			do_hypercall = true;
> +	}
> +
> +	if (do_hypercall) {
> +		struct evtchn_unmask unmask = { .port = port };
> +		(void)HYPERVISOR_event_channel_op(EVTCHNOP_unmask, &unmask);
> +	}
> +
> +	put_cpu();
> +}
> +
> +static uint32_t clear_linked(volatile event_word_t *word)
> +{
> +    event_word_t n, o, w;
How about just ''new'', ''old'',
''temp'' ?
> +
> +    w = *word;
> +
> +    do {
> +        o = w;
> +        n = (w & ~((1 << EVTCHN_FIFO_LINKED) |
EVTCHN_FIFO_LINK_MASK));
> +    } while ( (w = sync_cmpxchg(word, o, n)) != o );
> +
> +    return w & EVTCHN_FIFO_LINK_MASK;
> +}
> +
> +static void handle_irq_for_port(int port)
unsigned int ?
> +{
> +	int irq;
> +	struct irq_desc *desc;
> +
> +	irq = get_evtchn_to_irq(port);
> +	if (irq != -1) {
> +		desc = irq_to_desc(irq);
> +		if (desc)
> +			generic_handle_irq_desc(irq, desc);
> +	}
> +}
> +
> +static void consume_one_event(int cpu,
> +			      struct evtchn_fifo_control_block *control_block,
> +			      int priority, uint32_t *ready)
> +{
> +	struct evtchn_fifo_queue *q = &per_cpu(cpu_queue, cpu);
> +	uint32_t head;
> +	int port;
unsigned?> +	event_word_t *word;
> +
> +	head = q->head[priority];
> +
> +	/* Reached the tail last time?  Read the new HEAD from the
> +	   control block. */
> +	if (head == 0) {
> +		rmb(); /* Ensure word is up-to-date before reading head. */
> +		head = control_block->head[priority];
> +	}
> +
> +	port = head;
> +	word = event_word_from_port(port);
> +	head = clear_linked(word);
> +
> +	/*
> +	 * If the link is non-zero, there are more events in the
> +	 * queue, otherwise the queue is empty.
> +	 *
> +	 * If the queue is empty, clear this priority from our local
> +	 * copy of the ready word.
> +	 */
> +	if (head == 0)
> +		clear_bit(priority, BM(ready));
> +
> +	if (sync_test_bit(EVTCHN_FIFO_PENDING, BM(word))
> +	    && !sync_test_bit(EVTCHN_FIFO_MASKED, BM(word)))
> +		handle_irq_for_port(port);
> +
> +	q->head[priority] = head;
> +}
> +
> +#define EVTCHN_FIFO_READY_MASK ((1 << EVTCHN_FIFO_MAX_QUEUES) - 1)
That could go to the header that defines EVTCHN_FIFO_MAX_QUEUES.
> +
> +static void fifo_handle_events(int cpu)
> +{
> +	struct evtchn_fifo_control_block *control_block;
> +	uint32_t ready;
> +	int q;
> +
> +	control_block = per_cpu(cpu_control_block, cpu);
> +
> +	ready = xchg(&control_block->ready, 0);
> +
> +	while (ready & EVTCHN_FIFO_READY_MASK) {
> +		q = find_first_bit(BM(&ready), EVTCHN_FIFO_MAX_QUEUES);
> +		consume_one_event(cpu, control_block, q, &ready);
> +		ready |= xchg(&control_block->ready, 0);
Say priority 0 is where VIRQ_TIMER is set and the TIMER is set to
be incredibly short.

Couldn''t this cause a scenario where the guest clears the ready in
consume_one_event (clear_bit(0, BM(ready)) and the hypervisor
at the same time can set the bit. We then process the IRQ (virq_timer).

Then we get to the xchg here and end up with the ready having the
priority 0 bit set again. 

And then loop again and again..
> +	}
> +}
> +
> +static void fifo_resume(void)
> +{
> +	unsigned cpu;
> +
> +	for_each_possible_cpu(cpu) {
> +		void *control_block = per_cpu(cpu_control_block, cpu);
> +		struct evtchn_init_control init_control;
> +		int ret;
> +
> +		if (!control_block)
> +			continue;
> +
> +		/*
> +		 * If this CPU is offline, take the opportunity to
> +		 * free the control block while it is not being
> +		 * used.
> +		 */
> +		if (!cpu_online(cpu)) {
> +			free_page((unsigned long)control_block);
> +			per_cpu(cpu_control_block, cpu) = NULL;
> +			continue;
> +		}
> +
> +		init_control.control_mfn = virt_to_mfn(control_block);
> +		init_control.offset = 0;
> +		init_control.vcpu = cpu;
> +
> +		ret = HYPERVISOR_event_channel_op(EVTCHNOP_init_control,
> +						  &init_control);
> +		if (ret < 0)
> +			BUG();
So if this is run after migration to an older hypevisor won''t we
blow up here? Shouldn''t we just exit with an return and use the
old style classis events?
> +	}
> +
> +	/*
> +	 * The event array starts out as empty again and is extended
> +	 * as normal when events are bound.  The existing pages will
> +	 * be reused.
> +	 */
> +	event_array_pages = 0;
> +}
> +
> +static const struct evtchn_ops evtchn_ops_fifo = {
> +	.max_channels      = fifo_max_channels,
> +	.nr_channels       = fifo_nr_channels,
> +	.setup             = fifo_setup,
> +	.bind_to_cpu       = fifo_bind_to_cpu,
> +	.clear_pending     = fifo_clear_pending,
> +	.set_pending       = fifo_set_pending,
> +	.is_pending        = fifo_is_pending,
> +	.test_and_set_mask = fifo_test_and_set_mask,
> +	.mask              = fifo_mask,
> +	.unmask            = fifo_unmask,
> +	.handle_events     = fifo_handle_events,
> +	.resume            = fifo_resume,
> +};
> +
> +static int __cpuinit fifo_init_control_block(int cpu)
> +{
> +	struct page *control_block = NULL;
> +	struct evtchn_init_control init_control;
> +	int ret = -ENOMEM;
> +
> +	control_block = alloc_page(GFP_KERNEL|__GFP_ZERO);
> +	if (control_block == NULL)
> +		goto error;
> +
> +	init_control.control_mfn = virt_to_mfn(page_address(control_block));
> +	init_control.offset      = 0;
> +	init_control.vcpu        = cpu;
> +
> +	ret = HYPERVISOR_event_channel_op(EVTCHNOP_init_control,
&init_control);
> +	if (ret < 0)
> +		goto error;
> +
> +	per_cpu(cpu_control_block, cpu) = page_address(control_block);
> +
> +	return 0;
> +
> +  error:
> +	__free_page(control_block);
> +	return ret;
> +}
> +
> +static int __cpuinit fifo_cpu_notification(struct notifier_block *self,
> +					   unsigned long action, void *hcpu)
> +{
> +	int cpu = (long)hcpu;
> +	int ret = 0;
> +
> +	switch (action) {
> +	case CPU_UP_PREPARE:
> +		if (!per_cpu(cpu_control_block, cpu))
> +			ret = fifo_init_control_block(cpu);
> +		break;
> +	default:
> +		break;
> +	}
> +	return ret < 0 ? NOTIFY_BAD : NOTIFY_OK;
I think you should return NOTIFY_OK.

Say you do this:

1) Launch guest on a new hypervisor (which has FIFO ABI)
2) Bring down all CPUs but one.
3) Migrate to an older hypervisor (no FIFI ABI)
4) Bring up all of the CPUs.

We shouldn''t return NOTIFY_BAD b/c the hypervisor we are resuming
on is too old. We should just continue on without the FIFO mechanism
in place.
> +}
> +
> +static struct notifier_block fifo_cpu_notifier __cpuinitdata = {
> +	.notifier_call	= fifo_cpu_notification,
> +};
> +
> +
> +int __init xen_evtchn_init_fifo(void)
> +{
> +	int cpu = get_cpu();
> +	int ret;
> +
> +	ret = fifo_init_control_block(cpu);
> +	if (ret < 0)
> +		goto error;
> +
> +	printk(KERN_INFO "xen: switching to FIFO-based event
channels\n");
> +
> +	evtchn_ops = &evtchn_ops_fifo;
> +
> +	register_cpu_notifier(&fifo_cpu_notifier);
> +
> +	put_cpu();
> +	return 0;
> +
> +  error:
> +	put_cpu();
> +	return ret;
> +}
> -- 
> 1.7.2.5
>

David Vrabel

2013-Sep-24 16:04 UTC

head link

Re: [PATCH 01/13] xen/events: refactor retrigger_dynirq() and resend_irq_on_evtchn()

On 24/09/2013 15:37, Konrad Rzeszutek Wilk wrote:> On Fri, Sep 13, 2013 at 05:59:49PM +0100, David Vrabel wrote:
>> From: David Vrabel <david.vrabel@citrix.com>
>>
>> These two function did the same thing with different parameters, put
>> the common bits in retrigger_evtchn().
> 
> Plus the ''resend_irq_on_evtchn'' does not look to be used
by anybody.
It''s used by ia64 but I guess that counts at not used by anybody...

The caller in ia64 ignores the return value which is why we get away
with this refactoring.

David

David Vrabel

2013-Sep-24 16:06 UTC

head link

Re: [PATCH 02/13] xen/events: remove unnecessary init_evtchn_cpu_bindings()

On 24/09/2013 15:40, Konrad Rzeszutek Wilk wrote:> On Fri, Sep 13, 2013 at 05:59:50PM +0100, David Vrabel wrote:
>> From: David Vrabel <david.vrabel@citrix.com>
>>
>> Event channels are always explicitly bound to a specific VCPU before
>> they are first enabled.  There is no need to initialize all possible
>> events as bound to VCPU 0 at start of day or after a resume.
> 
> How long has this presumption about explicit bounding been in the
> hypervisor?
This is about the Linux-side binding -- i.e., the adjustment of the
per-cpu masks.  There''s always an explicit call to update this masks
after an event channel is bound.

David

David Vrabel

2013-Sep-24 16:11 UTC

head link

Re: [PATCH 12/13] xen/events: Add the hypervisor interface for the FIFO-based event channels

On 24/09/2013 16:08, Konrad Rzeszutek Wilk wrote:>> +/*
>> + * FIFO ABI
>> + */
>> +
>> +/* Events may have priorities from 0 (highest) to 15 (lowest). */
>> +#define EVTCHN_FIFO_PRIORITY_MIN     15
>> +#define EVTCHN_FIFO_PRIORITY_DEFAULT 7
>> +
>> +#define EVTCHN_FIFO_MAX_QUEUES (EVTCHN_FIFO_PRIORITY_MIN + 1)
>> +
>> +typedef uint32_t event_word_t;
> 
> Typedefs are frowed upon in the Linux kernel. Can you just use uint32_t
> please?
Checkpatch frowned at this too but I just frowned back.  I think this
specific typedef improves maintainability so I''m not inclined to remove
it to satisfy an arbitrary rule.

David

David Vrabel

2013-Sep-24 16:48 UTC

head link

Re: [PATCH 13/13] xen/events: use the FIFO-based ABI if available

On 24/09/2013 16:50, Konrad Rzeszutek Wilk wrote:> On Fri, Sep 13, 2013 at 06:00:01PM +0100, David Vrabel wrote:
>> +
>> +  error:
>> +	if (event_array_pages == 0)
>> +		panic("xen: unable to expand event array with initial page
(%d)\n", ret);
> 
> Shouldn''t we just printk and return a negative value so we can use
> the old style event channels?
No, once you''ve called EVTCHNOP_init_control for at least one VCPU the
hypervisor has switched ABIs and there is (deliberatly[*]) no mechanism
to switch back.

[*] it greatly simplifies the hypervisor ABI and implementation if we
don''t have to unwind a half initialized switch over.  This failure case
will never happen in practise as the hypervisor will run out of domheap
pages for the guest memory before it runs out of global mapping space.
>> +	else
>> +		printk(KERN_ERR "xen: unable to expand event array
(%d)\n", ret);
>> +	free_unused_array_pages();
>> +	return ret;
>> +}
>> +
>> +static void fifo_bind_to_cpu(struct irq_info *info, int cpu)
>> +{
>> +	/* no-op */
> 
> Perhaps you can also say: "The event array is shared between all of
the
> guests VCPUs." (from design-E).
That''s not relevant here.  This is a no-op because there is no guest
side state necessary for tracking which VCPU an event channel is bound
to -- this is all done by the per-VCPU queues.
>> +static void fifo_unmask(int port)
>> +{
>> +	unsigned int cpu = get_cpu();
>> +	bool do_hypercall = false;
>> +	bool evtchn_pending = false;
>> +
>> +	BUG_ON(!irqs_disabled());
>> +
>> +	if (unlikely((cpu != cpu_from_evtchn(port))))
>> +		do_hypercall = true;
> 
> Since the event array is shared across all of the vCPUs is this still
> needed?
The shared event array isn''t relevant (the 2-level ABI also has shared
pending and mask arrays).

But, since a hypercall is always required if an event is pending then we
can omit the above check -- it''s not possible (as in the 2-level case)
to optimize out the hypercall if we''re already on the VCPU.
>> +static uint32_t clear_linked(volatile event_word_t *word)
>> +{
>> +    event_word_t n, o, w;
> 
> How about just ''new'', ''old'',
''temp'' ?
The letters match the design doc.
>> +
>> +static void fifo_handle_events(int cpu)
>> +{
>> +	struct evtchn_fifo_control_block *control_block;
>> +	uint32_t ready;
>> +	int q;
>> +
>> +	control_block = per_cpu(cpu_control_block, cpu);
>> +
>> +	ready = xchg(&control_block->ready, 0);
>> +
>> +	while (ready & EVTCHN_FIFO_READY_MASK) {
>> +		q = find_first_bit(BM(&ready), EVTCHN_FIFO_MAX_QUEUES);
>> +		consume_one_event(cpu, control_block, q, &ready);
>> +		ready |= xchg(&control_block->ready, 0);
> 
> Say priority 0 is where VIRQ_TIMER is set and the TIMER is set to
> be incredibly short.
> 
> Couldn''t this cause a scenario where the guest clears the ready in
> consume_one_event (clear_bit(0, BM(ready)) and the hypervisor
> at the same time can set the bit. We then process the IRQ (virq_timer).
> 
> Then we get to the xchg here and end up with the ready having the
> priority 0 bit set again. 
> 
> And then loop again and again..
Don''t do that then.  This is no different to the behaviour of hardware
interrupts.
>> +		ret = HYPERVISOR_event_channel_op(EVTCHNOP_init_control,
>> +						  &init_control);
>> +		if (ret < 0)
>> +			BUG();
> 
> So if this is run after migration to an older hypevisor won''t we
> blow up here? Shouldn''t we just exit with an return and use the
> old style classis events?
Migrating to older hypervisors has never been supported.  You can only
go forwards.
>> +static int __cpuinit fifo_cpu_notification(struct notifier_block
*self,
>> +					   unsigned long action, void *hcpu)
>> +{
>> +	int cpu = (long)hcpu;
>> +	int ret = 0;
>> +
>> +	switch (action) {
>> +	case CPU_UP_PREPARE:
>> +		if (!per_cpu(cpu_control_block, cpu))
>> +			ret = fifo_init_control_block(cpu);
>> +		break;
>> +	default:
>> +		break;
>> +	}
>> +	return ret < 0 ? NOTIFY_BAD : NOTIFY_OK;
> 
> I think you should return NOTIFY_OK.
> 
> Say you do this:
> 
> 1) Launch guest on a new hypervisor (which has FIFO ABI)
> 2) Bring down all CPUs but one.
> 3) Migrate to an older hypervisor (no FIFI ABI)
> 4) Bring up all of the CPUs.
> 
> We shouldn''t return NOTIFY_BAD b/c the hypervisor we are resuming
> on is too old. We should just continue on without the FIFO mechanism
> in place.
Again, migrating to an older hypervisor has never been supported.  Also,
once we have switched to the FIFO-based ABI for one VCPU we cannot go
back and it''s better to fail to bring up the VCPU than have one that
cannot receive events.

David

Konrad Rzeszutek Wilk

2013-Sep-24 16:49 UTC

head link

Re: [PATCH 02/13] xen/events: remove unnecessary init_evtchn_cpu_bindings()

On Tue, Sep 24, 2013 at 05:06:06PM +0100, David Vrabel
wrote:> On 24/09/2013 15:40, Konrad Rzeszutek Wilk wrote:
> > On Fri, Sep 13, 2013 at 05:59:50PM +0100, David Vrabel wrote:
> >> From: David Vrabel <david.vrabel@citrix.com>
> >>
> >> Event channels are always explicitly bound to a specific VCPU
before
> >> they are first enabled.  There is no need to initialize all
possible
> >> events as bound to VCPU 0 at start of day or after a resume.
> > 
> > How long has this presumption about explicit bounding been in the
> > hypervisor?
> 
> This is about the Linux-side binding -- i.e., the adjustment of the
> per-cpu masks.  There''s always an explicit call to update this
masks
> after an event channel is bound.
Ooh. Could you add that please in the commit description?

Thank you.> 
> David

Konrad Rzeszutek Wilk

2013-Sep-24 16:51 UTC

head link

Re: [PATCH 12/13] xen/events: Add the hypervisor interface for the FIFO-based event channels

On Tue, Sep 24, 2013 at 05:11:54PM +0100, David Vrabel
wrote:> On 24/09/2013 16:08, Konrad Rzeszutek Wilk wrote:
> >> +/*
> >> + * FIFO ABI
> >> + */
> >> +
> >> +/* Events may have priorities from 0 (highest) to 15 (lowest). */
> >> +#define EVTCHN_FIFO_PRIORITY_MIN     15
> >> +#define EVTCHN_FIFO_PRIORITY_DEFAULT 7
> >> +
> >> +#define EVTCHN_FIFO_MAX_QUEUES (EVTCHN_FIFO_PRIORITY_MIN + 1)
> >> +
> >> +typedef uint32_t event_word_t;
> > 
> > Typedefs are frowed upon in the Linux kernel. Can you just use
uint32_t
> > please?
> 
> Checkpatch frowned at this too but I just frowned back.  I think this
> specific typedef improves maintainability so I''m not inclined to
remove
> it to satisfy an arbitrary rule.
<laughs>

<laughs some more>

typedefs it is then.> 
> David

Konrad Rzeszutek Wilk

2013-Sep-24 17:04 UTC

head link

Re: [PATCH 13/13] xen/events: use the FIFO-based ABI if available

On Tue, Sep 24, 2013 at 05:48:58PM +0100, David Vrabel
wrote:> On 24/09/2013 16:50, Konrad Rzeszutek Wilk wrote:
> > On Fri, Sep 13, 2013 at 06:00:01PM +0100, David Vrabel wrote:
> >> +
> >> +  error:
> >> +	if (event_array_pages == 0)
> >> +		panic("xen: unable to expand event array with initial page
(%d)\n", ret);
> > 
> > Shouldn''t we just printk and return a negative value so we
can use
> > the old style event channels?
> 
> No, once you''ve called EVTCHNOP_init_control for at least one VCPU
the
> hypervisor has switched ABIs and there is (deliberatly[*]) no mechanism
> to switch back.
> 
> [*] it greatly simplifies the hypervisor ABI and implementation if we
> don''t have to unwind a half initialized switch over.  This failure
case
> will never happen in practise as the hypervisor will run out of domheap
> pages for the guest memory before it runs out of global mapping space.
OK. And that is nicely mentioned in the design doc which I failed to spot
until now.> 
> >> +	else
> >> +		printk(KERN_ERR "xen: unable to expand event array
(%d)\n", ret);
> >> +	free_unused_array_pages();
> >> +	return ret;
> >> +}
> >> +
> >> +static void fifo_bind_to_cpu(struct irq_info *info, int cpu)
> >> +{
> >> +	/* no-op */
> > 
> > Perhaps you can also say: "The event array is shared between all
of the
> > guests VCPUs." (from design-E).
> 
> That''s not relevant here.  This is a no-op because there is no
guest
> side state necessary for tracking which VCPU an event channel is bound
> to -- this is all done by the per-VCPU queues.
Ah, gotcha.> 
> >> +static void fifo_unmask(int port)
> >> +{
> >> +	unsigned int cpu = get_cpu();
> >> +	bool do_hypercall = false;
> >> +	bool evtchn_pending = false;
> >> +
> >> +	BUG_ON(!irqs_disabled());
> >> +
> >> +	if (unlikely((cpu != cpu_from_evtchn(port))))
> >> +		do_hypercall = true;
> > 
> > Since the event array is shared across all of the vCPUs is this still
> > needed?
> 
> The shared event array isn''t relevant (the 2-level ABI also has
shared
> pending and mask arrays).
> 
> But, since a hypercall is always required if an event is pending then we
> can omit the above check -- it''s not possible (as in the 2-level
case)
> to optimize out the hypercall if we''re already on the VCPU.
> 
> >> +static uint32_t clear_linked(volatile event_word_t *word)
> >> +{
> >> +    event_word_t n, o, w;
> > 
> > How about just ''new'', ''old'',
''temp'' ?
> 
> The letters match the design doc.
Since you have to modify the design document anyway would it make
sense to alter it as well to have less cryptic variables?
> 
> >> +
> >> +static void fifo_handle_events(int cpu)
> >> +{
> >> +	struct evtchn_fifo_control_block *control_block;
> >> +	uint32_t ready;
> >> +	int q;
> >> +
> >> +	control_block = per_cpu(cpu_control_block, cpu);
> >> +
> >> +	ready = xchg(&control_block->ready, 0);
> >> +
> >> +	while (ready & EVTCHN_FIFO_READY_MASK) {
> >> +		q = find_first_bit(BM(&ready), EVTCHN_FIFO_MAX_QUEUES);
> >> +		consume_one_event(cpu, control_block, q, &ready);
> >> +		ready |= xchg(&control_block->ready, 0);
> > 
> > Say priority 0 is where VIRQ_TIMER is set and the TIMER is set to
> > be incredibly short.
> > 
> > Couldn''t this cause a scenario where the guest clears the
ready in
> > consume_one_event (clear_bit(0, BM(ready)) and the hypervisor
> > at the same time can set the bit. We then process the IRQ
(virq_timer).
> > 
> > Then we get to the xchg here and end up with the ready having the
> > priority 0 bit set again. 
> > 
> > And then loop again and again..
> 
> Don''t do that then.  This is no different to the behaviour of
hardware
> interrupts.
I was thinking there might be some mitigation techinque - and there is.

The Linux kernel would disable the IRQ, which hopefully means that this
defective port ends up being masked.

> 
> >> +		ret = HYPERVISOR_event_channel_op(EVTCHNOP_init_control,
> >> +						  &init_control);
> >> +		if (ret < 0)
> >> +			BUG();
> > 
> > So if this is run after migration to an older hypevisor won''t
we
> > blow up here? Shouldn''t we just exit with an return and use
the
> > old style classis events?
> 
> Migrating to older hypervisors has never been supported.  You can only
> go forwards.
OK.> 
> >> +static int __cpuinit fifo_cpu_notification(struct notifier_block
*self,
> >> +					   unsigned long action, void *hcpu)
> >> +{
> >> +	int cpu = (long)hcpu;
> >> +	int ret = 0;
> >> +
> >> +	switch (action) {
> >> +	case CPU_UP_PREPARE:
> >> +		if (!per_cpu(cpu_control_block, cpu))
> >> +			ret = fifo_init_control_block(cpu);
> >> +		break;
> >> +	default:
> >> +		break;
> >> +	}
> >> +	return ret < 0 ? NOTIFY_BAD : NOTIFY_OK;
> > 
> > I think you should return NOTIFY_OK.
> > 
> > Say you do this:
> > 
> > 1) Launch guest on a new hypervisor (which has FIFO ABI)
> > 2) Bring down all CPUs but one.
> > 3) Migrate to an older hypervisor (no FIFI ABI)
> > 4) Bring up all of the CPUs.
> > 
> > We shouldn''t return NOTIFY_BAD b/c the hypervisor we are
resuming
> > on is too old. We should just continue on without the FIFO mechanism
> > in place.
> 
> Again, migrating to an older hypervisor has never been supported.  Also,
> once we have switched to the FIFO-based ABI for one VCPU we cannot go
> back and it''s better to fail to bring up the VCPU than have one
that
> cannot receive events.
OK.> 
> David

David Vrabel

2013-Sep-26 10:53 UTC

head link

Re: [PATCH 10/13] xen/events: Refactor evtchn_to_irq array to be dynamically allocated

On 24/09/13 15:58, Konrad Rzeszutek Wilk wrote:> On Fri, Sep 13, 2013 at 05:59:58PM +0100, David Vrabel wrote:
>> From: Malcolm Crossley <malcolm.crossley@citrix.com>
>>
>> Refactor static array evtchn_to_irq array to be dynamically allocated
by
>> implementing get and set functions for accesses to the array.
>>
>> Two new port ops are added: max_channels (maximum supported number of
>> event channels) and nr_channels (number of currently usable event
>> channels).  For the N-level ABI, these numbers are both the same as
>> the shared data structure is a fixed size. For the FIFO ABI, these
>                             ^^ has
I think either usage is fine.
>> +	if (evtchn_to_irq[row] == NULL) {
>> +		/* Unallocated irq entries return -1 anyway */
>                                            ^^^^^ - are
Unallocated entries can''t have a value because they''re, well,
unallocated. So I think the original phrasing is more correct.

"get_evtchn_to_irq() returns -1 for unallocated entries anyway" would
be
slighter better though.

David

ps. Can you also trim your replies in future?  As in between the grammar
nit-picking there was an actual bug that needed fixing and this is hard
to find with the untrimmed quoting.

David Vrabel

2013-Sep-26 11:06 UTC

head link

Re: [PATCH 12/13] xen/events: Add the hypervisor interface for the FIFO-based event channels

On 24/09/13 16:06, Konrad Rzeszutek Wilk wrote:> On Fri, Sep 13, 2013 at 06:00:00PM +0100, David Vrabel wrote:
>> From: David Vrabel <david.vrabel@citrix.com>
>>
>> Add the hypercall sub-ops and the structures for the shared data used
>> in the FIFO-based event channel ABI.
> 
> And you should include an URL to the design document. Or include the
> text version of it (if possible) in this description.
The design doc is 16 pages and has diagrams, not something that could be
included in a commit message.  I''ll add a link.
>> +struct evtchn_fifo_control_block {
>> +	uint32_t     ready;
>> +	uint32_t     _rsvd1;
> 
> 
> The design document looks to have ''READY'' be 2 bytes?
Octet 0 and 1?
> (figure 2).
The docs need to be updated -- we only require the guest to be able to
do atomic ops on 32-bit values.

David

Konrad Rzeszutek Wilk

2013-Sep-26 12:51 UTC

head link

Re: [PATCH 10/13] xen/events: Refactor evtchn_to_irq array to be dynamically allocated

On Thu, Sep 26, 2013 at 11:53:07AM +0100, David Vrabel
wrote:> On 24/09/13 15:58, Konrad Rzeszutek Wilk wrote:
> > On Fri, Sep 13, 2013 at 05:59:58PM +0100, David Vrabel wrote:
> >> From: Malcolm Crossley <malcolm.crossley@citrix.com>
> >>
> >> Refactor static array evtchn_to_irq array to be dynamically
allocated by
> >> implementing get and set functions for accesses to the array.
> >>
> >> Two new port ops are added: max_channels (maximum supported number
of
> >> event channels) and nr_channels (number of currently usable event
> >> channels).  For the N-level ABI, these numbers are both the same
as
> >> the shared data structure is a fixed size. For the FIFO ABI, these
> >                             ^^ has
> 
> I think either usage is fine.
> 
> >> +	if (evtchn_to_irq[row] == NULL) {
> >> +		/* Unallocated irq entries return -1 anyway */
> >                                            ^^^^^ - are
> 
> Unallocated entries can''t have a value because they''re,
well,
> unallocated. So I think the original phrasing is more correct.
But right after that you have ''return 0;'' which threw me
out.> 
> "get_evtchn_to_irq() returns -1 for unallocated entries anyway"
would be
> slighter better though.
That would do it too.> 
> David
> 
> ps. Can you also trim your replies in future?  As in between the grammar
> nit-picking there was an actual bug that needed fixing and this is hard
> to find with the untrimmed quoting.

Xen devel - Sep 2013 - (no subject)

(no subject)

[PATCH 01/13] xen/events: refactor retrigger_dynirq() and resend_irq_on_evtchn()

[PATCH 02/13] xen/events: remove unnecessary init_evtchn_cpu_bindings()

[PATCH 03/13] xen/events: introduce test_and_set_mask

[PATCH 04/13] xen/events: replace raw bit ops with functions

[PATCH 05/13] xen/events: move drivers/xen/events.c into drivers/xen/events/

[PATCH 06/13] xen/events: move 2-level specific code into its own file

[PATCH 07/13] xen/events: add struct evtchn_ops for the low-level port operations

[PATCH 08/13] xen/events: allow setup of irq_info to fail

[PATCH 09/13] xen/events: add a evtchn_op for port setup

[PATCH 10/13] xen/events: Refactor evtchn_to_irq array to be dynamically allocated

[PATCH 11/13] xen/events: add xen_evtchn_mask_all()

[PATCH 12/13] xen/events: Add the hypervisor interface for the FIFO-based event channels

[PATCH 13/13] xen/events: use the FIFO-based ABI if available

Re: [RFC PATCHv3 00/12] Linux: FIFO-based event channel ABI

Re: [PATCH 01/13] xen/events: refactor retrigger_dynirq() and resend_irq_on_evtchn()

Re: [PATCH 02/13] xen/events: remove unnecessary init_evtchn_cpu_bindings()

Re: [PATCH 03/13] xen/events: introduce test_and_set_mask

Re: [PATCH 04/13] xen/events: replace raw bit ops with functions

Re: [PATCH 07/13] xen/events: add struct evtchn_ops for the low-level port operations

Re: [PATCH 08/13] xen/events: allow setup of irq_info to fail

Re: [PATCH 09/13] xen/events: add a evtchn_op for port setup

Re: [PATCH 10/13] xen/events: Refactor evtchn_to_irq array to be dynamically allocated

Re: [PATCH 11/13] xen/events: add xen_evtchn_mask_all()

Re: [PATCH 12/13] xen/events: Add the hypervisor interface for the FIFO-based event channels

Re: [PATCH 12/13] xen/events: Add the hypervisor interface for the FIFO-based event channels

Re: [PATCH 13/13] xen/events: use the FIFO-based ABI if available

Re: [PATCH 01/13] xen/events: refactor retrigger_dynirq() and resend_irq_on_evtchn()

Re: [PATCH 02/13] xen/events: remove unnecessary init_evtchn_cpu_bindings()

Re: [PATCH 12/13] xen/events: Add the hypervisor interface for the FIFO-based event channels

Re: [PATCH 13/13] xen/events: use the FIFO-based ABI if available

Re: [PATCH 02/13] xen/events: remove unnecessary init_evtchn_cpu_bindings()

Re: [PATCH 12/13] xen/events: Add the hypervisor interface for the FIFO-based event channels

Re: [PATCH 13/13] xen/events: use the FIFO-based ABI if available

Re: [PATCH 10/13] xen/events: Refactor evtchn_to_irq array to be dynamically allocated

Re: [PATCH 12/13] xen/events: Add the hypervisor interface for the FIFO-based event channels

Re: [PATCH 10/13] xen/events: Refactor evtchn_to_irq array to be dynamically allocated