Stefano Stabellini
2010-Aug-30 11:23 UTC
[Xen-devel] [PATCH 0 of 4] interrupt remapping in HVM guests
Hi all, this patch series implements the mechanisms needed by a PV on HVM guest to remap interrupts and MSIs into pirqs in order to receive those interrupts and MSIs as xen events. This allows the guest to avoid any reads and writes to the emulated LAPIC. The series consists of 4 patches, 2 patches for the hypervisor and 2 patches to qemu-xen; the changes are not interdependent. Xen needs to export some pirq related physdev_op hypercalls to hvm guests and keep track of all the remapped interrupts, that can either be interrupts corresponding to emulated devices or passthrough devices. A new physdev_op hypercall has been added to export the number of pirq available: considering that the guest is allowed to choose the pirq number, we can make sure the number it chooses is in the allowed range. The first patch to qemu-xen is to support MSI remapping: in order to remap an MSI into a pirq the guest enables the MSI passing 0 as vector number, in response qemu-xen reads the address and use it a pirq number for the following mapping request to xen. Finally we avoid pirq conclicts letting xen pick the pirq number for us in qemu-xen. The list of patches follows: xen: interrupt remapping in HVM guests xen: introduce PHYSDEVOP_get_nr_pirqs qemu-xen: support PV on HVM MSI remapping qemu-xen: qemu-xen: let xen choose the pirq number Cheers, Stefano Stabellini _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
stefano.stabellini@eu.citrix.com
2010-Aug-30 11:25 UTC
[Xen-devel] [PATCH 1 of 4] xen: interrupt remapping in HVM guests
This patch allows HVM guests to remap interrupts and MSIs into pirqs; once the mapping is in place the guest will receive the interrupt (or the MSI) as an event. The interrupt to be remapped can either be an interrupt of an emulated device or an interrupt of a passthrough device and we keep track of that. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> diff -r 5218db847b58 xen/arch/x86/domain.c --- a/xen/arch/x86/domain.c Tue Aug 17 19:32:37 2010 +0100 +++ b/xen/arch/x86/domain.c Fri Aug 20 16:47:36 2010 +0100 @@ -490,6 +490,16 @@ int arch_domain_create(struct domain *d, if ( !IO_APIC_IRQ(i) ) d->arch.irq_pirq[i] = d->arch.pirq_irq[i] = i; + d->arch.pirq_emuirq = xmalloc_array(int, d->nr_pirqs); + d->arch.emuirq_pirq = xmalloc_array(int, nr_irqs); + if ( !d->arch.pirq_emuirq || !d->arch.emuirq_pirq ) + goto fail; + memset(d->arch.pirq_emuirq, IRQ_UNBOUND, + d->nr_pirqs * sizeof(*d->arch.pirq_emuirq)); + memset(d->arch.emuirq_pirq, IRQ_UNBOUND, + d->nr_pirqs * sizeof(*d->arch.emuirq_pirq)); + + if ( (rc = iommu_domain_init(d)) != 0 ) goto fail; diff -r 5218db847b58 xen/arch/x86/hvm/hvm.c --- a/xen/arch/x86/hvm/hvm.c Tue Aug 17 19:32:37 2010 +0100 +++ b/xen/arch/x86/hvm/hvm.c Fri Aug 20 16:47:36 2010 +0100 @@ -2281,6 +2281,20 @@ static long hvm_memory_op(int cmd, XEN_G return rc; } +static long hvm_physdev_op(int cmd, XEN_GUEST_HANDLE(void) arg) +{ + switch ( cmd ) + { + case PHYSDEVOP_map_pirq: + case PHYSDEVOP_unmap_pirq: + case PHYSDEVOP_eoi: + case PHYSDEVOP_irq_status_query: + return do_physdev_op(cmd, arg); + default: + return -ENOSYS; + } +} + static long hvm_vcpu_op( int cmd, int vcpuid, XEN_GUEST_HANDLE(void) arg) { @@ -2316,6 +2330,7 @@ static hvm_hypercall_t *hvm_hypercall32_ [ __HYPERVISOR_memory_op ] = (hvm_hypercall_t *)hvm_memory_op, [ __HYPERVISOR_grant_table_op ] = (hvm_hypercall_t *)hvm_grant_table_op, [ __HYPERVISOR_vcpu_op ] = (hvm_hypercall_t *)hvm_vcpu_op, + [ __HYPERVISOR_physdev_op ] = (hvm_hypercall_t *)hvm_physdev_op, HYPERCALL(xen_version), HYPERCALL(event_channel_op), HYPERCALL(sched_op), @@ -2366,10 +2381,28 @@ static long hvm_vcpu_op_compat32( return rc; } +static long hvm_physdev_op_compat32( + int cmd, XEN_GUEST_HANDLE(void) arg) +{ + switch ( cmd ) + { + case PHYSDEVOP_map_pirq: + case PHYSDEVOP_unmap_pirq: + case PHYSDEVOP_eoi: + case PHYSDEVOP_irq_status_query: + return compat_physdev_op(cmd, arg); + break; + default: + return -ENOSYS; + break; + } +} + static hvm_hypercall_t *hvm_hypercall64_table[NR_hypercalls] = { [ __HYPERVISOR_memory_op ] = (hvm_hypercall_t *)hvm_memory_op, [ __HYPERVISOR_grant_table_op ] = (hvm_hypercall_t *)hvm_grant_table_op, [ __HYPERVISOR_vcpu_op ] = (hvm_hypercall_t *)hvm_vcpu_op, + [ __HYPERVISOR_physdev_op ] = (hvm_hypercall_t *)hvm_physdev_op, HYPERCALL(xen_version), HYPERCALL(event_channel_op), HYPERCALL(sched_op), @@ -2382,6 +2415,7 @@ static hvm_hypercall_t *hvm_hypercall32_ [ __HYPERVISOR_memory_op ] = (hvm_hypercall_t *)hvm_memory_op_compat32, [ __HYPERVISOR_grant_table_op ] = (hvm_hypercall_t *)hvm_grant_table_op_compat32, [ __HYPERVISOR_vcpu_op ] = (hvm_hypercall_t *)hvm_vcpu_op_compat32, + [ __HYPERVISOR_physdev_op ] = (hvm_hypercall_t *)hvm_physdev_op_compat32, HYPERCALL(xen_version), HYPERCALL(event_channel_op), HYPERCALL(sched_op), diff -r 5218db847b58 xen/arch/x86/hvm/irq.c --- a/xen/arch/x86/hvm/irq.c Tue Aug 17 19:32:37 2010 +0100 +++ b/xen/arch/x86/hvm/irq.c Fri Aug 20 16:47:36 2010 +0100 @@ -23,9 +23,30 @@ #include <xen/types.h> #include <xen/event.h> #include <xen/sched.h> +#include <xen/irq.h> #include <asm/hvm/domain.h> #include <asm/hvm/support.h> +/* Must be called with hvm_domain->irq_lock hold */ +static void assert_irq(struct domain *d, unsigned ioapic_gsi, unsigned pic_irq) +{ + int pirq = domain_emuirq_to_pirq(d, ioapic_gsi); + if ( pirq != IRQ_UNBOUND ) + { + send_guest_pirq(d, pirq); + return; + } + vioapic_irq_positive_edge(d, ioapic_gsi); + vpic_irq_positive_edge(d, pic_irq); +} + +/* Must be called with hvm_domain->irq_lock hold */ +static void deassert_irq(struct domain *d, unsigned isa_irq) +{ + if ( domain_emuirq_to_pirq(d, isa_irq) != IRQ_UNBOUND ) + vpic_irq_negative_edge(d, isa_irq); +} + static void __hvm_pci_intx_assert( struct domain *d, unsigned int device, unsigned int intx) { @@ -45,10 +66,7 @@ static void __hvm_pci_intx_assert( isa_irq = hvm_irq->pci_link.route[link]; if ( (hvm_irq->pci_link_assert_count[link]++ == 0) && isa_irq && (hvm_irq->gsi_assert_count[isa_irq]++ == 0) ) - { - vioapic_irq_positive_edge(d, isa_irq); - vpic_irq_positive_edge(d, isa_irq); - } + assert_irq(d, isa_irq, isa_irq); } void hvm_pci_intx_assert( @@ -77,7 +95,7 @@ static void __hvm_pci_intx_deassert( isa_irq = hvm_irq->pci_link.route[link]; if ( (--hvm_irq->pci_link_assert_count[link] == 0) && isa_irq && (--hvm_irq->gsi_assert_count[isa_irq] == 0) ) - vpic_irq_negative_edge(d, isa_irq); + deassert_irq(d, isa_irq); } void hvm_pci_intx_deassert( @@ -100,10 +118,7 @@ void hvm_isa_irq_assert( if ( !__test_and_set_bit(isa_irq, &hvm_irq->isa_irq.i) && (hvm_irq->gsi_assert_count[gsi]++ == 0) ) - { - vioapic_irq_positive_edge(d, gsi); - vpic_irq_positive_edge(d, isa_irq); - } + assert_irq(d, gsi, isa_irq); spin_unlock(&d->arch.hvm_domain.irq_lock); } @@ -120,7 +135,7 @@ void hvm_isa_irq_deassert( if ( __test_and_clear_bit(isa_irq, &hvm_irq->isa_irq.i) && (--hvm_irq->gsi_assert_count[gsi] == 0) ) - vpic_irq_negative_edge(d, isa_irq); + deassert_irq(d, isa_irq); spin_unlock(&d->arch.hvm_domain.irq_lock); } diff -r 5218db847b58 xen/arch/x86/irq.c --- a/xen/arch/x86/irq.c Tue Aug 17 19:32:37 2010 +0100 +++ b/xen/arch/x86/irq.c Fri Aug 20 16:47:36 2010 +0100 @@ -1404,7 +1404,11 @@ int get_free_pirq(struct domain *d, int { for ( i = 16; i < nr_irqs_gsi; i++ ) if ( !d->arch.pirq_irq[i] ) - break; + { + if ( !is_hvm_domain(d) || + d->arch.pirq_emuirq[i] == IRQ_UNBOUND ) + break; + } if ( i == nr_irqs_gsi ) return -ENOSPC; } @@ -1412,7 +1416,11 @@ int get_free_pirq(struct domain *d, int { for ( i = d->nr_pirqs - 1; i >= nr_irqs_gsi; i-- ) if ( !d->arch.pirq_irq[i] ) - break; + { + if ( !is_hvm_domain(d) || + d->arch.pirq_emuirq[i] == IRQ_UNBOUND ) + break; + } if ( i < nr_irqs_gsi ) return -ENOSPC; } @@ -1743,3 +1751,82 @@ void fixup_irqs(void) peoi[sp].ready = 1; flush_ready_eoi(); } + +int map_domain_emuirq_pirq(struct domain *d, int pirq, int emuirq) +{ + int old_emuirq = IRQ_UNBOUND, old_pirq = IRQ_UNBOUND; + + ASSERT(spin_is_locked(&d->event_lock)); + + if ( !is_hvm_domain(d) ) + return -EINVAL; + + if ( pirq < 0 || pirq >= d->nr_pirqs || + emuirq == IRQ_UNBOUND || emuirq >= (int) nr_irqs ) + { + dprintk(XENLOG_G_ERR, "dom%d: invalid pirq %d or emuirq %d\n", + d->domain_id, pirq, emuirq); + return -EINVAL; + } + + old_emuirq = domain_pirq_to_emuirq(d, pirq); + if ( emuirq != IRQ_PT ) + old_pirq = domain_emuirq_to_pirq(d, emuirq); + + if ( (old_emuirq != IRQ_UNBOUND && (old_emuirq != emuirq) ) || + (old_pirq != IRQ_UNBOUND && (old_pirq != pirq)) ) + { + dprintk(XENLOG_G_WARNING, "dom%d: pirq %d or emuirq %d already mapped\n", + d->domain_id, pirq, emuirq); + return 0; + } + + d->arch.pirq_emuirq[pirq] = emuirq; + /* do not store emuirq mappings for pt devices */ + if ( emuirq != IRQ_PT ) + d->arch.emuirq_pirq[emuirq] = pirq; + + return 0; +} + +int unmap_domain_pirq_emuirq(struct domain *d, int pirq) +{ + int emuirq, ret = 0; + + if ( !is_hvm_domain(d) ) + return -EINVAL; + + if ( (pirq < 0) || (pirq >= d->nr_pirqs) ) + return -EINVAL; + + ASSERT(spin_is_locked(&d->event_lock)); + + emuirq = domain_pirq_to_emuirq(d, pirq); + if ( emuirq == IRQ_UNBOUND ) + { + dprintk(XENLOG_G_ERR, "dom%d: pirq %d not mapped\n", + d->domain_id, pirq); + ret = -EINVAL; + goto done; + } + + d->arch.pirq_emuirq[pirq] = IRQ_UNBOUND; + d->arch.emuirq_pirq[emuirq] = IRQ_UNBOUND; + + done: + return ret; +} + +int hvm_domain_use_pirq(struct domain *d, int pirq) +{ + int emuirq; + + if ( !is_hvm_domain(d) ) + return 0; + + emuirq = domain_pirq_to_emuirq(d, pirq); + if ( emuirq != IRQ_UNBOUND && d->pirq_to_evtchn[pirq] != 0 ) + return 1; + else + return 0; +} diff -r 5218db847b58 xen/arch/x86/physdev.c --- a/xen/arch/x86/physdev.c Tue Aug 17 19:32:37 2010 +0100 +++ b/xen/arch/x86/physdev.c Fri Aug 20 16:47:36 2010 +0100 @@ -45,6 +45,60 @@ static int physdev_map_pirq(struct physd if ( d == NULL ) return -ESRCH; + if ( map->domid == DOMID_SELF && is_hvm_domain(d) ) + { + spin_lock(&d->event_lock); + switch ( map->type ) + { + case MAP_PIRQ_TYPE_GSI : + { + struct hvm_irq_dpci *hvm_irq_dpci; + struct hvm_girq_dpci_mapping *girq; + uint32_t machine_gsi = 0; + + /* find the machine gsi corresponding to the + * emulated gsi */ + hvm_irq_dpci = domain_get_irq_dpci(d); + if (hvm_irq_dpci) { + list_for_each_entry ( girq, + &hvm_irq_dpci->girq[map->index], + list ) { + machine_gsi = girq->machine_gsi; + } + } + /* found one, this mean we are dealing with a pt device */ + if ( machine_gsi ) + { + map->index = domain_pirq_to_irq(d, machine_gsi); + pirq = machine_gsi; + if ( pirq > 0 ) + ret = 0; + else + ret = pirq; + } + /* we didn''t find any, this means we are dealing + * with an emulated device */ + else + { + pirq = map->pirq; + if ( pirq < 0 ) + { + pirq = get_free_pirq(d, map->type, map->index); + } + ret = map_domain_emuirq_pirq(d, pirq, map->index); + } + map->pirq = pirq; + } + break; + default : + ret = -EINVAL; + dprintk(XENLOG_G_WARNING, "map type %d not supported yet\n", map->type); + break; + } + spin_unlock(&d->event_lock); + return ret; + } + if ( !IS_PRIV_FOR(current->domain, d) ) { ret = -EPERM; @@ -150,6 +204,9 @@ static int physdev_map_pirq(struct physd if ( ret == 0 ) map->pirq = pirq; + if ( !ret && is_hvm_domain(d) ) + map_domain_emuirq_pirq(d, pirq, IRQ_PT); + done: spin_unlock(&d->event_lock); spin_unlock(&pcidevs_lock); @@ -173,6 +230,14 @@ static int physdev_unmap_pirq(struct phy if ( d == NULL ) return -ESRCH; + if ( is_hvm_domain(d) ) + { + spin_lock(&d->event_lock); + ret = unmap_domain_pirq_emuirq(d, unmap->pirq); + spin_unlock(&d->event_lock); + goto free_domain; + } + ret = -EPERM; if ( !IS_PRIV_FOR(current->domain, d) ) goto free_domain; @@ -206,7 +271,11 @@ ret_t do_physdev_op(int cmd, XEN_GUEST_H break; if ( v->domain->arch.pirq_eoi_map ) evtchn_unmask(v->domain->pirq_to_evtchn[eoi.irq]); - ret = pirq_guest_eoi(v->domain, eoi.irq); + if ( !is_hvm_domain(v->domain) || + domain_pirq_to_emuirq(v->domain, eoi.irq) == IRQ_PT ) + ret = pirq_guest_eoi(v->domain, eoi.irq); + else + ret = 0; break; } @@ -261,6 +330,13 @@ ret_t do_physdev_op(int cmd, XEN_GUEST_H if ( (irq < 0) || (irq >= v->domain->nr_pirqs) ) break; irq_status_query.flags = 0; + if ( is_hvm_domain(v->domain) && + domain_pirq_to_emuirq(v->domain, irq) != IRQ_PT ) + { + ret = copy_to_guest(arg, &irq_status_query, 1) ? -EFAULT : 0; + break; + } + /* * Even edge-triggered or message-based IRQs can need masking from * time to time. If teh guest is not dynamically checking for this diff -r 5218db847b58 xen/common/event_channel.c --- a/xen/common/event_channel.c Tue Aug 17 19:32:37 2010 +0100 +++ b/xen/common/event_channel.c Fri Aug 20 16:47:36 2010 +0100 @@ -331,7 +331,7 @@ static long evtchn_bind_pirq(evtchn_bind if ( (pirq < 0) || (pirq >= d->nr_pirqs) ) return -EINVAL; - if ( !irq_access_permitted(d, pirq) ) + if ( !is_hvm_domain(d) && !irq_access_permitted(d, pirq) ) return -EPERM; spin_lock(&d->event_lock); @@ -345,12 +345,15 @@ static long evtchn_bind_pirq(evtchn_bind chn = evtchn_from_port(d, port); d->pirq_to_evtchn[pirq] = port; - rc = pirq_guest_bind(v, pirq, - !!(bind->flags & BIND_PIRQ__WILL_SHARE)); - if ( rc != 0 ) + if ( !is_hvm_domain(d) ) { - d->pirq_to_evtchn[pirq] = 0; - goto out; + rc = pirq_guest_bind(v, pirq, + !!(bind->flags & BIND_PIRQ__WILL_SHARE)); + if ( rc != 0 ) + { + d->pirq_to_evtchn[pirq] = 0; + goto out; + } } chn->state = ECS_PIRQ; @@ -403,7 +406,8 @@ static long __evtchn_close(struct domain break; case ECS_PIRQ: - pirq_guest_unbind(d1, chn1->u.pirq.irq); + if ( !is_hvm_domain(d1) ) + pirq_guest_unbind(d1, chn1->u.pirq.irq); d1->pirq_to_evtchn[chn1->u.pirq.irq] = 0; unlink_pirq_port(chn1, d1->vcpu[chn1->notify_vcpu_id]); break; @@ -664,8 +668,17 @@ int send_guest_pirq(struct domain *d, in /* * It should not be possible to race with __evtchn_close(): * The caller of this function must synchronise with pirq_guest_unbind(). + * + * In the HVM case port is 0 when the guest disable the + * emulated interrupt\evtchn. */ - ASSERT(port != 0); + if (!port) + { + if ( is_hvm_domain(d) ) + return 0; + else + return -EINVAL; + } chn = evtchn_from_port(d, port); return evtchn_set_pending(d->vcpu[chn->notify_vcpu_id], port); diff -r 5218db847b58 xen/common/kernel.c --- a/xen/common/kernel.c Tue Aug 17 19:32:37 2010 +0100 +++ b/xen/common/kernel.c Fri Aug 20 16:47:36 2010 +0100 @@ -261,7 +261,8 @@ DO(xen_version)(int cmd, XEN_GUEST_HANDL (1U << XENFEAT_gnttab_map_avail_bits); else fi.submap |= (1U << XENFEAT_hvm_safe_pvclock) | - (1U << XENFEAT_hvm_callback_vector); + (1U << XENFEAT_hvm_callback_vector) | + (1U << XENFEAT_hvm_pirqs); #endif break; default: diff -r 5218db847b58 xen/drivers/passthrough/io.c --- a/xen/drivers/passthrough/io.c Tue Aug 17 19:32:37 2010 +0100 +++ b/xen/drivers/passthrough/io.c Fri Aug 20 16:47:36 2010 +0100 @@ -373,6 +373,7 @@ int pt_irq_destroy_bind_vtd( hvm_irq_dpci->mirq[machine_gsi].dom = NULL; hvm_irq_dpci->mirq[machine_gsi].flags = 0; clear_bit(machine_gsi, hvm_irq_dpci->mapping); + unmap_domain_pirq_emuirq(d, machine_gsi); } } spin_unlock(&d->event_lock); @@ -452,7 +453,10 @@ void hvm_dpci_msi_eoi(struct domain *d, extern int vmsi_deliver(struct domain *d, int pirq); static int hvm_pci_msi_assert(struct domain *d, int pirq) { - return vmsi_deliver(d, pirq); + if ( hvm_domain_use_pirq(d, pirq) ) + return send_guest_pirq(d, pirq); + else + return vmsi_deliver(d, pirq); } #endif @@ -486,7 +490,10 @@ static void hvm_dirq_assist(unsigned lon { device = digl->device; intx = digl->intx; - hvm_pci_intx_assert(d, device, intx); + if ( hvm_domain_use_pirq(d, pirq) ) + send_guest_pirq(d, pirq); + else + hvm_pci_intx_assert(d, device, intx); hvm_irq_dpci->mirq[pirq].pending++; #ifdef SUPPORT_MSI_REMAPPING diff -r 5218db847b58 xen/include/asm-x86/domain.h --- a/xen/include/asm-x86/domain.h Tue Aug 17 19:32:37 2010 +0100 +++ b/xen/include/asm-x86/domain.h Fri Aug 20 16:47:36 2010 +0100 @@ -258,6 +258,9 @@ struct arch_domain /* NB. protected by d->event_lock and by irq_desc[irq].lock */ int *irq_pirq; int *pirq_irq; + /* pirq to emulated irq and vice versa */ + int *emuirq_pirq; + int *pirq_emuirq; /* Shared page for notifying that explicit PIRQ EOI is required. */ unsigned long *pirq_eoi_map; diff -r 5218db847b58 xen/include/asm-x86/irq.h --- a/xen/include/asm-x86/irq.h Tue Aug 17 19:32:37 2010 +0100 +++ b/xen/include/asm-x86/irq.h Fri Aug 20 16:47:36 2010 +0100 @@ -113,6 +113,9 @@ int map_domain_pirq(struct domain *d, in int unmap_domain_pirq(struct domain *d, int pirq); int get_free_pirq(struct domain *d, int type, int index); void free_domain_pirqs(struct domain *d); +int map_domain_emuirq_pirq(struct domain *d, int pirq, int irq); +int unmap_domain_pirq_emuirq(struct domain *d, int pirq); +int hvm_domain_use_pirq(struct domain *d, int irq); int init_irq_data(void); @@ -146,5 +149,9 @@ void irq_set_affinity(struct irq_desc *, #define domain_pirq_to_irq(d, pirq) ((d)->arch.pirq_irq[pirq]) #define domain_irq_to_pirq(d, irq) ((d)->arch.irq_pirq[irq]) +#define domain_pirq_to_emuirq(d, pirq) ((d)->arch.pirq_emuirq[pirq]) +#define domain_emuirq_to_pirq(d, emuirq) ((d)->arch.emuirq_pirq[emuirq]) +#define IRQ_UNBOUND -1 +#define IRQ_PT -2 #endif /* _ASM_HW_IRQ_H */ diff -r 5218db847b58 xen/include/public/features.h --- a/xen/include/public/features.h Tue Aug 17 19:32:37 2010 +0100 +++ b/xen/include/public/features.h Fri Aug 20 16:47:36 2010 +0100 @@ -74,6 +74,9 @@ /* x86: pvclock algorithm is safe to use on HVM */ #define XENFEAT_hvm_safe_pvclock 9 +/* x86: pirq can be used by HVM guests */ +#define XENFEAT_hvm_pirqs 10 + #define XENFEAT_NR_SUBMAPS 1 #endif /* __XEN_PUBLIC_FEATURES_H__ */ _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
stefano.stabellini@eu.citrix.com
2010-Aug-30 11:25 UTC
[Xen-devel] [PATCH 2 of 4] xen: introduce PHYSDEVOP_get_nr_pirqs
Introduce a new physdevop called PHYSDEVOP_get_nr_pirqs that allows PV and PV on HVM guests to get the number of pirqs available. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> diff -r f2dc396b92aa xen/arch/x86/hvm/hvm.c --- a/xen/arch/x86/hvm/hvm.c Wed Aug 18 14:14:23 2010 +0100 +++ b/xen/arch/x86/hvm/hvm.c Fri Aug 20 14:49:52 2010 +0100 @@ -2290,6 +2290,7 @@ static long hvm_physdev_op(int cmd, XEN_ case PHYSDEVOP_unmap_pirq: case PHYSDEVOP_eoi: case PHYSDEVOP_irq_status_query: + case PHYSDEVOP_get_nr_pirqs: return do_physdev_op(cmd, arg); default: return -ENOSYS; @@ -2393,6 +2394,8 @@ static long hvm_physdev_op_compat32( case PHYSDEVOP_eoi: case PHYSDEVOP_irq_status_query: return compat_physdev_op(cmd, arg); + case PHYSDEVOP_get_nr_pirqs: + return do_physdev_op(cmd, arg); break; default: return -ENOSYS; diff -r f2dc396b92aa xen/arch/x86/physdev.c --- a/xen/arch/x86/physdev.c Wed Aug 18 14:14:23 2010 +0100 +++ b/xen/arch/x86/physdev.c Fri Aug 20 14:49:52 2010 +0100 @@ -572,6 +572,12 @@ ret_t do_physdev_op(int cmd, XEN_GUEST_H setup_gsi.polarity); break; } + case PHYSDEVOP_get_nr_pirqs: { + struct physdev_nr_pirqs out; + out.nr_pirqs = v->domain->nr_pirqs; + ret = copy_to_guest(arg, &out, 1) ? -EFAULT : 0; + break; + } default: ret = -ENOSYS; break; diff -r f2dc396b92aa xen/include/public/physdev.h --- a/xen/include/public/physdev.h Wed Aug 18 14:14:23 2010 +0100 +++ b/xen/include/public/physdev.h Fri Aug 20 14:49:52 2010 +0100 @@ -240,6 +240,15 @@ struct physdev_setup_gsi { typedef struct physdev_setup_gsi physdev_setup_gsi_t; DEFINE_XEN_GUEST_HANDLE(physdev_setup_gsi_t); +#define PHYSDEVOP_get_nr_pirqs 22 +struct physdev_nr_pirqs { + /* OUT */ + uint32_t nr_pirqs; +}; + +typedef struct physdev_nr_pirqs physdev_nr_pirqs_t; +DEFINE_XEN_GUEST_HANDLE(physdev_nr_pirqs_t); + /* * Notify that some PIRQ-bound event channels have been unmasked. * ** This command is obsolete since interface version 0x00030202 and is ** _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
stefano.stabellini@eu.citrix.com
2010-Aug-30 11:25 UTC
[Xen-devel] [PATCH 3 of 4] qemu-xen: support PV on HVM MSI remapping
If the guest enables an MSI passing 0 as vector number, then read the address and use it as pirq number for the following mapping request to Xen. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> diff --git a/hw/pt-msi.c b/hw/pt-msi.c index b59b4fa..f0fb3e3 100644 --- a/hw/pt-msi.c +++ b/hw/pt-msi.c @@ -65,6 +65,7 @@ static void msix_set_enable(struct pt_dev *dev, int en) int pt_msi_setup(struct pt_dev *dev) { int pirq = -1; + uint8_t gvec = 0; if ( !(dev->msi->flags & MSI_FLAG_UNINIT) ) { @@ -72,6 +73,15 @@ int pt_msi_setup(struct pt_dev *dev) return -1; } + gvec = dev->msi->data & 0xFF; + if (!gvec) { + /* if gvec is 0, the guest is asking for a particular pirq that + * is passed as dest_id */ + pirq = (dev->msi->addr_hi & 0xffffff00) | + ((dev->msi->addr_lo >> MSI_TARGET_CPU_SHIFT) & 0xff); + PT_LOG("pt_msi_setup requested pirq = %d\n", pirq); + } + if ( xc_physdev_map_pirq_msi(xc_handle, domid, AUTO_ASSIGN, &pirq, PCI_DEVFN(dev->pci_dev->dev, ev->pci_dev->func), _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
stefano.stabellini@eu.citrix.com
2010-Aug-30 11:25 UTC
[Xen-devel] [PATCH 4 of 4] qemu-xen: let xen choose the pirq number
When remapping an interrupt into the guest, let xen choose the pirq, otherwise we might have to handle possible conflicts. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> diff --git a/hw/pass-through.c b/hw/pass-through.c index f6ae4b2..81ae3f3 100644 --- a/hw/pass-through.c +++ b/hw/pass-through.c @@ -4265,7 +4265,7 @@ static struct pt_dev * register_real_device(PCIBus *e_bus, if ( PT_MACHINE_IRQ_AUTO == machine_irq ) { - int pirq = pci_dev->irq; + int pirq = -1; machine_irq = pci_dev->irq; rc = xc_physdev_map_pirq(xc_handle, domid, machine_irq, &pirq); _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jan Beulich
2010-Aug-30 12:16 UTC
Re: [Xen-devel] [PATCH 1 of 4] xen: interrupt remapping in HVM guests
>>> On 30.08.10 at 13:25, <stefano.stabellini@eu.citrix.com> wrote: > --- a/xen/arch/x86/domain.c Tue Aug 17 19:32:37 2010 +0100 > +++ b/xen/arch/x86/domain.c Fri Aug 20 16:47:36 2010 +0100 > @@ -490,6 +490,16 @@ int arch_domain_create(struct domain *d, > if ( !IO_APIC_IRQ(i) ) > d->arch.irq_pirq[i] = d->arch.pirq_irq[i] = i; > > + d->arch.pirq_emuirq = xmalloc_array(int, d->nr_pirqs); > + d->arch.emuirq_pirq = xmalloc_array(int, nr_irqs); > + if ( !d->arch.pirq_emuirq || !d->arch.emuirq_pirq ) > + goto fail; > + memset(d->arch.pirq_emuirq, IRQ_UNBOUND, > + d->nr_pirqs * sizeof(*d->arch.pirq_emuirq)); > + memset(d->arch.emuirq_pirq, IRQ_UNBOUND, > + d->nr_pirqs * sizeof(*d->arch.emuirq_pirq)); > + > + > if ( (rc = iommu_domain_init(d)) != 0 ) > goto fail; >Shouldn''t this be done for HVM domains only, and should you free these arrays both in the error path of that function and in e.g. arch_domain_destroy()? Additionally, shouldn''t you add a build time check making sure that IRQ_UNBOUND is actually suitable for initialization via memset() (or alternatively use a loop)? Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Stefano Stabellini
2010-Aug-30 12:40 UTC
Re: [Xen-devel] [PATCH 1 of 4] xen: interrupt remapping in HVM guests
On Mon, 30 Aug 2010, Jan Beulich wrote:> >>> On 30.08.10 at 13:25, <stefano.stabellini@eu.citrix.com> wrote: > > --- a/xen/arch/x86/domain.c Tue Aug 17 19:32:37 2010 +0100 > > +++ b/xen/arch/x86/domain.c Fri Aug 20 16:47:36 2010 +0100 > > @@ -490,6 +490,16 @@ int arch_domain_create(struct domain *d, > > if ( !IO_APIC_IRQ(i) ) > > d->arch.irq_pirq[i] = d->arch.pirq_irq[i] = i; > > > > + d->arch.pirq_emuirq = xmalloc_array(int, d->nr_pirqs); > > + d->arch.emuirq_pirq = xmalloc_array(int, nr_irqs); > > + if ( !d->arch.pirq_emuirq || !d->arch.emuirq_pirq ) > > + goto fail; > > + memset(d->arch.pirq_emuirq, IRQ_UNBOUND, > > + d->nr_pirqs * sizeof(*d->arch.pirq_emuirq)); > > + memset(d->arch.emuirq_pirq, IRQ_UNBOUND, > > + d->nr_pirqs * sizeof(*d->arch.emuirq_pirq)); > > + > > + > > if ( (rc = iommu_domain_init(d)) != 0 ) > > goto fail; > > > > Shouldn''t this be done for HVM domains only, and should you free > these arrays both in the error path of that function and in e.g. > arch_domain_destroy()?Yes, you are right about that, I''ll fix and resend.> > Additionally, shouldn''t you add a build time check making sure > that IRQ_UNBOUND is actually suitable for initialization via > memset() (or alternatively use a loop)? >something like typeof? _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jan Beulich
2010-Aug-30 12:59 UTC
Re: [Xen-devel] [PATCH 1 of 4] xen: interrupt remapping in HVM guests
>>> On 30.08.10 at 14:40, Stefano Stabellini <stefano.stabellini@eu.citrix.com> wrote: > On Mon, 30 Aug 2010, Jan Beulich wrote: >> Additionally, shouldn''t you add a build time check making sure >> that IRQ_UNBOUND is actually suitable for initialization via >> memset() (or alternatively use a loop)? >> > > something like typeof?No, something that makes sure all bytes of the int are identical (since of the int you pass to memset() only the low 8 bits actually get used). For the moment, checking the value is either 0 or -1 would probably do. Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Stefano Stabellini
2010-Aug-30 13:28 UTC
Re: [Xen-devel] [PATCH 1 of 4] xen: interrupt remapping in HVM guests
On Mon, 30 Aug 2010, Jan Beulich wrote:> >>> On 30.08.10 at 14:40, Stefano Stabellini <stefano.stabellini@eu.citrix.com> wrote: > > On Mon, 30 Aug 2010, Jan Beulich wrote: > >> Additionally, shouldn''t you add a build time check making sure > >> that IRQ_UNBOUND is actually suitable for initialization via > >> memset() (or alternatively use a loop)? > >> > > > > something like typeof? > > No, something that makes sure all bytes of the int are identical (since > of the int you pass to memset() only the low 8 bits actually get used). > For the moment, checking the value is either 0 or -1 would probably > do.I understand now, but I think I prefer to use a loop instead :) _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2010-Aug-31 11:19 UTC
Re: [Xen-devel] [PATCH 2 of 4] xen: introduce PHYSDEVOP_get_nr_pirqs
Useful why? What if we''d like nr_pirqs to be per-domain dynamic in future? -- Keir On 30/08/2010 12:25, "Stefano Stabellini" <Stefano.Stabellini@eu.citrix.com> wrote:> Introduce a new physdevop called PHYSDEVOP_get_nr_pirqs that allows PV > and PV on HVM guests to get the number of pirqs available. > > Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> > > diff -r f2dc396b92aa xen/arch/x86/hvm/hvm.c > --- a/xen/arch/x86/hvm/hvm.c Wed Aug 18 14:14:23 2010 +0100 > +++ b/xen/arch/x86/hvm/hvm.c Fri Aug 20 14:49:52 2010 +0100 > @@ -2290,6 +2290,7 @@ static long hvm_physdev_op(int cmd, XEN_ > case PHYSDEVOP_unmap_pirq: > case PHYSDEVOP_eoi: > case PHYSDEVOP_irq_status_query: > + case PHYSDEVOP_get_nr_pirqs: > return do_physdev_op(cmd, arg); > default: > return -ENOSYS; > @@ -2393,6 +2394,8 @@ static long hvm_physdev_op_compat32( > case PHYSDEVOP_eoi: > case PHYSDEVOP_irq_status_query: > return compat_physdev_op(cmd, arg); > + case PHYSDEVOP_get_nr_pirqs: > + return do_physdev_op(cmd, arg); > break; > default: > return -ENOSYS; > diff -r f2dc396b92aa xen/arch/x86/physdev.c > --- a/xen/arch/x86/physdev.c Wed Aug 18 14:14:23 2010 +0100 > +++ b/xen/arch/x86/physdev.c Fri Aug 20 14:49:52 2010 +0100 > @@ -572,6 +572,12 @@ ret_t do_physdev_op(int cmd, XEN_GUEST_H > setup_gsi.polarity); > break; > } > + case PHYSDEVOP_get_nr_pirqs: { > + struct physdev_nr_pirqs out; > + out.nr_pirqs = v->domain->nr_pirqs; > + ret = copy_to_guest(arg, &out, 1) ? -EFAULT : 0; > + break; > + } > default: > ret = -ENOSYS; > break; > diff -r f2dc396b92aa xen/include/public/physdev.h > --- a/xen/include/public/physdev.h Wed Aug 18 14:14:23 2010 +0100 > +++ b/xen/include/public/physdev.h Fri Aug 20 14:49:52 2010 +0100 > @@ -240,6 +240,15 @@ struct physdev_setup_gsi { > typedef struct physdev_setup_gsi physdev_setup_gsi_t; > DEFINE_XEN_GUEST_HANDLE(physdev_setup_gsi_t); > > +#define PHYSDEVOP_get_nr_pirqs 22 > +struct physdev_nr_pirqs { > + /* OUT */ > + uint32_t nr_pirqs; > +}; > + > +typedef struct physdev_nr_pirqs physdev_nr_pirqs_t; > +DEFINE_XEN_GUEST_HANDLE(physdev_nr_pirqs_t); > + > /* > * Notify that some PIRQ-bound event channels have been unmasked. > * ** This command is obsolete since interface version 0x00030202 and is ** > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Stefano Stabellini
2010-Aug-31 12:24 UTC
Re: [Xen-devel] [PATCH 2 of 4] xen: introduce PHYSDEVOP_get_nr_pirqs
On Tue, 31 Aug 2010, Keir Fraser wrote:> Useful why? What if we''d like nr_pirqs to be per-domain dynamic in future? >That wouldn''t be a good idea considering that the current PHYSDEVOP_map_pirq interface allows the guest to choose the pirq number (unless the guest explicitely passes map.pirq == -1, but only PV on HVM guests and qemu do it, after my series). _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2010-Aug-31 12:38 UTC
Re: [Xen-devel] [PATCH 2 of 4] xen: introduce PHYSDEVOP_get_nr_pirqs
On 31/08/2010 13:24, "Stefano Stabellini" <Stefano.Stabellini@eu.citrix.com> wrote:> On Tue, 31 Aug 2010, Keir Fraser wrote: >> Useful why? What if we''d like nr_pirqs to be per-domain dynamic in future? >> > > That wouldn''t be a good idea considering that the current > PHYSDEVOP_map_pirq interface allows the guest to choose the pirq > numberSo what significance does nr_pirqs have to the guest?> (unless the guest explicitely passes map.pirq == -1, but only PV > on HVM guests and qemu do it, after my series). >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Stefano Stabellini
2010-Aug-31 12:46 UTC
Re: [Xen-devel] [PATCH 2 of 4] xen: introduce PHYSDEVOP_get_nr_pirqs
On Tue, 31 Aug 2010, Keir Fraser wrote:> On 31/08/2010 13:24, "Stefano Stabellini" <Stefano.Stabellini@eu.citrix.com> > wrote: > > > On Tue, 31 Aug 2010, Keir Fraser wrote: > >> Useful why? What if we''d like nr_pirqs to be per-domain dynamic in future? > >> > > > > That wouldn''t be a good idea considering that the current > > PHYSDEVOP_map_pirq interface allows the guest to choose the pirq > > number > > So what significance does nr_pirqs have to the guest?the upper limit of the pirq number it might choose _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2010-Aug-31 12:46 UTC
Re: [Xen-devel] [PATCH 2 of 4] xen: introduce PHYSDEVOP_get_nr_pirqs
On 31/08/2010 13:46, "Stefano Stabellini" <Stefano.Stabellini@eu.citrix.com> wrote:>>> That wouldn''t be a good idea considering that the current >>> PHYSDEVOP_map_pirq interface allows the guest to choose the pirq >>> number >> >> So what significance does nr_pirqs have to the guest? > > the upper limit of the pirq number it might chooseCouldn''t it, like, pick the smallest available? :-) _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Stefano Stabellini
2010-Aug-31 12:49 UTC
Re: [Xen-devel] [PATCH 2 of 4] xen: introduce PHYSDEVOP_get_nr_pirqs
On Tue, 31 Aug 2010, Keir Fraser wrote:> On 31/08/2010 13:46, "Stefano Stabellini" <Stefano.Stabellini@eu.citrix.com> > wrote: > > >>> That wouldn''t be a good idea considering that the current > >>> PHYSDEVOP_map_pirq interface allows the guest to choose the pirq > >>> number > >> > >> So what significance does nr_pirqs have to the guest? > > > > the upper limit of the pirq number it might choose > > Couldn''t it, like, pick the smallest available? :-) >Well, it might still be useful to know the upper limit, besides linux tends to choose the pirq == irq and the irqs for MSIs are high. Xen does the same thing by the way. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2010-Aug-31 12:55 UTC
Re: [Xen-devel] [PATCH 2 of 4] xen: introduce PHYSDEVOP_get_nr_pirqs
On 31/08/2010 13:49, "Stefano Stabellini" <Stefano.Stabellini@eu.citrix.com> wrote:>> Couldn''t it, like, pick the smallest available? :-) >> > > Well, it might still be useful to know the upper limit, besides linux > tends to choose the pirq == irq and the irqs for MSIs are high. > Xen does the same thing by the way.Well I''m just being fussy because it''s yet another irq related interface and we seem to have so many already. Might-be-useful is different from must-have-now, and if you are allocating pirq==irq then that allocation strategy is not influenced by knowing nr_pirqs is it? -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Stefano Stabellini
2010-Aug-31 13:05 UTC
Re: [Xen-devel] [PATCH 2 of 4] xen: introduce PHYSDEVOP_get_nr_pirqs
On Tue, 31 Aug 2010, Keir Fraser wrote:> On 31/08/2010 13:49, "Stefano Stabellini" <Stefano.Stabellini@eu.citrix.com> > wrote: > > >> Couldn''t it, like, pick the smallest available? :-) > >> > > > > Well, it might still be useful to know the upper limit, besides linux > > tends to choose the pirq == irq and the irqs for MSIs are high. > > Xen does the same thing by the way. > > Well I''m just being fussy because it''s yet another irq related interface and > we seem to have so many already. Might-be-useful is different from > must-have-now, and if you are allocating pirq==irq then that allocation > strategy is not influenced by knowing nr_pirqs is it?Knowing the pirq number upper limit is important for PV on HVM guests to be able to remap MSIs into pirqs minimizing the chances of conflicts. If we had another way of knowing the maximum pirq number from within the guest I would gladly use it. It is also useful for dom0 that up to know just assumed that the pirq number is always identical to the irq number (that for MSIs might be actually higher than nr_pirq). In other words: if the guest is allowed to choose the pirq number it must be able to know what the range is. Could we initialize nr_pirq always to the same value in xen? _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel