With the earlier patches now committed, this series is much shorter. I have split as many changes as I think I can unpick from the main functional change. These now live in patches 1 and 3. Some of the changes pulled forwards into patch 1 make a sup prising improvement to the clarity of patch 2. Patch 2 is the main functional change, which completely reimplements how HPET interrupts work, how ownership works, and how waking works. It is RFC, and is currently tested as "Doesn''t blow up when used". This includes more modern systems as well, using the "no-arat" command line parameter to cause Xen to use this codepath. Patches 4 and 5 are debugging, and possibly for inclusion if deemed useful. Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> CC: Keir Fraser <keir@xen.org> CC: Jan Beulich <JBeulich@suse.com> CC: Tim Deegan <tim@xen.org> -- 1.7.10.4
This is a set of changes which might not make sense alone, but are used/required for the main purpose of the series, and simply the mammoth patch making it easier to review. * Make hpet_msi_write conditionally disable MSIs while updating the route register. This removes the requirement that it must be masked while writing. * Defer read of cfg register in hpet_setup_msi_irq. As a result, an intremap failure prevents us suffering a pointless MMIO read. * Change some instances of per_cpu($foo, cpu) to this_cpu($foo). It is cleaner to read, and makes it more obvious when the code is poking around in another cpus data. * Convert hpet_next_event() to taking a struct hpet_event_channel *, and rename to __hpet_set_counter() for a more accurate description of its actions. * Insert some assertions about expected interrupt state. The following patch is rather stricter about its locking. Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> CC: Keir Fraser <keir@xen.org> CC: Jan Beulich <JBeulich@suse.com> CC: Tim Deegan <tim@xen.org> --- xen/arch/x86/hpet.c | 38 ++++++++++++++++++++++++++------------ 1 file changed, 26 insertions(+), 12 deletions(-) diff --git a/xen/arch/x86/hpet.c b/xen/arch/x86/hpet.c index 3a4f7e8..14e49e5 100644 --- a/xen/arch/x86/hpet.c +++ b/xen/arch/x86/hpet.c @@ -94,7 +94,7 @@ static inline unsigned long ns2ticks(unsigned long nsec, int shift, return (unsigned long) tmp; } -static int hpet_next_event(unsigned long delta, int timer) +static int __hpet_set_counter(struct hpet_event_channel *ch, unsigned long delta) { uint32_t cnt, cmp; unsigned long flags; @@ -102,7 +102,7 @@ static int hpet_next_event(unsigned long delta, int timer) local_irq_save(flags); cnt = hpet_read32(HPET_COUNTER); cmp = cnt + delta; - hpet_write32(cmp, HPET_Tn_CMP(timer)); + hpet_write32(cmp, HPET_Tn_CMP(ch->idx)); cmp = hpet_read32(HPET_COUNTER); local_irq_restore(flags); @@ -143,11 +143,11 @@ static int reprogram_hpet_evt_channel( delta = max_t(int64_t, delta, MIN_DELTA_NS); delta = ns2ticks(delta, ch->shift, ch->mult); - ret = hpet_next_event(delta, ch->idx); + ret = __hpet_set_counter(ch, delta); while ( ret && force ) { delta += delta; - ret = hpet_next_event(delta, ch->idx); + ret = __hpet_set_counter(ch, delta); } return ret; @@ -254,8 +254,10 @@ static void hpet_msi_mask(struct irq_desc *desc) ch->msi.msi_attrib.masked = 1; } -static int hpet_msi_write(struct hpet_event_channel *ch, struct msi_msg *msg) +static int __hpet_msi_write(struct hpet_event_channel *ch, struct msi_msg *msg) { + u32 cfg; + ch->msi.msg = *msg; if ( iommu_intremap ) @@ -266,9 +268,16 @@ static int hpet_msi_write(struct hpet_event_channel *ch, struct msi_msg *msg) return rc; } + cfg = hpet_read32(HPET_Tn_CFG(ch->idx)); + if ( cfg & HPET_TN_ENABLE ) + hpet_write32(cfg & ~HPET_TN_ENABLE, HPET_Tn_CFG(ch->idx)); + hpet_write32(msg->data, HPET_Tn_ROUTE(ch->idx)); hpet_write32(msg->address_lo, HPET_Tn_ROUTE(ch->idx) + 4); + if ( cfg & HPET_TN_ENABLE ) + hpet_write32(cfg, HPET_Tn_CFG(ch->idx)); + return 0; } @@ -311,7 +320,7 @@ static void hpet_msi_set_affinity(struct irq_desc *desc, const cpumask_t *mask) msg.address_lo &= ~MSI_ADDR_DEST_ID_MASK; msg.address_lo |= MSI_ADDR_DEST_ID(msg.dest32); if ( msg.data != ch->msi.msg.data || msg.dest32 != ch->msi.msg.dest32 ) - hpet_msi_write(ch, &msg); + __hpet_msi_write(ch, &msg); } /* @@ -332,13 +341,13 @@ static int __hpet_setup_msi_irq(struct irq_desc *desc) struct msi_msg msg; msi_compose_msg(desc->arch.vector, desc->arch.cpu_mask, &msg); - return hpet_msi_write(desc->action->dev_id, &msg); + return __hpet_msi_write(desc->action->dev_id, &msg); } static int __init hpet_setup_msi_irq(struct hpet_event_channel *ch) { int ret; - u32 cfg = hpet_read32(HPET_Tn_CFG(ch->idx)); + u32 cfg; irq_desc_t *desc = irq_to_desc(ch->msi.irq); if ( iommu_intremap ) @@ -350,6 +359,7 @@ static int __init hpet_setup_msi_irq(struct hpet_event_channel *ch) } /* set HPET Tn as oneshot */ + cfg = hpet_read32(HPET_Tn_CFG(ch->idx)); cfg &= ~(HPET_TN_LEVEL | HPET_TN_PERIODIC); cfg |= HPET_TN_FSB | HPET_TN_32BIT; hpet_write32(cfg, HPET_Tn_CFG(ch->idx)); @@ -697,14 +707,16 @@ void hpet_broadcast_enter(void) { unsigned int cpu = smp_processor_id(); struct hpet_event_channel *ch = per_cpu(cpu_bc_channel, cpu); + s_time_t deadline = this_cpu(timer_deadline); + + ASSERT(!local_irq_is_enabled()); - if ( per_cpu(timer_deadline, cpu) == 0 ) + if ( deadline == 0 ) return; if ( !ch ) ch = hpet_get_channel(cpu); - ASSERT(!local_irq_is_enabled()); if ( !(ch->flags & HPET_EVT_LEGACY) ) hpet_attach_channel(cpu, ch); @@ -725,7 +737,9 @@ void hpet_broadcast_exit(void) unsigned int cpu = smp_processor_id(); struct hpet_event_channel *ch = per_cpu(cpu_bc_channel, cpu); - if ( per_cpu(timer_deadline, cpu) == 0 ) + ASSERT(local_irq_is_enabled()); + + if ( this_cpu(timer_deadline) == 0 ) return; if ( !ch ) @@ -733,7 +747,7 @@ void hpet_broadcast_exit(void) /* Reprogram the deadline; trigger timer work now if it has passed. */ enable_APIC_timer(); - if ( !reprogram_timer(per_cpu(timer_deadline, cpu)) ) + if ( !reprogram_timer(this_cpu(timer_deadline)) ) raise_softirq(TIMER_SOFTIRQ); cpumask_clear_cpu(cpu, ch->cpumask); -- 1.7.10.4
Andrew Cooper
2013-Nov-07 15:28 UTC
[PATCH v2 2/5] x86/hpet: Use singe apic vector rather than irq_descs for HPET interrupts
This involves rewriting most of the MSI related HPET code, and as a result this patch looks very complicated. It is probably best viewed as an end result, with the following notes explaining what is going on. The new logic is as follows: * A single high priority vector is allocated and uses on all cpus. * Reliance on the irq infrastructure is completely removed. * Tracking of free hpet channels has changed. It is now an individual bitmap, and allocation is based on winning a test_and_clear_bit() operation. * There is a notion of strict ownership of hpet channels. ** A cpu which owns an HPET channel can program it for a desired deadline. ** A cpu which can''t find a free HPET channel to own may register for being woken up by another in-use HPET which will fire at an appropriate time. * Some functions have been renamed to be more descriptive. Some functions have parameters changed to be more consistent. * Any function with a __hpet prefix expectes the appropriate lock to be held. Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> CC: Keir Fraser <keir@xen.org> CC: Jan Beulich <JBeulich@suse.com> CC: Tim Deegan <tim@xen.org> --- xen/arch/x86/hpet.c | 429 +++++++++++++++++---------------------------------- 1 file changed, 144 insertions(+), 285 deletions(-) diff --git a/xen/arch/x86/hpet.c b/xen/arch/x86/hpet.c index 14e49e5..47d643c 100644 --- a/xen/arch/x86/hpet.c +++ b/xen/arch/x86/hpet.c @@ -4,26 +4,21 @@ * HPET management. */ -#include <xen/config.h> +#include <xen/lib.h> +#include <xen/init.h> +#include <xen/cpuidle.h> #include <xen/errno.h> -#include <xen/time.h> -#include <xen/timer.h> -#include <xen/smp.h> #include <xen/softirq.h> -#include <xen/irq.h> -#include <xen/numa.h> + +#include <mach_apic.h> + #include <asm/fixmap.h> #include <asm/div64.h> #include <asm/hpet.h> -#include <asm/msi.h> -#include <mach_apic.h> -#include <xen/cpuidle.h> #define MAX_DELTA_NS MILLISECS(10*1000) #define MIN_DELTA_NS MICROSECS(20) -#define HPET_EVT_USED_BIT 0 -#define HPET_EVT_USED (1 << HPET_EVT_USED_BIT) #define HPET_EVT_DISABLE_BIT 1 #define HPET_EVT_DISABLE (1 << HPET_EVT_DISABLE_BIT) #define HPET_EVT_LEGACY_BIT 2 @@ -36,8 +31,6 @@ struct hpet_event_channel s_time_t next_event; cpumask_var_t cpumask; spinlock_t lock; - void (*event_handler)(struct hpet_event_channel *); - unsigned int idx; /* physical channel idx */ unsigned int cpu; /* msi target */ struct msi_desc msi;/* msi state */ @@ -48,8 +41,20 @@ static struct hpet_event_channel *__read_mostly hpet_events; /* msi hpet channels used for broadcast */ static unsigned int __read_mostly num_hpets_used; -DEFINE_PER_CPU(struct hpet_event_channel *, cpu_bc_channel); +/* High-priority vector for HPET interrupts */ +static u8 __read_mostly hpet_vector; + +/* + * HPET channel used for idling. Either the HPET channel this cpu owns + * (indicated by channel->cpu pointing back), or the HPET channel belonging to + * another cpu with which we have requested to be woken. + */ +static DEFINE_PER_CPU(struct hpet_event_channel *, hpet_channel); + +/* Bitmask of currently-free HPET channels. */ +static uint32_t free_channels; +/* Data from the HPET ACPI table */ unsigned long __initdata hpet_address; u8 __initdata hpet_blockid; @@ -97,22 +102,20 @@ static inline unsigned long ns2ticks(unsigned long nsec, int shift, static int __hpet_set_counter(struct hpet_event_channel *ch, unsigned long delta) { uint32_t cnt, cmp; - unsigned long flags; - local_irq_save(flags); + ASSERT(!local_irq_is_enabled()); + cnt = hpet_read32(HPET_COUNTER); cmp = cnt + delta; hpet_write32(cmp, HPET_Tn_CMP(ch->idx)); cmp = hpet_read32(HPET_COUNTER); - local_irq_restore(flags); /* Are we within two ticks of the deadline passing? Then we may miss. */ return ((cmp + 2 - cnt) > delta) ? -ETIME : 0; } -static int reprogram_hpet_evt_channel( - struct hpet_event_channel *ch, - s_time_t expire, s_time_t now, int force) +static int __hpet_program_time(struct hpet_event_channel *ch, + s_time_t expire, s_time_t now, int force) { int64_t delta; int ret; @@ -153,89 +156,40 @@ static int reprogram_hpet_evt_channel( return ret; } -static void evt_do_broadcast(cpumask_t *mask) +static void __hpet_wake_cpus(cpumask_t *mask) { - unsigned int cpu = smp_processor_id(); - - if ( cpumask_test_and_clear_cpu(cpu, mask) ) - raise_softirq(TIMER_SOFTIRQ); - cpuidle_wakeup_mwait(mask); if ( !cpumask_empty(mask) ) cpumask_raise_softirq(mask, TIMER_SOFTIRQ); } -static void handle_hpet_broadcast(struct hpet_event_channel *ch) +static void __hpet_interrupt(struct hpet_event_channel *ch) { - cpumask_t mask; - s_time_t now, next_event; - unsigned int cpu; - unsigned long flags; - - spin_lock_irqsave(&ch->lock, flags); - -again: - ch->next_event = STIME_MAX; - - spin_unlock_irqrestore(&ch->lock, flags); - - next_event = STIME_MAX; - cpumask_clear(&mask); - now = NOW(); - - /* find all expired events */ - for_each_cpu(cpu, ch->cpumask) - { - s_time_t deadline; - - rmb(); - deadline = per_cpu(timer_deadline, cpu); - rmb(); - if ( !cpumask_test_cpu(cpu, ch->cpumask) ) - continue; - - if ( deadline <= now ) - cpumask_set_cpu(cpu, &mask); - else if ( deadline < next_event ) - next_event = deadline; - } - - /* wakeup the cpus which have an expired event. */ - evt_do_broadcast(&mask); - - if ( next_event != STIME_MAX ) - { - spin_lock_irqsave(&ch->lock, flags); - - if ( next_event < ch->next_event && - reprogram_hpet_evt_channel(ch, next_event, now, 0) ) - goto again; - - spin_unlock_irqrestore(&ch->lock, flags); - } + __hpet_wake_cpus(ch->cpumask); + __hpet_program_time(ch, this_cpu(timer_deadline), NOW(), 1); + raise_softirq(TIMER_SOFTIRQ); } -static void hpet_interrupt_handler(int irq, void *data, - struct cpu_user_regs *regs) +static void hpet_interrupt_handler(struct cpu_user_regs *regs) { - struct hpet_event_channel *ch = (struct hpet_event_channel *)data; - - this_cpu(irq_count)--; + unsigned int cpu = smp_processor_id(); + struct hpet_event_channel *ch = this_cpu(hpet_channel); - if ( !ch->event_handler ) + if ( ch ) { - printk(XENLOG_WARNING "Spurious HPET timer interrupt on HPET timer %d\n", ch->idx); - return; + spin_lock(&ch->lock); + if ( ch->cpu == cpu ) + __hpet_interrupt(ch); + spin_unlock(&ch->lock); } - ch->event_handler(ch); + ack_APIC_irq(); } -static void hpet_msi_unmask(struct irq_desc *desc) +static void __hpet_msi_unmask(struct hpet_event_channel *ch) { u32 cfg; - struct hpet_event_channel *ch = desc->action->dev_id; cfg = hpet_read32(HPET_Tn_CFG(ch->idx)); cfg |= HPET_TN_ENABLE; @@ -243,10 +197,9 @@ static void hpet_msi_unmask(struct irq_desc *desc) ch->msi.msi_attrib.masked = 0; } -static void hpet_msi_mask(struct irq_desc *desc) +static void __hpet_msi_mask(struct hpet_event_channel *ch) { u32 cfg; - struct hpet_event_channel *ch = desc->action->dev_id; cfg = hpet_read32(HPET_Tn_CFG(ch->idx)); cfg &= ~HPET_TN_ENABLE; @@ -281,74 +234,20 @@ static int __hpet_msi_write(struct hpet_event_channel *ch, struct msi_msg *msg) return 0; } -static void __maybe_unused -hpet_msi_read(struct hpet_event_channel *ch, struct msi_msg *msg) -{ - msg->data = hpet_read32(HPET_Tn_ROUTE(ch->idx)); - msg->address_lo = hpet_read32(HPET_Tn_ROUTE(ch->idx) + 4); - msg->address_hi = MSI_ADDR_BASE_HI; - if ( iommu_intremap ) - iommu_read_msi_from_ire(&ch->msi, msg); -} - -static unsigned int hpet_msi_startup(struct irq_desc *desc) -{ - hpet_msi_unmask(desc); - return 0; -} - -#define hpet_msi_shutdown hpet_msi_mask - -static void hpet_msi_ack(struct irq_desc *desc) -{ - irq_complete_move(desc); - move_native_irq(desc); - ack_APIC_irq(); -} - -static void hpet_msi_set_affinity(struct irq_desc *desc, const cpumask_t *mask) -{ - struct hpet_event_channel *ch = desc->action->dev_id; - struct msi_msg msg = ch->msi.msg; - - msg.dest32 = set_desc_affinity(desc, mask); - if ( msg.dest32 == BAD_APICID ) - return; - - msg.data &= ~MSI_DATA_VECTOR_MASK; - msg.data |= MSI_DATA_VECTOR(desc->arch.vector); - msg.address_lo &= ~MSI_ADDR_DEST_ID_MASK; - msg.address_lo |= MSI_ADDR_DEST_ID(msg.dest32); - if ( msg.data != ch->msi.msg.data || msg.dest32 != ch->msi.msg.dest32 ) - __hpet_msi_write(ch, &msg); -} - -/* - * IRQ Chip for MSI HPET Devices, - */ -static hw_irq_controller hpet_msi_type = { - .typename = "HPET-MSI", - .startup = hpet_msi_startup, - .shutdown = hpet_msi_shutdown, - .enable = hpet_msi_unmask, - .disable = hpet_msi_mask, - .ack = hpet_msi_ack, - .set_affinity = hpet_msi_set_affinity, -}; - -static int __hpet_setup_msi_irq(struct irq_desc *desc) +static int __hpet_setup_msi(struct hpet_event_channel *ch) { struct msi_msg msg; - msi_compose_msg(desc->arch.vector, desc->arch.cpu_mask, &msg); - return __hpet_msi_write(desc->action->dev_id, &msg); + ASSERT(ch->cpu != -1); + + msi_compose_msg(hpet_vector, cpumask_of(ch->cpu), &msg); + return __hpet_msi_write(ch, &msg); } -static int __init hpet_setup_msi_irq(struct hpet_event_channel *ch) +static int __init hpet_setup_msi(struct hpet_event_channel *ch) { int ret; u32 cfg; - irq_desc_t *desc = irq_to_desc(ch->msi.irq); if ( iommu_intremap ) { @@ -364,10 +263,7 @@ static int __init hpet_setup_msi_irq(struct hpet_event_channel *ch) cfg |= HPET_TN_FSB | HPET_TN_32BIT; hpet_write32(cfg, HPET_Tn_CFG(ch->idx)); - desc->handler = &hpet_msi_type; - ret = request_irq(ch->msi.irq, hpet_interrupt_handler, "HPET", ch); - if ( ret >= 0 ) - ret = __hpet_setup_msi_irq(desc); + ret = __hpet_setup_msi(ch); if ( ret < 0 ) { if ( iommu_intremap ) @@ -375,25 +271,6 @@ static int __init hpet_setup_msi_irq(struct hpet_event_channel *ch) return ret; } - desc->msi_desc = &ch->msi; - - return 0; -} - -static int __init hpet_assign_irq(struct hpet_event_channel *ch) -{ - int irq; - - if ( (irq = create_irq(NUMA_NO_NODE)) < 0 ) - return irq; - - ch->msi.irq = irq; - if ( hpet_setup_msi_irq(ch) ) - { - destroy_irq(irq); - return -EINVAL; - } - return 0; } @@ -414,6 +291,8 @@ static void __init hpet_fsb_cap_lookup(void) if ( !hpet_events ) return; + alloc_direct_apic_vector(&hpet_vector, hpet_interrupt_handler); + for ( i = 0; i < num_chs && num_hpets_used < nr_cpu_ids; i++ ) { struct hpet_event_channel *ch = &hpet_events[num_hpets_used]; @@ -436,7 +315,7 @@ static void __init hpet_fsb_cap_lookup(void) ch->flags = 0; ch->idx = i; - if ( hpet_assign_irq(ch) == 0 ) + if ( hpet_setup_msi(ch) == 0 ) num_hpets_used++; } @@ -444,102 +323,28 @@ static void __init hpet_fsb_cap_lookup(void) num_hpets_used, num_chs); } -static struct hpet_event_channel *hpet_get_channel(unsigned int cpu) +/* + * Search for, and allocate, a free HPET channel. Returns a pointer to the + * channel, or NULL in the case that none were free. The caller is + * responsible for returning the channel to the free pool. + */ +static struct hpet_event_channel *hpet_get_free_channel(void) { - static unsigned int next_channel; - unsigned int i, next; - struct hpet_event_channel *ch; - - if ( num_hpets_used == 0 ) - return hpet_events; + unsigned ch, tries; - if ( num_hpets_used >= nr_cpu_ids ) - return &hpet_events[cpu]; - - do { - next = next_channel; - if ( (i = next + 1) == num_hpets_used ) - i = 0; - } while ( cmpxchg(&next_channel, next, i) != next ); - - /* try unused channel first */ - for ( i = next; i < next + num_hpets_used; i++ ) + for ( tries = num_hpets_used; tries; --tries ) { - ch = &hpet_events[i % num_hpets_used]; - if ( !test_and_set_bit(HPET_EVT_USED_BIT, &ch->flags) ) - { - ch->cpu = cpu; - return ch; - } - } - - /* share a in-use channel */ - ch = &hpet_events[next]; - if ( !test_and_set_bit(HPET_EVT_USED_BIT, &ch->flags) ) - ch->cpu = cpu; - - return ch; -} - -static void set_channel_irq_affinity(struct hpet_event_channel *ch) -{ - struct irq_desc *desc = irq_to_desc(ch->msi.irq); - - ASSERT(!local_irq_is_enabled()); - spin_lock(&desc->lock); - hpet_msi_mask(desc); - hpet_msi_set_affinity(desc, cpumask_of(ch->cpu)); - hpet_msi_unmask(desc); - spin_unlock(&desc->lock); - - spin_unlock(&ch->lock); - - /* We may have missed an interrupt due to the temporary masking. */ - if ( ch->event_handler && ch->next_event < NOW() ) - ch->event_handler(ch); -} - -static void hpet_attach_channel(unsigned int cpu, - struct hpet_event_channel *ch) -{ - ASSERT(!local_irq_is_enabled()); - spin_lock(&ch->lock); - - per_cpu(cpu_bc_channel, cpu) = ch; - - /* try to be the channel owner again while holding the lock */ - if ( !test_and_set_bit(HPET_EVT_USED_BIT, &ch->flags) ) - ch->cpu = cpu; - - if ( ch->cpu != cpu ) - spin_unlock(&ch->lock); - else - set_channel_irq_affinity(ch); -} - -static void hpet_detach_channel(unsigned int cpu, - struct hpet_event_channel *ch) -{ - spin_lock_irq(&ch->lock); - - ASSERT(ch == per_cpu(cpu_bc_channel, cpu)); + if ( (ch = ffs(free_channels)) == 0 ) + break; - per_cpu(cpu_bc_channel, cpu) = NULL; + --ch; + ASSERT(ch < num_hpets_used); - if ( cpu != ch->cpu ) - spin_unlock_irq(&ch->lock); - else if ( cpumask_empty(ch->cpumask) ) - { - ch->cpu = -1; - clear_bit(HPET_EVT_USED_BIT, &ch->flags); - spin_unlock_irq(&ch->lock); - } - else - { - ch->cpu = cpumask_first(ch->cpumask); - set_channel_irq_affinity(ch); - local_irq_enable(); + if ( test_and_clear_bit(ch, &free_channels) ) + return &hpet_events[ch]; } + + return NULL; } #include <asm/mc146818rtc.h> @@ -576,6 +381,7 @@ void __init hpet_broadcast_init(void) /* Stop HPET legacy interrupts */ cfg &= ~HPET_CFG_LEGACY; n = num_hpets_used; + free_channels = (1U << n) - 1; } else { @@ -619,9 +425,8 @@ void __init hpet_broadcast_init(void) hpet_events[i].shift = 32; hpet_events[i].next_event = STIME_MAX; spin_lock_init(&hpet_events[i].lock); - wmb(); - hpet_events[i].event_handler = handle_hpet_broadcast; + hpet_events[1].msi.irq = -1; hpet_events[i].msi.msi_attrib.maskbit = 1; hpet_events[i].msi.msi_attrib.pos = MSI_TYPE_HPET; } @@ -661,9 +466,6 @@ void hpet_broadcast_resume(void) for ( i = 0; i < n; i++ ) { - if ( hpet_events[i].msi.irq >= 0 ) - __hpet_setup_msi_irq(irq_to_desc(hpet_events[i].msi.irq)); - /* set HPET Tn as oneshot */ cfg = hpet_read32(HPET_Tn_CFG(hpet_events[i].idx)); cfg &= ~(HPET_TN_LEVEL | HPET_TN_PERIODIC); @@ -706,36 +508,76 @@ void hpet_disable_legacy_broadcast(void) void hpet_broadcast_enter(void) { unsigned int cpu = smp_processor_id(); - struct hpet_event_channel *ch = per_cpu(cpu_bc_channel, cpu); + struct hpet_event_channel *ch = this_cpu(hpet_channel); s_time_t deadline = this_cpu(timer_deadline); ASSERT(!local_irq_is_enabled()); + ASSERT(ch == NULL); if ( deadline == 0 ) return; - if ( !ch ) - ch = hpet_get_channel(cpu); + ch = hpet_get_free_channel(); + if ( ch ) + { + /* This really should be an MSI channel by this point */ + ASSERT( !(ch->flags & HPET_EVT_LEGACY) ); + + spin_lock(&ch->lock); + + this_cpu(hpet_channel) = ch; + ch->cpu = cpu; + cpumask_set_cpu(cpu, ch->cpumask); + + __hpet_setup_msi(ch); + __hpet_program_time(ch, deadline, NOW(), 1); + __hpet_msi_unmask(ch); + + spin_unlock(&ch->lock); + + } + else + { + /* TODO - this seems very ugly */ + unsigned i; + + for ( i = 0; i < num_hpets_used; ++i ) + { + ch = &hpet_events[i]; + spin_lock(&ch->lock); + + if ( ch->cpu == -1 ) + goto continue_search; + + if ( ch->next_event >= deadline - MICROSECS(50) && + ch->next_event <= deadline ) + break; + + continue_search: + spin_unlock(&ch->lock); + ch = NULL; + } + + if ( ch ) + { + cpumask_set_cpu(cpu, ch->cpumask); + this_cpu(hpet_channel) = ch; + spin_unlock(&ch->lock); + } + else + this_cpu(timer_deadline) = NOW(); - if ( !(ch->flags & HPET_EVT_LEGACY) ) - hpet_attach_channel(cpu, ch); + } /* Disable LAPIC timer interrupts. */ disable_APIC_timer(); - cpumask_set_cpu(cpu, ch->cpumask); - - spin_lock(&ch->lock); - /* reprogram if current cpu expire time is nearer */ - if ( per_cpu(timer_deadline, cpu) < ch->next_event ) - reprogram_hpet_evt_channel(ch, per_cpu(timer_deadline, cpu), NOW(), 1); - spin_unlock(&ch->lock); } void hpet_broadcast_exit(void) { unsigned int cpu = smp_processor_id(); - struct hpet_event_channel *ch = per_cpu(cpu_bc_channel, cpu); + struct hpet_event_channel *ch = this_cpu(hpet_channel); ASSERT(local_irq_is_enabled()); @@ -743,17 +585,29 @@ void hpet_broadcast_exit(void) return; if ( !ch ) - ch = hpet_get_channel(cpu); + return; + + spin_lock_irq(&ch->lock); + + cpumask_clear_cpu(cpu, ch->cpumask); + + /* If we own the channel, detach it */ + if ( ch->cpu == cpu ) + { + __hpet_msi_mask(ch); + __hpet_wake_cpus(ch->cpumask); + ch->cpu = -1; + set_bit(ch->idx, &free_channels); + } + + this_cpu(hpet_channel) = NULL; + + spin_unlock_irq(&ch->lock); /* Reprogram the deadline; trigger timer work now if it has passed. */ enable_APIC_timer(); if ( !reprogram_timer(this_cpu(timer_deadline)) ) raise_softirq(TIMER_SOFTIRQ); - - cpumask_clear_cpu(cpu, ch->cpumask); - - if ( !(ch->flags & HPET_EVT_LEGACY) ) - hpet_detach_channel(cpu, ch); } int hpet_broadcast_is_available(void) @@ -770,7 +624,12 @@ int hpet_legacy_irq_tick(void) (hpet_events->flags & (HPET_EVT_DISABLE|HPET_EVT_LEGACY)) ! HPET_EVT_LEGACY ) return 0; - hpet_events->event_handler(hpet_events); + + /* TODO - Does this really make sense for legacy ticks ? */ + spin_lock_irq(&hpet_events->lock); + __hpet_interrupt(hpet_events); + spin_unlock_irq(&hpet_events->lock); + return 1; } -- 1.7.10.4
These changes are ones which were able to be pulled out of the previous patch. They are all misc cleanup without functional implications * Shift HPET_EVT_* definitions up now that USED has moved out * Shuffle struct hpet_event_channel ** Reflow horizontally and comment current use ** Promote ''shift'' to unsigned. It is the constant 32 but can be more easily optimised. ** Move ''flags'' up to fill 4 byte hole ** Move ''cpumask'' and ''lock'' into second cache line as they are diried from other cpus Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> CC: Keir Fraser <keir@xen.org> CC: Jan Beulich <JBeulich@suse.com> CC: Tim Deegan <tim@xen.org> --- xen/arch/x86/hpet.c | 22 +++++++++++----------- 1 file changed, 11 insertions(+), 11 deletions(-) diff --git a/xen/arch/x86/hpet.c b/xen/arch/x86/hpet.c index 47d643c..e467a78 100644 --- a/xen/arch/x86/hpet.c +++ b/xen/arch/x86/hpet.c @@ -19,22 +19,22 @@ #define MAX_DELTA_NS MILLISECS(10*1000) #define MIN_DELTA_NS MICROSECS(20) -#define HPET_EVT_DISABLE_BIT 1 +#define HPET_EVT_DISABLE_BIT 0 #define HPET_EVT_DISABLE (1 << HPET_EVT_DISABLE_BIT) -#define HPET_EVT_LEGACY_BIT 2 +#define HPET_EVT_LEGACY_BIT 1 #define HPET_EVT_LEGACY (1 << HPET_EVT_LEGACY_BIT) struct hpet_event_channel { - unsigned long mult; - int shift; - s_time_t next_event; - cpumask_var_t cpumask; - spinlock_t lock; - unsigned int idx; /* physical channel idx */ - unsigned int cpu; /* msi target */ - struct msi_desc msi;/* msi state */ - unsigned int flags; /* HPET_EVT_x */ + unsigned long mult; /* tick <-> time conversion */ + unsigned int shift; /* tick <-> time conversion */ + unsigned int flags; /* HPET_EVT_x */ + s_time_t next_event; /* expected time of next interrupt */ + unsigned int idx; /* HPET counter index */ + unsigned int cpu; /* owner of channel (or -1) */ + struct msi_desc msi; /* msi state */ + cpumask_var_t cpumask; /* cpus wishing to be woken */ + spinlock_t lock; } __cacheline_aligned; static struct hpet_event_channel *__read_mostly hpet_events; -- 1.7.10.4
Andrew Cooper
2013-Nov-07 15:28 UTC
[PATCH v2 4/5] x86/hpet: Debug and verbose hpet logging
This was for debugging purposes, but might perhaps be more useful generally. I am happy to keep none, some or all of it, depending on how useful people think it might be. Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> CC: Keir Fraser <keir@xen.org> CC: Jan Beulich <JBeulich@suse.com> --- xen/arch/x86/hpet.c | 83 +++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 83 insertions(+) diff --git a/xen/arch/x86/hpet.c b/xen/arch/x86/hpet.c index e467a78..6deb730 100644 --- a/xen/arch/x86/hpet.c +++ b/xen/arch/x86/hpet.c @@ -66,6 +66,36 @@ u8 __initdata hpet_blockid; static bool_t __initdata force_hpet_broadcast; boolean_param("hpetbroadcast", force_hpet_broadcast); +static bool_t __read_mostly hpet_verbose; +static bool_t __read_mostly hpet_debug; +static void __init parse_hpet_param(char * s) +{ + char *ss; + int val; + + do { + val = !!strncmp(s, "no-", 3); + if ( !val ) + s += 3; + + ss = strchr(s, '',''); + if ( ss ) + *ss = ''\0''; + + if ( !strcmp(s, "verbose") ) + hpet_verbose = val; + else if ( !strcmp(s, "debug") ) + { + hpet_debug = val; + if ( val ) + hpet_verbose = 1; + } + + s = ss + 1; + } while ( ss ); +} +custom_param("hpet", parse_hpet_param); + /* * Calculate a multiplication factor for scaled math, which is used to convert * nanoseconds based values to clock ticks: @@ -99,6 +129,35 @@ static inline unsigned long ns2ticks(unsigned long nsec, int shift, return (unsigned long) tmp; } +static void dump_hpet_timer(unsigned timer) +{ + u32 cfg = hpet_read32(HPET_Tn_CFG(timer)); + + printk(XENLOG_INFO "HPET: Timer %02u CFG: raw 0x%08"PRIx32 + " Caps: %d %c%c", timer, cfg, + cfg & HPET_TN_64BIT_CAP ? 64 : 32, + cfg & HPET_TN_FSB_CAP ? ''M'' : ''-'', + cfg & HPET_TN_PERIODIC_CAP ? ''P'' : ''-''); + + printk("\n Setup: "); + + if ( (cfg & HPET_TN_FSB_CAP) && (cfg & HPET_TN_FSB) ) + printk("FSB "); + + if ( !(cfg & HPET_TN_FSB) ) + printk("GSI %#x ", + (cfg & HPET_TN_ROUTE) >> HPET_TN_ROUTE_SHIFT); + + if ( cfg & HPET_TN_32BIT ) + printk("32bit "); + + if ( cfg & HPET_TN_PERIODIC ) + printk("Periodic "); + + printk("%sabled ", cfg & HPET_TN_ENABLE ? "En" : "Dis"); + printk("%s\n", cfg & HPET_TN_LEVEL ? "Level" : "Edge"); +} + static int __hpet_set_counter(struct hpet_event_channel *ch, unsigned long delta) { uint32_t cnt, cmp; @@ -642,7 +701,14 @@ u64 __init hpet_setup(void) unsigned int last; if ( hpet_rate ) + { + if ( hpet_debug ) + printk(XENLOG_DEBUG "HPET: Skipping re-setup\n"); return hpet_rate; + } + + if ( hpet_debug ) + printk(XENLOG_DEBUG "HPET: Setting up hpet data\n"); if ( hpet_address == 0 ) return 0; @@ -656,6 +722,20 @@ u64 __init hpet_setup(void) return 0; } + if ( hpet_verbose ) + { + printk(XENLOG_INFO "HPET: Vendor: %04"PRIx16", Rev: %u, %u timers\n", + hpet_id >> HPET_ID_VENDOR_SHIFT, + hpet_id & HPET_ID_REV, + ((hpet_id & HPET_ID_NUMBER) >> HPET_ID_NUMBER_SHIFT) + 1); + printk(XENLOG_INFO "HPET: Caps: "); + if ( hpet_id & HPET_ID_LEGSUP ) + printk("Legacy "); + if ( hpet_id & HPET_ID_64BIT ) + printk("64bit "); + printk("\n"); + } + /* Check for sane period (100ps <= period <= 100ns). */ hpet_period = hpet_read32(HPET_PERIOD); if ( (hpet_period > 100000000) || (hpet_period < 100000) ) @@ -713,6 +793,9 @@ void hpet_resume(u32 *boot_cfg) cfg &= ~HPET_TN_RESERVED; } hpet_write32(cfg, HPET_Tn_CFG(i)); + + if ( hpet_verbose ) + dump_hpet_timer(i); } cfg = hpet_read32(HPET_CFG); -- 1.7.10.4
Debug key for dumping HPET state. This patch is not intended for committing. --- xen/arch/x86/hpet.c | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+) diff --git a/xen/arch/x86/hpet.c b/xen/arch/x86/hpet.c index 6deb730..c96a283 100644 --- a/xen/arch/x86/hpet.c +++ b/xen/arch/x86/hpet.c @@ -9,6 +9,7 @@ #include <xen/cpuidle.h> #include <xen/errno.h> #include <xen/softirq.h> +#include <xen/keyhandler.h> #include <mach_apic.h> @@ -694,6 +695,21 @@ int hpet_legacy_irq_tick(void) static u32 *hpet_boot_cfg; +static void do_hpet_dump_state(unsigned char key) +{ + unsigned i; + printk("''%c'' pressed - dumping HPET state\n", key); + + for ( i = 0; i < num_hpets_used; ++i ) + dump_hpet_timer(i); +} + +static struct keyhandler hpet_dump_state = { + .irq_callback = 0, + .u.fn = do_hpet_dump_state, + .desc = "Dump hpet state" +}; + u64 __init hpet_setup(void) { static u64 __initdata hpet_rate; @@ -751,6 +767,8 @@ u64 __init hpet_setup(void) hpet_rate = 1000000000000000ULL; /* 10^15 */ (void)do_div(hpet_rate, hpet_period); + register_keyhandler(''1'', &hpet_dump_state); + return hpet_rate; } -- 1.7.10.4
Tim Deegan
2013-Nov-08 01:08 UTC
Re: [PATCH v2 2/5] x86/hpet: Use singe apic vector rather than irq_descs for HPET interrupts
Hi, At 15:28 +0000 on 07 Nov (1383834502), Andrew Cooper wrote:> This involves rewriting most of the MSI related HPET code, and as a result > this patch looks very complicated. It is probably best viewed as an end > result, with the following notes explaining what is going on. > > The new logic is as follows: > * A single high priority vector is allocated and uses on all cpus. > * Reliance on the irq infrastructure is completely removed. > * Tracking of free hpet channels has changed. It is now an individual > bitmap, and allocation is based on winning a test_and_clear_bit() > operation. > * There is a notion of strict ownership of hpet channels. > ** A cpu which owns an HPET channel can program it for a desired deadline. > ** A cpu which can''t find a free HPET channel to own may register for being > woken up by another in-use HPET which will fire at an appropriate time. > * Some functions have been renamed to be more descriptive. Some functions > have parameters changed to be more consistent. > * Any function with a __hpet prefix expectes the appropriate lock to be held.It certainly looks like the code should be easier to understand after this! I''ll try to read through the end result later in the week, but I have a few questions from the patch:> -static void handle_hpet_broadcast(struct hpet_event_channel *ch) > +static void __hpet_interrupt(struct hpet_event_channel *ch) > {[...]> + __hpet_program_time(ch, this_cpu(timer_deadline), NOW(), 1); > + raise_softirq(TIMER_SOFTIRQ); > }What''s the __hpet_program_time() doing? It looks like we''re reprogramming the hpet for our next event, before we handle the event we woke up for (i.e. always setting to to fire immediately). Or have I misunderstood?> @@ -619,9 +425,8 @@ void __init hpet_broadcast_init(void) > hpet_events[i].shift = 32; > hpet_events[i].next_event = STIME_MAX; > spin_lock_init(&hpet_events[i].lock); > - wmb(); > - hpet_events[i].event_handler = handle_hpet_broadcast; > > + hpet_events[1].msi.irq = -1;Really [1] or DYM [i]?> hpet_events[i].msi.msi_attrib.maskbit = 1; > hpet_events[i].msi.msi_attrib.pos = MSI_TYPE_HPET; > } > @@ -661,9 +466,6 @@ void hpet_broadcast_resume(void) > > for ( i = 0; i < n; i++ ) > { > - if ( hpet_events[i].msi.irq >= 0 ) > - __hpet_setup_msi_irq(irq_to_desc(hpet_events[i].msi.irq)); > - > /* set HPET Tn as oneshot */ > cfg = hpet_read32(HPET_Tn_CFG(hpet_events[i].idx)); > cfg &= ~(HPET_TN_LEVEL | HPET_TN_PERIODIC); > @@ -706,36 +508,76 @@ void hpet_disable_legacy_broadcast(void) > void hpet_broadcast_enter(void) > { > unsigned int cpu = smp_processor_id(); > - struct hpet_event_channel *ch = per_cpu(cpu_bc_channel, cpu); > + struct hpet_event_channel *ch = this_cpu(hpet_channel); > s_time_t deadline = this_cpu(timer_deadline); > > ASSERT(!local_irq_is_enabled()); > + ASSERT(ch == NULL); > > if ( deadline == 0 ) > return; > > - if ( !ch ) > - ch = hpet_get_channel(cpu); > + ch = hpet_get_free_channel(); > > + if ( ch ) > + { > + /* This really should be an MSI channel by this point */ > + ASSERT( !(ch->flags & HPET_EVT_LEGACY) ); > + > + spin_lock(&ch->lock); > + > + this_cpu(hpet_channel) = ch; > + ch->cpu = cpu; > + cpumask_set_cpu(cpu, ch->cpumask); > + > + __hpet_setup_msi(ch); > + __hpet_program_time(ch, deadline, NOW(), 1); > + __hpet_msi_unmask(ch); > + > + spin_unlock(&ch->lock); > + > + } > + else > + { > + /* TODO - this seems very ugly */ > + unsigned i; > + > + for ( i = 0; i < num_hpets_used; ++i ) > + { > + ch = &hpet_events[i]; > + spin_lock(&ch->lock); > + > + if ( ch->cpu == -1 ) > + goto continue_search; > + > + if ( ch->next_event >= deadline - MICROSECS(50) && > + ch->next_event <= deadline ) > + break; > + > + continue_search: > + spin_unlock(&ch->lock); > + ch = NULL; > + } > + > + if ( ch ) > + { > + cpumask_set_cpu(cpu, ch->cpumask); > + this_cpu(hpet_channel) = ch; > + spin_unlock(&ch->lock); > + } > + else > + this_cpu(timer_deadline) = NOW();Hmm. So if we can''t find a channel that fits the deadline we want, we give up? I thought the plan was to cause some other channel to fire early. Tim.
>>> On 07.11.13 at 16:28, Andrew Cooper <andrew.cooper3@citrix.com> wrote: > This is a set of changes which might not make sense alone, but are > used/required for the main purpose of the series, and simply the mammoth > patch > making it easier to review. > > * Make hpet_msi_write conditionally disable MSIs while updating the route > register. This removes the requirement that it must be masked while > writing. > > * Defer read of cfg register in hpet_setup_msi_irq. As a result, an > intremap > failure prevents us suffering a pointless MMIO read.Rather benign I would say - this isn''t a hot code path.> * Change some instances of per_cpu($foo, cpu) to this_cpu($foo). It is > cleaner to read, and makes it more obvious when the code is poking around > in > another cpus data.Documentation wise this is certainly desirable, but since our this_cpu(), other than e.g. Linux''es equivalents, internally does another smp_processor_id(), generate code becomes worse.> * Convert hpet_next_event() to taking a struct hpet_event_channel *, and > rename to __hpet_set_counter() for a more accurate description of its > actions.I very much think we should get away from the habit of prefixing symbols with two underscores: Such names are reserved to the compiler/platform, and the standard specifically reserves names starting with one underscore (and not followed by an upper case letter) for use a file scope symbols. This is what I''d like to request be done here.> @@ -697,14 +707,16 @@ void hpet_broadcast_enter(void) > { > unsigned int cpu = smp_processor_id(); > struct hpet_event_channel *ch = per_cpu(cpu_bc_channel, cpu);And you''re not even consistent with your conversion - you don''t convert the above, ...> + s_time_t deadline = this_cpu(timer_deadline); > + > + ASSERT(!local_irq_is_enabled()); > > - if ( per_cpu(timer_deadline, cpu) == 0 )... but you do convert this one.> @@ -725,7 +737,9 @@ void hpet_broadcast_exit(void) > unsigned int cpu = smp_processor_id(); > struct hpet_event_channel *ch = per_cpu(cpu_bc_channel, cpu); > > - if ( per_cpu(timer_deadline, cpu) == 0 ) > + ASSERT(local_irq_is_enabled()); > + > + if ( this_cpu(timer_deadline) == 0 )Same here. Anyway - I''d prefer these to be left alone. Jan
On 08/11/13 09:49, Jan Beulich wrote:>>>> On 07.11.13 at 16:28, Andrew Cooper <andrew.cooper3@citrix.com> wrote: >> This is a set of changes which might not make sense alone, but are >> used/required for the main purpose of the series, and simply the mammoth >> patch >> making it easier to review. >> >> * Make hpet_msi_write conditionally disable MSIs while updating the route >> register. This removes the requirement that it must be masked while >> writing. >> >> * Defer read of cfg register in hpet_setup_msi_irq. As a result, an >> intremap >> failure prevents us suffering a pointless MMIO read. > Rather benign I would say - this isn''t a hot code path. > >> * Change some instances of per_cpu($foo, cpu) to this_cpu($foo). It is >> cleaner to read, and makes it more obvious when the code is poking around >> in >> another cpus data. > Documentation wise this is certainly desirable, but since our this_cpu(), > other than e.g. Linux''es equivalents, internally does another > smp_processor_id(), generate code becomes worse.with an "unsigned int cpu = smp_processor_id()" in context, the generated code appears to be identical.> >> * Convert hpet_next_event() to taking a struct hpet_event_channel *, and >> rename to __hpet_set_counter() for a more accurate description of its >> actions. > I very much think we should get away from the habit of prefixing > symbols with two underscores: Such names are reserved to the > compiler/platform, and the standard specifically reserves names > starting with one underscore (and not followed by an upper case > letter) for use a file scope symbols. This is what I''d like to request > be done here.So what would you suggest? Here, all the __$foo() are designed to end up similar to large swathes of other Xen code, implying that the appropriate spinlock should be held by the caller. I am not too fussed which convention we use, so long as it is used consistently.> >> @@ -697,14 +707,16 @@ void hpet_broadcast_enter(void) >> { >> unsigned int cpu = smp_processor_id(); >> struct hpet_event_channel *ch = per_cpu(cpu_bc_channel, cpu); > And you''re not even consistent with your conversion - you don''t > convert the above, ...Because cpu_bc_channel disappears in the next patch. As stated in the description, the primary purpose of this patch is to make the following patch easier to read, rather than to strictly make sense itself.> >> + s_time_t deadline = this_cpu(timer_deadline); >> + >> + ASSERT(!local_irq_is_enabled()); >> >> - if ( per_cpu(timer_deadline, cpu) == 0 ) > ... but you do convert this one.Because the following patch needs ''deadline'' in the new code added here. ~Andrew> >> @@ -725,7 +737,9 @@ void hpet_broadcast_exit(void) >> unsigned int cpu = smp_processor_id(); >> struct hpet_event_channel *ch = per_cpu(cpu_bc_channel, cpu); >> >> - if ( per_cpu(timer_deadline, cpu) == 0 ) >> + ASSERT(local_irq_is_enabled()); >> + >> + if ( this_cpu(timer_deadline) == 0 ) > Same here. Anyway - I''d prefer these to be left alone. > > Jan >
Andrew Cooper
2013-Nov-08 11:05 UTC
Re: [PATCH v2 2/5] x86/hpet: Use singe apic vector rather than irq_descs for HPET interrupts
On 08/11/13 01:08, Tim Deegan wrote:> Hi, > > At 15:28 +0000 on 07 Nov (1383834502), Andrew Cooper wrote: >> This involves rewriting most of the MSI related HPET code, and as a result >> this patch looks very complicated. It is probably best viewed as an end >> result, with the following notes explaining what is going on. >> >> The new logic is as follows: >> * A single high priority vector is allocated and uses on all cpus. >> * Reliance on the irq infrastructure is completely removed. >> * Tracking of free hpet channels has changed. It is now an individual >> bitmap, and allocation is based on winning a test_and_clear_bit() >> operation. >> * There is a notion of strict ownership of hpet channels. >> ** A cpu which owns an HPET channel can program it for a desired deadline. >> ** A cpu which can''t find a free HPET channel to own may register for being >> woken up by another in-use HPET which will fire at an appropriate time. >> * Some functions have been renamed to be more descriptive. Some functions >> have parameters changed to be more consistent. >> * Any function with a __hpet prefix expectes the appropriate lock to be held. > It certainly looks like the code should be easier to understand after > this! I''ll try to read through the end result later in the week, but > I have a few questions from the patch: > >> -static void handle_hpet_broadcast(struct hpet_event_channel *ch) >> +static void __hpet_interrupt(struct hpet_event_channel *ch) >> { > [...] >> + __hpet_program_time(ch, this_cpu(timer_deadline), NOW(), 1); >> + raise_softirq(TIMER_SOFTIRQ); >> } > What''s the __hpet_program_time() doing? It looks like we''re > reprogramming the hpet for our next event, before we handle the event > we woke up for (i.e. always setting to to fire immediately). Or have > I misunderstood?Hmm yes - on further consideration this is silly. When moving the logic around I did try to err on the side of what the old logic did wrt the internal timing. However, as we will unconditionally will wander through hpet_broadcast_exit() and around to hpet_broadcast_enter() again, reprogramming the channel at this point is crazy.> >> @@ -619,9 +425,8 @@ void __init hpet_broadcast_init(void) >> hpet_events[i].shift = 32; >> hpet_events[i].next_event = STIME_MAX; >> spin_lock_init(&hpet_events[i].lock); >> - wmb(); >> - hpet_events[i].event_handler = handle_hpet_broadcast; >> >> + hpet_events[1].msi.irq = -1; > Really [1] or DYM [i]?Oops. Good catch. The font in my terminal makes those even harder to distinguish.> >> hpet_events[i].msi.msi_attrib.maskbit = 1; >> hpet_events[i].msi.msi_attrib.pos = MSI_TYPE_HPET; >> } >> @@ -661,9 +466,6 @@ void hpet_broadcast_resume(void) >> >> for ( i = 0; i < n; i++ ) >> { >> - if ( hpet_events[i].msi.irq >= 0 ) >> - __hpet_setup_msi_irq(irq_to_desc(hpet_events[i].msi.irq)); >> - >> /* set HPET Tn as oneshot */ >> cfg = hpet_read32(HPET_Tn_CFG(hpet_events[i].idx)); >> cfg &= ~(HPET_TN_LEVEL | HPET_TN_PERIODIC); >> @@ -706,36 +508,76 @@ void hpet_disable_legacy_broadcast(void) >> void hpet_broadcast_enter(void) >> { >> unsigned int cpu = smp_processor_id(); >> - struct hpet_event_channel *ch = per_cpu(cpu_bc_channel, cpu); >> + struct hpet_event_channel *ch = this_cpu(hpet_channel); >> s_time_t deadline = this_cpu(timer_deadline); >> >> ASSERT(!local_irq_is_enabled()); >> + ASSERT(ch == NULL); >> >> if ( deadline == 0 ) >> return; >> >> - if ( !ch ) >> - ch = hpet_get_channel(cpu); >> + ch = hpet_get_free_channel(); >> >> + if ( ch ) >> + { >> + /* This really should be an MSI channel by this point */ >> + ASSERT( !(ch->flags & HPET_EVT_LEGACY) ); >> + >> + spin_lock(&ch->lock); >> + >> + this_cpu(hpet_channel) = ch; >> + ch->cpu = cpu; >> + cpumask_set_cpu(cpu, ch->cpumask); >> + >> + __hpet_setup_msi(ch); >> + __hpet_program_time(ch, deadline, NOW(), 1); >> + __hpet_msi_unmask(ch); >> + >> + spin_unlock(&ch->lock); >> + >> + } >> + else >> + { >> + /* TODO - this seems very ugly */ >> + unsigned i; >> + >> + for ( i = 0; i < num_hpets_used; ++i ) >> + { >> + ch = &hpet_events[i]; >> + spin_lock(&ch->lock); >> + >> + if ( ch->cpu == -1 ) >> + goto continue_search; >> + >> + if ( ch->next_event >= deadline - MICROSECS(50) && >> + ch->next_event <= deadline ) >> + break; >> + >> + continue_search: >> + spin_unlock(&ch->lock); >> + ch = NULL; >> + } >> + >> + if ( ch ) >> + { >> + cpumask_set_cpu(cpu, ch->cpumask); >> + this_cpu(hpet_channel) = ch; >> + spin_unlock(&ch->lock); >> + } >> + else >> + this_cpu(timer_deadline) = NOW(); > Hmm. So if we can''t find a channel that fits the deadline we want, > we give up? I thought the plan was to cause some other channel to > fire early. > > Tim.I considered that, but (experimentally) the typcial period of time asked for is forced upwards by MIN_DELTA_NS, meaning that another cpu which cant find an HPET is almost certainly going to find a valid one given 50us slack. I decided that, in the rare case where this might occur, wandering around the idle loop and trying again was rather safer than reprogramming the hpet to be sooner, which then needs to have -ETIME logic and emulated wakeup in the case that the deadline was missed. ~Andrew> > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel
>>> On 08.11.13 at 11:37, Andrew Cooper <andrew.cooper3@citrix.com> wrote: > On 08/11/13 09:49, Jan Beulich wrote: >>>>> On 07.11.13 at 16:28, Andrew Cooper <andrew.cooper3@citrix.com> wrote: >>> * Change some instances of per_cpu($foo, cpu) to this_cpu($foo). It is >>> cleaner to read, and makes it more obvious when the code is poking around >>> in >>> another cpus data. >> Documentation wise this is certainly desirable, but since our this_cpu(), >> other than e.g. Linux''es equivalents, internally does another >> smp_processor_id(), generate code becomes worse. > > with an "unsigned int cpu = smp_processor_id()" in context, the > generated code appears to be identical.In the case here yes. But that changes if there''s e.g. a function call between the two accesses.>>> * Convert hpet_next_event() to taking a struct hpet_event_channel *, and >>> rename to __hpet_set_counter() for a more accurate description of its >>> actions. >> I very much think we should get away from the habit of prefixing >> symbols with two underscores: Such names are reserved to the >> compiler/platform, and the standard specifically reserves names >> starting with one underscore (and not followed by an upper case >> letter) for use a file scope symbols. This is what I''d like to request >> be done here. > > So what would you suggest? Here, all the __$foo() are designed to end > up similar to large swathes of other Xen code, implying that the > appropriate spinlock should be held by the caller. > > I am not too fussed which convention we use, so long as it is used > consistently.We obviously can''t switch to the single-underscore model in one go. But there are examples of this in the tree already, and I''m trying to stick to this unless someone tells me not to. In the end it''ll really depend on others'' opinions (albeit avoiding to violate the standard is imo sufficiently strong an argument by itself). Jan
At 11:40 +0000 on 08 Nov (1383907234), Jan Beulich wrote:> >>> On 08.11.13 at 11:37, Andrew Cooper <andrew.cooper3@citrix.com> wrote: > > On 08/11/13 09:49, Jan Beulich wrote: > >> I very much think we should get away from the habit of prefixing > >> symbols with two underscores: Such names are reserved to the > >> compiler/platform, and the standard specifically reserves names > >> starting with one underscore (and not followed by an upper case > >> letter) for use a file scope symbols. This is what I''d like to request > >> be done here. > > > > So what would you suggest? Here, all the __$foo() are designed to end > > up similar to large swathes of other Xen code, implying that the > > appropriate spinlock should be held by the caller. > > > > I am not too fussed which convention we use, so long as it is used > > consistently. > > We obviously can''t switch to the single-underscore model in one > go. But there are examples of this in the tree already, and I''m > trying to stick to this unless someone tells me not to. In the end > it''ll really depend on others'' opinions (albeit avoiding to violate the > standard is imo sufficiently strong an argument by itself).IIRC the __ reservation is a posix thing, not a C one, so it doesn''t apply to us kernel hackers. :) But actually I would prefer to avoid using underscore prefixes altogether; I think they''re ugly and sometimes slightly lazy. (This is a relatively recent conversion -- I realise the shadow code is full of them!). I''d be happy with an explicit CODING_STYLE convention that said we would use them only for the trivial locked vs unlocked distinction. Cheers, Tim.
>>> On 08.11.13 at 17:39, Tim Deegan <tim@xen.org> wrote: > At 11:40 +0000 on 08 Nov (1383907234), Jan Beulich wrote: >> >>> On 08.11.13 at 11:37, Andrew Cooper <andrew.cooper3@citrix.com> wrote: >> > On 08/11/13 09:49, Jan Beulich wrote: >> >> I very much think we should get away from the habit of prefixing >> >> symbols with two underscores: Such names are reserved to the >> >> compiler/platform, and the standard specifically reserves names >> >> starting with one underscore (and not followed by an upper case >> >> letter) for use a file scope symbols. This is what I'd like to request >> >> be done here. >> > >> > So what would you suggest? Here, all the __$foo() are designed to end >> > up similar to large swathes of other Xen code, implying that the >> > appropriate spinlock should be held by the caller. >> > >> > I am not too fussed which convention we use, so long as it is used >> > consistently. >> >> We obviously can't switch to the single-underscore model in one >> go. But there are examples of this in the tree already, and I'm >> trying to stick to this unless someone tells me not to. In the end >> it'll really depend on others' opinions (albeit avoiding to violate the >> standard is imo sufficiently strong an argument by itself). > > IIRC the __ reservation is a posix thing, not a C one, so it doesn't > apply to us kernel hackers. :) But actually I would prefer to avoid > using underscore prefixes altogether; I think they're ugly and sometimes > slightly lazy. (This is a relatively recent conversion -- I realise > the shadow code is full of them!).No, this is part of the C standard, in its description of the requirements of a conforming library implementation: "— All identifiers that begin with an underscore and either an uppercase letter or another underscore are always reserved for any use. — All identifiers that begin with an underscore are always reserved for use as identifiers with file scope in both the ordinary and tag name spaces." Irrespective of its placement in the library description, I take this as a general rule - after all, the compiler itself also has to have a name space it can use for itself or for communicating with the library. And while gcc (as an example) generally prefixes internal stuff with __builtin_, that's neither uniformly the case there (exceptions are e.g. extended keywords), nor a guarantee for the future, or for other compilers. Hence best for us to stay away from using reserved names in way other than what they're reserved for. Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
On 08/11/13 10:37, Andrew Cooper wrote:> > So what would you suggest? Here, all the __$foo() are designed to end > up similar to large swathes of other Xen code, implying that the > appropriate spinlock should be held by the caller. > > I am not too fussed which convention we use, so long as it is used > consistently.I''ve sometimes used a _locked suffix for this. David