George Dunlap
2010-Apr-14 10:26 UTC
[Xen-devel] [PATCH 0 of 5] Add credit2 scheduler (EXPERIMENTAL)
This patch series introduces the credit2 scheduler. The first two patches introduce changes necessary to allow the credit2 shared runqueue functionality to work properly; the last two implement the functionality itself. The scheduler is still in the experimental phase. There''s lots of opportunity to contribute with independent lines of development; email George Dunlap <george.dunlap@eu.citrix.com> or check out the wiki page http://wiki.xensource.com/xenwiki/Credit2_Scheduler_Development for ideas and status updates. 19 files changed, 1453 insertions(+), 21 deletions(-) tools/libxc/Makefile | 1 tools/libxc/xc_csched2.c | 50 + tools/libxc/xenctrl.h | 8 tools/python/xen/lowlevel/xc/xc.c | 58 + tools/python/xen/xend/XendAPI.py | 3 tools/python/xen/xend/XendDomain.py | 54 + tools/python/xen/xend/XendDomainInfo.py | 4 tools/python/xen/xend/XendNode.py | 4 tools/python/xen/xend/XendVMMetrics.py | 1 tools/python/xen/xend/server/SrvDomain.py | 14 tools/python/xen/xm/main.py | 82 ++ xen/arch/ia64/vmx/vmmu.c | 6 xen/common/Makefile | 1 xen/common/sched_credit.c | 8 xen/common/sched_credit2.c | 1125 +++++++++++++++++++++++++++++ xen/common/schedule.c | 22 xen/include/public/domctl.h | 4 xen/include/public/trace.h | 1 xen/include/xen/sched-if.h | 28 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
George Dunlap
2010-Apr-14 10:26 UTC
[Xen-devel] [PATCH 1 of 5] credit2: Add context_saved scheduler callback
2 files changed, 3 insertions(+) xen/common/schedule.c | 2 ++ xen/include/xen/sched-if.h | 1 + Because credit2 shares a runqueue between several cpus, it needs to know when a scheduled-out process has finally been context-switched away so that it can be added to the runqueue again. (Otherwise it may be grabbed by another processor before the context has been properly saved.) Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com> diff -r c02cc832cb2d -r 2631707c54b3 xen/common/schedule.c --- a/xen/common/schedule.c Tue Apr 13 18:19:33 2010 +0100 +++ b/xen/common/schedule.c Wed Apr 14 11:16:58 2010 +0100 @@ -923,6 +923,8 @@ /* Check for migration request /after/ clearing running flag. */ smp_mb(); + SCHED_OP(context_saved, prev); + if ( unlikely(test_bit(_VPF_migrating, &prev->pause_flags)) ) vcpu_migrate(prev); } diff -r c02cc832cb2d -r 2631707c54b3 xen/include/xen/sched-if.h --- a/xen/include/xen/sched-if.h Tue Apr 13 18:19:33 2010 +0100 +++ b/xen/include/xen/sched-if.h Wed Apr 14 11:16:58 2010 +0100 @@ -70,6 +70,7 @@ void (*sleep) (struct vcpu *); void (*wake) (struct vcpu *); + void (*context_saved) (struct vcpu *); struct task_slice (*do_schedule) (s_time_t); _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
George Dunlap
2010-Apr-14 10:26 UTC
[Xen-devel] [PATCH 2 of 5] credit2: Flexible cpu-to-schedule-spinlock mappings
4 files changed, 40 insertions(+), 19 deletions(-) xen/arch/ia64/vmx/vmmu.c | 6 +++--- xen/common/sched_credit.c | 8 ++++---- xen/common/schedule.c | 18 ++++++++++-------- xen/include/xen/sched-if.h | 27 +++++++++++++++++++++++---- Credit2 shares a runqueue between several cpus. Rather than have double locking and dealing with the cpu-to-runqueue races, allow the scheduler to redefine the sched_lock-to-cpu mapping. Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com> diff -r 2631707c54b3 -r 21d0f640b0c0 xen/arch/ia64/vmx/vmmu.c --- a/xen/arch/ia64/vmx/vmmu.c Wed Apr 14 11:16:58 2010 +0100 +++ b/xen/arch/ia64/vmx/vmmu.c Wed Apr 14 11:16:58 2010 +0100 @@ -394,7 +394,7 @@ if (cpu != current->processor) return; local_irq_save(flags); - if (!spin_trylock(&per_cpu(schedule_data, cpu).schedule_lock)) + if (!spin_trylock(per_cpu(schedule_data, cpu).schedule_lock)) goto bail2; if (v->processor != cpu) goto bail1; @@ -416,7 +416,7 @@ ia64_dv_serialize_data(); args->vcpu = NULL; bail1: - spin_unlock(&per_cpu(schedule_data, cpu).schedule_lock); + spin_unlock(per_cpu(schedule_data, cpu).schedule_lock); bail2: local_irq_restore(flags); } @@ -446,7 +446,7 @@ do { cpu = v->processor; if (cpu != current->processor) { - spin_barrier(&per_cpu(schedule_data, cpu).schedule_lock); + spin_barrier(per_cpu(schedule_data, cpu).schedule_lock); /* Flush VHPT on remote processors. */ smp_call_function_single(cpu, &ptc_ga_remote_func, &args, 1); } else { diff -r 2631707c54b3 -r 21d0f640b0c0 xen/common/sched_credit.c --- a/xen/common/sched_credit.c Wed Apr 14 11:16:58 2010 +0100 +++ b/xen/common/sched_credit.c Wed Apr 14 11:16:58 2010 +0100 @@ -789,7 +789,7 @@ spc->runq_sort_last = sort_epoch; - spin_lock_irqsave(&per_cpu(schedule_data, cpu).schedule_lock, flags); + spin_lock_irqsave(per_cpu(schedule_data, cpu).schedule_lock, flags); runq = &spc->runq; elem = runq->next; @@ -814,7 +814,7 @@ elem = next; } - spin_unlock_irqrestore(&per_cpu(schedule_data, cpu).schedule_lock, flags); + spin_unlock_irqrestore(per_cpu(schedule_data, cpu).schedule_lock, flags); } static void @@ -1130,7 +1130,7 @@ * cause a deadlock if the peer CPU is also load balancing and trying * to lock this CPU. */ - if ( !spin_trylock(&per_cpu(schedule_data, peer_cpu).schedule_lock) ) + if ( !spin_trylock(per_cpu(schedule_data, peer_cpu).schedule_lock) ) { CSCHED_STAT_CRANK(steal_trylock_failed); continue; @@ -1140,7 +1140,7 @@ * Any work over there to steal? */ speer = csched_runq_steal(peer_cpu, cpu, snext->pri); - spin_unlock(&per_cpu(schedule_data, peer_cpu).schedule_lock); + spin_unlock(per_cpu(schedule_data, peer_cpu).schedule_lock); if ( speer != NULL ) return speer; } diff -r 2631707c54b3 -r 21d0f640b0c0 xen/common/schedule.c --- a/xen/common/schedule.c Wed Apr 14 11:16:58 2010 +0100 +++ b/xen/common/schedule.c Wed Apr 14 11:16:58 2010 +0100 @@ -131,7 +131,7 @@ s_time_t delta; ASSERT(v->runstate.state != new_state); - ASSERT(spin_is_locked(&per_cpu(schedule_data,v->processor).schedule_lock)); + ASSERT(spin_is_locked(per_cpu(schedule_data,v->processor).schedule_lock)); vcpu_urgent_count_update(v); @@ -340,7 +340,7 @@ /* Switch to new CPU, then unlock old CPU. */ v->processor = new_cpu; spin_unlock_irqrestore( - &per_cpu(schedule_data, old_cpu).schedule_lock, flags); + per_cpu(schedule_data, old_cpu).schedule_lock, flags); /* Wake on new CPU. */ vcpu_wake(v); @@ -846,7 +846,7 @@ sd = &this_cpu(schedule_data); - spin_lock_irq(&sd->schedule_lock); + spin_lock_irq(sd->schedule_lock); stop_timer(&sd->s_timer); @@ -862,7 +862,7 @@ if ( unlikely(prev == next) ) { - spin_unlock_irq(&sd->schedule_lock); + spin_unlock_irq(sd->schedule_lock); trace_continue_running(next); return continue_running(prev); } @@ -900,7 +900,7 @@ ASSERT(!next->is_running); next->is_running = 1; - spin_unlock_irq(&sd->schedule_lock); + spin_unlock_irq(sd->schedule_lock); perfc_incr(sched_ctx); @@ -968,7 +968,9 @@ for_each_possible_cpu ( i ) { - spin_lock_init(&per_cpu(schedule_data, i).schedule_lock); + spin_lock_init(&per_cpu(schedule_data, i)._lock); + per_cpu(schedule_data, i).schedule_lock + = &per_cpu(schedule_data, i)._lock; init_timer(&per_cpu(schedule_data, i).s_timer, s_timer_fn, NULL, i); } @@ -1005,10 +1007,10 @@ for_each_online_cpu ( i ) { - spin_lock(&per_cpu(schedule_data, i).schedule_lock); + spin_lock(per_cpu(schedule_data, i).schedule_lock); printk("CPU[%02d] ", i); SCHED_OP(dump_cpu_state, i); - spin_unlock(&per_cpu(schedule_data, i).schedule_lock); + spin_unlock(per_cpu(schedule_data, i).schedule_lock); } local_irq_restore(flags); diff -r 2631707c54b3 -r 21d0f640b0c0 xen/include/xen/sched-if.h --- a/xen/include/xen/sched-if.h Wed Apr 14 11:16:58 2010 +0100 +++ b/xen/include/xen/sched-if.h Wed Apr 14 11:16:58 2010 +0100 @@ -10,8 +10,19 @@ #include <xen/percpu.h> +/* + * In order to allow a scheduler to remap the lock->cpu mapping, + * we have a per-cpu pointer, along with a pre-allocated set of + * locks. The generic schedule init code will point each schedule lock + * pointer to the schedule lock; if the scheduler wants to remap them, + * it can simply modify the schedule locks. + * + * For cache betterness, keep the actual lock in the same cache area + * as the rest of the struct. Just have the scheduler point to the + * one it wants (This may be the one right in front of it).*/ struct schedule_data { - spinlock_t schedule_lock; /* spinlock protecting curr */ + spinlock_t *schedule_lock, + _lock; struct vcpu *curr; /* current task */ struct vcpu *idle; /* idle task for this cpu */ void *sched_priv; @@ -27,11 +38,19 @@ for ( ; ; ) { + /* NB: For schedulers with multiple cores per runqueue, + * a vcpu may change processor w/o changing runqueues; + * so we may release a lock only to grab it again. + * + * If that is measured to be an issue, then the check + * should be changed to checking if the locks pointed to + * by cpu and v->processor are still the same. + */ cpu = v->processor; - spin_lock(&per_cpu(schedule_data, cpu).schedule_lock); + spin_lock(per_cpu(schedule_data, cpu).schedule_lock); if ( likely(v->processor == cpu) ) break; - spin_unlock(&per_cpu(schedule_data, cpu).schedule_lock); + spin_unlock(per_cpu(schedule_data, cpu).schedule_lock); } } @@ -42,7 +61,7 @@ static inline void vcpu_schedule_unlock(struct vcpu *v) { - spin_unlock(&per_cpu(schedule_data, v->processor).schedule_lock); + spin_unlock(per_cpu(schedule_data, v->processor).schedule_lock); } #define vcpu_schedule_unlock_irq(v) \ _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
George Dunlap
2010-Apr-14 10:26 UTC
[Xen-devel] [PATCH 3 of 5] credit2: Add a scheduler-specific schedule trace class
1 file changed, 1 insertion(+) xen/include/public/trace.h | 1 + Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com> diff -r 21d0f640b0c0 -r 68636d5fb3df xen/include/public/trace.h --- a/xen/include/public/trace.h Wed Apr 14 11:16:58 2010 +0100 +++ b/xen/include/public/trace.h Wed Apr 14 11:16:58 2010 +0100 @@ -53,6 +53,7 @@ #define TRC_HVM_HANDLER 0x00082000 /* various HVM handlers */ #define TRC_SCHED_MIN 0x00021000 /* Just runstate changes */ +#define TRC_SCHED_CLASS 0x00022000 /* Scheduler-specific */ #define TRC_SCHED_VERBOSE 0x00028000 /* More inclusive scheduling */ /* Trace events per class */ _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
George Dunlap
2010-Apr-14 10:26 UTC
[Xen-devel] [PATCH 4 of 5] credit2: Add credit2 scheduler to hypervisor
4 files changed, 1132 insertions(+) xen/common/Makefile | 1 xen/common/sched_credit2.c | 1125 +++++++++++++++++++++++++++++++++++++++++++ xen/common/schedule.c | 2 xen/include/public/domctl.h | 4 This is the core credit2 patch. It adds the new credit2 scheduler to the hypervisor, as the non-default scheduler. It should be emphasized that this is still in the development phase, and is probably still unstable. It is known to be suboptimal for multi-socket systems. Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com> diff -r 68636d5fb3df -r 1cdbec67f224 xen/common/Makefile --- a/xen/common/Makefile Wed Apr 14 11:16:58 2010 +0100 +++ b/xen/common/Makefile Wed Apr 14 11:16:58 2010 +0100 @@ -13,6 +13,7 @@ obj-y += page_alloc.o obj-y += rangeset.o obj-y += sched_credit.o +obj-y += sched_credit2.o obj-y += sched_sedf.o obj-y += schedule.o obj-y += shutdown.o diff -r 68636d5fb3df -r 1cdbec67f224 xen/common/sched_credit2.c --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/xen/common/sched_credit2.c Wed Apr 14 11:16:58 2010 +0100 @@ -0,0 +1,1125 @@ + +/**************************************************************************** + * (C) 2009 - George Dunlap - Citrix Systems R&D UK, Ltd + **************************************************************************** + * + * File: common/csched_credit2.c + * Author: George Dunlap + * + * Description: Credit-based SMP CPU scheduler + * Based on an earlier verson by Emmanuel Ackaouy. + */ + +#include <xen/config.h> +#include <xen/init.h> +#include <xen/lib.h> +#include <xen/sched.h> +#include <xen/domain.h> +#include <xen/delay.h> +#include <xen/event.h> +#include <xen/time.h> +#include <xen/perfc.h> +#include <xen/sched-if.h> +#include <xen/softirq.h> +#include <asm/atomic.h> +#include <xen/errno.h> +#include <xen/trace.h> + +#if __i386__ +#define PRI_stime "lld" +#else +#define PRI_stime "ld" +#endif + +#define d2printk(x...) +//#define d2printk printk + +#define TRC_CSCHED2_TICK TRC_SCHED_CLASS + 1 +#define TRC_CSCHED2_RUNQ_POS TRC_SCHED_CLASS + 2 +#define TRC_CSCHED2_CREDIT_BURN TRC_SCHED_CLASS + 3 +#define TRC_CSCHED2_CREDIT_ADD TRC_SCHED_CLASS + 4 +#define TRC_CSCHED2_TICKLE_CHECK TRC_SCHED_CLASS + 5 + +/* + * WARNING: This is still in an experimental phase. Status and work can be found at the + * credit2 wiki page: + * http://wiki.xensource.com/xenwiki/Credit2_Scheduler_Development + * TODO: + * + Immediate bug-fixes + * - Do per-runqueue, grab proper lock for dump debugkey + * + Multiple sockets + * - Detect cpu layout and make runqueue map, one per L2 (make_runq_map()) + * - Simple load balancer / runqueue assignment + * - Runqueue load measurement + * - Load-based load balancer + * + Hyperthreading + * - Look for non-busy core if possible + * - "Discount" time run on a thread with busy siblings + * + Algorithm: + * - "Mixed work" problem: if a VM is playing audio (5%) but also burning cpu (e.g., + * a flash animation in the background) can we schedule it with low enough latency + * so that audio doesn''t skip? + * - Cap and reservation: How to implement with the current system? + * + Optimizing + * - Profiling, making new algorithms, making math more efficient (no long division) + */ + +/* + * Design: + * + * VMs "burn" credits based on their weight; higher weight means + * credits burn more slowly. The highest weight vcpu burns credits at + * a rate of 1 credit per nanosecond. Others burn proportionally + * more. + * + * vcpus are inserted into the runqueue by credit order. + * + * Credits are "reset" when the next vcpu in the runqueue is less than + * or equal to zero. At that point, everyone''s credits are "clipped" + * to a small value, and a fixed credit is added to everyone. + * + * The plan is for all cores that share an L2 will share the same + * runqueue. At the moment, there is one global runqueue for all + * cores. + */ + +/* + * Locking: + * - Schedule-lock is per-runqueue + * + Protects runqueue data, runqueue insertion, &c + * + Also protects updates to private sched vcpu structure + * + Must be grabbed using vcpu_schedule_lock_irq() to make sure vcpu->processr + * doesn''t change under our feet. + * - Private data lock + * + Protects access to global domain list + * + All other private data is written at init and only read afterwards. + * Ordering: + * - We grab private->schedule when updating domain weight; so we + * must never grab private if a schedule lock is held. + */ + +/* + * Basic constants + */ +/* Default weight: How much a new domain starts with */ +#define CSCHED_DEFAULT_WEIGHT 256 +/* Min timer: Minimum length a timer will be set, to + * achieve efficiency */ +#define CSCHED_MIN_TIMER MICROSECS(500) +/* Amount of credit VMs begin with, and are reset to. + * ATM, set so that highest-weight VMs can only run for 10ms + * before a reset event. */ +#define CSCHED_CREDIT_INIT MILLISECS(10) +/* Carryover: How much "extra" credit may be carried over after + * a reset. */ +#define CSCHED_CARRYOVER_MAX CSCHED_MIN_TIMER +/* Reset: Value below which credit will be reset. */ +#define CSCHED_CREDIT_RESET 0 +/* Max timer: Maximum time a guest can be run for. */ +#define CSCHED_MAX_TIMER MILLISECS(2) + + +#define CSCHED_IDLE_CREDIT (-(1<<30)) + +/* + * Flags + */ +/* CSFLAG_scheduled: Is this vcpu either running on, or context-switching off, + * a physical cpu? + * + Accessed only with runqueue lock held + * + Set when chosen as next in csched_schedule(). + * + Cleared after context switch has been saved in csched_context_saved() + * + Checked in vcpu_wake to see if we can add to the runqueue, or if we should + * set CSFLAG_delayed_runq_add + * + Checked to be false in runq_insert. + */ +#define __CSFLAG_scheduled 1 +#define CSFLAG_scheduled (1<<__CSFLAG_scheduled) +/* CSFLAG_delayed_runq_add: Do we need to add this to the runqueue once it''d done + * being context switched out? + * + Set when scheduling out in csched_schedule() if prev is runnable + * + Set in csched_vcpu_wake if it finds CSFLAG_scheduled set + * + Read in csched_context_saved(). If set, it adds prev to the runqueue and + * clears the bit. + */ +#define __CSFLAG_delayed_runq_add 2 +#define CSFLAG_delayed_runq_add (1<<__CSFLAG_delayed_runq_add) + + +/* + * Useful macros + */ +#define CSCHED_VCPU(_vcpu) ((struct csched_vcpu *) (_vcpu)->sched_priv) +#define CSCHED_DOM(_dom) ((struct csched_dom *) (_dom)->sched_priv) +/* CPU to runq_id macro */ +#define c2r(_cpu) (csched_priv.runq_map[(_cpu)]) +/* CPU to runqueue struct macro */ +#define RQD(_cpu) (&csched_priv.rqd[c2r(_cpu)]) + +/* + * Per-runqueue data + */ +struct csched_runqueue_data { + int id; + struct list_head runq; /* Ordered list of runnable vms */ + struct list_head svc; /* List of all vcpus assigned to this runqueue */ + int max_weight; + int cpu_min, cpu_max; /* Range of physical cpus this runqueue runs */ +}; + +/* + * System-wide private data + */ +struct csched_private { + spinlock_t lock; + uint32_t ncpus; + struct domain *idle_domain; + + struct list_head sdom; /* Used mostly for dump keyhandler. */ + + int runq_map[NR_CPUS]; + uint32_t runq_count; + struct csched_runqueue_data rqd[NR_CPUS]; +}; + +/* + * Virtual CPU + */ +struct csched_vcpu { + struct list_head rqd_elem; /* On the runqueue data list */ + struct list_head sdom_elem; /* On the domain vcpu list */ + struct list_head runq_elem; /* On the runqueue */ + + /* Up-pointers */ + struct csched_dom *sdom; + struct vcpu *vcpu; + + int weight; + + int credit; + s_time_t start_time; /* When we were scheduled (used for credit) */ + unsigned flags; /* 16 bits doesn''t seem to play well with clear_bit() */ + +}; + +/* + * Domain + */ +struct csched_dom { + struct list_head vcpu; + struct list_head sdom_elem; + struct domain *dom; + uint16_t weight; + uint16_t nr_vcpus; +}; + + +/* + * Global variables + */ +static struct csched_private csched_priv; + +/* + * Time-to-credit, credit-to-time. + * FIXME: Do pre-calculated division? + */ +static s_time_t t2c(struct csched_runqueue_data *rqd, s_time_t time, struct csched_vcpu *svc) +{ + return time * rqd->max_weight / svc->weight; +} + +static s_time_t c2t(struct csched_runqueue_data *rqd, s_time_t credit, struct csched_vcpu *svc) +{ + return credit * svc->weight / rqd->max_weight; +} + +/* + * Runqueue related code + */ + +static /*inline*/ int +__vcpu_on_runq(struct csched_vcpu *svc) +{ + return !list_empty(&svc->runq_elem); +} + +static /*inline*/ struct csched_vcpu * +__runq_elem(struct list_head *elem) +{ + return list_entry(elem, struct csched_vcpu, runq_elem); +} + +static int +__runq_insert(struct list_head *runq, struct csched_vcpu *svc) +{ + struct list_head *iter; + int pos = 0; + + d2printk("rqi d%dv%d\n", + svc->vcpu->domain->domain_id, + svc->vcpu->vcpu_id); + + /* Idle vcpus not allowed on the runqueue anymore */ + BUG_ON(is_idle_vcpu(svc->vcpu)); + BUG_ON(svc->vcpu->is_running); + BUG_ON(test_bit(__CSFLAG_scheduled, &svc->flags)); + + list_for_each( iter, runq ) + { + struct csched_vcpu * iter_svc = __runq_elem(iter); + + if ( svc->credit > iter_svc->credit ) + { + d2printk(" p%d d%dv%d\n", + pos, + iter_svc->vcpu->domain->domain_id, + iter_svc->vcpu->vcpu_id); + break; + } + pos++; + } + + list_add_tail(&svc->runq_elem, iter); + + return pos; +} + +static void +runq_insert(unsigned int cpu, struct csched_vcpu *svc) +{ + struct list_head * runq = &RQD(cpu)->runq; + int pos = 0; + + ASSERT( spin_is_locked(per_cpu(schedule_data, cpu).schedule_lock) ); + + BUG_ON( __vcpu_on_runq(svc) ); + BUG_ON( c2r(cpu) != c2r(svc->vcpu->processor) ); + + pos = __runq_insert(runq, svc); + + { + struct { + unsigned dom:16,vcpu:16; + unsigned pos; + } d; + d.dom = svc->vcpu->domain->domain_id; + d.vcpu = svc->vcpu->vcpu_id; + d.pos = pos; + trace_var(TRC_CSCHED2_RUNQ_POS, 1, + sizeof(d), + (unsigned char *)&d); + } + + return; +} + +static inline void +__runq_remove(struct csched_vcpu *svc) +{ + BUG_ON( !__vcpu_on_runq(svc) ); + list_del_init(&svc->runq_elem); +} + +void burn_credits(struct csched_runqueue_data *rqd, struct csched_vcpu *, s_time_t); + +/* Check to see if the item on the runqueue is higher priority than what''s + * currently running; if so, wake up the processor */ +static /*inline*/ void +runq_tickle(unsigned int cpu, struct csched_vcpu *new, s_time_t now) +{ + int i, ipid=-1; + s_time_t lowest=(1<<30); + struct csched_runqueue_data *rqd = RQD(cpu); + + d2printk("rqt d%dv%d cd%dv%d\n", + new->vcpu->domain->domain_id, + new->vcpu->vcpu_id, + current->domain->domain_id, + current->vcpu_id); + + /* Find the cpu in this queue group that has the lowest credits */ + for ( i=rqd->cpu_min ; i < rqd->cpu_max ; i++ ) + { + struct csched_vcpu * cur; + + /* Skip cpus that aren''t online */ + if ( !cpu_online(i) ) + continue; + + cur = CSCHED_VCPU(per_cpu(schedule_data, i).curr); + + /* FIXME: keep track of idlers, chose from the mask */ + if ( is_idle_vcpu(cur->vcpu) ) + { + ipid = i; + lowest = CSCHED_IDLE_CREDIT; + break; + } + else + { + /* Update credits for current to see if we want to preempt */ + burn_credits(rqd, cur, now); + + if ( cur->credit < lowest ) + { + ipid = i; + lowest = cur->credit; + } + + /* TRACE */ { + struct { + unsigned dom:16,vcpu:16; + unsigned credit; + } d; + d.dom = cur->vcpu->domain->domain_id; + d.vcpu = cur->vcpu->vcpu_id; + d.credit = cur->credit; + trace_var(TRC_CSCHED2_TICKLE_CHECK, 1, + sizeof(d), + (unsigned char *)&d); + } + } + } + + if ( ipid != -1 ) + { + int cdiff = lowest - new->credit; + + if ( lowest == CSCHED_IDLE_CREDIT || cdiff < 0 ) { + d2printk("si %d\n", ipid); + cpu_raise_softirq(ipid, SCHEDULE_SOFTIRQ); + } + else + /* FIXME: Wake up later? */; + } +} + +/* + * Credit-related code + */ +static void reset_credit(int cpu, s_time_t now) +{ + struct list_head *iter; + + list_for_each( iter, &RQD(cpu)->svc ) + { + struct csched_vcpu * svc = list_entry(iter, struct csched_vcpu, rqd_elem); + + BUG_ON( is_idle_vcpu(svc->vcpu) ); + + /* "Clip" credits to max carryover */ + if ( svc->credit > CSCHED_CARRYOVER_MAX ) + svc->credit = CSCHED_CARRYOVER_MAX; + /* And add INIT */ + svc->credit += CSCHED_CREDIT_INIT; + svc->start_time = now; + + /* FIXME: Trace credit */ + } + + /* No need to resort runqueue, as everyone''s order should be the same. */ +} + +void burn_credits(struct csched_runqueue_data *rqd, struct csched_vcpu *svc, s_time_t now) +{ + s_time_t delta; + + /* Assert svc is current */ + ASSERT(svc==CSCHED_VCPU(per_cpu(schedule_data, svc->vcpu->processor).curr)); + + if ( is_idle_vcpu(svc->vcpu) ) + { + BUG_ON(svc->credit != CSCHED_IDLE_CREDIT); + return; + } + + delta = now - svc->start_time; + + if ( delta > 0 ) { + /* This will round down; should we consider rounding up...? */ + svc->credit -= t2c(rqd, delta, svc); + svc->start_time = now; + + d2printk("b d%dv%d c%d\n", + svc->vcpu->domain->domain_id, + svc->vcpu->vcpu_id, + svc->credit); + } else { + d2printk("%s: Time went backwards? now %"PRI_stime" start %"PRI_stime"\n", + __func__, now, svc->start_time); + } + + /* TRACE */ + { + struct { + unsigned dom:16,vcpu:16; + unsigned credit; + int delta; + } d; + d.dom = svc->vcpu->domain->domain_id; + d.vcpu = svc->vcpu->vcpu_id; + d.credit = svc->credit; + d.delta = delta; + trace_var(TRC_CSCHED2_CREDIT_BURN, 1, + sizeof(d), + (unsigned char *)&d); + } +} + +/* Find the domain with the highest weight. */ +void update_max_weight(struct csched_runqueue_data *rqd, int new_weight, int old_weight) +{ + /* Try to avoid brute-force search: + * - If new_weight is larger, max_weigth <- new_weight + * - If old_weight != max_weight, someone else is still max_weight + * (No action required) + * - If old_weight == max_weight, brute-force search for max weight + */ + if ( new_weight > rqd->max_weight ) + { + rqd->max_weight = new_weight; + printk("%s: Runqueue id %d max weight %d\n", __func__, rqd->id, rqd->max_weight); + } + else if ( old_weight == rqd->max_weight ) + { + struct list_head *iter; + int max_weight = 1; + + list_for_each( iter, &rqd->svc ) + { + struct csched_vcpu * svc = list_entry(iter, struct csched_vcpu, rqd_elem); + + if ( svc->weight > max_weight ) + max_weight = svc->weight; + } + + rqd->max_weight = max_weight; + printk("%s: Runqueue %d max weight %d\n", __func__, rqd->id, rqd->max_weight); + } +} + +#ifndef NDEBUG +static /*inline*/ void +__csched_vcpu_check(struct vcpu *vc) +{ + struct csched_vcpu * const svc = CSCHED_VCPU(vc); + struct csched_dom * const sdom = svc->sdom; + + BUG_ON( svc->vcpu != vc ); + BUG_ON( sdom != CSCHED_DOM(vc->domain) ); + if ( sdom ) + { + BUG_ON( is_idle_vcpu(vc) ); + BUG_ON( sdom->dom != vc->domain ); + } + else + { + BUG_ON( !is_idle_vcpu(vc) ); + } +} +#define CSCHED_VCPU_CHECK(_vc) (__csched_vcpu_check(_vc)) +#else +#define CSCHED_VCPU_CHECK(_vc) +#endif + +static int +csched_vcpu_init(struct vcpu *vc) +{ + struct domain * const dom = vc->domain; + struct csched_dom *sdom = CSCHED_DOM(dom); + struct csched_vcpu *svc; + + printk("%s: Initializing d%dv%d\n", + __func__, dom->domain_id, vc->vcpu_id); + + /* Allocate per-VCPU info */ + svc = xmalloc(struct csched_vcpu); + if ( svc == NULL ) + return -1; + + INIT_LIST_HEAD(&svc->rqd_elem); + INIT_LIST_HEAD(&svc->sdom_elem); + INIT_LIST_HEAD(&svc->runq_elem); + + svc->sdom = sdom; + svc->vcpu = vc; + svc->flags = 0U; + vc->sched_priv = svc; + + if ( ! is_idle_vcpu(vc) ) + { + BUG_ON( sdom == NULL ); + + svc->credit = CSCHED_CREDIT_INIT; + svc->weight = sdom->weight; + + /* FIXME: Do we need the private lock here? */ + list_add_tail(&svc->sdom_elem, &sdom->vcpu); + + /* Add vcpu to runqueue of initial processor */ + /* FIXME: Abstract for multiple runqueues */ + vcpu_schedule_lock_irq(vc); + + list_add_tail(&svc->rqd_elem, &RQD(vc->processor)->svc); + update_max_weight(RQD(vc->processor), svc->weight, 0); + + vcpu_schedule_unlock_irq(vc); + + sdom->nr_vcpus++; + } + else + { + BUG_ON( sdom != NULL ); + svc->credit = CSCHED_IDLE_CREDIT; + svc->weight = 0; + if ( csched_priv.idle_domain == NULL ) + csched_priv.idle_domain = dom; + } + + CSCHED_VCPU_CHECK(vc); + return 0; +} + +static void +csched_vcpu_destroy(struct vcpu *vc) +{ + struct csched_vcpu * const svc = CSCHED_VCPU(vc); + struct csched_dom * const sdom = svc->sdom; + + BUG_ON( sdom == NULL ); + BUG_ON( !list_empty(&svc->runq_elem) ); + + /* Remove from runqueue */ + vcpu_schedule_lock_irq(vc); + + list_del_init(&svc->rqd_elem); + update_max_weight(RQD(vc->processor), 0, svc->weight); + + vcpu_schedule_unlock_irq(vc); + + /* Remove from sdom list. Don''t need a lock for this, as it''s called + * syncronously when nothing else can happen. */ + list_del_init(&svc->sdom_elem); + + sdom->nr_vcpus--; + + xfree(svc); +} + +static void +csched_vcpu_sleep(struct vcpu *vc) +{ + struct csched_vcpu * const svc = CSCHED_VCPU(vc); + + BUG_ON( is_idle_vcpu(vc) ); + + if ( per_cpu(schedule_data, vc->processor).curr == vc ) + cpu_raise_softirq(vc->processor, SCHEDULE_SOFTIRQ); + else if ( __vcpu_on_runq(svc) ) + __runq_remove(svc); +} + +static void +csched_vcpu_wake(struct vcpu *vc) +{ + struct csched_vcpu * const svc = CSCHED_VCPU(vc); + const unsigned int cpu = vc->processor; + s_time_t now = 0; + + /* Schedule lock should be held at this point. */ + + d2printk("w d%dv%d\n", vc->domain->domain_id, vc->vcpu_id); + + BUG_ON( is_idle_vcpu(vc) ); + + /* Make sure svc priority mod happens before runq check */ + if ( unlikely(per_cpu(schedule_data, cpu).curr == vc) ) + { + goto out; + } + + if ( unlikely(__vcpu_on_runq(svc)) ) + { + /* If we''ve boosted someone that''s already on a runqueue, prioritize + * it and inform the cpu in question. */ + goto out; + } + + /* If the context hasn''t been saved for this vcpu yet, we can''t put it on + * another runqueue. Instead, we set a flag so that it will be put on the runqueue + * after the context has been saved. */ + if ( unlikely (test_bit(__CSFLAG_scheduled, &svc->flags) ) ) + { + set_bit(__CSFLAG_delayed_runq_add, &svc->flags); + goto out; + } + + now = NOW(); + + /* Put the VCPU on the runq */ + runq_insert(cpu, svc); + runq_tickle(cpu, svc, now); + +out: + d2printk("w-\n"); + return; +} + +static void +csched_context_saved(struct vcpu *vc) +{ + struct csched_vcpu * const svc = CSCHED_VCPU(vc); + + vcpu_schedule_lock_irq(vc); + + /* This vcpu is now eligible to be put on the runqueue again */ + clear_bit(__CSFLAG_scheduled, &svc->flags); + + /* If someone wants it on the runqueue, put it there. */ + /* + * NB: We can get rid of CSFLAG_scheduled by checking for + * vc->is_running and __vcpu_on_runq(svc) here. However, + * since we''re accessing the flags cacheline anyway, + * it seems a bit pointless; especially as we have plenty of + * bits free. + */ + if ( test_bit(__CSFLAG_delayed_runq_add, &svc->flags) ) + { + const unsigned int cpu = vc->processor; + + clear_bit(__CSFLAG_delayed_runq_add, &svc->flags); + + BUG_ON(__vcpu_on_runq(svc)); + + runq_insert(cpu, svc); + runq_tickle(cpu, svc, NOW()); + } + + vcpu_schedule_unlock_irq(vc); +} + +static int +csched_cpu_pick(struct vcpu *vc) +{ + /* FIXME: Chose a schedule group based on load */ + /* FIXME: Migrate the vcpu to the new runqueue list, updating + max_weight for each runqueue */ + return 0; +} + +static int +csched_dom_cntl( + struct domain *d, + struct xen_domctl_scheduler_op *op) +{ + struct csched_dom * const sdom = CSCHED_DOM(d); + unsigned long flags; + + if ( op->cmd == XEN_DOMCTL_SCHEDOP_getinfo ) + { + op->u.credit2.weight = sdom->weight; + } + else + { + ASSERT(op->cmd == XEN_DOMCTL_SCHEDOP_putinfo); + + if ( op->u.credit2.weight != 0 ) + { + struct list_head *iter; + int old_weight; + + /* Must hold csched_priv lock to update sdom, runq lock to + * update csvcs. */ + spin_lock_irqsave(&csched_priv.lock, flags); + + old_weight = sdom->weight; + + sdom->weight = op->u.credit2.weight; + + /* Update weights for vcpus, and max_weight for runqueues on which they reside */ + list_for_each ( iter, &sdom->vcpu ) + { + struct csched_vcpu *svc = list_entry(iter, struct csched_vcpu, sdom_elem); + + /* NB: Locking order is important here. Because we grab this lock here, we + * must never lock csched_priv.lock if we''re holding a runqueue + * lock. */ + vcpu_schedule_lock_irq(svc->vcpu); + + svc->weight = sdom->weight; + update_max_weight(RQD(svc->vcpu->processor), svc->weight, old_weight); + + vcpu_schedule_unlock_irq(svc->vcpu); + } + + spin_unlock_irqrestore(&csched_priv.lock, flags); + } + } + + return 0; +} + +static int +csched_dom_init(struct domain *dom) +{ + struct csched_dom *sdom; + int flags; + + printk("%s: Initializing domain %d\n", __func__, dom->domain_id); + + if ( is_idle_domain(dom) ) + return 0; + + sdom = xmalloc(struct csched_dom); + if ( sdom == NULL ) + return -ENOMEM; + + /* Initialize credit and weight */ + INIT_LIST_HEAD(&sdom->vcpu); + INIT_LIST_HEAD(&sdom->sdom_elem); + sdom->dom = dom; + sdom->weight = CSCHED_DEFAULT_WEIGHT; + sdom->nr_vcpus = 0; + + dom->sched_priv = sdom; + + spin_lock_irqsave(&csched_priv.lock, flags); + + list_add_tail(&sdom->sdom_elem, &csched_priv.sdom); + + spin_unlock_irqrestore(&csched_priv.lock, flags); + + return 0; +} + +static void +csched_dom_destroy(struct domain *dom) +{ + struct csched_dom *sdom = CSCHED_DOM(dom); + int flags; + + BUG_ON(!list_empty(&sdom->vcpu)); + + spin_lock_irqsave(&csched_priv.lock, flags); + + list_del_init(&sdom->sdom_elem); + + spin_unlock_irqrestore(&csched_priv.lock, flags); + + xfree(CSCHED_DOM(dom)); +} + +/* How long should we let this vcpu run for? */ +static s_time_t +csched_runtime(int cpu, struct csched_vcpu *snext) +{ + s_time_t time = CSCHED_MAX_TIMER; + struct csched_runqueue_data *rqd = RQD(cpu); + struct list_head *runq = &rqd->runq; + + if ( is_idle_vcpu(snext->vcpu) ) + return CSCHED_MAX_TIMER; + + /* Basic time */ + time = c2t(rqd, snext->credit, snext); + + /* Next guy on runqueue */ + if ( ! list_empty(runq) ) + { + struct csched_vcpu *svc = __runq_elem(runq->next); + s_time_t ntime; + + if ( ! is_idle_vcpu(svc->vcpu) ) + { + ntime = c2t(rqd, snext->credit - svc->credit, snext); + + if ( time > ntime ) + time = ntime; + } + } + + /* Check limits */ + if ( time < CSCHED_MIN_TIMER ) + time = CSCHED_MIN_TIMER; + else if ( time > CSCHED_MAX_TIMER ) + time = CSCHED_MAX_TIMER; + + return time; +} + +void __dump_execstate(void *unused); + +/* + * This function is in the critical path. It is designed to be simple and + * fast for the common case. + */ +static struct task_slice +csched_schedule(s_time_t now) +{ + const int cpu = smp_processor_id(); + struct csched_runqueue_data *rqd = RQD(cpu); + struct list_head * const runq = &rqd->runq; + struct csched_vcpu * const scurr = CSCHED_VCPU(current); + struct csched_vcpu *snext = NULL; + struct task_slice ret; + + CSCHED_VCPU_CHECK(current); + + d2printk("sc p%d c d%dv%d now %"PRI_stime"\n", + cpu, + scurr->vcpu->domain->domain_id, + scurr->vcpu->vcpu_id, + now); + + + /* Protected by runqueue lock */ + + /* Update credits */ + burn_credits(rqd, scurr, now); + + /* + * Select next runnable local VCPU (ie top of local runq). + * + * If the current vcpu is runnable, and has higher credit than + * the next guy on the queue (or there is noone else), we want to run him again. + * + * If the current vcpu is runnable, and the next guy on the queue + * has higher credit, we want to mark current for delayed runqueue + * add, and remove the next guy from the queue. + * + * If the current vcpu is not runnable, we want to chose the idle + * vcpu for this processor. + */ + if ( list_empty(runq) ) + snext = CSCHED_VCPU(csched_priv.idle_domain->vcpu[cpu]); + else + snext = __runq_elem(runq->next); + + if ( !is_idle_vcpu(current) && vcpu_runnable(current) ) + { + /* If the current vcpu is runnable, and has higher credit + * than the next on the runqueue, run him again. + * Otherwise, set him for delayed runq add. */ + if ( scurr->credit > snext->credit) + snext = scurr; + else + set_bit(__CSFLAG_delayed_runq_add, &scurr->flags); + } + + if ( snext != scurr && !is_idle_vcpu(snext->vcpu) ) + { + __runq_remove(snext); + if ( snext->vcpu->is_running ) + { + printk("p%d: snext d%dv%d running on p%d! scurr d%dv%d\n", + cpu, + snext->vcpu->domain->domain_id, snext->vcpu->vcpu_id, + snext->vcpu->processor, + scurr->vcpu->domain->domain_id, + scurr->vcpu->vcpu_id); + BUG(); + } + set_bit(__CSFLAG_scheduled, &snext->flags); + } + + if ( !is_idle_vcpu(snext->vcpu) && snext->credit <= CSCHED_CREDIT_RESET ) + reset_credit(cpu, now); + +#if 0 + /* + * Update idlers mask if necessary. When we''re idling, other CPUs + * will tickle us when they get extra work. + */ + if ( is_idle_vcpu(snext->vcpu) ) + { + if ( !cpu_isset(cpu, csched_priv.idlers) ) + cpu_set(cpu, csched_priv.idlers); + } + else if ( cpu_isset(cpu, csched_priv.idlers) ) + { + cpu_clear(cpu, csched_priv.idlers); + } +#endif + + if ( !is_idle_vcpu(snext->vcpu) ) + { + snext->start_time = now; + snext->vcpu->processor = cpu; /* Safe because lock for old processor is held */ + } + /* + * Return task to run next... + */ + ret.time = csched_runtime(cpu, snext); + ret.task = snext->vcpu; + + CSCHED_VCPU_CHECK(ret.task); + return ret; +} + +static void +csched_dump_vcpu(struct csched_vcpu *svc) +{ + printk("[%i.%i] flags=%x cpu=%i", + svc->vcpu->domain->domain_id, + svc->vcpu->vcpu_id, + svc->flags, + svc->vcpu->processor); + + printk(" credit=%" PRIi32" [w=%u]", svc->credit, svc->weight); + + printk("\n"); +} + +static void +csched_dump_pcpu(int cpu) +{ + struct list_head *runq, *iter; + struct csched_vcpu *svc; + int loop; + char cpustr[100]; + + /* FIXME: Do locking properly for access to runqueue structures */ + + runq = &RQD(cpu)->runq; + + cpumask_scnprintf(cpustr, sizeof(cpustr), per_cpu(cpu_sibling_map,cpu)); + printk(" sibling=%s, ", cpustr); + cpumask_scnprintf(cpustr, sizeof(cpustr), per_cpu(cpu_core_map,cpu)); + printk("core=%s\n", cpustr); + + /* current VCPU */ + svc = CSCHED_VCPU(per_cpu(schedule_data, cpu).curr); + if ( svc ) + { + printk("\trun: "); + csched_dump_vcpu(svc); + } + + loop = 0; + list_for_each( iter, runq ) + { + svc = __runq_elem(iter); + if ( svc ) + { + printk("\t%3d: ", ++loop); + csched_dump_vcpu(svc); + } + } +} + +static void +csched_dump(void) +{ + struct list_head *iter_sdom, *iter_svc; + int loop; + + printk("info:\n" + "\tncpus = %u\n" + "\tdefault-weight = %d\n", + csched_priv.ncpus, + CSCHED_DEFAULT_WEIGHT); + + /* FIXME: Locking! */ + + printk("active vcpus:\n"); + loop = 0; + list_for_each( iter_sdom, &csched_priv.sdom ) + { + struct csched_dom *sdom; + sdom = list_entry(iter_sdom, struct csched_dom, sdom_elem); + + list_for_each( iter_svc, &sdom->vcpu ) + { + struct csched_vcpu *svc; + svc = list_entry(iter_svc, struct csched_vcpu, sdom_elem); + + printk("\t%3d: ", ++loop); + csched_dump_vcpu(svc); + } + } +} + +static void +make_runq_map(void) +{ + int cpu, cpu_count=0; + + /* FIXME: Read pcpu layout and do this properly */ + for_each_possible_cpu( cpu ) + { + csched_priv.runq_map[cpu] = 0; + cpu_count++; + } + csched_priv.runq_count = 1; + + /* Move to the init code...? */ + csched_priv.rqd[0].cpu_min = 0; + csched_priv.rqd[0].cpu_max = cpu_count; +} + +static void +csched_init(void) +{ + int i; + + printk("Initializing Credit2 scheduler\n" \ + " WARNING: This is experimental software in development.\n" \ + " Use at your own risk.\n"); + + spin_lock_init(&csched_priv.lock); + INIT_LIST_HEAD(&csched_priv.sdom); + + csched_priv.ncpus = 0; + + make_runq_map(); + + for ( i=0; i<csched_priv.runq_count ; i++ ) + { + struct csched_runqueue_data *rqd = csched_priv.rqd + i; + + rqd->max_weight = 1; + rqd->id = i; + INIT_LIST_HEAD(&rqd->svc); + INIT_LIST_HEAD(&rqd->runq); + } + + /* Initialize pcpu structures */ + for_each_possible_cpu(i) + { + int runq_id; + spinlock_t *lock; + + /* Point the per-cpu schedule lock to the runq_id lock */ + runq_id = csched_priv.runq_map[i]; + lock = &per_cpu(schedule_data, runq_id)._lock; + + per_cpu(schedule_data, i).schedule_lock = lock; + + csched_priv.ncpus++; + } +} + +struct scheduler sched_credit2_def = { + .name = "SMP Credit Scheduler rev2", + .opt_name = "credit2", + .sched_id = XEN_SCHEDULER_CREDIT2, + + .init_domain = csched_dom_init, + .destroy_domain = csched_dom_destroy, + + .init_vcpu = csched_vcpu_init, + .destroy_vcpu = csched_vcpu_destroy, + + .sleep = csched_vcpu_sleep, + .wake = csched_vcpu_wake, + + .adjust = csched_dom_cntl, + + .pick_cpu = csched_cpu_pick, + .do_schedule = csched_schedule, + .context_saved = csched_context_saved, + + .dump_cpu_state = csched_dump_pcpu, + .dump_settings = csched_dump, + .init = csched_init, +}; diff -r 68636d5fb3df -r 1cdbec67f224 xen/common/schedule.c --- a/xen/common/schedule.c Wed Apr 14 11:16:58 2010 +0100 +++ b/xen/common/schedule.c Wed Apr 14 11:16:58 2010 +0100 @@ -56,9 +56,11 @@ extern const struct scheduler sched_sedf_def; extern const struct scheduler sched_credit_def; +extern const struct scheduler sched_credit2_def; static const struct scheduler *__initdata schedulers[] = { &sched_sedf_def, &sched_credit_def, + &sched_credit2_def, NULL }; diff -r 68636d5fb3df -r 1cdbec67f224 xen/include/public/domctl.h --- a/xen/include/public/domctl.h Wed Apr 14 11:16:58 2010 +0100 +++ b/xen/include/public/domctl.h Wed Apr 14 11:16:58 2010 +0100 @@ -303,6 +303,7 @@ /* Scheduler types. */ #define XEN_SCHEDULER_SEDF 4 #define XEN_SCHEDULER_CREDIT 5 +#define XEN_SCHEDULER_CREDIT2 6 /* Set or get info? */ #define XEN_DOMCTL_SCHEDOP_putinfo 0 #define XEN_DOMCTL_SCHEDOP_getinfo 1 @@ -321,6 +322,9 @@ uint16_t weight; uint16_t cap; } credit; + struct xen_domctl_sched_credit2 { + uint16_t weight; + } credit2; } u; }; typedef struct xen_domctl_scheduler_op xen_domctl_scheduler_op_t; _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
George Dunlap
2010-Apr-14 10:26 UTC
[Xen-devel] [PATCH 5 of 5] credit2: Add toolstack options to control credit2 scheduler parameters
11 files changed, 277 insertions(+), 2 deletions(-) tools/libxc/Makefile | 1 tools/libxc/xc_csched2.c | 50 +++++++++++++++++ tools/libxc/xenctrl.h | 8 ++ tools/python/xen/lowlevel/xc/xc.c | 58 ++++++++++++++++++++ tools/python/xen/xend/XendAPI.py | 3 - tools/python/xen/xend/XendDomain.py | 54 +++++++++++++++++++ tools/python/xen/xend/XendDomainInfo.py | 4 + tools/python/xen/xend/XendNode.py | 4 + tools/python/xen/xend/XendVMMetrics.py | 1 tools/python/xen/xend/server/SrvDomain.py | 14 ++++ tools/python/xen/xm/main.py | 82 +++++++++++++++++++++++++++++ Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com> diff -r 1cdbec67f224 -r 149e4fb24e95 tools/libxc/Makefile --- a/tools/libxc/Makefile Wed Apr 14 11:16:58 2010 +0100 +++ b/tools/libxc/Makefile Wed Apr 14 11:25:17 2010 +0100 @@ -17,6 +17,7 @@ CTRL_SRCS-y += xc_private.c CTRL_SRCS-y += xc_sedf.c CTRL_SRCS-y += xc_csched.c +CTRL_SRCS-y += xc_csched2.c CTRL_SRCS-y += xc_tbuf.c CTRL_SRCS-y += xc_pm.c CTRL_SRCS-y += xc_cpu_hotplug.c diff -r 1cdbec67f224 -r 149e4fb24e95 tools/libxc/xc_csched2.c --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/tools/libxc/xc_csched2.c Wed Apr 14 11:25:17 2010 +0100 @@ -0,0 +1,50 @@ +/**************************************************************************** + * (C) 2006 - Emmanuel Ackaouy - XenSource Inc. + **************************************************************************** + * + * File: xc_csched.c + * Author: Emmanuel Ackaouy + * + * Description: XC Interface to the credit scheduler + * + */ +#include "xc_private.h" + + +int +xc_sched_credit2_domain_set( + int xc_handle, + uint32_t domid, + struct xen_domctl_sched_credit2 *sdom) +{ + DECLARE_DOMCTL; + + domctl.cmd = XEN_DOMCTL_scheduler_op; + domctl.domain = (domid_t) domid; + domctl.u.scheduler_op.sched_id = XEN_SCHEDULER_CREDIT2; + domctl.u.scheduler_op.cmd = XEN_DOMCTL_SCHEDOP_putinfo; + domctl.u.scheduler_op.u.credit2 = *sdom; + + return do_domctl(xc_handle, &domctl); +} + +int +xc_sched_credit2_domain_get( + int xc_handle, + uint32_t domid, + struct xen_domctl_sched_credit2 *sdom) +{ + DECLARE_DOMCTL; + int err; + + domctl.cmd = XEN_DOMCTL_scheduler_op; + domctl.domain = (domid_t) domid; + domctl.u.scheduler_op.sched_id = XEN_SCHEDULER_CREDIT2; + domctl.u.scheduler_op.cmd = XEN_DOMCTL_SCHEDOP_getinfo; + + err = do_domctl(xc_handle, &domctl); + if ( err == 0 ) + *sdom = domctl.u.scheduler_op.u.credit2; + + return err; +} diff -r 1cdbec67f224 -r 149e4fb24e95 tools/libxc/xenctrl.h --- a/tools/libxc/xenctrl.h Wed Apr 14 11:16:58 2010 +0100 +++ b/tools/libxc/xenctrl.h Wed Apr 14 11:25:17 2010 +0100 @@ -475,6 +475,14 @@ uint32_t domid, struct xen_domctl_sched_credit *sdom); +int xc_sched_credit2_domain_set(int xc_handle, + uint32_t domid, + struct xen_domctl_sched_credit2 *sdom); + +int xc_sched_credit2_domain_get(int xc_handle, + uint32_t domid, + struct xen_domctl_sched_credit2 *sdom); + /** * This function sends a trigger to a domain. * diff -r 1cdbec67f224 -r 149e4fb24e95 tools/python/xen/lowlevel/xc/xc.c --- a/tools/python/xen/lowlevel/xc/xc.c Wed Apr 14 11:16:58 2010 +0100 +++ b/tools/python/xen/lowlevel/xc/xc.c Wed Apr 14 11:25:17 2010 +0100 @@ -1558,6 +1558,45 @@ "cap", sdom.cap); } +static PyObject *pyxc_sched_credit2_domain_set(XcObject *self, + PyObject *args, + PyObject *kwds) +{ + uint32_t domid; + uint16_t weight; + static char *kwd_list[] = { "domid", "weight", NULL }; + static char kwd_type[] = "I|H"; + struct xen_domctl_sched_credit2 sdom; + + weight = 0; + if( !PyArg_ParseTupleAndKeywords(args, kwds, kwd_type, kwd_list, + &domid, &weight) ) + return NULL; + + sdom.weight = weight; + + if ( xc_sched_credit2_domain_set(self->xc_handle, domid, &sdom) != 0 ) + return pyxc_error_to_exception(); + + Py_INCREF(zero); + return zero; +} + +static PyObject *pyxc_sched_credit2_domain_get(XcObject *self, PyObject *args) +{ + uint32_t domid; + struct xen_domctl_sched_credit2 sdom; + + if( !PyArg_ParseTuple(args, "I", &domid) ) + return NULL; + + if ( xc_sched_credit2_domain_get(self->xc_handle, domid, &sdom) != 0 ) + return pyxc_error_to_exception(); + + return Py_BuildValue("{s:H}", + "weight", sdom.weight); +} + static PyObject *pyxc_domain_setmaxmem(XcObject *self, PyObject *args) { uint32_t dom; @@ -2113,6 +2152,24 @@ "Returns: [dict]\n" " weight [short]: domain''s scheduling weight\n"}, + { "sched_credit2_domain_set", + (PyCFunction)pyxc_sched_credit2_domain_set, + METH_KEYWORDS, "\n" + "Set the scheduling parameters for a domain when running with the\n" + "SMP credit2 scheduler.\n" + " domid [int]: domain id to set\n" + " weight [short]: domain''s scheduling weight\n" + "Returns: [int] 0 on success; -1 on error.\n" }, + + { "sched_credit2_domain_get", + (PyCFunction)pyxc_sched_credit2_domain_get, + METH_VARARGS, "\n" + "Get the scheduling parameters for a domain when running with the\n" + "SMP credit2 scheduler.\n" + " domid [int]: domain id to get\n" + "Returns: [dict]\n" + " weight [short]: domain''s scheduling weight\n"}, + { "evtchn_alloc_unbound", (PyCFunction)pyxc_evtchn_alloc_unbound, METH_VARARGS | METH_KEYWORDS, "\n" @@ -2495,6 +2552,7 @@ /* Expose some libxc constants to Python */ PyModule_AddIntConstant(m, "XEN_SCHEDULER_SEDF", XEN_SCHEDULER_SEDF); PyModule_AddIntConstant(m, "XEN_SCHEDULER_CREDIT", XEN_SCHEDULER_CREDIT); + PyModule_AddIntConstant(m, "XEN_SCHEDULER_CREDIT2", XEN_SCHEDULER_CREDIT2); } diff -r 1cdbec67f224 -r 149e4fb24e95 tools/python/xen/xend/XendAPI.py --- a/tools/python/xen/xend/XendAPI.py Wed Apr 14 11:16:58 2010 +0100 +++ b/tools/python/xen/xend/XendAPI.py Wed Apr 14 11:25:17 2010 +0100 @@ -1626,8 +1626,7 @@ if ''weight'' in xeninfo.info[''vcpus_params''] \ and ''cap'' in xeninfo.info[''vcpus_params'']: weight = xeninfo.info[''vcpus_params''][''weight''] - cap = xeninfo.info[''vcpus_params''][''cap''] - xendom.domain_sched_credit_set(xeninfo.getDomid(), weight, cap) + xendom.domain_sched_credit2_set(xeninfo.getDomid(), weight) def VM_set_VCPUs_number_live(self, _, vm_ref, num): dom = XendDomain.instance().get_vm_by_uuid(vm_ref) diff -r 1cdbec67f224 -r 149e4fb24e95 tools/python/xen/xend/XendDomain.py --- a/tools/python/xen/xend/XendDomain.py Wed Apr 14 11:16:58 2010 +0100 +++ b/tools/python/xen/xend/XendDomain.py Wed Apr 14 11:25:17 2010 +0100 @@ -1757,6 +1757,60 @@ log.exception(ex) raise XendError(str(ex)) + def domain_sched_credit2_get(self, domid): + """Get credit2 scheduler parameters for a domain. + + @param domid: Domain ID or Name + @type domid: int or string. + @rtype: dict with keys ''weight'' + @return: credit2 scheduler parameters + """ + dominfo = self.domain_lookup_nr(domid) + if not dominfo: + raise XendInvalidDomain(str(domid)) + + if dominfo._stateGet() in (DOM_STATE_RUNNING, DOM_STATE_PAUSED): + try: + return xc.sched_credit2_domain_get(dominfo.getDomid()) + except Exception, ex: + raise XendError(str(ex)) + else: + return {''weight'' : dominfo.getWeight()} + + def domain_sched_credit2_set(self, domid, weight = None): + """Set credit2 scheduler parameters for a domain. + + @param domid: Domain ID or Name + @type domid: int or string. + @type weight: int + @rtype: 0 + """ + set_weight = False + dominfo = self.domain_lookup_nr(domid) + if not dominfo: + raise XendInvalidDomain(str(domid)) + try: + if weight is None: + weight = int(0) + elif weight < 1 or weight > 65535: + raise XendError("weight is out of range") + else: + set_weight = True + + assert type(weight) == int + + rc = 0 + if dominfo._stateGet() in (DOM_STATE_RUNNING, DOM_STATE_PAUSED): + rc = xc.sched_credit2_domain_set(dominfo.getDomid(), weight) + if rc == 0: + if set_weight: + dominfo.setWeight(weight) + self.managed_config_save(dominfo) + return rc + except Exception, ex: + log.exception(ex) + raise XendError(str(ex)) + def domain_maxmem_set(self, domid, mem): """Set the memory limit for a domain. diff -r 1cdbec67f224 -r 149e4fb24e95 tools/python/xen/xend/XendDomainInfo.py --- a/tools/python/xen/xend/XendDomainInfo.py Wed Apr 14 11:16:58 2010 +0100 +++ b/tools/python/xen/xend/XendDomainInfo.py Wed Apr 14 11:25:17 2010 +0100 @@ -2811,6 +2811,10 @@ XendDomain.instance().domain_sched_credit_set(self.getDomid(), self.getWeight(), self.getCap()) + elif XendNode.instance().xenschedinfo() == ''credit2'': + from xen.xend import XendDomain + XendDomain.instance().domain_sched_credit2_set(self.getDomid(), + self.getWeight()) def _initDomain(self): log.debug(''XendDomainInfo.initDomain: %s %s'', diff -r 1cdbec67f224 -r 149e4fb24e95 tools/python/xen/xend/XendNode.py --- a/tools/python/xen/xend/XendNode.py Wed Apr 14 11:16:58 2010 +0100 +++ b/tools/python/xen/xend/XendNode.py Wed Apr 14 11:25:17 2010 +0100 @@ -779,6 +779,8 @@ return ''sedf'' elif sched_id == xen.lowlevel.xc.XEN_SCHEDULER_CREDIT: return ''credit'' + elif sched_id == xen.lowlevel.xc.XEN_SCHEDULER_CREDIT2: + return ''credit2'' else: return ''unknown'' @@ -988,6 +990,8 @@ return ''sedf'' elif sched_id == xen.lowlevel.xc.XEN_SCHEDULER_CREDIT: return ''credit'' + elif sched_id == xen.lowlevel.xc.XEN_SCHEDULER_CREDIT2: + return ''credit2'' else: return ''unknown'' diff -r 1cdbec67f224 -r 149e4fb24e95 tools/python/xen/xend/XendVMMetrics.py --- a/tools/python/xen/xend/XendVMMetrics.py Wed Apr 14 11:16:58 2010 +0100 +++ b/tools/python/xen/xend/XendVMMetrics.py Wed Apr 14 11:25:17 2010 +0100 @@ -129,6 +129,7 @@ params_live[''cpumap%i'' % i] = \ ",".join(map(str, info[''cpumap''])) + # FIXME: credit2?? params_live.update(xc.sched_credit_domain_get(domid)) return params_live diff -r 1cdbec67f224 -r 149e4fb24e95 tools/python/xen/xend/server/SrvDomain.py --- a/tools/python/xen/xend/server/SrvDomain.py Wed Apr 14 11:16:58 2010 +0100 +++ b/tools/python/xen/xend/server/SrvDomain.py Wed Apr 14 11:25:17 2010 +0100 @@ -163,6 +163,20 @@ val = fn(req.args, {''dom'': self.dom.getName()}) return val + def op_domain_sched_credit2_get(self, _, req): + fn = FormFn(self.xd.domain_sched_credit2_get, + [[''dom'', ''str'']]) + val = fn(req.args, {''dom'': self.dom.getName()}) + return val + + + def op_domain_sched_credit2_set(self, _, req): + fn = FormFn(self.xd.domain_sched_credit2_set, + [[''dom'', ''str''], + [''weight'', ''int'']]) + val = fn(req.args, {''dom'': self.dom.getName()}) + return val + def op_maxmem_set(self, _, req): return self.call(self.dom.setMemoryMaximum, [[''memory'', ''int'']], diff -r 1cdbec67f224 -r 149e4fb24e95 tools/python/xen/xm/main.py --- a/tools/python/xen/xm/main.py Wed Apr 14 11:16:58 2010 +0100 +++ b/tools/python/xen/xm/main.py Wed Apr 14 11:25:17 2010 +0100 @@ -151,6 +151,8 @@ ''sched-sedf'' : (''<Domain> [options]'', ''Get/set EDF parameters.''), ''sched-credit'': (''[-d <Domain> [-w[=WEIGHT]|-c[=CAP]]]'', ''Get/set credit scheduler parameters.''), + ''sched-credit2'': (''[-d <Domain> [-w[=WEIGHT]]'', + ''Get/set credit2 scheduler parameters.''), ''sysrq'' : (''<Domain> <letter>'', ''Send a sysrq to a domain.''), ''debug-keys'' : (''<Keys>'', ''Send debug keys to Xen.''), ''trigger'' : (''<Domain> <nmi|reset|init|s3resume|power> [<VCPU>]'', @@ -277,6 +279,10 @@ (''-w WEIGHT'', ''--weight=WEIGHT'', ''Weight (int)''), (''-c CAP'', ''--cap=CAP'', ''Cap (int)''), ), + ''sched-credit2'': ( + (''-d DOMAIN'', ''--domain=DOMAIN'', ''Domain to modify''), + (''-w WEIGHT'', ''--weight=WEIGHT'', ''Weight (int)''), + ), ''list'': ( (''-l'', ''--long'', ''Output all VM details in SXP''), ('''', ''--label'', ''Include security labels''), @@ -418,6 +424,7 @@ ] scheduler_commands = [ + "sched-credit2", "sched-credit", "sched-sedf", ] @@ -1740,6 +1747,80 @@ if result != 0: err(str(result)) +def xm_sched_credit2(args): + """Get/Set options for Credit2 Scheduler.""" + + check_sched_type(''credit2'') + + try: + opts, params = getopt.getopt(args, "d:w:", + ["domain=", "weight="]) + except getopt.GetoptError, opterr: + err(opterr) + usage(''sched-credit2'') + + domid = None + weight = None + + for o, a in opts: + if o in ["-d", "--domain"]: + domid = a + elif o in ["-w", "--weight"]: + weight = int(a) + + doms = filter(lambda x : domid_match(domid, x), + [parse_doms_info(dom) + for dom in getDomains(None, ''all'')]) + + if weight is None: + if domid is not None and doms == []: + err("Domain ''%s'' does not exist." % domid) + usage(''sched-credit2'') + # print header if we aren''t setting any parameters + print ''%-33s %4s %6s'' % (''Name'',''ID'',''Weight'') + + for d in doms: + try: + if serverType == SERVER_XEN_API: + info = server.xenapi.VM_metrics.get_VCPUs_params( + server.xenapi.VM.get_metrics( + get_single_vm(d[''name'']))) + else: + info = server.xend.domain.sched_credit2_get(d[''name'']) + except xmlrpclib.Fault: + pass + + if ''weight'' not in info: + # domain does not support sched-credit2? + info = {''weight'': -1} + + info[''weight''] = int(info[''weight'']) + + info[''name''] = d[''name''] + info[''domid''] = str(d[''domid'']) + print( ("%(name)-32s %(domid)5s %(weight)6d") % info) + else: + if domid is None: + # place holder for system-wide scheduler parameters + err("No domain given.") + usage(''sched-credit2'') + + if serverType == SERVER_XEN_API: + if doms[0][''domid'']: + server.xenapi.VM.add_to_VCPUs_params_live( + get_single_vm(domid), + "weight", + weight) + else: + server.xenapi.VM.add_to_VCPUs_params( + get_single_vm(domid), + "weight", + weight) + else: + result = server.xend.domain.sched_credit2_set(domid, weight) + if result != 0: + err(str(result)) + def xm_info(args): arg_check(args, "info", 0, 1) @@ -3490,6 +3571,7 @@ # scheduler "sched-sedf": xm_sched_sedf, "sched-credit": xm_sched_credit, + "sched-credit2": xm_sched_credit2, # block "block-attach": xm_block_attach, "block-detach": xm_block_detach, _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
George Dunlap
2010-Apr-14 14:29 UTC
Re: [Xen-devel] [PATCH 0 of 5] Add credit2 scheduler (EXPERIMENTAL)
Keir has checked the patches in, so if you wait a bit, they should show up on the public repository. The tool patch is only necessary for adjusting the weight; if you''re OK using the default weight, just adding "sched=credit2" on the xen command-line should be fine. Don''t forget that this isn''t meant to perform well on multiple sockets yet. :-) -George Dan Magenheimer wrote:> Hi George -- > > I''m seeing some problems applying the patches (such as "malformed > patch"). If you could send me a monolithic patch in an attachment > and tell me what cset in http://xenbits.xensource.com/xen-unstable.hg > that it successfully applies against, I will try to give my > workload a test against it to see if it has the same > symptoms. > > Also, do I need to apply the tools patch if I don''t intend > to specify any parameters, or is the xen patch + "sched=credit2" > in a boot param sufficient? > > Thanks, > Dan > > >> -----Original Message----- >> From: George Dunlap [mailto:george.dunlap@eu.citrix.com] >> Sent: Wednesday, April 14, 2010 4:26 AM >> To: xen-devel@lists.xensource.com >> Cc: george.dunlap@eu.citrix.com >> Subject: [Xen-devel] [PATCH 0 of 5] Add credit2 scheduler >> (EXPERIMENTAL) >> >> This patch series introduces the credit2 scheduler. The first two >> patches >> introduce changes necessary to allow the credit2 shared runqueue >> functionality >> to work properly; the last two implement the functionality itself. >> >> The scheduler is still in the experimental phase. There''s lots of >> opportunity to contribute with independent lines of development; email >> George Dunlap <george.dunlap@eu.citrix.com> or check out the wiki page >> http://wiki.xensource.com/xenwiki/Credit2_Scheduler_Development for >> ideas >> and status updates. >> >> 19 files changed, 1453 insertions(+), 21 deletions(-) >> tools/libxc/Makefile | 1 >> tools/libxc/xc_csched2.c | 50 + >> tools/libxc/xenctrl.h | 8 >> tools/python/xen/lowlevel/xc/xc.c | 58 + >> tools/python/xen/xend/XendAPI.py | 3 >> tools/python/xen/xend/XendDomain.py | 54 + >> tools/python/xen/xend/XendDomainInfo.py | 4 >> tools/python/xen/xend/XendNode.py | 4 >> tools/python/xen/xend/XendVMMetrics.py | 1 >> tools/python/xen/xend/server/SrvDomain.py | 14 >> tools/python/xen/xm/main.py | 82 ++ >> xen/arch/ia64/vmx/vmmu.c | 6 >> xen/common/Makefile | 1 >> xen/common/sched_credit.c | 8 >> xen/common/sched_credit2.c | 1125 >> +++++++++++++++++++++++++++++ >> xen/common/schedule.c | 22 >> xen/include/public/domctl.h | 4 >> xen/include/public/trace.h | 1 >> xen/include/xen/sched-if.h | 28 >> >> _______________________________________________ >> Xen-devel mailing list >> Xen-devel@lists.xensource.com >> http://lists.xensource.com/xen-devel >>_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2010-Apr-14 14:52 UTC
Re: [Xen-devel] [PATCH 0 of 5] Add credit2 scheduler (EXPERIMENTAL)
The patches are already available from the staging tree. They will get automatically pushed to the main tree when they pass the regression tests. K. On 14/04/2010 15:29, "George Dunlap" <george.dunlap@eu.citrix.com> wrote:> Keir has checked the patches in, so if you wait a bit, they should show > up on the public repository. > > The tool patch is only necessary for adjusting the weight; if you''re OK > using the default weight, just adding "sched=credit2" on the xen > command-line should be fine. > > Don''t forget that this isn''t meant to perform well on multiple sockets > yet. :-) > > -George > > Dan Magenheimer wrote: >> Hi George -- >> >> I''m seeing some problems applying the patches (such as "malformed >> patch"). If you could send me a monolithic patch in an attachment >> and tell me what cset in http://xenbits.xensource.com/xen-unstable.hg >> that it successfully applies against, I will try to give my >> workload a test against it to see if it has the same >> symptoms. >> >> Also, do I need to apply the tools patch if I don''t intend >> to specify any parameters, or is the xen patch + "sched=credit2" >> in a boot param sufficient? >> >> Thanks, >> Dan >> >> >>> -----Original Message----- >>> From: George Dunlap [mailto:george.dunlap@eu.citrix.com] >>> Sent: Wednesday, April 14, 2010 4:26 AM >>> To: xen-devel@lists.xensource.com >>> Cc: george.dunlap@eu.citrix.com >>> Subject: [Xen-devel] [PATCH 0 of 5] Add credit2 scheduler >>> (EXPERIMENTAL) >>> >>> This patch series introduces the credit2 scheduler. The first two >>> patches >>> introduce changes necessary to allow the credit2 shared runqueue >>> functionality >>> to work properly; the last two implement the functionality itself. >>> >>> The scheduler is still in the experimental phase. There''s lots of >>> opportunity to contribute with independent lines of development; email >>> George Dunlap <george.dunlap@eu.citrix.com> or check out the wiki page >>> http://wiki.xensource.com/xenwiki/Credit2_Scheduler_Development for >>> ideas >>> and status updates. >>> >>> 19 files changed, 1453 insertions(+), 21 deletions(-) >>> tools/libxc/Makefile | 1 >>> tools/libxc/xc_csched2.c | 50 + >>> tools/libxc/xenctrl.h | 8 >>> tools/python/xen/lowlevel/xc/xc.c | 58 + >>> tools/python/xen/xend/XendAPI.py | 3 >>> tools/python/xen/xend/XendDomain.py | 54 + >>> tools/python/xen/xend/XendDomainInfo.py | 4 >>> tools/python/xen/xend/XendNode.py | 4 >>> tools/python/xen/xend/XendVMMetrics.py | 1 >>> tools/python/xen/xend/server/SrvDomain.py | 14 >>> tools/python/xen/xm/main.py | 82 ++ >>> xen/arch/ia64/vmx/vmmu.c | 6 >>> xen/common/Makefile | 1 >>> xen/common/sched_credit.c | 8 >>> xen/common/sched_credit2.c | 1125 >>> +++++++++++++++++++++++++++++ >>> xen/common/schedule.c | 22 >>> xen/include/public/domctl.h | 4 >>> xen/include/public/trace.h | 1 >>> xen/include/xen/sched-if.h | 28 >>> >>> _______________________________________________ >>> Xen-devel mailing list >>> Xen-devel@lists.xensource.com >>> http://lists.xensource.com/xen-devel >>> > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Dan Magenheimer
2010-Apr-14 15:59 UTC
RE: [Xen-devel] [PATCH 0 of 5] Add credit2 scheduler (EXPERIMENTAL)
Thanks. Unfortunately, after updating both hypervisor and tools to cs21173 (from staging), xend seems to start fine, but attempting to launch a domain yields: Xend has probably crashed! Invalid or missing HTTP status code. An immediate "xm list" shows that xend has not crashed (or perhaps silently and successfully restarted), but re-attempting to launch a domain yields the same message. (George, no need to point out that this is probably unrelated to the credit2 scheduler... but that would imply that xen-unstable-not-staging is also broken.) Keir, if staging passes your regression tests successfully without the problem I am seeing, please let me know. (And I''ll try rolling back to 4.0 for now.)> -----Original Message----- > From: Keir Fraser [mailto:keir.fraser@eu.citrix.com] > Sent: Wednesday, April 14, 2010 8:53 AM > To: George Dunlap; Dan Magenheimer > Cc: xen-devel@lists.xensource.com > Subject: Re: [Xen-devel] [PATCH 0 of 5] Add credit2 scheduler > (EXPERIMENTAL) > > The patches are already available from the staging tree. They will get > automatically pushed to the main tree when they pass the regression > tests. > > K. > > > On 14/04/2010 15:29, "George Dunlap" <george.dunlap@eu.citrix.com> > wrote: > > > Keir has checked the patches in, so if you wait a bit, they should > show > > up on the public repository. > > > > The tool patch is only necessary for adjusting the weight; if you''re > OK > > using the default weight, just adding "sched=credit2" on the xen > > command-line should be fine. > > > > Don''t forget that this isn''t meant to perform well on multiple > sockets > > yet. :-) > > > > -George > > > > Dan Magenheimer wrote: > >> Hi George -- > >> > >> I''m seeing some problems applying the patches (such as "malformed > >> patch"). If you could send me a monolithic patch in an attachment > >> and tell me what cset in http://xenbits.xensource.com/xen- > unstable.hg > >> that it successfully applies against, I will try to give my > >> workload a test against it to see if it has the same > >> symptoms. > >> > >> Also, do I need to apply the tools patch if I don''t intend > >> to specify any parameters, or is the xen patch + "sched=credit2" > >> in a boot param sufficient? > >> > >> Thanks, > >> Dan > >> > >> > >>> -----Original Message----- > >>> From: George Dunlap [mailto:george.dunlap@eu.citrix.com] > >>> Sent: Wednesday, April 14, 2010 4:26 AM > >>> To: xen-devel@lists.xensource.com > >>> Cc: george.dunlap@eu.citrix.com > >>> Subject: [Xen-devel] [PATCH 0 of 5] Add credit2 scheduler > >>> (EXPERIMENTAL) > >>> > >>> This patch series introduces the credit2 scheduler. The first two > >>> patches > >>> introduce changes necessary to allow the credit2 shared runqueue > >>> functionality > >>> to work properly; the last two implement the functionality itself. > >>> > >>> The scheduler is still in the experimental phase. There''s lots of > >>> opportunity to contribute with independent lines of development; > email > >>> George Dunlap <george.dunlap@eu.citrix.com> or check out the wiki > page > >>> http://wiki.xensource.com/xenwiki/Credit2_Scheduler_Development for > >>> ideas > >>> and status updates. > >>> > >>> 19 files changed, 1453 insertions(+), 21 deletions(-) > >>> tools/libxc/Makefile | 1 > >>> tools/libxc/xc_csched2.c | 50 + > >>> tools/libxc/xenctrl.h | 8 > >>> tools/python/xen/lowlevel/xc/xc.c | 58 + > >>> tools/python/xen/xend/XendAPI.py | 3 > >>> tools/python/xen/xend/XendDomain.py | 54 + > >>> tools/python/xen/xend/XendDomainInfo.py | 4 > >>> tools/python/xen/xend/XendNode.py | 4 > >>> tools/python/xen/xend/XendVMMetrics.py | 1 > >>> tools/python/xen/xend/server/SrvDomain.py | 14 > >>> tools/python/xen/xm/main.py | 82 ++ > >>> xen/arch/ia64/vmx/vmmu.c | 6 > >>> xen/common/Makefile | 1 > >>> xen/common/sched_credit.c | 8 > >>> xen/common/sched_credit2.c | 1125 > >>> +++++++++++++++++++++++++++++ > >>> xen/common/schedule.c | 22 > >>> xen/include/public/domctl.h | 4 > >>> xen/include/public/trace.h | 1 > >>> xen/include/xen/sched-if.h | 28 > >>> > >>> _______________________________________________ > >>> Xen-devel mailing list > >>> Xen-devel@lists.xensource.com > >>> http://lists.xensource.com/xen-devel > >>> > > > > > > _______________________________________________ > > Xen-devel mailing list > > Xen-devel@lists.xensource.com > > http://lists.xensource.com/xen-devel > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2010-Apr-14 16:23 UTC
Re: [Xen-devel] [PATCH 0 of 5] Add credit2 scheduler (EXPERIMENTAL)
On 14/04/2010 16:59, "Dan Magenheimer" <dan.magenheimer@oracle.com> wrote:> (George, no need to point out that this is probably > unrelated to the credit2 scheduler... but that would > imply that xen-unstable-not-staging is also broken.) > > Keir, if staging passes your regression tests successfully > without the problem I am seeing, please let me know. > (And I''ll try rolling back to 4.0 for now.)No, it crashes for me too. I think it''s related to the recent NUMA patches. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Dulloor
2010-Apr-14 16:31 UTC
Re: [Xen-devel] [PATCH 0 of 5] Add credit2 scheduler (EXPERIMENTAL)
> No, it crashes for me too. I think it''s related to the recent NUMA patches.Keir, which numa patches -dulloor On Wed, Apr 14, 2010 at 12:23 PM, Keir Fraser <keir.fraser@eu.citrix.com> wrote:> On 14/04/2010 16:59, "Dan Magenheimer" <dan.magenheimer@oracle.com> wrote: > >> (George, no need to point out that this is probably >> unrelated to the credit2 scheduler... but that would >> imply that xen-unstable-not-staging is also broken.) >> >> Keir, if staging passes your regression tests successfully >> without the problem I am seeing, please let me know. >> (And I''ll try rolling back to 4.0 for now.) > > No, it crashes for me too. I think it''s related to the recent NUMA patches. > > -- Keir > > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2010-Apr-14 16:36 UTC
Re: [Xen-devel] [PATCH 0 of 5] Add credit2 scheduler (EXPERIMENTAL)
On 14/04/2010 17:31, "Dulloor" <dulloor@gmail.com> wrote:>> No, it crashes for me too. I think it''s related to the recent NUMA patches. > Keir, which numa patchesNitin''s interface-changing patch, and the ensuing patch to remove sockets_per_node. Not that I''m certain, but it was around then that the crashes began, and that''s what touched the Xc Python extension package (and C extensions to Python code are usually what make Python programs crash). -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Dan Magenheimer
2010-Apr-14 16:46 UTC
RE: [Xen-devel] [PATCH 0 of 5] Add credit2 scheduler (EXPERIMENTAL)
> > (George, no need to point out that this is probably > > unrelated to the credit2 scheduler... but that would > > imply that xen-unstable-not-staging is also broken.) > > > > Keir, if staging passes your regression tests successfully > > without the problem I am seeing, please let me know. > > (And I''ll try rolling back to 4.0 for now.) > > No, it crashes for me too. I think it''s related to the recent NUMA > patches.OK. George, I''ve applied your credit2 patch to 4.0-testing tip and it seems to be starting up my test workload. I''ll let you know what I see (but it takes a few hours). _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Dan Magenheimer
2010-Apr-14 17:04 UTC
RE: [Xen-devel] [PATCH 0 of 5] Add credit2 scheduler (EXPERIMENTAL)
While someone is fixing up the new NUMA code, it would be nice if the output of the NUMA info in "xm info" would conform to the rest of the "xm info" output (e.g. all on the same line, maybe "topology: cpu=X node=Y etc"). I often write scripts that parse various Xen output, and I suspect many Xen-based products do also, so format consistency is always good.> -----Original Message----- > From: Keir Fraser [mailto:keir.fraser@eu.citrix.com] > Sent: Wednesday, April 14, 2010 10:36 AM > To: Dulloor > Cc: George Dunlap; Dan Magenheimer; xen-devel@lists.xensource.com > Subject: Re: [Xen-devel] [PATCH 0 of 5] Add credit2 scheduler > (EXPERIMENTAL) > > On 14/04/2010 17:31, "Dulloor" <dulloor@gmail.com> wrote: > > >> No, it crashes for me too. I think it''s related to the recent NUMA > patches. > > Keir, which numa patches > > Nitin''s interface-changing patch, and the ensuing patch to remove > sockets_per_node. Not that I''m certain, but it was around then that the > crashes began, and that''s what touched the Xc Python extension package > (and > C extensions to Python code are usually what make Python programs > crash). > > -- Keir > > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
George Dunlap
2010-Apr-15 13:53 UTC
Re: [Xen-devel] [PATCH 0 of 5] Add credit2 scheduler (EXPERIMENTAL)
I have not measured cache / TLB misses with this workload yet. In the past I''ve instrumented the scheduler trace records in Xen to include performance counters such as instructions executed and cache / tlb misses, and then used xenalyze (http://xenbits.xensource.com/ext/xenalyze.hg) to analyze them. But the functionality for both capture and analysis was never standardized or added to mainline. I''d be happy to help point you in the right direction if you''re interested in investing in that approach. :-) -George Naresh Rapolu wrote:> Hello George, > > How did you measure Cache/ TLB misses etc while using/profiling this > new scheduler ? Any tool that you`ve used which works with Xen ? > > Thanks, > Naresh Rapolu. > PhD Student, Computer Science, > Purdue University. > > George Dunlap wrote: > >> This patch series introduces the credit2 scheduler. The first two patches >> introduce changes necessary to allow the credit2 shared runqueue functionality >> to work properly; the last two implement the functionality itself. >> >> The scheduler is still in the experimental phase. There''s lots of >> opportunity to contribute with independent lines of development; email >> George Dunlap <george.dunlap@eu.citrix.com> or check out the wiki page >> http://wiki.xensource.com/xenwiki/Credit2_Scheduler_Development for ideas >> and status updates. >> >> 19 files changed, 1453 insertions(+), 21 deletions(-) >> tools/libxc/Makefile | 1 >> tools/libxc/xc_csched2.c | 50 + >> tools/libxc/xenctrl.h | 8 >> tools/python/xen/lowlevel/xc/xc.c | 58 + >> tools/python/xen/xend/XendAPI.py | 3 >> tools/python/xen/xend/XendDomain.py | 54 + >> tools/python/xen/xend/XendDomainInfo.py | 4 >> tools/python/xen/xend/XendNode.py | 4 >> tools/python/xen/xend/XendVMMetrics.py | 1 >> tools/python/xen/xend/server/SrvDomain.py | 14 >> tools/python/xen/xm/main.py | 82 ++ >> xen/arch/ia64/vmx/vmmu.c | 6 >> xen/common/Makefile | 1 >> xen/common/sched_credit.c | 8 >> xen/common/sched_credit2.c | 1125 +++++++++++++++++++++++++++++ >> xen/common/schedule.c | 22 >> xen/include/public/domctl.h | 4 >> xen/include/public/trace.h | 1 >> xen/include/xen/sched-if.h | 28 >> >> _______________________________________________ >> Xen-devel mailing list >> Xen-devel@lists.xensource.com >> http://lists.xensource.com/xen-devel >> >> > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
George Dunlap
2010-Apr-15 14:17 UTC
Re: [Xen-devel] [PATCH 0 of 5] Add credit2 scheduler (EXPERIMENTAL)
Dulloor wrote:> As we talked before, I am interested in improving the mutiple-socket > scenario and adding the load balancing functionalilty, which could > provide an acceptable alternative to pinning vcpus to sockets (for my > NUMA work). I am going over your patch right now, but what are your > thoughts ? >That would be great -- my focus for the next several months will be setting up a testing infrastructure to automatically test performance of different workloads mixes so I can hone the algorithm and test regressions. My idea with load balancing was to do this: * One runqueue per L2 cache. * Add code to calculate the load of a runqueue. Load would be the average (~integral) of (vcpus running + vcpus on runqueue). I was planning on doing accurate load calculation, rather than sample-based, and falling back to sample-based if accurate turned out to be too slow. * Calculate the load contributed by various vcpus. * At regular intervals, determine of some kind of balancing needs to be done by looking at the overall runqueue load and placing based on "contributory" load of each VCPU. Does that make sense? Thoughts? I have some old patches that calculated accurate load, I could dig them up if you wanted something to start with. (I don''t think they''ll apply cleanly at the moment.) Thanks, -George> -dulloor > > On Wed, Apr 14, 2010 at 6:26 AM, George Dunlap > <george.dunlap@eu.citrix.com> wrote: > >> This patch series introduces the credit2 scheduler. The first two patches >> introduce changes necessary to allow the credit2 shared runqueue functionality >> to work properly; the last two implement the functionality itself. >> >> The scheduler is still in the experimental phase. There''s lots of >> opportunity to contribute with independent lines of development; email >> George Dunlap <george.dunlap@eu.citrix.com> or check out the wiki page >> http://wiki.xensource.com/xenwiki/Credit2_Scheduler_Development for ideas >> and status updates. >> >> 19 files changed, 1453 insertions(+), 21 deletions(-) >> tools/libxc/Makefile | 1 >> tools/libxc/xc_csched2.c | 50 + >> tools/libxc/xenctrl.h | 8 >> tools/python/xen/lowlevel/xc/xc.c | 58 + >> tools/python/xen/xend/XendAPI.py | 3 >> tools/python/xen/xend/XendDomain.py | 54 + >> tools/python/xen/xend/XendDomainInfo.py | 4 >> tools/python/xen/xend/XendNode.py | 4 >> tools/python/xen/xend/XendVMMetrics.py | 1 >> tools/python/xen/xend/server/SrvDomain.py | 14 >> tools/python/xen/xm/main.py | 82 ++ >> xen/arch/ia64/vmx/vmmu.c | 6 >> xen/common/Makefile | 1 >> xen/common/sched_credit.c | 8 >> xen/common/sched_credit2.c | 1125 +++++++++++++++++++++++++++++ >> xen/common/schedule.c | 22 >> xen/include/public/domctl.h | 4 >> xen/include/public/trace.h | 1 >> xen/include/xen/sched-if.h | 28 >> >> _______________________________________________ >> Xen-devel mailing list >> Xen-devel@lists.xensource.com >> http://lists.xensource.com/xen-devel >> >>_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Naresh Rapolu
2010-Apr-15 16:46 UTC
Re: [Xen-devel] [PATCH 0 of 5] Add credit2 scheduler (EXPERIMENTAL)
Hello George, I am trying to get linux "perf" tool work with Xen(Virtualize PMU to measure hardware events from inside guests). I have the following options : 1. allowing the guest kernel to see the PMU hardware features via cpuid, and then doing whatever is necessary to make them work as expected (by instruction emulation, etc), or 2. keeping them hidden, but adding a new Xen interface and the appropriate Linux-side code to detect that interface and use it Does Xenalyze have any code relevant to this ? Can you think of any directions in this regard ? Thanks, Naresh Rapolu. George Dunlap wrote:> I have not measured cache / TLB misses with this workload yet. In the > past I''ve instrumented the scheduler trace records in Xen to include > performance counters such as instructions executed and cache / tlb > misses, and then used xenalyze > (http://xenbits.xensource.com/ext/xenalyze.hg) to analyze them. But > the functionality for both capture and analysis was never standardized > or added to mainline. > > I''d be happy to help point you in the right direction if you''re > interested in investing in that approach. :-) > > -George > > Naresh Rapolu wrote: >> Hello George, >> >> How did you measure Cache/ TLB misses etc while using/profiling this >> new scheduler ? Any tool that you`ve used which works with Xen ? >> >> Thanks, >> Naresh Rapolu. >> PhD Student, Computer Science, >> Purdue University. >> >> George Dunlap wrote: >> >>> This patch series introduces the credit2 scheduler. The first two >>> patches >>> introduce changes necessary to allow the credit2 shared runqueue >>> functionality >>> to work properly; the last two implement the functionality itself. >>> >>> The scheduler is still in the experimental phase. There''s lots of >>> opportunity to contribute with independent lines of development; email >>> George Dunlap <george.dunlap@eu.citrix.com> or check out the wiki page >>> http://wiki.xensource.com/xenwiki/Credit2_Scheduler_Development for >>> ideas >>> and status updates. >>> >>> 19 files changed, 1453 insertions(+), 21 deletions(-) >>> tools/libxc/Makefile | 1 >>> tools/libxc/xc_csched2.c | 50 + >>> tools/libxc/xenctrl.h | 8 >>> tools/python/xen/lowlevel/xc/xc.c | 58 + >>> tools/python/xen/xend/XendAPI.py | 3 >>> tools/python/xen/xend/XendDomain.py | 54 + >>> tools/python/xen/xend/XendDomainInfo.py | 4 >>> tools/python/xen/xend/XendNode.py | 4 >>> tools/python/xen/xend/XendVMMetrics.py | 1 >>> tools/python/xen/xend/server/SrvDomain.py | 14 >>> tools/python/xen/xm/main.py | 82 ++ >>> xen/arch/ia64/vmx/vmmu.c | 6 >>> xen/common/Makefile | 1 >>> xen/common/sched_credit.c | 8 >>> xen/common/sched_credit2.c | 1125 >>> +++++++++++++++++++++++++++++ >>> xen/common/schedule.c | 22 >>> xen/include/public/domctl.h | 4 >>> xen/include/public/trace.h | 1 >>> xen/include/xen/sched-if.h | 28 >>> _______________________________________________ >>> Xen-devel mailing list >>> Xen-devel@lists.xensource.com >>> http://lists.xensource.com/xen-devel >>> >> >> >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Dulloor
2010-Apr-15 17:33 UTC
Re: [Xen-devel] [PATCH 0 of 5] Add credit2 scheduler (EXPERIMENTAL)
Naresh, If you are interested only in profiling, you could use xenoprof too. I had ported xenoprof to pvops (attaching a patch that applies cleanly to linux pvops). I have used this with passive profiling and for profiling xen/dom0. This patch also includes an obvious fix (over oprofile branch in Jeremy''s repo) for active profiling, although I didn''t get a chance to test. Please let know if you try this and if you face any issues. thanks dulloor On Thu, Apr 15, 2010 at 12:46 PM, Naresh Rapolu <nrapolu@purdue.edu> wrote:> Hello George, > > I am trying to get linux "perf" tool work with Xen(Virtualize PMU to measure > hardware events from inside guests). > I have the following options : > > 1. allowing the guest kernel to see the PMU hardware features via > cpuid, and then doing whatever is necessary to make them work as > expected (by instruction emulation, etc), or > 2. keeping them hidden, but adding a new Xen interface and the > appropriate Linux-side code to detect that interface and use it > > > Does Xenalyze have any code relevant to this ? Can you think of any > directions in this regard ? > > Thanks, > Naresh Rapolu. > > > George Dunlap wrote: >> >> I have not measured cache / TLB misses with this workload yet. In the >> past I''ve instrumented the scheduler trace records in Xen to include >> performance counters such as instructions executed and cache / tlb misses, >> and then used xenalyze (http://xenbits.xensource.com/ext/xenalyze.hg) to >> analyze them. But the functionality for both capture and analysis was never >> standardized or added to mainline. >> >> I''d be happy to help point you in the right direction if you''re interested >> in investing in that approach. :-) >> >> -George >> >> Naresh Rapolu wrote: >>> >>> Hello George, >>> >>> How did you measure Cache/ TLB misses etc while using/profiling this new >>> scheduler ? Any tool that you`ve used which works with Xen ? >>> >>> Thanks, >>> Naresh Rapolu. >>> PhD Student, Computer Science, >>> Purdue University. >>> >>> George Dunlap wrote: >>> >>>> >>>> This patch series introduces the credit2 scheduler. The first two >>>> patches >>>> introduce changes necessary to allow the credit2 shared runqueue >>>> functionality >>>> to work properly; the last two implement the functionality itself. >>>> >>>> The scheduler is still in the experimental phase. There''s lots of >>>> opportunity to contribute with independent lines of development; email >>>> George Dunlap <george.dunlap@eu.citrix.com> or check out the wiki page >>>> http://wiki.xensource.com/xenwiki/Credit2_Scheduler_Development for >>>> ideas >>>> and status updates. >>>> >>>> 19 files changed, 1453 insertions(+), 21 deletions(-) >>>> tools/libxc/Makefile | 1 >>>> tools/libxc/xc_csched2.c | 50 + >>>> tools/libxc/xenctrl.h | 8 >>>> tools/python/xen/lowlevel/xc/xc.c | 58 + >>>> tools/python/xen/xend/XendAPI.py | 3 >>>> tools/python/xen/xend/XendDomain.py | 54 + >>>> tools/python/xen/xend/XendDomainInfo.py | 4 >>>> tools/python/xen/xend/XendNode.py | 4 >>>> tools/python/xen/xend/XendVMMetrics.py | 1 >>>> tools/python/xen/xend/server/SrvDomain.py | 14 tools/python/xen/xm/main.py >>>> | 82 ++ >>>> xen/arch/ia64/vmx/vmmu.c | 6 xen/common/Makefile >>>> | 1 xen/common/sched_credit.c | 8 >>>> xen/common/sched_credit2.c | 1125 >>>> +++++++++++++++++++++++++++++ >>>> xen/common/schedule.c | 22 >>>> xen/include/public/domctl.h | 4 xen/include/public/trace.h >>>> | 1 xen/include/xen/sched-if.h | 28 >>>> _______________________________________________ >>>> Xen-devel mailing list >>>> Xen-devel@lists.xensource.com >>>> http://lists.xensource.com/xen-devel >>>> >>> >>> >> > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Naresh Rapolu
2010-Apr-15 18:57 UTC
Re: [Xen-devel] [PATCH 0 of 5] Add credit2 scheduler (EXPERIMENTAL)
Hello Dulloor, Thank you so much for sharing this patch !! Jeremy and I feel that "perf" support is needed in Xen for thorough (per-process, per-vpcu) profiling, not just statistical profiling by Oprofile. Now that I have this latest Xenoprof patch, I will try to stick to its design and see how "perf" linux subsystem can use xenoprof hypercall interfaces. Will keep updating you on this regularly. Thanks, Naresh Rapolu, PhD student, Computer Science, Purdue University. Dulloor wrote:> Naresh, > > If you are interested only in profiling, you could use xenoprof too. > I had ported xenoprof to pvops (attaching a patch that applies cleanly > to linux pvops). I have used this with passive profiling and for > profiling xen/dom0. This patch also includes an obvious fix (over > oprofile branch in Jeremy''s repo) for active profiling, although I > didn''t get a chance to test. > > Please let know if you try this and if you face any issues. > > thanks > dulloor > > On Thu, Apr 15, 2010 at 12:46 PM, Naresh Rapolu <nrapolu@purdue.edu> wrote: > >> Hello George, >> >> I am trying to get linux "perf" tool work with Xen(Virtualize PMU to measure >> hardware events from inside guests). >> I have the following options : >> >> 1. allowing the guest kernel to see the PMU hardware features via >> cpuid, and then doing whatever is necessary to make them work as >> expected (by instruction emulation, etc), or >> 2. keeping them hidden, but adding a new Xen interface and the >> appropriate Linux-side code to detect that interface and use it >> >> >> Does Xenalyze have any code relevant to this ? Can you think of any >> directions in this regard ? >> >> Thanks, >> Naresh Rapolu. >> >> >> George Dunlap wrote: >> >>> I have not measured cache / TLB misses with this workload yet. In the >>> past I''ve instrumented the scheduler trace records in Xen to include >>> performance counters such as instructions executed and cache / tlb misses, >>> and then used xenalyze (http://xenbits.xensource.com/ext/xenalyze.hg) to >>> analyze them. But the functionality for both capture and analysis was never >>> standardized or added to mainline. >>> >>> I''d be happy to help point you in the right direction if you''re interested >>> in investing in that approach. :-) >>> >>> -George >>> >>> Naresh Rapolu wrote: >>> >>>> Hello George, >>>> >>>> How did you measure Cache/ TLB misses etc while using/profiling this new >>>> scheduler ? Any tool that you`ve used which works with Xen ? >>>> >>>> Thanks, >>>> Naresh Rapolu. >>>> PhD Student, Computer Science, >>>> Purdue University. >>>> >>>> George Dunlap wrote: >>>> >>>> >>>>> This patch series introduces the credit2 scheduler. The first two >>>>> patches >>>>> introduce changes necessary to allow the credit2 shared runqueue >>>>> functionality >>>>> to work properly; the last two implement the functionality itself. >>>>> >>>>> The scheduler is still in the experimental phase. There''s lots of >>>>> opportunity to contribute with independent lines of development; email >>>>> George Dunlap <george.dunlap@eu.citrix.com> or check out the wiki page >>>>> http://wiki.xensource.com/xenwiki/Credit2_Scheduler_Development for >>>>> ideas >>>>> and status updates. >>>>> >>>>> 19 files changed, 1453 insertions(+), 21 deletions(-) >>>>> tools/libxc/Makefile | 1 >>>>> tools/libxc/xc_csched2.c | 50 + >>>>> tools/libxc/xenctrl.h | 8 >>>>> tools/python/xen/lowlevel/xc/xc.c | 58 + >>>>> tools/python/xen/xend/XendAPI.py | 3 >>>>> tools/python/xen/xend/XendDomain.py | 54 + >>>>> tools/python/xen/xend/XendDomainInfo.py | 4 >>>>> tools/python/xen/xend/XendNode.py | 4 >>>>> tools/python/xen/xend/XendVMMetrics.py | 1 >>>>> tools/python/xen/xend/server/SrvDomain.py | 14 tools/python/xen/xm/main.py >>>>> | 82 ++ >>>>> xen/arch/ia64/vmx/vmmu.c | 6 xen/common/Makefile >>>>> | 1 xen/common/sched_credit.c | 8 >>>>> xen/common/sched_credit2.c | 1125 >>>>> +++++++++++++++++++++++++++++ >>>>> xen/common/schedule.c | 22 >>>>> xen/include/public/domctl.h | 4 xen/include/public/trace.h >>>>> | 1 xen/include/xen/sched-if.h | 28 >>>>> _______________________________________________ >>>>> Xen-devel mailing list >>>>> Xen-devel@lists.xensource.com >>>>> http://lists.xensource.com/xen-devel >>>>> >>>>> >>>> >> _______________________________________________ >> Xen-devel mailing list >> Xen-devel@lists.xensource.com >> http://lists.xensource.com/xen-devel >> >> >> ------------------------------------------------------------------------ >> >> _______________________________________________ >> Xen-devel mailing list >> Xen-devel@lists.xensource.com >> http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Dan Magenheimer
2010-Apr-15 20:11 UTC
RE: [Xen-devel] [PATCH 0 of 5] Add credit2 scheduler (EXPERIMENTAL)
Well, sadly, credit2 doesn''t seem to solve the problem... and even more sadly causes worse performance on my overcommitted workload. elapsed is wallclock seconds from the time the first VM launches the first "make clean" until the fourth VM finishes its second make. sumvcpu is the sum of the vcpu sec (including dom0) reported by xm list after all VM''s have finished the workload and force-crashed dom0 is the vcpu sec reported by xm list for dom0 credit: 5 test runs elapsed=(9447,9388,9578,9576,9412) sumvcpu=(13665,13671,13693,13589,13598) dom0=(559,556,555,467,483) sedf: 6 test runs elapsed=(10022,9418,9637,12129,13599,11875) sumvcpu=(13539,13514,13510,14270,14447,14237) dom0=(473,468,460,482,537,475) credit2: 6 test runs elapsed=(11007,9931,10051,10090,11647,10070) sumvcpu=(14878,14615,14610,14641,14886,14594) dom0=(510,470,471,482,536,463) P.S. physical machine is a single socket dual core> -----Original Message----- > From: George Dunlap [mailto:george.dunlap@eu.citrix.com] > Sent: Wednesday, April 14, 2010 8:30 AM > To: Dan Magenheimer > Cc: xen-devel@lists.xensource.com > Subject: Re: [Xen-devel] [PATCH 0 of 5] Add credit2 scheduler > (EXPERIMENTAL) > > Keir has checked the patches in, so if you wait a bit, they should show > up on the public repository. > > The tool patch is only necessary for adjusting the weight; if you''re OK > using the default weight, just adding "sched=credit2" on the xen > command-line should be fine. > > Don''t forget that this isn''t meant to perform well on multiple sockets > yet. :-) > > -George > > Dan Magenheimer wrote: > > Hi George -- > > > > I''m seeing some problems applying the patches (such as "malformed > > patch"). If you could send me a monolithic patch in an attachment > > and tell me what cset in http://xenbits.xensource.com/xen-unstable.hg > > that it successfully applies against, I will try to give my > > workload a test against it to see if it has the same > > symptoms. > > > > Also, do I need to apply the tools patch if I don''t intend > > to specify any parameters, or is the xen patch + "sched=credit2" > > in a boot param sufficient? > > > > Thanks, > > Dan > > > > > >> -----Original Message----- > >> From: George Dunlap [mailto:george.dunlap@eu.citrix.com] > >> Sent: Wednesday, April 14, 2010 4:26 AM > >> To: xen-devel@lists.xensource.com > >> Cc: george.dunlap@eu.citrix.com > >> Subject: [Xen-devel] [PATCH 0 of 5] Add credit2 scheduler > >> (EXPERIMENTAL) > >> > >> This patch series introduces the credit2 scheduler. The first two > >> patches > >> introduce changes necessary to allow the credit2 shared runqueue > >> functionality > >> to work properly; the last two implement the functionality itself. > >> > >> The scheduler is still in the experimental phase. There''s lots of > >> opportunity to contribute with independent lines of development; > email > >> George Dunlap <george.dunlap@eu.citrix.com> or check out the wiki > page > >> http://wiki.xensource.com/xenwiki/Credit2_Scheduler_Development for > >> ideas > >> and status updates. > >> > >> 19 files changed, 1453 insertions(+), 21 deletions(-) > >> tools/libxc/Makefile | 1 > >> tools/libxc/xc_csched2.c | 50 + > >> tools/libxc/xenctrl.h | 8 > >> tools/python/xen/lowlevel/xc/xc.c | 58 + > >> tools/python/xen/xend/XendAPI.py | 3 > >> tools/python/xen/xend/XendDomain.py | 54 + > >> tools/python/xen/xend/XendDomainInfo.py | 4 > >> tools/python/xen/xend/XendNode.py | 4 > >> tools/python/xen/xend/XendVMMetrics.py | 1 > >> tools/python/xen/xend/server/SrvDomain.py | 14 > >> tools/python/xen/xm/main.py | 82 ++ > >> xen/arch/ia64/vmx/vmmu.c | 6 > >> xen/common/Makefile | 1 > >> xen/common/sched_credit.c | 8 > >> xen/common/sched_credit2.c | 1125 > >> +++++++++++++++++++++++++++++ > >> xen/common/schedule.c | 22 > >> xen/include/public/domctl.h | 4 > >> xen/include/public/trace.h | 1 > >> xen/include/xen/sched-if.h | 28 > >> > >> _______________________________________________ > >> Xen-devel mailing list > >> Xen-devel@lists.xensource.com > >> http://lists.xensource.com/xen-devel > >> >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Dulloor
2010-Apr-17 20:29 UTC
Re: [Xen-devel] [PATCH 0 of 5] Add credit2 scheduler (EXPERIMENTAL)
On Thu, Apr 15, 2010 at 10:17 AM, George Dunlap <george.dunlap@eu.citrix.com> wrote:> Dulloor wrote: >> >> As we talked before, I am interested in improving the mutiple-socket >> scenario and adding the load balancing functionalilty, which could >> provide an acceptable alternative to pinning vcpus to sockets (for my >> NUMA work). I am going over your patch right now, but what are your >> thoughts ? >> > > That would be great -- my focus for the next several months will be setting > up a testing infrastructure to automatically test performance of different > workloads mixes so I can hone the algorithm and test regressions. > > My idea with load balancing was to do this: > * One runqueue per L2 cache. > * Add code to calculate the load of a runqueue. Load would be the average > (~integral) of (vcpus running + vcpus on runqueue). I was planning on doing > accurate load calculation, rather than sample-based, and falling back to > sample-based if accurate turned out to be too slow. > * Calculate the load contributed by various vcpus. > * At regular intervals, determine of some kind of balancing needs to be done > by looking at the overall runqueue load and placing based on "contributory" > load of each VCPU. > > Does that make sense? Thoughts?Sounds good. I can see that the runq_map for all cpus point to the same run-queue (in make_runq_map). I will start there.> > I have some old patches that calculated accurate load, I could dig them up > if you wanted something to start with. (I don''t think they''ll apply cleanly > at the moment.) > > Thanks, > -George >> >> -dulloor >> >> On Wed, Apr 14, 2010 at 6:26 AM, George Dunlap >> <george.dunlap@eu.citrix.com> wrote: >> >>> >>> This patch series introduces the credit2 scheduler. The first two >>> patches >>> introduce changes necessary to allow the credit2 shared runqueue >>> functionality >>> to work properly; the last two implement the functionality itself. >>> >>> The scheduler is still in the experimental phase. There''s lots of >>> opportunity to contribute with independent lines of development; email >>> George Dunlap <george.dunlap@eu.citrix.com> or check out the wiki page >>> http://wiki.xensource.com/xenwiki/Credit2_Scheduler_Development for ideas >>> and status updates. >>> >>> 19 files changed, 1453 insertions(+), 21 deletions(-) >>> tools/libxc/Makefile | 1 >>> tools/libxc/xc_csched2.c | 50 + >>> tools/libxc/xenctrl.h | 8 >>> tools/python/xen/lowlevel/xc/xc.c | 58 + >>> tools/python/xen/xend/XendAPI.py | 3 >>> tools/python/xen/xend/XendDomain.py | 54 + >>> tools/python/xen/xend/XendDomainInfo.py | 4 >>> tools/python/xen/xend/XendNode.py | 4 >>> tools/python/xen/xend/XendVMMetrics.py | 1 >>> tools/python/xen/xend/server/SrvDomain.py | 14 >>> tools/python/xen/xm/main.py | 82 ++ >>> xen/arch/ia64/vmx/vmmu.c | 6 >>> xen/common/Makefile | 1 >>> xen/common/sched_credit.c | 8 >>> xen/common/sched_credit2.c | 1125 >>> +++++++++++++++++++++++++++++ >>> xen/common/schedule.c | 22 >>> xen/include/public/domctl.h | 4 >>> xen/include/public/trace.h | 1 >>> xen/include/xen/sched-if.h | 28 >>> >>> _______________________________________________ >>> Xen-devel mailing list >>> Xen-devel@lists.xensource.com >>> http://lists.xensource.com/xen-devel >>> >>> > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel