George, I''m getting puzzled by the second c2t() invocation in csched_runtime(): Why is the difference of credits being passed here? Doesn''t that (unless svc->credit is non-positive, i.e. in all but unusual cases) guarantee time > ntime, and particularly allow for negative ntime? Thanks, Jan
On 24/01/13 07:40, Jan Beulich wrote:> George, > > I''m getting puzzled by the second c2t() invocation in > csched_runtime(): Why is the difference of credits being passed > here? Doesn''t that (unless svc->credit is non-positive, i.e. in all > but unusual cases) guarantee time > ntime, and particularly > allow for negative ntime?Ah, right -- yes, if the other guys'' credit is positive, "ntime" is guaranteed to be lower. Since c2t() involves integer division, it would definiteyl be good to get rid of the extra call if we can. My general principle is to make the code clear and easily readable first, and then do optimization afterwards -- in this case I just never came back and did the optimization step. Were you intending to submit a patch for this, or shall I? -George
On 24/01/13 09:49, George Dunlap wrote:> On 24/01/13 07:40, Jan Beulich wrote: >> George, >> >> I''m getting puzzled by the second c2t() invocation in >> csched_runtime(): Why is the difference of credits being passed >> here? Doesn''t that (unless svc->credit is non-positive, i.e. in all >> but unusual cases) guarantee time > ntime, and particularly >> allow for negative ntime? > > Ah, right -- yes, if the other guys'' credit is positive, "ntime" is > guaranteed to be lower. Since c2t() involves integer division, it > would definiteyl be good to get rid of the extra call if we can. > > My general principle is to make the code clear and easily readable > first, and then do optimization afterwards -- in this case I just > never came back and did the optimization step. > > Were you intending to submit a patch for this, or shall I?If we''re doing optimization, we can probably get away with doing all of the calculation in "credit" (including the MIN/MAX checking), and then only do the conversion if we''re not at one of those set points. That should get rid of any integer division entirely in a large number of cases. -George
>>> On 24.01.13 at 10:49, George Dunlap <george.dunlap@eu.citrix.com> wrote: > On 24/01/13 07:40, Jan Beulich wrote: >> George, >> >> I''m getting puzzled by the second c2t() invocation in >> csched_runtime(): Why is the difference of credits being passed >> here? Doesn''t that (unless svc->credit is non-positive, i.e. in all >> but unusual cases) guarantee time > ntime, and particularly >> allow for negative ntime? > > Ah, right -- yes, if the other guys'' credit is positive, "ntime" is > guaranteed to be lower. Since c2t() involves integer division, it would > definiteyl be good to get rid of the extra call if we can. > > My general principle is to make the code clear and easily readable > first, and then do optimization afterwards -- in this case I just never > came back and did the optimization step.Oh, I wasn''t thinking of just the optimization. It seemed wrong to me to do the subtraction there in the first place: "time" is being calculated from a plain value, so why would "ntime" be calculated from a delta?> Were you intending to submit a patch for this, or shall I?I surely can do so, but first need to understand the intentions of that code. Jan
On 24/01/13 10:06, Jan Beulich wrote:>>>> On 24.01.13 at 10:49, George Dunlap <george.dunlap@eu.citrix.com> wrote: >> On 24/01/13 07:40, Jan Beulich wrote: >>> George, >>> >>> I''m getting puzzled by the second c2t() invocation in >>> csched_runtime(): Why is the difference of credits being passed >>> here? Doesn''t that (unless svc->credit is non-positive, i.e. in all >>> but unusual cases) guarantee time > ntime, and particularly >>> allow for negative ntime? >> Ah, right -- yes, if the other guys'' credit is positive, "ntime" is >> guaranteed to be lower. Since c2t() involves integer division, it would >> definiteyl be good to get rid of the extra call if we can. >> >> My general principle is to make the code clear and easily readable >> first, and then do optimization afterwards -- in this case I just never >> came back and did the optimization step. > Oh, I wasn''t thinking of just the optimization. It seemed wrong to > me to do the subtraction there in the first place: "time" is being > calculated from a plain value, so why would "ntime" be calculated > from a delta?Ah, right -- so the idea here was to run until snext->credit was equal to svc->credit. That''s why the delta. The whole algorithm was supposed to be: 1. At a basic level, run until your credit is 0. 2. But if there''s someone else waiting to run, run until your credit ~= their credit. 3. But never run shorter than MIN_TIMER or longer than MAX_TIMER. #2 is one of the "experimental" / "research-y" ideas I was trying out. One of the goals was to reward vcpus that yielded by making sure that they would get better latency. -George
>>> On 24.01.13 at 11:07, George Dunlap <george.dunlap@eu.citrix.com> wrote: > On 24/01/13 10:06, Jan Beulich wrote: >>>>> On 24.01.13 at 10:49, George Dunlap <george.dunlap@eu.citrix.com> wrote: >>> On 24/01/13 07:40, Jan Beulich wrote: >>>> George, >>>> >>>> I''m getting puzzled by the second c2t() invocation in >>>> csched_runtime(): Why is the difference of credits being passed >>>> here? Doesn''t that (unless svc->credit is non-positive, i.e. in all >>>> but unusual cases) guarantee time > ntime, and particularly >>>> allow for negative ntime? >>> Ah, right -- yes, if the other guys'' credit is positive, "ntime" is >>> guaranteed to be lower. Since c2t() involves integer division, it would >>> definiteyl be good to get rid of the extra call if we can. >>> >>> My general principle is to make the code clear and easily readable >>> first, and then do optimization afterwards -- in this case I just never >>> came back and did the optimization step. >> Oh, I wasn''t thinking of just the optimization. It seemed wrong to >> me to do the subtraction there in the first place: "time" is being >> calculated from a plain value, so why would "ntime" be calculated >> from a delta? > > Ah, right -- so the idea here was to run until snext->credit was equal > to svc->credit. That''s why the delta.Which then means that under normal circumstances you would always only run each vCPU for CSCHED_MIN_TIMER, which seems quite odd. Wouldn''t it be more fair to do e.g. if ( time > ntime ) time = (time + ntime) / 2; since otherwise at the expiry of the time the two vCPU-s have equal credit, whereas you would generally expect a vCPU that just finished running to have lower credit than the next one to run? But as you validly said earlier, avoiding the c2t() in cases where we can tell up front that "time" would end up below CSCHED_MIN_TIMER (particularly zero or negative) would be desirable. I''d prefer to leave doing that to you though.> The whole algorithm was supposed to be: > 1. At a basic level, run until your credit is 0. > 2. But if there''s someone else waiting to run, run until your credit ~= > their credit. > 3. But never run shorter than MIN_TIMER or longer than MAX_TIMER. > > #2 is one of the "experimental" / "research-y" ideas I was trying out. > One of the goals was to reward vcpus that yielded by making sure that > they would get better latency.Which all makes sense. Jan
On 24/01/13 10:42, Jan Beulich wrote:>>>> On 24.01.13 at 11:07, George Dunlap <george.dunlap@eu.citrix.com> wrote: >> On 24/01/13 10:06, Jan Beulich wrote: >>>>>> On 24.01.13 at 10:49, George Dunlap <george.dunlap@eu.citrix.com> wrote: >>>> On 24/01/13 07:40, Jan Beulich wrote: >>>>> George, >>>>> >>>>> I''m getting puzzled by the second c2t() invocation in >>>>> csched_runtime(): Why is the difference of credits being passed >>>>> here? Doesn''t that (unless svc->credit is non-positive, i.e. in all >>>>> but unusual cases) guarantee time > ntime, and particularly >>>>> allow for negative ntime? >>>> Ah, right -- yes, if the other guys'' credit is positive, "ntime" is >>>> guaranteed to be lower. Since c2t() involves integer division, it would >>>> definiteyl be good to get rid of the extra call if we can. >>>> >>>> My general principle is to make the code clear and easily readable >>>> first, and then do optimization afterwards -- in this case I just never >>>> came back and did the optimization step. >>> Oh, I wasn''t thinking of just the optimization. It seemed wrong to >>> me to do the subtraction there in the first place: "time" is being >>> calculated from a plain value, so why would "ntime" be calculated >>> from a delta? >> Ah, right -- so the idea here was to run until snext->credit was equal >> to svc->credit. That''s why the delta. > Which then means that under normal circumstances you would > always only run each vCPU for CSCHED_MIN_TIMER, which > seems quite odd.Only when both domains are burning cpu -- which in my tests, even for very busy domains, was very rarely the case. This is particularly in the case of HVM domains, where there are regular MMIO accesses that throw everything into a bit of a kilter. :-)> Wouldn''t it be more fair to do e.g. > > if ( time > ntime ) > time = (time + ntime) / 2; > > since otherwise at the expiry of the time the two vCPU-s have > equal credit, whereas you would generally expect a vCPU that > just finished running to have lower credit than the next one to > run?The divide by two thing would either get just variations of runtimes, or always MAX_TIMER. Suppose we have "burner" vcpus v1 and v2 (and we didn''t have the ''max''): Just after the "reset", everyone''s credit is around 10ms; so this would cause v1 to run for (10+(10-10))/2 == 5ms, then v2 to run for (10+(10-5))/2=7.5ms (!), then v1 to run for (5+(5-2.5))/2 = 3.75ms, &c. We could do the average after applying MAX, but then you just get a "steady state" for burners of (MIN+MAX)/2. In any case, scheduling is like economics: apparently simple minor changes can have really big effects, so you need to be very careful about changing things without doing experiments first; and I don''t really have time at the moment. The current one is known to be at least "not awful", so I think we should just stick with it until someone can take a proper look at potential alternatives. :-)> But as you validly said earlier, avoiding the c2t() in cases where we > can tell up front that "time" would end up below CSCHED_MIN_TIMER > (particularly zero or negative) would be desirable. I''d prefer to > leave doing that to you though.I''ve already started. :-) -George
On 24/01/13 10:42, Jan Beulich wrote:>>>> On 24.01.13 at 11:07, George Dunlap <george.dunlap@eu.citrix.com> wrote: >> On 24/01/13 10:06, Jan Beulich wrote: >>>>>> On 24.01.13 at 10:49, George Dunlap <george.dunlap@eu.citrix.com> wrote: >>>> On 24/01/13 07:40, Jan Beulich wrote: >>>>> George, >>>>> >>>>> I''m getting puzzled by the second c2t() invocation in >>>>> csched_runtime(): Why is the difference of credits being passed >>>>> here? Doesn''t that (unless svc->credit is non-positive, i.e. in all >>>>> but unusual cases) guarantee time > ntime, and particularly >>>>> allow for negative ntime? >>>> Ah, right -- yes, if the other guys'' credit is positive, "ntime" is >>>> guaranteed to be lower. Since c2t() involves integer division, it would >>>> definiteyl be good to get rid of the extra call if we can. >>>> >>>> My general principle is to make the code clear and easily readable >>>> first, and then do optimization afterwards -- in this case I just never >>>> came back and did the optimization step. >>> Oh, I wasn''t thinking of just the optimization. It seemed wrong to >>> me to do the subtraction there in the first place: "time" is being >>> calculated from a plain value, so why would "ntime" be calculated >>> from a delta? >> Ah, right -- so the idea here was to run until snext->credit was equal >> to svc->credit. That''s why the delta. > Which then means that under normal circumstances you would > always only run each vCPU for CSCHED_MIN_TIMER, which > seems quite odd. Wouldn''t it be more fair to do e.g. > > if ( time > ntime ) > time = (time + ntime) / 2; > > since otherwise at the expiry of the time the two vCPU-s have > equal credit, whereas you would generally expect a vCPU that > just finished running to have lower credit than the next one to > run? > > But as you validly said earlier, avoiding the c2t() in cases where we > can tell up front that "time" would end up below CSCHED_MIN_TIMER > (particularly zero or negative) would be desirable. I''d prefer to > leave doing that to you though.Hmm, actually doing the stuff with MIN and MAX timer is more tricky; I forgot that vcpus with different weights burn credit at different rates, so we''d have to pre-calculate t2c(MIN) and t2c(MAX) for each vcpu. That sounds like a bit more code than I''m up for ATM (and more complexity than I''d like to add unless there''s a measurable benefit, which I don''t have time to measure ATM); but I can certainly get rid of the second c2t(). -George