Hi, all I want to know whether xen scheduler supports preemption? If it supports, in the source code, which part decides it has the preemption function? _______________________________________________ Xen-users mailing list Xen-users@lists.xen.org http://lists.xen.org/xen-users
On mar, 2013-06-25 at 22:29 +0800, 张伟 wrote:> Hi, all >Hi,> I want to know whether xen scheduler supports preemption? >Yes, it definitely does.> If it supports, in the source code, which part decides it has the > preemption function? >All the important bits about Xen scheduling can be found, in the source code, in the following files: xen/common/schedule.c xen/common/sched_credit.c xen/common/sched_credit2.c xen/common/sched_sedf.c xen/common/sched_arinc653.c Look particularly carefully at schedule.c, which hosts the generic scheduling frameweork, common to all the scheduling algorithms we support, and to sched_credit.c, which is where the scheduling algorithm that is used by default is implemented. Regarding what you''re saying about the "preemption function", I''m sorry, but I cannot parse that part of the sentence... What do you mean by "which part decides it has the preemption function"? Dario -- <<This happens because I choose it to happen!>> (Raistlin Majere) ----------------------------------------------------------------- Dario Faggioli, Ph.D, http://about.me/dario.faggioli Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK) _______________________________________________ Xen-users mailing list Xen-users@lists.xen.org http://lists.xen.org/xen-users
Thank you very much for your reply! See below. 在 2013-06-26 07:27:26,"Dario Faggioli" <dario.faggioli@citrix.com> 写道:>On mar, 2013-06-25 at 22:29 +0800, 张伟 wrote: >> Hi, all >> >Hi, > >> I want to know whether xen scheduler supports preemption? >> >Yes, it definitely does. > >> If it supports, in the source code, which part decides it has the >> preemption function? >> >All the important bits about Xen scheduling can be found, in the source >code, in the following files: > >xen/common/schedule.c >xen/common/sched_credit.c >xen/common/sched_credit2.c >xen/common/sched_sedf.c >xen/common/sched_arinc653.c > >Look particularly carefully at schedule.c, which hosts the generic >scheduling frameweork, common to all the scheduling algorithms we >support, and to sched_credit.c, which is where the scheduling algorithm >that is used by default is implemented. > >Regarding what you're saying about the "preemption function", I'm sorry, >but I cannot parse that part of the sentence... What do you mean by >"which part decides it has the preemption function"?My meaning is that which code decides the xen scheduler has the preemption ability. In the sched_credit.c file, there is a function, csched_vcpu_wake()->__runq_tickle(), in the function, __runq_tickle(), at the end, there is the following code: if ( !cpumask_empty(&mask) ) cpumask_raise_softirq(&mask, SCHEDULE_SOFTIRQ); It will raise SCHEDULE_SOFTIRQ interrupt, whether here decides it has the preemption ability, or other parts? If it raise a SCHEDULE_SOFTIRQ interrupt, when will deal with this software interrupt? In time or the current vcpu gives up the physical cpu?> >Dario > >-- ><<This happens because I choose it to happen!>> (Raistlin Majere) >----------------------------------------------------------------- >Dario Faggioli, Ph.D, http://about.me/dario.faggioli >Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK) >_______________________________________________ Xen-users mailing list Xen-users@lists.xen.org http://lists.xen.org/xen-users
On mer, 2013-06-26 at 07:37 +0800, 张伟 wrote:> 在 2013-06-26 07:27:26,"Dario Faggioli" <dario.faggioli@citrix.com> 写道: > >Regarding what you''re saying about the "preemption function", I''m sorry, > >but I cannot parse that part of the sentence... What do you mean by > >"which part decides it has the preemption function"? > My meaning is that which code decides the xen scheduler has the preemption ability. >Well, the point is it is hard to restrict to a single function (or anything like that) something like what you call the "preemption ability". I mean, when you design a scheduler you either design it to be preemptible or not, and this design choice reflects in many places in the code... Anyways...> In the sched_credit.c file, there is a function, csched_vcpu_wake()->__runq_tickle(), in the function, __runq_tickle(), at the end, there is the following code: >... Yes, that is at least most of it. In fact, when a vcpu wakes up, it is added to a specific runq, and the ''tickling'' mechanism is there right to ensure that the said vcpu starts to run as soon as possible, either if there are idle pcpus, or the running vcpus have lower priority, the latter case being the definition of preemption.> if ( !cpumask_empty(&mask) ) > cpumask_raise_softirq(&mask, SCHEDULE_SOFTIRQ); > It will raise SCHEDULE_SOFTIRQ interrupt, whether here decides it has the preemption ability, or other parts? >Again, this is probably the most important part of it. The scheduler runs every time the SCHEDULE_SOFTIRQ interrupt is raised (for a given pcpu), and the fact that this happens as a consequence of a vcpu waking up, is what make this particular path a (possible) ''preemption point''. If you, for instance, avoid raising the SCHEDULE_SOFTIRQ for busy pcpus (I would still tickle the idle ones, or you''ll get funny results! :-O), you definitely are making the (credit) scheduler less preemptible. Of course, wake-ups is not the only cause of SCHEDULE_SOFTIRQ being raised. E.g., it fires periodically at the scheduling time slice boundaries. If you want to avoid vcpus being interrupted by others with higher priority for this case too, you probably have more paths to tweak than just the csched_vcpu_wake() function.> If it raise a SCHEDULE_SOFTIRQ interrupt, when will deal with this software interrupt? In time or the current vcpu gives up the physical cpu? >And here I''m failing at understanding what you mean again... When a SCHEDULE_SOFTIRQ is raised for a given pcpu, that pcpu will deal with it, well, ASAP (look at how softirqs & tasklets work in the hypervisor source code). What do you mean by "give up the physical cpu"? Regards, Dario -- <<This happens because I choose it to happen!>> (Raistlin Majere) ----------------------------------------------------------------- Dario Faggioli, Ph.D, http://about.me/dario.faggioli Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK) _______________________________________________ Xen-users mailing list Xen-users@lists.xen.org http://lists.xen.org/xen-users
Hello Please, i want to continue in this topic! How does the xen schedule ensures that the cap(not null) value of a VM is respected? Firstly in a case of a single VCPU VM and in the case of the multiple VPCU VM? 2013/6/26 张伟 <zhangwqh@126.com>> Thank you very much for your reply! See below. > > > > > > > 在 2013-06-26 07:27:26,"Dario Faggioli" <dario.faggioli@citrix.com> 写道: > >On mar, 2013-06-25 at 22:29 +0800, 张伟 wrote: > >> Hi, all > >> > >Hi, > > > >> I want to know whether xen scheduler supports preemption? > >> > >Yes, it definitely does. > > > >> If it supports, in the source code, which part decides it has the > >> preemption function? > >> > >All the important bits about Xen scheduling can be found, in the source > >code, in the following files: > > > >xen/common/schedule.c > >xen/common/sched_credit.c > >xen/common/sched_credit2.c > >xen/common/sched_sedf.c > >xen/common/sched_arinc653.c > > > >Look particularly carefully at schedule.c, which hosts the generic > >scheduling frameweork, common to all the scheduling algorithms we > >support, and to sched_credit.c, which is where the scheduling algorithm > >that is used by default is implemented. > > > >Regarding what you''re saying about the "preemption function", I''m sorry, > >but I cannot parse that part of the sentence... What do you mean by > >"which part decides it has the preemption function"? > > My meaning is that which code decides the xen scheduler has the preemption ability. In the sched_credit.c file, there is a function, csched_vcpu_wake()->__runq_tickle(), in the function, __runq_tickle(), at the end, there is the following code: > > if ( !cpumask_empty(&mask) ) > cpumask_raise_softirq(&mask, SCHEDULE_SOFTIRQ); > > It will raise SCHEDULE_SOFTIRQ interrupt, whether here decides it has the preemption ability, or other parts? > > If it raise a SCHEDULE_SOFTIRQ interrupt, when will deal with this software interrupt? In time or the current vcpu gives up the physical cpu? > > > > > >Dario > > > >-- > ><<This happens because I choose it to happen!>> (Raistlin Majere) > >----------------------------------------------------------------- > >Dario Faggioli, Ph.D, http://about.me/dario.faggioli > >Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK) > > > > > > > _______________________________________________ > Xen-users mailing list > Xen-users@lists.xen.org > http://lists.xen.org/xen-users >_______________________________________________ Xen-users mailing list Xen-users@lists.xen.org http://lists.xen.org/xen-users
On mer, 2013-06-26 at 10:09 +0200, Christy Business wrote:> Hello >Hi,> Please, i want to continue in this topic! >I''d be happy to help, but unfortunately, I did never get to look deeply on how capping works. :-(> How does the xen schedule ensures that the cap(not null) value of a > VM is respected? Firstly in a case of a single VCPU VM and in the case > of the multiple VPCU VM? >What I know is and, basing on that, what I suggest is: - credit1 has the capping capability, credit2 does not have anything like hat yet. SEDF has something like that, although I wouldn''t call it a cap. Therefore, for learning how it works, concentrate on sched_credit.c - the mechanism has been designed, and got the most of his serious testing, for single VCPU VMs. It is not that it does not work with SMP guests, actually, it behaved just fine all the time I had the chance to try it, even in that scenario, but bear this in mind when you investigate the algorithm. - it is all based on keeping track on how much credit a VM can consume before needing to be ''parked'', to avoid overrunning the cap itself. Look for what happens to variables called ''cap'', ''credit_cap'', and of course ''credit''. Most of the math to make it work seems to reside in the csched_acct() (in sched_credit.c, of course) function. And this is all I can say about it, I''m afraid. Regards, Dario -- <<This happens because I choose it to happen!>> (Raistlin Majere) ----------------------------------------------------------------- Dario Faggioli, Ph.D, http://about.me/dario.faggioli Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK) _______________________________________________ Xen-users mailing list Xen-users@lists.xen.org http://lists.xen.org/xen-users
Thank you very much for your detail explanation! See below. 在 2013-06-26 09:49:52,"Dario Faggioli" <dario.faggioli@citrix.com> 写道:>On mer, 2013-06-26 at 07:37 +0800, 张伟 wrote: >> 在 2013-06-26 07:27:26,"Dario Faggioli" <dario.faggioli@citrix.com> 写道: >> >Regarding what you're saying about the "preemption function", I'm sorry, >> >but I cannot parse that part of the sentence... What do you mean by >> >"which part decides it has the preemption function"? >> My meaning is that which code decides the xen scheduler has the preemption ability. >> >Well, the point is it is hard to restrict to a single function (or >anything like that) something like what you call the "preemption >ability". I mean, when you design a scheduler you either design it to be >preemptible or not, and this design choice reflects in many places in >the code... Anyways... > >> In the sched_credit.c file, there is a function, csched_vcpu_wake()->__runq_tickle(), in the function, __runq_tickle(), at the end, there is the following code: >> >... Yes, that is at least most of it. In fact, when a vcpu wakes up, it >is added to a specific runq, and the 'tickling' mechanism is there right >to ensure that the said vcpu starts to run as soon as possible, either >if there are idle pcpus, or the running vcpus have lower priority, the >latter case being the definition of preemption.When a vcpu wakes up, it is added to a specific runq. Whether the specific runq is the runnable queue? either if there are idle pcpus, or the running vcpus have lower priority? I do not understand your meaning. You mean that if there are idle pcpus, the waked up vcpu will be scheduled on the idle pcpus to run. If not, it will preempted the current running vcpus if the waked up vcpu has the higher priority compared to the the current vcpu. Whether my understanding is right?> >> if ( !cpumask_empty(&mask) ) >> cpumask_raise_softirq(&mask, SCHEDULE_SOFTIRQ); >> It will raise SCHEDULE_SOFTIRQ interrupt, whether here decides it has the preemption ability, or other parts? >> >Again, this is probably the most important part of it. The scheduler >runs every time the SCHEDULE_SOFTIRQ interrupt is raised (for a given >pcpu), and the fact that this happens as a consequence of a vcpu waking >up, is what make this particular path a (possible) 'preemption point'. > >If you, for instance, avoid raising the SCHEDULE_SOFTIRQ for busy pcpus >(I would still tickle the idle ones, or you'll get funny results! :-O), >you definitely are making the (credit) scheduler less preemptible.I can not understand here. still tickle the idle ones, or you'll get funny results! What's the meaning?> >Of course, wake-ups is not the only cause of SCHEDULE_SOFTIRQ being >raised. E.g., it fires periodically at the scheduling time slice >boundaries. If you want to avoid vcpus being interrupted by others with >higher priority for this case too, you probably have more paths to tweak >than just the csched_vcpu_wake() function. >Yes, I can not remember the number of raising SCHEDULE_SOFTIRQ interrupt. Long time ago, I check the places of raising SCHEDULE_SOFTIRQ interrupt. It is about seven places.>> If it raise a SCHEDULE_SOFTIRQ interrupt, when will deal with this software interrupt? In time or the current vcpu gives up the physical cpu? >> >And here I'm failing at understanding what you mean again... When a >SCHEDULE_SOFTIRQ is raised for a given pcpu, that pcpu will deal with >it, well, ASAP (look at how softirqs & tasklets work in the hypervisor >source code). What do you mean by "give up the physical cpu"?I mean after raising the SCHEDULE_SOFTIRQ interrupt, the handler function schedule() will execute in time or need to wait the current vcpu scheduled out. Which part decides the priority among them? softirqs & tasklets work in the hypervisor source code. Can you give me some guidance, where is the code for softirqs & tasklets. Another question: In the schedule() function of schedule.c file, at first, it will set the flag tasklet_work_scheduled according to whether has the tasklet_work. What is the tasklet work? In the csched_schedule() of sched_credit.c file, it will give the idle vcpu boost priority if the tasklet_work_scheduled is set. I have some difficult for understanding this part. Maybe my confusion is not knowing the tasklet work. Can you give some explanation why designing like this?> >Regards, >Dario > >-- ><<This happens because I choose it to happen!>> (Raistlin Majere) >----------------------------------------------------------------- >Dario Faggioli, Ph.D, http://about.me/dario.faggioli >Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK) >_______________________________________________ Xen-users mailing list Xen-users@lists.xen.org http://lists.xen.org/xen-users
So, first of all... Can you use plain text instead of HTML for e-mails? On mer, 2013-06-26 at 21:16 +0800, 张伟 wrote:> Thank you very much for your detail explanation! See below. >You''re welcome. Although, at this point, I''m curious about why you''re interested in this... What is it that you want to achieve?> >... Yes, that is at least most of it. In fact, when a vcpu wakes up, it > >is added to a specific runq, and the ''tickling'' mechanism is there right > >to ensure that the said vcpu starts to run as soon as possible, either > >if there are idle pcpus, or the running vcpus have lower priority, the > >latter case being the definition of preemption. > When a vcpu wakes up, it is added to a specific runq. Whether the specific runq is the runnable queue? >Well, the vcpu wakes-up, so yes, it is the runnable queue of a specific pCPU. Which ''specific pCPU'' depends, and I suggest you looking more deeply in the scheduler code. From the top of my head, I''d say it is the runqueue of the pCPU where the vCPU was when it went to sleep.> either if there are idle pcpus, or the running vcpus have lower priority? >In credit1, it works like this: - you (the vCPU) wake-up and I (Xen scheduler) queue you on the runq of the pCPU when you where before going to sleep; - if that pCPU is busy, I poke other pCPUs to see if you can run there (that''s the meaning of ''tickling''); - if the above is not possible, I check if preemption is required. If yes, I preempt the vCPU running on the runq, if not, you have to wait for your turn (or for some other pCPU becoming idle and picking you up) in the runq. Does that make sense?> I do not understand your meaning. You mean that if there are idle pcpus, the waked up vcpu will be scheduled on the idle pcpus to run. >For sure, the scheduler will try as hard as he can to achieve this, yes.> If not, it will preempted the current running vcpus if the waked up vcpu has the higher priority compared to the the current vcpu. Whether my understanding is right? >I believe it is. Actually, I believe this is either the definition or, in any case, the only sensible thing that a reasonable enough preemptible scheduler should do. :-) For the deep technicalities of how this is implemented in credit1, please refer to my hopefully accurate explanation above, or, even better, to sched_credit.c.> > If you, for instance, avoid raising the SCHEDULE_SOFTIRQ for busy > > pcpus > > (I would still tickle the idle ones, or you''ll get funny results! :-O), > > you definitely are making the (credit) scheduler less preemptible. > I can not understand here. still tickle the idle ones, or you''ll get funny results! What''s the meaning? >The meaning is that, given the explanation above, inhibiting preemption by, for instance, not tickling the busy pCPUs might actually work. On the other hand, if you have idle pCPUs, having them running the woken-up task is not a preemption, right? Well, if you do not tickle those pCPUs you won''t get there, and you not only will get rid of peemption on busy pCPUs, you will also have idle pCPUs that remains idle, even if there are vCPUs waiting to be executed. This means you''re killing not only preemption, but also work conserving-ness, and that might not be among your original goals (or was it?).> >Of course, wake-ups is not the only cause of SCHEDULE_SOFTIRQ being > >raised. E.g., it fires periodically at the scheduling time slice > >boundaries. If you want to avoid vcpus being interrupted by others with > >higher priority for this case too, you probably have more paths to tweak > >than just the csched_vcpu_wake() function. > > > Yes, I can not remember the number of raising SCHEDULE_SOFTIRQ interrupt. Long time ago, I check the places of raising SCHEDULE_SOFTIRQ interrupt. It is about seven places. >Fine. Then, to be sure, I''d check all of them and see what they end up doing. I know they''re all calling csched_schedule(), what I mean is I''d check the conditions and the parameters, to verify which ones of these 7 possible situations could lead to preemption. What you can be quite sure of, is ha there''s not going to be a preemption without a call to csched_schedule() being involved, so you may even try to instrument the code at that level.. It really all depends on your final purpose.> >And here I''m failing at understanding what you mean again... When a > >SCHEDULE_SOFTIRQ is raised for a given pcpu, that pcpu will deal with > >it, well, ASAP (look at how softirqs & tasklets work in the hypervisor > >source code). What do you mean by "give up the physical cpu"? > I mean after raising the SCHEDULE_SOFTIRQ interrupt, the handler function schedule() will execute in time or need to wait the current vcpu scheduled out. Which part decides the priority among them? >Mmm... I spot some confusion here. Why the scheduling out of a vcpu should be involved in all this? I mean, raising a SCHEDULE_SOFTIRQ and, most important, handling it, happens in Xen code. That means there is a pCPU executing hypervisor code, independently of which one is the vCPU that is or was running on that same pCPU. Well, this same hypervisor code will get to execute, at some point, csched_schedule(), make the scheduling decision and, if that is the case, dschedule the running vCPU and schedule another one (and here you are a preemption). Actually, we really can''t wait for a vCPU to be descheduled to execute the Xen scheduler, since it''s the Xen scheduler itself that deschedules vCPUs! :-O Perhaps, with "scheduled out" you mean something like block, i.e., you want to know if Xen is able to interrupt the vCPUs or if it always run them to completion or blocking. In which case, the former, we interrupt the vCPUs, just like an (preemptible) OS scheduler interrupts OS''s tasks. Whether or not that will result in a preemption, it depends both on the scheduler and on the circumstances. Sounds better now?> Can you give me some guidance, where is the code for softirqs & tasklets. >Well, grep and find are usually good friends, when the question is where is the code! :-P Both $ grep tasklet xen.git/xen/* and $ grep softirq xen.git/xen/* Produce a lot of output here. Also, I''d try something like that... You know, programmers usually have quite few fantasy $ find ./xen.git/xen/ -iname tasklet* ./xen/include/xen/tasklet.h ./xen/common/tasklet.c $ find ./xen.git/xen/ -iname softirq* ./xen/include/asm-x86/softirq.h ./xen/include/xen/softirq.h ./xen/include/asm-arm/softirq.h ./xen/common/softirq.c> Another question: > In the schedule() function of schedule.c file, at first, it will set the flag tasklet_work_scheduled according to whether has the tasklet_work. What is the tasklet work? >After having inspected at least some of the sources above, look for the do_tasklet() function, and revise what it does. If it''s the concept of tasklet and softirq that you''re unfamiliar with, well, very quickly it''s just one way of deferring work in an OS (or, in out case, an hypervisor, but still). Linux makes uses of these kind of things pretty heavily (although the names, the implementation, and the number of different variants of them changes with kernel versions). I trust/hope you can find enough documentation about that on line. :-)> In the csched_schedule() of sched_credit.c file, it will give the idle vcpu boost priority if the tasklet_work_scheduled is set. > I have some difficult for understanding this part. Maybe my confusion is not knowing the tasklet work. Can you give some explanation why designing like this? >Again, tasklet is deferred work. That means there is this pretty function you want to call, but you can call it right now. Typical example is because you have interrupt disabled and the pretty function in question wants interrupt enabled, or it is you that you don''t want to keep interrupts disabled for too long, or any other reason. Ok, what you do is to make a note about calling that function later, and that''s exactly what tasklet does. The reason why we execute them in idle domain''s context is, well, because we have to execute them somewhere! :-) Seriously, our scheduler schedules vCPUs, not ''functions'', so you either call a function from where you are (and we already said you can''t) or, when you''re done, the scheduler will pick a vCPU and get on with it, and your function will never be called. What we hence do is making sure it is one of the idle domain''s vCPUs that is scheduled, as well as making sure that such vCPU will call your function as part of ''its workload''. Check out the idle_loop() function, it''s in xen/arch/x86/domain.c. Regards, Dario _______________________________________________ Xen-users mailing list Xen-users@lists.xen.org http://lists.xen.org/xen-users
Thank you very much for your guidance! At 2013-06-27 18:30:05,"Dario Faggioli" <dario.faggioli@citrix.com> wrote:>So, first of all... Can you use plain text instead of HTML for e-mails? > >On mer, 2013-06-26 at 21:16 +0800, 张伟 wrote: >> Thank you very much for your detail explanation! See below. >> >You're welcome. Although, at this point, I'm curious about why you're >interested in this... What is it that you want to achieve?At first, I have a wrong understanding for xen scheduler preemption. I thought it did not support preemption. Last week, my advisor corrects my thought. So I want to know if a system supports preemption, the code which key part need to do the modification. At first, I add something in xen scheduler(only simple). My modification will bring some virtual machines starvation. Now I want to decrease the starvation. I need to add some other things. I meet a serious problem, in the schedule() or csched_schedule() function, if access the variable csched_dom structure, the system will automatically reboot. Eg, if add the printk("The domain weight is %d", sdom->weight); in the csched_schedule() or schedule(), the system will automatically reboot and can not enter the system. Do you know why? It is very strange. In these two functions, I can successfully access the variable of csched_vcpu structure and csched_private.> >> >... Yes, that is at least most of it. In fact, when a vcpu wakes up, it >> >is added to a specific runq, and the 'tickling' mechanism is there right >> >to ensure that the said vcpu starts to run as soon as possible, either >> >if there are idle pcpus, or the running vcpus have lower priority, the >> >latter case being the definition of preemption. >> When a vcpu wakes up, it is added to a specific runq. Whether the specific runq is the runnable queue? >> >Well, the vcpu wakes-up, so yes, it is the runnable queue of a specific >pCPU. Which 'specific pCPU' depends, and I suggest you looking more >deeply in the scheduler code. From the top of my head, I'd say it is the >runqueue of the pCPU where the vCPU was when it went to sleep. > >> either if there are idle pcpus, or the running vcpus have lower priority? >> >In credit1, it works like this: > - you (the vCPU) wake-up and I (Xen scheduler) queue you on the runq > of the pCPU when you where before going to sleep; > - if that pCPU is busy, I poke other pCPUs to see if you can run there > (that's the meaning of 'tickling'); > - if the above is not possible, I check if preemption is required. If > yes, I preempt the vCPU running on the runq, if not, you have to wait > for your turn (or for some other pCPU becoming idle and picking you > up) in the runq. > >Does that make sense? >Yes, now it make sense. Thank you very much for trying to let me understand what you said.>> I do not understand your meaning. You mean that if there are idle pcpus, the waked up vcpu will be scheduled on the idle pcpus to run. >> >For sure, the scheduler will try as hard as he can to achieve this, yes. > >> If not, it will preempted the current running vcpus if the waked up vcpu has the higher priority compared to the the current vcpu. Whether my understanding is right? >> >I believe it is. Actually, I believe this is either the definition or, >in any case, the only sensible thing that a reasonable enough >preemptible scheduler should do. :-) > >For the deep technicalities of how this is implemented in credit1, >please refer to my hopefully accurate explanation above, or, even >better, to sched_credit.c. > >> > If you, for instance, avoid raising the SCHEDULE_SOFTIRQ for busy >> > pcpus >> > (I would still tickle the idle ones, or you'll get funny results! :-O), >> > you definitely are making the (credit) scheduler less preemptible. >> I can not understand here. still tickle the idle ones, or you'll get funny results! What's the meaning? >> >The meaning is that, given the explanation above, inhibiting preemption >by, for instance, not tickling the busy pCPUs might actually work. On >the other hand, if you have idle pCPUs, having them running the woken-up >task is not a preemption, right? Well, if you do not tickle those pCPUs >you won't get there, and you not only will get rid of peemption on busy >pCPUs, you will also have idle pCPUs that remains idle, even if there >are vCPUs waiting to be executed. > >This means you're killing not only preemption, but also work >conserving-ness, and that might not be among your original goals (or was >it?). > >> >Of course, wake-ups is not the only cause of SCHEDULE_SOFTIRQ being >> >raised. E.g., it fires periodically at the scheduling time slice >> >boundaries. If you want to avoid vcpus being interrupted by others with >> >higher priority for this case too, you probably have more paths to tweak >> >than just the csched_vcpu_wake() function. >> > >> Yes, I can not remember the number of raising SCHEDULE_SOFTIRQ interrupt. Long time ago, I check the places of raising SCHEDULE_SOFTIRQ interrupt. It is about seven places. >> >Fine. Then, to be sure, I'd check all of them and see what they end up >doing. I know they're all calling csched_schedule(), what I mean is I'd >check the conditions and the parameters, to verify which ones of these 7 >possible situations could lead to preemption. > >What you can be quite sure of, is ha there's not going to be a >preemption without a call to csched_schedule() being involved, so you >may even try to instrument the code at that level.. It really all >depends on your final purpose. > >> >And here I'm failing at understanding what you mean again... When a >> >SCHEDULE_SOFTIRQ is raised for a given pcpu, that pcpu will deal with >> >it, well, ASAP (look at how softirqs & tasklets work in the hypervisor >> >source code). What do you mean by "give up the physical cpu"? >> I mean after raising the SCHEDULE_SOFTIRQ interrupt, the handler function schedule() will execute in time or need to wait the current vcpu scheduled out. Which part decides the priority among them? >> >Mmm... I spot some confusion here. Why the scheduling out of a vcpu >should be involved in all this? I mean, raising a SCHEDULE_SOFTIRQ and, >most important, handling it, happens in Xen code. That means there is a >pCPU executing hypervisor code, independently of which one is the vCPU >that is or was running on that same pCPU. Well, this same hypervisor >code will get to execute, at some point, csched_schedule(), make the >scheduling decision and, if that is the case, dschedule the running vCPU >and schedule another one (and here you are a preemption). > >Actually, we really can't wait for a vCPU to be descheduled to execute >the Xen scheduler, since it's the Xen scheduler itself that deschedules >vCPUs! :-O > >Perhaps, with "scheduled out" you mean something like block, i.e., you >want to know if Xen is able to interrupt the vCPUs or if it always run >them to completion or blocking. In which case, the former, we interrupt >the vCPUs, just like an (preemptible) OS scheduler interrupts OS's >tasks. Whether or not that will result in a preemption, it depends both >on the scheduler and on the circumstances. > >Sounds better now? > >> Can you give me some guidance, where is the code for softirqs & tasklets. >> >Well, grep and find are usually good friends, when the question is where >is the code! :-P > >Both > >$ grep tasklet xen.git/xen/* > >and > >$ grep softirq xen.git/xen/* > >Produce a lot of output here. Also, I'd try something like that... You >know, programmers usually have quite few fantasy > >$ find ./xen.git/xen/ -iname tasklet* >./xen/include/xen/tasklet.h >./xen/common/tasklet.c > >$ find ./xen.git/xen/ -iname softirq* >./xen/include/asm-x86/softirq.h >./xen/include/xen/softirq.h >./xen/include/asm-arm/softirq.h >./xen/common/softirq.c > >> Another question: >> In the schedule() function of schedule.c file, at first, it will set the flag tasklet_work_scheduled according to whether has the tasklet_work. What is the tasklet work? >> >After having inspected at least some of the sources above, look for the >do_tasklet() function, and revise what it does. If it's the concept of >tasklet and softirq that you're unfamiliar with, well, very quickly it's >just one way of deferring work in an OS (or, in out case, an hypervisor, >but still). > >Linux makes uses of these kind of things pretty heavily (although the >names, the implementation, and the number of different variants of them >changes with kernel versions). I trust/hope you can find enough >documentation about that on line. :-) > >> In the csched_schedule() of sched_credit.c file, it will give the idle vcpu boost priority if the tasklet_work_scheduled is set. >> I have some difficult for understanding this part. Maybe my confusion is not knowing the tasklet work. Can you give some explanation why designing like this? >> >Again, tasklet is deferred work. That means there is this pretty >function you want to call, but you can call it right now. Typical >example is because you have interrupt disabled and the pretty function >in question wants interrupt enabled, or it is you that you don't want to >keep interrupts disabled for too long, or any other reason. > >Ok, what you do is to make a note about calling that function later, and >that's exactly what tasklet does. The reason why we execute them in idle >domain's context is, well, because we have to execute them >somewhere! :-) > >Seriously, our scheduler schedules vCPUs, not 'functions', so you either >call a function from where you are (and we already said you can't) or, >when you're done, the scheduler will pick a vCPU and get on with it, and >your function will never be called. What we hence do is making sure it >is one of the idle domain's vCPUs that is scheduled, as well as making >sure that such vCPU will call your function as part of 'its workload'. > >Check out the idle_loop() function, it's in xen/arch/x86/domain.c.Thank you very much once again for your detail description!> >Regards, >Dario > >_______________________________________________ Xen-users mailing list Xen-users@lists.xen.org http://lists.xen.org/xen-users
Ok, you''re still failing at using plain text... Could you please try harder? :-) On dom, 2013-06-30 at 23:00 +0800, 张伟 wrote:> At first, I have a wrong understanding for xen scheduler preemption. I thought it did not support preemption. Last week, my advisor corrects my thought. >Ok, so you''re "just" studying scheduling in OSes/hypervisors, with particular focus on how preemption works and is implemented.. Or you''re also aiming at envisioning/introducing some new functionality, perhaps, for research/coursework (since you mentioned an ''advisor'')?> So I want to know if a system supports preemption, the code which key part need to do the modification. At first, I add something in xen scheduler(only simple). >Ok, so you added something... What for? What was the final purpose? "Just" understanding, or are you modifying the behavior? You know, the reason I''m asking is because, whatever you''re doing, we might be able to help you better if we know what you are aiming at, not to mention that what you''re trying to achieve, might be interesting and/or beneficial for other Xen users as well! :-)> My modification will bring some virtual machines starvation. Now I want to decrease the starvation. >Oh... And the purpose was to actually introduce starvation? EhEh, I don''t think so! :-P If not, what was it?> I need to add some other things. I meet a serious problem, in the schedule() or csched_schedule() function, if access the variable csched_dom structure, the system will automatically reboot. Eg, if add the printk("The domain weight is %d", sdom->weight); in the csched_schedule() or schedule(), the system will automatically reboot and can not enter the system. Do you know why? It is very strange. In these two functions, I can successfully access the variable of csched_vcpu structure and csched_private. >I cannot comment on the specific ''access the variable'' without seeing the code. However, the behavior you''re reporting is a clear symptom of the hypervisor crashing. For better investigating the reason, you need to be able to see what it says _before_ rebooting. You''ll most likely need the ''noreboot'' boot parameter and a serial console... Do you already have those? Follow some tips on how to do that here on how to put yourself in a more debugging friendly situation: http://wiki.debian.org/Xen#dom0_automatic_reboots http://wiki.xen.org/wiki/Xen_Serial_Console http://wiki.xen.org/wiki/Debugging_Xen Regards, Dario -- <<This happens because I choose it to happen!>> (Raistlin Majere) ----------------------------------------------------------------- Dario Faggioli, Ph.D, http://about.me/dario.faggioli Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK) _______________________________________________ Xen-users mailing list Xen-users@lists.xen.org http://lists.xen.org/xen-users