Shan, Haitao
2008-Sep-09 08:59 UTC
[Xen-devel] [PATCH 1/4] CPU online/offline support in Xen
This patch implements cpu offline feature. Best Regards Haitao Shan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2008-Sep-10 10:43 UTC
[Xen-devel] Re: [PATCH 1/4] CPU online/offline support in Xen
I feel this is more complicated than it needs to be. How about clearing VCPUs from the offlined CPU''s runqueue from the very end of __cpu_disable()? At that point all other CPUs are safely in softirq context with IRQs disabled, and we are running on the correct CPU (being offlined). We could have a hook into the scheduler subsystem at that point to break affinities, assign to different runqueues, etc. We would just need to be careful not to try an IPI. :-) This approach would not need a cpu_schedule_map (which is really increasing code fragility imo, by creating possible extra confusion about which cpumask is the wright one to use in a given situation). My feeling, unless I''ve missed something, is that this would make the patch quite a bit smaller and with a smaller spread of code changes. -- Keir On 9/9/08 09:59, "Shan, Haitao" <haitao.shan@intel.com> wrote:> This patch implements cpu offline feature. > > Best Regards > Haitao Shan_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2008-Sep-10 10:59 UTC
Re: [Xen-devel] Re: [PATCH 1/4] CPU online/offline support in Xen
On 10/9/08 11:43, "Keir Fraser" <keir.fraser@eu.citrix.com> wrote:> I feel this is more complicated than it needs to be. > > How about clearing VCPUs from the offlined CPU''s runqueue from the very end > of __cpu_disable()? At that point all other CPUs are safely in softirq > context with IRQs disabled, and we are running on the correct CPU (being > offlined). We could have a hook into the scheduler subsystem at that point > to break affinities, assign to different runqueues, etc. We would just need > to be careful not to try an IPI. :-) This approach would not need a > cpu_schedule_map (which is really increasing code fragility imo, by creating > possible extra confusion about which cpumask is the wright one to use in a > given situation). > > My feeling, unless I''ve missed something, is that this would make the patch > quite a bit smaller and with a smaller spread of code changes.Another thought: we may (appear to) need an IPI after VCPUs have been migrated to other runqueues. And actually that will be safe because smp_send_event_check_cpu() is non-blocking (does not wait for the remote CPU to run the IPI handler). So it *is* safe to do non-blocking IPIs from stop_machine_run() context. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Haitao Shan
2008-Sep-10 12:59 UTC
Re: [Xen-devel] Re: [PATCH 1/4] CPU online/offline support in Xen
Agree. Placing migration in stop_machine context will definitely make our jobs easier. I will start making a new patch tomorrow. :) I place the migraton code outside the stop_machine_run context, partly because I am not quite sure how long it will take to migrate all the vcpus away. If it takes too much time, all useful works are blocked since all cpus are in the stop_machine context. Of course, I borrowed the ideas from kernel, which also let me made the desicion. 2008/9/10 Keir Fraser <keir.fraser@eu.citrix.com>:> I feel this is more complicated than it needs to be. > > How about clearing VCPUs from the offlined CPU''s runqueue from the very end > of __cpu_disable()? At that point all other CPUs are safely in softirq > context with IRQs disabled, and we are running on the correct CPU (being > offlined). We could have a hook into the scheduler subsystem at that point > to break affinities, assign to different runqueues, etc. We would just need > to be careful not to try an IPI. :-) This approach would not need a > cpu_schedule_map (which is really increasing code fragility imo, by creating > possible extra confusion about which cpumask is the wright one to use in a > given situation). > > My feeling, unless I''ve missed something, is that this would make the patch > quite a bit smaller and with a smaller spread of code changes. > > -- Keir > > On 9/9/08 09:59, "Shan, Haitao" <haitao.shan@intel.com> wrote: > >> This patch implements cpu offline feature. >> >> Best Regards >> Haitao Shan > > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Frank van der Linden
2008-Sep-10 16:05 UTC
Re: [Xen-devel] Re: [PATCH 1/4] CPU online/offline support in Xen
Haitao Shan wrote:> Agree. Placing migration in stop_machine context will definitely make > our jobs easier. I will start making a new patch tomorrow. :) > I place the migraton code outside the stop_machine_run context, partly > because I am not quite sure how long it will take to migrate all the > vcpus away. If it takes too much time, all useful works are blocked > since all cpus are in the stop_machine context. Of course, I borrowed > the ideas from kernel, which also let me made the desicion. > > 2008/9/10 Keir Fraser <keir.fraser@eu.citrix.com>: > >> I feel this is more complicated than it needs to be. >> >> How about clearing VCPUs from the offlined CPU''s runqueue from the very end >> of __cpu_disable()? At that point all other CPUs are safely in softirq >> context with IRQs disabled, and we are running on the correct CPU (being >> offlined). We could have a hook into the scheduler subsystem at that point >> to break affinities, assign to different runqueues, etc. We would just need >> to be careful not to try an IPI. :-) This approach would not need a >> cpu_schedule_map (which is really increasing code fragility imo, by creating >> possible extra confusion about which cpumask is the wright one to use in a >> given situation). >> >> My feeling, unless I''ve missed something, is that this would make the patch >> quite a bit smaller and with a smaller spread of code changes. >> >> -- Keir >>This would also address some problems I saw with the patch: race conditions regarding migration of VCPUs, because other CPUs may call runq_tickle. Or a hypercall may come in changing the VCPU affinity, since things are done in 2 stages. The changes I have are more complicated, because I was working off 3.1.4, which is our current Xen version. It doesn''t have things like stop_machine_run. But if the patch is simplified in this manner, it is easier for us to use, and we can just backport things like stop_machine_run for the time being. The other issue I was seeing was that cpu_up sometimes did not succeed in actually getting a CPU to boot. But there have been a few fixes to smpboot.c, so I''ll have to see if that always works now. - Frank _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2008-Sep-11 07:36 UTC
Re: [Xen-devel] Re: [PATCH 1/4] CPU online/offline support in Xen
On 10/9/08 17:05, "Frank van der Linden" <Frank.Vanderlinden@Sun.COM> wrote:> The other issue I was seeing was that cpu_up sometimes did not succeed > in actually getting a CPU to boot. But there have been a few fixes to > smpboot.c, so I''ll have to see if that always works now.Thanks, that would be interesting. Perhaps we need to tweak some delay parameters in smpboot.c. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Shan, Haitao
2008-Sep-11 08:02 UTC
RE: [Xen-devel] Re: [PATCH 1/4] CPU online/offline support in Xen
Hi, Keir, Attached is the updated patch using the methods as you described in another mail. What do you think of the one? Signed-off-by: Shan Haitao <haitao.shan@intel.com> Best Regards Haitao Shan Haitao Shan wrote:> Agree. Placing migration in stop_machine context will definitely make > our jobs easier. I will start making a new patch tomorrow. :) > I place the migraton code outside the stop_machine_run context, partly > because I am not quite sure how long it will take to migrate all the > vcpus away. If it takes too much time, all useful works are blocked > since all cpus are in the stop_machine context. Of course, I borrowed > the ideas from kernel, which also let me made the desicion. > > 2008/9/10 Keir Fraser <keir.fraser@eu.citrix.com>: >> I feel this is more complicated than it needs to be. >> >> How about clearing VCPUs from the offlined CPU''s runqueue from the >> very end of __cpu_disable()? At that point all other CPUs are safely >> in softirq context with IRQs disabled, and we are running on the >> correct CPU (being offlined). We could have a hook into the >> scheduler subsystem at that point to break affinities, assign to >> different runqueues, etc. We would just need to be careful not to >> try an IPI. :-) This approach would not need a cpu_schedule_map >> (which is really increasing code fragility imo, by creating possible >> extra confusion about which cpumask is the wright one to use in a >> given situation). >> >> My feeling, unless I''ve missed something, is that this would make >> the patch quite a bit smaller and with a smaller spread of code >> changes. >> >> -- Keir >> >> On 9/9/08 09:59, "Shan, Haitao" <haitao.shan@intel.com> wrote: >> >>> This patch implements cpu offline feature. >>> >>> Best Regards >>> Haitao Shan >> >> >> >> _______________________________________________ >> Xen-devel mailing list >> Xen-devel@lists.xensource.com >> http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2008-Sep-11 11:12 UTC
Re: [Xen-devel] Re: [PATCH 1/4] CPU online/offline support in Xen
It looks much better. I''ll read through, maybe tweak, and most likely then check it in. Thanks, Keir On 11/9/08 09:02, "Shan, Haitao" <haitao.shan@intel.com> wrote:> Hi, Keir, > > Attached is the updated patch using the methods as you described in > another mail. > What do you think of the one? > > Signed-off-by: Shan Haitao <haitao.shan@intel.com> > > Best Regards > Haitao Shan > > Haitao Shan wrote: >> Agree. Placing migration in stop_machine context will definitely make >> our jobs easier. I will start making a new patch tomorrow. :) >> I place the migraton code outside the stop_machine_run context, partly >> because I am not quite sure how long it will take to migrate all the >> vcpus away. If it takes too much time, all useful works are blocked >> since all cpus are in the stop_machine context. Of course, I borrowed >> the ideas from kernel, which also let me made the desicion. >> >> 2008/9/10 Keir Fraser <keir.fraser@eu.citrix.com>: >>> I feel this is more complicated than it needs to be. >>> >>> How about clearing VCPUs from the offlined CPU''s runqueue from the >>> very end of __cpu_disable()? At that point all other CPUs are safely >>> in softirq context with IRQs disabled, and we are running on the >>> correct CPU (being offlined). We could have a hook into the >>> scheduler subsystem at that point to break affinities, assign to >>> different runqueues, etc. We would just need to be careful not to >>> try an IPI. :-) This approach would not need a cpu_schedule_map >>> (which is really increasing code fragility imo, by creating possible >>> extra confusion about which cpumask is the wright one to use in a >>> given situation). >>> >>> My feeling, unless I''ve missed something, is that this would make >>> the patch quite a bit smaller and with a smaller spread of code >>> changes. >>> >>> -- Keir >>> >>> On 9/9/08 09:59, "Shan, Haitao" <haitao.shan@intel.com> wrote: >>> >>>> This patch implements cpu offline feature. >>>> >>>> Best Regards >>>> Haitao Shan >>> >>> >>> >>> _______________________________________________ >>> Xen-devel mailing list >>> Xen-devel@lists.xensource.com >>> http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Shan, Haitao
2008-Sep-11 11:33 UTC
RE: [Xen-devel] Re: [PATCH 1/4] CPU online/offline support in Xen
Thanks! Concerning cpu online/offline development, I have a small question here. Since cpu_online_map is very important, code in different subsystems may use it extensively. If such code is not designed with cpu online/offline in mind, it may introduce race conditions, just like the one fixed in cpu calibration rendezvous. Currently, we solve it in a find-and-fix manner. Do you have any idea that can solve the problem in a cleaner way? Thanks in advance. Shan Haitao -----Original Message----- From: Keir Fraser [mailto:keir.fraser@eu.citrix.com] Sent: 2008年9月11日 19:13 To: Shan, Haitao; Haitao Shan Cc: xen-devel@lists.xensource.com Subject: Re: [Xen-devel] Re: [PATCH 1/4] CPU online/offline support in Xen It looks much better. I''ll read through, maybe tweak, and most likely then check it in. Thanks, Keir On 11/9/08 09:02, "Shan, Haitao" <haitao.shan@intel.com> wrote:> Hi, Keir, > > Attached is the updated patch using the methods as you described in > another mail. > What do you think of the one? > > Signed-off-by: Shan Haitao <haitao.shan@intel.com> > > Best Regards > Haitao Shan > > Haitao Shan wrote: >> Agree. Placing migration in stop_machine context will definitely make >> our jobs easier. I will start making a new patch tomorrow. :) >> I place the migraton code outside the stop_machine_run context, partly >> because I am not quite sure how long it will take to migrate all the >> vcpus away. If it takes too much time, all useful works are blocked >> since all cpus are in the stop_machine context. Of course, I borrowed >> the ideas from kernel, which also let me made the desicion. >> >> 2008/9/10 Keir Fraser <keir.fraser@eu.citrix.com>: >>> I feel this is more complicated than it needs to be. >>> >>> How about clearing VCPUs from the offlined CPU''s runqueue from the >>> very end of __cpu_disable()? At that point all other CPUs are safely >>> in softirq context with IRQs disabled, and we are running on the >>> correct CPU (being offlined). We could have a hook into the >>> scheduler subsystem at that point to break affinities, assign to >>> different runqueues, etc. We would just need to be careful not to >>> try an IPI. :-) This approach would not need a cpu_schedule_map >>> (which is really increasing code fragility imo, by creating possible >>> extra confusion about which cpumask is the wright one to use in a >>> given situation). >>> >>> My feeling, unless I''ve missed something, is that this would make >>> the patch quite a bit smaller and with a smaller spread of code >>> changes. >>> >>> -- Keir >>> >>> On 9/9/08 09:59, "Shan, Haitao" <haitao.shan@intel.com> wrote: >>> >>>> This patch implements cpu offline feature. >>>> >>>> Best Regards >>>> Haitao Shan >>> >>> >>> >>> _______________________________________________ >>> Xen-devel mailing list >>> Xen-devel@lists.xensource.com >>> http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2008-Sep-11 12:42 UTC
Re: [Xen-devel] Re: [PATCH 1/4] CPU online/offline support in Xen
On 11/9/08 12:33, "Shan, Haitao" <haitao.shan@intel.com> wrote:> Thanks! > Concerning cpu online/offline development, I have a small question here. > Since cpu_online_map is very important, code in different subsystems may use > it extensively. If such code is not designed with cpu online/offline in mind, > it may introduce race conditions, just like the one fixed in cpu calibration > rendezvous. > Currently, we solve it in a find-and-fix manner. Do you have any idea that can > solve the problem in a cleaner way? > Thanks in advance.Mostly it will be race against CPUs coming online, since stop_machine_run() is a strong barrier in the offline case. We could use stop_machine_run() for onlining too (even if just to set a bit in cpu_online_map). Or indeed we just carry on fixing up the bugs one by one as we find them. I actually doubt there are that many, so I think this current approach is fine. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2008-Sep-11 14:15 UTC
Re: [Xen-devel] Re: [PATCH 1/4] CPU online/offline support in Xen
I applied the patch with the following changes: * I rewrote your changes to fixup_irqs(). We should force lazy EOIs *after* we have serviced any straggling interrupts. Also we should actually clear the EOI stack so it is empty next time the CPU comes online. * I simplified your changes to schedule.c in light of the fact we run in stop_machine context. Hence we can be quite relaxed about locking, for example. * I removed your change to __csched_vcpu_is_migrateable() and instead put a similar check in csched_load_balance(). I think this is clearer and also cheaper. I note that the VCPU currently running on the offlined CPU continues to run there even after __cpu_disable(), and until that CPU does a final run through the scheduler soon after. I hope it does not matter there is one vcpu with v->processor == offlined_cpu for a short while (e.g., what if another CPU does vcpu_sleep_nosync(v) -> cpu_raise_softirq(v->processor, ...)). I *think* it''s actually okay, but I''m not totally certain. Really I guess this patch needs some stress testing (lots of online/offline cycles while pausing/unpausing domains, etc). Perhaps we could plumb through a Xen sysctl and make a small dom0 utility for this purpose? -- Keir On 11/9/08 12:33, "Shan, Haitao" <haitao.shan@intel.com> wrote:> Thanks! > Concerning cpu online/offline development, I have a small question here. > Since cpu_online_map is very important, code in different subsystems may use > it extensively. If such code is not designed with cpu online/offline in mind, > it may introduce race conditions, just like the one fixed in cpu calibration > rendezvous. > Currently, we solve it in a find-and-fix manner. Do you have any idea that can > solve the problem in a cleaner way? > Thanks in advance. > > Shan Haitao > > -----Original Message----- > From: Keir Fraser [mailto:keir.fraser@eu.citrix.com] > Sent: 2008年9月11日 19:13 > To: Shan, Haitao; Haitao Shan > Cc: xen-devel@lists.xensource.com > Subject: Re: [Xen-devel] Re: [PATCH 1/4] CPU online/offline support in Xen > > It looks much better. I''ll read through, maybe tweak, and most likely then > check it in. > > Thanks, > Keir > > On 11/9/08 09:02, "Shan, Haitao" <haitao.shan@intel.com> wrote: > >> Hi, Keir, >> >> Attached is the updated patch using the methods as you described in >> another mail. >> What do you think of the one? >> >> Signed-off-by: Shan Haitao <haitao.shan@intel.com> >> >> Best Regards >> Haitao Shan >> >> Haitao Shan wrote: >>> Agree. Placing migration in stop_machine context will definitely make >>> our jobs easier. I will start making a new patch tomorrow. :) >>> I place the migraton code outside the stop_machine_run context, partly >>> because I am not quite sure how long it will take to migrate all the >>> vcpus away. If it takes too much time, all useful works are blocked >>> since all cpus are in the stop_machine context. Of course, I borrowed >>> the ideas from kernel, which also let me made the desicion. >>> >>> 2008/9/10 Keir Fraser <keir.fraser@eu.citrix.com>: >>>> I feel this is more complicated than it needs to be. >>>> >>>> How about clearing VCPUs from the offlined CPU''s runqueue from the >>>> very end of __cpu_disable()? At that point all other CPUs are safely >>>> in softirq context with IRQs disabled, and we are running on the >>>> correct CPU (being offlined). We could have a hook into the >>>> scheduler subsystem at that point to break affinities, assign to >>>> different runqueues, etc. We would just need to be careful not to >>>> try an IPI. :-) This approach would not need a cpu_schedule_map >>>> (which is really increasing code fragility imo, by creating possible >>>> extra confusion about which cpumask is the wright one to use in a >>>> given situation). >>>> >>>> My feeling, unless I''ve missed something, is that this would make >>>> the patch quite a bit smaller and with a smaller spread of code >>>> changes. >>>> >>>> -- Keir >>>> >>>> On 9/9/08 09:59, "Shan, Haitao" <haitao.shan@intel.com> wrote: >>>> >>>>> This patch implements cpu offline feature. >>>>> >>>>> Best Regards >>>>> Haitao Shan >>>> >>>> >>>> >>>> _______________________________________________ >>>> Xen-devel mailing list >>>> Xen-devel@lists.xensource.com >>>> http://lists.xensource.com/xen-devel > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Christoph Egger
2008-Sep-11 14:23 UTC
Re: [Xen-devel] Re: [PATCH 1/4] CPU online/offline support in Xen
On Thursday 11 September 2008 16:15:14 Keir Fraser wrote:> I applied the patch with the following changes: > * I rewrote your changes to fixup_irqs(). We should force lazy EOIs > *after* we have serviced any straggling interrupts. Also we should actually > clear the EOI stack so it is empty next time the CPU comes online. > * I simplified your changes to schedule.c in light of the fact we run in > stop_machine context. Hence we can be quite relaxed about locking, for > example. > * I removed your change to __csched_vcpu_is_migrateable() and instead put > a similar check in csched_load_balance(). I think this is clearer and also > cheaper. > > I note that the VCPU currently running on the offlined CPU continues to run > there even after __cpu_disable(), and until that CPU does a final run > through the scheduler soon after. I hope it does not matter there is one > vcpu with v->processor == offlined_cpu for a short whileThis is not acceptable regarding to machine check. When Dom0 offlines a defect cpu, nothing may continue on it or silent data corruption occurs. Christoph> On 11/9/08 12:33, "Shan, Haitao" <haitao.shan@intel.com> wrote: > > Thanks! > > Concerning cpu online/offline development, I have a small question here. > > Since cpu_online_map is very important, code in different subsystems may > > use it extensively. If such code is not designed with cpu online/offline > > in mind, it may introduce race conditions, just like the one fixed in cpu > > calibration rendezvous. > > Currently, we solve it in a find-and-fix manner. Do you have any idea > > that can solve the problem in a cleaner way? > > Thanks in advance. > > > > Shan Haitao > > > > -----Original Message----- > > From: Keir Fraser [mailto:keir.fraser@eu.citrix.com] > > Sent: 2008年9月11日 19:13 > > To: Shan, Haitao; Haitao Shan > > Cc: xen-devel@lists.xensource.com > > Subject: Re: [Xen-devel] Re: [PATCH 1/4] CPU online/offline support in > > Xen > > > > It looks much better. I''ll read through, maybe tweak, and most likely > > then check it in. > > > > Thanks, > > Keir > > > > On 11/9/08 09:02, "Shan, Haitao" <haitao.shan@intel.com> wrote: > >> Hi, Keir, > >> > >> Attached is the updated patch using the methods as you described in > >> another mail. > >> What do you think of the one? > >> > >> Signed-off-by: Shan Haitao <haitao.shan@intel.com> > >> > >> Best Regards > >> Haitao Shan > >> > >> Haitao Shan wrote: > >>> Agree. Placing migration in stop_machine context will definitely make > >>> our jobs easier. I will start making a new patch tomorrow. :) > >>> I place the migraton code outside the stop_machine_run context, partly > >>> because I am not quite sure how long it will take to migrate all the > >>> vcpus away. If it takes too much time, all useful works are blocked > >>> since all cpus are in the stop_machine context. Of course, I borrowed > >>> the ideas from kernel, which also let me made the desicion. > >>> > >>> 2008/9/10 Keir Fraser <keir.fraser@eu.citrix.com>: > >>>> I feel this is more complicated than it needs to be. > >>>> > >>>> How about clearing VCPUs from the offlined CPU''s runqueue from the > >>>> very end of __cpu_disable()? At that point all other CPUs are safely > >>>> in softirq context with IRQs disabled, and we are running on the > >>>> correct CPU (being offlined). We could have a hook into the > >>>> scheduler subsystem at that point to break affinities, assign to > >>>> different runqueues, etc. We would just need to be careful not to > >>>> try an IPI. :-) This approach would not need a cpu_schedule_map > >>>> (which is really increasing code fragility imo, by creating possible > >>>> extra confusion about which cpumask is the wright one to use in a > >>>> given situation). > >>>> > >>>> My feeling, unless I''ve missed something, is that this would make > >>>> the patch quite a bit smaller and with a smaller spread of code > >>>> changes. > >>>> > >>>> -- Keir > >>>> > >>>> On 9/9/08 09:59, "Shan, Haitao" <haitao.shan@intel.com> wrote: > >>>>> This patch implements cpu offline feature. > >>>>> > >>>>> Best Regards > >>>>> Haitao Shan-- AMD Saxony, Dresden, Germany Operating System Research Center Legal Information: AMD Saxony Limited Liability Company & Co. KG Sitz (Geschäftsanschrift): Wilschdorfer Landstr. 101, 01109 Dresden, Deutschland Registergericht Dresden: HRA 4896 vertretungsberechtigter Komplementär: AMD Saxony LLC (Sitz Wilmington, Delaware, USA) Geschäftsführer der AMD Saxony LLC: Dr. Hans-R. Deppe, Thomas McCoy _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2008-Sep-11 14:32 UTC
Re: [Xen-devel] Re: [PATCH 1/4] CPU online/offline support in Xen
On 11/9/08 15:23, "Christoph Egger" <Christoph.Egger@amd.com> wrote:>> I note that the VCPU currently running on the offlined CPU continues to run >> there even after __cpu_disable(), and until that CPU does a final run >> through the scheduler soon after. I hope it does not matter there is one >> vcpu with v->processor == offlined_cpu for a short while > > This is not acceptable regarding to machine check. When Dom0 offlines a > defect cpu, nothing may continue on it or silent data corruption occurs.It doesn''t run for unbounded time. The offline CPU will immediately run softirq work as the very next thing it does, causing a run through the scheduler, where it will 100% definitely pick up the idle vcpu and hence play dead in the idle loop. This won''t be guaranteed synchronous with a offline request via a hypercall though. By which I mean, that hypercall may return but the last vcpu may not stop running on the cpu until some tiny amount of time later. If this is unacceptable we have to add a hook into sched_credit.c to synchronously and forcibly deschedule the currently running VCPU from within stop_machine context. This is probably moderately tricky, so I hope we can avoid it. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2008-Sep-11 14:47 UTC
Re: [Xen-devel] Re: [PATCH 1/4] CPU online/offline support in Xen
On 11/9/08 15:32, "Keir Fraser" <keir.fraser@eu.citrix.com> wrote:>>> I note that the VCPU currently running on the offlined CPU continues to run >>> there even after __cpu_disable(), and until that CPU does a final run >>> through the scheduler soon after. I hope it does not matter there is one >>> vcpu with v->processor == offlined_cpu for a short while >> >> This is not acceptable regarding to machine check. When Dom0 offlines a >> defect cpu, nothing may continue on it or silent data corruption occurs. > > It doesn''t run for unbounded time. The offline CPU will immediately run > softirq work as the very next thing it does, causing a run through the > scheduler, where it will 100% definitely pick up the idle vcpu and hence > play dead in the idle loop. > > This won''t be guaranteed synchronous with a offline request via a hypercall > though. By which I mean, that hypercall may return but the last vcpu may not > stop running on the cpu until some tiny amount of time later.Actually I''m wrong on this. __cpu_die() will ensure the CPU really is totally offline before returning. So calls to cpu_down() really are synchronous already. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Shan, Haitao
2008-Sep-11 16:00 UTC
RE: [Xen-devel] Re: [PATCH 1/4] CPU online/offline support in Xen
Hi, Keir, Concerning the last running vcpu on the dying cpu, I have some thought. Yes, there would be a short time after the stop_machine_run when this vcpu v->processor == dying_cpu. But anyhow, we set fie __VPF_migrating flag for that vcpu and issued a schedule_softirq on the dying cpu. This softirq should run immediately after stop_machine context, am I right? If so, by the time the schedule softirq is executed, this last vcpu is migrated away from this dying cpu. But saving of its context will be delayed to play_dead->sync_lazy_context. If another cpu issues the schedule request to this dying cpu (vcpu_sleep_nosync->cpu_raise_softirq(vc->processor....)) during this time, the request will be serviced by the above code sequence. So it is safe in such cases. Am I missing something important? I am not quite confident on the statements, though. Shan Haitao -----Original Message----- From: Keir Fraser [mailto:keir.fraser@eu.citrix.com] Sent: 2008年9月11日 22:15 To: Shan, Haitao; Haitao Shan; Tian, Kevin Cc: xen-devel@lists.xensource.com Subject: Re: [Xen-devel] Re: [PATCH 1/4] CPU online/offline support in Xen I applied the patch with the following changes: * I rewrote your changes to fixup_irqs(). We should force lazy EOIs *after* we have serviced any straggling interrupts. Also we should actually clear the EOI stack so it is empty next time the CPU comes online. * I simplified your changes to schedule.c in light of the fact we run in stop_machine context. Hence we can be quite relaxed about locking, for example. * I removed your change to __csched_vcpu_is_migrateable() and instead put a similar check in csched_load_balance(). I think this is clearer and also cheaper. I note that the VCPU currently running on the offlined CPU continues to run there even after __cpu_disable(), and until that CPU does a final run through the scheduler soon after. I hope it does not matter there is one vcpu with v->processor == offlined_cpu for a short while (e.g., what if another CPU does vcpu_sleep_nosync(v) -> cpu_raise_softirq(v->processor, ...)). I *think* it''s actually okay, but I''m not totally certain. Really I guess this patch needs some stress testing (lots of online/offline cycles while pausing/unpausing domains, etc). Perhaps we could plumb through a Xen sysctl and make a small dom0 utility for this purpose? -- Keir On 11/9/08 12:33, "Shan, Haitao" <haitao.shan@intel.com> wrote:> Thanks! > Concerning cpu online/offline development, I have a small question here. > Since cpu_online_map is very important, code in different subsystems may use > it extensively. If such code is not designed with cpu online/offline in mind, > it may introduce race conditions, just like the one fixed in cpu calibration > rendezvous. > Currently, we solve it in a find-and-fix manner. Do you have any idea that can > solve the problem in a cleaner way? > Thanks in advance. > > Shan Haitao > > -----Original Message----- > From: Keir Fraser [mailto:keir.fraser@eu.citrix.com] > Sent: 2008年9月11日 19:13 > To: Shan, Haitao; Haitao Shan > Cc: xen-devel@lists.xensource.com > Subject: Re: [Xen-devel] Re: [PATCH 1/4] CPU online/offline support in Xen > > It looks much better. I''ll read through, maybe tweak, and most likely then > check it in. > > Thanks, > Keir > > On 11/9/08 09:02, "Shan, Haitao" <haitao.shan@intel.com> wrote: > >> Hi, Keir, >> >> Attached is the updated patch using the methods as you described in >> another mail. >> What do you think of the one? >> >> Signed-off-by: Shan Haitao <haitao.shan@intel.com> >> >> Best Regards >> Haitao Shan >> >> Haitao Shan wrote: >>> Agree. Placing migration in stop_machine context will definitely make >>> our jobs easier. I will start making a new patch tomorrow. :) >>> I place the migraton code outside the stop_machine_run context, partly >>> because I am not quite sure how long it will take to migrate all the >>> vcpus away. If it takes too much time, all useful works are blocked >>> since all cpus are in the stop_machine context. Of course, I borrowed >>> the ideas from kernel, which also let me made the desicion. >>> >>> 2008/9/10 Keir Fraser <keir.fraser@eu.citrix.com>: >>>> I feel this is more complicated than it needs to be. >>>> >>>> How about clearing VCPUs from the offlined CPU''s runqueue from the >>>> very end of __cpu_disable()? At that point all other CPUs are safely >>>> in softirq context with IRQs disabled, and we are running on the >>>> correct CPU (being offlined). We could have a hook into the >>>> scheduler subsystem at that point to break affinities, assign to >>>> different runqueues, etc. We would just need to be careful not to >>>> try an IPI. :-) This approach would not need a cpu_schedule_map >>>> (which is really increasing code fragility imo, by creating possible >>>> extra confusion about which cpumask is the wright one to use in a >>>> given situation). >>>> >>>> My feeling, unless I''ve missed something, is that this would make >>>> the patch quite a bit smaller and with a smaller spread of code >>>> changes. >>>> >>>> -- Keir >>>> >>>> On 9/9/08 09:59, "Shan, Haitao" <haitao.shan@intel.com> wrote: >>>> >>>>> This patch implements cpu offline feature. >>>>> >>>>> Best Regards >>>>> Haitao Shan >>>> >>>> >>>> >>>> _______________________________________________ >>>> Xen-devel mailing list >>>> Xen-devel@lists.xensource.com >>>> http://lists.xensource.com/xen-devel > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2008-Sep-11 16:52 UTC
Re: [Xen-devel] Re: [PATCH 1/4] CPU online/offline support in Xen
On 11/9/08 17:00, "Shan, Haitao" <haitao.shan@intel.com> wrote:> Hi, Keir, > > Concerning the last running vcpu on the dying cpu, I have some thought. > Yes, there would be a short time after the stop_machine_run when this vcpu > v->processor == dying_cpu. But anyhow, we set fie __VPF_migrating flag for > that vcpu and issued a schedule_softirq on the dying cpu. > This softirq should run immediately after stop_machine context, am I right? If > so, by the time the schedule softirq is executed, this last vcpu is migrated > away from this dying cpu. But saving of its context will be delayed to > play_dead->sync_lazy_context. > If another cpu issues the schedule request to this dying cpu > (vcpu_sleep_nosync->cpu_raise_softirq(vc->processor....)) during this time, > the request will be serviced by the above code sequence. So it is safe in such > cases. > Am I missing something important? I am not quite confident on the statements, > though.I agree it looks safe. By the way, have you considered using this hotplug functionality for power management? If instead of for(;;) halt(); we instead hooked into Cx management and tried to get into as deep sleep as possible (possibly even supporting the really deep sleeps that power off a whole socket and mean you *have* to come back via real mode) then this would give a nice coarse-time-scale power management mechanism controllable from dom0. I consider this might be a nice win for possibly less effort than is being expended in trying to make idle residency times (and hence Cx residency times) as long as possible. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Shan, Haitao
2008-Sep-11 23:30 UTC
RE: [Xen-devel] Re: [PATCH 1/4] CPU online/offline support in Xen
Agree and I ever discussed about this interesting thing with Kevin. So I think he can talk more on this topic. :) Shan Haitao -----Original Message----- From: Keir Fraser [mailto:keir.fraser@eu.citrix.com] Sent: 2008年9月12日 0:53 To: Shan, Haitao; Haitao Shan; Tian, Kevin Cc: xen-devel@lists.xensource.com Subject: Re: [Xen-devel] Re: [PATCH 1/4] CPU online/offline support in Xen On 11/9/08 17:00, "Shan, Haitao" <haitao.shan@intel.com> wrote:> Hi, Keir, > > Concerning the last running vcpu on the dying cpu, I have some thought. > Yes, there would be a short time after the stop_machine_run when this vcpu > v->processor == dying_cpu. But anyhow, we set fie __VPF_migrating flag for > that vcpu and issued a schedule_softirq on the dying cpu. > This softirq should run immediately after stop_machine context, am I right? If > so, by the time the schedule softirq is executed, this last vcpu is migrated > away from this dying cpu. But saving of its context will be delayed to > play_dead->sync_lazy_context. > If another cpu issues the schedule request to this dying cpu > (vcpu_sleep_nosync->cpu_raise_softirq(vc->processor....)) during this time, > the request will be serviced by the above code sequence. So it is safe in such > cases. > Am I missing something important? I am not quite confident on the statements, > though.I agree it looks safe. By the way, have you considered using this hotplug functionality for power management? If instead of for(;;) halt(); we instead hooked into Cx management and tried to get into as deep sleep as possible (possibly even supporting the really deep sleeps that power off a whole socket and mean you *have* to come back via real mode) then this would give a nice coarse-time-scale power management mechanism controllable from dom0. I consider this might be a nice win for possibly less effort than is being expended in trying to make idle residency times (and hence Cx residency times) as long as possible. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Tian, Kevin
2008-Sep-12 02:22 UTC
RE: [Xen-devel] Re: [PATCH 1/4] CPU online/offline support in Xen
On Friday, September 12, 2008 12:53 AM, Keir Fraser wrote:> On 11/9/08 17:00, "Shan, Haitao" <haitao.shan@intel.com> wrote: > >> Hi, Keir, >> >> Concerning the last running vcpu on the dying cpu, I have somethought.>> Yes, there would be a short time after the stop_machine_run when thisvcpu>> v->processor == dying_cpu. But anyhow, we set fie __VPF_migratingflag for>> that vcpu and issued a schedule_softirq on the dying cpu. >> This softirq should run immediately after stop_machine context, am Iright?>> If so, by the time the schedule softirq is executed, this last vcpuis>> migrated away from this dying cpu. But saving of its context will bedelayed>> to play_dead->sync_lazy_context. If another cpu issues the schedulerequest>> to this dying cpu(vcpu_sleep_nosync->cpu_raise_softirq(vc->processor....))>> during this time, the request will be serviced by the above codesequence.>> So it is safe in such cases. Am I missing something important? I amnot>> quite confident on the statements, though. > > I agree it looks safe. > > By the way, have you considered using this hotplug functionality forpower> management? If instead of for(;;) halt(); we instead hooked into Cx > management and tried to get into as deep sleep as possible (possiblyeven> supporting the really deep sleeps that power off a whole socket andmean you> *have* to come back via real mode) then this would give a nice > coarse-time-scale power management mechanism controllable from dom0.Yes, that''s one good suggestion and we can add deep sleep for offline path.> > I consider this might be a nice win for possibly less effort than isbeing> expended in trying to make idle residency times (and hence Cxresidency> times) as long as possible. >These two don''t conflict. Cpu online/offline can''t be used in small interval due to long latency and added overhead to whole system, but it makes sense when administrator realizes low cpu utilization in a relatively long period like in hrs. Current idle governor instead runs in fine-grained level to fit the otherwise cases. Thanks, Kevin _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2008-Sep-12 06:02 UTC
Re: [Xen-devel] Re: [PATCH 1/4] CPU online/offline support in Xen
On 12/9/08 03:22, "Tian, Kevin" <kevin.tian@intel.com> wrote:>> I consider this might be a nice win for possibly less effort than is > being >> expended in trying to make idle residency times (and hence Cx > residency >> times) as long as possible. >> > > These two don''t conflict. Cpu online/offline can''t be used in small > interval due > to long latency and added overhead to whole system, but it makes sense > when administrator realizes low cpu utilization in a relatively long > period like > in hrs. Current idle governor instead runs in fine-grained level to fit > the otherwise > cases.I certainly agree with that. Just pointing out that, with the fine-graiend approach, beyond a certain point you''ll be investing effort for smaller and smaller further gains. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Tian, Kevin
2008-Sep-12 06:04 UTC
RE: [Xen-devel] Re: [PATCH 1/4] CPU online/offline support in Xen
>From: Keir Fraser >Sent: 2008年9月12日 14:02 >To: Tian, Kevin; Shan, Haitao; Haitao Shan >Cc: xen-devel@lists.xensource.com; Wei, Gang >Subject: Re: [Xen-devel] Re: [PATCH 1/4] CPU online/offline >support in Xen > >On 12/9/08 03:22, "Tian, Kevin" <kevin.tian@intel.com> wrote: > >>> I consider this might be a nice win for possibly less effort than is >> being >>> expended in trying to make idle residency times (and hence Cx >> residency >>> times) as long as possible. >>> >> >> These two don''t conflict. Cpu online/offline can''t be used in small >> interval due >> to long latency and added overhead to whole system, but it >makes sense >> when administrator realizes low cpu utilization in a relatively long >> period like >> in hrs. Current idle governor instead runs in fine-grained >level to fit >> the otherwise >> cases. > >I certainly agree with that. Just pointing out that, with the >fine-graiend >approach, beyond a certain point you''ll be investing effort >for smaller and >smaller further gains. >Sure I agree, and beyond that point a weak-designed governor may even slow system with no actual gain. Thanks, Kevin _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Gavin Maltby
2008-Sep-17 04:17 UTC
Re: [Xen-devel] Re: [PATCH 1/4] CPU online/offline support in Xen
Christoph Egger wrote:> On Thursday 11 September 2008 16:15:14 Keir Fraser wrote: >> I applied the patch with the following changes: >> * I rewrote your changes to fixup_irqs(). We should force lazy EOIs >> *after* we have serviced any straggling interrupts. Also we should actually >> clear the EOI stack so it is empty next time the CPU comes online. >> * I simplified your changes to schedule.c in light of the fact we run in >> stop_machine context. Hence we can be quite relaxed about locking, for >> example. >> * I removed your change to __csched_vcpu_is_migrateable() and instead put >> a similar check in csched_load_balance(). I think this is clearer and also >> cheaper. >> >> I note that the VCPU currently running on the offlined CPU continues to run >> there even after __cpu_disable(), and until that CPU does a final run >> through the scheduler soon after. I hope it does not matter there is one >> vcpu with v->processor == offlined_cpu for a short while > > This is not acceptable regarding to machine check. When Dom0 offlines a > defect cpu, nothing may continue on it or silent data corruption occurs.I don''t see this as a problem for machine check correctness. If dom0 asks to offline a cpu (because it believes the cpu is busted and a threat to uptime), that decision is fundamentally asynchronous to the actual error handling that occured at machine check exception time: - running in whatever context - MCE occurs - trap to hypervisor MCE handler . this decides on hypervisor panic, or other appropriate immediate (in handler) response . telemetry forwarded to dom0 for logging and analysis - assume no hypervisor panic - eons pass during which any unconstrained bad data remaining after initial handling may go anywhere - dom0 gets telemetry and let''s say diagnoses a fault and decides to call back into the hypervisor to offline the offending cpu Note the "eons pass" bit; tonnes of instructions may run on the bad cpu in this time, and a few more for some offline delay won''t hurt. Gavin _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jan Beulich
2008-Sep-17 07:05 UTC
Re: [Xen-devel] Re: [PATCH 1/4] CPU online/offline support in Xen
>>> Gavin Maltby <Gavin.Maltby@Sun.COM> 17.09.08 06:17 >>> >I don''t see this as a problem for machine check correctness. > >If dom0 asks to offline a cpu (because it believes the cpu is busted and >a threat to uptime), that decision is fundamentally asynchronous >to the actual error handling that occured at machine check exception >time: > > - running in whatever context > - MCE occurs > - trap to hypervisor MCE handler > . this decides on hypervisor panic, or other appropriate > immediate (in handler) response > . telemetry forwarded to dom0 for logging and analysis > - assume no hypervisor panic > - eons pass during which any unconstrained bad data remaining > after initial handling may go anywhere > - dom0 gets telemetry and let''s say diagnoses a fault and > decides to call back into the hypervisor to offline the > offending cpu > >Note the "eons pass" bit; tonnes of instructions may run on the >bad cpu in this time, and a few more for some offline delay won''t >hurt.Shouldn''t this possibly be handled the other way around: If a recoverable MCE happened, immediately stop scheduling anything on the affected CPU(s), until Dom0 tells you otherwise (and of course as long as there remains at least one CPU to run on). Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jiang, Yunhong
2008-Sep-17 09:20 UTC
RE: [Xen-devel] Re: [PATCH 1/4] CPU online/offline support in Xen
>-----Original Message----- >From: xen-devel-bounces@lists.xensource.com >[mailto:xen-devel-bounces@lists.xensource.com] On Behalf Of Jan Beulich >Sent: 2008年9月17日 15:06 >To: Christoph Egger; Gavin Maltby >Cc: Haitao Shan; Tian, Kevin; xen-devel@lists.xensource.com; >Shan, Haitao; Keir Fraser >Subject: Re: [Xen-devel] Re: [PATCH 1/4] CPU online/offline >support in Xen > >>>> Gavin Maltby <Gavin.Maltby@Sun.COM> 17.09.08 06:17 >>> >>I don't see this as a problem for machine check correctness. >> >>If dom0 asks to offline a cpu (because it believes the cpu is >busted and >>a threat to uptime), that decision is fundamentally asynchronous >>to the actual error handling that occured at machine check exception >>time: >> >> - running in whatever context >> - MCE occurs >> - trap to hypervisor MCE handler >> . this decides on hypervisor panic, or other appropriate >> immediate (in handler) response >> . telemetry forwarded to dom0 for logging and analysis >> - assume no hypervisor panic >> - eons pass during which any unconstrained bad data remaining >> after initial handling may go anywhere >> - dom0 gets telemetry and let's say diagnoses a fault and >> decides to call back into the hypervisor to offline the >> offending cpu >> >>Note the "eons pass" bit; tonnes of instructions may run on the >>bad cpu in this time, and a few more for some offline delay won't >>hurt. > >Shouldn't this possibly be handled the other way around: If a >recoverable >MCE happened, immediately stop scheduling anything on the affected >CPU(s), until Dom0 tells you otherwise (and of course as long as there >remains at least one CPU to run on).Current MCE handling in Xen has no mechanism to achieve this, agree that some initial containment in Xen is needed to reduce the possibility of second MCE, ( will the program locality cause such situation?) What we are thinking is, when MCA handler happen, all domain's vcpu except dom0's vcpu0 need be bring into xen's execution context.> >Jan > > >_______________________________________________ >Xen-devel mailing list >Xen-devel@lists.xensource.com >http://lists.xensource.com/xen-devel >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Christoph Egger
2008-Sep-17 09:43 UTC
Re: [Xen-devel] Re: [PATCH 1/4] CPU online/offline support in Xen
On Wednesday 17 September 2008 11:20:57 Jiang, Yunhong wrote:> >-----Original Message----- > >From: xen-devel-bounces@lists.xensource.com > >[mailto:xen-devel-bounces@lists.xensource.com] On Behalf Of Jan Beulich > >Sent: 2008年9月17日 15:06 > >To: Christoph Egger; Gavin Maltby > >Cc: Haitao Shan; Tian, Kevin; xen-devel@lists.xensource.com; > >Shan, Haitao; Keir Fraser > >Subject: Re: [Xen-devel] Re: [PATCH 1/4] CPU online/offline > >support in Xen > > > >>>> Gavin Maltby <Gavin.Maltby@Sun.COM> 17.09.08 06:17 >>> > >> > >>I don''t see this as a problem for machine check correctness. > >> > >>If dom0 asks to offline a cpu (because it believes the cpu is > > > >busted and > > > >>a threat to uptime), that decision is fundamentally asynchronous > >>to the actual error handling that occured at machine check exception > >>time: > >> > >> - running in whatever context > >> - MCE occurs > >> - trap to hypervisor MCE handler > >> . this decides on hypervisor panic, or other appropriate > >> immediate (in handler) response > >> . telemetry forwarded to dom0 for logging and analysis > >> - assume no hypervisor panic > >> - eons pass during which any unconstrained bad data remaining > >> after initial handling may go anywhere > >> - dom0 gets telemetry and let''s say diagnoses a fault and > >> decides to call back into the hypervisor to offline the > >> offending cpu > >> > >>Note the "eons pass" bit; tonnes of instructions may run on the > >>bad cpu in this time, and a few more for some offline delay won''t > >>hurt. > > > >Shouldn''t this possibly be handled the other way around: If a > >recoverable > >MCE happened, immediately stop scheduling anything on the affected > >CPU(s), until Dom0 tells you otherwise (and of course as long as there > >remains at least one CPU to run on). > > Current MCE handling in Xen has no mechanism to achieve this.It has since c/s 17968. Christoph -- AMD Saxony, Dresden, Germany Operating System Research Center Legal Information: AMD Saxony Limited Liability Company & Co. KG Sitz (Geschäftsanschrift): Wilschdorfer Landstr. 101, 01109 Dresden, Deutschland Registergericht Dresden: HRA 4896 vertretungsberechtigter Komplementär: AMD Saxony LLC (Sitz Wilmington, Delaware, USA) Geschäftsführer der AMD Saxony LLC: Dr. Hans-R. Deppe, Thomas McCoy _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ke, Liping
2008-Sep-17 13:14 UTC
RE: [Xen-devel] Re: [PATCH 1/4] CPU online/offline support in Xen
Hi, Egger Thanks a lot about your answer. Just look through your patch and want to get some help: 1. When MCE happens, will all of the cores (even in different socket) all be brought into MCE handler in AMD platform (such as K8) too? So every core will enter and execute k8_machine_check handler? When mce happened, this handler will enter N (number of cores) times. 2. When doing send_guest_trap, if dom0->vcpu0->processor = 0 while in nmi_mce_softirq, cur_cpu = 1, when set affinity, we should bind dom0->vcpu0 with cur_cpu 1 instead of its original bindings [cpu_set(cpu, affinity) vs cpu_set(st->processor, affinity)]? Otherwise the affinity has e no changes and need not restore? Not sure about this. 3. If several vcpus are running (belongs dom0 or other domains) when MCA happens, if we don't pause other vcpus except dom0.vcpu0 and let them into idle, maybe we can't make sure that those vcpus will still be scheduled in an unstable environment? At the same time, other pcpu might be in k8_machine_check handler concurrently? Thanks a lot for your answer! Regards, Criping -----Original Message----- From: xen-devel-bounces@lists.xensource.com [mailto:xen-devel-bounces@lists.xensource.com] On Behalf Of Christoph Egger Sent: 2008年9月17日 17:44 To: Jiang, Yunhong Cc: Tian, Kevin; xen-devel@lists.xensource.com; Shan, Haitao; Gavin Maltby; Keir Fraser; Haitao Shan Subject: Re: [Xen-devel] Re: [PATCH 1/4] CPU online/offline support in Xen On Wednesday 17 September 2008 11:20:57 Jiang, Yunhong wrote:> >-----Original Message----- > >From: xen-devel-bounces@lists.xensource.com > >[mailto:xen-devel-bounces@lists.xensource.com] On Behalf Of Jan Beulich > >Sent: 2008年9月17日 15:06 > >To: Christoph Egger; Gavin Maltby > >Cc: Haitao Shan; Tian, Kevin; xen-devel@lists.xensource.com; > >Shan, Haitao; Keir Fraser > >Subject: Re: [Xen-devel] Re: [PATCH 1/4] CPU online/offline > >support in Xen > > > >>>> Gavin Maltby <Gavin.Maltby@Sun.COM> 17.09.08 06:17 >>> > >> > >>I don't see this as a problem for machine check correctness. > >> > >>If dom0 asks to offline a cpu (because it believes the cpu is > > > >busted and > > > >>a threat to uptime), that decision is fundamentally asynchronous > >>to the actual error handling that occured at machine check exception > >>time: > >> > >> - running in whatever context > >> - MCE occurs > >> - trap to hypervisor MCE handler > >> . this decides on hypervisor panic, or other appropriate > >> immediate (in handler) response > >> . telemetry forwarded to dom0 for logging and analysis > >> - assume no hypervisor panic > >> - eons pass during which any unconstrained bad data remaining > >> after initial handling may go anywhere > >> - dom0 gets telemetry and let's say diagnoses a fault and > >> decides to call back into the hypervisor to offline the > >> offending cpu > >> > >>Note the "eons pass" bit; tonnes of instructions may run on the > >>bad cpu in this time, and a few more for some offline delay won't > >>hurt. > > > >Shouldn't this possibly be handled the other way around: If a > >recoverable > >MCE happened, immediately stop scheduling anything on the affected > >CPU(s), until Dom0 tells you otherwise (and of course as long as there > >remains at least one CPU to run on). > > Current MCE handling in Xen has no mechanism to achieve this.It has since c/s 17968. Christoph -- AMD Saxony, Dresden, Germany Operating System Research Center Legal Information: AMD Saxony Limited Liability Company & Co. KG Sitz (Geschäftsanschrift): Wilschdorfer Landstr. 101, 01109 Dresden, Deutschland Registergericht Dresden: HRA 4896 vertretungsberechtigter Komplementär: AMD Saxony LLC (Sitz Wilmington, Delaware, USA) Geschäftsführer der AMD Saxony LLC: Dr. Hans-R. Deppe, Thomas McCoy _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jiang, Yunhong
2008-Sep-18 03:56 UTC
RE: [Xen-devel] Re: [PATCH 1/4] CPU online/offline support in Xen
>> > >> >Shouldn''t this possibly be handled the other way around: If a >> >recoverable >> >MCE happened, immediately stop scheduling anything on the affected >> >CPU(s), until Dom0 tells you otherwise (and of course as >long as there >> >remains at least one CPU to run on). >> >> Current MCE handling in Xen has no mechanism to achieve this. > >It has since c/s 17968.Hmm, I think current NMI_MCE_SOFTIRQ can''t make sure other guest will not be scheduled. Considering there is a schedule softirq already pending on the pCPU, other guest may run before the impacted guest. Did I missed anything? Thanks Yunhong Jiang> >Christoph > > > >-- >AMD Saxony, Dresden, Germany >Operating System Research Center > >Legal Information: >AMD Saxony Limited Liability Company & Co. KG >Sitz (Geschäftsanschrift): > Wilschdorfer Landstr. 101, 01109 Dresden, Deutschland >Registergericht Dresden: HRA 4896 >vertretungsberechtigter Komplementär: > AMD Saxony LLC (Sitz Wilmington, Delaware, USA) >Geschäftsführer der AMD Saxony LLC: > Dr. Hans-R. Deppe, Thomas McCoy > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2008-Sep-18 07:20 UTC
Re: [Xen-devel] Re: [PATCH 1/4] CPU online/offline support in Xen
On 18/9/08 04:56, "Jiang, Yunhong" <yunhong.jiang@intel.com> wrote:>> It has since c/s 17968. > > Hmm, I think current NMI_MCE_SOFTIRQ can''t make sure other guest will not be > scheduled. Considering there is a schedule softirq already pending on the > pCPU, other guest may run before the impacted guest. Did I missed anything?There are races here in any case. What if #MC happens halfway through the scheduler, just before set_current(new)? -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jiang, Yunhong
2008-Sep-18 08:13 UTC
RE: [Xen-devel] Re: [PATCH 1/4] CPU online/offline support in Xen
>-----Original Message----- >From: Keir Fraser [mailto:keir.fraser@eu.citrix.com] >Sent: 2008年9月18日 15:21 >To: Jiang, Yunhong; Christoph Egger >Cc: Jan Beulich; Gavin Maltby; Haitao Shan; Tian, Kevin; >xen-devel@lists.xensource.com; Shan, Haitao >Subject: Re: [Xen-devel] Re: [PATCH 1/4] CPU online/offline >support in Xen > >On 18/9/08 04:56, "Jiang, Yunhong" <yunhong.jiang@intel.com> wrote: > >>> It has since c/s 17968. >> >> Hmm, I think current NMI_MCE_SOFTIRQ can't make sure other >guest will not be >> scheduled. Considering there is a schedule softirq already >pending on the >> pCPU, other guest may run before the impacted guest. Did I >missed anything? > >There are races here in any case. What if #MC happens halfway >through the >scheduler, just before set_current(new)?If MCE handler will not cause schedule and not change current, will any issue happen?> > -- Keir > > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2008-Sep-18 09:11 UTC
Re: [Xen-devel] Re: [PATCH 1/4] CPU online/offline support in Xen
On 18/9/08 09:13, "Jiang, Yunhong" <yunhong.jiang@intel.com> wrote:>>> Hmm, I think current NMI_MCE_SOFTIRQ can''t make sure other >> guest will not be >>> scheduled. Considering there is a schedule softirq already >> pending on the >>> pCPU, other guest may run before the impacted guest. Did I >> missed anything? >> >> There are races here in any case. What if #MC happens halfway >> through the >> scheduler, just before set_current(new)? > > If MCE handler will not cause schedule and not change current, will any issue > happen?I''m not sure exactly what you mean. What *I* meant was that there are certain points during execution where, if a #MC occurs, it may not be possible to determine which single vCPU was running on the pCPU. I guess though that if you ever get unrecoverable errors reported while running inside the hypervisor, you probably can''t recover anyway. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jiang, Yunhong
2008-Sep-18 15:17 UTC
RE: [Xen-devel] Re: [PATCH 1/4] CPU online/offline support in Xen
Keir Fraser <mailto:keir.fraser@eu.citrix.com> wrote:> On 18/9/08 09:13, "Jiang, Yunhong" <yunhong.jiang@intel.com> wrote: > >>>> Hmm, I think current NMI_MCE_SOFTIRQ can''t make sure other guest will >>>> not be scheduled. Considering there is a schedule softirq already >>>> pending on the pCPU, other guest may run before the impacted guest. Did >>>> I missed anything? >>> >>> There are races here in any case. What if #MC happens halfway through the >>> scheduler, just before set_current(new)? >> >> If MCE handler will not cause schedule and not change current, will any >> issue happen? > > I''m not sure exactly what you mean. What *I* meant was that there are > certain points during execution where, if a #MC occurs, it may not be > possible to determine which single vCPU was running on theCurrent implementation on k8_machine_check, it determine xen_impacted through if current is idel domain. And it determine which domain is impacted through current. I have no idea of AMD''s machine check mechanism, but when considering support on intel platform, it may be a bit different. For example, xen is impacted if MCE caused by sync event happens in Xen''s context, even is not in idel domain. Also impacted domain may not always determined by current, memory ownership may help to decide impacted domain. Another difference we are considering is, we suppose domU''s MCA handler is not trusted, so firstly, we may always need dom0''s MCE handler support, secondly, after domU MCE handler, some guard may be needed to make sure no error triggered again.> pCPU. I guess > though that if you ever get unrecoverable errors reported while running > inside the hypervisor, you probably can''t recover anyway.I think this may depends on the error type. If the error is an async event, it may be ok to continue after some containment. For example, if EIPV=0, RIPV=1, and ADDRV =1 and happens to xen''s execution context, it may because of some async event to the memory side, in that situtaion, we can kill the owner of the page (if that page is owned exclusively by one guest) and continue run. However, if it is a sync event like EIPV=1, then we have to reset the system.> > -- Keir_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel