Yu, Ke
2009-Jul-19 06:46 UTC
[Xen-devel] [PATCH][pvops_dom0][4/4] use physical acpi_id in acpi processor parsing logic
use physical acpi_id in acpi processor parsing logic xen domain0 is seeing vCPU, while xen need the acpi info of physical CPU, so this patch hack the acpi processor parsing logic to use physical acpi_id instead vcpu id From: Yu Ke <ke.yu@intel.com> Signed-off-by: Yu Ke <ke.yu@intel.com> --- arch/x86/kernel/acpi/processor.c | 4 ++++ drivers/acpi/processor_core.c | 11 ++++++++--- 2 files changed, 12 insertions(+), 3 deletions(-) diff --git a/arch/x86/kernel/acpi/processor.c b/arch/x86/kernel/acpi/processor.c index 9f0b296..d4d57a9 100644 --- a/arch/x86/kernel/acpi/processor.c +++ b/arch/x86/kernel/acpi/processor.c @@ -11,6 +11,7 @@ #include <acpi/processor.h> #include <asm/acpi.h> +#include <asm/xen/hypervisor.h> static void init_intel_pdc(struct acpi_processor *pr, struct cpuinfo_x86 *c) { @@ -82,6 +83,9 @@ void arch_acpi_processor_init_pdc(struct acpi_processor *pr) { struct cpuinfo_x86 *c = &cpu_data(pr->id); + if (xen_pv_domain()) + c = &cpu_data(0); + pr->pdc = NULL; if (c->x86_vendor == X86_VENDOR_INTEL) init_intel_pdc(pr, c); diff --git a/drivers/acpi/processor_core.c b/drivers/acpi/processor_core.c index 0b6facc..a124c4a 100644 --- a/drivers/acpi/processor_core.c +++ b/drivers/acpi/processor_core.c @@ -638,6 +638,9 @@ static int acpi_processor_get_info(struct acpi_device *device) pr->id = cpu_index; + if (xen_pv_domain()) + pr->id = pr->acpi_id; + /* * Extra Processor objects may be enumerated on MP systems with * less than the max # of CPUs. They should be ignored _iff @@ -703,14 +706,16 @@ static int __cpuinit acpi_processor_start(struct acpi_device *device) return 0; } - BUG_ON((pr->id >= nr_cpu_ids) || (pr->id < 0)); + if (!xen_pv_domain()) + BUG_ON((pr->id >= nr_cpu_ids) || (pr->id < 0)); /* * Buggy BIOS check * ACPI id of processors can be reported wrongly by the BIOS. * Don''t trust it blindly */ - if (per_cpu(processor_device_array, pr->id) != NULL && + if (!xen_pv_domain() && + per_cpu(processor_device_array, pr->id) != NULL && per_cpu(processor_device_array, pr->id) != device) { printk(KERN_WARNING "BIOS reported wrong ACPI id " "for the processor\n"); @@ -725,7 +730,7 @@ static int __cpuinit acpi_processor_start(struct acpi_device *device) goto end; sysdev = get_cpu_sysdev(pr->id); - if (sysfs_create_link(&device->dev.kobj, &sysdev->kobj, "sysdev")) + if (sysdev != NULL && sysfs_create_link(&device->dev.kobj, &sysdev->kobj, "sysdev")) return -EFAULT; /* _PDC call should be done before doing anything else (if reqd.). */ _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2009-Jul-20 20:36 UTC
[Xen-devel] Re: [PATCH][pvops_dom0][4/4] use physical acpi_id in acpi processor parsing logic
On 07/18/09 23:46, Yu, Ke wrote:> use physical acpi_id in acpi processor parsing logic > > xen domain0 is seeing vCPU, while xen need the acpi info of physical CPU, so this patch hack the acpi processor parsing logic to use physical acpi_id instead vcpu id >Would there be a problem with making this always use the acpi id? That would get rid of the Xen-specific stuff, and makes logical sense (at least to me) since this code is dealing with APIC processor objects. I guess this goes back to my earlier comment; could this code be defined in terms of ACPI processors rather than general CPUs, so that the Xen (or any virtual) case can distinguish the two notions, while they can be mapped 1:1 for the native case? (Or perhaps there are reasons you might want to have ACPI control processor power states which are logically offline to the kernel?) J> From: Yu Ke <ke.yu@intel.com> > > Signed-off-by: Yu Ke <ke.yu@intel.com> > --- > > arch/x86/kernel/acpi/processor.c | 4 ++++ > drivers/acpi/processor_core.c | 11 ++++++++--- > 2 files changed, 12 insertions(+), 3 deletions(-) > > > diff --git a/arch/x86/kernel/acpi/processor.c b/arch/x86/kernel/acpi/processor.c > index 9f0b296..d4d57a9 100644 > --- a/arch/x86/kernel/acpi/processor.c > +++ b/arch/x86/kernel/acpi/processor.c > @@ -11,6 +11,7 @@ > > #include <acpi/processor.h> > #include <asm/acpi.h> > +#include <asm/xen/hypervisor.h> > > static void init_intel_pdc(struct acpi_processor *pr, struct cpuinfo_x86 *c) > { > @@ -82,6 +83,9 @@ void arch_acpi_processor_init_pdc(struct acpi_processor *pr) > { > struct cpuinfo_x86 *c = &cpu_data(pr->id); > > + if (xen_pv_domain()) > + c = &cpu_data(0); > + > pr->pdc = NULL; > if (c->x86_vendor == X86_VENDOR_INTEL) > init_intel_pdc(pr, c); > diff --git a/drivers/acpi/processor_core.c b/drivers/acpi/processor_core.c > index 0b6facc..a124c4a 100644 > --- a/drivers/acpi/processor_core.c > +++ b/drivers/acpi/processor_core.c > @@ -638,6 +638,9 @@ static int acpi_processor_get_info(struct acpi_device *device) > > pr->id = cpu_index; > > + if (xen_pv_domain()) > + pr->id = pr->acpi_id; > + > /* > * Extra Processor objects may be enumerated on MP systems with > * less than the max # of CPUs. They should be ignored _iff > @@ -703,14 +706,16 @@ static int __cpuinit acpi_processor_start(struct acpi_device *device) > return 0; > } > > - BUG_ON((pr->id >= nr_cpu_ids) || (pr->id < 0)); > + if (!xen_pv_domain()) > + BUG_ON((pr->id >= nr_cpu_ids) || (pr->id < 0)); > > /* > * Buggy BIOS check > * ACPI id of processors can be reported wrongly by the BIOS. > * Don''t trust it blindly > */ > - if (per_cpu(processor_device_array, pr->id) != NULL && > + if (!xen_pv_domain() && > + per_cpu(processor_device_array, pr->id) != NULL && > per_cpu(processor_device_array, pr->id) != device) { > printk(KERN_WARNING "BIOS reported wrong ACPI id " > "for the processor\n"); > @@ -725,7 +730,7 @@ static int __cpuinit acpi_processor_start(struct acpi_device *device) > goto end; > > sysdev = get_cpu_sysdev(pr->id); > - if (sysfs_create_link(&device->dev.kobj, &sysdev->kobj, "sysdev")) > + if (sysdev != NULL && sysfs_create_link(&device->dev.kobj, &sysdev->kobj, "sysdev")) > return -EFAULT; > > /* _PDC call should be done before doing anything else (if reqd.). */ > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Yu, Ke
2009-Jul-21 08:07 UTC
[Xen-devel] RE: [PATCH][pvops_dom0][4/4] use physical acpi_id in acpi processor parsing logic
>From: Jeremy Fitzhardinge [mailto:jeremy@goop.org] >Sent: Tuesday, July 21, 2009 4:36 AM >To: Yu, Ke >Cc: xen-devel@lists.xensource.com; Tian, Kevin >Subject: Re: [PATCH][pvops_dom0][4/4] use physical acpi_id in acpi processor >parsing logic > >On 07/18/09 23:46, Yu, Ke wrote: >> use physical acpi_id in acpi processor parsing logic >> >> xen domain0 is seeing vCPU, while xen need the acpi info of physical CPU, >so this patch hack the acpi processor parsing logic to use physical acpi_id >instead vcpu id >> > >Would there be a problem with making this always use the acpi id? That >would get rid of the Xen-specific stuff, and makes logical sense (at >least to me) since this code is dealing with APIC processor objects.> >I guess this goes back to my earlier comment; could this code be defined >in terms of ACPI processors rather than general CPUs, so that the Xen >(or any virtual) case can distinguish the two notions, while they can be >mapped 1:1 for the native case? (Or perhaps there are reasons you might >want to have ACPI control processor power states which are logically >offline to the kernel?) > > J >To use acpi id in native, I can see there are at least two kind of conflicts need to be resolved: 1. kernel assume it only cares about the present CPU. For non present CPU, it will simply stop going further and return, or trigger BUG(). When switch to acpi id, the acpi processor object may refer to a non present cpu, so the code need to be able to handle the non-present CPU situation. 2. native kernel use per_cpu data extensively, which is indexed by general cpu id. when switch to acpi id, these per_cpu data should be changed to the array indexed by acpi id. Take the acpi processor core code (driver/acpi/ processor_core.c) as example, the condition check " BUG_ON((pr->id >= nr_cpu_ids) || (pr->id < 0)); " need change. the per-cpu data processor_device_array, processors need change. And the cpu_sys_devices in get_cpu_sysdev need more thoughts before changing, since it is globally used by other component. Another example is the cpufreq case. if we want to use acpi id in cpufreq case, we also need to resolve the above two conflicts. For example, in drivers/cpufreq/cpufreq.c, its core data struct " cpufreq_policy " is per-CPU, thus need many changes in every place it is used. and the condition checking, like " if (cpu >= nr_cpu_ids) goto err_out;" also need change. Compared with the change in the driver/acpi/ processor_core.c, the change in cpufreq is more intrusive. Since the acpi processor core code already has the Px info parsing functionality, it may be better not changing cpufreq. Best Regards Ke _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2009-Jul-28 17:53 UTC
[Xen-devel] Re: [PATCH][pvops_dom0][4/4] use physical acpi_id in acpi processor parsing logic
On 07/21/09 01:07, Yu, Ke wrote:> To use acpi id in native, I can see there are at least two kind of conflicts need to be resolved: > 1. kernel assume it only cares about the present CPU. For non present CPU, it will simply stop going further and return, or trigger BUG(). When switch to acpi id, the acpi processor object may refer to a non present cpu, so the code need to be able to handle the non-present CPU situation. >The percpu subsystem should be able to deal with accesses to percpu data of non-present cpus (though it might need some advance preparation to make sure the memory is allocated). In general the percpu subsystem is concerned with making sure that the amount of memory allocated is "reasonable" - ie, for cpus which are actually present or could be present, rather than cpus which can never exist on this system (like running a kernel compiled for 1024 processors on a dual-core laptop). I assume that ACPI processor IDs are always going to be in the realm of sensible for the hardware: ie, either CPUs which actually exist, or which have sockets which could potentially be hotplugged. In that case I don''t see a problem with making sure they have percpu data allocated. (Of course in the Xen case this needs a bit more care, since the domain VCPU count has nothing to do with the host PCPUs, but we can do things like manipulate the possible CPU set if that helps.)> 2. native kernel use per_cpu data extensively, which is indexed by general cpu id. when switch to acpi id, these per_cpu data should be changed to the array indexed by acpi id. >How is the acpi id derived? Is the the same as the local apic id? Is it typically the same as the kernel''s smp_processor_id, or does it tend to be different? If they''re different, is the mapping fixed or can it vary?> Take the acpi processor core code (driver/acpi/ processor_core.c) as example, the condition check " BUG_ON((pr->id >= nr_cpu_ids) || (pr->id < 0)); " need change. the per-cpu data processor_device_array, processors need change. And the cpu_sys_devices in get_cpu_sysdev need more thoughts before changing, since it is globally used by other component. > > Another example is the cpufreq case. if we want to use acpi id in cpufreq case, we also need to resolve the above two conflicts. For example, in drivers/cpufreq/cpufreq.c, its core data struct " cpufreq_policy " is per-CPU, thus need many changes in every place it is used. and the condition checking, like " if (cpu >= nr_cpu_ids) goto err_out;" also need change. Compared with the change in the driver/acpi/ processor_core.c, the change in cpufreq is more intrusive. Since the acpi processor core code already has the Px info parsing functionality, it may be better not changing cpufreq. >OK, to summarize: The cpufreq subsystem provides two services to the rest of the kernel: * the ability to set the overall power management policy (performance, powersave, etc) * the mechanism and drivers to implement that policy In this case we still want a way to set the policy, but Xen itself will implement the mechanism internally without dom0''s further involvement (aside from some info culled from the ACPI tables), right? But even then, cpufreq is oriented towards controlling the kernel-visible CPUs, and is ill-suited to controlling the policy of the host CPUs from the context of one particular domain. Therefore we need to have new interfaces which: 1. insert ACPI info that dom0 extracts from various tables into Xen (assuming its impractical for Xen to do this itself) 2. set the overall power-management policy 3. Xen implements that policy without further interaction with dom0 (And what''s missing from this is some way for each individual domain to set the "importance" of the work being done on each VCPU to allow Xen to determine what''s the appropriate operating point for each PCPU from timeslice to timeslice.) Is that accurate? Thanks, J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Yu, Ke
2009-Jul-29 02:35 UTC
[Xen-devel] RE: [PATCH][pvops_dom0][4/4] use physical acpi_id in acpi processor parsing logic
>-----Original Message----- >From: Jeremy Fitzhardinge [mailto:jeremy@goop.org] >Sent: Wednesday, July 29, 2009 1:54 AM >To: Yu, Ke >Cc: xen-devel@lists.xensource.com; Tian, Kevin >Subject: Re: [PATCH][pvops_dom0][4/4] use physical acpi_id in acpi processor >parsing logic > >On 07/21/09 01:07, Yu, Ke wrote: >> To use acpi id in native, I can see there are at least two kind of conflicts need to >be resolved: >> 1. kernel assume it only cares about the present CPU. For non present CPU, it >will simply stop going further and return, or trigger BUG(). When switch to acpi id, >the acpi processor object may refer to a non present cpu, so the code need to be >able to handle the non-present CPU situation. >> > >The percpu subsystem should be able to deal with accesses to percpu data >of non-present cpus (though it might need some advance preparation to >make sure the memory is allocated). In general the percpu subsystem is >concerned with making sure that the amount of memory allocated is >"reasonable" - ie, for cpus which are actually present or could be >present, rather than cpus which can never exist on this system (like >running a kernel compiled for 1024 processors on a dual-core laptop). > >I assume that ACPI processor IDs are always going to be in the realm of >sensible for the hardware: ie, either CPUs which actually exist, or >which have sockets which could potentially be hotplugged. In that case >I don''t see a problem with making sure they have percpu data allocated.This looks not true unfortunately. When I set dom0 vcpu number to 1, I observe the per-CPU data only allocate memory for 1 CPU. probably we need to manipulate the possible CPU set as you suggested.> >(Of course in the Xen case this needs a bit more care, since the domain >VCPU count has nothing to do with the host PCPUs, but we can do things >like manipulate the possible CPU set if that helps.) > >> 2. native kernel use per_cpu data extensively, which is indexed by general cpu id. >when switch to acpi id, these per_cpu data should be changed to the array >indexed by acpi id. >> > >How is the acpi id derived? Is the the same as the local apic id? Is >it typically the same as the kernel''s smp_processor_id, or does it tend >to be different? If they''re different, is the mapping fixed or can it vary?acpi id is derived from the acpi DSDT table acpi processor object. For example, the following sniper declare 4 processors in socket 0. And the second field in processor declaration is the acpi id. Scope (\_SB) { Device (\_SB.SCK0) { Processor (CPU0, 0x00, 0x00000410, 0x06) { ... } Processor (CPU1, 0x01, 0x00000410, 0x06) { ... } Processor (CPU2, 0x02, 0x00000410, 0x06) { ... } Processor (CPU3, 0x03, 0x00000410, 0x06) { ... } } } acpi id is different from local apic id which is derived from ACPI MADT table, and also different from kernel''s smp_processor_id. Their mapping is fixed. note that in dom0 case, their mapping may be incomplete, since some processor may not presented to dom0.> >> Take the acpi processor core code (driver/acpi/ processor_core.c) as example, >the condition check " BUG_ON((pr->id >= nr_cpu_ids) || (pr->id < 0)); " need >change. the per-cpu data processor_device_array, processors need change. And >the cpu_sys_devices in get_cpu_sysdev need more thoughts before changing, >since it is globally used by other component. >> >> Another example is the cpufreq case. if we want to use acpi id in cpufreq case, >we also need to resolve the above two conflicts. For example, in >drivers/cpufreq/cpufreq.c, its core data struct " cpufreq_policy " is per-CPU, thus >need many changes in every place it is used. and the condition checking, like " if >(cpu >= nr_cpu_ids) goto err_out;" also need change. Compared with the change >in the driver/acpi/ processor_core.c, the change in cpufreq is more intrusive. >Since the acpi processor core code already has the Px info parsing functionality, it >may be better not changing cpufreq. >> > >OK, to summarize: > >The cpufreq subsystem provides two services to the rest of the kernel: > > * the ability to set the overall power management policy > (performance, powersave, etc) > * the mechanism and drivers to implement that policy > >In this case we still want a way to set the policy, but Xen itself will >implement the mechanism internally without dom0''s further involvement >(aside from some info culled from the ACPI tables), right?Right. both policy and mechanism are done in hypervisor. and xen already provide the platform_hypercall for management tool to set the overall power management policy. Currently we have one user space tool xenpm (xen-unstable/tools/misc/xenpm.c) to set the PM policy.> >But even then, cpufreq is oriented towards controlling the >kernel-visible CPUs, and is ill-suited to controlling the policy of the >host CPUs from the context of one particular domain. > >Therefore we need to have new interfaces which: > > 1. insert ACPI info that dom0 extracts from various tables into Xen > (assuming its impractical for Xen to do this itself) > 2. set the overall power-management policy > 3. Xen implements that policy without further interaction with dom0 > >(And what''s missing from this is some way for each individual domain to >set the "importance" of the work being done on each VCPU to allow Xen to >determine what''s the appropriate operating point for each PCPU from >timeslice to timeslice.) > >Is that accurate?In current design, cpufreq will not start in dom0 or domU. And there is no way for VM to set the "importance" of its work. but administrator can manually set the policy by the management tool. In summary, Xen already implement the cpufreq policy and mechanism in hypervisor, and also provide hypercall for management tool to set the overall policy, so the only thing need dom0 involvement is acpi info parsing. The change to dom0 cpufreq code is avoidable. Best Regards Ke _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel