Konrad Rzeszutek Wilk
2011-Mar-22 14:50 UTC
[Xen-devel] Re: [RFC PATCH V4 4/5] cpuidle: driver for xen
On Tue, Mar 22, 2011 at 06:03:28PM +0530, Trinabh Gupta wrote:> This patch implements a default cpuidle driver for xen. > Earlier pm_idle was flipped to default_idle. Maybe there > is a better way to ensure default_idle is called > without using this cpuidle driver.Please also CC the Xen devel mailing list (I did this for you) I couldn''t find it in the description, but I was wondering what is that you are trying to fix? What is the problem description? Two, you mention in your description that it was tested on x86 systems. did you test this with Xen? If so, what version.> > Signed-off-by: Trinabh Gupta <trinabh@linux.vnet.ibm.com> > --- > > arch/x86/xen/setup.c | 42 +++++++++++++++++++++++++++++++++++++++++- > 1 files changed, 41 insertions(+), 1 deletions(-) > > diff --git a/arch/x86/xen/setup.c b/arch/x86/xen/setup.c > index a8a66a5..4fce4cd 100644 > --- a/arch/x86/xen/setup.c > +++ b/arch/x86/xen/setup.c > @@ -9,6 +9,8 @@ > #include <linux/mm.h> > #include <linux/pm.h> > #include <linux/memblock.h> > +#include <linux/cpuidle.h> > +#include <linux/slab.h> > > #include <asm/elf.h> > #include <asm/vdso.h> > @@ -321,6 +323,44 @@ void __cpuinit xen_enable_syscall(void) > #endif /* CONFIG_X86_64 */ > } > > +static struct cpuidle_driver xen_idle_driver = { > + .name = "xen_idle", > + .owner = THIS_MODULE, > + .priority = 1, > +}; > + > +extern struct cpuidle_state state_default_state; > + > +static int setup_cpuidle(int cpu) > +{ > + struct cpuidle_device *dev = kzalloc(sizeof(struct cpuidle_device), > + GFP_KERNEL);No checking to see if dev == NULL?> + int count = CPUIDLE_DRIVER_STATE_START; > + dev->cpu = cpu; > + dev->drv = &xen_idle_driver; > + > + dev->states[count] = state_default_idle; > + count++; > + > + dev->state_count = count; > + > + if (cpuidle_register_device(dev)) > + return -EIO;No cleanup of the ''dev'' so that we don''t leak memory?> + return 0; > +} > + > +static int xen_idle_init(void) > +{ > + int retval, i; > + retval = cpuidle_register_driver(&xen_idle_driver); > + > + for_each_online_cpu(i) { > + setup_cpuidle(i); > + } > + > + return 0; > +} > + > void __init xen_arch_setup(void) > { > xen_panic_handler_init(); > @@ -354,7 +394,7 @@ void __init xen_arch_setup(void) > #ifdef CONFIG_X86_32 > boot_cpu_data.hlt_works_ok = 1; > #endif > - pm_idle = default_idle; > + xen_idle_init(); > boot_option_idle_override = IDLE_HALT; > > fiddle_vdso(); > > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Trinabh Gupta
2011-Mar-23 09:57 UTC
[Xen-devel] Re: [RFC PATCH V4 4/5] cpuidle: driver for xen
On 03/22/2011 08:20 PM, Konrad Rzeszutek Wilk wrote:> On Tue, Mar 22, 2011 at 06:03:28PM +0530, Trinabh Gupta wrote: >> This patch implements a default cpuidle driver for xen. >> Earlier pm_idle was flipped to default_idle. Maybe there >> is a better way to ensure default_idle is called >> without using this cpuidle driver. >Hi Konrad,> Please also CC the Xen devel mailing list (I did this for you)Yes, I will. Thanks> > I couldn''t find it in the description, but I was wondering > what is that you are trying to fix? What is the problem description?We are trying to remove the exported function pointer pm_idle which is set to the desired idle routine to be used. For example, xen sets it to default_idle. Having exported function pointer is not very secure. The first problem is that this exported pointer can be modified/flipped by any subsystem. There is no tracking or notification mechanism. Secondly and more importantly, various subsystems save the value of this pointer, flip it and later restore to the saved value. There is no guarantee that the saved value is still valid (see http://lkml.org/lkml/2009/8/28/43 and http://lkml.org/lkml/2009/8/28/50) The first patch of the series removed pm_idle and now we directly call into CPUIdle subsystem. As a result for all the subsystems using pm_idle, we have to implement a driver that can be registered to cpuidle.> > Two, you mention in your description that it was tested on x86 systems. did > you test this with Xen? If so, what version.The patches are still in RFC stage and haven''t been tested with Xen. Once we are clear on a particular solution, we will carefully test the patches. Thanks for the code review. Yes, I have missed a couple of things. I will look at how to maintain per cpu dev pointers and free the memory. Thanks, -Trinabh> >> >> Signed-off-by: Trinabh Gupta<trinabh@linux.vnet.ibm.com> >> --- >> >> arch/x86/xen/setup.c | 42 +++++++++++++++++++++++++++++++++++++++++- >> 1 files changed, 41 insertions(+), 1 deletions(-) >> >> diff --git a/arch/x86/xen/setup.c b/arch/x86/xen/setup.c >> index a8a66a5..4fce4cd 100644 >> --- a/arch/x86/xen/setup.c >> +++ b/arch/x86/xen/setup.c >> @@ -9,6 +9,8 @@ >> #include<linux/mm.h> >> #include<linux/pm.h> >> #include<linux/memblock.h> >> +#include<linux/cpuidle.h> >> +#include<linux/slab.h> >> >> #include<asm/elf.h> >> #include<asm/vdso.h> >> @@ -321,6 +323,44 @@ void __cpuinit xen_enable_syscall(void) >> #endif /* CONFIG_X86_64 */ >> } >> >> +static struct cpuidle_driver xen_idle_driver = { >> + .name = "xen_idle", >> + .owner = THIS_MODULE, >> + .priority = 1, >> +}; >> + >> +extern struct cpuidle_state state_default_state; >> + >> +static int setup_cpuidle(int cpu) >> +{ >> + struct cpuidle_device *dev = kzalloc(sizeof(struct cpuidle_device), >> + GFP_KERNEL); > > No checking to see if dev == NULL? >> + int count = CPUIDLE_DRIVER_STATE_START; >> + dev->cpu = cpu; >> + dev->drv =&xen_idle_driver; >> + >> + dev->states[count] = state_default_idle; >> + count++; >> + >> + dev->state_count = count; >> + >> + if (cpuidle_register_device(dev)) >> + return -EIO; > No cleanup of the ''dev'' so that we don''t leak memory? > >> + return 0; >> +} >> + >> +static int xen_idle_init(void) >> +{ >> + int retval, i; >> + retval = cpuidle_register_driver(&xen_idle_driver); >> + >> + for_each_online_cpu(i) { >> + setup_cpuidle(i); >> + } >> + >> + return 0; >> +} >> + >> void __init xen_arch_setup(void) >> { >> xen_panic_handler_init(); >> @@ -354,7 +394,7 @@ void __init xen_arch_setup(void) >> #ifdef CONFIG_X86_32 >> boot_cpu_data.hlt_works_ok = 1; >> #endif >> - pm_idle = default_idle; >> + xen_idle_init(); >> boot_option_idle_override = IDLE_HALT; >> >> fiddle_vdso(); >> >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> Please read the FAQ at http://www.tux.org/lkml/ > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Len Brown
2011-Mar-24 07:18 UTC
[Xen-devel] Re: [RFC PATCH V4 4/5] cpuidle: driver for xen
Is a CONFIG_XEN kernel supposed to use just HLT in idle? xen_arch_setup() does this: pm_idle = default_idle; boot_option_idle_override = IDLE_HALT; which has that effect. I guess this makes sense b/c the CONFIG_XEN kernel is Dom0 and the real C-sates are done by the hypervisor? Would the same CONFIG_XEN kernel binary ever not run xen_arch_setup(), run on raw hardware, and want to use idle states other than HLT? thanks, -Len Brown, Intel Open Source Technology Center _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2011-Mar-24 12:05 UTC
[Xen-devel] Re: [RFC PATCH V4 4/5] cpuidle: driver for xen
On Thu, Mar 24, 2011 at 03:18:14AM -0400, Len Brown wrote:> Is a CONFIG_XEN kernel supposed to use just HLT in idle?For right now..> > xen_arch_setup() does this: > > pm_idle = default_idle; > boot_option_idle_override = IDLE_HALT; > > which has that effect. I guess this makes sense b/c the > CONFIG_XEN kernel is Dom0 and the real C-sates are done > by the hypervisor?Correct. There are some patches that make the C-states be visible in the Linux kernel, but that hasn''t been ported over yet.> > Would the same CONFIG_XEN kernel binary ever not > run xen_arch_setup(), run on raw hardware, and wantever not? I am not sure of the question, so let me state: The Linux kernel if compiled with CONFIG_XEN, and if run on native hardware, would _never_ run ''xen_arch_setup()''*. It would run the normal, native type setup.> to use idle states other than HLT? >*: It could if you really really wanted. You would need to change the GRUB2 to inject some extra data in the ''sub_hardware'' flag to be the Xen specific. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Len Brown
2011-Mar-25 07:19 UTC
[Xen-devel] Re: [RFC PATCH V4 4/5] cpuidle: driver for xen
> On Thu, Mar 24, 2011 at 03:18:14AM -0400, Len Brown wrote: > > Is a CONFIG_XEN kernel supposed to use just HLT in idle? > > For right now.. > > > > xen_arch_setup() does this: > > > > pm_idle = default_idle; > > boot_option_idle_override = IDLE_HALT; > > > > which has that effect. I guess this makes sense b/c the > > CONFIG_XEN kernel is Dom0 and the real C-sates are done > > by the hypervisor? > > Correct. There are some patches that make the C-states > be visible in the Linux kernel, but that hasn''t been ported > over yet.The Xen Dom0 kernel will trap into the hypervisor whenever it does a HLT or an MWAIT, yes? What is the benefit of having Dom0 decided between C-states that it can''t actually enter? What is the mechanism by which those C-states are made visible to Dom0, and how are those states related to the states that are supported on the bare iron?> > Would the same CONFIG_XEN kernel binary ever not > > run xen_arch_setup(), run on raw hardware, and want > > to use idle states other than HLT?> ever not? I am not sure of the question, so let me state: > The Linux kernel if compiled with CONFIG_XEN, and if run on > native hardware, would _never_ run ''xen_arch_setup()''*. It would > run the normal, native type setup.Thanks, that clarifies. -Len Brown, Intel Open Source Technolgy Center _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2011-Mar-25 14:38 UTC
Re: [Xen-devel] Re: [RFC PATCH V4 4/5] cpuidle: driver for xen
On 03/24/2011 12:05 PM, Konrad Rzeszutek Wilk wrote:> On Thu, Mar 24, 2011 at 03:18:14AM -0400, Len Brown wrote: >> Is a CONFIG_XEN kernel supposed to use just HLT in idle? > For right now..For always, I should think.>> xen_arch_setup() does this: >> >> pm_idle = default_idle; >> boot_option_idle_override = IDLE_HALT; >> >> which has that effect. I guess this makes sense b/c the >> CONFIG_XEN kernel is Dom0 and the real C-sates are done >> by the hypervisor? > Correct. There are some patches that make the C-states > be visible in the Linux kernel, but that hasn''t been ported > over yet.All we need is for the idle CPU to block in the hypervisor; a plain "hlt" is always going to be sufficient (which is overridden as a pvop into a sched_idle hypercall). Xen will choose an appropriate power state for the physical cpus depending on the overall busyness of the system (which any individual virtual machine can''t determine). J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2011-Mar-25 14:43 UTC
Re: [Xen-devel] Re: [RFC PATCH V4 4/5] cpuidle: driver for xen
On 03/25/2011 07:19 AM, Len Brown wrote:> The Xen Dom0 kernel will trap into the hypervisor > whenever it does a HLT or an MWAIT, yes?Yes, on hlt.> What is the benefit of having Dom0 decided between > C-states that it can''t actually enter?There might be a slight benefit to allow a domain to tell Xen what its overall utilisation is (ie, "I''d like this VCPU to run, but it isn''t very important so you can take that into account when choosing scheduling priority and/or PCPU performance"). But there''s nothing like that at present.> What is the mechanism by which those C-states are > made visible to Dom0, and how are those states > related to the states that are supported on > the bare iron?Because dom0 is the official ACPI owner (ie, it has the AML interpreter), we need dom0 to handle complex interactions with ACPI (the hypervisor can do simple table parsing). At present the mechanism for power states is that dom0 gets them out of ACPI, and then passes them to Xen which actually uses them. But no guest kernel has any runtime use of power states. J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Len Brown
2011-Mar-31 02:02 UTC
Re: [Xen-devel] Re: [RFC PATCH V4 4/5] cpuidle: driver for xen
> >> Is a CONFIG_XEN kernel supposed to use just HLT in idle? > > For right now.. > > For always, I should think.Yay!> >> xen_arch_setup() does this: > >> > >> pm_idle = default_idle; > >> boot_option_idle_override = IDLE_HALT; > >> > >> which has that effect. I guess this makes sense b/c the > >> CONFIG_XEN kernel is Dom0 and the real C-sates are done > >> by the hypervisor? > > Correct. There are some patches that make the C-states > > be visible in the Linux kernel, but that hasn''t been ported > > over yet. > > All we need is for the idle CPU to block in the hypervisor; a plain > "hlt" is always going to be sufficient (which is overridden as a pvop > into a sched_idle hypercall). > > Xen will choose an appropriate power state for the physical cpus > depending on the overall busyness of the system (which any individual > virtual machine can''t determine).Okay, knowing that the Dom0 kernel 1. can boot in non-xen mode on bare hardware and run cpuidle 2. needs just HLT when booted in xen mode will help us keep things simple. thanks, Len Brown, Intel Open Source Technology Center _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Len Brown
2011-Mar-31 21:26 UTC
Re: [Xen-devel] Re: [RFC PATCH V4 4/5] cpuidle: driver for xen
> > >> xen_arch_setup() does this: > > >> > > >> pm_idle = default_idle; > > >> boot_option_idle_override = IDLE_HALT;What happens on a Xen kernel if these lines are not there? Does Xen export the C-states tables to Dom0 kernel, and the Dom0 kernel has an acpi processor driver, and thus it would try to use all the C-states? If yes, must Xen show those tables to Dom? If it did not, then the lines above would not be necessary, as in the absence of any C-states, the kernel should use halt by default. thanks, -Len Brown, Intel Open Source Technology Center _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2011-Mar-31 22:36 UTC
Re: [Xen-devel] Re: [RFC PATCH V4 4/5] cpuidle: driver for xen
On 03/31/2011 02:26 PM, Len Brown wrote:>>>>> xen_arch_setup() does this: >>>>> >>>>> pm_idle = default_idle; >>>>> boot_option_idle_override = IDLE_HALT; > What happens on a Xen kernel if these lines are not there? > Does Xen export the C-states tables to Dom0 kernel, and the Dom0 > kernel has an acpi processor driver, and thus it would try to > use all the C-states?If they''re no there it tries to use the Intel cpuidle driver, which fails (just hangs forever in idle, I think).> If yes, must Xen show those tables to Dom? > If it did not, then the lines above would not be necessary, > as in the absence of any C-states, the kernel should > use halt by default.The dom0 kernel gets all the ACPI state so it can get all the juicy goodness from it. It does extract the C state info, but passes it back to Xen rather than use it itself. We don''t generally try to filter the ACPI state before letting dom0 see it (DMAR being the exception, since dom0 really has no business knowing about that). (We have this basic problem that neither Xen nor dom0 are the ideal owners of ACPI. In principle Xen should own ACPI as the most privileged "OS", but it really only cares about things like power states, interrupt routing, system topology, busses, etc. But dom0 cares about lid switches, magic keyboard keys, volume controls, video output switching, etc, etc. At the moment it seems to work best if dom0 do all ACPI processing then pass Xen the parts it needs, which are generally fixed-at-startup config info items.) J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Len Brown
2011-Apr-01 03:03 UTC
Re: [Xen-devel] Re: [RFC PATCH V4 4/5] cpuidle: driver for xen
> >>>>> xen_arch_setup() does this: > >>>>> > >>>>> pm_idle = default_idle; > >>>>> boot_option_idle_override = IDLE_HALT; > > What happens on a Xen kernel if these lines are not there? > > Does Xen export the C-states tables to Dom0 kernel, and the Dom0 > > kernel has an acpi processor driver, and thus it would try to > > use all the C-states? > > If they''re no there it tries to use the Intel cpuidle driver, which > fails (just hangs forever in idle, I think). > > > If yes, must Xen show those tables to Dom? > > If it did not, then the lines above would not be necessary, > > as in the absence of any C-states, the kernel should > > use halt by default. > > The dom0 kernel gets all the ACPI state so it can get all the juicy > goodness from it. It does extract the C state info, but passes it back > to Xen rather than use it itself. We don''t generally try to filter the > ACPI state before letting dom0 see it (DMAR being the exception, since > dom0 really has no business knowing about that). > > (We have this basic problem that neither Xen nor dom0 are the ideal > owners of ACPI. In principle Xen should own ACPI as the most privileged > "OS", but it really only cares about things like power states, interrupt > routing, system topology, busses, etc. But dom0 cares about lid > switches, magic keyboard keys, volume controls, video output switching, > etc, etc. At the moment it seems to work best if dom0 do all ACPI > processing then pass Xen the parts it needs, which are generally > fixed-at-startup config info items.)Okay. Since a Xen kernel can also boot on bare iron, and thus includes cpuidle, acpi_idle, intel_idle; and when in Dom0 mode it simply wants to run HLT in idle... I think what we want to do is simply disable cpuidle when booted in Dom0 mode. That will allow it to fall back to default_idle w/o xen_setup needing to tinker with installing an idle driver. I''ll send a patch in the next series. If you can try it out, that would be great. thanks, -Len _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Maybe Matching Threads
- 2.6.37 dom0 under Xen 4.1 clocksource not working
- Re: [RFC PATCH V4 3/5] cpuidle: default idle driver for x86
- [RFC PATCH] Exporting ACPI Pxx/Cxx states to other kernel subsystems (v1).
- Kernel crash with acpi_processor, cpu_idle and intel_idle =y
- Kernel crash with acpi_processor, cpu_idle and intel_idle =y