Trinabh Gupta
2011-Mar-23 09:22 UTC
[Xen-devel] Re: [RFC PATCH V4 2/5] cpuidle: list based cpuidle driver registration and selection
Hi Len, The goal of the patch series is to remove exported pm_idle function pointer (see http://lkml.org/lkml/2009/8/28/43 and http://lkml.org/lkml/2009/8/28/50 for problems related to pm_idle). The first patch in the series removes pm_idle for x86 and we now directly call cpuidle_idle_call as suggested by Arjan (https://lkml.org/lkml/2010/10/19/453). But we also have to replace the functionality provided by pm_idle, i.e. call default_idle for platforms where no better idle routine exists, call mwait for pre-nehalem platforms, use intel_idle or acpi_idle for nehalem architectures etc. To manage all this we need a registration mechanism which is conveniently provided by cpuidle. In theory I agree that we can maybe do without list based registration i.e probe and pick the best for the platform, but things may become less predictable and difficult to manage as we have more and more platforms and drivers. By directly calling into cpuidle, we already have arch default other than intel_idle and acpi_idle. Then APM and xen (though it uses default_idle) also have their own idle routines. List based management and selection based on priority would provide a cleaner solution. Thanks, -Trinabh On 03/23/2011 08:29 AM, Len Brown wrote:> the original cpuidle prototype supported multiple driver registration, > but no production use for it could be imagined, and so it was deleted. > > Subsequently on x86, we added intel_idle to replace acpi_idle > and a typical kernel will have them both built in. > We still don''t allow mutliple registrations, we just arrange > affairs such that the preferred intel_idle probes before > the backup, acpi_idle. If intel_idle recognizes the platform, > its probe succeeds, else acpi_idle gets a go. > If there is a problem with intel_idle, or a comparison needs to be made, > a bootparam is available to tell intel_idle not to probe. > > This mechanism takes approximately 10 lines of code -- the bootparam > to disable the preferred driver. > > What is the benefit of all the code to support the feature of run-time > multiple driver registration and switching? > > thanks, > Len Brown, Intel Open Source Technology Center > > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Len Brown
2011-Mar-23 20:51 UTC
[Xen-devel] Re: [RFC PATCH V4 2/5] cpuidle: list based cpuidle driver registration and selection
> The goal of the patch series is to remove exported pm_idle function > pointer (see http://lkml.org/lkml/2009/8/28/43 and > http://lkml.org/lkml/2009/8/28/50 for problems related to pm_idle). > The first patch in the series removes pm_idle for x86 and we > now directly call cpuidle_idle_call as suggested by Arjan > (https://lkml.org/lkml/2010/10/19/453).So the problem statement with "pm_idle" is that it is visible to modules and thus potentially racey and unsafe? Any reason we can''t delete his line today to address most of the concern? EXPORT_SYMBOL(pm_idle);> But we also have to replace the functionality provided by pm_idle, > i.e. call default_idle for platforms where no better idle routine > exists, call mwait for pre-nehalem platforms, use intel_idle or > acpi_idle for nehalem architectures etc. To manage all this > we need a registration mechanism which is conveniently provided > by cpuidle.It isn''t immediately clear to me that all of these options need to be preserved. Are we suggesting that x86 must always build with cpuidle? I''m sure that somebody someplace will object to that. OTOH, if cpuidle is included, I''d like to see the non-cpuidle code excluded, since nobody will run it...> In theory I agree that we can maybe do without list based > registration i.e probe and pick the best for the platform, but things > may become less predictable and difficult to manage as > we have more and more platforms and drivers. > By directly calling into cpuidle, we already have arch default > other than intel_idle and acpi_idle. Then APM and xen (though > it uses default_idle) also have their own idle routines. > List based management and selection based on priority would provideDoes anybody actually use the latest kernel in APM mode? I''m not even sure the last version of Windows that would talk to APM, it was whatever was before Windows-95, I think. But don''t get me wrong, I agree that pm_idle should go. I agree that cpuidle should have a default other than the polling loop it currently uses. I just don''t think we should spend a lot of code and time preserving every conceivable option and feature. We should first do some spring cleaning to see if we can simplify the problem. thanks, -Len Brown, Intel Open Source Technology Center _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Len Brown
2011-Mar-24 04:41 UTC
[Xen-devel] Re: [RFC PATCH V4 2/5] cpuidle: list based cpuidle driver registration and selection
> Does anybody actually use the latest kernel in APM mode?Google tells me that Windows 2000 still supported APM, and it was still present in Windows XP APM was not present in Windows Vista, aka Windows 2006. MS dropped support in their latest OS 5 years ago. When will the latest Linux drop APM support? -Len _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Trinabh Gupta
2011-Mar-24 14:13 UTC
[Xen-devel] Re: [RFC PATCH V4 2/5] cpuidle: list based cpuidle driver registration and selection
On 03/24/2011 02:21 AM, Len Brown wrote:>> The goal of the patch series is to remove exported pm_idle function >> pointer (see http://lkml.org/lkml/2009/8/28/43 and >> http://lkml.org/lkml/2009/8/28/50 for problems related to pm_idle). >> The first patch in the series removes pm_idle for x86 and we >> now directly call cpuidle_idle_call as suggested by Arjan >> (https://lkml.org/lkml/2010/10/19/453). > > So the problem statement with "pm_idle" is that it is visible to modules > and thus potentially racey and unsafe? > > Any reason we can''t delete his line today to address most of the concern?Hi Len, I think there are other problems too, related to saving and restoring of pm_idle pointer. For example, cpuidle itself saves current value of pm_idle, flips it and then restores the saved value. There is no guarantee that the saved function still exists. APM does exact same thing (though it may not be used these days). The problem also is that a number of architectures have copied the same design based on pm_idle; so its spreading.> > EXPORT_SYMBOL(pm_idle); > >> But we also have to replace the functionality provided by pm_idle, >> i.e. call default_idle for platforms where no better idle routine >> exists, call mwait for pre-nehalem platforms, use intel_idle or >> acpi_idle for nehalem architectures etc. To manage all this >> we need a registration mechanism which is conveniently provided >> by cpuidle. > > It isn''t immediately clear to me that all of these options > need to be preserved.So what do you suggest can be removed?> > Are we suggesting that x86 must always build with cpuidle? > I''m sure that somebody someplace will object to that.Arjan argued that since almost everyone today runs cpuidle it may be best to include it in the kernel (https://lkml.org/lkml/2010/10/20/243). But yes, we agreed that we would have to make cpuidle lighter incrementally. Making ladder governor optional could be one way for example.> > OTOH, if cpuidle is included, I''d like to see the > non-cpuidle code excluded, since nobody will run it... > >> In theory I agree that we can maybe do without list based >> registration i.e probe and pick the best for the platform, but things >> may become less predictable and difficult to manage as >> we have more and more platforms and drivers. >> By directly calling into cpuidle, we already have arch default >> other than intel_idle and acpi_idle. Then APM and xen (though >> it uses default_idle) also have their own idle routines. >> List based management and selection based on priority would provide > > Does anybody actually use the latest kernel in APM mode? > I''m not even sure the last version of Windows that would talk to APM, > it was whatever was before Windows-95, I think. > > But don''t get me wrong, I agree that pm_idle should go. > I agree that cpuidle should have a default other than > the polling loop it currently uses.Sure, thanks -Trinabh _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Vaidyanathan Srinivasan
2011-Mar-24 16:52 UTC
[Xen-devel] Re: [RFC PATCH V4 2/5] cpuidle: list based cpuidle driver registration and selection
* Trinabh Gupta <trinabh@linux.vnet.ibm.com> [2011-03-24 19:43:43]: [snip]> >>But we also have to replace the functionality provided by pm_idle, > >>i.e. call default_idle for platforms where no better idle routine > >>exists, call mwait for pre-nehalem platforms, use intel_idle or > >>acpi_idle for nehalem architectures etc. To manage all this > >>we need a registration mechanism which is conveniently provided > >>by cpuidle. > > > >It isn''t immediately clear to me that all of these options > >need to be preserved. > > So what do you suggest can be removed?Can we use safe_halt() until intel_idle/acpi_idle take over? But what if they do not take over? If safe_halt() is not very bad compared to the variants like mwait_idle and c1e_idle, then we can remove the old code and no need to move them to default driver.> >Are we suggesting that x86 must always build with cpuidle? > >I''m sure that somebody someplace will object to that. > > Arjan argued that since almost everyone today runs cpuidle > it may be best to include it in the kernel > (https://lkml.org/lkml/2010/10/20/243). But yes, we agreed > that we would have to make cpuidle lighter incrementally. > Making ladder governor optional could be one way for example. > > > >OTOH, if cpuidle is included, I''d like to see the > >non-cpuidle code excluded, since nobody will run it...The non-cpuidle code will be the select_idle_routine() and related function that cam move to default_driver that register to cpuidle. We can load on-demand as module if better routines fail to register. Maybe we don''t need this at all as discussed in the above point? --Vaidy [snip] _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Len Brown
2011-Mar-25 07:05 UTC
[Xen-devel] Re: [RFC PATCH V4 2/5] cpuidle: list based cpuidle driver registration and selection
> I think there are other problems too, related to saving and restoring > of pm_idle pointer. For example, cpuidle itself saves current value > of pm_idle, flips it and then restores the saved value. There is > no guarantee that the saved function still exists. APM does exact > same thing (though it may not be used these days). > > The problem also is that a number of architectures have copied the > same design based on pm_idle; so its spreading.pm_idle is a primitive design yes, but I think the issue with pm_idle is a theoretical one, at least on x86; as there isn''t any other code scribbling on pm_idle in practice. So this is clean-up, rather than bug-fix work...> > It isn''t immediately clear to me that all of these options > > need to be preserved. > > So what do you suggest can be removed?I sent a series of small patches yesterday to get the ball rolling... https://lkml.org/lkml/2011/3/24/54 I think the xen thing can go away. I proposed that APM be removed entirely, but that will take a few releases to conclude....> > Are we suggesting that x86 must always build with cpuidle? > > I''m sure that somebody someplace will object to that. > > Arjan argued that since almost everyone today runs cpuidle > it may be best to include it in the kernel > (https://lkml.org/lkml/2010/10/20/243). But yes, we agreed > that we would have to make cpuidle lighter incrementally. > Making ladder governor optional could be one way for example.ladder is already optional. cheers, -Len Brown, Intel Open Source Technology Center _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Len Brown
2011-Mar-25 07:13 UTC
[Xen-devel] Re: [RFC PATCH V4 2/5] cpuidle: list based cpuidle driver registration and selection
> > So what do you suggest can be removed? > > Can we use safe_halt() until intel_idle/acpi_idle take over? But what > if they do not take over? If safe_halt() is not very bad compared to > the variants like mwait_idle and c1e_idle, then we can remove the old > code and no need to move them to default driver.One reason I''d like a default cpuidle driver is that today there is a race. cpuidle registers, but until its driver registers it will use polling. go ahead and look: grep . /sys/devices/system/cpu/cpu*/cpuidle/state0/usage that should be 0, but it isn''t...> > >Are we suggesting that x86 must always build with cpuidle? > > >I''m sure that somebody someplace will object to that. > > > > Arjan argued that since almost everyone today runs cpuidle > > it may be best to include it in the kernel > > (https://lkml.org/lkml/2010/10/20/243). But yes, we agreed > > that we would have to make cpuidle lighter incrementally. > > Making ladder governor optional could be one way for example. > > > > > >OTOH, if cpuidle is included, I''d like to see the > > >non-cpuidle code excluded, since nobody will run it... > > The non-cpuidle code will be the select_idle_routine() and related > function that cam move to default_driver that register to cpuidle. > We can load on-demand as module if better routines fail to register. > Maybe we don''t need this at all as discussed in the above point?Right, though I don''t share your fascination with modules. cheers, Len Brown, Intel Open Source Technology Center _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2011-Mar-25 15:35 UTC
Re: [Xen-devel] Re: [RFC PATCH V4 2/5] cpuidle: list based cpuidle driver registration and selection
On Fri, Mar 25, 2011 at 03:05:36AM -0400, Len Brown wrote:> > I think there are other problems too, related to saving and restoring > > of pm_idle pointer. For example, cpuidle itself saves current value > > of pm_idle, flips it and then restores the saved value. There is > > no guarantee that the saved function still exists. APM does exact > > same thing (though it may not be used these days). > > > > The problem also is that a number of architectures have copied the > > same design based on pm_idle; so its spreading. > > pm_idle is a primitive design yes, but I think the issue > with pm_idle is a theoretical one, at least on x86; > as there isn''t any other code scribbling on pm_idle > in practice. So this is clean-up, rather than bug-fix work... > > > > It isn''t immediately clear to me that all of these options > > > need to be preserved. > > > > So what do you suggest can be removed? > > I sent a series of small patches yesterday to get the ball rolling... > https://lkml.org/lkml/2011/3/24/54 > > I think the xen thing can go away.The xen thing being the setting of cpuidle to halt or the proposed patch? _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Len Brown
2011-Mar-31 02:25 UTC
Re: [Xen-devel] Re: [RFC PATCH V4 2/5] cpuidle: list based cpuidle driver registration and selection
thanks, Len Brown, Intel Open Source Technology Center On Fri, 25 Mar 2011, Konrad Rzeszutek Wilk wrote:> On Fri, Mar 25, 2011 at 03:05:36AM -0400, Len Brown wrote: > > > I think there are other problems too, related to saving and restoring > > > of pm_idle pointer. For example, cpuidle itself saves current value > > > of pm_idle, flips it and then restores the saved value. There is > > > no guarantee that the saved function still exists. APM does exact > > > same thing (though it may not be used these days). > > > > > > The problem also is that a number of architectures have copied the > > > same design based on pm_idle; so its spreading. > > > > pm_idle is a primitive design yes, but I think the issue > > with pm_idle is a theoretical one, at least on x86; > > as there isn''t any other code scribbling on pm_idle > > in practice. So this is clean-up, rather than bug-fix work... > > > > > > It isn''t immediately clear to me that all of these options > > > > need to be preserved. > > > > > > So what do you suggest can be removed? > > > > I sent a series of small patches yesterday to get the ball rolling... > > https://lkml.org/lkml/2011/3/24/54 > > > > I think the xen thing can go away. > > The xen thing being the setting of cpuidle to halt or the proposed > patch?I don''t think Xen needs a cpuidle driver. Xen just needs to tell the kernel to call HALT, and I think we''ll keep that around for the non-cpuidle case, and for the idle periods before cpuidle initializes. cheers, Len Brown, Intel Open Source Technology Center _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel