it seems getting lost, and thus resend. Thanks, Kevin -----Original Message----- From: Tian, Kevin Sent: 2008年7月24日 8:39 To: ''Jan Beulich'' Cc: Liu, Jinsong; Keir Fraser; xen-devel@lists.xensource.com; mark.langsdorf@amd.com Subject: RE: [Xen-devel] cpufreq info propagation>From: Jan Beulich >Sent: 2008年7月23日 18:13 >> >>startup info is viable. But how fragile is the cmdline option? > >If the user passes an option like this manually, depending on the >particular kernel version''s logic of parsing options, this may >result in >conflicts and hence misbehavior. Using the startup info avoids such >collisions. Likewise, this method doesn''t scale, since the command line >cannot grow arbitrarily large.Agree then. We''ll send out a fix.> >>>In any case there is an open question for me as to how current status >>>is being reported to the user (at least in Dom0) on the current CPU >>>frequency in use when CPU_FREQ is configured off (and I have to >> >>If CPU_FREQ is off, error is returned when user tries to query >>cpufreq information from Xen. > >You mean when (in Xen) control is set to FREQCTL_dom0_kernel. But >my question was how this works when CPU_FREQ is configured off >in the kernel, but Xen is (told to [? - see below]) >control(ling) cpufreq.My answer was to xen controlled cpufreq. As I said below, xen cmd line option and dom0 CONFIG_CPU_FREQ are two conditions to enable xen controlled cpufreq. When FREQCTL_xen is set, cpufreq driver/governor within xen is enabled, which however doesn''t start working until someone (dom0) notifies xen about processor freq info. When dom0 CPU_FREQ is enabled, extcnl logic then attempts to parse ACPI px info and then notify Xen by hypercall. After that, Xen cpufreq begins working. I think we may enable CPU_FREQ in dom0''s config by default, to eliminate such disconnected logic.> >>>admit that while I didn''t try it, I suspect it doesn''t even work when >>>CPU_FREQ is on - this is because the apparently only reporting >>>function, processor_extcntl_get_performance(), is being called >>>exclusively in the context of acpi_processor_start()). With that I >>>wonder how I would be able to determine whether the code works >>>at all. >> >>Not sure about your concern. Currently xen-controlled cpufreq isn''t >>enabled by default. User needs pass xen cmdline and also enable >>dom0 CPU_FREQ. Once that''s done, everything should work. > >At least as enabling in Xen is concerned, I can''t see this is >the case: the >respective initializer functions are being called out of the >XENPF_set_processor_pminfo handler unconditonally. I.e. I''m missing >where this logic would be suppressed (kind of a > > ret = -ENOSYS; > if ( cpufreq_controller != FREQCTL_xen ) > break; > >missing at the top of the XEN_PM_PX sub-case).You''re right. This is a bug to gracefully handle such corner case.> >Further, tying the functionality to CPU_FREQ being enabled in Dom0 >seems odd to me, too: Who is controlling what then? I thought >it''s solely >in the hypervisor''s hands, and in that case involving the whole cpufreq >machinery (including the governors) in the kernel appears pointless.Processor freq info is contained in ACPI Px methods in AML style, and thus there''s no way for Xen to retrieve them without dom0''s help. However existing ACPI parse logic is only compiled when CPU_FREQ is configured. It''d be mess to copy those code out of CPU_FREQ. Also enable CPU_FREQ in dom0 doesn''t mean the control goes to dom0. Dom0 only tempts to control when cpufreq driver is loaded, and by far the driver is hacked from loading when extcnl is requesting. This is a bit hacky, and the cleaner way should move such check to generic cpufreq layer since there''re so many drivers. Note that similar story will be repeated in the future, if any feature is tightly coupled with ACPI, e.g. RAS.> >Additionally, there''s nothing being done to prevent the collision with >other low-level drivers. acpi-cpufreq and powernow-k8 have got an >early-out clause into their startup inserted, but all other >drivers have >been left alone. And with Xen being the preferred cpufreq controller, >in my opinion there ought to be an option to enable that in the kernel >which would automatically suppress all low-level drivers.Yes, we need handle it better. A patch will be made before 3.3. As a summary, (if I have answered your concerns), both xen-controlled and dom0-controlled options are working in normal case, with some corner case conflicting not handled cleanly. They both require CPU_FREQ being enabled to work.> >>>As a minor observation, I also find the naming vs. meaning of the >>>FREQCTL_none enumerator misleading - shouldn''t this rather be >>>FREQCTL_xen? >> >>If user wants to disable cpufreq at all, a FREQCTL_none option is >>required. > >Right, but I can''t see where it''s used (i.e. where a >distinction is made >between no and Xen-based cpufreq handling). >Sorry that I didn''t read code carefully, as I thought there should be a FREQCTL_xen already. It''s a silly one, and should be fixed. Thanks Kevin _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
OK, another try. Jinsong is working on fixes as pointed by Jan, however it blocks as latest tree doesn''t work for xen controlled cpufreq yet. It may be due to the cleanup patch yesterday, or some earlier changeset. Anyway, since Jan already sends out one fix (-ENOSYS), Jinsong will cover the rest after making it working again. Thanks, Kevin -----Original Message----- From: Tian, Kevin Sent: 2008年7月24日 12:12 To: ''xen-devel@lists.xensource.com''; ''Jan Beulich''; Keir Fraser; Liu, Jinsong; mark.langsdorf@amd.com Subject: FW: [Xen-devel] cpufreq info propagation it seems getting lost, and thus resend. Thanks, Kevin -----Original Message----- From: Tian, Kevin Sent: 2008年7月24日 8:39 To: ''Jan Beulich'' Cc: Liu, Jinsong; Keir Fraser; xen-devel@lists.xensource.com; mark.langsdorf@amd.com Subject: RE: [Xen-devel] cpufreq info propagation>From: Jan Beulich >Sent: 2008年7月23日 18:13 >> >>startup info is viable. But how fragile is the cmdline option? > >If the user passes an option like this manually, depending on the >particular kernel version''s logic of parsing options, this may >result in >conflicts and hence misbehavior. Using the startup info avoids such >collisions. Likewise, this method doesn''t scale, since the command line >cannot grow arbitrarily large.Agree then. We''ll send out a fix.> >>>In any case there is an open question for me as to how current status >>>is being reported to the user (at least in Dom0) on the current CPU >>>frequency in use when CPU_FREQ is configured off (and I have to >> >>If CPU_FREQ is off, error is returned when user tries to query >>cpufreq information from Xen. > >You mean when (in Xen) control is set to FREQCTL_dom0_kernel. But >my question was how this works when CPU_FREQ is configured off >in the kernel, but Xen is (told to [? - see below]) >control(ling) cpufreq.My answer was to xen controlled cpufreq. As I said below, xen cmd line option and dom0 CONFIG_CPU_FREQ are two conditions to enable xen controlled cpufreq. When FREQCTL_xen is set, cpufreq driver/governor within xen is enabled, which however doesn''t start working until someone (dom0) notifies xen about processor freq info. When dom0 CPU_FREQ is enabled, extcnl logic then attempts to parse ACPI px info and then notify Xen by hypercall. After that, Xen cpufreq begins working. I think we may enable CPU_FREQ in dom0''s config by default, to eliminate such disconnected logic.> >>>admit that while I didn''t try it, I suspect it doesn''t even work when >>>CPU_FREQ is on - this is because the apparently only reporting >>>function, processor_extcntl_get_performance(), is being called >>>exclusively in the context of acpi_processor_start()). With that I >>>wonder how I would be able to determine whether the code works >>>at all. >> >>Not sure about your concern. Currently xen-controlled cpufreq isn''t >>enabled by default. User needs pass xen cmdline and also enable >>dom0 CPU_FREQ. Once that''s done, everything should work. > >At least as enabling in Xen is concerned, I can''t see this is >the case: the >respective initializer functions are being called out of the >XENPF_set_processor_pminfo handler unconditonally. I.e. I''m missing >where this logic would be suppressed (kind of a > > ret = -ENOSYS; > if ( cpufreq_controller != FREQCTL_xen ) > break; > >missing at the top of the XEN_PM_PX sub-case).You''re right. This is a bug to gracefully handle such corner case.> >Further, tying the functionality to CPU_FREQ being enabled in Dom0 >seems odd to me, too: Who is controlling what then? I thought >it''s solely >in the hypervisor''s hands, and in that case involving the whole cpufreq >machinery (including the governors) in the kernel appears pointless.Processor freq info is contained in ACPI Px methods in AML style, and thus there''s no way for Xen to retrieve them without dom0''s help. However existing ACPI parse logic is only compiled when CPU_FREQ is configured. It''d be mess to copy those code out of CPU_FREQ. Also enable CPU_FREQ in dom0 doesn''t mean the control goes to dom0. Dom0 only tempts to control when cpufreq driver is loaded, and by far the driver is hacked from loading when extcnl is requesting. This is a bit hacky, and the cleaner way should move such check to generic cpufreq layer since there''re so many drivers. Note that similar story will be repeated in the future, if any feature is tightly coupled with ACPI, e.g. RAS.> >Additionally, there''s nothing being done to prevent the collision with >other low-level drivers. acpi-cpufreq and powernow-k8 have got an >early-out clause into their startup inserted, but all other >drivers have >been left alone. And with Xen being the preferred cpufreq controller, >in my opinion there ought to be an option to enable that in the kernel >which would automatically suppress all low-level drivers.Yes, we need handle it better. A patch will be made before 3.3. As a summary, (if I have answered your concerns), both xen-controlled and dom0-controlled options are working in normal case, with some corner case conflicting not handled cleanly. They both require CPU_FREQ being enabled to work.> >>>As a minor observation, I also find the naming vs. meaning of the >>>FREQCTL_none enumerator misleading - shouldn''t this rather be >>>FREQCTL_xen? >> >>If user wants to disable cpufreq at all, a FREQCTL_none option is >>required. > >Right, but I can''t see where it''s used (i.e. where a >distinction is made >between no and Xen-based cpufreq handling). >Sorry that I didn''t read code carefully, as I thought there should be a FREQCTL_xen already. It''s a silly one, and should be fixed. Thanks Kevin _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
>Processor freq info is contained in ACPI Px methods in AML style, >and thus there''s no way for Xen to retrieve them without dom0''s help. >However existing ACPI parse logic is only compiled when CPU_FREQ >is configured. It''d be mess to copy those code out of CPU_FREQ.I understand that - an alternative idea was to enable CONFIG_CPU_FREQ by a (conditional) #define in the relevant core ACPI files. But that should probably be considered only if getting things to work by tweaking the core cpufreq code turns out to be no less intrusive.>Also enable CPU_FREQ in dom0 doesn''t mean the control goes to >dom0. Dom0 only tempts to control when cpufreq driver is loaded, >and by far the driver is hacked from loading when extcnl is requesting. >This is a bit hacky, and the cleaner way should move such check >to generic cpufreq layer since there''re so many drivers.Indeed, if that was done e.g. by making cpufreq_register_driver() fail in this case, I would feel much safer allowing the CPU_FREQ options again. But that would imply that all the drivers behave properly when that registration fails, and while powernow-k8 appears to do so, acpi-cpufreq doesn''t seem to - it would (in .26) at least leak per-CPU data allocated in acpi_cpufreq_early_init(). What I''m still unclear about is the role of the kernel governors when Xen controls cpufreq. Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
>From: Jan Beulich [mailto:jbeulich@novell.com] >Sent: 2008年7月24日 22:18 > >>Processor freq info is contained in ACPI Px methods in AML style, >>and thus there''s no way for Xen to retrieve them without dom0''s help. >>However existing ACPI parse logic is only compiled when CPU_FREQ >>is configured. It''d be mess to copy those code out of CPU_FREQ. > >I understand that - an alternative idea was to enable CONFIG_CPU_FREQ >by a (conditional) #define in the relevant core ACPI files. But that >should probably be considered only if getting things to work >by tweaking >the core cpufreq code turns out to be no less intrusive.Yes, we thought that alternative in the start, but then give up since it makes code a bit messed. Actually this is similar dependency as on ACPI PROCESSOR module, as you changed it back to ''m'' yesterday.> >>Also enable CPU_FREQ in dom0 doesn''t mean the control goes to >>dom0. Dom0 only tempts to control when cpufreq driver is loaded, >>and by far the driver is hacked from loading when extcnl is >requesting. >>This is a bit hacky, and the cleaner way should move such check >>to generic cpufreq layer since there''re so many drivers. > >Indeed, if that was done e.g. by making cpufreq_register_driver() >fail in this case, I would feel much safer allowing the CPU_FREQ >options again. But that would imply that all the drivers behaveYes, that''s the option we tempt to do.>properly when that registration fails, and while powernow-k8 >appears to do so, acpi-cpufreq doesn''t seem to - it would (in .26) >at least leak per-CPU data allocated in acpi_cpufreq_early_init().But there''s no way to make it elegant, as cpufreq drivers themselves are initialized independently except the registration hooked to core cpufreq layer. If we really want to make absolutely clean, case by case check and hack has to be done per each cpufreq driver. But I guess that''s not worthy as even some data leaked which won''t be accumulative.> >What I''m still unclear about is the role of the kernel governors when >Xen controls cpufreq. >As you already perceived, the purpose is just to use some ACPI Px parse logic. But to avoid intrusive change to existing linux code, we choose a coarse-level option to enable whole CPU_FREQ. When kernel image size is enlarged a bit, no runtime overhead is added as kernel governors are not activated at all when driver is not registered, IIRC. :-) Thanks, Kevin _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel