Niraj Tolia
2008-Oct-22 07:57 UTC
[Xen-devel] Problems with enabling hypervisor C and P-state control
Hi, Is there any documentation on enabling hypervisor support for both C and P-state control? On xen-unstable and linux-2.6.18-xen.hg, if I enable cpuidle=1 on the xen command line and then run xenpm, I will get output for C-states (shown below) but it complains that "Xen cpufreq is not enabled!" cpu id : 0 total C-states : 2 idle time(ms) : 73264 C0 : transition [00000000000000000000] residency [00000000000000000000 ms] C1 : transition [00000000000000025505] residency [00000000000000000000 ms] (repeats for all cores) However, CPU_FREQ depends on PROCESSOR_EXTERNAL_CONTROL being set to ''n''. There doesn''t seem to be a way to enable/disable CONFIG_PROCESSOR_EXTERNAL_CONTROL from menuconfig. I therefore manually twiddled that bits in .config and enabled CPU_FREQ and CPU_FREQ_TABLE (plus the performance governor). If I don''t specify the cpufreq option on the dom0 kernel command line, xenpm doesn''t give me C-state information anymore (output below) and will repeat the same warning about cpufreq not being enabled. cpu id : 0 total C-states : 0 idle time(ms) : 0 (repeats for all cores) If I then add cpufreq=xen to the kernel command line, xenpm''s output does not change. Any ideas on what I might be doing wrong? This is on a quad-core Xeon (E7330). Cheers, Niraj _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Yu, Ke
2008-Oct-22 08:16 UTC
RE: [Xen-devel] Problems with enabling hypervisor C and P-state control
You can simply add xen grub option "cpufreq=xen cpuidle" to enable C and P state, like the following: kernel /boot/xen.gz dom0_mem=256M console=com1 cpufreq=xen cpuidle. For the xen-unstable and linux-2.6.18-xen.hg compilation, you don''t need to change anything, C and P support is compiled by default, and there is no conflict between them If you still can not see P state info by xenpm, could you please check your BIOS to see if the Px is enabled? Or can you add loglvl=info in xen command line and send out the xen boot log, so that we can further analysis how Px BIOS info is parsed by Xen? Best Regards Ke Niraj Tolia wrote:> Hi, > > Is there any documentation on enabling hypervisor support for both C > and P-state control? > > On xen-unstable and linux-2.6.18-xen.hg, if I enable cpuidle=1 on the > xen command line and then run xenpm, I will get output for C-states > (shown below) but it complains that "Xen cpufreq is not enabled!" > > cpu id : 0 > total C-states : 2 > idle time(ms) : 73264 > C0 : transition [00000000000000000000] > residency [00000000000000000000 ms] > C1 : transition [00000000000000025505] > residency [00000000000000000000 ms] > > (repeats for all cores) > > However, CPU_FREQ depends on PROCESSOR_EXTERNAL_CONTROL being set to > ''n''. There doesn''t seem to be a way to enable/disable > CONFIG_PROCESSOR_EXTERNAL_CONTROL from menuconfig. I therefore > manually twiddled that bits in .config and enabled CPU_FREQ and > CPU_FREQ_TABLE (plus the performance governor). If I don''t specify the > cpufreq option on the dom0 kernel command line, xenpm doesn''t give me > C-state information anymore (output below) and will repeat the same > warning about cpufreq not being enabled. > > cpu id : 0 > total C-states : 0 > idle time(ms) : 0 > > (repeats for all cores) > > If I then add cpufreq=xen to the kernel command line, xenpm''s output > does not change. Any ideas on what I might be doing wrong? This is on > a quad-core Xeon (E7330). > > Cheers, > Niraj > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Niraj Tolia
2008-Oct-23 23:49 UTC
Re: [Xen-devel] Problems with enabling hypervisor C and P-state control
On Wed, Oct 22, 2008 at 1:16 AM, Yu, Ke <ke.yu@intel.com> wrote:> You can simply add xen grub option "cpufreq=xen cpuidle" to enable C and P state, like the following: > kernel /boot/xen.gz dom0_mem=256M console=com1 cpufreq=xen cpuidle. >Hi Ke, This seems to "work" but xenpm doesn''t seem to be giving me the right data. On a completely idle system, the residency time of state C1 is 0ms. xentrace does report a large number of transitions though but they seem to be in rapid succession. Is this correct? CPU4 320429115477 (+24006897) cpu_idle_exit [ C1 -> C0 ] CPU4 320429118141 (+ 2664) cpu_idle_entry [ C0 -> C1 ] CPU4 320437500021 (+ 8381880) cpu_idle_exit [ C1 -> C0 ] CPU4 320437505493 (+ 5472) cpu_idle_entry [ C0 -> C1 ] CPU4 320453123229 (+15617736) cpu_idle_exit [ C1 -> C0 ] CPU4 320453125992 (+ 2763) cpu_idle_entry [ C0 -> C1 ] CPU4 320477132295 (+24006303) cpu_idle_exit [ C1 -> C0 ] CPU4 320477134968 (+ 2673) cpu_idle_entry [ C0 -> C1 ] Second, when I look at the P-state output (shown below), xenpm shows that the lowest P-state is only set on the first socket (this is a quad-core, quad-socket system). However, I have a feeling that this might be a problem with displaying the data rather than the underlying logic. Any ideas? # xenpm | grep ''*'' *P3 : freq [1599 MHz] *P3 : freq [1599 MHz] *P3 : freq [1599 MHz] *P3 : freq [1599 MHz] *P0 : freq [2398 MHz] *P0 : freq [2398 MHz] *P0 : freq [2398 MHz] *P0 : freq [2398 MHz] *P0 : freq [2398 MHz] *P0 : freq [2398 MHz] *P0 : freq [2398 MHz] *P0 : freq [2398 MHz] *P0 : freq [2398 MHz] *P0 : freq [2398 MHz] *P0 : freq [2398 MHz] *P0 : freq [2398 MHz] Cheers, Niraj _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Tian, Kevin
2008-Oct-24 02:43 UTC
RE: [Xen-devel] Problems with enabling hypervisor C and P-state control
Niraj, any update on Ke''s suggestion? Thanks, Kevin>From: Yu, Ke >Sent: Wednesday, October 22, 2008 4:17 PM > >You can simply add xen grub option "cpufreq=xen cpuidle" to >enable C and P state, like the following: >kernel /boot/xen.gz dom0_mem=256M console=com1 cpufreq=xen cpuidle. > >For the xen-unstable and linux-2.6.18-xen.hg compilation, you >don''t need to change anything, C and P support is compiled by >default, and there is no conflict between them > >If you still can not see P state info by xenpm, could you >please check your BIOS to see if the Px is enabled? Or can you >add loglvl=info in xen command line and send out the xen boot >log, so that we can further analysis how Px BIOS info is parsed by Xen? > >Best Regards >Ke > >Niraj Tolia wrote: >> Hi, >> >> Is there any documentation on enabling hypervisor support for both C >> and P-state control? >> >> On xen-unstable and linux-2.6.18-xen.hg, if I enable cpuidle=1 on the >> xen command line and then run xenpm, I will get output for C-states >> (shown below) but it complains that "Xen cpufreq is not enabled!" >> >> cpu id : 0 >> total C-states : 2 >> idle time(ms) : 73264 >> C0 : transition [00000000000000000000] >> residency [00000000000000000000 ms] >> C1 : transition [00000000000000025505] >> residency [00000000000000000000 ms] >> >> (repeats for all cores) >> >> However, CPU_FREQ depends on PROCESSOR_EXTERNAL_CONTROL being set to >> ''n''. There doesn''t seem to be a way to enable/disable >> CONFIG_PROCESSOR_EXTERNAL_CONTROL from menuconfig. I therefore >> manually twiddled that bits in .config and enabled CPU_FREQ and >> CPU_FREQ_TABLE (plus the performance governor). If I don''t >specify the >> cpufreq option on the dom0 kernel command line, xenpm doesn''t give me >> C-state information anymore (output below) and will repeat the same >> warning about cpufreq not being enabled. >> >> cpu id : 0 >> total C-states : 0 >> idle time(ms) : 0 >> >> (repeats for all cores) >> >> If I then add cpufreq=xen to the kernel command line, xenpm''s output >> does not change. Any ideas on what I might be doing wrong? This is on >> a quad-core Xeon (E7330). >> >> Cheers, >> Niraj >> >> _______________________________________________ >> Xen-devel mailing list >> Xen-devel@lists.xensource.com >> http://lists.xensource.com/xen-devel > > >_______________________________________________ >Xen-devel mailing list >Xen-devel@lists.xensource.com >http://lists.xensource.com/xen-devel >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Yu, Ke
2008-Oct-24 03:14 UTC
RE: [Xen-devel] Problems with enabling hypervisor C and P-state control
Niraj Tolia wrote:> > Hi Ke, > > This seems to "work" but xenpm doesn''t seem to be giving me the right > data. On a completely idle system, the residency time of state C1 is > 0ms. xentrace does report a large number of transitions though but > they seem to be in rapid succession. Is this correct? > > CPU4 320429115477 (+24006897) cpu_idle_exit [ C1 -> C0 ] > CPU4 320429118141 (+ 2664) cpu_idle_entry [ C0 -> C1 ] > CPU4 320437500021 (+ 8381880) cpu_idle_exit [ C1 -> C0 ] > CPU4 320437505493 (+ 5472) cpu_idle_entry [ C0 -> C1 ] > CPU4 320453123229 (+15617736) cpu_idle_exit [ C1 -> C0 ] > CPU4 320453125992 (+ 2763) cpu_idle_entry [ C0 -> C1 ] > CPU4 320477132295 (+24006303) cpu_idle_exit [ C1 -> C0 ] > CPU4 320477134968 (+ 2673) cpu_idle_entry [ C0 -> C1 ]Glad to see it "works" :) and thank for your feedback, it is valuable to us in term of improving the Xen PM functionality. You are right, the output is not correct. And the root cause is xen PM statistics logic does not record the C1 time. The "hlt" C1 time can not be accurately calculated due to interrupt, so we hesitate to record the inaccurate value originally. Since now linux kernel already has the "near to accurate" C1 time accouting, we are planing to port that patch to Xen. After that, xenpm will show correct info.> > > Second, when I look at the P-state output (shown below), xenpm shows > that the lowest P-state is only set on the first socket (this is a > quad-core, quad-socket system). However, I have a feeling that this > might be a problem with displaying the data rather than the underlying > logic. Any ideas? > > # xenpm | grep ''*'' > *P3 : freq [1599 MHz] > *P3 : freq [1599 MHz] > *P3 : freq [1599 MHz] > *P3 : freq [1599 MHz] > *P0 : freq [2398 MHz] > *P0 : freq [2398 MHz] > *P0 : freq [2398 MHz] > *P0 : freq [2398 MHz] > *P0 : freq [2398 MHz] > *P0 : freq [2398 MHz] > *P0 : freq [2398 MHz] > *P0 : freq [2398 MHz] > *P0 : freq [2398 MHz] > *P0 : freq [2398 MHz] > *P0 : freq [2398 MHz] > *P0 : freq [2398 MHz]This seems a bug. From the above info, I can not decided if it is xenpm issue or xen cpufreq issue. could you please provide more info, e.g. - xen boot log (with loglvl=info), so that we can see if cpufreq driver is initialized in all cpus - xentrace date on Px state, so that we can see if the Px transition really happened. Best Regards Ke> > Cheers, > Niraj > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Tian, Kevin
2008-Oct-24 04:13 UTC
RE: [Xen-devel] Problems with enabling hypervisor C and P-state control
>From: Yu, Ke >Sent: Friday, October 24, 2008 11:15 AM >> >> >> Second, when I look at the P-state output (shown below), xenpm shows >> that the lowest P-state is only set on the first socket (this is a >> quad-core, quad-socket system). However, I have a feeling that this >> might be a problem with displaying the data rather than the >underlying >> logic. Any ideas? >> >> # xenpm | grep ''*'' >> *P3 : freq [1599 MHz] >> *P3 : freq [1599 MHz] >> *P3 : freq [1599 MHz] >> *P3 : freq [1599 MHz] >> *P0 : freq [2398 MHz] >> *P0 : freq [2398 MHz] >> *P0 : freq [2398 MHz] >> *P0 : freq [2398 MHz] >> *P0 : freq [2398 MHz] >> *P0 : freq [2398 MHz] >> *P0 : freq [2398 MHz] >> *P0 : freq [2398 MHz] >> *P0 : freq [2398 MHz] >> *P0 : freq [2398 MHz] >> *P0 : freq [2398 MHz] >> *P0 : freq [2398 MHz] > >This seems a bug. From the above info, I can not decided if it >is xenpm issue or xen cpufreq issue. could you please provide >more info, e.g. >- xen boot log (with loglvl=info), so that we can see if >cpufreq driver is initialized in all cpus >- xentrace date on Px state, so that we can see if the Px >transition really happened. >BTW, did you create any domains and run any workloads? Thanks, Kevin _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Niraj Tolia
2008-Oct-24 04:45 UTC
Re: [Xen-devel] Problems with enabling hypervisor C and P-state control
On Thu, Oct 23, 2008 at 8:14 PM, Yu, Ke <ke.yu@intel.com> wrote:> Niraj Tolia wrote: >>[snip]>> >> >> Second, when I look at the P-state output (shown below), xenpm shows >> that the lowest P-state is only set on the first socket (this is a >> quad-core, quad-socket system). However, I have a feeling that this >> might be a problem with displaying the data rather than the underlying >> logic. Any ideas? >> >> # xenpm | grep ''*'' >> *P3 : freq [1599 MHz] >> *P3 : freq [1599 MHz] >> *P3 : freq [1599 MHz] >> *P3 : freq [1599 MHz] >> *P0 : freq [2398 MHz] >> *P0 : freq [2398 MHz] >> *P0 : freq [2398 MHz] >> *P0 : freq [2398 MHz] >> *P0 : freq [2398 MHz] >> *P0 : freq [2398 MHz] >> *P0 : freq [2398 MHz] >> *P0 : freq [2398 MHz] >> *P0 : freq [2398 MHz] >> *P0 : freq [2398 MHz] >> *P0 : freq [2398 MHz] >> *P0 : freq [2398 MHz] > > This seems a bug. From the above info, I can not decided if it is xenpm issue or xen cpufreq issue. could you please provide more info, e.g. > - xen boot log (with loglvl=info), so that we can see if cpufreq driver is initialized in all cpusThe output ''xm dmesg'' is attached but is truncated as the buffer overflowed. However, there should be enough information to show that cpufreq drivers are initialized for most cpus.> - xentrace date on Px state, so that we can see if the Px transition really happened. >The below output was displayed when I started and stopped CPU intensive tasks on a system where xenpm initially listedthe first four cores being in P3 and the rest in P0. cat tmp.out | xentrace_format /home/ntolia/src/xen-unstable.hg/tools/xentrace/formats | grep -i freq CPU3 635322726837 (+ 1898604) cpu_freq_change [ 1599MHz -> 2398MHz ] CPU0 637123663233 (+ 1816488) cpu_freq_change [ 1599MHz -> 2398MHz ] CPU0 637365505221 (+ 1826208) cpu_freq_change [ 2398MHz -> 1599MHz ] CPU1 637423814277 (+ 1799946) cpu_freq_change [ 1599MHz -> 2398MHz ] CPU2 637449731748 (+ 1748628) cpu_freq_change [ 1599MHz -> 2398MHz ] CPU2 637691500107 (+ 1816884) cpu_freq_change [ 2398MHz -> 1599MHz ] CPU0 637847441253 (+ 1932336) cpu_freq_change [ 1599MHz -> 2398MHz ] CPU1 637905760983 (+ 1819935) cpu_freq_change [ 2398MHz -> 1599MHz ] CPU2 637933222818 (+ 1707120) cpu_freq_change [ 1599MHz -> 2398MHz ] CPU1 638147682630 (+ 1906677) cpu_freq_change [ 1599MHz -> 2398MHz ] CPU0 639049514238 (+ 1859544) cpu_freq_change [ 2398MHz -> 2132MHz ] CPU0 639531387018 (+ 1844451) cpu_freq_change [ 2132MHz -> 2398MHz ] CPU1 639589748679 (+ 1772361) cpu_freq_change [ 2398MHz -> 1599MHz ] CPU1 639831560877 (+ 1797507) cpu_freq_change [ 1599MHz -> 2398MHz ] I would be happy to help test patches on this system. Cheers, Niraj -- Niraj Tolia, Researcher, HP Labs http://www.hpl.hp.com/personal/Niraj_Tolia/ _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Niraj Tolia
2008-Oct-24 04:47 UTC
Re: [Xen-devel] Problems with enabling hypervisor C and P-state control
On Thu, Oct 23, 2008 at 9:13 PM, Tian, Kevin <kevin.tian@intel.com> wrote:>>From: Yu, Ke >>Sent: Friday, October 24, 2008 11:15 AM >>> >>> >>> Second, when I look at the P-state output (shown below), xenpm shows >>> that the lowest P-state is only set on the first socket (this is a >>> quad-core, quad-socket system). However, I have a feeling that this >>> might be a problem with displaying the data rather than the >>underlying >>> logic. Any ideas? >>> >>> # xenpm | grep ''*'' >>> *P3 : freq [1599 MHz] >>> *P3 : freq [1599 MHz] >>> *P3 : freq [1599 MHz] >>> *P3 : freq [1599 MHz] >>> *P0 : freq [2398 MHz] >>> *P0 : freq [2398 MHz] >>> *P0 : freq [2398 MHz] >>> *P0 : freq [2398 MHz] >>> *P0 : freq [2398 MHz] >>> *P0 : freq [2398 MHz] >>> *P0 : freq [2398 MHz] >>> *P0 : freq [2398 MHz] >>> *P0 : freq [2398 MHz] >>> *P0 : freq [2398 MHz] >>> *P0 : freq [2398 MHz] >>> *P0 : freq [2398 MHz] >> >>This seems a bug. From the above info, I can not decided if it >>is xenpm issue or xen cpufreq issue. could you please provide >>more info, e.g. >>- xen boot log (with loglvl=info), so that we can see if >>cpufreq driver is initialized in all cpus >>- xentrace date on Px state, so that we can see if the Px >>transition really happened. >> > > BTW, did you create any domains and run any workloads?No, not yet. Only dom0 was active. Running workloads in different domains is the next step once I have verified that basic Px transitions are performing as expected. Cheers, Niraj -- Niraj Tolia, Researcher, HP Labs http://www.hpl.hp.com/personal/Niraj_Tolia/ _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Yu, Ke
2008-Oct-24 05:59 UTC
RE: [Xen-devel] Problems with enabling hypervisor C and P-state control
After discussing with Jinsong, we got the root cause. You are right, this is xen pm statistics logic issue. when the coordination type is SW_ANY, we only record the first CPU cpufreq change, the other 3 cores within the same dependency domain is ignored, so you only see one core changes every dependency domain. The attached patch fix this issue. could you please have a try? If it works in your platform, we will send out for applying in upstream. Best Regards Ke Niraj Tolia wrote:> On Thu, Oct 23, 2008 at 8:14 PM, Yu, Ke <ke.yu@intel.com> wrote: >> Niraj Tolia wrote: >>> > > [snip] > >>> >>> >>> Second, when I look at the P-state output (shown below), xenpm shows >>> that the lowest P-state is only set on the first socket (this is a >>> quad-core, quad-socket system). However, I have a feeling that this >>> might be a problem with displaying the data rather than the >>> underlying logic. Any ideas? >>> >>> # xenpm | grep ''*'' >>> *P3 : freq [1599 MHz] >>> *P3 : freq [1599 MHz] >>> *P3 : freq [1599 MHz] >>> *P3 : freq [1599 MHz] >>> *P0 : freq [2398 MHz] >>> *P0 : freq [2398 MHz] >>> *P0 : freq [2398 MHz] >>> *P0 : freq [2398 MHz] >>> *P0 : freq [2398 MHz] >>> *P0 : freq [2398 MHz] >>> *P0 : freq [2398 MHz] >>> *P0 : freq [2398 MHz] >>> *P0 : freq [2398 MHz] >>> *P0 : freq [2398 MHz] >>> *P0 : freq [2398 MHz] >>> *P0 : freq [2398 MHz] >> >> This seems a bug. From the above info, I can not decided if it is >> xenpm issue or xen cpufreq issue. could you please provide more >> info, e.g. - xen boot log (with loglvl=info), so that we can see if >> cpufreq driver is initialized in all cpus > > The output ''xm dmesg'' is attached but is truncated as the buffer > overflowed. However, there should be enough information to show that > cpufreq drivers are initialized for most cpus. > >> - xentrace date on Px state, so that we can see if the Px transition >> really happened. >> > > The below output was displayed when I started and stopped CPU > intensive tasks on a system where xenpm initially listedthe first four > cores being in P3 and the rest in P0. > > cat tmp.out | xentrace_format > /home/ntolia/src/xen-unstable.hg/tools/xentrace/formats | grep -i > freq > > CPU3 635322726837 (+ 1898604) cpu_freq_change [ 1599MHz -> 2398MHz ] > CPU0 637123663233 (+ 1816488) cpu_freq_change [ 1599MHz -> 2398MHz ] > CPU0 637365505221 (+ 1826208) cpu_freq_change [ 2398MHz -> 1599MHz ] > CPU1 637423814277 (+ 1799946) cpu_freq_change [ 1599MHz -> 2398MHz ] > CPU2 637449731748 (+ 1748628) cpu_freq_change [ 1599MHz -> 2398MHz ] > CPU2 637691500107 (+ 1816884) cpu_freq_change [ 2398MHz -> 1599MHz ] > CPU0 637847441253 (+ 1932336) cpu_freq_change [ 1599MHz -> 2398MHz ] > CPU1 637905760983 (+ 1819935) cpu_freq_change [ 2398MHz -> 1599MHz ] > CPU2 637933222818 (+ 1707120) cpu_freq_change [ 1599MHz -> 2398MHz ] > CPU1 638147682630 (+ 1906677) cpu_freq_change [ 1599MHz -> 2398MHz ] > CPU0 639049514238 (+ 1859544) cpu_freq_change [ 2398MHz -> 2132MHz ] > CPU0 639531387018 (+ 1844451) cpu_freq_change [ 2132MHz -> 2398MHz ] > CPU1 639589748679 (+ 1772361) cpu_freq_change [ 2398MHz -> 1599MHz ] > CPU1 639831560877 (+ 1797507) cpu_freq_change [ 1599MHz -> 2398MHz ] > > > I would be happy to help test patches on this system. > > Cheers, > Niraj_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Tian, Kevin
2008-Oct-24 07:56 UTC
RE: [Xen-devel] Problems with enabling hypervisor C and P-state control
Niraj, one more info here. From your log, cpus on your platform are actually indexed by: package 1 (0, 4, 8, 12) package 2 (1, 5, 9, 13) package 3 (2, 6, 10, 14) package 4 (3, 7, 11, 15) thus cpu0/1/2/3 actually indicates 1st core in each package, instead of 4 cores in 1st package. :-) Thanks, Kevin>-----Original Message----- >From: Yu, Ke >Sent: Friday, October 24, 2008 2:00 PM >To: Niraj Tolia >Cc: Xen Developers; Tian, Kevin; Liu, Jinsong >Subject: RE: [Xen-devel] Problems with enabling hypervisor C >and P-state control > >After discussing with Jinsong, we got the root cause. You are >right, this is xen pm statistics logic issue. when the >coordination type is SW_ANY, we only record the first CPU >cpufreq change, the other 3 cores within the same dependency >domain is ignored, so you only see one core changes every >dependency domain. > >The attached patch fix this issue. could you please have a >try? If it works in your platform, we will send out for >applying in upstream. > >Best Regards >Ke > >Niraj Tolia wrote: >> On Thu, Oct 23, 2008 at 8:14 PM, Yu, Ke <ke.yu@intel.com> wrote: >>> Niraj Tolia wrote: >>>> >> >> [snip] >> >>>> >>>> >>>> Second, when I look at the P-state output (shown below), >xenpm shows >>>> that the lowest P-state is only set on the first socket (this is a >>>> quad-core, quad-socket system). However, I have a feeling that this >>>> might be a problem with displaying the data rather than the >>>> underlying logic. Any ideas? >>>> >>>> # xenpm | grep ''*'' >>>> *P3 : freq [1599 MHz] >>>> *P3 : freq [1599 MHz] >>>> *P3 : freq [1599 MHz] >>>> *P3 : freq [1599 MHz] >>>> *P0 : freq [2398 MHz] >>>> *P0 : freq [2398 MHz] >>>> *P0 : freq [2398 MHz] >>>> *P0 : freq [2398 MHz] >>>> *P0 : freq [2398 MHz] >>>> *P0 : freq [2398 MHz] >>>> *P0 : freq [2398 MHz] >>>> *P0 : freq [2398 MHz] >>>> *P0 : freq [2398 MHz] >>>> *P0 : freq [2398 MHz] >>>> *P0 : freq [2398 MHz] >>>> *P0 : freq [2398 MHz] >>> >>> This seems a bug. From the above info, I can not decided if it is >>> xenpm issue or xen cpufreq issue. could you please provide more >>> info, e.g. - xen boot log (with loglvl=info), so that we can see if >>> cpufreq driver is initialized in all cpus >> >> The output ''xm dmesg'' is attached but is truncated as the buffer >> overflowed. However, there should be enough information to show that >> cpufreq drivers are initialized for most cpus. >> >>> - xentrace date on Px state, so that we can see if the Px transition >>> really happened. >>> >> >> The below output was displayed when I started and stopped CPU >> intensive tasks on a system where xenpm initially listedthe >first four >> cores being in P3 and the rest in P0. >> >> cat tmp.out | xentrace_format >> /home/ntolia/src/xen-unstable.hg/tools/xentrace/formats | grep -i >> freq >> >> CPU3 635322726837 (+ 1898604) cpu_freq_change [ 1599MHz -> >2398MHz ] >> CPU0 637123663233 (+ 1816488) cpu_freq_change [ 1599MHz -> >2398MHz ] >> CPU0 637365505221 (+ 1826208) cpu_freq_change [ 2398MHz -> >1599MHz ] >> CPU1 637423814277 (+ 1799946) cpu_freq_change [ 1599MHz -> >2398MHz ] >> CPU2 637449731748 (+ 1748628) cpu_freq_change [ 1599MHz -> >2398MHz ] >> CPU2 637691500107 (+ 1816884) cpu_freq_change [ 2398MHz -> >1599MHz ] >> CPU0 637847441253 (+ 1932336) cpu_freq_change [ 1599MHz -> >2398MHz ] >> CPU1 637905760983 (+ 1819935) cpu_freq_change [ 2398MHz -> >1599MHz ] >> CPU2 637933222818 (+ 1707120) cpu_freq_change [ 1599MHz -> >2398MHz ] >> CPU1 638147682630 (+ 1906677) cpu_freq_change [ 1599MHz -> >2398MHz ] >> CPU0 639049514238 (+ 1859544) cpu_freq_change [ 2398MHz -> >2132MHz ] >> CPU0 639531387018 (+ 1844451) cpu_freq_change [ 2132MHz -> >2398MHz ] >> CPU1 639589748679 (+ 1772361) cpu_freq_change [ 2398MHz -> >1599MHz ] >> CPU1 639831560877 (+ 1797507) cpu_freq_change [ 1599MHz -> >2398MHz ] >> >> >> I would be happy to help test patches on this system. >> >> Cheers, >> Niraj > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Pratt
2008-Oct-24 08:31 UTC
RE: [Xen-devel] Problems with enabling hypervisor C and P-statecontrol
> > Niraj, one more info here. From your log, cpus on your platform > are actually indexed by: > package 1 (0, 4, 8, 12) > package 2 (1, 5, 9, 13) > package 3 (2, 6, 10, 14) > package 4 (3, 7, 11, 15) > > thus cpu0/1/2/3 actually indicates 1st core in each package, > instead of 4 cores in 1st package. :-)We really ought to have code in Xen to make the enumeration consistent regardless of the BIOS order. I don''t really care whether its sockets/cores/threads or threads/cores/sockets, but we really ought to be consistent. Ian _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Tian, Kevin
2008-Oct-25 03:01 UTC
RE: [Xen-devel] Problems with enabling hypervisor C and P-statecontrol
>From: Ian Pratt [mailto:Ian.Pratt@eu.citrix.com] >Sent: Friday, October 24, 2008 4:31 PM >> >> Niraj, one more info here. From your log, cpus on your platform >> are actually indexed by: >> package 1 (0, 4, 8, 12) >> package 2 (1, 5, 9, 13) >> package 3 (2, 6, 10, 14) >> package 4 (3, 7, 11, 15) >> >> thus cpu0/1/2/3 actually indicates 1st core in each package, >> instead of 4 cores in 1st package. :-) > >We really ought to have code in Xen to make the enumeration consistent >regardless of the BIOS order. I don''t really care whether its >sockets/cores/threads or threads/cores/sockets, but we really ought to >be consistent.If Xen wants to be consistent with one policy, as you suggested, it requires core/thread info known before booting APs, and then take that info in alloc_cpu_id. However core/thread info can be only acquired on AP by CPUID and sibling/core map can be constructed after all APs are booted. Then you may need a temporary cpu id allocated and then switch it to a real one later. This looks a bit messed on some arrays[NR_CPUS]. Is it worthy of doing that way, or just expose the mapping between xen cpu id and sockets/cores/ threads? Thanks, Kevin _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2008-Oct-25 14:50 UTC
Re: [Xen-devel] Problems with enabling hypervisor C and P-statecontrol
On 25/10/08 04:01, "Tian, Kevin" <kevin.tian@intel.com> wrote:>> We really ought to have code in Xen to make the enumeration consistent >> regardless of the BIOS order. I don''t really care whether its >> sockets/cores/threads or threads/cores/sockets, but we really ought to >> be consistent. > > If Xen wants to be consistent with one policy, as you suggested, it > requires core/thread info known before booting APs, and then take > that info in alloc_cpu_id. However core/thread info can be only > acquired on AP by CPUID and sibling/core map can be constructed > after all APs are booted.Sort by LAPIC ID. The LAPIC ID is defined to be hierarchical, so this would automatically get us sorting by thread then core then socket. This could be as simple as arranging for the calls to mp_register_lapic() to happen in the correct order.> Then you may need a temporary cpu id > allocated and then switch it to a real one later. This looks a bit > messed on some arrays[NR_CPUS]. Is it worthy of doing that way, > or just expose the mapping between xen cpu id and sockets/cores/ > threads?The mapping between flat identifier and socket/core/thread should be made available, and/or modify tools to accept a hierarchical cpu identifier in addition to the old-style flat identifier. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Tian, Kevin
2008-Oct-27 07:56 UTC
RE: [Xen-devel] Problems with enabling hypervisor C and P-statecontrol
>From: Keir Fraser [mailto:keir.fraser@eu.citrix.com] >Sent: Saturday, October 25, 2008 10:50 PM > >On 25/10/08 04:01, "Tian, Kevin" <kevin.tian@intel.com> wrote: > >>> We really ought to have code in Xen to make the enumeration >consistent >>> regardless of the BIOS order. I don''t really care whether its >>> sockets/cores/threads or threads/cores/sockets, but we >really ought to >>> be consistent. >> >> If Xen wants to be consistent with one policy, as you suggested, it >> requires core/thread info known before booting APs, and then take >> that info in alloc_cpu_id. However core/thread info can be only >> acquired on AP by CPUID and sibling/core map can be constructed >> after all APs are booted. > >Sort by LAPIC ID. The LAPIC ID is defined to be hierarchical, >so this would >automatically get us sorting by thread then core then socket. >This could be >as simple as arranging for the calls to mp_register_lapic() to >happen in the >correct order.calls to mp_register_lapic are initiated from acpi_parse_lapic, which is a callback when one ACPI LAPIC entry is found. To make it sorted, either delayed calls are required by caching all found lapic entries, or re-order within mp_register_lapic. Also how about BSP? If BSP is not thread0/core0/package0, then all the rest sorting effort is just meaningless. :-(> >> Then you may need a temporary cpu id >> allocated and then switch it to a real one later. This looks a bit >> messed on some arrays[NR_CPUS]. Is it worthy of doing that way, >> or just expose the mapping between xen cpu id and sockets/cores/ >> threads? > >The mapping between flat identifier and socket/core/thread >should be made >available, and/or modify tools to accept a hierarchical cpu >identifier in >addition to the old-style flat identifier. >This is necessary information. Is there any existing interface reporting similar info? If not, could you suggest which interface to carry? Tools like xenpm may need to query such infro to construct a better summary. If it''s not there, we can consider to add. Thanks, Kevin _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2008-Oct-27 11:24 UTC
Re: [Xen-devel] Problems with enabling hypervisor C and P-statecontrol
On 27/10/08 07:56, "Tian, Kevin" <kevin.tian@intel.com> wrote:> Also how about BSP? If BSP is not thread0/core0/package0, > then all the rest sorting effort is just meaningless. :-(True. Another possibility would be to provide a remapping between CPU identifiers used within Xen and those exposed to tools via sysctl/domctl. The latter could be sorted; and no ordering constraints placed on the former.> This is necessary information. Is there any existing interface reporting > similar info? If not, could you suggest which interface to carry? Tools > like xenpm may need to query such infro to construct a better summary. > If it''s not there, we can consider to add.The info is not available right now. It could be exposed via sysctl. Either a new sysctl or by extending existing XEN_SYSCTL_getcpuinfo. -- keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Tian, Kevin
2008-Oct-27 12:19 UTC
RE: [Xen-devel] Problems with enabling hypervisor C and P-statecontrol
>From: Keir Fraser [mailto:keir.fraser@eu.citrix.com] >Sent: Monday, October 27, 2008 7:25 PM > >On 27/10/08 07:56, "Tian, Kevin" <kevin.tian@intel.com> wrote: > >> Also how about BSP? If BSP is not thread0/core0/package0, >> then all the rest sorting effort is just meaningless. :-( > >True. Another possibility would be to provide a remapping between CPU >identifiers used within Xen and those exposed to tools via >sysctl/domctl. >The latter could be sorted; and no ordering constraints placed on the >former. >This looks better, but the place to be changed would be large. Hypercall handlers are easy. However carefulness has to be paid for some implicit paths, like xentrace and serial output, where smp_processor_id() is invoked naturally. Such info can be inconsistent to identifiers exposed via sysctl/domctl, when consumed by user or developer. Then another level of translation between these two are required to be exposed again. If anyway we''ll add a new path to expose mapping between xen identifier and physical topology, above requirement seems not strong then. :-) Thanks, Kevin _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Pratt
2008-Oct-27 15:02 UTC
RE: [Xen-devel] Problems with enabling hypervisor C and P-statecontrol
> > Also how about BSP? If BSP is not thread0/core0/package0, > > then all the rest sorting effort is just meaningless. :-( > > True. Another possibility would be to provide a remapping between CPU > identifiers used within Xen and those exposed to tools via > sysctl/domctl. > The latter could be sorted; and no ordering constraints placed on the > former.That''s likely to get very confusing when it comes to debugging... Have we seen any systems where the boot CPU isn''t thread 0, core 0, socket 0? Thanks, Ian> > This is necessary information. Is there any existing interface > reporting > > similar info? If not, could you suggest which interface to carry? > Tools > > like xenpm may need to query such infro to construct a better > summary. > > If it''s not there, we can consider to add. > > The info is not available right now. It could be exposed via sysctl. > Either > a new sysctl or by extending existing XEN_SYSCTL_getcpuinfo. > > -- keir >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Niraj Tolia
2008-Oct-27 18:01 UTC
Re: [Xen-devel] Problems with enabling hypervisor C and P-state control
On Thu, Oct 23, 2008 at 10:59 PM, Yu, Ke <ke.yu@intel.com> wrote:> After discussing with Jinsong, we got the root cause. You are right, this is xen pm statistics logic issue. when the coordination type is SW_ANY, we only record the first CPU cpufreq change, the other 3 cores within the same dependency domain is ignored, so you only see one core changes every dependency domain. > > The attached patch fix this issue. could you please have a try? If it works in your platform, we will send out for applying in upstream.I just applied the patch and while xenpm might be doing the right thing, I am not completely sure. For example, if I launch a single VCPU VM, pin it to a core, and launch a CPU intensive task on it, ALL four cores on the socket are reported to switch into P0. However, from what I understand about this processor (Xeon E7330), only two of them should. Like vanilla Linux, the other two should be able to operate at independent voltage/frequency settings. Once again, I am not sure if this is xenpm''s fault or if the underlying frequency control code isn''t able to determine what CPUs need to switch frequency at the same time. Cheers, Niraj -- Niraj Tolia, Researcher, HP Labs http://www.hpl.hp.com/personal/Niraj_Tolia/ _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Tian, Kevin
2008-Oct-28 01:04 UTC
RE: [Xen-devel] Problems with enabling hypervisor C and P-state control
>From: Niraj Tolia [mailto:ntolia@gmail.com] >Sent: Tuesday, October 28, 2008 2:01 AM > >On Thu, Oct 23, 2008 at 10:59 PM, Yu, Ke <ke.yu@intel.com> wrote: >> After discussing with Jinsong, we got the root cause. You >are right, this is xen pm statistics logic issue. when the >coordination type is SW_ANY, we only record the first CPU >cpufreq change, the other 3 cores within the same dependency >domain is ignored, so you only see one core changes every >dependency domain. >> >> The attached patch fix this issue. could you please have a >try? If it works in your platform, we will send out for >applying in upstream. > >I just applied the patch and while xenpm might be doing the right >thing, I am not completely sure. For example, if I launch a single >VCPU VM, pin it to a core, and launch a CPU intensive task on it, ALL >four cores on the socket are reported to switch into P0. However, from >what I understand about this processor (Xeon E7330), only two of them >should. Like vanilla Linux, the other two should be able to operate at >independent voltage/frequency settings. Once again, I am not sure if >this is xenpm''s fault or if the underlying frequency control code >isn''t able to determine what CPUs need to switch frequency at the >same time. >Do you change any BIOS setting when comparing native Linux and Xen? From the xen dmesg you posted last time: ... (XEN) _PSD: num_entries=5 rev=0 domain=1 coord_type=253 num_processors=4 ... (XEN) _PSD: num_entries=5 rev=0 domain=2 coord_type=253 num_processors=4 ... (XEN) _PSD: num_entries=5 rev=0 domain=3 coord_type=253 num_processors=4 ... You can see that BIOS reports 4 processors in a dependent domain with a SW_ANY coordination type. It means that any cpu within given dependent domain changes freq, all the rest 3 cpus change too. Thanks, Kevin _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Niraj Tolia
2008-Oct-28 02:19 UTC
Re: [Xen-devel] Problems with enabling hypervisor C and P-state control
On Mon, Oct 27, 2008 at 6:04 PM, Tian, Kevin <kevin.tian@intel.com> wrote:>>From: Niraj Tolia [mailto:ntolia@gmail.com] >>Sent: Tuesday, October 28, 2008 2:01 AM >> >>On Thu, Oct 23, 2008 at 10:59 PM, Yu, Ke <ke.yu@intel.com> wrote: >>> After discussing with Jinsong, we got the root cause. You >>are right, this is xen pm statistics logic issue. when the >>coordination type is SW_ANY, we only record the first CPU >>cpufreq change, the other 3 cores within the same dependency >>domain is ignored, so you only see one core changes every >>dependency domain. >>> >>> The attached patch fix this issue. could you please have a >>try? If it works in your platform, we will send out for >>applying in upstream. >> >>I just applied the patch and while xenpm might be doing the right >>thing, I am not completely sure. For example, if I launch a single >>VCPU VM, pin it to a core, and launch a CPU intensive task on it, ALL >>four cores on the socket are reported to switch into P0. However, from >>what I understand about this processor (Xeon E7330), only two of them >>should. Like vanilla Linux, the other two should be able to operate at >>independent voltage/frequency settings. Once again, I am not sure if >>this is xenpm''s fault or if the underlying frequency control code >>isn''t able to determine what CPUs need to switch frequency at the >>same time. >> > > Do you change any BIOS setting when comparing native Linux and > Xen? From the xen dmesg you posted last time:No, I did not change anything in the BIOS. However, when I run vanilla Linux w/ cpufreqd, cpufreq-info will only list two cores being tied together. This is with the 2.6.24-21 kernel provided with Ubuntu 8.04.1. # cpufreq-info cpufrequtils 002: cpufreq-info (C) Dominik Brodowski 2004-2006 Report errors and bugs to linux@brodo.de, please. analyzing CPU 0: driver: acpi-cpufreq CPUs which need to switch frequency at the same time: 0 4 hardware limits: 1.60 GHz - 2.40 GHz available frequency steps: 2.40 GHz, 2.13 GHz, 1.87 GHz, 1.60 GHz available cpufreq governors: powersave, conservative, ondemand, userspace, performance current policy: frequency should be within 1.60 GHz and 1.60 GHz. The governor "powersave" may decide which speed to use within this range. current CPU frequency is 1.60 GHz. ... Cheers, NIraj> ... > (XEN) _PSD: num_entries=5 rev=0 domain=1 coord_type=253 num_processors=4 > ... > (XEN) _PSD: num_entries=5 rev=0 domain=2 coord_type=253 num_processors=4 > ... > (XEN) _PSD: num_entries=5 rev=0 domain=3 coord_type=253 num_processors=4 > ... > You can see that BIOS reports 4 processors in a dependent domain > with a SW_ANY coordination type. It means that any cpu within > given dependent domain changes freq, all the rest 3 cpus change too. > > Thanks, > Kevin > >-- Niraj Tolia, Researcher, HP Labs http://www.hpl.hp.com/personal/Niraj_Tolia/ _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Niraj Tolia
2008-Oct-28 02:23 UTC
Re: [Xen-devel] Problems with enabling hypervisor C and P-state control
On Mon, Oct 27, 2008 at 7:19 PM, Niraj Tolia <ntolia@gmail.com> wrote:> On Mon, Oct 27, 2008 at 6:04 PM, Tian, Kevin <kevin.tian@intel.com> wrote: >>>From: Niraj Tolia [mailto:ntolia@gmail.com] >>>Sent: Tuesday, October 28, 2008 2:01 AM >>> >>>On Thu, Oct 23, 2008 at 10:59 PM, Yu, Ke <ke.yu@intel.com> wrote: >>>> After discussing with Jinsong, we got the root cause. You >>>are right, this is xen pm statistics logic issue. when the >>>coordination type is SW_ANY, we only record the first CPU >>>cpufreq change, the other 3 cores within the same dependency >>>domain is ignored, so you only see one core changes every >>>dependency domain. >>>> >>>> The attached patch fix this issue. could you please have a >>>try? If it works in your platform, we will send out for >>>applying in upstream. >>> >>>I just applied the patch and while xenpm might be doing the right >>>thing, I am not completely sure. For example, if I launch a single >>>VCPU VM, pin it to a core, and launch a CPU intensive task on it, ALL >>>four cores on the socket are reported to switch into P0. However, from >>>what I understand about this processor (Xeon E7330), only two of them >>>should. Like vanilla Linux, the other two should be able to operate at >>>independent voltage/frequency settings. Once again, I am not sure if >>>this is xenpm''s fault or if the underlying frequency control code >>>isn''t able to determine what CPUs need to switch frequency at the >>>same time. >>> >> >> Do you change any BIOS setting when comparing native Linux and >> Xen? From the xen dmesg you posted last time: > > > No, I did not change anything in the BIOS. However, when I run vanilla > Linux w/ cpufreqd, cpufreq-info will only list two cores being tied > together. This is with the 2.6.24-21 kernel provided with Ubuntu > 8.04.1. > > # cpufreq-info > cpufrequtils 002: cpufreq-info (C) Dominik Brodowski 2004-2006 > Report errors and bugs to linux@brodo.de, please. > analyzing CPU 0: > driver: acpi-cpufreq > CPUs which need to switch frequency at the same time: 0 4 > hardware limits: 1.60 GHz - 2.40 GHz > available frequency steps: 2.40 GHz, 2.13 GHz, 1.87 GHz, 1.60 GHz > available cpufreq governors: powersave, conservative, ondemand, > userspace, performance > current policy: frequency should be within 1.60 GHz and 1.60 GHz. > The governor "powersave" may decide which speed to use > within this range. > current CPU frequency is 1.60 GHz. > > ... >I just noticed that cpufreq-info only lists 8 CPUs. Turns out that Ubuntu''s kernels come with NR_CPUS = 8. So, you might be right. I will try and recompile a vanilla kernel tomorrow to see what happens. Cheers, Niraj> Cheers, > NIraj > >> ... >> (XEN) _PSD: num_entries=5 rev=0 domain=1 coord_type=253 num_processors=4 >> ... >> (XEN) _PSD: num_entries=5 rev=0 domain=2 coord_type=253 num_processors=4 >> ... >> (XEN) _PSD: num_entries=5 rev=0 domain=3 coord_type=253 num_processors=4 >> ... >> You can see that BIOS reports 4 processors in a dependent domain >> with a SW_ANY coordination type. It means that any cpu within >> given dependent domain changes freq, all the rest 3 cpus change too. >> >> Thanks, >> Kevin >> >> > > > > -- > Niraj Tolia, Researcher, HP Labs > http://www.hpl.hp.com/personal/Niraj_Tolia/ >-- Niraj Tolia, Researcher, HP Labs http://www.hpl.hp.com/personal/Niraj_Tolia/ _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Liu, Jinsong
2008-Oct-28 02:29 UTC
RE: [Xen-devel] Problems with enabling hypervisor C and P-state control
Niraj Tolia wrote:> On Mon, Oct 27, 2008 at 7:19 PM, Niraj Tolia <ntolia@gmail.com> wrote: >> On Mon, Oct 27, 2008 at 6:04 PM, Tian, Kevin <kevin.tian@intel.com> >> wrote: >>>> From: Niraj Tolia [mailto:ntolia@gmail.com] >>>> Sent: Tuesday, October 28, 2008 2:01 AM >>>> >>>> On Thu, Oct 23, 2008 at 10:59 PM, Yu, Ke <ke.yu@intel.com> wrote: >>>>> After discussing with Jinsong, we got the root cause. You >>>> are right, this is xen pm statistics logic issue. when the >>>> coordination type is SW_ANY, we only record the first CPU >>>> cpufreq change, the other 3 cores within the same dependency >>>> domain is ignored, so you only see one core changes every >>>> dependency domain. >>>>> >>>>> The attached patch fix this issue. could you please have a >>>> try? If it works in your platform, we will send out for >>>> applying in upstream. >>>> >>>> I just applied the patch and while xenpm might be doing the right >>>> thing, I am not completely sure. For example, if I launch a single >>>> VCPU VM, pin it to a core, and launch a CPU intensive task on it, >>>> ALL four cores on the socket are reported to switch into P0. >>>> However, from what I understand about this processor (Xeon E7330), >>>> only two of them should. Like vanilla Linux, the other two should >>>> be able to operate at independent voltage/frequency settings. Once >>>> again, I am not sure if this is xenpm''s fault or if the underlying >>>> frequency control code isn''t able to determine what CPUs need to >>>> switch frequency at the same time. >>>> >>> >>> Do you change any BIOS setting when comparing native Linux and >>> Xen? From the xen dmesg you posted last time: >> >> >> No, I did not change anything in the BIOS. However, when I run >> vanilla Linux w/ cpufreqd, cpufreq-info will only list two cores >> being tied together. This is with the 2.6.24-21 kernel provided with >> Ubuntu >> 8.04.1. >> >> # cpufreq-info >> cpufrequtils 002: cpufreq-info (C) Dominik Brodowski 2004-2006 >> Report errors and bugs to linux@brodo.de, please. >> analyzing CPU 0: >> driver: acpi-cpufreq >> CPUs which need to switch frequency at the same time: 0 4 >> hardware limits: 1.60 GHz - 2.40 GHz >> available frequency steps: 2.40 GHz, 2.13 GHz, 1.87 GHz, 1.60 GHz >> available cpufreq governors: powersave, conservative, ondemand, >> userspace, performance current policy: frequency should be within >> 1.60 GHz and 1.60 GHz. The governor "powersave" may >> decide which speed to use within this range. >> current CPU frequency is 1.60 GHz. >> >> ... >> > > I just noticed that cpufreq-info only lists 8 CPUs. Turns out that > Ubuntu''s kernels come with NR_CPUS = 8. So, you might be right. I will > try and recompile a vanilla kernel tomorrow to see what happens.Yes, from xm dmesg info you send us several days ago, your machine has 4 slot, each has 4 cores with SW_ANY coornidation.> > Cheers, > Niraj > >> Cheers, >> NIraj >> >>> ... >>> (XEN) _PSD: num_entries=5 rev=0 domain=1 coord_type=253 >>> num_processors=4 ... (XEN) _PSD: num_entries=5 rev=0 domain=2 >>> coord_type=253 num_processors=4 ... (XEN) _PSD: num_entries=5 >>> rev=0 domain=3 coord_type=253 num_processors=4 ... You can see that >>> BIOS reports 4 processors in a dependent domain >>> with a SW_ANY coordination type. It means that any cpu within >>> given dependent domain changes freq, all the rest 3 cpus change too. >>> >>> Thanks, >>> Kevin >>> >>> >> >> >> >> -- >> Niraj Tolia, Researcher, HP Labs >> http://www.hpl.hp.com/personal/Niraj_Tolia/_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Liu, Jinsong
2008-Oct-29 09:02 UTC
RE: [Xen-devel] Problems with enabling hypervisor C and P-state control
Niraj, Any update about xen cpufreq at your platform? Does the patch work? Thanks, Jinsong Niraj Tolia wrote:> On Mon, Oct 27, 2008 at 7:19 PM, Niraj Tolia <ntolia@gmail.com> wrote: >> On Mon, Oct 27, 2008 at 6:04 PM, Tian, Kevin <kevin.tian@intel.com> >> wrote: >>>> From: Niraj Tolia [mailto:ntolia@gmail.com] >>>> Sent: Tuesday, October 28, 2008 2:01 AM >>>> >>>> On Thu, Oct 23, 2008 at 10:59 PM, Yu, Ke <ke.yu@intel.com> wrote: >>>>> After discussing with Jinsong, we got the root cause. You >>>> are right, this is xen pm statistics logic issue. when the >>>> coordination type is SW_ANY, we only record the first CPU >>>> cpufreq change, the other 3 cores within the same dependency >>>> domain is ignored, so you only see one core changes every >>>> dependency domain. >>>>> >>>>> The attached patch fix this issue. could you please have a >>>> try? If it works in your platform, we will send out for >>>> applying in upstream. >>>> >>>> I just applied the patch and while xenpm might be doing the right >>>> thing, I am not completely sure. For example, if I launch a single >>>> VCPU VM, pin it to a core, and launch a CPU intensive task on it, >>>> ALL four cores on the socket are reported to switch into P0. >>>> However, from what I understand about this processor (Xeon E7330), >>>> only two of them should. Like vanilla Linux, the other two should >>>> be able to operate at independent voltage/frequency settings. Once >>>> again, I am not sure if this is xenpm''s fault or if the underlying >>>> frequency control code isn''t able to determine what CPUs need to >>>> switch frequency at the same time. >>>> >>> >>> Do you change any BIOS setting when comparing native Linux and >>> Xen? From the xen dmesg you posted last time: >> >> >> No, I did not change anything in the BIOS. However, when I run >> vanilla Linux w/ cpufreqd, cpufreq-info will only list two cores >> being tied together. This is with the 2.6.24-21 kernel provided with >> Ubuntu >> 8.04.1. >> >> # cpufreq-info >> cpufrequtils 002: cpufreq-info (C) Dominik Brodowski 2004-2006 >> Report errors and bugs to linux@brodo.de, please. >> analyzing CPU 0: >> driver: acpi-cpufreq >> CPUs which need to switch frequency at the same time: 0 4 >> hardware limits: 1.60 GHz - 2.40 GHz >> available frequency steps: 2.40 GHz, 2.13 GHz, 1.87 GHz, 1.60 GHz >> available cpufreq governors: powersave, conservative, ondemand, >> userspace, performance current policy: frequency should be within >> 1.60 GHz and 1.60 GHz. The governor "powersave" may >> decide which speed to use within this range. >> current CPU frequency is 1.60 GHz. >> >> ... >> > > I just noticed that cpufreq-info only lists 8 CPUs. Turns out that > Ubuntu''s kernels come with NR_CPUS = 8. So, you might be right. I will > try and recompile a vanilla kernel tomorrow to see what happens. > > Cheers, > Niraj > >> Cheers, >> NIraj >> >>> ... >>> (XEN) _PSD: num_entries=5 rev=0 domain=1 coord_type=253 >>> num_processors=4 ... (XEN) _PSD: num_entries=5 rev=0 domain=2 >>> coord_type=253 num_processors=4 ... (XEN) _PSD: num_entries=5 >>> rev=0 domain=3 coord_type=253 num_processors=4 ... You can see that >>> BIOS reports 4 processors in a dependent domain >>> with a SW_ANY coordination type. It means that any cpu within >>> given dependent domain changes freq, all the rest 3 cpus change too. >>> >>> Thanks, >>> Kevin >>> >>> >> >> >> >> -- >> Niraj Tolia, Researcher, HP Labs >> http://www.hpl.hp.com/personal/Niraj_Tolia/_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Niraj Tolia
2008-Oct-29 15:32 UTC
Re: [Xen-devel] Problems with enabling hypervisor C and P-state control
On Wed, Oct 29, 2008 at 2:02 AM, Liu, Jinsong <jinsong.liu@intel.com> wrote:> Niraj, > > Any update about xen cpufreq at your platform? Does the patch work? >Hi Jinsong, Yup, it does seem to work. I will let you know if I run into any other issues. Cheers, Niraj> Thanks, > Jinsong > > Niraj Tolia wrote: >> On Mon, Oct 27, 2008 at 7:19 PM, Niraj Tolia <ntolia@gmail.com> wrote: >>> On Mon, Oct 27, 2008 at 6:04 PM, Tian, Kevin <kevin.tian@intel.com> >>> wrote: >>>>> From: Niraj Tolia [mailto:ntolia@gmail.com] >>>>> Sent: Tuesday, October 28, 2008 2:01 AM >>>>> >>>>> On Thu, Oct 23, 2008 at 10:59 PM, Yu, Ke <ke.yu@intel.com> wrote: >>>>>> After discussing with Jinsong, we got the root cause. You >>>>> are right, this is xen pm statistics logic issue. when the >>>>> coordination type is SW_ANY, we only record the first CPU >>>>> cpufreq change, the other 3 cores within the same dependency >>>>> domain is ignored, so you only see one core changes every >>>>> dependency domain. >>>>>> >>>>>> The attached patch fix this issue. could you please have a >>>>> try? If it works in your platform, we will send out for >>>>> applying in upstream. >>>>> >>>>> I just applied the patch and while xenpm might be doing the right >>>>> thing, I am not completely sure. For example, if I launch a single >>>>> VCPU VM, pin it to a core, and launch a CPU intensive task on it, >>>>> ALL four cores on the socket are reported to switch into P0. >>>>> However, from what I understand about this processor (Xeon E7330), >>>>> only two of them should. Like vanilla Linux, the other two should >>>>> be able to operate at independent voltage/frequency settings. Once >>>>> again, I am not sure if this is xenpm''s fault or if the underlying >>>>> frequency control code isn''t able to determine what CPUs need to >>>>> switch frequency at the same time. >>>>> >>>> >>>> Do you change any BIOS setting when comparing native Linux and >>>> Xen? From the xen dmesg you posted last time: >>> >>> >>> No, I did not change anything in the BIOS. However, when I run >>> vanilla Linux w/ cpufreqd, cpufreq-info will only list two cores >>> being tied together. This is with the 2.6.24-21 kernel provided with >>> Ubuntu >>> 8.04.1. >>> >>> # cpufreq-info >>> cpufrequtils 002: cpufreq-info (C) Dominik Brodowski 2004-2006 >>> Report errors and bugs to linux@brodo.de, please. >>> analyzing CPU 0: >>> driver: acpi-cpufreq >>> CPUs which need to switch frequency at the same time: 0 4 >>> hardware limits: 1.60 GHz - 2.40 GHz >>> available frequency steps: 2.40 GHz, 2.13 GHz, 1.87 GHz, 1.60 GHz >>> available cpufreq governors: powersave, conservative, ondemand, >>> userspace, performance current policy: frequency should be within >>> 1.60 GHz and 1.60 GHz. The governor "powersave" may >>> decide which speed to use within this range. >>> current CPU frequency is 1.60 GHz. >>> >>> ... >>> >> >> I just noticed that cpufreq-info only lists 8 CPUs. Turns out that >> Ubuntu''s kernels come with NR_CPUS = 8. So, you might be right. I will >> try and recompile a vanilla kernel tomorrow to see what happens. >> >> Cheers, >> Niraj >> >>> Cheers, >>> NIraj >>> >>>> ... >>>> (XEN) _PSD: num_entries=5 rev=0 domain=1 coord_type=253 >>>> num_processors=4 ... (XEN) _PSD: num_entries=5 rev=0 domain=2 >>>> coord_type=253 num_processors=4 ... (XEN) _PSD: num_entries=5 >>>> rev=0 domain=3 coord_type=253 num_processors=4 ... You can see that >>>> BIOS reports 4 processors in a dependent domain >>>> with a SW_ANY coordination type. It means that any cpu within >>>> given dependent domain changes freq, all the rest 3 cpus change too. >>>> >>>> Thanks, >>>> Kevin >>>> >>>> >>> >>> >>> >>> -- >>> Niraj Tolia, Researcher, HP Labs >>> http://www.hpl.hp.com/personal/Niraj_Tolia/ > >-- Niraj Tolia, Researcher, HP Labs http://www.hpl.hp.com/personal/Niraj_Tolia/ _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Liu, Jinsong
2008-Oct-29 16:00 UTC
RE: [Xen-devel] Problems with enabling hypervisor C and P-state control
Niraj Tolia wrote:> On Wed, Oct 29, 2008 at 2:02 AM, Liu, Jinsong <jinsong.liu@intel.com> > wrote: >> Niraj, >> >> Any update about xen cpufreq at your platform? Does the patch work? >> > > Hi Jinsong, > > Yup, it does seem to work. I will let you know if I run into any > other issues. > > Cheers, > NirajGlad to see it work at your platform. 2 valuable results are: - SW_ANY coordination has been tested in your platform and find a px statistic bug (I don''t have this kind of platform); - The correctness of cpufreq I/O control method has been tested in your platform (again, all my test platform is MSR control); Thank you very much! Jinsong> >> Thanks, >> Jinsong >> >> Niraj Tolia wrote: >>> On Mon, Oct 27, 2008 at 7:19 PM, Niraj Tolia <ntolia@gmail.com> >>> wrote: >>>> On Mon, Oct 27, 2008 at 6:04 PM, Tian, Kevin <kevin.tian@intel.com> >>>> wrote: >>>>>> From: Niraj Tolia [mailto:ntolia@gmail.com] >>>>>> Sent: Tuesday, October 28, 2008 2:01 AM >>>>>> >>>>>> On Thu, Oct 23, 2008 at 10:59 PM, Yu, Ke <ke.yu@intel.com> wrote: >>>>>>> After discussing with Jinsong, we got the root cause. You >>>>>> are right, this is xen pm statistics logic issue. when the >>>>>> coordination type is SW_ANY, we only record the first CPU >>>>>> cpufreq change, the other 3 cores within the same dependency >>>>>> domain is ignored, so you only see one core changes every >>>>>> dependency domain. >>>>>>> >>>>>>> The attached patch fix this issue. could you please have a >>>>>> try? If it works in your platform, we will send out for >>>>>> applying in upstream. >>>>>> >>>>>> I just applied the patch and while xenpm might be doing the right >>>>>> thing, I am not completely sure. For example, if I launch a >>>>>> single VCPU VM, pin it to a core, and launch a CPU intensive >>>>>> task on it, ALL four cores on the socket are reported to switch >>>>>> into P0. However, from what I understand about this processor >>>>>> (Xeon E7330), only two of them should. Like vanilla Linux, the >>>>>> other two should be able to operate at independent >>>>>> voltage/frequency settings. Once again, I am not sure if this is >>>>>> xenpm''s fault or if the underlying frequency control code isn''t >>>>>> able to determine what CPUs need to switch frequency at the >>>>>> same time. >>>>>> >>>>> >>>>> Do you change any BIOS setting when comparing native Linux and >>>>> Xen? From the xen dmesg you posted last time: >>>> >>>> >>>> No, I did not change anything in the BIOS. However, when I run >>>> vanilla Linux w/ cpufreqd, cpufreq-info will only list two cores >>>> being tied together. This is with the 2.6.24-21 kernel provided >>>> with Ubuntu >>>> 8.04.1. >>>> >>>> # cpufreq-info >>>> cpufrequtils 002: cpufreq-info (C) Dominik Brodowski 2004-2006 >>>> Report errors and bugs to linux@brodo.de, please. >>>> analyzing CPU 0: >>>> driver: acpi-cpufreq >>>> CPUs which need to switch frequency at the same time: 0 4 >>>> hardware limits: 1.60 GHz - 2.40 GHz >>>> available frequency steps: 2.40 GHz, 2.13 GHz, 1.87 GHz, 1.60 GHz >>>> available cpufreq governors: powersave, conservative, ondemand, >>>> userspace, performance current policy: frequency should be within >>>> 1.60 GHz and 1.60 GHz. The governor "powersave" >>>> may decide which speed to use within this range. >>>> current CPU frequency is 1.60 GHz. >>>> >>>> ... >>>> >>> >>> I just noticed that cpufreq-info only lists 8 CPUs. Turns out that >>> Ubuntu''s kernels come with NR_CPUS = 8. So, you might be right. I >>> will try and recompile a vanilla kernel tomorrow to see what >>> happens. >>> >>> Cheers, >>> Niraj >>> >>>> Cheers, >>>> NIraj >>>> >>>>> ... >>>>> (XEN) _PSD: num_entries=5 rev=0 domain=1 coord_type=253 >>>>> num_processors=4 ... (XEN) _PSD: num_entries=5 rev=0 domain=2 >>>>> coord_type=253 num_processors=4 ... (XEN) _PSD: num_entries=5 >>>>> rev=0 domain=3 coord_type=253 num_processors=4 ... You can see >>>>> that BIOS reports 4 processors in a dependent domain >>>>> with a SW_ANY coordination type. It means that any cpu within >>>>> given dependent domain changes freq, all the rest 3 cpus change >>>>> too. >>>>> >>>>> Thanks, >>>>> Kevin >>>>> >>>>> >>>> >>>> >>>> >>>> -- >>>> Niraj Tolia, Researcher, HP Labs >>>> http://www.hpl.hp.com/personal/Niraj_Tolia/_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Apparently Analagous Threads
- Support for CPU frequency scaling in Xen
- Kernel crash with acpi_processor, cpu_idle and intel_idle =y
- Kernel crash with acpi_processor, cpu_idle and intel_idle =y
- Using collectd: CPUFreq in dom0
- [PATCH 1/2] cpufreq, powernow: enable/disable core performance boost for all cpus in policy