thr3ads.net - freebsd stable - powerd / cpufreq question [Apr 2011]

If this information is useful, please help other people find it:
Share via:

Daniel Geržo

2011-Apr-08 11:28 UTC

powerd / cpufreq question

Hello guys,

I have a new machine with Xeon(R) CPU X5650 2666.77-MHz and I would like 
to utilize powerd(8) on it however, when I run `powerd -v -r90' I see 
something like this:

load  64%, current freq 2668 MHz ( 0), wanted freq 5336 MHz
load 120%, current freq 2668 MHz ( 0), wanted freq 5336 MHz
load 173%, current freq 2668 MHz ( 0), wanted freq 5336 MHz
load  62%, current freq 2668 MHz ( 0), wanted freq 5336 MHz
load  82%, current freq 2668 MHz ( 0), wanted freq 5336 MHz
load 110%, current freq 2668 MHz ( 0), wanted freq 5336 MHz

even though the machine is according to top(1) ~90% idle; So I realized, 
that powerd might take the load as the sum of loads of all the cores 
(12), so I tried to tweak powerd arguments like this:

`powerd -v -r 1000 -i 600'

but that errors for me with:

root@[s1-a ~]# powerd -v -r 1000 -i 600
powerd: 1000 is not a valid percent

Well, that makes sense, but why powerd itself knows about load > 100% 
but doesn't allow me to specify it? Is this bug? I suppose not if it 
works for other people...

Other question would be why powerd wants to set freq 5336, when it is 
not available at all (would be nice to have it heh.):

dev.cpu.0.freq_levels: 2668/109000 2533/81000 2400/69000 2267/58000 
2133/48000 2000/40000 1867/32000 1733/26000 1600/20000 1400/17500 
1200/15000 1000/12500

The symptoms seem to show that there's a bug in the code calculating the 
cpu load. Any ideas what may be wrong?

Examle of two consecutive cp_times sysctl output:

kern.cp_times: 4182996 0 306925 85623 13563403 3164971 0 201479 93110 
14679313 3450792 0 258166 80198 14349717 2795270 0 180252 76701 15086650 
2952777 0 217156 119627 14849313 2418067 0 158594 73497 15488715 2408492 
0 175131 104377 15450873 2003803 0 131790 75753 15927527 2456736 0 
178894 36963 15466280 1607095 0 117396 4197 16410185 2127878 0 147639 
30804 15832552 1406621 0 92686 1058 16638508

kern.cp_times: 4183013 0 306927 85626 13563469 3164980 0 201482 93110 
14679390 3450796 0 258167 80199 14349800 2795274 0 180252 76701 15086735 
2952780 0 217157 119629 14849396 2418070 0 158597 73497 15488798 2408499 
0 175132 104377 15450954 2003804 0 131791 75753 15927614 2456744 0 
178897 36963 15466358 1607098 0 117398 4197 16410269 2127880 0 147640 
30804 15832638 1406621 0 92686 1058 16638597

Thanks!

-- 
S pozdravom / Best regards
   Daniel Gerzo

Alexander Motin

2011-Apr-08 12:14 UTC

head link

powerd / cpufreq question

Hi.

On 08.04.2011 14:12, Daniel Ger?o wrote:> I have a new machine with Xeon(R) CPU X5650 2666.77-MHz and I would like
> to utilize powerd(8) on it however, when I run `powerd -v -r90' I see
> something like this:
>
> load 64%, current freq 2668 MHz ( 0), wanted freq 5336 MHz
> load 120%, current freq 2668 MHz ( 0), wanted freq 5336 MHz
> load 173%, current freq 2668 MHz ( 0), wanted freq 5336 MHz
> load 62%, current freq 2668 MHz ( 0), wanted freq 5336 MHz
> load 82%, current freq 2668 MHz ( 0), wanted freq 5336 MHz
> load 110%, current freq 2668 MHz ( 0), wanted freq 5336 MHz
>
> even though the machine is according to top(1) ~90% idle; So I realized,
> that powerd might take the load as the sum of loads of all the cores
> (12), so I tried to tweak powerd arguments like this:
>
> `powerd -v -r 1000 -i 600'
>
> but that errors for me with:
>
> root@[s1-a ~]# powerd -v -r 1000 -i 600
> powerd: 1000 is not a valid percent
>
> Well, that makes sense, but why powerd itself knows about load > 100%
> but doesn't allow me to specify it? Is this bug? I suppose not if it
> works for other people...
It is reasonable limitation. powerd can't know how load distributed 
among multiple cores in time. If all cores are equally busy at lets say 
10% (that gives 120% total) and cores are never waiting for each other 
then obviously frequency could be reduced. But if the same 120% mean 
100%+20%, or if load is equally spread, but processes on different cores 
are waiting for each other, then reducing frequency will reduce 
performance. powerd can't know that and so stays on a safe side.
> Other question would be why powerd wants to set freq 5336, when it is
> not available at all (would be nice to have it heh.):
You may see there it is a "wanted" frequency, not real one. :) It is 
internal implementation details. In such way powerd implements keeping a 
full frequency for some time after the load dropped. It's not a bug.

On multi-core systems like this power management can better be done on 
per-core bases. Powerd can't control frequencies on per-core basis (also 
because it require non-trivial interoperation with scheduler). But if 
your ACPI BIOS allows, you can try to put unused cores into deeper 
C-states, that may give better power saving and TurboBoost on busy cores 
as a bonus. It works better on 9-CURRENT, but on 8-STABLE some bonuses 
still could be achieved.

You may want to look here:
http://wiki.freebsd.org/TuningPowerConsumption

-- 
Alexander Motin

Ian Smith

2011-Apr-11 04:09 UTC

head link

powerd / cpufreq question

On Fri, 8 Apr 2011, Daniel Ger?o wrote:

 > Hello guys,
 > 
 > I have a new machine with Xeon(R) CPU X5650 2666.77-MHz and I would like
to
 > utilize powerd(8) on it however, when I run `powerd -v -r90' I see
something
 > like this:
 > 
 > load  64%, current freq 2668 MHz ( 0), wanted freq 5336 MHz
 > load 120%, current freq 2668 MHz ( 0), wanted freq 5336 MHz
 > load 173%, current freq 2668 MHz ( 0), wanted freq 5336 MHz
 > load  62%, current freq 2668 MHz ( 0), wanted freq 5336 MHz
 > load  82%, current freq 2668 MHz ( 0), wanted freq 5336 MHz
 > load 110%, current freq 2668 MHz ( 0), wanted freq 5336 MHz
 > 
 > even though the machine is according to top(1) ~90% idle; So I realized,
that
 > powerd might take the load as the sum of loads of all the cores (12), so I
 > tried to tweak powerd arguments like this:

Hi Daniel, Alexander, all.

I hope to engage more on this interesting topic later, but first:

[..]

 > Examle of two consecutive cp_times sysctl output:
 > 
 > kern.cp_times: 4182996 0 306925 85623 13563403 3164971 0 201479 93110
 > 14679313 3450792 0 258166 80198 14349717 2795270 0 180252 76701 15086650
 > 2952777 0 217156 119627 14849313 2418067 0 158594 73497 15488715 2408492 0
 > 175131 104377 15450873 2003803 0 131790 75753 15927527 2456736 0 178894
36963
 > 15466280 1607095 0 117396 4197 16410185 2127878 0 147639 30804 15832552
 > 1406621 0 92686 1058 16638508
 > 
 > kern.cp_times: 4183013 0 306927 85626 13563469 3164980 0 201482 93110
 > 14679390 3450796 0 258167 80199 14349800 2795274 0 180252 76701 15086735
 > 2952780 0 217157 119629 14849396 2418070 0 158597 73497 15488798 2408499 0
 > 175132 104377 15450954 2003804 0 131791 75753 15927614 2456744 0 178897
36963
 > 15466358 1607098 0 117398 4197 16410269 2127880 0 147640 30804 15832638
 > 1406621 0 92686 1058 16638597

I wrote the script included below to try making some sense of these, 
that defaults to using your above values, resulting in:

smithi on sola% sh cptimes.sh
              cp_user    cp_nice     cp_sys    cp_intr    cp_idle
cpu: 0 @t0    4182996          0     306925      85623   13563403
cpu: 0 @t1    4183013          0     306927      85626   13563469
                   17          0          2          3         66
cpu: 1 @t0    3164971          0     201479      93110   14679313
cpu: 1 @t1    3164980          0     201482      93110   14679390
                    9          0          3          0         77
cpu: 2 @t0    3450792          0     258166      80198   14349717
cpu: 2 @t1    3450796          0     258167      80199   14349800
                    4          0          1          1         83
cpu: 3 @t0    2795270          0     180252      76701   15086650
cpu: 3 @t1    2795274          0     180252      76701   15086735
                    4          0          0          0         85
cpu: 4 @t0    2952777          0     217156     119627   14849313
cpu: 4 @t1    2952780          0     217157     119629   14849396
                    3          0          1          2         83
cpu: 5 @t0    2418067          0     158594      73497   15488715
cpu: 5 @t1    2418070          0     158597      73497   15488798
                    3          0          3          0         83
cpu: 6 @t0    2408492          0     175131     104377   15450873
cpu: 6 @t1    2408499          0     175132     104377   15450954
                    7          0          1          0         81
cpu: 7 @t0    2003803          0     131790      75753   15927527
cpu: 7 @t1    2003804          0     131791      75753   15927614
                    1          0          1          0         87
cpu: 8 @t0    2456736          0     178894      36963   15466280
cpu: 8 @t1    2456744          0     178897      36963   15466358
                    8          0          3          0         78
cpu: 9 @t0    1607095          0     117396       4197   16410185
cpu: 9 @t1    1607098          0     117398       4197   16410269
                    3          0          2          0         84
cpu:10 @t0    2127878          0     147639      30804   15832552
cpu:10 @t1    2127880          0     147640      30804   15832638
                    2          0          1          0         86
cpu:11 @t0    1406621          0      92686       1058   16638508
cpu:11 @t1    1406621          0      92686       1058   16638597
                    0          0          0          0         89

As you see, total of differences for each cpu is here 89 ticks, but I've 
no idea of the interval between your two readings, or your value of HZ?

Are those kern.cp_times values as they came, or did you remove trailing 
zeroes?  Reason I ask is that on my Thinkpad T23, single-core 1133/733 
MHz, sysctl kern.cp_time shows the usual 5 values, but kern.cp_times has 
the same 5 values for cpu0, but then 5 zeroes for each of cpu1 through 
cpu31, on 8.2-PRE about early January.  I need to update the script to 
remove surplus data for non-existing cpus, but wonder if the extra data 
also appeared on your 12 core box?

I also found that on both my old 5.5 system and at 8.2-PRE, tick values 
are stathz (here 128Hz) rather than hardclock ticks (here 200Hz and 
1000Hz respectively).

smithi on sola% sh cptimes.sh "`sysctl kern.cp_time`" "`sleep 10;
sysctl kern.cp_time`"
              cp_user    cp_nice     cp_sys    cp_intr    cp_idle
cpu: 0 @t0   73734119     307963   40043714    7694011  581512591
cpu: 0 @t1   73734218     307964   40043765    7694013  581513729
                   99          1         51          2       1138

Total diffs 1291, /10 = 129 per second, near enough with sleep delay.

Some of this will likely change with mav@'s new eventtimers on 9.x ..  
BTW Alexander, eventtimers(4) and hpet(4) have lately turned up, eg: 
http://www.freebsd.org/cgi/man.cgi?query=eventtimers&apropos=0&sektion=0&manpath=FreeBSD+9-current&format=html
but attimer(4) and atrtc(4) still have not.  Bug simon@ at BSDCan? :)

cheers, Ian

======#!/bin/sh
# cptimes.sh v0.2 smithi 9/4/11
# see /sys/sys/resource.h (@5.5-STABLE) .. cp_* units are stathz ticks

# initial values pasted from danger@'s email 

time0="kern.cp_times: 4182996 0 306925 85623 13563403 3164971 0 201479
93110 14679313 3450792 0 258166 80198
14349717 2795270 0 180252 76701 15086650 2952777 0 217156 119627 14849313
2418067 0 158594 73497 15488715
2408492 0 175131 104377 15450873 2003803 0 131790 75753 15927527 2456736 0
178894 36963 15466280 1607095 0
117396 4197 16410185 2127878 0 147639 30804 15832552 1406621 0 92686 1058
16638508"

time1="kern.cp_times: 4183013 0 306927 85626 13563469 3164980 0 201482
93110 14679390 3450796 0 258167 80199
14349800 2795274 0 180252 76701 15086735 2952780 0 217157 119629 14849396
2418070 0 158597 73497 15488798
2408499 0 175132 104377 15450954 2003804 0 131791 75753 15927614 2456744 0
178897 36963 15466358 1607098 0
117398 4197 16410269 2127880 0 147640 30804 15832638 1406621 0 92686 1058
16638597"

# eg: sh cptimes.sh "`sysctl kern.cp_time`" "`sleep 10; sysctl
kern.cp_time`"
[ "$2" ] && time0="$1" &&
time1="$2"

for i in 0 1; do
	eval var=\$time$i
	# [ "${var%% *}" != kern.cp_times: ] && echo oops! &&
exit 1
	# also allow data without name (sysctl -n) and older kern.cp_time
	first=${var%% *}; [ "${first%time*:}" != kern.cp_ ] &&
var=" $var"
	set -- `echo ${var#* }`
	cpu=0
	while [ "$1" ]; do
		line=''
		for j in 1 2 3 4 5; do
			[ "$line" ] && line="$line $1" || line=$1
			shift
		done
		eval line_${cpu}_${i}=\"$line\"		# ersatz 2D arrays
		cpu=$((cpu+1))
	done
done

printdiffs() {
	echo -n "          "
	while [ "$6" ]; do
		echo -n "`printf ' % 10i' $(($6 - $1))`"
		shift
	done
	echo
}

echo "              cp_user    cp_nice     cp_sys    cp_intr   
cp_idle"
for cpu in `jot $cpu 0`; do
	for i in 0 1; do
		eval row_${i}=\$line_${cpu}_${i}
		eval line="\$row_${i}"
		echo "`printf 'cpu:%2u @t%1u %10u %10u %10u %10u %10u' \
			 $cpu $i $line`"
	done
	printdiffs $row_0 $row_1
done
=======

Alexander Motin

2011-Apr-12 18:57 UTC

head link

powerd / cpufreq question

Ian Smith wrote:> On Tue, 12 Apr 2011, Daniel Gerzo wrote:
>  > On 11.4.2011 6:08, Ian Smith wrote:
>  > > 
>  > > As you see, total of differences for each cpu is here 89 ticks,
but I've
>  > > no idea of the interval between your two readings, or your value
of HZ?
>  > 
>  > the interval may have been around 1-2 seconds.
>  > My value of HZ is default, 1000.
> 
> Ok, seems it depends on stathz, not HZ, so 89'd be less than 1 second
if
> your stathz is 128 .. I gather that may be changed with the 9.x timers?
9-CURRENT tries to set stathz to 127, or at least somewhere around. The
main difference there is that clocks really tick only when CPU is
running and emulated during idle periods to allow C-states do their job.

-- 
Alexander Motin

freebsd stable - Apr 2011 - powerd / cpufreq question

powerd / cpufreq question

powerd / cpufreq question

powerd / cpufreq question

powerd / cpufreq question