Hello guys, I have a new machine with Xeon(R) CPU X5650 2666.77-MHz and I would like to utilize powerd(8) on it however, when I run `powerd -v -r90' I see something like this: load 64%, current freq 2668 MHz ( 0), wanted freq 5336 MHz load 120%, current freq 2668 MHz ( 0), wanted freq 5336 MHz load 173%, current freq 2668 MHz ( 0), wanted freq 5336 MHz load 62%, current freq 2668 MHz ( 0), wanted freq 5336 MHz load 82%, current freq 2668 MHz ( 0), wanted freq 5336 MHz load 110%, current freq 2668 MHz ( 0), wanted freq 5336 MHz even though the machine is according to top(1) ~90% idle; So I realized, that powerd might take the load as the sum of loads of all the cores (12), so I tried to tweak powerd arguments like this: `powerd -v -r 1000 -i 600' but that errors for me with: root@[s1-a ~]# powerd -v -r 1000 -i 600 powerd: 1000 is not a valid percent Well, that makes sense, but why powerd itself knows about load > 100% but doesn't allow me to specify it? Is this bug? I suppose not if it works for other people... Other question would be why powerd wants to set freq 5336, when it is not available at all (would be nice to have it heh.): dev.cpu.0.freq_levels: 2668/109000 2533/81000 2400/69000 2267/58000 2133/48000 2000/40000 1867/32000 1733/26000 1600/20000 1400/17500 1200/15000 1000/12500 The symptoms seem to show that there's a bug in the code calculating the cpu load. Any ideas what may be wrong? Examle of two consecutive cp_times sysctl output: kern.cp_times: 4182996 0 306925 85623 13563403 3164971 0 201479 93110 14679313 3450792 0 258166 80198 14349717 2795270 0 180252 76701 15086650 2952777 0 217156 119627 14849313 2418067 0 158594 73497 15488715 2408492 0 175131 104377 15450873 2003803 0 131790 75753 15927527 2456736 0 178894 36963 15466280 1607095 0 117396 4197 16410185 2127878 0 147639 30804 15832552 1406621 0 92686 1058 16638508 kern.cp_times: 4183013 0 306927 85626 13563469 3164980 0 201482 93110 14679390 3450796 0 258167 80199 14349800 2795274 0 180252 76701 15086735 2952780 0 217157 119629 14849396 2418070 0 158597 73497 15488798 2408499 0 175132 104377 15450954 2003804 0 131791 75753 15927614 2456744 0 178897 36963 15466358 1607098 0 117398 4197 16410269 2127880 0 147640 30804 15832638 1406621 0 92686 1058 16638597 Thanks! -- S pozdravom / Best regards Daniel Gerzo
Hi. On 08.04.2011 14:12, Daniel Ger?o wrote:> I have a new machine with Xeon(R) CPU X5650 2666.77-MHz and I would like > to utilize powerd(8) on it however, when I run `powerd -v -r90' I see > something like this: > > load 64%, current freq 2668 MHz ( 0), wanted freq 5336 MHz > load 120%, current freq 2668 MHz ( 0), wanted freq 5336 MHz > load 173%, current freq 2668 MHz ( 0), wanted freq 5336 MHz > load 62%, current freq 2668 MHz ( 0), wanted freq 5336 MHz > load 82%, current freq 2668 MHz ( 0), wanted freq 5336 MHz > load 110%, current freq 2668 MHz ( 0), wanted freq 5336 MHz > > even though the machine is according to top(1) ~90% idle; So I realized, > that powerd might take the load as the sum of loads of all the cores > (12), so I tried to tweak powerd arguments like this: > > `powerd -v -r 1000 -i 600' > > but that errors for me with: > > root@[s1-a ~]# powerd -v -r 1000 -i 600 > powerd: 1000 is not a valid percent > > Well, that makes sense, but why powerd itself knows about load > 100% > but doesn't allow me to specify it? Is this bug? I suppose not if it > works for other people...It is reasonable limitation. powerd can't know how load distributed among multiple cores in time. If all cores are equally busy at lets say 10% (that gives 120% total) and cores are never waiting for each other then obviously frequency could be reduced. But if the same 120% mean 100%+20%, or if load is equally spread, but processes on different cores are waiting for each other, then reducing frequency will reduce performance. powerd can't know that and so stays on a safe side.> Other question would be why powerd wants to set freq 5336, when it is > not available at all (would be nice to have it heh.):You may see there it is a "wanted" frequency, not real one. :) It is internal implementation details. In such way powerd implements keeping a full frequency for some time after the load dropped. It's not a bug. On multi-core systems like this power management can better be done on per-core bases. Powerd can't control frequencies on per-core basis (also because it require non-trivial interoperation with scheduler). But if your ACPI BIOS allows, you can try to put unused cores into deeper C-states, that may give better power saving and TurboBoost on busy cores as a bonus. It works better on 9-CURRENT, but on 8-STABLE some bonuses still could be achieved. You may want to look here: http://wiki.freebsd.org/TuningPowerConsumption -- Alexander Motin
On Fri, 8 Apr 2011, Daniel Ger?o wrote:
> Hello guys,
>
> I have a new machine with Xeon(R) CPU X5650 2666.77-MHz and I would like
to
> utilize powerd(8) on it however, when I run `powerd -v -r90' I see
something
> like this:
>
> load 64%, current freq 2668 MHz ( 0), wanted freq 5336 MHz
> load 120%, current freq 2668 MHz ( 0), wanted freq 5336 MHz
> load 173%, current freq 2668 MHz ( 0), wanted freq 5336 MHz
> load 62%, current freq 2668 MHz ( 0), wanted freq 5336 MHz
> load 82%, current freq 2668 MHz ( 0), wanted freq 5336 MHz
> load 110%, current freq 2668 MHz ( 0), wanted freq 5336 MHz
>
> even though the machine is according to top(1) ~90% idle; So I realized,
that
> powerd might take the load as the sum of loads of all the cores (12), so I
> tried to tweak powerd arguments like this:
Hi Daniel, Alexander, all.
I hope to engage more on this interesting topic later, but first:
[..]
> Examle of two consecutive cp_times sysctl output:
>
> kern.cp_times: 4182996 0 306925 85623 13563403 3164971 0 201479 93110
> 14679313 3450792 0 258166 80198 14349717 2795270 0 180252 76701 15086650
> 2952777 0 217156 119627 14849313 2418067 0 158594 73497 15488715 2408492 0
> 175131 104377 15450873 2003803 0 131790 75753 15927527 2456736 0 178894
36963
> 15466280 1607095 0 117396 4197 16410185 2127878 0 147639 30804 15832552
> 1406621 0 92686 1058 16638508
>
> kern.cp_times: 4183013 0 306927 85626 13563469 3164980 0 201482 93110
> 14679390 3450796 0 258167 80199 14349800 2795274 0 180252 76701 15086735
> 2952780 0 217157 119629 14849396 2418070 0 158597 73497 15488798 2408499 0
> 175132 104377 15450954 2003804 0 131791 75753 15927614 2456744 0 178897
36963
> 15466358 1607098 0 117398 4197 16410269 2127880 0 147640 30804 15832638
> 1406621 0 92686 1058 16638597
I wrote the script included below to try making some sense of these,
that defaults to using your above values, resulting in:
smithi on sola% sh cptimes.sh
cp_user cp_nice cp_sys cp_intr cp_idle
cpu: 0 @t0 4182996 0 306925 85623 13563403
cpu: 0 @t1 4183013 0 306927 85626 13563469
17 0 2 3 66
cpu: 1 @t0 3164971 0 201479 93110 14679313
cpu: 1 @t1 3164980 0 201482 93110 14679390
9 0 3 0 77
cpu: 2 @t0 3450792 0 258166 80198 14349717
cpu: 2 @t1 3450796 0 258167 80199 14349800
4 0 1 1 83
cpu: 3 @t0 2795270 0 180252 76701 15086650
cpu: 3 @t1 2795274 0 180252 76701 15086735
4 0 0 0 85
cpu: 4 @t0 2952777 0 217156 119627 14849313
cpu: 4 @t1 2952780 0 217157 119629 14849396
3 0 1 2 83
cpu: 5 @t0 2418067 0 158594 73497 15488715
cpu: 5 @t1 2418070 0 158597 73497 15488798
3 0 3 0 83
cpu: 6 @t0 2408492 0 175131 104377 15450873
cpu: 6 @t1 2408499 0 175132 104377 15450954
7 0 1 0 81
cpu: 7 @t0 2003803 0 131790 75753 15927527
cpu: 7 @t1 2003804 0 131791 75753 15927614
1 0 1 0 87
cpu: 8 @t0 2456736 0 178894 36963 15466280
cpu: 8 @t1 2456744 0 178897 36963 15466358
8 0 3 0 78
cpu: 9 @t0 1607095 0 117396 4197 16410185
cpu: 9 @t1 1607098 0 117398 4197 16410269
3 0 2 0 84
cpu:10 @t0 2127878 0 147639 30804 15832552
cpu:10 @t1 2127880 0 147640 30804 15832638
2 0 1 0 86
cpu:11 @t0 1406621 0 92686 1058 16638508
cpu:11 @t1 1406621 0 92686 1058 16638597
0 0 0 0 89
As you see, total of differences for each cpu is here 89 ticks, but I've
no idea of the interval between your two readings, or your value of HZ?
Are those kern.cp_times values as they came, or did you remove trailing
zeroes? Reason I ask is that on my Thinkpad T23, single-core 1133/733
MHz, sysctl kern.cp_time shows the usual 5 values, but kern.cp_times has
the same 5 values for cpu0, but then 5 zeroes for each of cpu1 through
cpu31, on 8.2-PRE about early January. I need to update the script to
remove surplus data for non-existing cpus, but wonder if the extra data
also appeared on your 12 core box?
I also found that on both my old 5.5 system and at 8.2-PRE, tick values
are stathz (here 128Hz) rather than hardclock ticks (here 200Hz and
1000Hz respectively).
smithi on sola% sh cptimes.sh "`sysctl kern.cp_time`" "`sleep 10;
sysctl kern.cp_time`"
cp_user cp_nice cp_sys cp_intr cp_idle
cpu: 0 @t0 73734119 307963 40043714 7694011 581512591
cpu: 0 @t1 73734218 307964 40043765 7694013 581513729
99 1 51 2 1138
Total diffs 1291, /10 = 129 per second, near enough with sleep delay.
Some of this will likely change with mav@'s new eventtimers on 9.x ..
BTW Alexander, eventtimers(4) and hpet(4) have lately turned up, eg:
http://www.freebsd.org/cgi/man.cgi?query=eventtimers&apropos=0&sektion=0&manpath=FreeBSD+9-current&format=html
but attimer(4) and atrtc(4) still have not. Bug simon@ at BSDCan? :)
cheers, Ian
======#!/bin/sh
# cptimes.sh v0.2 smithi 9/4/11
# see /sys/sys/resource.h (@5.5-STABLE) .. cp_* units are stathz ticks
# initial values pasted from danger@'s email
time0="kern.cp_times: 4182996 0 306925 85623 13563403 3164971 0 201479
93110 14679313 3450792 0 258166 80198
14349717 2795270 0 180252 76701 15086650 2952777 0 217156 119627 14849313
2418067 0 158594 73497 15488715
2408492 0 175131 104377 15450873 2003803 0 131790 75753 15927527 2456736 0
178894 36963 15466280 1607095 0
117396 4197 16410185 2127878 0 147639 30804 15832552 1406621 0 92686 1058
16638508"
time1="kern.cp_times: 4183013 0 306927 85626 13563469 3164980 0 201482
93110 14679390 3450796 0 258167 80199
14349800 2795274 0 180252 76701 15086735 2952780 0 217157 119629 14849396
2418070 0 158597 73497 15488798
2408499 0 175132 104377 15450954 2003804 0 131791 75753 15927614 2456744 0
178897 36963 15466358 1607098 0
117398 4197 16410269 2127880 0 147640 30804 15832638 1406621 0 92686 1058
16638597"
# eg: sh cptimes.sh "`sysctl kern.cp_time`" "`sleep 10; sysctl
kern.cp_time`"
[ "$2" ] && time0="$1" &&
time1="$2"
for i in 0 1; do
eval var=\$time$i
# [ "${var%% *}" != kern.cp_times: ] && echo oops! &&
exit 1
# also allow data without name (sysctl -n) and older kern.cp_time
first=${var%% *}; [ "${first%time*:}" != kern.cp_ ] &&
var=" $var"
set -- `echo ${var#* }`
cpu=0
while [ "$1" ]; do
line=''
for j in 1 2 3 4 5; do
[ "$line" ] && line="$line $1" || line=$1
shift
done
eval line_${cpu}_${i}=\"$line\" # ersatz 2D arrays
cpu=$((cpu+1))
done
done
printdiffs() {
echo -n " "
while [ "$6" ]; do
echo -n "`printf ' % 10i' $(($6 - $1))`"
shift
done
echo
}
echo " cp_user cp_nice cp_sys cp_intr
cp_idle"
for cpu in `jot $cpu 0`; do
for i in 0 1; do
eval row_${i}=\$line_${cpu}_${i}
eval line="\$row_${i}"
echo "`printf 'cpu:%2u @t%1u %10u %10u %10u %10u %10u' \
$cpu $i $line`"
done
printdiffs $row_0 $row_1
done
=======
Ian Smith wrote:> On Tue, 12 Apr 2011, Daniel Gerzo wrote: > > On 11.4.2011 6:08, Ian Smith wrote: > > > > > > As you see, total of differences for each cpu is here 89 ticks, but I've > > > no idea of the interval between your two readings, or your value of HZ? > > > > the interval may have been around 1-2 seconds. > > My value of HZ is default, 1000. > > Ok, seems it depends on stathz, not HZ, so 89'd be less than 1 second if > your stathz is 128 .. I gather that may be changed with the 9.x timers?9-CURRENT tries to set stathz to 127, or at least somewhere around. The main difference there is that clocks really tick only when CPU is running and emulated during idle periods to allow C-states do their job. -- Alexander Motin