A couple days ago I updated my system and was excited to see cpufreq and powerd in 5-stable. Since then however I noticed that my laptop temperature is about 5?C higher than with est and estctrl. I found that cpufreq when setting 200MHz for example set the absolute frequency to 1600MHz (max for this laptop) and the relative frequency (p4tcc) to 12.5% instead of using a more power conserving setting like 800MHz/25%. The problem is that cpufreq_expand_set() (sys/kern/kern_cpu.c) traverses freq levels from high to low when adding relative levels and skips duplicates. When it wants to add 800MHz/25% it sees this setting as a duplicate of 1600MHz/12.5% it has found before. This can be fixed by letting cpufreq_expand_set() traverse freq levels in reverse order (and still skipping duplicates). Then each frequency level has the lowest possible absolute setting. This is a one line change in sys/kern/kern_cpu.c (line 653). --- sys/kern/kern_cpu.c.orig Mon Aug 1 14:42:26 2005 +++ sys/kern/kern_cpu.c Mon Aug 1 15:17:55 2005 @@ -650,7 +650,7 @@ CF_MTX_ASSERT(&sc->lock); - TAILQ_FOREACH(search, &sc->all_levels, link) { + TAILQ_FOREACH_REVERSE(search, &sc->all_levels, cf_level_lst, link) { /* Skip this level if we've already modified it. */ for (i = 0; i < search->rel_count; i++) { if (search->rel_set[i].dev == set_arr->sets[0].dev) With this patch temperature is almost as low as with est again (only 1?C hotter). However, there are still such levels like 1400/12.5 (175MHz) which are lower than let's say 600/37.5 (225MHz), but consume a lot more power. On my laptop this problem doesn't really occur because of the way powerd works, only the absolute levels 1600, 800 and 600 are ever used. I can imagine somebody with a 1700MHz cpu not being so lucky though. So, I've worked out a patch (attached) that makes sure that a lower frequency level has at most the same absolute setting (preferably less of course). This eliminates quite a few levels so somebody with a better knowledge of cpufreq should check if this patch really does something good. This is the first time I've taken a look at FreeBSD source code by the way. Also, somewhat related, the p4tcc driver doesn't recognise acpi_throttle, which means that when you load the cpufreq module after booting, the freq levels are messed up. I'm not sure what the best solution for this is. Let p4tcc detect acpi_throttle and don't attach if it's present (like acpi_throttle does now if it finds p4tcc) or detach it before attaching? Or maybe p4tcc and acpi_throttle should be merged into one driver? Finally, is the kernel config option CPU_ENABLE_TCC still relevant? Because it's still listed in NOTES.
Hmm, the server must have dropped the attachment. Another try... -------------- next part -------------- --- sys/kern/kern_cpu.c.orig Mon Aug 1 14:42:26 2005 +++ sys/kern/kern_cpu.c Mon Aug 1 23:52:06 2005 @@ -650,19 +650,7 @@ CF_MTX_ASSERT(&sc->lock); - TAILQ_FOREACH(search, &sc->all_levels, link) { - /* Skip this level if we've already modified it. */ - for (i = 0; i < search->rel_count; i++) { - if (search->rel_set[i].dev == set_arr->sets[0].dev) - break; - } - if (i != search->rel_count) { - CF_DEBUG("skipping modified level, freq %d (dev %s)\n", - search->total_set.freq, - device_get_nameunit(search->rel_set[i].dev)); - continue; - } - + TAILQ_FOREACH_REVERSE(search, &sc->all_levels, cf_level_lst, link) { /* Add each setting to the level, duplicating if necessary. */ for (i = 0; i < set_arr->count; i++) { set = &set_arr->sets[i]; @@ -677,10 +665,11 @@ /* * The new level was a duplicate of an existing level - * so we freed it. Go to the next setting. + * or the relative setting was too low so we freed it. + * No need to try lower settings of this driver. */ if (fill == NULL) - continue; + break; /* Add this setting to the existing or new level. */ KASSERT(fill->rel_count < MAX_SETTINGS, @@ -756,24 +745,28 @@ */ list = &sc->all_levels; if (TAILQ_EMPTY(list)) { + /* Does this ever happen? */ CF_DEBUG("dup done, inserted %d at head\n", fill_set->freq); TAILQ_INSERT_HEAD(list, fill, link); } else { TAILQ_FOREACH_REVERSE(itr, list, cf_level_lst, link) { itr_set = &itr->total_set; - if (CPUFREQ_CMP(fill_set->freq, itr_set->freq)) { - CF_DEBUG( - "dup done, freeing new level %d, matches %d\n", - fill_set->freq, itr_set->freq); - free(fill, M_TEMP); - fill = NULL; - break; - } else if (fill_set->freq < itr_set->freq) { + if (fill_set->freq > itr_set->freq) { + continue; + } else if (fill_set->freq < itr_set->freq && + fill->abs_set.freq <= itr->abs_set.freq) { CF_DEBUG( "dup done, inserting new level %d after %d\n", fill_set->freq, itr_set->freq); TAILQ_INSERT_AFTER(list, itr, fill, link); sc->all_count++; + break; + } else { + CF_DEBUG( + "dup done, freeing new level %d, worse than existing %d\n", + fill_set->freq, itr_set->freq); + free(fill, M_TEMP); + fill = NULL; break; } }
On Tue, Aug 02, 2005 at 12:22:02AM +0200, Tijl Coosemans wrote:> A couple days ago I updated my system and was excited to see cpufreq > and powerd in 5-stable. Since then however I noticed that my laptop > temperature is about 5?C higher than with est and estctrl. I found that > cpufreq when setting 200MHz for example set the absolute frequency to > 1600MHz (max for this laptop) and the relative frequency (p4tcc) to > 12.5% instead of using a more power conserving setting like 800MHz/25%. > > The problem is that cpufreq_expand_set() (sys/kern/kern_cpu.c) > traverses freq levels from high to low when adding relative levels and > skips duplicates. When it wants to add 800MHz/25% it sees this setting > as a duplicate of 1600MHz/12.5% it has found before. This can be fixed > by letting cpufreq_expand_set() traverse freq levels in reverse order > (and still skipping duplicates). Then each frequency level has the > lowest possible absolute setting. This is a one line change in > sys/kern/kern_cpu.c (line 653).It's a well known bug. Someday I think I will have enough time to fix that one if Nate don't bite me.> With this patch temperature is almost as low as with est again (only > 1?C hotter). However, there are still such levels like 1400/12.5 > (175MHz) which are lower than let's say 600/37.5 (225MHz), but consume > a lot more power. On my laptop this problem doesn't really occur > because of the way powerd works, only the absolute levels 1600, 800 and > 600 are ever used. I can imagine somebody with a 1700MHz cpu not being > so lucky though. So, I've worked out a patch (attached) that makes sure > that a lower frequency level has at most the same absolute setting > (preferably less of course). This eliminates quite a few levels so > somebody with a better knowledge of cpufreq should check if this patch > really does something good. This is the first time I've taken a look at > FreeBSD source code by the way.It's in my todo list in a so long time that I must admit I must be blamed to have not fixed that already.> Also, somewhat related, the p4tcc driver doesn't recognise > acpi_throttle, which means that when you load the cpufreq module after > booting, the freq levels are messed up. I'm not sure what the best > solution for this is. Let p4tcc detect acpi_throttle and don't attach > if it's present (like acpi_throttle does now if it finds p4tcc) or > detach it before attaching? Or maybe p4tcc and acpi_throttle should be > merged into one driver? > > Finally, is the kernel config option CPU_ENABLE_TCC still relevant? > Because it's still listed in NOTES.Right. I forgot to kill that option. -- Bruno Ducrot -- Which is worse: ignorance or apathy? -- Don't know. Don't care.
Tijl Coosemans wrote:> A couple days ago I updated my system and was excited to see cpufreq > and powerd in 5-stable. Since then however I noticed that my laptop > temperature is about 5?C higher than with est and estctrl. I found that > cpufreq when setting 200MHz for example set the absolute frequency to > 1600MHz (max for this laptop) and the relative frequency (p4tcc) to > 12.5% instead of using a more power conserving setting like 800MHz/25%. > > The problem is that cpufreq_expand_set() (sys/kern/kern_cpu.c) > traverses freq levels from high to low when adding relative levels and > skips duplicates. When it wants to add 800MHz/25% it sees this setting > as a duplicate of 1600MHz/12.5% it has found before. This can be fixed > by letting cpufreq_expand_set() traverse freq levels in reverse order > (and still skipping duplicates). Then each frequency level has the > lowest possible absolute setting. This is a one line change in > sys/kern/kern_cpu.c (line 653).You have some valid issues but I need some time to analyze your patch to be sure it doesn't introduce new problems. There may be some issues with traversing the list backwards.> Also, somewhat related, the p4tcc driver doesn't recognise > acpi_throttle, which means that when you load the cpufreq module after > booting, the freq levels are messed up. I'm not sure what the best > solution for this is. Let p4tcc detect acpi_throttle and don't attach > if it's present (like acpi_throttle does now if it finds p4tcc) or > detach it before attaching? Or maybe p4tcc and acpi_throttle should be > merged into one driver?cpufreq is not set up for loading after boot currently. It must be loaded at boot. There are architectural issues that need to be solved to make this happen, namely real arbitration between drivers loaded that support the same feature through different mechanisms. p4tcc and acpi_throttle on some architectures is such a combo that needs special attention.> Finally, is the kernel config option CPU_ENABLE_TCC still relevant? > Because it's still listed in NOTES.The old option should be removed. I'll try to review this patch and commit it sometime soon. -- Nate
Tijl Coosemans wrote:> A couple days ago I updated my system and was excited to see cpufreq > and powerd in 5-stable. Since then however I noticed that my laptop > temperature is about 5?C higher than with est and estctrl. I found that > cpufreq when setting 200MHz for example set the absolute frequency to > 1600MHz (max for this laptop) and the relative frequency (p4tcc) to > 12.5% instead of using a more power conserving setting like 800MHz/25%.A variant of your patch has been committed and will be MFCd. Thanks!> So, I've worked out a patch (attached) that makes sure > that a lower frequency level has at most the same absolute setting > (preferably less of course). This eliminates quite a few levels so > somebody with a better knowledge of cpufreq should check if this patch > really does something good. This is the first time I've taken a look at > FreeBSD source code by the way.I added back the check for CPUFREQ_CMP since you don't want duplicate levels. This is not currently a problem with est/p4tcc but other combinations of settings could have produced duplicates with the patch's approach.> Also, somewhat related, the p4tcc driver doesn't recognise > acpi_throttle, which means that when you load the cpufreq module after > booting, the freq levels are messed up. I'm not sure what the best > solution for this is. Let p4tcc detect acpi_throttle and don't attach > if it's present (like acpi_throttle does now if it finds p4tcc) or > detach it before attaching? Or maybe p4tcc and acpi_throttle should be > merged into one driver?acpi_throttle is only the same as p4tcc on x86 platforms. We need a better negotiation strategy in general between the different drivers. The logic for these two is already p4tcc > acpi_throttle but we need to support reprobing when cpufreq.ko is loaded after boot by detaching both and then allowing p4tcc to win the probe. -- Nate