Pasi Kärkkäinen
2012-Feb-03 18:09 UTC
Stop the continuous flood of (XEN) traps.c:2432:d0 Domain attempted WRMSR ..
Hello, IIRC there was some discussion earlier about these messages in Xen''s dmesg: (XEN) traps.c:2432:d0 Domain attempted WRMSR 00000000000001ac from 0x0000000000c800c8 to 0x0000000080c880c8. (XEN) traps.c:2432:d0 Domain attempted WRMSR 00000000000001ac from 0x0000000000c800c8 to 0x0000000080c880c8. (XEN) traps.c:2432:d0 Domain attempted WRMSR 00000000000001ac from 0x0000000000c800c8 to 0x0000000080c880c8. (XEN) traps.c:2432:d0 Domain attempted WRMSR 00000000000001ac from 0x0000000000c800c8 to 0x0000000080c880c8. At least on my systems there''s continuous flood of those messages, so they will fill up the Xen dmesg log buffer and "xm dmesg" or "xl dmesg" won''t show any valuable information, just those messages. I seem to be getting those messages even when there''s only dom0 running. Is the plan to drop those messages? What''s causing them? hmm, according to this bugzilla https://bugzilla.redhat.com/show_bug.cgi?id=470035, they''re related to dom0 kernel acpi-cpufreq ? Also it seems there was discussion about the subject on 2011/08: http://old-list-archives.xen.org/archives/html/xen-devel/2011-08/msg00561.html Xen hypervisor 4.1.2. dom0 Linux kernel 3.2.2. Thanks, -- Pasi
Konrad Rzeszutek Wilk
2012-Feb-03 18:55 UTC
Re: Stop the continuous flood of (XEN) traps.c:2432:d0 Domain attempted WRMSR ..
On Fri, Feb 03, 2012 at 08:09:52PM +0200, Pasi Kärkkäinen wrote:> Hello, > > IIRC there was some discussion earlier about these messages in Xen''s dmesg: > > (XEN) traps.c:2432:d0 Domain attempted WRMSR 00000000000001ac from 0x0000000000c800c8 to 0x0000000080c880c8. > (XEN) traps.c:2432:d0 Domain attempted WRMSR 00000000000001ac from 0x0000000000c800c8 to 0x0000000080c880c8. > (XEN) traps.c:2432:d0 Domain attempted WRMSR 00000000000001ac from 0x0000000000c800c8 to 0x0000000080c880c8. > (XEN) traps.c:2432:d0 Domain attempted WRMSR 00000000000001ac from 0x0000000000c800c8 to 0x0000000080c880c8. > > At least on my systems there''s continuous flood of those messages, so they will fill up the > Xen dmesg log buffer and "xm dmesg" or "xl dmesg" won''t show any valuable information, just those messages.Is it always that MSR? That looks to be TURBO_POWER_CURRENT_LIMIT which is the intel_ips driver doing.> > I seem to be getting those messages even when there''s only dom0 running. > Is the plan to drop those messages? What''s causing them?Looks to be the intel-ips. If you rename it does the issue disappear?> > hmm, according to this bugzilla https://bugzilla.redhat.com/show_bug.cgi?id=470035, > they''re related to dom0 kernel acpi-cpufreq ? > > Also it seems there was discussion about the subject on 2011/08: > http://old-list-archives.xen.org/archives/html/xen-devel/2011-08/msg00561.html > > > Xen hypervisor 4.1.2. > dom0 Linux kernel 3.2.2. > > > Thanks, > > -- Pasi
Pasi Kärkkäinen
2012-Feb-05 19:44 UTC
Re: Stop the continuous flood of (XEN) traps.c:2432:d0 Domain attempted WRMSR ..
On Fri, Feb 03, 2012 at 01:55:27PM -0500, Konrad Rzeszutek Wilk wrote:> On Fri, Feb 03, 2012 at 08:09:52PM +0200, Pasi Kärkkäinen wrote: > > Hello, > > > > IIRC there was some discussion earlier about these messages in Xen''s dmesg: > > > > (XEN) traps.c:2432:d0 Domain attempted WRMSR 00000000000001ac from 0x0000000000c800c8 to 0x0000000080c880c8. > > (XEN) traps.c:2432:d0 Domain attempted WRMSR 00000000000001ac from 0x0000000000c800c8 to 0x0000000080c880c8. > > (XEN) traps.c:2432:d0 Domain attempted WRMSR 00000000000001ac from 0x0000000000c800c8 to 0x0000000080c880c8. > > (XEN) traps.c:2432:d0 Domain attempted WRMSR 00000000000001ac from 0x0000000000c800c8 to 0x0000000080c880c8. > > > > At least on my systems there''s continuous flood of those messages, so they will fill up the > > Xen dmesg log buffer and "xm dmesg" or "xl dmesg" won''t show any valuable information, just those messages. > > Is it always that MSR? That looks to be TURBO_POWER_CURRENT_LIMIT > which is the intel_ips driver doing. >Yeah, it''s always the same..> > > > I seem to be getting those messages even when there''s only dom0 running. > > Is the plan to drop those messages? What''s causing them? > > Looks to be the intel-ips. If you rename it does the issue disappear?I just did "rmmod intel_ips" and the flood stopped.. Btw on baremetal I get this in dmesg: [ 745.033645] CPU1: Core temperature above threshold, cpu clock throttled (total events = 1) [ 745.033652] CPU3: Core temperature above threshold, cpu clock throttled (total events = 1) [ 745.034676] CPU1: Core temperature/speed normal [ 745.034678] CPU3: Core temperature/speed normal [ 849.678508] intel ips 0000:00:1f.6: MCP limit exceeded: Avg temp 9682, limit 9000 [ 899.614074] intel ips 0000:00:1f.6: MCP limit exceeded: Avg temp 9896, limit 9000 [ 899.722881] [Hardware Error]: Machine check events logged [ 1172.675987] CPU3: Core temperature above threshold, cpu clock throttled (total events = 78) [ 1172.675990] CPU1: Core temperature above threshold, cpu clock throttled (total events = 78) [ 1172.677038] CPU1: Core temperature/speed normal [ 1172.677042] CPU3: Core temperature/speed normal [ 1174.260050] intel ips 0000:00:1f.6: MCP limit exceeded: Avg temp 9676, limit 9000 [ 1199.339634] [Hardware Error]: Machine check events logged -- Pasi> > > > hmm, according to this bugzilla https://bugzilla.redhat.com/show_bug.cgi?id=470035, > > they''re related to dom0 kernel acpi-cpufreq ? > > > > Also it seems there was discussion about the subject on 2011/08: > > http://old-list-archives.xen.org/archives/html/xen-devel/2011-08/msg00561.html > > > > > > Xen hypervisor 4.1.2. > > dom0 Linux kernel 3.2.2. > > > > > > Thanks, > > > > -- Pasi
Konrad Rzeszutek Wilk
2012-Feb-09 21:21 UTC
Re: [Xen-devel] Stop the continuous flood of (XEN) traps.c:2432:d0 Domain attempted WRMSR ..
On Sun, Feb 05, 2012 at 09:44:13PM +0200, Pasi K?rkk?inen wrote:> On Fri, Feb 03, 2012 at 01:55:27PM -0500, Konrad Rzeszutek Wilk wrote: > > On Fri, Feb 03, 2012 at 08:09:52PM +0200, Pasi K?rkk?inen wrote: > > > Hello, > > > > > > IIRC there was some discussion earlier about these messages in Xen''s dmesg: > > > > > > (XEN) traps.c:2432:d0 Domain attempted WRMSR 00000000000001ac from 0x0000000000c800c8 to 0x0000000080c880c8. > > > (XEN) traps.c:2432:d0 Domain attempted WRMSR 00000000000001ac from 0x0000000000c800c8 to 0x0000000080c880c8. > > > (XEN) traps.c:2432:d0 Domain attempted WRMSR 00000000000001ac from 0x0000000000c800c8 to 0x0000000080c880c8. > > > (XEN) traps.c:2432:d0 Domain attempted WRMSR 00000000000001ac from 0x0000000000c800c8 to 0x0000000080c880c8. > > > > > > At least on my systems there''s continuous flood of those messages, so they will fill up the > > > Xen dmesg log buffer and "xm dmesg" or "xl dmesg" won''t show any valuable information, just those messages. > > > > Is it always that MSR? That looks to be TURBO_POWER_CURRENT_LIMIT > > which is the intel_ips driver doing. > > > > Yeah, it''s always the same.. > > > > > > > I seem to be getting those messages even when there''s only dom0 running. > > > Is the plan to drop those messages? What''s causing them? > > > > Looks to be the intel-ips. If you rename it does the issue disappear? > > I just did "rmmod intel_ips" and the flood stopped.. > > > Btw on baremetal I get this in dmesg: > > [ 745.033645] CPU1: Core temperature above threshold, cpu clock throttled (total events = 1) > [ 745.033652] CPU3: Core temperature above threshold, cpu clock throttled (total events = 1) > [ 745.034676] CPU1: Core temperature/speed normal > [ 745.034678] CPU3: Core temperature/speed normal > [ 849.678508] intel ips 0000:00:1f.6: MCP limit exceeded: Avg temp 9682, limit 9000 > [ 899.614074] intel ips 0000:00:1f.6: MCP limit exceeded: Avg temp 9896, limit 9000 > [ 899.722881] [Hardware Error]: Machine check events logged > [ 1172.675987] CPU3: Core temperature above threshold, cpu clock throttled (total events = 78) > [ 1172.675990] CPU1: Core temperature above threshold, cpu clock throttled (total events = 78) > [ 1172.677038] CPU1: Core temperature/speed normal > [ 1172.677042] CPU3: Core temperature/speed normal > [ 1174.260050] intel ips 0000:00:1f.6: MCP limit exceeded: Avg temp 9676, limit 9000 > [ 1199.339634] [Hardware Error]: Machine check events loggedJesse, and Matthew, Is there a way to make the intel_ips.c driver be in a "low-power" state? My first thought about fixing this was that we could allow the hypervisor to allow those RDMSR but the Linux kernel has no power to actually influence the power management (as the hypervisor is in charge of that) - so would the driver be capable of just sitting back and not influencing the CPU?> > > -- Pasi > > > > > > > > hmm, according to this bugzilla https://bugzilla.redhat.com/show_bug.cgi?id=470035, > > > they''re related to dom0 kernel acpi-cpufreq ? > > > > > > Also it seems there was discussion about the subject on 2011/08: > > > http://old-list-archives.xen.org/archives/html/xen-devel/2011-08/msg00561.html > > > > > > > > > Xen hypervisor 4.1.2. > > > dom0 Linux kernel 3.2.2. > > > > > > > > > Thanks, > > > > > > -- Pasi > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel
Jesse Barnes
2012-Feb-09 21:27 UTC
Re: [Xen-devel] Stop the continuous flood of (XEN) traps.c:2432:d0 Domain attempted WRMSR ..
On Thu, 9 Feb 2012 17:21:47 -0400 Konrad Rzeszutek Wilk <konrad@darnok.org> wrote:> On Sun, Feb 05, 2012 at 09:44:13PM +0200, Pasi K?rkk?inen wrote: > > On Fri, Feb 03, 2012 at 01:55:27PM -0500, Konrad Rzeszutek Wilk wrote: > > > On Fri, Feb 03, 2012 at 08:09:52PM +0200, Pasi K?rkk?inen wrote: > > > > Hello, > > > > > > > > IIRC there was some discussion earlier about these messages in Xen''s dmesg: > > > > > > > > (XEN) traps.c:2432:d0 Domain attempted WRMSR 00000000000001ac from 0x0000000000c800c8 to 0x0000000080c880c8. > > > > (XEN) traps.c:2432:d0 Domain attempted WRMSR 00000000000001ac from 0x0000000000c800c8 to 0x0000000080c880c8. > > > > (XEN) traps.c:2432:d0 Domain attempted WRMSR 00000000000001ac from 0x0000000000c800c8 to 0x0000000080c880c8. > > > > (XEN) traps.c:2432:d0 Domain attempted WRMSR 00000000000001ac from 0x0000000000c800c8 to 0x0000000080c880c8. > > > > > > > > At least on my systems there''s continuous flood of those messages, so they will fill up the > > > > Xen dmesg log buffer and "xm dmesg" or "xl dmesg" won''t show any valuable information, just those messages. > > > > > > Is it always that MSR? That looks to be TURBO_POWER_CURRENT_LIMIT > > > which is the intel_ips driver doing. > > > > > > > Yeah, it''s always the same.. > > > > > > > > > > I seem to be getting those messages even when there''s only dom0 running. > > > > Is the plan to drop those messages? What''s causing them? > > > > > > Looks to be the intel-ips. If you rename it does the issue disappear? > > > > I just did "rmmod intel_ips" and the flood stopped.. > > > > > > Btw on baremetal I get this in dmesg: > > > > [ 745.033645] CPU1: Core temperature above threshold, cpu clock throttled (total events = 1) > > [ 745.033652] CPU3: Core temperature above threshold, cpu clock throttled (total events = 1) > > [ 745.034676] CPU1: Core temperature/speed normal > > [ 745.034678] CPU3: Core temperature/speed normal > > [ 849.678508] intel ips 0000:00:1f.6: MCP limit exceeded: Avg temp 9682, limit 9000 > > [ 899.614074] intel ips 0000:00:1f.6: MCP limit exceeded: Avg temp 9896, limit 9000 > > [ 899.722881] [Hardware Error]: Machine check events logged > > [ 1172.675987] CPU3: Core temperature above threshold, cpu clock throttled (total events = 78) > > [ 1172.675990] CPU1: Core temperature above threshold, cpu clock throttled (total events = 78) > > [ 1172.677038] CPU1: Core temperature/speed normal > > [ 1172.677042] CPU3: Core temperature/speed normal > > [ 1174.260050] intel ips 0000:00:1f.6: MCP limit exceeded: Avg temp 9676, limit 9000 > > [ 1199.339634] [Hardware Error]: Machine check events logged > > Jesse, and Matthew, > > Is there a way to make the intel_ips.c driver be in a "low-power" state? > > My first thought about fixing this was that we could allow the > hypervisor to allow those RDMSR but the Linux kernel has no power to > actually influence the power management (as the hypervisor is in charge > of that) - so would the driver be capable of just sitting back and > not influencing the CPU?Yeah it''s easy enough to turn off or disable. But it doesn''t currently export any knobs for controlling behavior. I don''t have any issue with exposing some though... -- Jesse Barnes, Intel Open Source Technology Center
Konrad Rzeszutek Wilk
2012-Mar-28 20:29 UTC
Re: [Xen-devel] Stop the continuous flood of (XEN) traps.c:2432:d0 Domain attempted WRMSR ..
On Thu, Feb 09, 2012 at 01:27:15PM -0800, Jesse Barnes wrote:> On Thu, 9 Feb 2012 17:21:47 -0400 > Konrad Rzeszutek Wilk <konrad@darnok.org> wrote: > > > On Sun, Feb 05, 2012 at 09:44:13PM +0200, Pasi K?rkk?inen wrote: > > > On Fri, Feb 03, 2012 at 01:55:27PM -0500, Konrad Rzeszutek Wilk wrote: > > > > On Fri, Feb 03, 2012 at 08:09:52PM +0200, Pasi K?rkk?inen wrote: > > > > > Hello, > > > > > > > > > > IIRC there was some discussion earlier about these messages in Xen''s dmesg: > > > > > > > > > > (XEN) traps.c:2432:d0 Domain attempted WRMSR 00000000000001ac from 0x0000000000c800c8 to 0x0000000080c880c8. > > > > > (XEN) traps.c:2432:d0 Domain attempted WRMSR 00000000000001ac from 0x0000000000c800c8 to 0x0000000080c880c8. > > > > > (XEN) traps.c:2432:d0 Domain attempted WRMSR 00000000000001ac from 0x0000000000c800c8 to 0x0000000080c880c8. > > > > > (XEN) traps.c:2432:d0 Domain attempted WRMSR 00000000000001ac from 0x0000000000c800c8 to 0x0000000080c880c8. > > > > > > > > > > At least on my systems there''s continuous flood of those messages, so they will fill up the > > > > > Xen dmesg log buffer and "xm dmesg" or "xl dmesg" won''t show any valuable information, just those messages. > > > > > > > > Is it always that MSR? That looks to be TURBO_POWER_CURRENT_LIMIT > > > > which is the intel_ips driver doing. > > > > > > > > > > Yeah, it''s always the same.. > > > > > > > > > > > > > I seem to be getting those messages even when there''s only dom0 running. > > > > > Is the plan to drop those messages? What''s causing them? > > > > > > > > Looks to be the intel-ips. If you rename it does the issue disappear? > > > > > > I just did "rmmod intel_ips" and the flood stopped.. > > > > > > > > > Btw on baremetal I get this in dmesg: > > > > > > [ 745.033645] CPU1: Core temperature above threshold, cpu clock throttled (total events = 1) > > > [ 745.033652] CPU3: Core temperature above threshold, cpu clock throttled (total events = 1) > > > [ 745.034676] CPU1: Core temperature/speed normal > > > [ 745.034678] CPU3: Core temperature/speed normal > > > [ 849.678508] intel ips 0000:00:1f.6: MCP limit exceeded: Avg temp 9682, limit 9000 > > > [ 899.614074] intel ips 0000:00:1f.6: MCP limit exceeded: Avg temp 9896, limit 9000 > > > [ 899.722881] [Hardware Error]: Machine check events logged > > > [ 1172.675987] CPU3: Core temperature above threshold, cpu clock throttled (total events = 78) > > > [ 1172.675990] CPU1: Core temperature above threshold, cpu clock throttled (total events = 78) > > > [ 1172.677038] CPU1: Core temperature/speed normal > > > [ 1172.677042] CPU3: Core temperature/speed normal > > > [ 1174.260050] intel ips 0000:00:1f.6: MCP limit exceeded: Avg temp 9676, limit 9000 > > > [ 1199.339634] [Hardware Error]: Machine check events logged > > > > Jesse, and Matthew, > > > > Is there a way to make the intel_ips.c driver be in a "low-power" state? > > > > My first thought about fixing this was that we could allow the > > hypervisor to allow those RDMSR but the Linux kernel has no power to > > actually influence the power management (as the hypervisor is in charge > > of that) - so would the driver be capable of just sitting back and > > not influencing the CPU? > > Yeah it''s easy enough to turn off or disable. But it doesn''t currently > export any knobs for controlling behavior. I don''t have any issue with > exposing some though...Pasi, Could you test the two patches independetly of each other? Meaning test the Linux one without the Xen one, and vice-versa.
Pasi Kärkkäinen
2012-Mar-28 21:19 UTC
Re: Stop the continuous flood of (XEN) traps.c:2432:d0 Domain attempted WRMSR ..
On Wed, Mar 28, 2012 at 04:29:07PM -0400, Konrad Rzeszutek Wilk wrote:> > > > > > Jesse, and Matthew, > > > > > > Is there a way to make the intel_ips.c driver be in a "low-power" state? > > > > > > My first thought about fixing this was that we could allow the > > > hypervisor to allow those RDMSR but the Linux kernel has no power to > > > actually influence the power management (as the hypervisor is in charge > > > of that) - so would the driver be capable of just sitting back and > > > not influencing the CPU? > > > > Yeah it''s easy enough to turn off or disable. But it doesn''t currently > > export any knobs for controlling behavior. I don''t have any issue with > > exposing some though... > > Pasi, > > Could you test the two patches independetly of each other? Meaning > test the Linux one without the Xen one, and vice-versa. >Sure, I''ll give it a try during the weekend when I''m able to access the box in question. Thanks! -- Pasi> diff --git a/drivers/platform/x86/intel_ips.c b/drivers/platform/x86/intel_ips.c > index 88a98cf..7276831 100644 > --- a/drivers/platform/x86/intel_ips.c > +++ b/drivers/platform/x86/intel_ips.c > @@ -1407,6 +1407,10 @@ static struct ips_mcp_limits *ips_detect_cpu(struct ips_driver *ips) > } > > rdmsrl(TURBO_POWER_CURRENT_LIMIT, turbo_power); > + if (turbo_power == 0) { > + ips->turbo_toggle_allowed = false; > + return NULL; > + } > tdp = turbo_power & TURBO_TDP_MASK; > > /* Sanity check TDP against CPU */> diff -r 8e2690dbec49 xen/arch/x86/traps.c > --- a/xen/arch/x86/traps.c Sat Mar 24 13:13:49 2012 -0400 > +++ b/xen/arch/x86/traps.c Wed Mar 28 16:27:31 2012 -0400 > @@ -1746,7 +1746,8 @@ void (*pv_post_outb_hook)(unsigned int p > static inline uint64_t guest_misc_enable(uint64_t val) > { > val &= ~(MSR_IA32_MISC_ENABLE_PERF_AVAIL | > - MSR_IA32_MISC_ENABLE_MONITOR_ENABLE); > + MSR_IA32_MISC_ENABLE_MONITOR_ENABLE | > + MSR_IA32_MISC_ENABLE_TURBO); > val |= MSR_IA32_MISC_ENABLE_BTS_UNAVAIL | > MSR_IA32_MISC_ENABLE_PEBS_UNAVAIL | > MSR_IA32_MISC_ENABLE_XTPR_DISABLE; > diff -r 8e2690dbec49 xen/include/asm-x86/msr-index.h > --- a/xen/include/asm-x86/msr-index.h Sat Mar 24 13:13:49 2012 -0400 > +++ b/xen/include/asm-x86/msr-index.h Wed Mar 28 16:27:31 2012 -0400 > @@ -327,6 +327,7 @@ > #define MSR_IA32_MISC_ENABLE_MONITOR_ENABLE (1<<18) > #define MSR_IA32_MISC_ENABLE_LIMIT_CPUID (1<<22) > #define MSR_IA32_MISC_ENABLE_XTPR_DISABLE (1<<23) > +#define MSR_IA32_MISC_ENABLE_TURBO (1<<38) > > #define MSR_IA32_TSC_DEADLINE 0x000006E0 > #define MSR_IA32_ENERGY_PERF_BIAS 0x000001b0
Pasi Kärkkäinen
2012-Jul-08 20:47 UTC
Re: Stop the continuous flood of (XEN) traps.c:2432:d0 Domain attempted WRMSR ..
On Wed, Mar 28, 2012 at 04:29:07PM -0400, Konrad Rzeszutek Wilk wrote:> On Thu, Feb 09, 2012 at 01:27:15PM -0800, Jesse Barnes wrote: > > On Thu, 9 Feb 2012 17:21:47 -0400 > > Konrad Rzeszutek Wilk <konrad@darnok.org> wrote: > > > > > On Sun, Feb 05, 2012 at 09:44:13PM +0200, Pasi K?rkk?inen wrote: > > > > On Fri, Feb 03, 2012 at 01:55:27PM -0500, Konrad Rzeszutek Wilk wrote: > > > > > On Fri, Feb 03, 2012 at 08:09:52PM +0200, Pasi K?rkk?inen wrote: > > > > > > Hello, > > > > > > > > > > > > IIRC there was some discussion earlier about these messages in Xen''s dmesg: > > > > > > > > > > > > (XEN) traps.c:2432:d0 Domain attempted WRMSR 00000000000001ac from 0x0000000000c800c8 to 0x0000000080c880c8. > > > > > > (XEN) traps.c:2432:d0 Domain attempted WRMSR 00000000000001ac from 0x0000000000c800c8 to 0x0000000080c880c8. > > > > > > (XEN) traps.c:2432:d0 Domain attempted WRMSR 00000000000001ac from 0x0000000000c800c8 to 0x0000000080c880c8. > > > > > > (XEN) traps.c:2432:d0 Domain attempted WRMSR 00000000000001ac from 0x0000000000c800c8 to 0x0000000080c880c8. > > > > > > > > > > > > At least on my systems there''s continuous flood of those messages, so they will fill up the > > > > > > Xen dmesg log buffer and "xm dmesg" or "xl dmesg" won''t show any valuable information, just those messages. > > > > > > > > > > Is it always that MSR? That looks to be TURBO_POWER_CURRENT_LIMIT > > > > > which is the intel_ips driver doing. > > > > > > > > > > > > > Yeah, it''s always the same.. > > > > > > > > > > > > > > > > I seem to be getting those messages even when there''s only dom0 running. > > > > > > Is the plan to drop those messages? What''s causing them? > > > > > > > > > > Looks to be the intel-ips. If you rename it does the issue disappear? > > > > > > > > I just did "rmmod intel_ips" and the flood stopped.. > > > > > > > > > > > > Btw on baremetal I get this in dmesg: > > > > > > > > [ 745.033645] CPU1: Core temperature above threshold, cpu clock throttled (total events = 1) > > > > [ 745.033652] CPU3: Core temperature above threshold, cpu clock throttled (total events = 1) > > > > [ 745.034676] CPU1: Core temperature/speed normal > > > > [ 745.034678] CPU3: Core temperature/speed normal > > > > [ 849.678508] intel ips 0000:00:1f.6: MCP limit exceeded: Avg temp 9682, limit 9000 > > > > [ 899.614074] intel ips 0000:00:1f.6: MCP limit exceeded: Avg temp 9896, limit 9000 > > > > [ 899.722881] [Hardware Error]: Machine check events logged > > > > [ 1172.675987] CPU3: Core temperature above threshold, cpu clock throttled (total events = 78) > > > > [ 1172.675990] CPU1: Core temperature above threshold, cpu clock throttled (total events = 78) > > > > [ 1172.677038] CPU1: Core temperature/speed normal > > > > [ 1172.677042] CPU3: Core temperature/speed normal > > > > [ 1174.260050] intel ips 0000:00:1f.6: MCP limit exceeded: Avg temp 9676, limit 9000 > > > > [ 1199.339634] [Hardware Error]: Machine check events logged > > > > > > Jesse, and Matthew, > > > > > > Is there a way to make the intel_ips.c driver be in a "low-power" state? > > > > > > My first thought about fixing this was that we could allow the > > > hypervisor to allow those RDMSR but the Linux kernel has no power to > > > actually influence the power management (as the hypervisor is in charge > > > of that) - so would the driver be capable of just sitting back and > > > not influencing the CPU? > > > > Yeah it''s easy enough to turn off or disable. But it doesn''t currently > > export any knobs for controlling behavior. I don''t have any issue with > > exposing some though... > > Pasi, > > Could you test the two patches independetly of each other? Meaning > test the Linux one without the Xen one, and vice-versa. > >Sorry for the really long delay.. I tested these patches now.> diff --git a/drivers/platform/x86/intel_ips.c b/drivers/platform/x86/intel_ips.c > index 88a98cf..7276831 100644 > --- a/drivers/platform/x86/intel_ips.c > +++ b/drivers/platform/x86/intel_ips.c > @@ -1407,6 +1407,10 @@ static struct ips_mcp_limits *ips_detect_cpu(struct ips_driver *ips) > } > > rdmsrl(TURBO_POWER_CURRENT_LIMIT, turbo_power); > + if (turbo_power == 0) { > + ips->turbo_toggle_allowed = false; > + return NULL; > + } > tdp = turbo_power & TURBO_TDP_MASK; > > /* Sanity check TDP against CPU */This Linux patch applied to Linux 3.4.4 dom0 kernel and no patches to the hypervisor didn''t change anything.. the hypervisor log is still flooded with: (XEN) traps.c:2432:d0 Domain attempted WRMSR 00000000000001ac from 0x0000000000c800c8 to 0x0000000080c880c8. (XEN) traps.c:2432:d0 Domain attempted WRMSR 00000000000001ac from 0x0000000000c800c8 to 0x0000000080c880c8. And then the Xen patch..> diff -r 8e2690dbec49 xen/arch/x86/traps.c > --- a/xen/arch/x86/traps.c Sat Mar 24 13:13:49 2012 -0400 > +++ b/xen/arch/x86/traps.c Wed Mar 28 16:27:31 2012 -0400 > @@ -1746,7 +1746,8 @@ void (*pv_post_outb_hook)(unsigned int p > static inline uint64_t guest_misc_enable(uint64_t val) > { > val &= ~(MSR_IA32_MISC_ENABLE_PERF_AVAIL | > - MSR_IA32_MISC_ENABLE_MONITOR_ENABLE); > + MSR_IA32_MISC_ENABLE_MONITOR_ENABLE | > + MSR_IA32_MISC_ENABLE_TURBO); > val |= MSR_IA32_MISC_ENABLE_BTS_UNAVAIL | > MSR_IA32_MISC_ENABLE_PEBS_UNAVAIL | > MSR_IA32_MISC_ENABLE_XTPR_DISABLE; > diff -r 8e2690dbec49 xen/include/asm-x86/msr-index.h > --- a/xen/include/asm-x86/msr-index.h Sat Mar 24 13:13:49 2012 -0400 > +++ b/xen/include/asm-x86/msr-index.h Wed Mar 28 16:27:31 2012 -0400 > @@ -327,6 +327,7 @@ > #define MSR_IA32_MISC_ENABLE_MONITOR_ENABLE (1<<18) > #define MSR_IA32_MISC_ENABLE_LIMIT_CPUID (1<<22) > #define MSR_IA32_MISC_ENABLE_XTPR_DISABLE (1<<23) > +#define MSR_IA32_MISC_ENABLE_TURBO (1<<38) > > #define MSR_IA32_TSC_DEADLINE 0x000006E0 > #define MSR_IA32_ENERGY_PERF_BIAS 0x000001b0It seems this Xen patch breaks compilation.. at least on Fedora 16 gcc (Xen 4.1.3-rc2): traps.c: In function ''guest_misc_enable'': traps.c:1780:14: error: left shift count >= width of type [-Werror] cc1: all warnings being treated as errors make[4]: *** [traps.o] Error 1 So I had to do a trivial change to msr-index.h: #define MSR_IA32_MISC_ENABLE_TURBO (1L<<38) Which seems to fix the compilation. I tested the patched hypervisor with stock Fedora 16 Linux 3.4.2-1.fc16.x86_64 dom0 kernel, and also then I get the hypervisor log entries: (XEN) traps.c:2489:d0 Domain attempted WRMSR 00000000000001ac from 0x0000000000c800c8 to 0x0000000080c880c8. (XEN) traps.c:2489:d0 Domain attempted WRMSR 00000000000001ac from 0x0000000000c800c8 to 0x0000000080c880c8. So it looks like unfortunately the patches didn''t help reducing the spam in the hypervisor logs. Thanks, -- Pasi