I recently realized that the original way of dealing with the DebugCtl
MSR on VMX failed to make use of the dedicated guest VMCS field. This
is being fixed with this patch.
What is puzzling me to a certain degree is that while there is a guest
VMCS field for this MSR, there''s no equivalent host load field, but
there''s also no indication that the MSR would be cleared during a
vmexit. Can someone at Intel perhaps give a statement on this?
I would really be possible to avoid the intercept on DebugCtl reads
altogether, but that would require new support functions to
individually disable read and write intercepts (which currently are
always handled as a pair). I''m not sure such a change is warranted
for a debugging only feature.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Index: 2008-03-05/xen/arch/x86/hvm/vmx/vmcs.c
==================================================================---
2008-03-05.orig/xen/arch/x86/hvm/vmx/vmcs.c	2008-02-18 10:07:45.000000000 +0100
+++ 2008-03-05/xen/arch/x86/hvm/vmx/vmcs.c	2008-03-14 14:08:11.000000000 +0100
@@ -870,7 +870,7 @@ void vmcs_dump_vcpu(struct vcpu *v)
     x  = (unsigned long long)vmr(TSC_OFFSET_HIGH) << 32;
     x |= (uint32_t)vmr(TSC_OFFSET);
     printk("TSC Offset = %016llx\n", x);
-    x  = (unsigned long long)vmr(GUEST_IA32_DEBUGCTL) << 32;
+    x  = (unsigned long long)vmr(GUEST_IA32_DEBUGCTL_HIGH) << 32;
     x |= (uint32_t)vmr(GUEST_IA32_DEBUGCTL);
     printk("DebugCtl=%016llx DebugExceptions=%016llx\n", x,
            (unsigned long long)vmr(GUEST_PENDING_DBG_EXCEPTIONS));
Index: 2008-03-05/xen/arch/x86/hvm/vmx/vmx.c
==================================================================---
2008-03-05.orig/xen/arch/x86/hvm/vmx/vmx.c	2008-02-26 10:43:52.000000000 +0100
+++ 2008-03-05/xen/arch/x86/hvm/vmx/vmx.c	2008-03-14 14:14:40.000000000 +0100
@@ -1512,8 +1512,10 @@ static int vmx_msr_read_intercept(struct
         msr_content = var_range_base[index];
         break;
     case MSR_IA32_DEBUGCTLMSR:
-        if ( vmx_read_guest_msr(v, ecx, &msr_content) != 0 )
-            msr_content = 0;
+        msr_content = __vmread(GUEST_IA32_DEBUGCTL);
+#ifdef __i386__
+        msr_content |= (u64)__vmread(GUEST_IA32_DEBUGCTL_HIGH) << 32;
+#endif
         break;
     case MSR_IA32_VMX_BASIC...MSR_IA32_VMX_PROCBASED_CTLS2:
         goto gp_fault;
@@ -1732,11 +1734,15 @@ static int vmx_msr_write_intercept(struc
         }
 
         if ( (rc < 0) ||
-             (vmx_add_guest_msr(v, ecx) < 0) ||
              (vmx_add_host_load_msr(v, ecx) < 0) )
             vmx_inject_hw_exception(v, TRAP_machine_check, 0);
         else
-            vmx_write_guest_msr(v, ecx, msr_content);
+        {
+            __vmwrite(GUEST_IA32_DEBUGCTL, msr_content);
+#ifdef __i386__
+            __vmwrite(GUEST_IA32_DEBUGCTL_HIGH, msr_content >> 32);
+#endif
+        }
 
         break;
     }
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Hi, Jan, For IA32_DEBUGCTRL: You can use VMX''s generic MSR save/load mechanism for loading host value for this MSR. Otherwise, the value in this MSR remains unchanged across VMX transitions between root and non-root mode. I would rather think that this dedicate MSR field is an optimization in addition to the generic save/load mechanism. There may not be needs for a dedicate host field to make them in pairs. Best Regards Haitao Shan Jan Beulich wrote:> I recently realized that the original way of dealing with the DebugCtl > MSR on VMX failed to make use of the dedicated guest VMCS field. This > is being fixed with this patch. > > What is puzzling me to a certain degree is that while there is a guest > VMCS field for this MSR, there''s no equivalent host load field, but > there''s also no indication that the MSR would be cleared during a > vmexit. Can someone at Intel perhaps give a statement on this? > > I would really be possible to avoid the intercept on DebugCtl reads > altogether, but that would require new support functions to > individually disable read and write intercepts (which currently are > always handled as a pair). I''m not sure such a change is warranted > for a debugging only feature. > > Signed-off-by: Jan Beulich <jbeulich@novell.com> > > Index: 2008-03-05/xen/arch/x86/hvm/vmx/vmcs.c > ==================================================================> --- 2008-03-05.orig/xen/arch/x86/hvm/vmx/vmcs.c 2008-02-18 > 10:07:45.000000000 +0100 +++ > 2008-03-05/xen/arch/x86/hvm/vmx/vmcs.c 2008-03-1414:08:11.000000000> +0100 @@ -870,7 +870,7 @@ void vmcs_dump_vcpu(struct vcpu *v) x > = (unsigned long long)vmr(TSC_OFFSET_HIGH) << 32; x |> (uint32_t)vmr(TSC_OFFSET); printk("TSC Offset = %016llx\n", x); > - x = (unsigned long long)vmr(GUEST_IA32_DEBUGCTL) << 32; > + x = (unsigned long long)vmr(GUEST_IA32_DEBUGCTL_HIGH) << 32; > x |= (uint32_t)vmr(GUEST_IA32_DEBUGCTL); > printk("DebugCtl=%016llx DebugExceptions=%016llx\n", x, > (unsigned long long)vmr(GUEST_PENDING_DBG_EXCEPTIONS)); > Index: 2008-03-05/xen/arch/x86/hvm/vmx/vmx.c > ==================================================================> --- 2008-03-05.orig/xen/arch/x86/hvm/vmx/vmx.c 2008-02-26 > 10:43:52.000000000 +0100 +++ > 2008-03-05/xen/arch/x86/hvm/vmx/vmx.c 2008-03-14 14:14:40.000000000 > +0100 @@ -1512,8 +1512,10 @@ static int > vmx_msr_read_intercept(struct msr_content > var_range_base[index]; break; case MSR_IA32_DEBUGCTLMSR: > - if ( vmx_read_guest_msr(v, ecx, &msr_content) != 0 ) > - msr_content = 0; > + msr_content = __vmread(GUEST_IA32_DEBUGCTL); > +#ifdef __i386__ > + msr_content |= (u64)__vmread(GUEST_IA32_DEBUGCTL_HIGH) << 32; > +#endif > break; > case MSR_IA32_VMX_BASIC...MSR_IA32_VMX_PROCBASED_CTLS2: > goto gp_fault; > @@ -1732,11 +1734,15 @@ static int vmx_msr_write_intercept(struc > } > > if ( (rc < 0) || > - (vmx_add_guest_msr(v, ecx) < 0) || > (vmx_add_host_load_msr(v, ecx) < 0) ) > vmx_inject_hw_exception(v, TRAP_machine_check, 0); > else > - vmx_write_guest_msr(v, ecx, msr_content); > + { > + __vmwrite(GUEST_IA32_DEBUGCTL, msr_content); > +#ifdef __i386__ > + __vmwrite(GUEST_IA32_DEBUGCTL_HIGH, msr_content >> 32); > +#endif > + } > > break; > } > > > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
>>> "Shan, Haitao" <haitao.shan@intel.com> 17.03.08 07:29 >>> >Hi, Jan, > >For IA32_DEBUGCTRL: >You can use VMX''s generic MSR save/load mechanism for loading host value >for this MSR. Otherwise, the value in this MSR remains unchanged across >VMX transitions between root and non-root mode. >I would rather think that this dedicate MSR field is an optimization in >addition to the generic save/load mechanism. There may not be needs for >a dedicate host field to make them in pairs.The point I''m trying to make is that the hypervisor has no way of allowing the guest to modify e.g. the DS area control bits in this register without adding a host load MSR; even for the BTF and LBR bits it seems suspicious to expect the hypervisor to tolerate the guest register settings. Basically, whenever the guest is permitted to set the MSR to a non-zero value the host will need to load the MSR during vmexit, and hence requiring explicit adding of the MSR to the set that is loaded through the generic mechanism seems inconsistent. But nevertheless thanks for clarifying things work the way they appear to based solely on the VMCS documentation, and namely that there''s no way to avoid the explicit MSR load. Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
>>> "Haitao Shan" <maillists.shan@gmail.com> 17.03.08 14:38 >>> >Yes, I agree. Once allowing guest to modify this MSR, you have to implement >some save/load functions. >I am wondering what you are trying to enable? As far as I know, DS areaYes, I''m considering adding support for DS, since Linux 2.6.25-rc briefly had support for this (it''s now disabled again), so it''s likely that pretty soon the functionality in the kernel will be there. Hence the hypervisor ought to support it.>(both PEBS and BTS) is not working properly now. You should assume that any >events which will trigger DS write might happen right at the moment that CPU >is in VMExit/VMEntry. Only using HW to switch related MSRs is not enough.That would be very bad - it would mean that you cannot reliably virtualize this. But - are you sure here? For PEBS, I could see this to be true (although it should be properly dealt with in hardware), but BTS shouldn''t have any activity during vmexit/vmentry as there are no branches during that time. Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Oh, yes. Sorry for my bad memory. The spec does say that there is no branch trace activities during VMExit/VMEntry. It should work. But anyhow, I think it is better to try and see whether it works.:) For PEBS, this actually can happen during VMExit/VMEntry. I have been debugging this for a long time when I made the patch to enable PMU usage in HVM guests. If you programme the PMCs and enable PEBS and DS in guest, while in host these features are disabled, events happened during VMExit/VMEntry will be lost. And it is hard to emulate the lost one in VMM. So I disabled these features through IA32_MISC_ENABLE. You can see in current Xen''s code. Best Regards Haitao Shan Jan Beulich wrote:>>>> "Haitao Shan" <maillists.shan@gmail.com> 17.03.08 14:38 >>> >> Yes, I agree. Once allowing guest to modify this MSR, you have to >> implement some save/load functions. I am wondering what you are >> trying to enable? As far as I know, DS area > > Yes, I''m considering adding support for DS, since Linux 2.6.25-rc > briefly had support for this (it''s now disabled again), so it''s > likely that pretty soon the functionality in the kernel will be > there. Hence the hypervisor ought to support it. > >> (both PEBS and BTS) is not working properly now. You should assume >> that any events which will trigger DS write might happen right at >> the moment that CPU is in VMExit/VMEntry. Only using HW to switch >> related MSRs is not enough. > > That would be very bad - it would mean that you cannot reliably > virtualize this. But - are you sure here? For PEBS, I could see this > to be true (although it should be properly dealt with in hardware), > but > BTS shouldn''t have any activity during vmexit/vmentry as there are > no branches during that time. > > Jan > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Reasonably Related Threads
- [PATCH 2/3] Implement tsc adjust feature
- [PATCH v4 0/4] x86/HVM: miscellaneous improvements
- [PATCH]x86:x2apic: Disable x2apic on x86-32 permanently
- Bug#748052: [Xen-devel] dom0 USB failing with "ehci-pci: probe of 0000:00:1d.0 faile
- [Patch] Add NMI Injection and Pending Support in VMX