c/s 86d60e855fe118df0dbdf67b67b1a0ec8fdb9f0d does not cause a wbinvd() in certain cases, such as writing the hypercall page, which performs: __map_domain_page(page); hypercall_page_initialise(d, hypercall_page); unmap_domain_page(hypercall_page); And bypasses the hooks in __hvm_{copy,clear}() looking for hypervisor modification of domain memory. Move the hook into unmap_domain_page() so it caches more domain memory modification points. Furthermore, the currently unconditional wbinvd() in vmx_ctxt_switch_to() should be deferred until vmentry. This way, a ctx_switch_to() followed by an iteration of a hypercall continuation wont hit Xen with two wbinvd()s. Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> CC: Keir Fraser <keir@xen.org> CC: Jan Beulich <JBeulich@suse.com> CC: Tim Deegan <tim@xen.org> CC: Jun Nakajima <jun.nakajima@intel.com> CC: Liu Jinsong <jinsong.liu@intel.com> --- This patch is somewhat RFC, partly because I have only compile-tested it, and because of a few further questions. 1) Are there any other places other than unmap_domain_page() we might want to put hooks? Any well behaved access should use {map,unmap}_domain_page() as far as I am aware. 2) What is expected to happen with Qemu mappings from dom0? As far as I am aware, they can''t retoactivly have their pagetable caching type changed. --- xen/arch/x86/domain_page.c | 4 ++++ xen/arch/x86/hvm/hvm.c | 6 ------ xen/arch/x86/hvm/vmx/vmx.c | 2 +- 3 files changed, 5 insertions(+), 7 deletions(-) diff --git a/xen/arch/x86/domain_page.c b/xen/arch/x86/domain_page.c index bc18263..18385c6 100644 --- a/xen/arch/x86/domain_page.c +++ b/xen/arch/x86/domain_page.c @@ -171,6 +171,10 @@ void unmap_domain_page(const void *ptr) unsigned long va = (unsigned long)ptr, mfn, flags; struct vcpu_maphash_entry *hashent; + if ( unlikely(is_hvm_vcpu(current) && + current->arch.hvm_vcpu.cache_mode == NO_FILL_CACHE_MODE) ) + current->arch.hvm_vcpu.hypervisor_access_uc_hvm_memory = 1; + if ( va >= DIRECTMAP_VIRT_START ) return; diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c index 36699b8..73d1cf0 100644 --- a/xen/arch/x86/hvm/hvm.c +++ b/xen/arch/x86/hvm/hvm.c @@ -2477,9 +2477,6 @@ static enum hvm_copy_result __hvm_copy( return HVMCOPY_unhandleable; #endif - if ( unlikely(curr->arch.hvm_vcpu.cache_mode == NO_FILL_CACHE_MODE) ) - curr->arch.hvm_vcpu.hypervisor_access_uc_hvm_memory = 1; - while ( todo > 0 ) { count = min_t(int, PAGE_SIZE - (addr & ~PAGE_MASK), todo); @@ -2591,9 +2588,6 @@ static enum hvm_copy_result __hvm_clear(paddr_t addr, int size) return HVMCOPY_unhandleable; #endif - if ( unlikely(curr->arch.hvm_vcpu.cache_mode == NO_FILL_CACHE_MODE) ) - curr->arch.hvm_vcpu.hypervisor_access_uc_hvm_memory = 1; - while ( todo > 0 ) { count = min_t(int, PAGE_SIZE - (addr & ~PAGE_MASK), todo); diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c index a0ad37a..6515e7a 100644 --- a/xen/arch/x86/hvm/vmx/vmx.c +++ b/xen/arch/x86/hvm/vmx/vmx.c @@ -644,7 +644,7 @@ static void vmx_ctxt_switch_to(struct vcpu *v) /* For guest cr0.cd setting, do not use potentially polluted cache */ if ( unlikely(v->arch.hvm_vcpu.cache_mode == NO_FILL_CACHE_MODE) ) - wbinvd(); + v->arch.hvm_vcpu.hypervisor_access_uc_hvm_memory = 1; vmx_restore_guest_msrs(v); vmx_restore_dr(v); -- 1.7.10.4
>>> On 07.11.13 at 11:56, Andrew Cooper <andrew.cooper3@citrix.com> wrote: > This patch is somewhat RFC, partly because I have only compile-tested it, and > because of a few further questions. > > 1) Are there any other places other than unmap_domain_page() we might want to > put hooks? Any well behaved access should use {map,unmap}_domain_page() as > far as I am aware.Any writes done through map_domain_page_global() mappings.> 2) What is expected to happen with Qemu mappings from dom0? As far as I am > aware, they can''t retoactivly have their pagetable caching type changed.Right, so the flag should also get set when completing ioreq-s. Jan
On 07/11/13 11:09, Jan Beulich wrote:>>>> On 07.11.13 at 11:56, Andrew Cooper <andrew.cooper3@citrix.com> wrote: >> This patch is somewhat RFC, partly because I have only compile-tested it, and >> because of a few further questions. >> >> 1) Are there any other places other than unmap_domain_page() we might want to >> put hooks? Any well behaved access should use {map,unmap}_domain_page() as >> far as I am aware. > Any writes done through map_domain_page_global() mappings.Ooh yes. I will hook unmap_domain_page_global() as well> >> 2) What is expected to happen with Qemu mappings from dom0? As far as I am >> aware, they can''t retoactivly have their pagetable caching type changed. > Right, so the flag should also get set when completing ioreq-s. > > Jan >Ok, but that still doesn''t help coherency of emulated device state which changes without an ioreq. Qemu scraping the framebuffer for VNC comes to mind. How bad would it be to have clean cacheline in RAM because of Qemu, which is accessed as UC by the domain? I suspect the answer is "there be dragons", and thus best avoided. ~Andrew
On 07/11/13 11:09, Jan Beulich wrote:>>>> On 07.11.13 at 11:56, Andrew Cooper <andrew.cooper3@citrix.com> wrote: >> This patch is somewhat RFC, partly because I have only compile-tested it, and >> because of a few further questions. >> >> 1) Are there any other places other than unmap_domain_page() we might want to >> put hooks? Any well behaved access should use {map,unmap}_domain_page() as >> far as I am aware. > Any writes done through map_domain_page_global() mappings. > >> 2) What is expected to happen with Qemu mappings from dom0? As far as I am >> aware, they can''t retoactivly have their pagetable caching type changed. > Right, so the flag should also get set when completing ioreq-s. > > Jan >And further questions. * Can Intel confirm which platforms are incapable of using VT-d snooping? * Can Intel confirm which platforms are missing vmx_pat? It would be nice to have an idea of which systems are going to have implicitly changed behaviour andor a performance cliff as a result of these changes. ~Andrew
>>> On 07.11.13 at 12:15, Andrew Cooper <andrew.cooper3@citrix.com> wrote: > On 07/11/13 11:09, Jan Beulich wrote: >>>>> On 07.11.13 at 11:56, Andrew Cooper <andrew.cooper3@citrix.com> wrote: >>> This patch is somewhat RFC, partly because I have only compile-tested it, and >>> because of a few further questions. >>> >>> 1) Are there any other places other than unmap_domain_page() we might want > to >>> put hooks? Any well behaved access should use {map,unmap}_domain_page() > as >>> far as I am aware. >> Any writes done through map_domain_page_global() mappings. > > Ooh yes. I will hook unmap_domain_page_global() as wellThat won''t suffice (and is really the wrong place) - global mappings are usually used in order to be able to write to the same area over an extended period of time. I.e. here you''d need to hunt down the individual writes (hopefully not too many, since quite a few of the mappings involved are r/o). Jan
>>> On 07.11.13 at 12:44, "Jan Beulich" <JBeulich@suse.com> wrote: >>>> On 07.11.13 at 12:15, Andrew Cooper <andrew.cooper3@citrix.com> wrote: >> On 07/11/13 11:09, Jan Beulich wrote: >>>>>> On 07.11.13 at 11:56, Andrew Cooper <andrew.cooper3@citrix.com> wrote: >>>> This patch is somewhat RFC, partly because I have only compile-tested it, and >>>> because of a few further questions. >>>> >>>> 1) Are there any other places other than unmap_domain_page() we might want >> to >>>> put hooks? Any well behaved access should use {map,unmap}_domain_page() >> as >>>> far as I am aware. >>> Any writes done through map_domain_page_global() mappings. >> >> Ooh yes. I will hook unmap_domain_page_global() as well > > That won''t suffice (and is really the wrong place) - global mappings > are usually used in order to be able to write to the same area over > an extended period of time. I.e. here you''d need to hunt down the > individual writes (hopefully not too many, since quite a few of the > mappings involved are r/o).Which would get pretty ugly: - guest_walk_tables() (accumulating set_ad_bits() results) - all writes through vcpu_info() - event_fifo.c would need this scattered around - hvm_load_segment_selector()''s setting of the accessed bit - hvm_task_switch()''s handling of the busy bit - all the nested HVM code''s writes through ->nv_vvmcx Jan