Hi, in our BS2000 guest running as HVM with EPT on x86_64 I have a problem which seems to be related to stale TLB entries. I''m pretty sure I have invalidated the TLB correctly after a change of the page tables, so I''ve searched for possible problems in the hypervisor. Xen is version 4.0 from SLES 11 SP1. If I have read the sources correctly, neither INVLPG nor reload of CR3 are handled by the hypervisor. And I didn''t find an explicit clearing of the TLB when a vcpu is switching physical cpus. So I think the following scenario is possible: - a vcpu is running on physical cpu A creating a TLB entry - the vcpu is scheduled on physical cpu B, while physical cpu A is left idle - on physical cpu B the TLB entry is cleared by INVLPG or load CR3 - the vcpu is scheduled on physical cpu A again (no other vcpu was active there in between), CR3 is same as when vcpu left cpu A - the old TLB entry from the vcpu is still valid there! Do I miss something? Juergen -- Juergen Gross Principal Developer Operating Systems TSP ES&S SWE OS6 Telephone: +49 (0) 89 3222 2967 Fujitsu Technology Solutions e-mail: juergen.gross@ts.fujitsu.com Domagkstr. 28 Internet: ts.fujitsu.com D-80807 Muenchen Company details: ts.fujitsu.com/imprint.html _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
If you''re talking about just TLB stuff (not changes to the EPT tables), that should happen as a result of the context switch code (nothing to do with EPT). The code in question is here: xen/arch/x86/domain.c:context_switch() if ( unlikely(!cpu_isset(cpu, dirty_mask) && !cpus_empty(dirty_mask)) ) { /* Other cpus call __sync_local_execstate from flush ipi handler. */ flush_tlb_mask(&dirty_mask); } "Dirty mask" means "where this vcpu has run"; since the vcpu in question will have run on another pcpu, this should happen before the vcpu is allowed to run on cpu 0 again. -George _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
At 13:00 +0000 on 24 Jan (1295874058), Juergen Gross wrote:> Hi, > > in our BS2000 guest running as HVM with EPT on x86_64 I have a problem which > seems to be related to stale TLB entries. I''m pretty sure I have invalidated > the TLB correctly after a change of the page tables, so I''ve searched for > possible problems in the hypervisor. > > Xen is version 4.0 from SLES 11 SP1. > > If I have read the sources correctly, neither INVLPG nor reload of CR3 are > handled by the hypervisor. And I didn''t find an explicit clearing of the TLB > when a vcpu is switching physical cpus. So I think the following scenario is > possible: > > - a vcpu is running on physical cpu A creating a TLB entry > - the vcpu is scheduled on physical cpu B, while physical cpu A is left idle > - on physical cpu B the TLB entry is cleared by INVLPG or load CR3 > - the vcpu is scheduled on physical cpu A again (no other vcpu was active > there in between), CR3 is same as when vcpu left cpu A > - the old TLB entry from the vcpu is still valid there! > > Do I miss something?vmx_do_resume() calls hvm_asid_flush_vcpu() if the VCPU is migrating onto this CPU, so the VCPU should get a fresh ASID when it comes back to CPU A. Processors with no ASID support flush their TLBs on every VMENTER and VMEXIT, so I don''t see where we could leak TLB entries. If there is a leak it should be fairly easy to repro with a toy kernel and an idle host. Cheers, Tim -- Tim Deegan <Tim.Deegan@citrix.com> Principal Software Engineer, Xen Platform Team Citrix Systems UK Ltd. (Company #02937203, SL9 0BG) _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
At 13:11 +0000 on 24 Jan (1295874671), George Dunlap wrote:> If you''re talking about just TLB stuff (not changes to the EPT > tables), that should happen as a result of the context switch code > (nothing to do with EPT). The code in question is here: > > xen/arch/x86/domain.c:context_switch() > if ( unlikely(!cpu_isset(cpu, dirty_mask) && !cpus_empty(dirty_mask)) ) > { > /* Other cpus call __sync_local_execstate from flush ipi handler. */ > flush_tlb_mask(&dirty_mask); > } > > "Dirty mask" means "where this vcpu has run"; since the vcpu in > question will have run on another pcpu, this should happen before the > vcpu is allowed to run on cpu 0 again.Actually this code flushes the _other_ CPU''s TLB, but I think the code in vmx_do_resume should be enough. Cheers, Tim. -- Tim Deegan <Tim.Deegan@citrix.com> Principal Software Engineer, Xen Platform Team Citrix Systems UK Ltd. (Company #02937203, SL9 0BG) _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 01/24/11 14:11, George Dunlap wrote:> If you''re talking about just TLB stuff (not changes to the EPT > tables), that should happen as a result of the context switch code > (nothing to do with EPT). The code in question is here: > > xen/arch/x86/domain.c:context_switch() > if ( unlikely(!cpu_isset(cpu, dirty_mask)&& !cpus_empty(dirty_mask)) ) > { > /* Other cpus call __sync_local_execstate from flush ipi handler. */ > flush_tlb_mask(&dirty_mask); > } > > "Dirty mask" means "where this vcpu has run"; since the vcpu in > question will have run on another pcpu, this should happen before the > vcpu is allowed to run on cpu 0 again.Really? I think you refer to this code in __context_switch(): /* * Mark this CPU in next domain''s dirty cpumasks before calling * ctxt_switch_to(). This avoids a race on things like EPT flushing, * which is synchronised on that function. */ if ( p->domain != n->domain ) cpu_set(cpu, n->domain->domain_dirty_cpumask); cpu_set(cpu, n->vcpu_dirty_cpumask); This should set the dirty bit for the physical cpu on which the vcpu is just about to be started. But the dirty bit of the previous vcpu is cleared a little bit later: if ( p->domain != n->domain ) cpu_clear(cpu, p->domain->domain_dirty_cpumask); cpu_clear(cpu, p->vcpu_dirty_cpumask); Couldn''t this leave the dirty mask to be empty again? Juergen -- Juergen Gross Principal Developer Operating Systems TSP ES&S SWE OS6 Telephone: +49 (0) 89 3222 2967 Fujitsu Technology Solutions e-mail: juergen.gross@ts.fujitsu.com Domagkstr. 28 Internet: ts.fujitsu.com D-80807 Muenchen Company details: ts.fujitsu.com/imprint.html _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 01/24/11 14:13, Tim Deegan wrote:> At 13:00 +0000 on 24 Jan (1295874058), Juergen Gross wrote: >> Hi, >> >> in our BS2000 guest running as HVM with EPT on x86_64 I have a problem which >> seems to be related to stale TLB entries. I''m pretty sure I have invalidated >> the TLB correctly after a change of the page tables, so I''ve searched for >> possible problems in the hypervisor. >> >> Xen is version 4.0 from SLES 11 SP1. >> >> If I have read the sources correctly, neither INVLPG nor reload of CR3 are >> handled by the hypervisor. And I didn''t find an explicit clearing of the TLB >> when a vcpu is switching physical cpus. So I think the following scenario is >> possible: >> >> - a vcpu is running on physical cpu A creating a TLB entry >> - the vcpu is scheduled on physical cpu B, while physical cpu A is left idle >> - on physical cpu B the TLB entry is cleared by INVLPG or load CR3 >> - the vcpu is scheduled on physical cpu A again (no other vcpu was active >> there in between), CR3 is same as when vcpu left cpu A >> - the old TLB entry from the vcpu is still valid there! >> >> Do I miss something? > > vmx_do_resume() calls hvm_asid_flush_vcpu() if the VCPU is migrating > onto this CPU, so the VCPU should get a fresh ASID when it comes back to > CPU A. Processors with no ASID support flush their TLBs on every > VMENTER and VMEXIT, so I don''t see where we could leak TLB entries.Ah, this was the missing information I needed! Thanks, I''ll keep on searching... Juergen -- Juergen Gross Principal Developer Operating Systems TSP ES&S SWE OS6 Telephone: +49 (0) 89 3222 2967 Fujitsu Technology Solutions e-mail: juergen.gross@ts.fujitsu.com Domagkstr. 28 Internet: ts.fujitsu.com D-80807 Muenchen Company details: ts.fujitsu.com/imprint.html _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel