Sengul Thomas observed that Xen was updating the domheap mappings without a lock and that this caused problems on SMP systems. The underlying problem here is that the domheap mappings are supposed to be per-PCPU but were not, instead all PCPUs are currently sharing the same pagetables and therefore using the same mappings. Implement per-PCPU page tables and give each PCPU its own domheap mapping pages. Sengul, thanks for your bugreport, I wonder if you could test this series? WRT the Freeze this change represents a bug fix not a feature... Changes in V3: Addressed comments from Tim & Stefano. Changes in V2: dropped patch "arm: remove incorrect dcache flush at start of day." which was itself incorrect. addressed Stefano''s review on the meaty patch. added a patch to BUILD_BUG_ON various address space alignment constraints.
Ian Campbell
2013-Apr-24 10:53 UTC
[PATCH V3 1/4] xen: arm: rename xen_pgtable to boot_pgtable
The intention is that in a subsequent patch each PCPU will have its own pagetables and that xen_pgtable will become a per-cpu variable. The boot pagetables will become the boot cpu''s pagetables. For now leave a #define in place for those places which semantically do mean xen_pgtable and not boot_pgtable. Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Tim Deegan <tim@xen.org> --- xen/arch/arm/arm32/head.S | 2 +- xen/arch/arm/arm64/head.S | 4 ++-- xen/arch/arm/mm.c | 18 +++++++++++------- 3 files changed, 14 insertions(+), 10 deletions(-) diff --git a/xen/arch/arm/arm32/head.S b/xen/arch/arm/arm32/head.S index f2f581d..0b4cfde 100644 --- a/xen/arch/arm/arm32/head.S +++ b/xen/arch/arm/arm32/head.S @@ -206,7 +206,7 @@ skip_bss: mcr CP32(r0, HSCTLR) /* Write Xen''s PT''s paddr into the HTTBR */ - ldr r4, =xen_pgtable + ldr r4, =boot_pgtable add r4, r4, r10 /* r4 := paddr (xen_pagetable) */ mov r5, #0 /* r4:r5 is paddr (xen_pagetable) */ mcrr CP64(r4, r5, HTTBR) diff --git a/xen/arch/arm/arm64/head.S b/xen/arch/arm/arm64/head.S index c18ef2b..f0d9066 100644 --- a/xen/arch/arm/arm64/head.S +++ b/xen/arch/arm/arm64/head.S @@ -190,14 +190,14 @@ skip_bss: msr SCTLR_EL2, x0 /* Write Xen''s PT''s paddr into the HTTBR */ - ldr x4, =xen_pgtable + ldr x4, =boot_pgtable add x4, x4, x20 /* x4 := paddr (xen_pagetable) */ msr TTBR0_EL2, x4 /* Non-boot CPUs don''t need to rebuild the pagetable */ cbnz x22, pt_ready - ldr x1, =xen_first + ldr x1, =boot_first add x1, x1, x20 /* x1 := paddr (xen_first) */ mov x3, #PT_PT /* x2 := table map of xen_first */ orr x2, x1, x3 /* (+ rights for linear PT) */ diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c index ba3140d..3cb852b 100644 --- a/xen/arch/arm/mm.c +++ b/xen/arch/arm/mm.c @@ -40,10 +40,10 @@ struct domain *dom_xen, *dom_io, *dom_cow; /* Static start-of-day pagetables that we use before the allocators are up */ -/* xen_pgtable == root of the trie (zeroeth level on 64-bit, first on 32-bit) */ -lpae_t xen_pgtable[LPAE_ENTRIES] __attribute__((__aligned__(4096))); +/* boot_pgtable == root of the trie (zeroeth level on 64-bit, first on 32-bit) */ +lpae_t boot_pgtable[LPAE_ENTRIES] __attribute__((__aligned__(4096))); #ifdef CONFIG_ARM_64 -lpae_t xen_first[LPAE_ENTRIES] __attribute__((__aligned__(4096))); +lpae_t boot_first[LPAE_ENTRIES] __attribute__((__aligned__(4096))); #endif /* N.B. The second-level table is 4 contiguous pages long, and covers * all addresses from 0 to 0xffffffff. Offsets into it are calculated @@ -52,6 +52,10 @@ lpae_t xen_second[LPAE_ENTRIES*4] __attribute__((__aligned__(4096*4))); lpae_t xen_fixmap[LPAE_ENTRIES] __attribute__((__aligned__(4096))); static lpae_t xen_xenmap[LPAE_ENTRIES] __attribute__((__aligned__(4096))); +/* boot_pgtable becomes the boot processors pagetable, eventually this will + * become a per-cpu variable */ +#define xen_pgtable boot_pgtable + /* Non-boot CPUs use this to find the correct pagetables. */ uint64_t boot_ttbr; @@ -284,11 +288,11 @@ void __init setup_pagetables(unsigned long boot_phys_offset, paddr_t xen_paddr) /* Beware! Any state we modify between now and the PT switch may be * discarded when we switch over to the copy. */ - /* Update the copy of xen_pgtable to use the new paddrs */ - p = (void *) xen_pgtable + dest_va - (unsigned long) _start; + /* Update the copy of boot_pgtable to use the new paddrs */ + p = (void *) boot_pgtable + dest_va - (unsigned long) _start; #ifdef CONFIG_ARM_64 p[0].pt.base += (phys_offset - boot_phys_offset) >> PAGE_SHIFT; - p = (void *) xen_first + dest_va - (unsigned long) _start; + p = (void *) boot_first + dest_va - (unsigned long) _start; #endif for ( i = 0; i < 4; i++) p[i].pt.base += (phys_offset - boot_phys_offset) >> PAGE_SHIFT; @@ -305,7 +309,7 @@ void __init setup_pagetables(unsigned long boot_phys_offset, paddr_t xen_paddr) p[i].pt.base += (phys_offset - boot_phys_offset) >> PAGE_SHIFT; /* Change pagetables to the copy in the relocated Xen */ - boot_ttbr = (uintptr_t) xen_pgtable + phys_offset; + boot_ttbr = (uintptr_t) boot_pgtable + phys_offset; flush_xen_dcache(boot_ttbr); flush_xen_dcache_va_range((void*)dest_va, _end - _start); flush_xen_text_tlb(); -- 1.7.2.5
Ian Campbell
2013-Apr-24 10:53 UTC
[PATCH V3 2/4] arm: parenthesise argument to *_linear_offset macros
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Tim Deegan <tim@xen.org> --- xen/include/asm-arm/page.h | 6 +++--- 1 files changed, 3 insertions(+), 3 deletions(-) diff --git a/xen/include/asm-arm/page.h b/xen/include/asm-arm/page.h index 14e63eb..a6a312f 100644 --- a/xen/include/asm-arm/page.h +++ b/xen/include/asm-arm/page.h @@ -323,9 +323,9 @@ static inline int gva_to_ipa(vaddr_t va, paddr_t *paddr) #define FIRST_MASK (~(FIRST_SIZE - 1)) /* Calculate the offsets into the pagetables for a given VA */ -#define first_linear_offset(va) (va >> FIRST_SHIFT) -#define second_linear_offset(va) (va >> SECOND_SHIFT) -#define third_linear_offset(va) (va >> THIRD_SHIFT) +#define first_linear_offset(va) ((va) >> FIRST_SHIFT) +#define second_linear_offset(va) ((va) >> SECOND_SHIFT) +#define third_linear_offset(va) ((va) >> THIRD_SHIFT) #define TABLE_OFFSET(offs) ((unsigned int)(offs) & LPAE_ENTRY_MASK) #define first_table_offset(va) TABLE_OFFSET(first_linear_offset(va)) -- 1.7.2.5
Ian Campbell
2013-Apr-24 10:54 UTC
[PATCH V3 3/4] arm: add build time asserts for various virtual address aligment constraints
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Tim Deegan <tim@xen.org> --- Not totally sold on the need for this, but I guess it is harmless enough... --- xen/arch/arm/mm.c | 11 +++++++++++ 1 files changed, 11 insertions(+), 0 deletions(-) diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c index 3cb852b..656aa82 100644 --- a/xen/arch/arm/mm.c +++ b/xen/arch/arm/mm.c @@ -74,6 +74,17 @@ unsigned long total_pages; extern char __init_begin[], __init_end[]; +/* Checking VA memory layout alignment. */ +static inline void check_memory_layout_alignment_constraints(void) { + /* 2MB aligned regions */ + BUILD_BUG_ON(XEN_VIRT_START & ~SECOND_MASK); + BUILD_BUG_ON(FIXMAP_ADDR(0) & ~SECOND_MASK); + BUILD_BUG_ON(BOOT_MISC_VIRT_START & ~SECOND_MASK); + /* 1GB aligned regions */ + BUILD_BUG_ON(XENHEAP_VIRT_START & ~FIRST_MASK); + BUILD_BUG_ON(DOMHEAP_VIRT_START & ~FIRST_MASK); +} + void dump_pt_walk(lpae_t *first, paddr_t addr) { lpae_t *second = NULL, *third = NULL; -- 1.7.2.5
Ian Campbell
2013-Apr-24 10:54 UTC
[PATCH V3 4/4] arm: allocate per-PCPU domheap pagetable pages
The domheap mappings are supposed to be per-PCPU. Therefore xen_pgtable becomes a per-PCPU variable and we allocate and setup the page tables for each secondary PCPU just before we tell it to come up. Each secondary PCPU starts out on the boot page table but switches to its own page tables ASAP. The boot PCPU uses the boot pagetables as its own. Signed-off-by: Ian Campbell <ian.campbell@citrix.com> --- v3: Various typos Reinstate comment about using second_linear_offset() on xen_second[] Refactor TTBR0 write into a macro. Flush first level PT on ARM64. Unaddressed here: - Some of the flushing and/or barriers surrounding the TTBR0 write are not well understood and may be redundant. - reliance on the compiler not spilling to the stack between relocation and the TTBR0 write. v2: Coding style fixes. Use DOMHEAP_SECOND_PAGES instead of "2" Flush the secondary CPU''s pagetables on CPU0 when we create/write them not on the secondary CPU. --- xen/arch/arm/mm.c | 152 ++++++++++++++++++++++++++++++++++++----- xen/arch/arm/smpboot.c | 6 ++ xen/include/asm-arm/config.h | 4 + xen/include/asm-arm/mm.h | 4 +- 4 files changed, 146 insertions(+), 20 deletions(-) diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c index 656aa82..03492df 100644 --- a/xen/arch/arm/mm.c +++ b/xen/arch/arm/mm.c @@ -39,22 +39,48 @@ struct domain *dom_xen, *dom_io, *dom_cow; -/* Static start-of-day pagetables that we use before the allocators are up */ -/* boot_pgtable == root of the trie (zeroeth level on 64-bit, first on 32-bit) */ +/* Static start-of-day pagetables that we use before the + * allocators are up. These go on to become the boot CPU''s real pagetables. + */ lpae_t boot_pgtable[LPAE_ENTRIES] __attribute__((__aligned__(4096))); #ifdef CONFIG_ARM_64 lpae_t boot_first[LPAE_ENTRIES] __attribute__((__aligned__(4096))); #endif -/* N.B. The second-level table is 4 contiguous pages long, and covers - * all addresses from 0 to 0xffffffff. Offsets into it are calculated - * with second_linear_offset(), not second_table_offset(). */ + +/* + * xen_pgtable and xen_dommap are per-PCPU and are allocated before + * bringing up each CPU. On 64-bit a first level table is also allocated. + * + * xen_second, xen_fixmap and xen_xenmap are shared between all PCPUs. + */ + +/* Per-CPU pagetable pages */ +/* xen_pgtable == root of the trie (zeroeth level on 64-bit, first on 32-bit) */ +static DEFINE_PER_CPU(lpae_t *, xen_pgtable); +/* xen_dommap == pages used by map_domain_page, these pages contain + * the second level pagetables which mapp the domheap region + * DOMHEAP_VIRT_START...DOMHEAP_VIRT_END in 2MB chunks. */ +static DEFINE_PER_CPU(lpae_t *, xen_dommap); + +/* Common pagetable leaves */ +/* Second level page tables. + * + * The second-level table is 2 contiguous pages long, and covers all + * addresses from 0 to 0x7fffffff. Offsets into it are calculated + * with second_linear_offset(), not second_table_offset(). + * + * Addresses 0x80000000 to 0xffffffff are covered by the per-cpu + * xen_domheap mappings described above. However we allocate 4 pages + * here for use in the boot page tables and the second two pages + * become the boot CPUs xen_dommap pages. + */ lpae_t xen_second[LPAE_ENTRIES*4] __attribute__((__aligned__(4096*4))); +/* First level page table used for fixmap */ lpae_t xen_fixmap[LPAE_ENTRIES] __attribute__((__aligned__(4096))); +/* First level page table used to map Xen itself with the XN bit set + * as appropriate. */ static lpae_t xen_xenmap[LPAE_ENTRIES] __attribute__((__aligned__(4096))); -/* boot_pgtable becomes the boot processors pagetable, eventually this will - * become a per-cpu variable */ -#define xen_pgtable boot_pgtable /* Non-boot CPUs use this to find the correct pagetables. */ uint64_t boot_ttbr; @@ -118,12 +144,17 @@ done: void dump_hyp_walk(vaddr_t addr) { uint64_t ttbr = READ_SYSREG64(TTBR0_EL2); + lpae_t *pgtable = this_cpu(xen_pgtable); - printk("Walking Hypervisor VA 0x%"PRIvaddr" via TTBR 0x%016"PRIx64"\n", - addr, ttbr); + printk("Walking Hypervisor VA 0x%"PRIvaddr" " + "on CPU%d via TTBR 0x%016"PRIx64"\n", + addr, smp_processor_id(), ttbr); - BUG_ON( (lpae_t *)(unsigned long)(ttbr - phys_offset) != xen_pgtable ); - dump_pt_walk(xen_pgtable, addr); + if ( smp_processor_id() == 0 ) + BUG_ON( (lpae_t *)(unsigned long)(ttbr - phys_offset) != pgtable ); + else + BUG_ON( virt_to_maddr(pgtable) != ttbr ); + dump_pt_walk(pgtable, addr); } /* Map a 4k page in a fixmap entry */ @@ -149,7 +180,7 @@ void clear_fixmap(unsigned map) void *map_domain_page(unsigned long mfn) { unsigned long flags; - lpae_t *map = xen_second + second_linear_offset(DOMHEAP_VIRT_START); + lpae_t *map = this_cpu(xen_dommap); unsigned long slot_mfn = mfn & ~LPAE_ENTRY_MASK; vaddr_t va; lpae_t pte; @@ -215,7 +246,7 @@ void *map_domain_page(unsigned long mfn) void unmap_domain_page(const void *va) { unsigned long flags; - lpae_t *map = xen_second + second_linear_offset(DOMHEAP_VIRT_START); + lpae_t *map = this_cpu(xen_dommap); int slot = ((unsigned long) va - DOMHEAP_VIRT_START) >> SECOND_SHIFT; local_irq_save(flags); @@ -230,7 +261,7 @@ void unmap_domain_page(const void *va) unsigned long domain_page_map_to_mfn(const void *va) { - lpae_t *map = xen_second + second_linear_offset(DOMHEAP_VIRT_START); + lpae_t *map = this_cpu(xen_dommap); int slot = ((unsigned long) va - DOMHEAP_VIRT_START) >> SECOND_SHIFT; unsigned long offset = ((unsigned long)va>>THIRD_SHIFT) & LPAE_ENTRY_MASK; @@ -276,6 +307,16 @@ void __cpuinit setup_virt_paging(void) WRITE_SYSREG32(0x80002558, VTCR_EL2); isb(); } +/* This needs to be a macro to stop the compiler spilling to the stack + * which will change when we change pagetables */ +#define WRITE_TTBR(ttbr) \ + flush_xen_text_tlb(); \ + WRITE_SYSREG64(ttbr, TTBR0_EL2); \ + dsb(); /* ensure memory accesses do not cross over the TTBR0 write */ \ + /* flush_xen_text_tlb contains an initial isb which ensures the \ + * write to TTBR0 has completed. */ \ + flush_xen_text_tlb() + /* Boot-time pagetable setup. * Changes here may need matching changes in head.S */ void __init setup_pagetables(unsigned long boot_phys_offset, paddr_t xen_paddr) @@ -323,11 +364,8 @@ void __init setup_pagetables(unsigned long boot_phys_offset, paddr_t xen_paddr) boot_ttbr = (uintptr_t) boot_pgtable + phys_offset; flush_xen_dcache(boot_ttbr); flush_xen_dcache_va_range((void*)dest_va, _end - _start); - flush_xen_text_tlb(); - WRITE_SYSREG64(boot_ttbr, TTBR0_EL2); - dsb(); /* Ensure visibility of HTTBR update */ - flush_xen_text_tlb(); + WRITE_TTBR(boot_ttbr); /* Undo the temporary map */ pte.bits = 0; @@ -373,11 +411,87 @@ void __init setup_pagetables(unsigned long boot_phys_offset, paddr_t xen_paddr) WRITE_SYSREG32(READ_SYSREG32(SCTLR_EL2) | SCTLR_WXN, SCTLR_EL2); /* Flush everything after setting WXN bit. */ flush_xen_text_tlb(); + + per_cpu(xen_pgtable, 0) = boot_pgtable; + per_cpu(xen_dommap, 0) = xen_second + + second_linear_offset(DOMHEAP_VIRT_START); + + /* Some of these slots may have been used during start of day and/or + * relocation. Make sure they are clear now. */ + memset(this_cpu(xen_dommap), 0, DOMHEAP_SECOND_PAGES*PAGE_SIZE); + flush_xen_dcache_va_range(this_cpu(xen_dommap), + DOMHEAP_SECOND_PAGES*PAGE_SIZE); +} + +int init_secondary_pagetables(int cpu) +{ + lpae_t *root, *first, *domheap, pte; + int i; + + root = alloc_xenheap_page(); +#ifdef CONFIG_ARM_64 + first = alloc_xenheap_page(); +#else + first = root; /* root == first level on 32-bit 3-level trie */ +#endif + domheap = alloc_xenheap_pages(get_order_from_pages(DOMHEAP_SECOND_PAGES), 0); + + if ( root == NULL || domheap == NULL || first == NULL ) + { + printk("Not enough free memory for secondary CPU%d pagetables\n", cpu); + free_xenheap_pages(domheap, get_order_from_pages(DOMHEAP_SECOND_PAGES)); +#ifdef CONFIG_ARM_64 + free_xenheap_page(first); +#endif + free_xenheap_page(root); + return -ENOMEM; + } + + /* Initialise root pagetable from root of boot tables */ + memcpy(root, boot_pgtable, PAGE_SIZE); + +#ifdef CONFIG_ARM_64 + /* Initialise first pagetable from first level of boot tables, and + * hook into the new root. */ + memcpy(first, boot_first, PAGE_SIZE); + pte = mfn_to_xen_entry(virt_to_mfn(first)); + pte.pt.table = 1; + write_pte(root, pte); +#endif + + /* Ensure the domheap has no stray mappings */ + memset(domheap, 0, DOMHEAP_SECOND_PAGES*PAGE_SIZE); + + /* Update the first level mapping to reference the local CPUs + * domheap mapping pages. */ + for ( i = 0; i < DOMHEAP_SECOND_PAGES; i++ ) + { + pte = mfn_to_xen_entry(virt_to_mfn(domheap+i*LPAE_ENTRIES)); + pte.pt.table = 1; + write_pte(&first[first_table_offset(DOMHEAP_VIRT_START+i*FIRST_SIZE)], pte); + } + + flush_xen_dcache_va_range(root, PAGE_SIZE); +#ifdef CONFIG_ARM_64 + flush_xen_dcache_va_range(first, PAGE_SIZE); +#endif + flush_xen_dcache_va_range(domheap, DOMHEAP_SECOND_PAGES*PAGE_SIZE); + + per_cpu(xen_pgtable, cpu) = root; + per_cpu(xen_dommap, cpu) = domheap; + + return 0; } /* MMU setup for secondary CPUS (which already have paging enabled) */ void __cpuinit mmu_init_secondary_cpu(void) { + uint64_t ttbr; + + /* Change to this CPU''s pagetables */ + ttbr = (uintptr_t)virt_to_maddr(this_cpu(xen_pgtable)); + WRITE_TTBR(ttbr); + /* From now on, no mapping may be both writable and executable. */ WRITE_SYSREG32(READ_SYSREG32(SCTLR_EL2) | SCTLR_WXN, SCTLR_EL2); flush_xen_text_tlb(); diff --git a/xen/arch/arm/smpboot.c b/xen/arch/arm/smpboot.c index bd353c8..8011987 100644 --- a/xen/arch/arm/smpboot.c +++ b/xen/arch/arm/smpboot.c @@ -220,6 +220,12 @@ void stop_cpu(void) /* Bring up a remote CPU */ int __cpu_up(unsigned int cpu) { + int rc; + + rc = init_secondary_pagetables(cpu); + if ( rc < 0 ) + return rc; + /* Tell the remote CPU which stack to boot on. */ init_stack = idle_vcpu[cpu]->arch.stack; diff --git a/xen/include/asm-arm/config.h b/xen/include/asm-arm/config.h index 8be8563..e49aac1 100644 --- a/xen/include/asm-arm/config.h +++ b/xen/include/asm-arm/config.h @@ -98,12 +98,16 @@ #define EARLY_VMAP_VIRT_START mk_unsigned_long(0x10000000) #define XENHEAP_VIRT_START mk_unsigned_long(0x40000000) #define DOMHEAP_VIRT_START mk_unsigned_long(0x80000000) +#define DOMHEAP_VIRT_END mk_unsigned_long(0xffffffff) #define EARLY_VMAP_VIRT_END XENHEAP_VIRT_START #define HYPERVISOR_VIRT_START XEN_VIRT_START #define DOMHEAP_ENTRIES 1024 /* 1024 2MB mapping slots */ +/* Number of domheap pagetable pages required at the second level (2MB mappings) */ +#define DOMHEAP_SECOND_PAGES ((DOMHEAP_VIRT_END - DOMHEAP_VIRT_START + 1) >> FIRST_SHIFT) + /* Fixmap slots */ #define FIXMAP_CONSOLE 0 /* The primary UART */ #define FIXMAP_PT 1 /* Temporary mappings of pagetable pages */ diff --git a/xen/include/asm-arm/mm.h b/xen/include/asm-arm/mm.h index 4be383e..26c271e 100644 --- a/xen/include/asm-arm/mm.h +++ b/xen/include/asm-arm/mm.h @@ -138,7 +138,9 @@ extern unsigned long total_pages; /* Boot-time pagetable setup */ extern void setup_pagetables(unsigned long boot_phys_offset, paddr_t xen_paddr); -/* MMU setup for secondary CPUS (which already have paging enabled) */ +/* Allocate and initialise pagetables for a secondary CPU */ +extern int __cpuinit init_secondary_pagetables(int cpu); +/* Switch secondary CPUS to its own pagetables and finalise MMU setup */ extern void __cpuinit mmu_init_secondary_cpu(void); /* Second stage paging setup, to be called on all CPUs */ extern void __cpuinit setup_virt_paging(void); -- 1.7.2.5
Tim Deegan
2013-Apr-24 12:49 UTC
Re: [PATCH V3 4/4] arm: allocate per-PCPU domheap pagetable pages
At 11:54 +0100 on 24 Apr (1366804441), Ian Campbell wrote:> + /* Some of these slots may have been used during start of day and/or > + * relocation. Make sure they are clear now. */ > + memset(this_cpu(xen_dommap), 0, DOMHEAP_SECOND_PAGES*PAGE_SIZE); > + flush_xen_dcache_va_range(this_cpu(xen_dommap), > + DOMHEAP_SECOND_PAGES*PAGE_SIZE); > +}This is a dcache flush -- do we need flush_xen_data_tlb_range_va() instead (or possibly as well)? Also: do we care about dirty cachelines from the earlier operations? And in that case should we flush the cache before the PTs are cleared (and again afterwards to guard against prefetches)? Tim (still not convinced my mental model of ARM memory is right...)
Ian Campbell
2013-Apr-24 13:04 UTC
Re: [PATCH V3 4/4] arm: allocate per-PCPU domheap pagetable pages
On Wed, 2013-04-24 at 13:49 +0100, Tim Deegan wrote:> At 11:54 +0100 on 24 Apr (1366804441), Ian Campbell wrote: > > + /* Some of these slots may have been used during start of day and/or > > + * relocation. Make sure they are clear now. */ > > + memset(this_cpu(xen_dommap), 0, DOMHEAP_SECOND_PAGES*PAGE_SIZE); > > + flush_xen_dcache_va_range(this_cpu(xen_dommap), > > + DOMHEAP_SECOND_PAGES*PAGE_SIZE); > > +} > > This is a dcache flush -- do we need flush_xen_data_tlb_range_va() > instead (or possibly as well)?The reason for this zeroing is really for the benefit of the code in map_domain_page which is looking at the present bits to find a free slot. I had a crash due to it stumbling over the uninitialised memory in the freshly allocated secondary CPUs dommap and figured I should zero the boot CPUs one as well. It probably doesn''t actually matter that much if the MMU sees stale entries here (in the case of the boot CPU they will be valid entries, not invalid gunk like on the secondaries). At the point that map_domain_page actually puts something useful in here it will do all the right flushes (I hope!). In other words I have a feeling even the flush which is there is not strictly necessary. However it is consistent with write_pte, which is effectively what the memset is doing.> Also: do we care about dirty cachelines > from the earlier operations? And in that case should we flush the cache > before the PTs are cleared (and again afterwards to guard against > prefetches)?I sure hope those were all flushed as part of the relocation etc. I''m pretty sure they must have been.> Tim (still not convinced my mental model of ARM memory is right...)
Tim Deegan
2013-Apr-24 13:11 UTC
Re: [PATCH V3 4/4] arm: allocate per-PCPU domheap pagetable pages
At 14:04 +0100 on 24 Apr (1366812242), Ian Campbell wrote:> On Wed, 2013-04-24 at 13:49 +0100, Tim Deegan wrote: > > At 11:54 +0100 on 24 Apr (1366804441), Ian Campbell wrote: > > > + /* Some of these slots may have been used during start of day and/or > > > + * relocation. Make sure they are clear now. */ > > > + memset(this_cpu(xen_dommap), 0, DOMHEAP_SECOND_PAGES*PAGE_SIZE); > > > + flush_xen_dcache_va_range(this_cpu(xen_dommap), > > > + DOMHEAP_SECOND_PAGES*PAGE_SIZE); > > > +} > > > > This is a dcache flush -- do we need flush_xen_data_tlb_range_va() > > instead (or possibly as well)? > > The reason for this zeroing is really for the benefit of the code in > map_domain_page which is looking at the present bits to find a free > slot. I had a crash due to it stumbling over the uninitialised memory in > the freshly allocated secondary CPUs dommap and figured I should zero > the boot CPUs one as well.Ah, right - this is flushing the dommap PTEs. For some reason I read it as a flush of the domheap VAs.> It probably doesn''t actually matter that much if the MMU sees stale > entries here (in the case of the boot CPU they will be valid entries, > not invalid gunk like on the secondaries). At the point that > map_domain_page actually puts something useful in here it will do all > the right flushes (I hope!).Yep.> In other words I have a feeling even the flush which is there is not > strictly necessary. However it is consistent with write_pte, which is > effectively what the memset is doing.Understood. And with that, Acked-by: TIm Deegan <tim@xen.org> Cheers, Tim.