Konrad Rzeszutek Wilk
2010-Dec-21 21:37 UTC
[Xen-devel] [RFC PATCH v1] Consider void entries in the P2M as 1-1 mapping.
Please see attached an RFC of first set of patches that augments how Xen MMU deals with PFNs that point to physical devices. Short summary: No need to troll through code to add VM_IO on mmap paths anymore. Long summary: Under Xen MMU we would distinguish two different types of PFNs in the P2M tree: real MFN, INVALID_P2M_ENTRY (missing PFN - used for ballooning). If there was a device which PCI BAR was within the P2M, we would look at the flags and if _PAGE_IOMAP was passed we would just return the PFN without consulting the P2M. We have a patch (and some auxilary for other subsystems) that sets this: x86: define arch_vm_get_page_prot to set _PAGE_IOMAP on VM_IO vmas This patchset proposes a different way of doing this where the patch above and the other auxilary ones will not be neccessary. This different is the one that H. Peter Anvin and Jeremy Fitzhardinge suggested. The mechanism is to think of the void entries (so not filled) in the P2M tree structure as identity (1-1) mapping. In the past we used to think of those regions as "missing" and under the ownership of the balloon code. But the balloon code only operates on a specific region. This region is in last E820 RAM page (basically any region past nr_pages is considered balloon type page). Gaps in the E820 (which are usually considered to PCI BAR spaces) would end up with the void entries and point to the "missing" pages. This patchset considers the void entries as "identity" and for balloon pages you have to set the PFNs to be "missing". This means that the void entries are now considered 1-1, so for PFNs which exist in large gaps of the P2M space will return the same PFN. Since the E820 gaps could cross boundary (keep in mind that the P2M structure is a 3-level tree) in the P2M regions we go through the E820 gaps and reserved E820 regions and set those to be identity. For large regions we just hook up the top (or middle) pointer to shared "identity" pages. For smaller regions we set the MFN wherein pfn_to_mfn(pfn)==pfn. For the case of the balloon pages, the setting of the "missing" pages is mostly already done. The initial case of carving the last E820 region for balloon ownership is augmented to set those PFNs to missing. This patchset is also available under git: git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen.git devel/p2m-identity Further work: The xen/mmu.c code where it deals with _PAGE_IOMAP can be removed, but to guard against regressions or bugs lets take it one patchset at a time. Also filter out _PAGE_IOMAP on entries that are System RAM (which is wrong). With this P2M to lookup we can make this determination easily. P.S. You might wonder why the IDENTITY_P2M_ENTRY value is not used when writting to the P2M, but only used as flag. That is b/c the toolset does not consider that value to be correct so instead we use the INVALID_P2M_ENTRY. In-Reply-To: _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2010-Dec-21 21:37 UTC
[Xen-devel] [PATCH 01/10] xen: Make all reserved pages for the balloon be INVALID_P2M_ENTRY.
This patch prepares ourselves for the case where void entries in the P2M tree structure do not necessarily imply that the pages are missing. With this, we diligently set regions that will be used by the balloon driver to be INVALID_P2M_ENTRY and under the ownership of the balloon driver. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> --- arch/x86/xen/setup.c | 8 ++++++++ drivers/xen/balloon.c | 1 + 2 files changed, 9 insertions(+), 0 deletions(-) diff --git a/arch/x86/xen/setup.c b/arch/x86/xen/setup.c index b5a7f92..d984d36 100644 --- a/arch/x86/xen/setup.c +++ b/arch/x86/xen/setup.c @@ -52,6 +52,8 @@ phys_addr_t xen_extra_mem_start, xen_extra_mem_size; static __init void xen_add_extra_mem(unsigned long pages) { + unsigned long pfn; + u64 size = (u64)pages * PAGE_SIZE; u64 extra_start = xen_extra_mem_start + xen_extra_mem_size; @@ -66,6 +68,11 @@ static __init void xen_add_extra_mem(unsigned long pages) xen_extra_mem_size += size; xen_max_p2m_pfn = PFN_DOWN(extra_start + size); + + for (pfn = PFN_DOWN(extra_start); pfn <= xen_max_p2m_pfn; pfn++) { + BUG_ON(!set_phys_to_machine(pfn, INVALID_P2M_ENTRY)); + BUG_ON(get_phys_to_machine(pfn) != INVALID_P2M_ENTRY); + } } static unsigned long __init xen_release_chunk(phys_addr_t start_addr, @@ -105,6 +112,7 @@ static unsigned long __init xen_release_chunk(phys_addr_t start_addr, start, end, ret); if (ret == 1) { set_phys_to_machine(pfn, INVALID_P2M_ENTRY); + BUG_ON(get_phys_to_machine(pfn) != INVALID_P2M_ENTRY); len++; } } diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c index 43f9f02..f82bb48 100644 --- a/drivers/xen/balloon.c +++ b/drivers/xen/balloon.c @@ -297,6 +297,7 @@ static int decrease_reservation(unsigned long nr_pages) for (i = 0; i < nr_pages; i++) { pfn = mfn_to_pfn(frame_list[i]); set_phys_to_machine(pfn, INVALID_P2M_ENTRY); + BUG_ON(get_phys_to_machine(pfn) != INVALID_P2M_ENTRY); balloon_append(pfn_to_page(pfn)); } -- 1.7.1 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2010-Dec-21 21:37 UTC
[Xen-devel] [PATCH 02/10] xen/p2m: change p2m_missing_* to p2m_identity_*
From: Konrad Rzeszutek Wilk <konrad@kernel.org> We are going to alter how we think about P2M. Most of the P2M contains MFN, and areas that are not populated are considered to be "missing". Missing means that the PFN is either not set for this guest (not have that much memory allocated) or is under the balloon driver ownership. We are instead now going to think of those not populated areas as "identity." Meaning that that the PFN for which we would get the p2m_identity we will provide the the PFN value back instead of P2M_MISSING. Essentially treating those regions as PFN==MFN. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> --- arch/x86/xen/mmu.c | 59 ++++++++++++++++++++++++++------------------------- 1 files changed, 30 insertions(+), 29 deletions(-) diff --git a/arch/x86/xen/mmu.c b/arch/x86/xen/mmu.c index 44924e5..d6d0276 100644 --- a/arch/x86/xen/mmu.c +++ b/arch/x86/xen/mmu.c @@ -209,9 +209,9 @@ unsigned long xen_max_p2m_pfn __read_mostly; #define MAX_P2M_PFN (P2M_TOP_PER_PAGE * P2M_MID_PER_PAGE * P2M_PER_PAGE) /* Placeholders for holes in the address space */ -static RESERVE_BRK_ARRAY(unsigned long, p2m_missing, P2M_PER_PAGE); -static RESERVE_BRK_ARRAY(unsigned long *, p2m_mid_missing, P2M_MID_PER_PAGE); -static RESERVE_BRK_ARRAY(unsigned long, p2m_mid_missing_mfn, P2M_MID_PER_PAGE); +static RESERVE_BRK_ARRAY(unsigned long, p2m_identity, P2M_PER_PAGE); +static RESERVE_BRK_ARRAY(unsigned long *, p2m_mid_identity, P2M_MID_PER_PAGE); +static RESERVE_BRK_ARRAY(unsigned long, p2m_mid_identity_mfn, P2M_MID_PER_PAGE); static RESERVE_BRK_ARRAY(unsigned long **, p2m_top, P2M_TOP_PER_PAGE); static RESERVE_BRK_ARRAY(unsigned long, p2m_top_mfn, P2M_TOP_PER_PAGE); @@ -241,7 +241,7 @@ static void p2m_top_init(unsigned long ***top) unsigned i; for (i = 0; i < P2M_TOP_PER_PAGE; i++) - top[i] = p2m_mid_missing; + top[i] = p2m_mid_identity; } static void p2m_top_mfn_init(unsigned long *top) @@ -249,7 +249,7 @@ static void p2m_top_mfn_init(unsigned long *top) unsigned i; for (i = 0; i < P2M_TOP_PER_PAGE; i++) - top[i] = virt_to_mfn(p2m_mid_missing_mfn); + top[i] = virt_to_mfn(p2m_mid_identity_mfn); } static void p2m_top_mfn_p_init(unsigned long **top) @@ -257,7 +257,7 @@ static void p2m_top_mfn_p_init(unsigned long **top) unsigned i; for (i = 0; i < P2M_TOP_PER_PAGE; i++) - top[i] = p2m_mid_missing_mfn; + top[i] = p2m_mid_identity_mfn; } static void p2m_mid_init(unsigned long **mid) @@ -265,7 +265,7 @@ static void p2m_mid_init(unsigned long **mid) unsigned i; for (i = 0; i < P2M_MID_PER_PAGE; i++) - mid[i] = p2m_missing; + mid[i] = p2m_identity; } static void p2m_mid_mfn_init(unsigned long *mid) @@ -273,7 +273,7 @@ static void p2m_mid_mfn_init(unsigned long *mid) unsigned i; for (i = 0; i < P2M_MID_PER_PAGE; i++) - mid[i] = virt_to_mfn(p2m_missing); + mid[i] = virt_to_mfn(p2m_identity); } static void p2m_init(unsigned long *p2m) @@ -300,8 +300,8 @@ void xen_build_mfn_list_list(void) /* Pre-initialize p2m_top_mfn to be completely missing */ if (p2m_top_mfn == NULL) { - p2m_mid_missing_mfn = extend_brk(PAGE_SIZE, PAGE_SIZE); - p2m_mid_mfn_init(p2m_mid_missing_mfn); + p2m_mid_identity_mfn = extend_brk(PAGE_SIZE, PAGE_SIZE); + p2m_mid_mfn_init(p2m_mid_identity_mfn); p2m_top_mfn_p = extend_brk(PAGE_SIZE, PAGE_SIZE); p2m_top_mfn_p_init(p2m_top_mfn_p); @@ -310,7 +310,7 @@ void xen_build_mfn_list_list(void) p2m_top_mfn_init(p2m_top_mfn); } else { /* Reinitialise, mfn''s all change after migration */ - p2m_mid_mfn_init(p2m_mid_missing_mfn); + p2m_mid_mfn_init(p2m_mid_identity_mfn); } for (pfn = 0; pfn < xen_max_p2m_pfn; pfn += P2M_PER_PAGE) { @@ -326,15 +326,15 @@ void xen_build_mfn_list_list(void) * they''re just missing, just update the stored mfn, * since all could have changed over a migrate. */ - if (mid == p2m_mid_missing) { + if (mid == p2m_mid_identity) { BUG_ON(mididx); - BUG_ON(mid_mfn_p != p2m_mid_missing_mfn); - p2m_top_mfn[topidx] = virt_to_mfn(p2m_mid_missing_mfn); + BUG_ON(mid_mfn_p != p2m_mid_identity_mfn); + p2m_top_mfn[topidx] = virt_to_mfn(p2m_mid_identity_mfn); pfn += (P2M_MID_PER_PAGE - 1) * P2M_PER_PAGE; continue; } - if (mid_mfn_p == p2m_mid_missing_mfn) { + if (mid_mfn_p == p2m_mid_identity_mfn) { /* * XXX boot-time only! We should never find * missing parts of the mfn tree after @@ -370,11 +370,11 @@ void __init xen_build_dynamic_phys_to_machine(void) xen_max_p2m_pfn = max_pfn; - p2m_missing = extend_brk(PAGE_SIZE, PAGE_SIZE); - p2m_init(p2m_missing); + p2m_identity = extend_brk(PAGE_SIZE, PAGE_SIZE); + p2m_init(p2m_identity); - p2m_mid_missing = extend_brk(PAGE_SIZE, PAGE_SIZE); - p2m_mid_init(p2m_mid_missing); + p2m_mid_identity = extend_brk(PAGE_SIZE, PAGE_SIZE); + p2m_mid_init(p2m_mid_identity); p2m_top = extend_brk(PAGE_SIZE, PAGE_SIZE); p2m_top_init(p2m_top); @@ -388,7 +388,7 @@ void __init xen_build_dynamic_phys_to_machine(void) unsigned topidx = p2m_top_index(pfn); unsigned mididx = p2m_mid_index(pfn); - if (p2m_top[topidx] == p2m_mid_missing) { + if (p2m_top[topidx] == p2m_mid_identity) { unsigned long **mid = extend_brk(PAGE_SIZE, PAGE_SIZE); p2m_mid_init(mid); @@ -443,7 +443,7 @@ static bool alloc_p2m(unsigned long pfn) top_p = &p2m_top[topidx]; mid = *top_p; - if (mid == p2m_mid_missing) { + if (mid == p2m_mid_identity) { /* Mid level is missing, allocate a new one */ mid = alloc_p2m_page(); if (!mid) @@ -451,7 +451,7 @@ static bool alloc_p2m(unsigned long pfn) p2m_mid_init(mid); - if (cmpxchg(top_p, p2m_mid_missing, mid) != p2m_mid_missing) + if (cmpxchg(top_p, p2m_mid_identity, mid) != p2m_mid_identity) free_p2m_page(mid); } @@ -460,9 +460,9 @@ static bool alloc_p2m(unsigned long pfn) BUG_ON(virt_to_mfn(mid_mfn) != *top_mfn_p); - if (mid_mfn == p2m_mid_missing_mfn) { + if (mid_mfn == p2m_mid_identity_mfn) { /* Separately check the mid mfn level */ - unsigned long missing_mfn; + unsigned long identity_mfn; unsigned long mid_mfn_mfn; mid_mfn = alloc_p2m_page(); @@ -471,15 +471,16 @@ static bool alloc_p2m(unsigned long pfn) p2m_mid_mfn_init(mid_mfn); - missing_mfn = virt_to_mfn(p2m_mid_missing_mfn); + identity_mfn = virt_to_mfn(p2m_mid_identity_mfn); mid_mfn_mfn = virt_to_mfn(mid_mfn); - if (cmpxchg(top_mfn_p, missing_mfn, mid_mfn_mfn) != missing_mfn) + if (cmpxchg(top_mfn_p, identity_mfn, mid_mfn_mfn) !+ identity_mfn) free_p2m_page(mid_mfn); else p2m_top_mfn_p[topidx] = mid_mfn; } - if (p2m_top[topidx][mididx] == p2m_missing) { + if (p2m_top[topidx][mididx] == p2m_identity) { /* p2m leaf page is missing */ unsigned long *p2m; @@ -489,7 +490,7 @@ static bool alloc_p2m(unsigned long pfn) p2m_init(p2m); - if (cmpxchg(&mid[mididx], p2m_missing, p2m) != p2m_missing) + if (cmpxchg(&mid[mididx], p2m_identity, p2m) != p2m_identity) free_p2m_page(p2m); else mid_mfn[mididx] = virt_to_mfn(p2m); @@ -512,7 +513,7 @@ bool __set_phys_to_machine(unsigned long pfn, unsigned long mfn) mididx = p2m_mid_index(pfn); idx = p2m_index(pfn); - if (p2m_top[topidx][mididx] == p2m_missing) + if (p2m_top[topidx][mididx] == p2m_identity) return mfn == INVALID_P2M_ENTRY; p2m_top[topidx][mididx][idx] = mfn; -- 1.7.1 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2010-Dec-21 21:37 UTC
[Xen-devel] [PATCH 03/10] xen/mmu: Add the notion of IDENTITY_P2M_ENTRY.
Our P2M tree structure is a three-level. On the leaf nodes we set the Machine Frame Number (MFN) of the PFN. What this means is that when one does: pfn_to_mfn(pfn), which is used when creating PTE entries, you get the real MFN of the hardware. When Xen sets up a guest it initially populates a array which has descending MFN values, as so: idx: 0, 1, 2 [0x290F, 0x290E, 0x290D, ..] so pfn_to_mfn(2)==0x290D. If you start, restart many guests that list starts looking quite random. Anyhow, we graft this structure on our P2M tree structure and stick in those MFN in the leafs. But for all other leaf entries, or for the top root, or middle one, for which there is no corresponding MFN, we assume that the MFN is the PFN. In other words, for example: pfn_to_mfn(0xc0000)=0xc0000. Note, this is a departure from how P2M previously worked. In the past, it would give you INVALID_P2M_ENTRY, so: pfn_to_mfn(0xc0000)=INVALID_P2M_ENTRY. The benefit of this is, that we can assume for non-RAM regions (think PCI BARs, or ACPI spaces), we can create mappings easily b/c we get the PFN value to match the MFN. However, there is ballooning to be considered. Ballooning requires that we keep track of INVALID_P2M_ENTRY or even set the leaf entries with this value. Better yet, there might be huge regions (8MB, at the end of E820 region) for which we need to set that INVALID_P2M_ENTRY. For that we introduce two new pages: p2m_missing and p2m_mid_missing. All entries in p2m_missing are of INVALID_P2M_ENTRY type, and all entries in p2m_mid_missing point to p2m_missing. Whenever we detect that we need to set INVALID_P2M_ENTRY for large areas we swap those p2m_missing and/or p2m_mid_missing. P.S. We cannot set the IDENTITY_P2M_ENTRY in the P2M tree. This is b/c the toolstack considers that invalid and would abort during migration of the PV guest. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> --- arch/x86/include/asm/xen/page.h | 1 + arch/x86/xen/mmu.c | 61 +++++++++++++++++++++++++++++++++++---- 2 files changed, 56 insertions(+), 6 deletions(-) diff --git a/arch/x86/include/asm/xen/page.h b/arch/x86/include/asm/xen/page.h index 8760cc6..4b0ee16 100644 --- a/arch/x86/include/asm/xen/page.h +++ b/arch/x86/include/asm/xen/page.h @@ -29,6 +29,7 @@ typedef struct xpaddr { /**** MACHINE <-> PHYSICAL CONVERSION MACROS ****/ #define INVALID_P2M_ENTRY (~0UL) +#define IDENTITY_P2M_ENTRY (0UL) #define FOREIGN_FRAME_BIT (1UL<<31) #define FOREIGN_FRAME(m) ((m) | FOREIGN_FRAME_BIT) diff --git a/arch/x86/xen/mmu.c b/arch/x86/xen/mmu.c index d6d0276..4ba7e4e 100644 --- a/arch/x86/xen/mmu.c +++ b/arch/x86/xen/mmu.c @@ -217,6 +217,9 @@ static RESERVE_BRK_ARRAY(unsigned long **, p2m_top, P2M_TOP_PER_PAGE); static RESERVE_BRK_ARRAY(unsigned long, p2m_top_mfn, P2M_TOP_PER_PAGE); static RESERVE_BRK_ARRAY(unsigned long *, p2m_top_mfn_p, P2M_TOP_PER_PAGE); +static RESERVE_BRK_ARRAY(unsigned long, p2m_missing, P2M_PER_PAGE); +static RESERVE_BRK_ARRAY(unsigned long *, p2m_mid_missing, P2M_MID_PER_PAGE); + RESERVE_BRK(p2m_mid, PAGE_SIZE * (MAX_DOMAIN_PAGES / (P2M_PER_PAGE * P2M_MID_PER_PAGE))); RESERVE_BRK(p2m_mid_mfn, PAGE_SIZE * (MAX_DOMAIN_PAGES / (P2M_PER_PAGE * P2M_MID_PER_PAGE))); @@ -260,12 +263,12 @@ static void p2m_top_mfn_p_init(unsigned long **top) top[i] = p2m_mid_identity_mfn; } -static void p2m_mid_init(unsigned long **mid) +static void p2m_mid_init(unsigned long **mid, unsigned long *ptr) { unsigned i; for (i = 0; i < P2M_MID_PER_PAGE; i++) - mid[i] = p2m_identity; + mid[i] = ptr; } static void p2m_mid_mfn_init(unsigned long *mid) @@ -374,11 +377,16 @@ void __init xen_build_dynamic_phys_to_machine(void) p2m_init(p2m_identity); p2m_mid_identity = extend_brk(PAGE_SIZE, PAGE_SIZE); - p2m_mid_init(p2m_mid_identity); + p2m_mid_init(p2m_mid_identity, p2m_identity); p2m_top = extend_brk(PAGE_SIZE, PAGE_SIZE); p2m_top_init(p2m_top); + p2m_missing = extend_brk(PAGE_SIZE, PAGE_SIZE); + p2m_init(p2m_missing); + + p2m_mid_missing = extend_brk(PAGE_SIZE, PAGE_SIZE); + p2m_mid_init(p2m_mid_missing, p2m_missing); /* * The domain builder gives us a pre-constructed p2m array in * mfn_list for all the pages initially given to us, so we just @@ -390,7 +398,7 @@ void __init xen_build_dynamic_phys_to_machine(void) if (p2m_top[topidx] == p2m_mid_identity) { unsigned long **mid = extend_brk(PAGE_SIZE, PAGE_SIZE); - p2m_mid_init(mid); + p2m_mid_init(mid, p2m_identity); p2m_top[topidx] = mid; } @@ -410,6 +418,28 @@ unsigned long get_phys_to_machine(unsigned long pfn) mididx = p2m_mid_index(pfn); idx = p2m_index(pfn); + /* + * The INVALID_P2M_ENTRY is filled in both p2m_*identity + * and in p2m_*missing, so returning the INVALID_P2M_ENTRY + * would be wrong. + */ + if (p2m_top[topidx] == p2m_mid_identity) + return pfn; + + if (p2m_top[topidx][mididx] == p2m_identity) + return pfn; + +#if 0 + /* + * These are superflous. The p2m_missing and p2m_mid_missing + * both contain INVALID_P2M_ENTRY values. But this is correct + * and can help in understanding this code. */ + if (p2m_top[topidx] == p2m_mid_missing) + return INVALID_P2M_ENTRY; + + if (p2m_top[topidx][mididx] == p2m_missing) + return INVALID_P2M_ENTRY; +#endif return p2m_top[topidx][mididx][idx]; } EXPORT_SYMBOL_GPL(get_phys_to_machine); @@ -449,7 +479,7 @@ static bool alloc_p2m(unsigned long pfn) if (!mid) return false; - p2m_mid_init(mid); + p2m_mid_init(mid, p2m_identity); if (cmpxchg(top_p, p2m_mid_identity, mid) != p2m_mid_identity) free_p2m_page(mid); @@ -513,9 +543,28 @@ bool __set_phys_to_machine(unsigned long pfn, unsigned long mfn) mididx = p2m_mid_index(pfn); idx = p2m_index(pfn); - if (p2m_top[topidx][mididx] == p2m_identity) + if (mfn == INVALID_P2M_ENTRY) { + /* If it is INVALID, swap over.. */ + if (p2m_top[topidx] == p2m_mid_identity) { + p2m_top[topidx] = p2m_mid_missing; + return 1; + } + if (p2m_top[topidx][mididx] == p2m_identity) { + p2m_top[topidx][mididx] = p2m_missing; + return 1; + } + } + + /* And the result of the above swap over.. */ + if (p2m_top[topidx][mididx] == p2m_missing) return mfn == INVALID_P2M_ENTRY; + /* For sparse holes were the p2m leaf has real PFN along with + * PCI holes, stick in the PFN as the MFN value, do not pass + * in the IDENTITY_P2M_ENTRY state - that value cannot be saved. + */ + BUG_ON(mfn == IDENTITY_P2M_ENTRY); + p2m_top[topidx][mididx][idx] = mfn; return true; -- 1.7.1 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2010-Dec-21 21:37 UTC
[Xen-devel] [PATCH 04/10] xen/mmu: For 1-1 mapping, automatically set _PAGE_IOMAP.
This means that for PFNs (specifically: those in any E820 gaps or non-RAM E820 regions) that have 1-1 mapping we set the _PAGE_IOMAP flag. Later on we could remove the _PAGE_IOMAP code handling, but for right now lets keep this in to not introduce any bisection failures across this patchset. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> --- arch/x86/xen/mmu.c | 3 +++ 1 files changed, 3 insertions(+), 0 deletions(-) diff --git a/arch/x86/xen/mmu.c b/arch/x86/xen/mmu.c index 4ba7e4e..bd02e7d 100644 --- a/arch/x86/xen/mmu.c +++ b/arch/x86/xen/mmu.c @@ -832,6 +832,9 @@ static pteval_t pte_pfn_to_mfn(pteval_t val) pteval_t flags = val & PTE_FLAGS_MASK; unsigned long mfn = pfn_to_mfn(pfn); + if (mfn == pfn) + flags |= _PAGE_IOMAP; + /* * If there''s no mfn for the pfn, then just create an * empty non-present pte. Unfortunately this loses -- 1.7.1 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2010-Dec-21 21:37 UTC
[Xen-devel] [PATCH 05/10] xen/setup: Set identity mapping for non-RAM E820 and E820 gaps.
For all regions that are not considered RAM, we do not necessarily have to set the P2M, as it assumes that any P2M top branch (so covering 134217728 pages, on 64-bit) or middle branch (so covering 262144 pages, on 64-bit) are identity. Meaning pfn_to_mfn(pfn)==pfn. However, not all E820 gaps are that large, so for smaller and for boundary conditions we fill out the P2M mapping with the identity mapping. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> --- arch/x86/xen/setup.c | 34 ++++++++++++++++++++++++++++++++++ 1 files changed, 34 insertions(+), 0 deletions(-) diff --git a/arch/x86/xen/setup.c b/arch/x86/xen/setup.c index d984d36..752c865 100644 --- a/arch/x86/xen/setup.c +++ b/arch/x86/xen/setup.c @@ -146,6 +146,34 @@ static unsigned long __init xen_return_unused_memory(unsigned long max_pfn, return released; } +static unsigned long __init xen_set_identity(const struct e820map *e820) +{ + phys_addr_t last = 0; + int i; + unsigned long identity = 0; + unsigned long pfn; + + for (i = 0; i < e820->nr_map; i++) { + phys_addr_t start = e820->map[i].addr; + phys_addr_t end = start + e820->map[i].size; + + if (end < start) + continue; + + if (e820->map[i].type != E820_RAM) { + for (pfn = PFN_UP(start); pfn < PFN_DOWN(end); pfn++) + set_phys_to_machine(pfn, pfn); + identity += pfn - PFN_UP(start); + } + if (start > last && ((start - last) > 0)) { + for (pfn = PFN_UP(last); pfn < PFN_DOWN(start); pfn++) + set_phys_to_machine(pfn, pfn); + identity += pfn - PFN_UP(last); + } + last = end; + } + return identity; +} /** * machine_specific_memory_setup - Hook for machine specific memory setup. **/ @@ -254,6 +282,12 @@ char * __init xen_memory_setup(void) xen_add_extra_mem(extra_pages); + /* + * Set P2M for all non-RAM pages and E820 gaps to be identity + * type PFNs. + */ + printk(KERN_INFO "Set %ld page(s) to 1-1 mapping.\n", + xen_set_identity(&e820)); return "Xen"; } -- 1.7.1 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2010-Dec-21 21:37 UTC
[Xen-devel] [PATCH 06/10] xen/setup: Only set identity mapping in E820 regions when privileged.
We do not want to set the identity mapping on E820 reserved regions when running as PV. This is b/c the 0->ISA_END_ADDRESS region would be considered identity and we would try to read DMI information and fail (since the pfn_to_mfn(mfn)==pfn) under PV guests. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> --- arch/x86/xen/setup.c | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/arch/x86/xen/setup.c b/arch/x86/xen/setup.c index 752c865..34fb906 100644 --- a/arch/x86/xen/setup.c +++ b/arch/x86/xen/setup.c @@ -160,7 +160,7 @@ static unsigned long __init xen_set_identity(const struct e820map *e820) if (end < start) continue; - if (e820->map[i].type != E820_RAM) { + if (xen_initial_domain() && e820->map[i].type != E820_RAM) { for (pfn = PFN_UP(start); pfn < PFN_DOWN(end); pfn++) set_phys_to_machine(pfn, pfn); identity += pfn - PFN_UP(start); -- 1.7.1 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2010-Dec-21 21:37 UTC
[Xen-devel] [PATCH 07/10] xen/mmu: Work with 1-1 mappings when allocating new top/middle entries.
We were not properly taking under advisement the 1-1 mappings when a large area of memory was ballooned out. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> --- arch/x86/xen/mmu.c | 13 ++++++++----- 1 files changed, 8 insertions(+), 5 deletions(-) diff --git a/arch/x86/xen/mmu.c b/arch/x86/xen/mmu.c index bd02e7d..92f4fec 100644 --- a/arch/x86/xen/mmu.c +++ b/arch/x86/xen/mmu.c @@ -464,7 +464,7 @@ static void free_p2m_page(void *p) static bool alloc_p2m(unsigned long pfn) { unsigned topidx, mididx; - unsigned long ***top_p, **mid; + unsigned long ***top_p, **mid, **mid_orig; unsigned long *top_mfn_p, *mid_mfn; topidx = p2m_top_index(pfn); @@ -473,15 +473,16 @@ static bool alloc_p2m(unsigned long pfn) top_p = &p2m_top[topidx]; mid = *top_p; - if (mid == p2m_mid_identity) { + if (mid == p2m_mid_identity || mid == p2m_mid_missing) { /* Mid level is missing, allocate a new one */ + mid_orig = mid; mid = alloc_p2m_page(); if (!mid) return false; p2m_mid_init(mid, p2m_identity); - if (cmpxchg(top_p, p2m_mid_identity, mid) != p2m_mid_identity) + if (cmpxchg(top_p, mid_orig, mid) != mid_orig) free_p2m_page(mid); } @@ -510,9 +511,11 @@ static bool alloc_p2m(unsigned long pfn) p2m_top_mfn_p[topidx] = mid_mfn; } - if (p2m_top[topidx][mididx] == p2m_identity) { + if (p2m_top[topidx][mididx] == p2m_identity || + p2m_top[topidx][mididx] == p2m_missing) { /* p2m leaf page is missing */ unsigned long *p2m; + unsigned long *p2m_orig = p2m_top[topidx][mididx]; p2m = alloc_p2m_page(); if (!p2m) @@ -520,7 +523,7 @@ static bool alloc_p2m(unsigned long pfn) p2m_init(p2m); - if (cmpxchg(&mid[mididx], p2m_identity, p2m) != p2m_identity) + if (cmpxchg(&mid[mididx], p2m_orig, p2m) != p2m_orig) free_p2m_page(p2m); else mid_mfn[mididx] = virt_to_mfn(p2m); -- 1.7.1 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2010-Dec-21 21:37 UTC
[Xen-devel] [PATCH 08/10] xen/mmu: Bugfix. Fill the top entry page with appropriate middle layer pointers.
If we swapped over from using an p2m_mid_identical to p2m_mid_missing (earlier call to set_phys_to_machine) and then started going through the PFNs in descending order to program a new MFN (balloon worker), we would end up in this code path. At that point we would set up new page filled with pointers to p2m_identity instead of p2m_missing. This had the disastrous effect that get_phys_to_machine on that PFN would return an 1-1 mapping instead of INVALID_P2M_ENTRY resulting in hitting a BUG check in balloon driver. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> --- arch/x86/xen/mmu.c | 5 ++++- 1 files changed, 4 insertions(+), 1 deletions(-) diff --git a/arch/x86/xen/mmu.c b/arch/x86/xen/mmu.c index 92f4fec..a917439 100644 --- a/arch/x86/xen/mmu.c +++ b/arch/x86/xen/mmu.c @@ -480,7 +480,10 @@ static bool alloc_p2m(unsigned long pfn) if (!mid) return false; - p2m_mid_init(mid, p2m_identity); + if (mid == p2m_mid_identity) + p2m_mid_init(mid, p2m_identity); + else + p2m_mid_init(mid, p2m_missing); if (cmpxchg(top_p, mid_orig, mid) != mid_orig) free_p2m_page(mid); -- 1.7.1 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2010-Dec-21 21:37 UTC
[Xen-devel] [PATCH 09/10] xen/mmu: Be aware of p2m_[mid_|]missing when saving/restore.
We did not consider the 1-1 mapping and during restore would not properly deal with them. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> --- arch/x86/xen/mmu.c | 3 +-- 1 files changed, 1 insertions(+), 2 deletions(-) diff --git a/arch/x86/xen/mmu.c b/arch/x86/xen/mmu.c index a917439..b2b8733 100644 --- a/arch/x86/xen/mmu.c +++ b/arch/x86/xen/mmu.c @@ -329,14 +329,13 @@ void xen_build_mfn_list_list(void) * they''re just missing, just update the stored mfn, * since all could have changed over a migrate. */ - if (mid == p2m_mid_identity) { + if (mid == p2m_mid_identity || mid == p2m_mid_missing) { BUG_ON(mididx); BUG_ON(mid_mfn_p != p2m_mid_identity_mfn); p2m_top_mfn[topidx] = virt_to_mfn(p2m_mid_identity_mfn); pfn += (P2M_MID_PER_PAGE - 1) * P2M_PER_PAGE; continue; } - if (mid_mfn_p == p2m_mid_identity_mfn) { /* * XXX boot-time only! We should never find -- 1.7.1 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2010-Dec-21 21:37 UTC
[Xen-devel] [PATCH 10/10] xen/mmu: Warn against races.
The initial bootup code uses set_phys_to_machine quite a lot, and after bootup it would be the balloon driver. The balloon driver does have mutex lock so this should not be necessary - but just in case, add a warning if we do hit this scenario. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> --- arch/x86/xen/mmu.c | 8 +++++--- 1 files changed, 5 insertions(+), 3 deletions(-) diff --git a/arch/x86/xen/mmu.c b/arch/x86/xen/mmu.c index b2b8733..86d3bd3 100644 --- a/arch/x86/xen/mmu.c +++ b/arch/x86/xen/mmu.c @@ -549,13 +549,15 @@ bool __set_phys_to_machine(unsigned long pfn, unsigned long mfn) idx = p2m_index(pfn); if (mfn == INVALID_P2M_ENTRY) { - /* If it is INVALID, swap over.. */ + /* Ballon lock should inhibit racing, but just in case.*/ if (p2m_top[topidx] == p2m_mid_identity) { - p2m_top[topidx] = p2m_mid_missing; + WARN_ON(cmpxchg(&p2m_top[topidx], p2m_mid_identity, + p2m_mid_missing) != p2m_mid_identity); return 1; } if (p2m_top[topidx][mididx] == p2m_identity) { - p2m_top[topidx][mididx] = p2m_missing; + WARN_ON(cmpxchg(&p2m_top[topidx][mididx], p2m_identity, + p2m_missing) != p2m_identity); return 1; } } -- 1.7.1 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2010-Dec-21 22:19 UTC
[Xen-devel] Re: [PATCH 01/10] xen: Make all reserved pages for the balloon be INVALID_P2M_ENTRY.
On 12/21/2010 01:37 PM, Konrad Rzeszutek Wilk wrote:> This patch prepares ourselves for the case where void entries in the P2M > tree structure do not necessarily imply that the pages are missing. > With this, we diligently set regions that will be used by the > balloon driver to be INVALID_P2M_ENTRY and under the ownership > of the balloon driver. > > Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> > --- > arch/x86/xen/setup.c | 8 ++++++++ > drivers/xen/balloon.c | 1 + > 2 files changed, 9 insertions(+), 0 deletions(-) > > diff --git a/arch/x86/xen/setup.c b/arch/x86/xen/setup.c > index b5a7f92..d984d36 100644 > --- a/arch/x86/xen/setup.c > +++ b/arch/x86/xen/setup.c > @@ -52,6 +52,8 @@ phys_addr_t xen_extra_mem_start, xen_extra_mem_size; > > static __init void xen_add_extra_mem(unsigned long pages) > { > + unsigned long pfn; > + > u64 size = (u64)pages * PAGE_SIZE; > u64 extra_start = xen_extra_mem_start + xen_extra_mem_size; > > @@ -66,6 +68,11 @@ static __init void xen_add_extra_mem(unsigned long pages) > xen_extra_mem_size += size; > > xen_max_p2m_pfn = PFN_DOWN(extra_start + size); > + > + for (pfn = PFN_DOWN(extra_start); pfn <= xen_max_p2m_pfn; pfn++) { > + BUG_ON(!set_phys_to_machine(pfn, INVALID_P2M_ENTRY));Use __set_phys_to_machine where you don''t expect (or can''t allow) any allocation. Also, I''m not a fan of hiding real side-effectful code in a BUG_ON predicate.> + BUG_ON(get_phys_to_machine(pfn) != INVALID_P2M_ENTRY); > + } > } > > static unsigned long __init xen_release_chunk(phys_addr_t start_addr, > @@ -105,6 +112,7 @@ static unsigned long __init xen_release_chunk(phys_addr_t start_addr, > start, end, ret); > if (ret == 1) { > set_phys_to_machine(pfn, INVALID_P2M_ENTRY);Ditto on this one (I guess I forgot, or it predates the existence of __set_phys_to_machine).> + BUG_ON(get_phys_to_machine(pfn) != INVALID_P2M_ENTRY); > len++; > } > } > diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c > index 43f9f02..f82bb48 100644 > --- a/drivers/xen/balloon.c > +++ b/drivers/xen/balloon.c > @@ -297,6 +297,7 @@ static int decrease_reservation(unsigned long nr_pages) > for (i = 0; i < nr_pages; i++) { > pfn = mfn_to_pfn(frame_list[i]); > set_phys_to_machine(pfn, INVALID_P2M_ENTRY); > + BUG_ON(get_phys_to_machine(pfn) != INVALID_P2M_ENTRY); > balloon_append(pfn_to_page(pfn)); > } >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2010-Dec-21 22:29 UTC
[Xen-devel] Re: [PATCH 04/10] xen/mmu: For 1-1 mapping, automatically set _PAGE_IOMAP.
On 12/21/2010 01:37 PM, Konrad Rzeszutek Wilk wrote:> This means that for PFNs (specifically: those in any E820 gaps > or non-RAM E820 regions) that have 1-1 mapping we set the > _PAGE_IOMAP flag. > > Later on we could remove the _PAGE_IOMAP code handling, but > for right now lets keep this in to not introduce any bisection > failures across this patchset. > > Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> > --- > arch/x86/xen/mmu.c | 3 +++ > 1 files changed, 3 insertions(+), 0 deletions(-) > > diff --git a/arch/x86/xen/mmu.c b/arch/x86/xen/mmu.c > index 4ba7e4e..bd02e7d 100644 > --- a/arch/x86/xen/mmu.c > +++ b/arch/x86/xen/mmu.c > @@ -832,6 +832,9 @@ static pteval_t pte_pfn_to_mfn(pteval_t val) > pteval_t flags = val & PTE_FLAGS_MASK; > unsigned long mfn = pfn_to_mfn(pfn); > > + if (mfn == pfn) > + flags |= _PAGE_IOMAP;Why? Does it really make sense to set _PAGE_IOMAP if they just happen to be the same value? J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2010-Dec-21 22:34 UTC
[Xen-devel] Re: [PATCH 05/10] xen/setup: Set identity mapping for non-RAM E820 and E820 gaps.
On 12/21/2010 01:37 PM, Konrad Rzeszutek Wilk wrote:> For all regions that are not considered RAM, we do not > necessarily have to set the P2M, as it assumes that any > P2M top branch (so covering 134217728 pages, on 64-bit) or > middle branch (so covering 262144 pages, on 64-bit) are identity. > Meaning pfn_to_mfn(pfn)==pfn. However, not all E820 gaps are > that large, so for smaller and for boundary conditions we fill > out the P2M mapping with the identity mapping. > > Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> > --- > arch/x86/xen/setup.c | 34 ++++++++++++++++++++++++++++++++++ > 1 files changed, 34 insertions(+), 0 deletions(-) > > diff --git a/arch/x86/xen/setup.c b/arch/x86/xen/setup.c > index d984d36..752c865 100644 > --- a/arch/x86/xen/setup.c > +++ b/arch/x86/xen/setup.c > @@ -146,6 +146,34 @@ static unsigned long __init xen_return_unused_memory(unsigned long max_pfn, > return released; > } > > +static unsigned long __init xen_set_identity(const struct e820map *e820) > +{ > + phys_addr_t last = 0; > + int i; > + unsigned long identity = 0; > + unsigned long pfn; > + > + for (i = 0; i < e820->nr_map; i++) { > + phys_addr_t start = e820->map[i].addr; > + phys_addr_t end = start + e820->map[i].size; > + > + if (end < start) > + continue; > + > + if (e820->map[i].type != E820_RAM) { > + for (pfn = PFN_UP(start); pfn < PFN_DOWN(end); pfn++) > + set_phys_to_machine(pfn, pfn); > + identity += pfn - PFN_UP(start); > + } > + if (start > last && ((start - last) > 0)) { > + for (pfn = PFN_UP(last); pfn < PFN_DOWN(start); pfn++) > + set_phys_to_machine(pfn, pfn); > + identity += pfn - PFN_UP(last); > + }Couldn''t you just do something like: if (e820->map[i].type != E820_RAM) continue; for (pfn = PFN_UP(last); pfn < PFN_DOWN(start); pfn++) set_phys_to_machine(pfn, pfn); identity += pfn - PFN_UP(last); last = end; ie, handle the hole and non-RAM cases together? Also, what happens with the p2m tree mid layers in this? If you''re doing page-by-page set_phys_to_machine, won''t it end up allocating them all? How can you optimise the "large chunks of address space are identity" case? It would probably be cleanest to have a set_ident_phys_to_machine(start, end) function which can do all that. J> + last = end; > + } > + return identity; > +} > /** > * machine_specific_memory_setup - Hook for machine specific memory setup. > **/ > @@ -254,6 +282,12 @@ char * __init xen_memory_setup(void) > > xen_add_extra_mem(extra_pages); > > + /* > + * Set P2M for all non-RAM pages and E820 gaps to be identity > + * type PFNs. > + */ > + printk(KERN_INFO "Set %ld page(s) to 1-1 mapping.\n", > + xen_set_identity(&e820)); > return "Xen"; > } >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2010-Dec-21 22:37 UTC
[Xen-devel] Re: [PATCH 06/10] xen/setup: Only set identity mapping in E820 regions when privileged.
On 12/21/2010 01:37 PM, Konrad Rzeszutek Wilk wrote:> We do not want to set the identity mapping on E820 reserved > regions when running as PV. This is b/c the 0->ISA_END_ADDRESS region > would be considered identity and we would try to read DMI information > and fail (since the pfn_to_mfn(mfn)==pfn) under PV guests. > > Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> > --- > arch/x86/xen/setup.c | 2 +- > 1 files changed, 1 insertions(+), 1 deletions(-) > > diff --git a/arch/x86/xen/setup.c b/arch/x86/xen/setup.c > index 752c865..34fb906 100644 > --- a/arch/x86/xen/setup.c > +++ b/arch/x86/xen/setup.c > @@ -160,7 +160,7 @@ static unsigned long __init xen_set_identity(const struct e820map *e820) > if (end < start) > continue; > > - if (e820->map[i].type != E820_RAM) { > + if (xen_initial_domain() && e820->map[i].type != E820_RAM) { > for (pfn = PFN_UP(start); pfn < PFN_DOWN(end); pfn++) > set_phys_to_machine(pfn, pfn); > identity += pfn - PFN_UP(start);So you''re relying on the fact that the ISA area is the only non-RAM e820 entry in domU? I think it would be better to do a specific exclusion for the ISA area rather than this. J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2010-Dec-21 22:37 UTC
[Xen-devel] Re: [PATCH 07/10] xen/mmu: Work with 1-1 mappings when allocating new top/middle entries.
On 12/21/2010 01:37 PM, Konrad Rzeszutek Wilk wrote:> We were not properly taking under advisement the 1-1 mappings > when a large area of memory was ballooned out.Could you expand on this? What does it mean? J> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> > --- > arch/x86/xen/mmu.c | 13 ++++++++----- > 1 files changed, 8 insertions(+), 5 deletions(-) > > diff --git a/arch/x86/xen/mmu.c b/arch/x86/xen/mmu.c > index bd02e7d..92f4fec 100644 > --- a/arch/x86/xen/mmu.c > +++ b/arch/x86/xen/mmu.c > @@ -464,7 +464,7 @@ static void free_p2m_page(void *p) > static bool alloc_p2m(unsigned long pfn) > { > unsigned topidx, mididx; > - unsigned long ***top_p, **mid; > + unsigned long ***top_p, **mid, **mid_orig; > unsigned long *top_mfn_p, *mid_mfn; > > topidx = p2m_top_index(pfn); > @@ -473,15 +473,16 @@ static bool alloc_p2m(unsigned long pfn) > top_p = &p2m_top[topidx]; > mid = *top_p; > > - if (mid == p2m_mid_identity) { > + if (mid == p2m_mid_identity || mid == p2m_mid_missing) { > /* Mid level is missing, allocate a new one */ > + mid_orig = mid; > mid = alloc_p2m_page(); > if (!mid) > return false; > > p2m_mid_init(mid, p2m_identity); > > - if (cmpxchg(top_p, p2m_mid_identity, mid) != p2m_mid_identity) > + if (cmpxchg(top_p, mid_orig, mid) != mid_orig) > free_p2m_page(mid); > } > > @@ -510,9 +511,11 @@ static bool alloc_p2m(unsigned long pfn) > p2m_top_mfn_p[topidx] = mid_mfn; > } > > - if (p2m_top[topidx][mididx] == p2m_identity) { > + if (p2m_top[topidx][mididx] == p2m_identity || > + p2m_top[topidx][mididx] == p2m_missing) { > /* p2m leaf page is missing */ > unsigned long *p2m; > + unsigned long *p2m_orig = p2m_top[topidx][mididx]; > > p2m = alloc_p2m_page(); > if (!p2m) > @@ -520,7 +523,7 @@ static bool alloc_p2m(unsigned long pfn) > > p2m_init(p2m); > > - if (cmpxchg(&mid[mididx], p2m_identity, p2m) != p2m_identity) > + if (cmpxchg(&mid[mididx], p2m_orig, p2m) != p2m_orig) > free_p2m_page(p2m); > else > mid_mfn[mididx] = virt_to_mfn(p2m);_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2010-Dec-21 22:38 UTC
[Xen-devel] Re: [PATCH 08/10] xen/mmu: Bugfix. Fill the top entry page with appropriate middle layer pointers.
On 12/21/2010 01:37 PM, Konrad Rzeszutek Wilk wrote:> If we swapped over from using an p2m_mid_identical to p2m_mid_missing > (earlier call to set_phys_to_machine) and then started going through the > PFNs in descending order to program a new MFN (balloon worker), we would > end up in this code path. At that point we would set up new page filled with > pointers to p2m_identity instead of p2m_missing. This had the disastrous > effect that get_phys_to_machine on that PFN would return an 1-1 mapping > instead of INVALID_P2M_ENTRY resulting in hitting a BUG check in balloon driver. >Are you going to fold this into the appropriate patch later?> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> > --- > arch/x86/xen/mmu.c | 5 ++++- > 1 files changed, 4 insertions(+), 1 deletions(-) > > diff --git a/arch/x86/xen/mmu.c b/arch/x86/xen/mmu.c > index 92f4fec..a917439 100644 > --- a/arch/x86/xen/mmu.c > +++ b/arch/x86/xen/mmu.c > @@ -480,7 +480,10 @@ static bool alloc_p2m(unsigned long pfn) > if (!mid) > return false; > > - p2m_mid_init(mid, p2m_identity); > + if (mid == p2m_mid_identity) > + p2m_mid_init(mid, p2m_identity); > + else > + p2m_mid_init(mid, p2m_missing); > > if (cmpxchg(top_p, mid_orig, mid) != mid_orig) > free_p2m_page(mid);_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2010-Dec-21 22:41 UTC
[Xen-devel] Re: [PATCH 02/10] xen/p2m: change p2m_missing_* to p2m_identity_*
On 12/21/2010 01:37 PM, Konrad Rzeszutek Wilk wrote:> From: Konrad Rzeszutek Wilk <konrad@kernel.org> > > We are going to alter how we think about P2M. Most of the > P2M contains MFN, and areas that are not populated are > considered to be "missing". Missing means that the PFN > is either not set for this guest (not have that much memory > allocated) or is under the balloon driver ownership. > > We are instead now going to think of those not populated > areas as "identity." Meaning that that the PFN for which > we would get the p2m_identity we will provide the the PFN > value back instead of P2M_MISSING. Essentially treating > those regions as PFN==MFN. >This renames missing -> identity, but does it actually change the functionality? Doesn''t it just leave it being misnamed? It would probably be better to fold in the actual identity implementation as well.> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> > --- > arch/x86/xen/mmu.c | 59 ++++++++++++++++++++++++++------------------------- > 1 files changed, 30 insertions(+), 29 deletions(-) > > diff --git a/arch/x86/xen/mmu.c b/arch/x86/xen/mmu.c > index 44924e5..d6d0276 100644 > --- a/arch/x86/xen/mmu.c > +++ b/arch/x86/xen/mmu.c > @@ -209,9 +209,9 @@ unsigned long xen_max_p2m_pfn __read_mostly; > #define MAX_P2M_PFN (P2M_TOP_PER_PAGE * P2M_MID_PER_PAGE * P2M_PER_PAGE) > > /* Placeholders for holes in the address space */ > -static RESERVE_BRK_ARRAY(unsigned long, p2m_missing, P2M_PER_PAGE); > -static RESERVE_BRK_ARRAY(unsigned long *, p2m_mid_missing, P2M_MID_PER_PAGE); > -static RESERVE_BRK_ARRAY(unsigned long, p2m_mid_missing_mfn, P2M_MID_PER_PAGE); > +static RESERVE_BRK_ARRAY(unsigned long, p2m_identity, P2M_PER_PAGE); > +static RESERVE_BRK_ARRAY(unsigned long *, p2m_mid_identity, P2M_MID_PER_PAGE); > +static RESERVE_BRK_ARRAY(unsigned long, p2m_mid_identity_mfn, P2M_MID_PER_PAGE); > > static RESERVE_BRK_ARRAY(unsigned long **, p2m_top, P2M_TOP_PER_PAGE); > static RESERVE_BRK_ARRAY(unsigned long, p2m_top_mfn, P2M_TOP_PER_PAGE); > @@ -241,7 +241,7 @@ static void p2m_top_init(unsigned long ***top) > unsigned i; > > for (i = 0; i < P2M_TOP_PER_PAGE; i++) > - top[i] = p2m_mid_missing; > + top[i] = p2m_mid_identity; > } > > static void p2m_top_mfn_init(unsigned long *top) > @@ -249,7 +249,7 @@ static void p2m_top_mfn_init(unsigned long *top) > unsigned i; > > for (i = 0; i < P2M_TOP_PER_PAGE; i++) > - top[i] = virt_to_mfn(p2m_mid_missing_mfn); > + top[i] = virt_to_mfn(p2m_mid_identity_mfn); > } > > static void p2m_top_mfn_p_init(unsigned long **top) > @@ -257,7 +257,7 @@ static void p2m_top_mfn_p_init(unsigned long **top) > unsigned i; > > for (i = 0; i < P2M_TOP_PER_PAGE; i++) > - top[i] = p2m_mid_missing_mfn; > + top[i] = p2m_mid_identity_mfn; > } > > static void p2m_mid_init(unsigned long **mid) > @@ -265,7 +265,7 @@ static void p2m_mid_init(unsigned long **mid) > unsigned i; > > for (i = 0; i < P2M_MID_PER_PAGE; i++) > - mid[i] = p2m_missing; > + mid[i] = p2m_identity; > } > > static void p2m_mid_mfn_init(unsigned long *mid) > @@ -273,7 +273,7 @@ static void p2m_mid_mfn_init(unsigned long *mid) > unsigned i; > > for (i = 0; i < P2M_MID_PER_PAGE; i++) > - mid[i] = virt_to_mfn(p2m_missing); > + mid[i] = virt_to_mfn(p2m_identity); > } > > static void p2m_init(unsigned long *p2m) > @@ -300,8 +300,8 @@ void xen_build_mfn_list_list(void) > > /* Pre-initialize p2m_top_mfn to be completely missing */ > if (p2m_top_mfn == NULL) { > - p2m_mid_missing_mfn = extend_brk(PAGE_SIZE, PAGE_SIZE); > - p2m_mid_mfn_init(p2m_mid_missing_mfn); > + p2m_mid_identity_mfn = extend_brk(PAGE_SIZE, PAGE_SIZE); > + p2m_mid_mfn_init(p2m_mid_identity_mfn); > > p2m_top_mfn_p = extend_brk(PAGE_SIZE, PAGE_SIZE); > p2m_top_mfn_p_init(p2m_top_mfn_p); > @@ -310,7 +310,7 @@ void xen_build_mfn_list_list(void) > p2m_top_mfn_init(p2m_top_mfn); > } else { > /* Reinitialise, mfn''s all change after migration */ > - p2m_mid_mfn_init(p2m_mid_missing_mfn); > + p2m_mid_mfn_init(p2m_mid_identity_mfn); > } > > for (pfn = 0; pfn < xen_max_p2m_pfn; pfn += P2M_PER_PAGE) { > @@ -326,15 +326,15 @@ void xen_build_mfn_list_list(void) > * they''re just missing, just update the stored mfn, > * since all could have changed over a migrate. > */ > - if (mid == p2m_mid_missing) { > + if (mid == p2m_mid_identity) { > BUG_ON(mididx); > - BUG_ON(mid_mfn_p != p2m_mid_missing_mfn); > - p2m_top_mfn[topidx] = virt_to_mfn(p2m_mid_missing_mfn); > + BUG_ON(mid_mfn_p != p2m_mid_identity_mfn); > + p2m_top_mfn[topidx] = virt_to_mfn(p2m_mid_identity_mfn); > pfn += (P2M_MID_PER_PAGE - 1) * P2M_PER_PAGE; > continue; > } > > - if (mid_mfn_p == p2m_mid_missing_mfn) { > + if (mid_mfn_p == p2m_mid_identity_mfn) { > /* > * XXX boot-time only! We should never find > * missing parts of the mfn tree after > @@ -370,11 +370,11 @@ void __init xen_build_dynamic_phys_to_machine(void) > > xen_max_p2m_pfn = max_pfn; > > - p2m_missing = extend_brk(PAGE_SIZE, PAGE_SIZE); > - p2m_init(p2m_missing); > + p2m_identity = extend_brk(PAGE_SIZE, PAGE_SIZE); > + p2m_init(p2m_identity); > > - p2m_mid_missing = extend_brk(PAGE_SIZE, PAGE_SIZE); > - p2m_mid_init(p2m_mid_missing); > + p2m_mid_identity = extend_brk(PAGE_SIZE, PAGE_SIZE); > + p2m_mid_init(p2m_mid_identity); > > p2m_top = extend_brk(PAGE_SIZE, PAGE_SIZE); > p2m_top_init(p2m_top); > @@ -388,7 +388,7 @@ void __init xen_build_dynamic_phys_to_machine(void) > unsigned topidx = p2m_top_index(pfn); > unsigned mididx = p2m_mid_index(pfn); > > - if (p2m_top[topidx] == p2m_mid_missing) { > + if (p2m_top[topidx] == p2m_mid_identity) { > unsigned long **mid = extend_brk(PAGE_SIZE, PAGE_SIZE); > p2m_mid_init(mid); > > @@ -443,7 +443,7 @@ static bool alloc_p2m(unsigned long pfn) > top_p = &p2m_top[topidx]; > mid = *top_p; > > - if (mid == p2m_mid_missing) { > + if (mid == p2m_mid_identity) { > /* Mid level is missing, allocate a new one */ > mid = alloc_p2m_page(); > if (!mid) > @@ -451,7 +451,7 @@ static bool alloc_p2m(unsigned long pfn) > > p2m_mid_init(mid); > > - if (cmpxchg(top_p, p2m_mid_missing, mid) != p2m_mid_missing) > + if (cmpxchg(top_p, p2m_mid_identity, mid) != p2m_mid_identity) > free_p2m_page(mid); > } > > @@ -460,9 +460,9 @@ static bool alloc_p2m(unsigned long pfn) > > BUG_ON(virt_to_mfn(mid_mfn) != *top_mfn_p); > > - if (mid_mfn == p2m_mid_missing_mfn) { > + if (mid_mfn == p2m_mid_identity_mfn) { > /* Separately check the mid mfn level */ > - unsigned long missing_mfn; > + unsigned long identity_mfn; > unsigned long mid_mfn_mfn; > > mid_mfn = alloc_p2m_page(); > @@ -471,15 +471,16 @@ static bool alloc_p2m(unsigned long pfn) > > p2m_mid_mfn_init(mid_mfn); > > - missing_mfn = virt_to_mfn(p2m_mid_missing_mfn); > + identity_mfn = virt_to_mfn(p2m_mid_identity_mfn); > mid_mfn_mfn = virt_to_mfn(mid_mfn); > - if (cmpxchg(top_mfn_p, missing_mfn, mid_mfn_mfn) != missing_mfn) > + if (cmpxchg(top_mfn_p, identity_mfn, mid_mfn_mfn) !> + identity_mfn)Don''t wrap this.> free_p2m_page(mid_mfn); > else > p2m_top_mfn_p[topidx] = mid_mfn; > } > > - if (p2m_top[topidx][mididx] == p2m_missing) { > + if (p2m_top[topidx][mididx] == p2m_identity) { > /* p2m leaf page is missing */ > unsigned long *p2m; > > @@ -489,7 +490,7 @@ static bool alloc_p2m(unsigned long pfn) > > p2m_init(p2m); > > - if (cmpxchg(&mid[mididx], p2m_missing, p2m) != p2m_missing) > + if (cmpxchg(&mid[mididx], p2m_identity, p2m) != p2m_identity) > free_p2m_page(p2m); > else > mid_mfn[mididx] = virt_to_mfn(p2m); > @@ -512,7 +513,7 @@ bool __set_phys_to_machine(unsigned long pfn, unsigned long mfn) > mididx = p2m_mid_index(pfn); > idx = p2m_index(pfn); > > - if (p2m_top[topidx][mididx] == p2m_missing) > + if (p2m_top[topidx][mididx] == p2m_identity) > return mfn == INVALID_P2M_ENTRY; > > p2m_top[topidx][mididx][idx] = mfn;Thanks, J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
H. Peter Anvin
2010-Dec-21 23:22 UTC
[Xen-devel] Re: [PATCH 01/10] xen: Make all reserved pages for the balloon be INVALID_P2M_ENTRY.
On 12/21/2010 02:19 PM, Jeremy Fitzhardinge wrote:> > Also, I''m not a fan of hiding real side-effectful code in a BUG_ON > predicate. >That''s an understatement. There are two schools of thought about what that should mean if BUG_ON was suppressed at compile time. -hpa -- H. Peter Anvin, Intel Open Source Technology Center I work for Intel. I don''t speak on their behalf. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Campbell
2010-Dec-22 08:36 UTC
Re: [Xen-devel] [RFC PATCH v1] Consider void entries in the P2M as 1-1 mapping.
On Tue, 2010-12-21 at 21:37 +0000, Konrad Rzeszutek Wilk wrote:> In the past we used to think of those regions as "missing" and under > the ownership of the balloon code. But the balloon code only operates > on a specific region. This region is in lastE820 RAM page (basically > any region past nr_pages is considered balloon type page).That is true at start of day but once the system is up and running the balloon driver can make a hole for anything which can be returned by alloc_page. The following descriptions seem to consider this correctly but I just wanted to clarify. I don''t think it''s necessarily the last E820 RAM page either, that''s just what the tools today happen to build. In principal the tools could push down a holey e820 (e.g. with PCI holes prepunched etc) and boot the domain ballooned down such that the N-2, N-3 e820 RAM regions are above nr_pages too.> This patchset considers the void entries as "identity" and for balloon > pages you have to set the PFNs to be "missing". This means that the > void entries are now considered 1-1, so for PFNs which exist in large > gaps of the P2M space will return the same PFN.I would naively have expected that a missing entry indicated an invalid/missing entry rather than an identity region, it just seems like the safer default since we are (maybe) more likely to catch an INVALID_P2M_ENTRY before handing it to the hypervisor and getting ourselves shot. In that case the identity regions would need to be explicitly registered, is that harder to do? I guess we could register any hole or explicit non-RAM region in the e820 as identity but do we sometimes see I/O memory above the top of the e820 or is there some other problem I''m not thinking of?> The xen/mmu.c code where it deals with _PAGE_IOMAP can be removed, but > to guard against regressions or bugs lets take it one patchset at a > time.Could we have a WARN_ON(_PAGE_IOMAP && !PAGE_IDENTITY) (or whatever the predicates really are) in some relevant places in mmu.c? Ian. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Campbell
2010-Dec-22 08:44 UTC
Re: [Xen-devel] [PATCH 03/10] xen/mmu: Add the notion of IDENTITY_P2M_ENTRY.
On Tue, 2010-12-21 at 21:37 +0000, Konrad Rzeszutek Wilk wrote:> P.S. > We cannot set the IDENTITY_P2M_ENTRY in the P2M tree. This is b/c > the toolstack considers that invalid and would abort during > migration of the PV guest.What do we do instead, just write the actual identity MFN value? Does using 1UL<<30 (or 1UL<<62 on 64 bit, note that Jeremy recently fixed FOREIGN_FRAME_BIT too) as IDENTITY_P2M_ENTRY keep the toolstack happy while still allowing us to distinguish from identity from happens-to-be-the-same as necessary? To some extent 1-1 I/O regions could be considered foreign mappings (of "memory" owned by DOMIO) so perhaps it''s possible that setting the FOREIGN_FRAME_BIT is both sufficient and correct? Ian. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Campbell
2010-Dec-22 08:47 UTC
Re: [Xen-devel] Re: [PATCH 01/10] xen: Make all reserved pages for the balloon be INVALID_P2M_ENTRY.
On Tue, 2010-12-21 at 22:19 +0000, Jeremy Fitzhardinge wrote:> On 12/21/2010 01:37 PM, Konrad Rzeszutek Wilk wrote: > > + BUG_ON(!set_phys_to_machine(pfn, INVALID_P2M_ENTRY));[...]> I''m not a fan of hiding real side-effectful code in a BUG_ON > predicate. > > > + BUG_ON(get_phys_to_machine(pfn) != INVALID_P2M_ENTRY);I''m also not a fan of this "is the previous I called function buggy" BUG_ON. If we aren''t confident that set_phys_to_machine() is doing the right thing then adding a post-condition check to that function would be a more preferable option. Ian. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Campbell
2010-Dec-22 08:49 UTC
Re: [Xen-devel] [PATCH 05/10] xen/setup: Set identity mapping for non-RAM E820 and E820 gaps.
On Tue, 2010-12-21 at 21:37 +0000, Konrad Rzeszutek Wilk wrote:> @@ -254,6 +282,12 @@ char * __init xen_memory_setup(void) > > xen_add_extra_mem(extra_pages); > > + /* > + * Set P2M for all non-RAM pages and E820 gaps to be identity > + * type PFNs. > + */ > + printk(KERN_INFO "Set %ld page(s) to 1-1 mapping.\n", > + xen_set_identity(&e820));I had to look at this patch twice to find the caller of the xen_set_identity function here. I think spending a variable here would be worth the improved clarity. Ian. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Campbell
2010-Dec-22 08:54 UTC
Re: [Xen-devel] [PATCH 07/10] xen/mmu: Work with 1-1 mappings when allocating new top/middle entries.
On Tue, 2010-12-21 at 21:37 +0000, Konrad Rzeszutek Wilk wrote:> We were not properly taking under advisement the 1-1 mappings > when a large area of memory was ballooned out.Are we lazily allocating the p2m tree nodes for regions initially covered by the balloon? (perhaps we have always done this and it isn''t new with this series) Would it be simpley to always populate enough tree nodes to cover the ballooned area as well as nr_pages at start of day and therefore avoid worrying about it later on (except for memory hotplug which is special in this way already)? Ian.> > Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> > --- > arch/x86/xen/mmu.c | 13 ++++++++----- > 1 files changed, 8 insertions(+), 5 deletions(-) > > diff --git a/arch/x86/xen/mmu.c b/arch/x86/xen/mmu.c > index bd02e7d..92f4fec 100644 > --- a/arch/x86/xen/mmu.c > +++ b/arch/x86/xen/mmu.c > @@ -464,7 +464,7 @@ static void free_p2m_page(void *p) > static bool alloc_p2m(unsigned long pfn) > { > unsigned topidx, mididx; > - unsigned long ***top_p, **mid; > + unsigned long ***top_p, **mid, **mid_orig; > unsigned long *top_mfn_p, *mid_mfn; > > topidx = p2m_top_index(pfn); > @@ -473,15 +473,16 @@ static bool alloc_p2m(unsigned long pfn) > top_p = &p2m_top[topidx]; > mid = *top_p; > > - if (mid == p2m_mid_identity) { > + if (mid == p2m_mid_identity || mid == p2m_mid_missing) { > /* Mid level is missing, allocate a new one */ > + mid_orig = mid; > mid = alloc_p2m_page(); > if (!mid) > return false; > > p2m_mid_init(mid, p2m_identity); > > - if (cmpxchg(top_p, p2m_mid_identity, mid) != p2m_mid_identity) > + if (cmpxchg(top_p, mid_orig, mid) != mid_orig) > free_p2m_page(mid); > } > > @@ -510,9 +511,11 @@ static bool alloc_p2m(unsigned long pfn) > p2m_top_mfn_p[topidx] = mid_mfn; > } > > - if (p2m_top[topidx][mididx] == p2m_identity) { > + if (p2m_top[topidx][mididx] == p2m_identity || > + p2m_top[topidx][mididx] == p2m_missing) { > /* p2m leaf page is missing */ > unsigned long *p2m; > + unsigned long *p2m_orig = p2m_top[topidx][mididx]; > > p2m = alloc_p2m_page(); > if (!p2m) > @@ -520,7 +523,7 @@ static bool alloc_p2m(unsigned long pfn) > > p2m_init(p2m); > > - if (cmpxchg(&mid[mididx], p2m_identity, p2m) != p2m_identity) > + if (cmpxchg(&mid[mididx], p2m_orig, p2m) != p2m_orig) > free_p2m_page(p2m); > else > mid_mfn[mididx] = virt_to_mfn(p2m);_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2010-Dec-22 14:53 UTC
[Xen-devel] Re: [PATCH 01/10] xen: Make all reserved pages for the balloon be INVALID_P2M_ENTRY.
On Tue, Dec 21, 2010 at 02:19:40PM -0800, Jeremy Fitzhardinge wrote:> On 12/21/2010 01:37 PM, Konrad Rzeszutek Wilk wrote: > > This patch prepares ourselves for the case where void entries in the P2M > > tree structure do not necessarily imply that the pages are missing. > > With this, we diligently set regions that will be used by the > > balloon driver to be INVALID_P2M_ENTRY and under the ownership > > of the balloon driver. > > > > Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> > > --- > > arch/x86/xen/setup.c | 8 ++++++++ > > drivers/xen/balloon.c | 1 + > > 2 files changed, 9 insertions(+), 0 deletions(-) > > > > diff --git a/arch/x86/xen/setup.c b/arch/x86/xen/setup.c > > index b5a7f92..d984d36 100644 > > --- a/arch/x86/xen/setup.c > > +++ b/arch/x86/xen/setup.c > > @@ -52,6 +52,8 @@ phys_addr_t xen_extra_mem_start, xen_extra_mem_size; > > > > static __init void xen_add_extra_mem(unsigned long pages) > > { > > + unsigned long pfn; > > + > > u64 size = (u64)pages * PAGE_SIZE; > > u64 extra_start = xen_extra_mem_start + xen_extra_mem_size; > > > > @@ -66,6 +68,11 @@ static __init void xen_add_extra_mem(unsigned long pages) > > xen_extra_mem_size += size; > > > > xen_max_p2m_pfn = PFN_DOWN(extra_start + size); > > + > > + for (pfn = PFN_DOWN(extra_start); pfn <= xen_max_p2m_pfn; pfn++) { > > + BUG_ON(!set_phys_to_machine(pfn, INVALID_P2M_ENTRY)); > > Use __set_phys_to_machine where you don''t expect (or can''t allow) any > allocation.Are you OK with me moving then this check: if (unlikely(xen_feature(XENFEAT_auto_translated_physmap))) { BUG_ON(pfn != mfn && mfn != INVALID_P2M_ENTRY); return true; } from set_phys_to_machine to __set_phys_to_machine?> > Also, I''m not a fan of hiding real side-effectful code in a BUG_ON > predicate.Ok. Will expunge these. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2010-Dec-22 14:59 UTC
[Xen-devel] [SPAM] Re: [PATCH 02/10] xen/p2m: change p2m_missing_* to p2m_identity_*
On Tue, Dec 21, 2010 at 02:41:23PM -0800, Jeremy Fitzhardinge wrote:> On 12/21/2010 01:37 PM, Konrad Rzeszutek Wilk wrote: > > From: Konrad Rzeszutek Wilk <konrad@kernel.org> > > > > We are going to alter how we think about P2M. Most of the > > P2M contains MFN, and areas that are not populated are > > considered to be "missing". Missing means that the PFN > > is either not set for this guest (not have that much memory > > allocated) or is under the balloon driver ownership. > > > > We are instead now going to think of those not populated > > areas as "identity." Meaning that that the PFN for which > > we would get the p2m_identity we will provide the the PFN > > value back instead of P2M_MISSING. Essentially treating > > those regions as PFN==MFN. > > > > This renames missing -> identity, but does it actually change the > functionality? Doesn''t it just leave it being misnamed? It would > probably be better to fold in the actual identity implementation as well.You sure? It would be a lot of changes in one patch. This patch is a nop - so no functional changes except the name change. Let me annotate the git tree to mention this.> > mid_mfn_mfn = virt_to_mfn(mid_mfn); > > - if (cmpxchg(top_mfn_p, missing_mfn, mid_mfn_mfn) != missing_mfn) > > + if (cmpxchg(top_mfn_p, identity_mfn, mid_mfn_mfn) !> > + identity_mfn) > > Don''t wrap this.Checkpatch.pl was unhappy without it. I can ignore this. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2010-Dec-22 15:02 UTC
[Xen-devel] Re: [PATCH 04/10] xen/mmu: For 1-1 mapping, automatically set _PAGE_IOMAP.
On Tue, Dec 21, 2010 at 02:29:31PM -0800, Jeremy Fitzhardinge wrote:> On 12/21/2010 01:37 PM, Konrad Rzeszutek Wilk wrote: > > This means that for PFNs (specifically: those in any E820 gaps > > or non-RAM E820 regions) that have 1-1 mapping we set the > > _PAGE_IOMAP flag. > > > > Later on we could remove the _PAGE_IOMAP code handling, but > > for right now lets keep this in to not introduce any bisection > > failures across this patchset. > > > > Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> > > --- > > arch/x86/xen/mmu.c | 3 +++ > > 1 files changed, 3 insertions(+), 0 deletions(-) > > > > diff --git a/arch/x86/xen/mmu.c b/arch/x86/xen/mmu.c > > index 4ba7e4e..bd02e7d 100644 > > --- a/arch/x86/xen/mmu.c > > +++ b/arch/x86/xen/mmu.c > > @@ -832,6 +832,9 @@ static pteval_t pte_pfn_to_mfn(pteval_t val) > > pteval_t flags = val & PTE_FLAGS_MASK; > > unsigned long mfn = pfn_to_mfn(pfn); > > > > + if (mfn == pfn) > > + flags |= _PAGE_IOMAP; > > Why? Does it really make sense to set _PAGE_IOMAP if they just happen > to be the same value?Yes. Without this, user applications, such as ''dmidecode'' cannot get data. But I think with ditching a bunch of the _PAGE_IOMAP in the xen/mmu.c we can remove this. I would rather keep this patch as temporary scaffolding and when the other set of patches is ready for the _PAGE_IOMAP, ditch this one.> > J_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2010-Dec-22 15:04 UTC
[Xen-devel] Re: [PATCH 05/10] xen/setup: Set identity mapping for non-RAM E820 and E820 gaps.
> Couldn''t you just do something like: > > if (e820->map[i].type != E820_RAM)I am going to assume you meant ''=='' here.> continue; > > for (pfn = PFN_UP(last); pfn < PFN_DOWN(start); pfn++) > set_phys_to_machine(pfn, pfn); > identity += pfn - PFN_UP(last); > > last = end; > > ie, handle the hole and non-RAM cases together?A derivation of this does work: last = ISA_END_ADDRESS; for (i = 0; i < e820->nr_map; i++) { phys_addr_t start = e820->map[i].addr; phys_addr_t end = start + e820->map[i].size; if (end < start) continue; /* Skip over the 1MB region. */ if (last > end) continue; if (e820->map[i].type == E820_RAM) { /* Without saving ''last'' we would end up gobbling RAM regions. */ last = end; continue; } for (pfn = PFN_UP(last); pfn < PFN_DOWN(end); pfn++) set_phys_to_machine(pfn, pfn); identity += pfn - PFN_UP(last); last = end; }> > Also, what happens with the p2m tree mid layers in this? If you''re > doing page-by-page set_phys_to_machine, won''t it end up allocating them > all? How can you optimise the "large chunks of address space are > identity" case?The issue here is that when this code path is called (xen_memory_setup), the p2m_top[topidx][mididx] has been set to start_info->mfn_list[] (by xen_build_dynamic_phys_to_machine). Granted, some of these entries have been evicted (by the xen_return_unused_memory), so the regions in the mfn_list are pock-marked with INVALID_P2M_ENTRY. For those regions we set the PFN in the p2m_top[topidx][mididx][idx]. We do not allocate anything during this pass. In the the dom0_mem=max:X (or X,max:Y, where Y>X), which I neglected to test this would actually try to allocate (whoops). Let me roll up a patch for this.> > It would probably be cleanest to have a set_ident_phys_to_machine(start, > end) function which can do all that.Not sure if it is truly needed. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2010-Dec-22 15:06 UTC
Re: [Xen-devel] [RFC PATCH v1] Consider void entries in the P2M as 1-1 mapping.
On Wed, Dec 22, 2010 at 08:36:55AM +0000, Ian Campbell wrote:> On Tue, 2010-12-21 at 21:37 +0000, Konrad Rzeszutek Wilk wrote: > > In the past we used to think of those regions as "missing" and under > > the ownership of the balloon code. But the balloon code only operates > > on a specific region. This region is in lastE820 RAM page (basically > > any region past nr_pages is considered balloon type page). > > That is true at start of day but once the system is up and running the > balloon driver can make a hole for anything which can be returned by > alloc_page.<nods>> > The following descriptions seem to consider this correctly but I just > wanted to clarify.Yes. Thank you for thinking this one through.> > I don''t think it''s necessarily the last E820 RAM page either, that''s > just what the tools today happen to build. In principal the tools could > push down a holey e820 (e.g. with PCI holes prepunched etc) and boot the > domain ballooned down such that the N-2, N-3 e820 RAM regions are above > nr_pages too.OK, but they would be marked as E820 RAM regions, right?> > > This patchset considers the void entries as "identity" and for balloon > > pages you have to set the PFNs to be "missing". This means that the > > void entries are now considered 1-1, so for PFNs which exist in large > > gaps of the P2M space will return the same PFN. > > I would naively have expected that a missing entry indicated an > invalid/missing entry rather than an identity region, it just seems likeIt has. For regions that are small, or already allocated it would stuff the INVALID_P2M_ENTRY in it. For larger areas (so more than 1MB or so) if there has not been a top entry allocated for it, it will attach the p2m_mid_missing to it which has pointes to p2m_missing, which in turn is filled iwht INVALID_P2M_ENTRY.> the safer default since we are (maybe) more likely to catch an > INVALID_P2M_ENTRY before handing it to the hypervisor and getting > ourselves shot.When I think entry, I think the lowel-level of the tree, not the top or middle which are the ones that are by default now considered "identity". FYI, the p2m_identity is stuffed with INVALID_P2M_ENTRY so if somebody does get a hold of the value there somehow without first trying to set it, we would catch it and do this: (xen/mmu.c, pte_pfn_to_mfn function): /* * If there''s no mfn for the pfn, then just create an * empty non-present pte. Unfortunately this loses * information about the original pfn, so * pte_mfn_to_pfn is asymmetric. */ if (unlikely(mfn == INVALID_P2M_ENTRY)) { mfn = 0; flags = 0; }> > In that case the identity regions would need to be explicitly > registered, is that harder to do?It might not be.. but it would end up in the same logic path (in the pte_pfn_to_mfn function).> > I guess we could register any hole or explicit non-RAM region in the > e820 as identity but do we sometimes see I/O memory above the top of the > e820 or is there some other problem I''m not thinking of?Hot plug memory is one. There are also some PCI BARs that are above that region (but I can''t remember the details). Jeremy mentioned something about Fujitsu machines.> > > The xen/mmu.c code where it deals with _PAGE_IOMAP can be removed, but > > to guard against regressions or bugs lets take it one patchset at a > > time. > > Could we have a WARN_ON(_PAGE_IOMAP && !PAGE_IDENTITY) (or whatever the > predicates really are) in some relevant places in mmu.c?The PAGE_IDENTITY or (IDENTITY_P2M_ENTRY) is never set anywhere. We could do this: WARN_ON(pfn_to_mfn(pfn)==pfn && (flag & _PAGE_IOMAP)) but that would be printed all the time. Unless I saved some extra flag (as you were alluding to earlier) and did that along with the MFN and for identity mappings just returned that flag unconditionaly. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2010-Dec-22 15:07 UTC
[Xen-devel] Re: [PATCH 06/10] xen/setup: Only set identity mapping in E820 regions when privileged.
On Tue, Dec 21, 2010 at 02:37:01PM -0800, Jeremy Fitzhardinge wrote:> On 12/21/2010 01:37 PM, Konrad Rzeszutek Wilk wrote: > > We do not want to set the identity mapping on E820 reserved > > regions when running as PV. This is b/c the 0->ISA_END_ADDRESS region > > would be considered identity and we would try to read DMI information > > and fail (since the pfn_to_mfn(mfn)==pfn) under PV guests. > > > > Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> > > --- > > arch/x86/xen/setup.c | 2 +- > > 1 files changed, 1 insertions(+), 1 deletions(-) > > > > diff --git a/arch/x86/xen/setup.c b/arch/x86/xen/setup.c > > index 752c865..34fb906 100644 > > --- a/arch/x86/xen/setup.c > > +++ b/arch/x86/xen/setup.c > > @@ -160,7 +160,7 @@ static unsigned long __init xen_set_identity(const struct e820map *e820) > > if (end < start) > > continue; > > > > - if (e820->map[i].type != E820_RAM) { > > + if (xen_initial_domain() && e820->map[i].type != E820_RAM) { > > for (pfn = PFN_UP(start); pfn < PFN_DOWN(end); pfn++) > > set_phys_to_machine(pfn, pfn); > > identity += pfn - PFN_UP(start); > > So you''re relying on the fact that the ISA area is the only non-RAM e820 > entry in domU? I think it would be better to do a specific exclusion > for the ISA area rather than this.Ok. I''ve rolled something along what xen_return_unused regions does in the function. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2010-Dec-22 15:10 UTC
[Xen-devel] Re: [PATCH 07/10] xen/mmu: Work with 1-1 mappings when allocating new top/middle entries.
On Tue, Dec 21, 2010 at 02:37:46PM -0800, Jeremy Fitzhardinge wrote:> On 12/21/2010 01:37 PM, Konrad Rzeszutek Wilk wrote: > > We were not properly taking under advisement the 1-1 mappings > > when a large area of memory was ballooned out. > > Could you expand on this? What does it mean?The balloon code was going from the end of its region down, and those regions were in the p2m_missing (and p2m_mid_missing) zone. Which is correct, except that the alloc_p2m did not know how to handle this. So it never actually allocated the middle or entry level pages which would contain the newly added PFNs. I''ve rolled this patch in the xen/mmu: Add the notion of identity (1-1) mapping. patch soon to be posted. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2010-Dec-22 15:11 UTC
[Xen-devel] Re: [PATCH 08/10] xen/mmu: Bugfix. Fill the top entry page with appropriate middle layer pointers.
On Tue, Dec 21, 2010 at 02:38:30PM -0800, Jeremy Fitzhardinge wrote:> On 12/21/2010 01:37 PM, Konrad Rzeszutek Wilk wrote: > > If we swapped over from using an p2m_mid_identical to p2m_mid_missing > > (earlier call to set_phys_to_machine) and then started going through the > > PFNs in descending order to program a new MFN (balloon worker), we would > > end up in this code path. At that point we would set up new page filled with > > pointers to p2m_identity instead of p2m_missing. This had the disastrous > > effect that get_phys_to_machine on that PFN would return an 1-1 mapping > > instead of INVALID_P2M_ENTRY resulting in hitting a BUG check in balloon driver. > > > Are you going to fold this into the appropriate patch later?Yes. Rolled it in xen/mmu: Add the notion of identity (1-1) mapping. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2010-Dec-22 15:46 UTC
[Xen-devel] Re: [PATCH 01/10] xen: Make all reserved pages for the balloon be INVALID_P2M_ENTRY.
On 12/22/2010 06:53 AM, Konrad Rzeszutek Wilk wrote:> On Tue, Dec 21, 2010 at 02:19:40PM -0800, Jeremy Fitzhardinge wrote: >> On 12/21/2010 01:37 PM, Konrad Rzeszutek Wilk wrote: >>> This patch prepares ourselves for the case where void entries in the P2M >>> tree structure do not necessarily imply that the pages are missing. >>> With this, we diligently set regions that will be used by the >>> balloon driver to be INVALID_P2M_ENTRY and under the ownership >>> of the balloon driver. >>> >>> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> >>> --- >>> arch/x86/xen/setup.c | 8 ++++++++ >>> drivers/xen/balloon.c | 1 + >>> 2 files changed, 9 insertions(+), 0 deletions(-) >>> >>> diff --git a/arch/x86/xen/setup.c b/arch/x86/xen/setup.c >>> index b5a7f92..d984d36 100644 >>> --- a/arch/x86/xen/setup.c >>> +++ b/arch/x86/xen/setup.c >>> @@ -52,6 +52,8 @@ phys_addr_t xen_extra_mem_start, xen_extra_mem_size; >>> >>> static __init void xen_add_extra_mem(unsigned long pages) >>> { >>> + unsigned long pfn; >>> + >>> u64 size = (u64)pages * PAGE_SIZE; >>> u64 extra_start = xen_extra_mem_start + xen_extra_mem_size; >>> >>> @@ -66,6 +68,11 @@ static __init void xen_add_extra_mem(unsigned long pages) >>> xen_extra_mem_size += size; >>> >>> xen_max_p2m_pfn = PFN_DOWN(extra_start + size); >>> + >>> + for (pfn = PFN_DOWN(extra_start); pfn <= xen_max_p2m_pfn; pfn++) { >>> + BUG_ON(!set_phys_to_machine(pfn, INVALID_P2M_ENTRY)); >> Use __set_phys_to_machine where you don''t expect (or can''t allow) any >> allocation. > Are you OK with me moving then this check: > > if (unlikely(xen_feature(XENFEAT_auto_translated_physmap))) { > BUG_ON(pfn != mfn && mfn != INVALID_P2M_ENTRY); > return true; > } > from set_phys_to_machine to __set_phys_to_machine?Yep - not that we''ll see that taken on any current or near-future Xen, I suspect. J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Campbell
2010-Dec-22 16:26 UTC
Re: [Xen-devel] [RFC PATCH v1] Consider void entries in the P2M as 1-1 mapping.
On Wed, 2010-12-22 at 15:06 +0000, Konrad Rzeszutek Wilk wrote:> On Wed, Dec 22, 2010 at 08:36:55AM +0000, Ian Campbell wrote: > > On Tue, 2010-12-21 at 21:37 +0000, Konrad Rzeszutek Wilk wrote: > > > In the past we used to think of those regions as "missing" and under > > > the ownership of the balloon code. But the balloon code only operates > > > on a specific region. This region is in lastE820 RAM page (basically > > > any region past nr_pages is considered balloon type page). > > > > That is true at start of day but once the system is up and running the > > balloon driver can make a hole for anything which can be returned by > > alloc_page. > > <nods> > > > > The following descriptions seem to consider this correctly but I just > > wanted to clarify. > > Yes. Thank you for thinking this one through. > > > > I don''t think it''s necessarily the last E820 RAM page either, that''s > > just what the tools today happen to build. In principal the tools could > > push down a holey e820 (e.g. with PCI holes prepunched etc) and boot the > > domain ballooned down such that the N-2, N-3 e820 RAM regions are above > > nr_pages too. > > OK, but they would be marked as E820 RAM regions, right?Yes. There''s no special E820 type for ballooned out RAM.> > > This patchset considers the void entries as "identity" and for balloon > > > pages you have to set the PFNs to be "missing". This means that the > > > void entries are now considered 1-1, so for PFNs which exist in large > > > gaps of the P2M space will return the same PFN. > > > > I would naively have expected that a missing entry indicated an > > invalid/missing entry rather than an identity region, it just seems like > > It has. For regions that are small, or already allocated it would > stuff the INVALID_P2M_ENTRY in it. For larger areas (so more than 1MB or so) > if there has not been a top entry allocated for it, it will attach > the p2m_mid_missing to it which has pointes to p2m_missing, which in > turn is filled iwht INVALID_P2M_ENTRY.Hrm, I think I''m probably just confused by the missing vs. invalid vs. void terminology and conflating it all with IDENTITY/INVALID_P2M_ENTRY and getting in a mess.> > the safer default since we are (maybe) more likely to catch an > > INVALID_P2M_ENTRY before handing it to the hypervisor and getting > > ourselves shot. > > When I think entry, I think the lowel-level of the tree, not the > top or middle which are the ones that are by default now considered > "identity"."now" before this series or "now" after? I think the default value for a lookup of an uninitialised entry should be the same regardless of whether the mid levels of the tree happen to be filled in (or pointing to common placeholder entries) or not. Is that the case?> FYI, the p2m_identity is stuffed with INVALID_P2M_ENTRY > so if somebody does get a hold of the value there somehow without > first trying to set it, we would catch it and do this:p2m_identity is filled with INVALID_P2M_ENTRY? No wonder I''m confused by the names ;-) Why isn''t it either called p2m_invalid or filled with IDENTITY_P2M_ENTRY?> It might not be.. but it would end up in the same logic path (in > the pte_pfn_to_mfn function).Sure. My concern is about this bit but rather about what accesses to unknown entries return. Currently I think they return INVALID_P2M_ENTRY but you are proposing that they return identity instead, which seems wrong for anything which isn''t explicitly initialised as I/O (or identity for any other reason).> > > > > The xen/mmu.c code where it deals with _PAGE_IOMAP can be removed, but > > > to guard against regressions or bugs lets take it one patchset at a > > > time. > > > > Could we have a WARN_ON(_PAGE_IOMAP && !PAGE_IDENTITY) (or whatever the > > predicates really are) in some relevant places in mmu.c? > > The PAGE_IDENTITY or (IDENTITY_P2M_ENTRY) is never set anywhere.So how is it used? I don''t see it apart from in a single BUG_ON and some comments. Do you just rely on IDENTITY_P2M_ENTRY==0 and things being filed with 0 by default?> We could > do this: > > WARN_ON(pfn_to_mfn(pfn)==pfn && (flag & _PAGE_IOMAP)) > > but that would be printed all the time. > > Unless I saved some extra flag (as you were alluding to earlier) and did that > along with the MFN and for identity mappings just returned that flag unconditionaly._______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Campbell
2010-Dec-22 16:27 UTC
Re: [Xen-devel] Re: [PATCH 04/10] xen/mmu: For 1-1 mapping, automatically set _PAGE_IOMAP.
On Wed, 2010-12-22 at 15:02 +0000, Konrad Rzeszutek Wilk wrote:> On Tue, Dec 21, 2010 at 02:29:31PM -0800, Jeremy Fitzhardinge wrote: > > On 12/21/2010 01:37 PM, Konrad Rzeszutek Wilk wrote: > > > @@ -832,6 +832,9 @@ static pteval_t pte_pfn_to_mfn(pteval_t val) > > > pteval_t flags = val & PTE_FLAGS_MASK; > > > unsigned long mfn = pfn_to_mfn(pfn); > > > > > > + if (mfn == pfn) > > > + flags |= _PAGE_IOMAP; > > > > Why? Does it really make sense to set _PAGE_IOMAP if they just happen > > to be the same value? > > Yes. Without this, user applications, such as ''dmidecode'' cannot get > data.I think Jeremy''s point is that the "mfn == pfn" here has false positives, since it is possible that a normal RAM page has identical mfn and pfn by pure chance. Ian. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2010-Dec-22 17:47 UTC
Re: [Xen-devel] [PATCH 07/10] xen/mmu: Work with 1-1 mappings when allocating new top/middle entries.
On Wed, Dec 22, 2010 at 08:54:31AM +0000, Ian Campbell wrote:> On Tue, 2010-12-21 at 21:37 +0000, Konrad Rzeszutek Wilk wrote: > > We were not properly taking under advisement the 1-1 mappings > > when a large area of memory was ballooned out. > > Are we lazily allocating the p2m tree nodes for regions initially > covered by the balloon? (perhaps we have always done this and it isn''t > new with this series)Before: Yes Now: Yes.> > Would it be simpley to always populate enough tree nodes to cover the > ballooned area as well as nr_pages at start of day and therefore avoid > worrying about it later on (except for memory hotplug which is special > in this way already)?Tried that, ran out of reserved_brk space :-) But not sure what we would gain for this - it is not always guaranteed that we will populate up to the memory ''maxmem'' region. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2010-Dec-22 18:01 UTC
Re: [Xen-devel] [RFC PATCH v1] Consider void entries in the P2M as 1-1 mapping.
> > OK, but they would be marked as E820 RAM regions, right? > > Yes. There''s no special E820 type for ballooned out RAM.Wheew, good.> > > It has. For regions that are small, or already allocated it would > > stuff the INVALID_P2M_ENTRY in it. For larger areas (so more than 1MB or so) > > if there has not been a top entry allocated for it, it will attach > > the p2m_mid_missing to it which has pointes to p2m_missing, which in > > turn is filled iwht INVALID_P2M_ENTRY. > > Hrm, I think I''m probably just confused by the missing vs. invalid vs. > void terminology and conflating it all with IDENTITY/INVALID_P2M_ENTRY > and getting in a mess.I should do a better job explaining this. Will attach some pictures next time.> > > > the safer default since we are (maybe) more likely to catch an > > > INVALID_P2M_ENTRY before handing it to the hypervisor and getting > > > ourselves shot. > > > > When I think entry, I think the lowel-level of the tree, not the > > top or middle which are the ones that are by default now considered > > "identity". > > "now" before this series or "now" after?After.> > I think the default value for a lookup of an uninitialised entry should > be the same regardless of whether the mid levels of the tree happen to > be filled in (or pointing to common placeholder entries) or not. Is that > the case?Yes. But there are no uninitialized entry. All of them are either INVALID_P2M_ENTRY or have a PFN value (with some potential flags attached to them). Nothing else is allowed.> > > FYI, the p2m_identity is stuffed with INVALID_P2M_ENTRY > > so if somebody does get a hold of the value there somehow without > > first trying to set it, we would catch it and do this: > > p2m_identity is filled with INVALID_P2M_ENTRY? No wonder I''m confused by > the names ;-) Why isn''t it either called p2m_invalid or filled withI am using both ''p2m_missing'' and ''p2m_identity'' pointers as a way to figure out if the entries are considered missing (so up for balloon graps) or identity PFNs. If it is neither p2m_missing nor p2m_identity it means it has been allocated (probably via alloc_p2m) and contains PFNs (which might be INVALID_P2M_ENTRY if balloon plucks that page out, a PFN, or an 1-1 if the E820 gap or reserved region falls within that entry). The contents of both pages (p2m_missing and p2m_identity) is INVALID_P2M_ENTRY.> IDENTITY_P2M_ENTRY?The value 0 would make the toolstack during migrate throw a fit.> > > It might not be.. but it would end up in the same logic path (in > > the pte_pfn_to_mfn function). > > Sure. > > My concern is about this bit but rather about what accesses to unknown > entries return. Currently I think they return INVALID_P2M_ENTRY but you > are proposing that they return identity instead, which seems wrong forCorrect.> anything which isn''t explicitly initialised as I/O (or identity for any > other reason).Aha! And this is what we are fixing. You see, a lot of drivers don''t explicitly initialize their vmap''s as I/O (or do as VM_IO but actually use real RAM). This makes it possible to work with those guys. I think what you are saying is to be more conservative and only set those implicit 1-1 mappings on E820 gaps, and on non-RAM E820 regions. Everything else should be considered missing so that we will return for pfn_to_mfn(MAX_P2M_PFN) == INVALID_P2M_ENTRY instead of MAX_P2M_PFN?> > > > > > > > The xen/mmu.c code where it deals with _PAGE_IOMAP can be removed, but > > > > to guard against regressions or bugs lets take it one patchset at a > > > > time. > > > > > > Could we have a WARN_ON(_PAGE_IOMAP && !PAGE_IDENTITY) (or whatever the > > > predicates really are) in some relevant places in mmu.c? > > > > The PAGE_IDENTITY or (IDENTITY_P2M_ENTRY) is never set anywhere. > > So how is it used? I don''t see it apart from in a single BUG_ON and some > comments. Do you just rely on IDENTITY_P2M_ENTRY==0 and things being > filed with 0 by default?No. INVALID_P2M_ENTRY. Now that I think of it, I am not sure why I even introduced the IDENTITY_P2M_ENTRY. It sure is confusing. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2010-Dec-22 20:36 UTC
[Xen-devel] [SPAM] Re: [PATCH 02/10] xen/p2m: change p2m_missing_* to p2m_identity_*
On 12/22/2010 06:59 AM, Konrad Rzeszutek Wilk wrote:> On Tue, Dec 21, 2010 at 02:41:23PM -0800, Jeremy Fitzhardinge wrote: >> On 12/21/2010 01:37 PM, Konrad Rzeszutek Wilk wrote: >>> From: Konrad Rzeszutek Wilk <konrad@kernel.org> >>> >>> We are going to alter how we think about P2M. Most of the >>> P2M contains MFN, and areas that are not populated are >>> considered to be "missing". Missing means that the PFN >>> is either not set for this guest (not have that much memory >>> allocated) or is under the balloon driver ownership. >>> >>> We are instead now going to think of those not populated >>> areas as "identity." Meaning that that the PFN for which >>> we would get the p2m_identity we will provide the the PFN >>> value back instead of P2M_MISSING. Essentially treating >>> those regions as PFN==MFN. >>> >> This renames missing -> identity, but does it actually change the >> functionality? Doesn''t it just leave it being misnamed? It would >> probably be better to fold in the actual identity implementation as well. > You sure? It would be a lot of changes in one patch. This patch is > a nop - so no functional changes except the name change. > > Let me annotate the git tree to mention this.Yeah, I''m in two minds. I like small single-purpose patches, but the rename really does leave things v. misnamed. I guess it doesn''t really matter for one commit, so long as its still bisectable (and the commit comment makes it clear that the name is misleading).>>> mid_mfn_mfn = virt_to_mfn(mid_mfn); >>> - if (cmpxchg(top_mfn_p, missing_mfn, mid_mfn_mfn) != missing_mfn) >>> + if (cmpxchg(top_mfn_p, identity_mfn, mid_mfn_mfn) !>>> + identity_mfn) >> Don''t wrap this. > Checkpatch.pl was unhappy without it. I can ignore this.Checkpatch is generally wrong on the subject of long lines. J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel