David Vrabel
2011-Sep-15 12:29 UTC
[Xen-devel] xen: memory initialization/balloon fixes (#3)
This set of patches fixes some bugs in the memory initialization under Xen and in Xen''s memory balloon driver. They can make 100s of MB of additional RAM available (depending on the system/configuration). Patch 1 is already applied. Patch 2 fixes a bug in patch 1 and should be queued for 3.1 (and along with patch 1 considered for 3.0 stable). Patch 3 is a bug fix and should be queued for 3.1 and possibly queued for the 3.0 stable tree. Patches 5 & 6 increase the amount of low memory in 32 bit domains started with < 1 GiB of RAM. Please queue for 3.2 Patch 7 releases all pages in the initial allocation with PFNs that lie within a 1-1 mapping. This seems correct to me as I think that once the 1-1 mapping is set the MFN of the original page is lost so it''s no longer accessible by the kernel (and it cannot be used by another domain Changes since #2: - New patch: xen: avoid adding non-existant memory if the reservation is unlimited - Avoid using a hypercall to get the current number of pages in the ballon driver. Apparently the hypercall won''t return the right value if paging is used. - Addresses Konrad''s review comments. David _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
David Vrabel
2011-Sep-15 12:29 UTC
[Xen-devel] [PATCH 1/7] xen: use maximum reservation to limit amount of usable RAM
From: David Vrabel <david.vrabel@citrix.com> Use the domain''s maximum reservation to limit the amount of extra RAM for the memory balloon. This reduces the size of the pages tables and the amount of reserved low memory (which defaults to about 1/32 of the total RAM). On a system with 8 GiB of RAM with the domain limited to 1 GiB the kernel reports: Before: Memory: 627792k/4472000k available After: Memory: 549740k/11132224k available A increase of about 76 MiB (~1.5% of the unused 7 GiB). The reserved low memory is also reduced from 253 MiB to 32 MiB. The total additional usable RAM is 329 MiB. For dom0, this requires at patch to Xen (''x86: use ''dom0_mem'' to limit the number of pages for dom0'')[1]. [1] http://lists.xensource.com/archives/html/xen-devel/2011-08/msg00567.html Signed-off-by: David Vrabel <david.vrabel@citrix.com> --- arch/x86/xen/setup.c | 19 +++++++++++++++++++ 1 files changed, 19 insertions(+), 0 deletions(-) diff --git a/arch/x86/xen/setup.c b/arch/x86/xen/setup.c index df118a8..c3b8d44 100644 --- a/arch/x86/xen/setup.c +++ b/arch/x86/xen/setup.c @@ -184,6 +184,19 @@ static unsigned long __init xen_set_identity(const struct e820entry *list, PFN_UP(start_pci), PFN_DOWN(last)); return identity; } + +static unsigned long __init xen_get_max_pages(void) +{ + unsigned long max_pages = MAX_DOMAIN_PAGES; + domid_t domid = DOMID_SELF; + int ret; + + ret = HYPERVISOR_memory_op(XENMEM_maximum_reservation, &domid); + if (ret > 0) + max_pages = ret; + return min(max_pages, MAX_DOMAIN_PAGES); +} + /** * machine_specific_memory_setup - Hook for machine specific memory setup. **/ @@ -292,6 +305,12 @@ char * __init xen_memory_setup(void) sanitize_e820_map(e820.map, ARRAY_SIZE(e820.map), &e820.nr_map); + extra_limit = xen_get_max_pages(); + if (extra_limit >= max_pfn) + extra_pages = extra_limit - max_pfn; + else + extra_pages = 0; + extra_pages += xen_return_unused_memory(xen_start_info->nr_pages, &e820); /* -- 1.7.2.5 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
David Vrabel
2011-Sep-15 12:29 UTC
[Xen-devel] [PATCH 2/7] xen: avoid adding non-existant memory if the reservation is unlimited
From: David Vrabel <david.vrabel@citrix.com> If the domain''s reservation is unlimited, too many pages are added to the balloon memory region. Correctly check the limit so the number of extra pages is not increased in this case. Signed-off-by: David Vrabel <david.vrabel@citrix.com> --- arch/x86/xen/setup.c | 10 ++++++---- 1 files changed, 6 insertions(+), 4 deletions(-) diff --git a/arch/x86/xen/setup.c b/arch/x86/xen/setup.c index c3b8d44..46d6d21 100644 --- a/arch/x86/xen/setup.c +++ b/arch/x86/xen/setup.c @@ -306,10 +306,12 @@ char * __init xen_memory_setup(void) sanitize_e820_map(e820.map, ARRAY_SIZE(e820.map), &e820.nr_map); extra_limit = xen_get_max_pages(); - if (extra_limit >= max_pfn) - extra_pages = extra_limit - max_pfn; - else - extra_pages = 0; + if (max_pfn + extra_pages > extra_limit) { + if (extra_limit > max_pfn) + extra_pages = extra_limit - max_pfn; + else + extra_pages = 0; + } extra_pages += xen_return_unused_memory(xen_start_info->nr_pages, &e820); -- 1.7.2.5 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
David Vrabel
2011-Sep-15 12:29 UTC
[Xen-devel] [PATCH 3/7] xen/balloon: account for pages released during memory setup
From: David Vrabel <david.vrabel@citrix.com> In xen_memory_setup() pages that occur in gaps in the memory map are released back to Xen. This reduces the domain''s current page count in the hypervisor. The Xen balloon driver does not correctly decrease its initial current_pages count to reflect this. If ''delta'' pages are released and the target is adjusted the resulting reservation is always ''delta'' less than the requested target. This affects dom0 if the initial allocation of pages overlaps the PCI memory region but won''t affect most domU guests that have been setup with pseudo-physical memory maps that don''t have gaps. Fix this by accouting for the released pages when starting the balloon driver. If the domain''s targets are managed by xapi, the domain may eventually run out of memory and die because xapi currently gets its target calculations wrong and whenever it is restarted it always reduces the target by ''delta''. Signed-off-by: David Vrabel <david.vrabel@citrix.com> --- arch/x86/xen/setup.c | 7 ++++++- drivers/xen/balloon.c | 4 +++- include/xen/page.h | 2 ++ 3 files changed, 11 insertions(+), 2 deletions(-) diff --git a/arch/x86/xen/setup.c b/arch/x86/xen/setup.c index 46d6d21..c983717 100644 --- a/arch/x86/xen/setup.c +++ b/arch/x86/xen/setup.c @@ -39,6 +39,9 @@ extern void xen_syscall32_target(void); /* Amount of extra memory space we add to the e820 ranges */ phys_addr_t xen_extra_mem_start, xen_extra_mem_size; +/* Number of pages released from the initial allocation. */ +unsigned long xen_released_pages; + /* * The maximum amount of extra memory compared to the base size. The * main scaling factor is the size of struct page. At extreme ratios @@ -313,7 +316,9 @@ char * __init xen_memory_setup(void) extra_pages = 0; } - extra_pages += xen_return_unused_memory(xen_start_info->nr_pages, &e820); + xen_released_pages = xen_return_unused_memory(xen_start_info->nr_pages, + &e820); + extra_pages += xen_released_pages; /* * Clamp the amount of extra memory to a EXTRA_MEM_RATIO diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c index 5dfd8f8..4f59fb3 100644 --- a/drivers/xen/balloon.c +++ b/drivers/xen/balloon.c @@ -565,7 +565,9 @@ static int __init balloon_init(void) pr_info("xen/balloon: Initialising balloon driver.\n"); - balloon_stats.current_pages = xen_pv_domain() ? min(xen_start_info->nr_pages, max_pfn) : max_pfn; + balloon_stats.current_pages = xen_pv_domain() + ? min(xen_start_info->nr_pages - xen_released_pages, max_pfn) + : max_pfn; balloon_stats.target_pages = balloon_stats.current_pages; balloon_stats.balloon_low = 0; balloon_stats.balloon_high = 0; diff --git a/include/xen/page.h b/include/xen/page.h index 0be36b9..92b61f8 100644 --- a/include/xen/page.h +++ b/include/xen/page.h @@ -5,4 +5,6 @@ extern phys_addr_t xen_extra_mem_start, xen_extra_mem_size; +extern unsigned long xen_released_pages; + #endif /* _XEN_PAGE_H */ -- 1.7.2.5 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
David Vrabel
2011-Sep-15 12:29 UTC
[Xen-devel] [PATCH 4/7] xen/balloon: simplify test for the end of usable RAM
From: David Vrabel <david.vrabel@citrix.com> When initializing the balloon only max_pfn needs to be checked (max_pfn will always be <= e820_end_of_ram_pfn()) and improve the confusing comment. Signed-off-by: David Vrabel <david.vrabel@citrix.com> --- drivers/xen/balloon.c | 18 +++++++++--------- 1 files changed, 9 insertions(+), 9 deletions(-) diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c index 4f59fb3..9efb993 100644 --- a/drivers/xen/balloon.c +++ b/drivers/xen/balloon.c @@ -586,16 +586,16 @@ static int __init balloon_init(void) #endif /* - * Initialise the balloon with excess memory space. We need - * to make sure we don''t add memory which doesn''t exist or - * logically exist. The E820 map can be trimmed to be smaller - * than the amount of physical memory due to the mem= command - * line parameter. And if this is a 32-bit non-HIGHMEM kernel - * on a system with memory which requires highmem to access, - * don''t try to use it. + * Initialize the balloon with pages from the extra memory + * region (see arch/x86/xen/setup.c). + * + * If the amount of usable memory has been limited (e.g., with + * the ''mem'' command line parameter), don''t add pages beyond + * this limit. */ - extra_pfn_end = min(min(max_pfn, e820_end_of_ram_pfn()), - (unsigned long)PFN_DOWN(xen_extra_mem_start + xen_extra_mem_size)); + extra_pfn_end = min(max_pfn, + (unsigned long)PFN_DOWN(xen_extra_mem_start + + xen_extra_mem_size)); for (pfn = PFN_UP(xen_extra_mem_start); pfn < extra_pfn_end; pfn++) { -- 1.7.2.5 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
David Vrabel
2011-Sep-15 12:29 UTC
[Xen-devel] [PATCH 5/7] xen: allow balloon driver to use more than one memory region
From: David Vrabel <david.vrabel@citrix.com> Allow the xen balloon driver to populate its list of extra pages from more than one region of memory. This will allow platforms to provide (for example) a region of low memory and a region of high memory. The maximum possible number of extra regions is 128 (== E820MAX) which is quite large so xen_extra_mem is placed in __initdata. This is safe as both xen_memory_setup() and balloon_init() are in __init. The balloon regions themselves are not altered (i.e., there is still only the one region). Signed-off-by: David Vrabel <david.vrabel@citrix.com> --- arch/x86/xen/setup.c | 20 ++++++++++---------- drivers/xen/balloon.c | 44 +++++++++++++++++++++++++++----------------- include/xen/page.h | 10 +++++++++- 3 files changed, 46 insertions(+), 28 deletions(-) diff --git a/arch/x86/xen/setup.c b/arch/x86/xen/setup.c index c983717..0c8e974 100644 --- a/arch/x86/xen/setup.c +++ b/arch/x86/xen/setup.c @@ -37,7 +37,7 @@ extern void xen_syscall_target(void); extern void xen_syscall32_target(void); /* Amount of extra memory space we add to the e820 ranges */ -phys_addr_t xen_extra_mem_start, xen_extra_mem_size; +struct xen_memory_region xen_extra_mem[XEN_EXTRA_MEM_MAX_REGIONS] __initdata; /* Number of pages released from the initial allocation. */ unsigned long xen_released_pages; @@ -59,7 +59,7 @@ static void __init xen_add_extra_mem(unsigned long pages) unsigned long pfn; u64 size = (u64)pages * PAGE_SIZE; - u64 extra_start = xen_extra_mem_start + xen_extra_mem_size; + u64 extra_start = xen_extra_mem[0].start + xen_extra_mem[0].size; if (!pages) return; @@ -69,7 +69,7 @@ static void __init xen_add_extra_mem(unsigned long pages) memblock_x86_reserve_range(extra_start, extra_start + size, "XEN EXTRA"); - xen_extra_mem_size += size; + xen_extra_mem[0].size += size; xen_max_p2m_pfn = PFN_DOWN(extra_start + size); @@ -242,7 +242,7 @@ char * __init xen_memory_setup(void) memcpy(map_raw, map, sizeof(map)); e820.nr_map = 0; - xen_extra_mem_start = mem_end; + xen_extra_mem[0].start = mem_end; for (i = 0; i < memmap.nr_entries; i++) { unsigned long long end; @@ -270,8 +270,8 @@ char * __init xen_memory_setup(void) e820_add_region(end, delta, E820_UNUSABLE); } - if (map[i].size > 0 && end > xen_extra_mem_start) - xen_extra_mem_start = end; + if (map[i].size > 0 && end > xen_extra_mem[0].start) + xen_extra_mem[0].start = end; /* Add region if any remains */ if (map[i].size > 0) @@ -279,10 +279,10 @@ char * __init xen_memory_setup(void) } /* Align the balloon area so that max_low_pfn does not get set * to be at the _end_ of the PCI gap at the far end (fee01000). - * Note that xen_extra_mem_start gets set in the loop above to be - * past the last E820 region. */ - if (xen_initial_domain() && (xen_extra_mem_start < (1ULL<<32))) - xen_extra_mem_start = (1ULL<<32); + * Note that the start of balloon area gets set in the loop above + * to be past the last E820 region. */ + if (xen_initial_domain() && (xen_extra_mem[0].start < (1ULL<<32))) + xen_extra_mem[0].start = (1ULL<<32); /* * In domU, the ISA region is normal, usable memory, but we diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c index 9efb993..fc43b53 100644 --- a/drivers/xen/balloon.c +++ b/drivers/xen/balloon.c @@ -555,11 +555,32 @@ void free_xenballooned_pages(int nr_pages, struct page** pages) } EXPORT_SYMBOL(free_xenballooned_pages); -static int __init balloon_init(void) +static void __init balloon_add_region(unsigned long start_pfn, + unsigned long pages) { unsigned long pfn, extra_pfn_end; struct page *page; + /* + * If the amount of usable memory has been limited (e.g., with + * the ''mem'' command line parameter), don''t add pages beyond + * this limit. + */ + extra_pfn_end = min(max_pfn, start_pfn + pages); + + for (pfn = start_pfn; pfn < extra_pfn_end; pfn++) { + page = pfn_to_page(pfn); + /* totalram_pages and totalhigh_pages do not + include the boot-time balloon extension, so + don''t subtract from it. */ + __balloon_append(page); + } +} + +static int __init balloon_init(void) +{ + int i; + if (!xen_domain()) return -ENODEV; @@ -587,23 +608,12 @@ static int __init balloon_init(void) /* * Initialize the balloon with pages from the extra memory - * region (see arch/x86/xen/setup.c). - * - * If the amount of usable memory has been limited (e.g., with - * the ''mem'' command line parameter), don''t add pages beyond - * this limit. + * regions (see arch/x86/xen/setup.c). */ - extra_pfn_end = min(max_pfn, - (unsigned long)PFN_DOWN(xen_extra_mem_start - + xen_extra_mem_size)); - for (pfn = PFN_UP(xen_extra_mem_start); - pfn < extra_pfn_end; - pfn++) { - page = pfn_to_page(pfn); - /* totalram_pages and totalhigh_pages do not include the boot-time - balloon extension, so don''t subtract from it. */ - __balloon_append(page); - } + for (i = 0; i < XEN_EXTRA_MEM_MAX_REGIONS; i++) + if (xen_extra_mem[i].size) + balloon_add_region(PFN_UP(xen_extra_mem[i].start), + PFN_DOWN(xen_extra_mem[i].size)); return 0; } diff --git a/include/xen/page.h b/include/xen/page.h index 92b61f8..12765b6 100644 --- a/include/xen/page.h +++ b/include/xen/page.h @@ -3,7 +3,15 @@ #include <asm/xen/page.h> -extern phys_addr_t xen_extra_mem_start, xen_extra_mem_size; +struct xen_memory_region { + phys_addr_t start; + phys_addr_t size; +}; + +#define XEN_EXTRA_MEM_MAX_REGIONS 128 /* == E820MAX */ + +extern __initdata +struct xen_memory_region xen_extra_mem[XEN_EXTRA_MEM_MAX_REGIONS]; extern unsigned long xen_released_pages; -- 1.7.2.5 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
David Vrabel
2011-Sep-15 12:29 UTC
[Xen-devel] [PATCH 6/7] xen: allow extra memory to be in multiple regions
From: David Vrabel <david.vrabel@citrix.com> Allow the extra memory (used by the balloon driver) to be in multiple regions (typically two regions, one for low memory and one for high memory). This allows the balloon driver to increase the number of available low pages (if the initial number if pages is small). As a side effect, the algorithm for building the e820 memory map is simpler and more obviously correct as the map supplied by the hypervisor is (almost) used as is (in particular, all reserved regions and gaps are preserved). Only RAM regions are altered and RAM regions above max_pfn + extra_pages are marked as unused (the region is split in two if necessary). Signed-off-by: David Vrabel <david.vrabel@citrix.com> --- arch/x86/xen/setup.c | 173 ++++++++++++++++++++++---------------------------- 1 files changed, 77 insertions(+), 96 deletions(-) diff --git a/arch/x86/xen/setup.c b/arch/x86/xen/setup.c index 0c8e974..6433371 100644 --- a/arch/x86/xen/setup.c +++ b/arch/x86/xen/setup.c @@ -54,26 +54,32 @@ unsigned long xen_released_pages; */ #define EXTRA_MEM_RATIO (10) -static void __init xen_add_extra_mem(unsigned long pages) +static void __init xen_add_extra_mem(u64 start, u64 size) { unsigned long pfn; + int i; - u64 size = (u64)pages * PAGE_SIZE; - u64 extra_start = xen_extra_mem[0].start + xen_extra_mem[0].size; - - if (!pages) - return; - - e820_add_region(extra_start, size, E820_RAM); - sanitize_e820_map(e820.map, ARRAY_SIZE(e820.map), &e820.nr_map); - - memblock_x86_reserve_range(extra_start, extra_start + size, "XEN EXTRA"); + for (i = 0; i < XEN_EXTRA_MEM_MAX_REGIONS; i++) { + /* Add new region. */ + if (xen_extra_mem[i].size == 0) { + xen_extra_mem[i].start = start; + xen_extra_mem[i].size = size; + break; + } + /* Append to existing region. */ + if (xen_extra_mem[i].start + xen_extra_mem[i].size == start) { + xen_extra_mem[i].size += size; + break; + } + } + if (i == XEN_EXTRA_MEM_MAX_REGIONS) + printk(KERN_WARNING "Warning: not enough extra memory regions\n"); - xen_extra_mem[0].size += size; + memblock_x86_reserve_range(start, start + size, "XEN EXTRA"); - xen_max_p2m_pfn = PFN_DOWN(extra_start + size); + xen_max_p2m_pfn = PFN_DOWN(start + size); - for (pfn = PFN_DOWN(extra_start); pfn <= xen_max_p2m_pfn; pfn++) + for (pfn = PFN_DOWN(start); pfn <= xen_max_p2m_pfn; pfn++) __set_phys_to_machine(pfn, INVALID_P2M_ENTRY); } @@ -120,8 +126,8 @@ static unsigned long __init xen_release_chunk(phys_addr_t start_addr, return len; } -static unsigned long __init xen_return_unused_memory(unsigned long max_pfn, - const struct e820map *e820) +static unsigned long __init xen_return_unused_memory( + unsigned long max_pfn, const struct e820entry *map, int nr_map) { phys_addr_t max_addr = PFN_PHYS(max_pfn); phys_addr_t last_end = ISA_END_ADDRESS; @@ -129,13 +135,13 @@ static unsigned long __init xen_return_unused_memory(unsigned long max_pfn, int i; /* Free any unused memory above the low 1Mbyte. */ - for (i = 0; i < e820->nr_map && last_end < max_addr; i++) { - phys_addr_t end = e820->map[i].addr; + for (i = 0; i < nr_map && last_end < max_addr; i++) { + phys_addr_t end = map[i].addr; end = min(max_addr, end); if (last_end < end) released += xen_release_chunk(last_end, end); - last_end = max(last_end, e820->map[i].addr + e820->map[i].size); + last_end = max(last_end, map[i].addr + map[i].size); } if (last_end < max_addr) @@ -206,14 +212,13 @@ static unsigned long __init xen_get_max_pages(void) char * __init xen_memory_setup(void) { static struct e820entry map[E820MAX] __initdata; - static struct e820entry map_raw[E820MAX] __initdata; unsigned long max_pfn = xen_start_info->nr_pages; unsigned long long mem_end; int rc; struct xen_memory_map memmap; + unsigned long max_pages; unsigned long extra_pages = 0; - unsigned long extra_limit; unsigned long identity_pages = 0; int i; int op; @@ -240,49 +245,59 @@ char * __init xen_memory_setup(void) } BUG_ON(rc); - memcpy(map_raw, map, sizeof(map)); - e820.nr_map = 0; - xen_extra_mem[0].start = mem_end; - for (i = 0; i < memmap.nr_entries; i++) { - unsigned long long end; - - /* Guard against non-page aligned E820 entries. */ - if (map[i].type == E820_RAM) - map[i].size -= (map[i].size + map[i].addr) % PAGE_SIZE; - - end = map[i].addr + map[i].size; - if (map[i].type == E820_RAM && end > mem_end) { - /* RAM off the end - may be partially included */ - u64 delta = min(map[i].size, end - mem_end); - - map[i].size -= delta; - end -= delta; - - extra_pages += PFN_DOWN(delta); - /* - * Set RAM below 4GB that is not for us to be unusable. - * This prevents "System RAM" address space from being - * used as potential resource for I/O address (happens - * when ''allocate_resource'' is called). - */ - if (delta && - (xen_initial_domain() && end < 0x100000000ULL)) - e820_add_region(end, delta, E820_UNUSABLE); + /* Make sure the Xen-supplied memory map is well-ordered. */ + sanitize_e820_map(map, memmap.nr_entries, &memmap.nr_entries); + + max_pages = xen_get_max_pages(); + if (max_pages > max_pfn) + extra_pages += max_pages - max_pfn; + + xen_released_pages = xen_return_unused_memory(max_pfn, map, + memmap.nr_entries); + extra_pages += xen_released_pages; + + /* + * Clamp the amount of extra memory to a EXTRA_MEM_RATIO + * factor the base size. On non-highmem systems, the base + * size is the full initial memory allocation; on highmem it + * is limited to the max size of lowmem, so that it doesn''t + * get completely filled. + * + * In principle there could be a problem in lowmem systems if + * the initial memory is also very large with respect to + * lowmem, but we won''t try to deal with that here. + */ + extra_pages = min(EXTRA_MEM_RATIO * min(max_pfn, PFN_DOWN(MAXMEM)), + extra_pages); + + i = 0; + while (i < memmap.nr_entries) { + u64 addr = map[i].addr; + u64 size = map[i].size; + u32 type = map[i].type; + + if (type == E820_RAM) { + /* RAM regions must be page aligned. */ + size -= (addr + size) % PAGE_SIZE; + addr = PAGE_ALIGN(addr); + + if (addr < mem_end) { + size = min(size, mem_end - addr); + } else if (extra_pages) { + size = min(size, (u64)extra_pages * PAGE_SIZE); + extra_pages -= size / PAGE_SIZE; + xen_add_extra_mem(addr, size); + } else + type = E820_UNUSABLE; } - if (map[i].size > 0 && end > xen_extra_mem[0].start) - xen_extra_mem[0].start = end; + e820_add_region(addr, size, type); - /* Add region if any remains */ - if (map[i].size > 0) - e820_add_region(map[i].addr, map[i].size, map[i].type); + map[i].addr += size; + map[i].size -= size; + if (map[i].size == 0) + i++; } - /* Align the balloon area so that max_low_pfn does not get set - * to be at the _end_ of the PCI gap at the far end (fee01000). - * Note that the start of balloon area gets set in the loop above - * to be past the last E820 region. */ - if (xen_initial_domain() && (xen_extra_mem[0].start < (1ULL<<32))) - xen_extra_mem[0].start = (1ULL<<32); /* * In domU, the ISA region is normal, usable memory, but we @@ -308,45 +323,11 @@ char * __init xen_memory_setup(void) sanitize_e820_map(e820.map, ARRAY_SIZE(e820.map), &e820.nr_map); - extra_limit = xen_get_max_pages(); - if (max_pfn + extra_pages > extra_limit) { - if (extra_limit > max_pfn) - extra_pages = extra_limit - max_pfn; - else - extra_pages = 0; - } - - xen_released_pages = xen_return_unused_memory(xen_start_info->nr_pages, - &e820); - extra_pages += xen_released_pages; - - /* - * Clamp the amount of extra memory to a EXTRA_MEM_RATIO - * factor the base size. On non-highmem systems, the base - * size is the full initial memory allocation; on highmem it - * is limited to the max size of lowmem, so that it doesn''t - * get completely filled. - * - * In principle there could be a problem in lowmem systems if - * the initial memory is also very large with respect to - * lowmem, but we won''t try to deal with that here. - */ - extra_limit = min(EXTRA_MEM_RATIO * min(max_pfn, PFN_DOWN(MAXMEM)), - max_pfn + extra_pages); - - if (extra_limit >= max_pfn) - extra_pages = extra_limit - max_pfn; - else - extra_pages = 0; - - xen_add_extra_mem(extra_pages); - /* * Set P2M for all non-RAM pages and E820 gaps to be identity - * type PFNs. We supply it with the non-sanitized version - * of the E820. + * type PFNs. */ - identity_pages = xen_set_identity(map_raw, memmap.nr_entries); + identity_pages = xen_set_identity(e820.map, e820.nr_map); printk(KERN_INFO "Set %ld page(s) to 1-1 mapping.\n", identity_pages); return "Xen"; } -- 1.7.2.5 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
David Vrabel
2011-Sep-15 12:29 UTC
[Xen-devel] [PATCH 7/7] xen: release all pages within 1-1 p2m mappings
From: David Vrabel <david.vrabel@citrix.com> In xen_memory_setup() all reserved regions and gaps are set to an identity (1-1) p2m mapping. If an available page has a PFN within one of these 1-1 mappings it will become inaccessible (as it MFN is lost) so release them before setting up the mapping. This can make an additional 256 MiB or more of RAM available (depending on the size of the reserved regions in the memory map) if the initial pages overlap with reserved regions. Signed-off-by: David Vrabel <david.vrabel@citrix.com> --- arch/x86/xen/setup.c | 100 ++++++++++++++++--------------------------------- 1 files changed, 33 insertions(+), 67 deletions(-) diff --git a/arch/x86/xen/setup.c b/arch/x86/xen/setup.c index 6433371..986661b 100644 --- a/arch/x86/xen/setup.c +++ b/arch/x86/xen/setup.c @@ -126,72 +126,44 @@ static unsigned long __init xen_release_chunk(phys_addr_t start_addr, return len; } -static unsigned long __init xen_return_unused_memory( - unsigned long max_pfn, const struct e820entry *map, int nr_map) +static unsigned long __init xen_set_identity_and_release( + const struct e820entry *list, size_t map_size, unsigned long nr_pages) { - phys_addr_t max_addr = PFN_PHYS(max_pfn); - phys_addr_t last_end = ISA_END_ADDRESS; + phys_addr_t avail_end = PFN_PHYS(nr_pages); + phys_addr_t last_end = 0; unsigned long released = 0; - int i; - - /* Free any unused memory above the low 1Mbyte. */ - for (i = 0; i < nr_map && last_end < max_addr; i++) { - phys_addr_t end = map[i].addr; - end = min(max_addr, end); - - if (last_end < end) - released += xen_release_chunk(last_end, end); - last_end = max(last_end, map[i].addr + map[i].size); - } - - if (last_end < max_addr) - released += xen_release_chunk(last_end, max_addr); - - printk(KERN_INFO "released %lu pages of unused memory\n", released); - return released; -} - -static unsigned long __init xen_set_identity(const struct e820entry *list, - ssize_t map_size) -{ - phys_addr_t last = xen_initial_domain() ? 0 : ISA_END_ADDRESS; - phys_addr_t start_pci = last; - const struct e820entry *entry; unsigned long identity = 0; + const struct e820entry *entry; int i; + /* + * For each memory region consider whether to release and map + * the region and the preceeding gap (if any). If the region + * is RAM, only the gap is released and mapped. + */ for (i = 0, entry = list; i < map_size; i++, entry++) { - phys_addr_t start = entry->addr; - phys_addr_t end = start + entry->size; + phys_addr_t begin = last_end; + phys_addr_t end = entry->addr + entry->size; - if (start < last) - start = last; + last_end = end; - if (end <= start) - continue; - - /* Skip over the 1MB region. */ - if (last > end) - continue; + if (entry->type == E820_RAM || entry->type == E820_UNUSABLE) + end = entry->addr; - if ((entry->type == E820_RAM) || (entry->type == E820_UNUSABLE)) { - if (start > start_pci) - identity += set_phys_range_identity( - PFN_UP(start_pci), PFN_DOWN(start)); + if (begin < end) { + if (begin < avail_end) + released += xen_release_chunk( + begin, min(end, avail_end)); - /* Without saving ''last'' we would gooble RAM too - * at the end of the loop. */ - last = end; - start_pci = end; - continue; + identity += set_phys_range_identity( + PFN_UP(begin), PFN_DOWN(end)); } - start_pci = min(start, start_pci); - last = end; } - if (last > start_pci) - identity += set_phys_range_identity( - PFN_UP(start_pci), PFN_DOWN(last)); - return identity; + + printk(KERN_INFO "Released %lu pages of unused memory\n", released); + printk(KERN_INFO "Set %ld page(s) to 1-1 mapping\n", identity); + + return released; } static unsigned long __init xen_get_max_pages(void) @@ -219,7 +191,6 @@ char * __init xen_memory_setup(void) struct xen_memory_map memmap; unsigned long max_pages; unsigned long extra_pages = 0; - unsigned long identity_pages = 0; int i; int op; @@ -252,8 +223,13 @@ char * __init xen_memory_setup(void) if (max_pages > max_pfn) extra_pages += max_pages - max_pfn; - xen_released_pages = xen_return_unused_memory(max_pfn, map, - memmap.nr_entries); + /* + * Set P2M for all non-RAM pages and E820 gaps to be identity + * type PFNs. Any RAM pages that would be made inaccesible by + * this are first released. + */ + xen_released_pages = xen_set_identity_and_release( + map, memmap.nr_entries, max_pfn); extra_pages += xen_released_pages; /* @@ -303,10 +279,6 @@ char * __init xen_memory_setup(void) * In domU, the ISA region is normal, usable memory, but we * reserve ISA memory anyway because too many things poke * about in there. - * - * In Dom0, the host E820 information can leave gaps in the - * ISA range, which would cause us to release those pages. To - * avoid this, we unconditionally reserve them here. */ e820_add_region(ISA_START_ADDRESS, ISA_END_ADDRESS - ISA_START_ADDRESS, E820_RESERVED); @@ -323,12 +295,6 @@ char * __init xen_memory_setup(void) sanitize_e820_map(e820.map, ARRAY_SIZE(e820.map), &e820.nr_map); - /* - * Set P2M for all non-RAM pages and E820 gaps to be identity - * type PFNs. - */ - identity_pages = xen_set_identity(e820.map, e820.nr_map); - printk(KERN_INFO "Set %ld page(s) to 1-1 mapping.\n", identity_pages); return "Xen"; } -- 1.7.2.5 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Dan Magenheimer
2011-Sep-20 16:57 UTC
RE: [Xen-devel] xen: memory initialization/balloon fixes (#3)
> From: David Vrabel [mailto:david.vrabel@citrix.com] > Sent: Thursday, September 15, 2011 6:29 AM > To: xen-devel@lists.xensource.com > Cc: Konrad Rzeszutek Wilk > Subject: [Xen-devel] xen: memory initialization/balloon fixes (#3) > > This set of patches fixes some bugs in the memory initialization under > Xen and in Xen''s memory balloon driver. They can make 100s of MB of > additional RAM available (depending on the system/configuration). > > Patch 1 is already applied. > > Patch 2 fixes a bug in patch 1 and should be queued for 3.1 (and along > with patch 1 considered for 3.0 stable). > > Patch 3 is a bug fix and should be queued for 3.1 and possibly > queued for the 3.0 stable tree. > > Patches 5 & 6 increase the amount of low memory in 32 bit domains > started with < 1 GiB of RAM. Please queue for 3.2 > > Patch 7 releases all pages in the initial allocation with PFNs that > lie within a 1-1 mapping. This seems correct to me as I think that > once the 1-1 mapping is set the MFN of the original page is lost so > it''s no longer accessible by the kernel (and it cannot be used by > another domainHi David -- Thanks for your patches! I am looking at a memory capacity/ballooning weirdness that I hoped your patchset might fix, but so far it has not. I''m wondering if there was an earlier fix that you are building upon and that I am missing. My problem occurs in a PV domU with an upstream-variant kernel based on 3.0.5. The problem is that the total amount of memory as seen from inside the guest is always substantially less than the amount of memory seen from outside the guest. The difference seems to be fixed within a given boot, but assigning a different vm.cfg memchanges the amount. (For example, the difference D is about 18MB on a mem=128 boot and about 36MB on a mem=1024 boot.) Part B of the problem (and the one most important to me) is that setting /sys/devices/system/xen_memory/xen_memory0/target_kb to X results in a MemTotal inside the domU (as observed by "head -1 /proc/meminfo") of X-D. This can be particularly painful when X is aggressively small as X-D may result in OOMs. To use kernel function/variable names (and I observed this with some debugging code), when balloon_set_new_target(X) is called totalram_pages gets driven to X-D. I am using xm, but I don''t think this is a toolchain problem because the problem can be provoked and observed entirely within the guest... though I suppose it is possible that that the initial "mem=" is the origin of the problem and the balloon driver just perpetuates the initial difference. (I tried xl... same problem... my Xen/toolset version is 4.1.2-rc1-pre cset 23102) The descriptions in your patchset sound exactly as if you are attacking the same problem, but I''m not seeing any improved result. Any thoughts or ideas? Thanks, Dan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2011-Sep-21 15:05 UTC
Re: [Xen-devel] [PATCH 3/7] xen/balloon: account for pages released during memory setup
On Thu, Sep 15, 2011 at 01:29:24PM +0100, David Vrabel wrote:> From: David Vrabel <david.vrabel@citrix.com> > > In xen_memory_setup() pages that occur in gaps in the memory map are > released back to Xen. This reduces the domain''s current page count in > the hypervisor. The Xen balloon driver does not correctly decrease > its initial current_pages count to reflect this. If ''delta'' pages are > released and the target is adjusted the resulting reservation is > always ''delta'' less than the requested target. > > This affects dom0 if the initial allocation of pages overlaps the PCI > memory region but won''t affect most domU guests that have been setup > with pseudo-physical memory maps that don''t have gaps. > > Fix this by accouting for the released pages when starting the balloon > driver.Does this make the behaviour of the pvops guest be similar to the old-style XenOLinux? If so, perhaps we should include that in the git description for usability purposes (ie, when somebody searches the git log for what has happened in v3.2 Linux).> > If the domain''s targets are managed by xapi, the domain may eventually > run out of memory and die because xapi currently gets its target > calculations wrong and whenever it is restarted it always reduces the > target by ''delta''.> > Signed-off-by: David Vrabel <david.vrabel@citrix.com> > --- > arch/x86/xen/setup.c | 7 ++++++- > drivers/xen/balloon.c | 4 +++- > include/xen/page.h | 2 ++ > 3 files changed, 11 insertions(+), 2 deletions(-) > > diff --git a/arch/x86/xen/setup.c b/arch/x86/xen/setup.c > index 46d6d21..c983717 100644 > --- a/arch/x86/xen/setup.c > +++ b/arch/x86/xen/setup.c > @@ -39,6 +39,9 @@ extern void xen_syscall32_target(void); > /* Amount of extra memory space we add to the e820 ranges */ > phys_addr_t xen_extra_mem_start, xen_extra_mem_size; > > +/* Number of pages released from the initial allocation. */ > +unsigned long xen_released_pages; > + > /* > * The maximum amount of extra memory compared to the base size. The > * main scaling factor is the size of struct page. At extreme ratios > @@ -313,7 +316,9 @@ char * __init xen_memory_setup(void) > extra_pages = 0; > } > > - extra_pages += xen_return_unused_memory(xen_start_info->nr_pages, &e820); > + xen_released_pages = xen_return_unused_memory(xen_start_info->nr_pages, > + &e820); > + extra_pages += xen_released_pages; > > /* > * Clamp the amount of extra memory to a EXTRA_MEM_RATIO > diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c > index 5dfd8f8..4f59fb3 100644 > --- a/drivers/xen/balloon.c > +++ b/drivers/xen/balloon.c > @@ -565,7 +565,9 @@ static int __init balloon_init(void) > > pr_info("xen/balloon: Initialising balloon driver.\n"); > > - balloon_stats.current_pages = xen_pv_domain() ? min(xen_start_info->nr_pages, max_pfn) : max_pfn; > + balloon_stats.current_pages = xen_pv_domain() > + ? min(xen_start_info->nr_pages - xen_released_pages, max_pfn) > + : max_pfn; > balloon_stats.target_pages = balloon_stats.current_pages; > balloon_stats.balloon_low = 0; > balloon_stats.balloon_high = 0; > diff --git a/include/xen/page.h b/include/xen/page.h > index 0be36b9..92b61f8 100644 > --- a/include/xen/page.h > +++ b/include/xen/page.h > @@ -5,4 +5,6 @@ > > extern phys_addr_t xen_extra_mem_start, xen_extra_mem_size; > > +extern unsigned long xen_released_pages; > + > #endif /* _XEN_PAGE_H */ > -- > 1.7.2.5 > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek
2011-Sep-21 15:08 UTC
[Xen-devel] Re: [PATCH 2/7] xen: avoid adding non-existant memory if the reservation is unlimited
On Thu, Sep 15, 2011 at 01:29:23PM +0100, David Vrabel wrote:> From: David Vrabel <david.vrabel@citrix.com> > > If the domain''s reservation is unlimited, too many pages are added to > the balloon memory region. Correctly check the limit so the number of > extra pages is not increased in this case.and this one is in 3.1 too. Albeit with a more verbose description. Look in git commit: e3b73c4a25e9a5705b4ef28b91676caf01f9bc9f> > Signed-off-by: David Vrabel <david.vrabel@citrix.com> > --- > arch/x86/xen/setup.c | 10 ++++++---- > 1 files changed, 6 insertions(+), 4 deletions(-) > > diff --git a/arch/x86/xen/setup.c b/arch/x86/xen/setup.c > index c3b8d44..46d6d21 100644 > --- a/arch/x86/xen/setup.c > +++ b/arch/x86/xen/setup.c > @@ -306,10 +306,12 @@ char * __init xen_memory_setup(void) > sanitize_e820_map(e820.map, ARRAY_SIZE(e820.map), &e820.nr_map); > > extra_limit = xen_get_max_pages(); > - if (extra_limit >= max_pfn) > - extra_pages = extra_limit - max_pfn; > - else > - extra_pages = 0; > + if (max_pfn + extra_pages > extra_limit) { > + if (extra_limit > max_pfn) > + extra_pages = extra_limit - max_pfn; > + else > + extra_pages = 0; > + } > > extra_pages += xen_return_unused_memory(xen_start_info->nr_pages, &e820); > > -- > 1.7.2.5_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2011-Sep-21 15:09 UTC
[Xen-devel] Re: [PATCH 1/7] xen: use maximum reservation to limit amount of usable RAM
On Thu, Sep 15, 2011 at 01:29:22PM +0100, David Vrabel wrote:> From: David Vrabel <david.vrabel@citrix.com> > > Use the domain''s maximum reservation to limit the amount of extra RAM > for the memory balloon. This reduces the size of the pages tables and > the amount of reserved low memory (which defaults to about 1/32 of the > total RAM). > > On a system with 8 GiB of RAM with the domain limited to 1 GiB the > kernel reports: > > Before: > > Memory: 627792k/4472000k available > > After: > > Memory: 549740k/11132224k available > > A increase of about 76 MiB (~1.5% of the unused 7 GiB). The reserved > low memory is also reduced from 253 MiB to 32 MiB. The total > additional usable RAM is 329 MiB. > > For dom0, this requires at patch to Xen (''x86: use ''dom0_mem'' to limit > the number of pages for dom0'')[1].Not going to pick this one up as it already is in 3.1.> > [1] http://lists.xensource.com/archives/html/xen-devel/2011-08/msg00567.html > > Signed-off-by: David Vrabel <david.vrabel@citrix.com> > --- > arch/x86/xen/setup.c | 19 +++++++++++++++++++ > 1 files changed, 19 insertions(+), 0 deletions(-) > > diff --git a/arch/x86/xen/setup.c b/arch/x86/xen/setup.c > index df118a8..c3b8d44 100644 > --- a/arch/x86/xen/setup.c > +++ b/arch/x86/xen/setup.c > @@ -184,6 +184,19 @@ static unsigned long __init xen_set_identity(const struct e820entry *list, > PFN_UP(start_pci), PFN_DOWN(last)); > return identity; > } > + > +static unsigned long __init xen_get_max_pages(void) > +{ > + unsigned long max_pages = MAX_DOMAIN_PAGES; > + domid_t domid = DOMID_SELF; > + int ret; > + > + ret = HYPERVISOR_memory_op(XENMEM_maximum_reservation, &domid); > + if (ret > 0) > + max_pages = ret; > + return min(max_pages, MAX_DOMAIN_PAGES); > +} > + > /** > * machine_specific_memory_setup - Hook for machine specific memory setup. > **/ > @@ -292,6 +305,12 @@ char * __init xen_memory_setup(void) > > sanitize_e820_map(e820.map, ARRAY_SIZE(e820.map), &e820.nr_map); > > + extra_limit = xen_get_max_pages(); > + if (extra_limit >= max_pfn) > + extra_pages = extra_limit - max_pfn; > + else > + extra_pages = 0; > + > extra_pages += xen_return_unused_memory(xen_start_info->nr_pages, &e820); > > /* > -- > 1.7.2.5_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2011-Sep-21 17:11 UTC
[Xen-devel] Re: xen: memory initialization/balloon fixes (#3)
On Thu, Sep 15, 2011 at 01:29:21PM +0100, David Vrabel wrote:> This set of patches fixes some bugs in the memory initialization under > Xen and in Xen''s memory balloon driver. They can make 100s of MB of > additional RAM available (depending on the system/configuration). > > Patch 1 is already applied. > > Patch 2 fixes a bug in patch 1 and should be queued for 3.1 (and along > with patch 1 considered for 3.0 stable). > > Patch 3 is a bug fix and should be queued for 3.1 and possibly > queued for the 3.0 stable tree. > > Patches 5 & 6 increase the amount of low memory in 32 bit domains > started with < 1 GiB of RAM. Please queue for 3.2I''ve queued them up and going to test them this week to make sure there are no regressions.> > Patch 7 releases all pages in the initial allocation with PFNs that > lie within a 1-1 mapping. This seems correct to me as I think thatThe only thing I remember about this was with dmidecode doing something fishy.. (As in, it wouldn''t work when the pages under 1MB were released) (But I can''t remember the details about it, so I might be completly wrong). Could you please test that as well?> once the 1-1 mapping is set the MFN of the original page is lost so > it''s no longer accessible by the kernel (and it cannot be used by > another domain > > Changes since #2: > > - New patch: xen: avoid adding non-existant memory if the reservation > is unlimited > - Avoid using a hypercall to get the current number of pages in the > ballon driver. Apparently the hypercall won''t return the right > value if paging is used. > - Addresses Konrad''s review comments. > > David_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Dan Magenheimer
2011-Sep-21 22:29 UTC
RE: [Xen-devel] xen: memory initialization/balloon fixes (#3)
> From: Dan Magenheimer > Sent: Tuesday, September 20, 2011 10:58 AM > To: David Vrabel; xen-devel@lists.xensource.com > Cc: Konrad Wilk > Subject: RE: [Xen-devel] xen: memory initialization/balloon fixes (#3) > > > From: David Vrabel [mailto:david.vrabel@citrix.com] > > Sent: Thursday, September 15, 2011 6:29 AM > > To: xen-devel@lists.xensource.com > > Cc: Konrad Rzeszutek Wilk > > Subject: [Xen-devel] xen: memory initialization/balloon fixes (#3) > > > > This set of patches fixes some bugs in the memory initialization under > > Xen and in Xen''s memory balloon driver. They can make 100s of MB of > > additional RAM available (depending on the system/configuration). > > Hi David -- > > Thanks for your patches! I am looking at a memory capacity/ballooning > weirdness that I hoped your patchset might fix, but so far it has not. > I''m wondering if there was an earlier fix that you are building upon > and that I am missing. > > My problem occurs in a PV domU with an upstream-variant kernel based > on 3.0.5. The problem is that the total amount of memory as seen > from inside the guest is always substantially less than the amount > of memory seen from outside the guest. The difference seems to > be fixed within a given boot, but assigning a different vm.cfg mem> changes the amount. (For example, the difference D is about 18MB on > a mem=128 boot and about 36MB on a mem=1024 boot.) > > Part B of the problem (and the one most important to me) is that > setting /sys/devices/system/xen_memory/xen_memory0/target_kb > to X results in a MemTotal inside the domU (as observed by > "head -1 /proc/meminfo") of X-D. This can be particularly painful > when X is aggressively small as X-D may result in OOMs. > To use kernel function/variable names (and I observed this with > some debugging code), when balloon_set_new_target(X) is called > totalram_pages gets driven to X-D. > > I am using xm, but I don''t think this is a toolchain problem because > the problem can be provoked and observed entirely within the guest... > though I suppose it is possible that that the initial "mem=" is the > origin of the problem and the balloon driver just perpetuates > the initial difference. (I tried xl... same problem... my > Xen/toolset version is 4.1.2-rc1-pre cset 23102) > > The descriptions in your patchset sound exactly as if you are > attacking the same problem, but I''m not seeing any improved > result. Any thoughts or ideas?Hi David (and Konrad) -- Don''t know if you are looking at this or not (or if your patchset was intended to fix this problem or not). Looking into Part B of the problem, it appears that in balloon_init() the initial value of balloon_stats.current_pages may be set incorrectly. I''m finding that (for a PV domain), both nr_pages and max_pfn match mem=, but totalram_pages is substantially less. Current_pages should never be higher than the actual number of pages of RAM seen by the kernel, should it? By changing max_pfn to totalram_pages in the initialization of balloon_stats.current_pages in balloon_init(), my problem goes away... almost. With that fix, setting the balloon target to NkB results in totalram_pages (as seen from "head -1 /proc/meminfo") going to (N+6092)kB. The value 6092 appears to be fixed regardless of mem= and the fact that the result is off by 6092kB is annoying but, since it is higher rather than lower and it is fixed, it is not nearly as dangerous IMHO. Since I''m not as sure of the RAM-ifications (pun intended) of this change, I''d appreciate any comments you might have. Also, this doesn''t fix the large difference between MEM(K) reported by the domain in xentop (which matches mem=) and totalram_pages but, though also annoying, that''s not such a big problem IMHO. I''m guessing this may be space taken for PV pagetables or something like that, though the amount of RAM that "disappears" on a small-RAM guest (e.g. mem=128) is very high (e.g. ~18MB). But for my purposes (selfballooning), this doesn''t matter (much) so I don''t plan to pursue this right now. Thanks for any feedback! Dan P.S. I also haven''t looked at the HVM code in balloon_init. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
David Vrabel
2011-Sep-22 12:32 UTC
Re: [Xen-devel] xen: memory initialization/balloon fixes (#3)
On 20/09/11 17:57, Dan Magenheimer wrote:> > Thanks for your patches! I am looking at a memory capacity/ballooning > weirdness that I hoped your patchset might fix, but so far it has not. > I''m wondering if there was an earlier fix that you are building upon > and that I am missing. > > My problem occurs in a PV domU with an upstream-variant kernel based > on 3.0.5. The problem is that the total amount of memory as seen > from inside the guest is always substantially less than the amount > of memory seen from outside the guest. The difference seems to > be fixed within a given boot, but assigning a different vm.cfg mem> changes the amount. (For example, the difference D is about 18MB on > a mem=128 boot and about 36MB on a mem=1024 boot.)I don''t see the problem? The MemTotal value /proc/meminfo doesn''t include some pages reserved by the kernel which is why it''s less than the maximum reservation of the domain.> Part B of the problem (and the one most important to me) is that > setting /sys/devices/system/xen_memory/xen_memory0/target_kb > to X results in a MemTotal inside the domU (as observed by > "head -1 /proc/meminfo") of X-D. This can be particularly painful > when X is aggressively small as X-D may result in OOMs. > To use kernel function/variable names (and I observed this with > some debugging code), when balloon_set_new_target(X) is called > totalram_pages gets driven to X-D.Again, this looks like the correct behavior to me. David _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
David Vrabel
2011-Sep-22 13:08 UTC
[Xen-devel] Re: xen: memory initialization/balloon fixes (#3)
On 21/09/11 18:11, Konrad Rzeszutek Wilk wrote:> On Thu, Sep 15, 2011 at 01:29:21PM +0100, David Vrabel wrote: >> >> Patch 7 releases all pages in the initial allocation with PFNs that >> lie within a 1-1 mapping. This seems correct to me as I think that > > The only thing I remember about this was with dmidecode doing something > fishy.. (As in, it wouldn''t work when the pages under 1MB were released) > (But I can''t remember the details about it, so I might be completly > wrong). > > Could you please test that as well?dmidecode works on the two test boxes I had to hand.>> once the 1-1 mapping is set the MFN of the original page is lost so >> it''s no longer accessible by the kernel (and it cannot be used by >> another domainDavid _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Dan Magenheimer
2011-Sep-22 17:06 UTC
RE: [Xen-devel] xen: memory initialization/balloon fixes (#3)
> From: David Vrabel [mailto:david.vrabel@citrix.com] > Sent: Thursday, September 22, 2011 6:32 AM > To: Dan Magenheimer > Cc: xen-devel@lists.xensource.com; Konrad Wilk > Subject: Re: [Xen-devel] xen: memory initialization/balloon fixes (#3) > > On 20/09/11 17:57, Dan Magenheimer wrote: > > > > Thanks for your patches! I am looking at a memory capacity/ballooning > > weirdness that I hoped your patchset might fix, but so far it has not. > > I''m wondering if there was an earlier fix that you are building upon > > and that I am missing. > > > > My problem occurs in a PV domU with an upstream-variant kernel based > > on 3.0.5. The problem is that the total amount of memory as seen > > from inside the guest is always substantially less than the amount > > of memory seen from outside the guest. The difference seems to > > be fixed within a given boot, but assigning a different vm.cfg mem> > changes the amount. (For example, the difference D is about 18MB on > > a mem=128 boot and about 36MB on a mem=1024 boot.) > > I don''t see the problem?Hi David -- Sorry, just to clarify, are you saying you are seeing the same behavior and don''t consider it a problem, or that you are not seeing the same difference?> The MemTotal value /proc/meminfo doesn''t include some pages reserved by > the kernel which is why it''s less than the maximum reservation of the > domain.I''m aware of that... "some" has been a fixed size of a few megabytes in Xen for a long time. I am seeing 30-60MB or more. If you are never seeing a difference more than a few MB, maybe I''ve misapplied your patches (as they didn''t apply cleanly and I had to do some manual patching).> > Part B of the problem (and the one most important to me) is that > > setting /sys/devices/system/xen_memory/xen_memory0/target_kb > > to X results in a MemTotal inside the domU (as observed by > > "head -1 /proc/meminfo") of X-D. This can be particularly painful > > when X is aggressively small as X-D may result in OOMs. > > To use kernel function/variable names (and I observed this with > > some debugging code), when balloon_set_new_target(X) is called > > totalram_pages gets driven to X-D. > > Again, this looks like the correct behavior to me.Hmmm... so if a user (or automated tool) uses the Xen-defined API (i.e. /sys/devices/system/xen_memory/xen_memory0/target_kb) to use the Xen balloon driver to attempt to reduce memory usage to 100MB, and the Xen balloon driver instead reduces it to some random number somewhere between 40MB and 90MB, which may or may not cause OOMs, you consider this correct behavior? (Cc''ing a couple of ballooning old-timers to ensure I am not misunderstanding the intended API.) Thanks, Dan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Dan Magenheimer
2011-Sep-22 22:34 UTC
RE: [Xen-devel] xen: memory initialization/balloon fixes (#3)
> From: Dan Magenheimer > Subject: RE: [Xen-devel] xen: memory initialization/balloon fixes (#3) > > > From: David Vrabel [mailto:david.vrabel@citrix.com] > > Subject: Re: [Xen-devel] xen: memory initialization/balloon fixes (#3) > > > > On 20/09/11 17:57, Dan Magenheimer wrote: > > > > > > > > > My problem occurs in a PV domU with an upstream-variant kernel based > > > on 3.0.5. The problem is that the total amount of memory as seen > > > from inside the guest is always substantially less than the amount > > > of memory seen from outside the guest. The difference seems to > > > be fixed within a given boot, but assigning a different vm.cfg mem> > > changes the amount. (For example, the difference D is about 18MB on > > > a mem=128 boot and about 36MB on a mem=1024 boot.) > > > > I don''t see the problem? > > Hi David -- > > Sorry, just to clarify, are you saying you are seeing the same > behavior and don''t consider it a problem, or that you are not > seeing the same difference? > > > The MemTotal value /proc/meminfo doesn''t include some pages reserved by > > the kernel which is why it''s less than the maximum reservation of the > > domain. > > I''m aware of that... "some" has been a fixed size of a few megabytes > in Xen for a long time. I am seeing 30-60MB or more.Never mind on this part. After further debugging, I can see that this difference is due to normal uses of memory by the kernel for XEN PAGETABLES and RAMDISK etc. It''s unfortunate that the difference is so large, but I guess that''s in part due to the desire to use the same kernel binary for native and virtualized. I don''t remember it being nearly so high for older PV kernels, but I guess it''s progress! :-}> > > Part B of the problem (and the one most important to me) is that > > > setting /sys/devices/system/xen_memory/xen_memory0/target_kb > > > to X results in a MemTotal inside the domU (as observed by > > > "head -1 /proc/meminfo") of X-D. This can be particularly painful > > > when X is aggressively small as X-D may result in OOMs. > > > To use kernel function/variable names (and I observed this with > > > some debugging code), when balloon_set_new_target(X) is called > > > totalram_pages gets driven to X-D. > > > > Again, this looks like the correct behavior to me. > > Hmmm... so if a user (or automated tool) uses the Xen-defined > API (i.e. /sys/devices/system/xen_memory/xen_memory0/target_kb) > to use the Xen balloon driver to attempt to reduce memory usage > to 100MB, and the Xen balloon driver instead reduces it to > some random number somewhere between 40MB and 90MB, which > may or may not cause OOMs, you consider this correct behavior?I still think this is a bug but apparently orthogonal to your patchset. So sorry to bother you. Dan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2011-Sep-22 22:51 UTC
Re: [Xen-devel] xen: memory initialization/balloon fixes (#3)
On 09/22/2011 03:34 PM, Dan Magenheimer wrote:>> I''m aware of that... "some" has been a fixed size of a few megabytes >> in Xen for a long time. I am seeing 30-60MB or more. > Never mind on this part. After further debugging, I can see > that this difference is due to normal uses of memory by the > kernel for XEN PAGETABLES and RAMDISK etc. It''s unfortunate > that the difference is so large, but I guess that''s in part due > to the desire to use the same kernel binary for native and > virtualized. I don''t remember it being nearly so high for > older PV kernels, but I guess it''s progress! :-}I don''t think the Xen parts allocate/reserves lots of memory unnecessarily, so it shouldn''t be too different from the 2.6.18-xen kernels. They do reserve various chunks of memory, but for things like RAMDISK I think they get released again (and anyway, I don''t think that''s going to be anywhere near 30MB, let alone 60). I''m not very confident in those /proc/meminfo numbers - they may count memory as "reserved" if its in a reserved region even if the pages themselves have been released to the kernel pool.>>>> Part B of the problem (and the one most important to me) is that >>>> setting /sys/devices/system/xen_memory/xen_memory0/target_kb >>>> to X results in a MemTotal inside the domU (as observed by >>>> "head -1 /proc/meminfo") of X-D. This can be particularly painful >>>> when X is aggressively small as X-D may result in OOMs. >>>> To use kernel function/variable names (and I observed this with >>>> some debugging code), when balloon_set_new_target(X) is called >>>> totalram_pages gets driven to X-D. >>> Again, this looks like the correct behavior to me. >> Hmmm... so if a user (or automated tool) uses the Xen-defined >> API (i.e. /sys/devices/system/xen_memory/xen_memory0/target_kb) >> to use the Xen balloon driver to attempt to reduce memory usage >> to 100MB, and the Xen balloon driver instead reduces it to >> some random number somewhere between 40MB and 90MB, which >> may or may not cause OOMs, you consider this correct behavior? > I still think this is a bug but apparently orthogonal to > your patchset. So sorry to bother you.If you ask for 100MB, it should never try to make the domain smaller than that; if it does, it suggests the number is being misparsed or something. J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Dan Magenheimer
2011-Sep-22 23:46 UTC
RE: [Xen-devel] xen: memory initialization/balloon fixes (#3)
> From: Jeremy Fitzhardinge [mailto:jeremy@goop.org] > > On 09/22/2011 03:34 PM, Dan Magenheimer wrote: > >> I''m aware of that... "some" has been a fixed size of a few megabytes > >> in Xen for a long time. I am seeing 30-60MB or more. > > Never mind on this part. After further debugging, I can see > > that this difference is due to normal uses of memory by the > > kernel for XEN PAGETABLES and RAMDISK etc. It''s unfortunate > > that the difference is so large, but I guess that''s in part due > > to the desire to use the same kernel binary for native and > > virtualized. I don''t remember it being nearly so high for > > older PV kernels, but I guess it''s progress! :-} > > I don''t think the Xen parts allocate/reserves lots of memory > unnecessarily, so it shouldn''t be too different from the 2.6.18-xen > kernels. They do reserve various chunks of memory, but for things like > RAMDISK I think they get released again (and anyway, I don''t think > that''s going to be anywhere near 30MB, let alone 60). I''m not very > confident in those /proc/meminfo numbers - they may count memory as > "reserved" if its in a reserved region even if the pages themselves have > been released to the kernel pool.No, the first line of /proc/meminfo is precisely "totalram_pages".> >>>> Part B of the problem (and the one most important to me) is that > >>>> setting /sys/devices/system/xen_memory/xen_memory0/target_kb > >>>> to X results in a MemTotal inside the domU (as observed by > >>>> "head -1 /proc/meminfo") of X-D. This can be particularly painful > >>>> when X is aggressively small as X-D may result in OOMs. > >>>> To use kernel function/variable names (and I observed this with > >>>> some debugging code), when balloon_set_new_target(X) is called > >>>> totalram_pages gets driven to X-D. > >>> Again, this looks like the correct behavior to me. > >> Hmmm... so if a user (or automated tool) uses the Xen-defined > >> API (i.e. /sys/devices/system/xen_memory/xen_memory0/target_kb) > >> to use the Xen balloon driver to attempt to reduce memory usage > >> to 100MB, and the Xen balloon driver instead reduces it to > >> some random number somewhere between 40MB and 90MB, which > >> may or may not cause OOMs, you consider this correct behavior? > > I still think this is a bug but apparently orthogonal to > > your patchset. So sorry to bother you. > > If you ask for 100MB, it should never try to make the domain smaller > than that; if it does, it suggests the number is being misparsed or > something.OK then balloon_stats.current_pages can never be larger than totalram_pages. Which means that balloon_stats.current_pages must always grow and shrink when totalram_pages does (which is true now only in the balloon driver code). Which means, I think: balloon_stats.current_pages is just plain wrong! It doesn''t need to exist! If we replace every instance in balloon.c with totalram_pages, I think everything just works. Will run some tests tomorrow. Dan P.S. Not sure about Daniel''s hotplug stuff though.... _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
David Vrabel
2011-Sep-23 10:45 UTC
Re: [Xen-devel] xen: memory initialization/balloon fixes (#3)
On 23/09/11 00:46, Dan Magenheimer wrote:>> From: Jeremy Fitzhardinge [mailto:jeremy@goop.org] >> >> On 09/22/2011 03:34 PM, Dan Magenheimer wrote: >>>> I''m aware of that... "some" has been a fixed size of a few megabytes >>>> in Xen for a long time. I am seeing 30-60MB or more. >>> Never mind on this part. After further debugging, I can see >>> that this difference is due to normal uses of memory by the >>> kernel for XEN PAGETABLES and RAMDISK etc. It''s unfortunate >>> that the difference is so large, but I guess that''s in part due >>> to the desire to use the same kernel binary for native and >>> virtualized. I don''t remember it being nearly so high for >>> older PV kernels, but I guess it''s progress! :-} >> >> I don''t think the Xen parts allocate/reserves lots of memory >> unnecessarily, so it shouldn''t be too different from the 2.6.18-xen >> kernels. They do reserve various chunks of memory, but for things like >> RAMDISK I think they get released again (and anyway, I don''t think >> that''s going to be anywhere near 30MB, let alone 60). I''m not very >> confident in those /proc/meminfo numbers - they may count memory as >> "reserved" if its in a reserved region even if the pages themselves have >> been released to the kernel pool. > > No, the first line of /proc/meminfo is precisely "totalram_pages".I think most of the increase in reserved memory compared to classic Xen kernels is the change to using the generic SWIOTLB. This is up to 64 MiB.>>>>>> Part B of the problem (and the one most important to me) is that >>>>>> setting /sys/devices/system/xen_memory/xen_memory0/target_kb >>>>>> to X results in a MemTotal inside the domU (as observed by >>>>>> "head -1 /proc/meminfo") of X-D. This can be particularly painful >>>>>> when X is aggressively small as X-D may result in OOMs. >>>>>> To use kernel function/variable names (and I observed this with >>>>>> some debugging code), when balloon_set_new_target(X) is called >>>>>> totalram_pages gets driven to X-D. >>>>> Again, this looks like the correct behavior to me. >>>> Hmmm... so if a user (or automated tool) uses the Xen-defined >>>> API (i.e. /sys/devices/system/xen_memory/xen_memory0/target_kb) >>>> to use the Xen balloon driver to attempt to reduce memory usage >>>> to 100MB, and the Xen balloon driver instead reduces it to >>>> some random number somewhere between 40MB and 90MB, which >>>> may or may not cause OOMs, you consider this correct behavior? >>> I still think this is a bug but apparently orthogonal to >>> your patchset. So sorry to bother you. >> >> If you ask for 100MB, it should never try to make the domain smaller >> than that; if it does, it suggests the number is being misparsed or >> something. > > OK then balloon_stats.current_pages can never be larger than totalram_pages. > Which means that balloon_stats.current_pages must always grow > and shrink when totalram_pages does (which is true now only in > the balloon driver code). Which means, I think: > > balloon_stats.current_pages is just plain wrong! It doesn''t need to > exist! If we replace every instance in balloon.c with totalram_pages, > I think everything just works. Will run some tests tomorrow.No. balloon_stats.current_pages is the amount of pages used by the domain from Xen''s point of view (and must be equal to the amount report by xl top). It is not what the guest kernel thinks is the number of usable pages. Because totalram_pages doesn''t include some reserved pages balloon_stats.current_pages will necessarily always be greater. If you''re attempting to make the domain self-balloon I don''t see why you''re even interested in the total number of pages. Surely it''s the number of free pages that''s useful? e.g., a basic self-ballooning algorithm would be something like. delta = free_pages - emergency_reserve - spare reservation_target -= delta Where: free_pages is the current number of free pages emergency_reserve is the amount of pages the kernel reserves for satisfying important allocations when memory is low. This is approximately (initial_maximum_reservation / 32). spare is some extra number of pages to provide a buffer when the memory usage increases. David _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2011-Sep-23 13:28 UTC
Re: [Xen-devel] xen: memory initialization/balloon fixes (#3)
> > No, the first line of /proc/meminfo is precisely "totalram_pages". > > I think most of the increase in reserved memory compared to classic Xen > kernels is the change to using the generic SWIOTLB. This is up to 64 MiB.Which should be disabled by default on domU... Well, unless your guest has more than 4GB than it gets enabled. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Dan Magenheimer
2011-Sep-23 19:04 UTC
RE: [Xen-devel] xen: memory initialization/balloon fixes (#3)
> From: David Vrabel [mailto:david.vrabel@citrix.com] > Subject: Re: [Xen-devel] xen: memory initialization/balloon fixes (#3) > > On 23/09/11 00:46, Dan Magenheimer wrote: > >> From: Jeremy Fitzhardinge [mailto:jeremy@goop.org] > >> > >> I don''t think the Xen parts allocate/reserves lots of memory > >> unnecessarily, so it shouldn''t be too different from the 2.6.18-xen > >> kernels. They do reserve various chunks of memory, but for things like > >> RAMDISK I think they get released again (and anyway, I don''t think > >> that''s going to be anywhere near 30MB, let alone 60). I''m not very > >> confident in those /proc/meminfo numbers - they may count memory as > >> "reserved" if its in a reserved region even if the pages themselves have > >> been released to the kernel pool. > > > > No, the first line of /proc/meminfo is precisely "totalram_pages". > > I think most of the increase in reserved memory compared to classic Xen > kernels is the change to using the generic SWIOTLB. This is up to 64 MiB.Hi David -- My data agrees with Konrad''s reply. I don''t see the SWIOTLB but am only testing with smaller guests (<2GB).> >>>>> Again, this looks like the correct behavior to me. > >>>> Hmmm... so if a user (or automated tool) uses the Xen-defined > >>>> API (i.e. /sys/devices/system/xen_memory/xen_memory0/target_kb) > >>>> to use the Xen balloon driver to attempt to reduce memory usage > >>>> to 100MB, and the Xen balloon driver instead reduces it to > >>>> some random number somewhere between 40MB and 90MB, which > >>>> may or may not cause OOMs, you consider this correct behavior? > >>> I still think this is a bug but apparently orthogonal to > >>> your patchset. So sorry to bother you. > >> > >> If you ask for 100MB, it should never try to make the domain smaller > >> than that; if it does, it suggests the number is being misparsed or > >> something.(Jeremy -- No it''s not getting mis-parsed... in the existing balloon code the try-to-balloon-to-target-N code is instead ballooning to N-D, where D is "reserved_pages+absent_pages"... and as you pointed out, this sum gets modified after the balloon is initialized.)> > OK then balloon_stats.current_pages can never be larger than totalram_pages. > > Which means that balloon_stats.current_pages must always grow > > and shrink when totalram_pages does (which is true now only in > > the balloon driver code). Which means, I think: > > > > balloon_stats.current_pages is just plain wrong! It doesn''t need to > > exist! If we replace every instance in balloon.c with totalram_pages, > > I think everything just works. Will run some tests tomorrow. > > No. balloon_stats.current_pages is the amount of pages used by the > domain from Xen''s point of view (and must be equal to the amount report > by xl top). It is not what the guest kernel thinks is the number of > usable pages.OK. It still appears to be not always accurate to me. Are you using balloon_stats.current_pages via sysfs? AFAICT, the interface used to get the amount of domain memory (e.g. for xl top) does not use balloon_stats.current_pages, so the only way it would be visible outside of the balloon driver is via sysfs. I suppose that is part of the guest ABI now, even if it contains an unreliable value. In any case, I can leave balloon_stats.current_pages alone in my patch-under-development. It is "just plain wrong" for my purposes and for setting a balloon target, and I think still wrong for the purpose you state, but I think I can leave unmodified the code that updates it so as not to affect code that uses it via sysfs.> Because totalram_pages doesn''t include some reserved pages > balloon_stats.current_pages will necessarily always be greater.Yep.> If you''re attempting to make the domain self-balloon I don''t see why > you''re even interested in the total number of pages. Surely it''s the > number of free pages that''s useful?No, after the kernel has been busy awhile, the number of free pages can be quite small, because most of RAM gets used in the page cache. Self-ballooning (as used with tmem) is much more aggressive than that because part of the purpose of tmem is to act as an all-domain page cache. If you''re not familiar with tmem, see http://oss.oracle.com/projects/tmem. Thanks again for the feedback. I''ll cc you on my upcoming patch. Dan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2011-Sep-24 02:08 UTC
[Xen-devel] Re: xen: memory initialization/balloon fixes (#3)
On Thu, Sep 15, 2011 at 01:29:21PM +0100, David Vrabel wrote:> This set of patches fixes some bugs in the memory initialization under > Xen and in Xen''s memory balloon driver. They can make 100s of MB of > additional RAM available (depending on the system/configuration). > > Patch 1 is already applied. > > Patch 2 fixes a bug in patch 1 and should be queued for 3.1 (and along > with patch 1 considered for 3.0 stable). > > Patch 3 is a bug fix and should be queued for 3.1 and possibly > queued for the 3.0 stable tree. > > Patches 5 & 6 increase the amount of low memory in 32 bit domains > started with < 1 GiB of RAM. Please queue for 3.2 > > Patch 7 releases all pages in the initial allocation with PFNs that > lie within a 1-1 mapping. This seems correct to me as I think that > once the 1-1 mapping is set the MFN of the original page is lost so > it''s no longer accessible by the kernel (and it cannot be used by > another domain > > Changes since #2: > > - New patch: xen: avoid adding non-existant memory if the reservation > is unlimited > - Avoid using a hypercall to get the current number of pages in the > ballon driver. Apparently the hypercall won''t return the right > value if paging is used. > - Addresses Konrad''s review comments.They don''t work on AMD boxes: XELINUX 3.82 2009-06-09 Copyright (C) 1994-2009 H. Peter Anvin et al Loading xen.gz... ok Loading vmlinuz... ok Loading initramfs.cpio.gz... ok __ __ _ _ _ _ _ ___ __ _____ ___ \ \/ /___ _ __ | || | / | / / |/ _ \ / /_|___ / / _ \ \ // _ \ ''_ \ | || |_ | |__| | | | | | ''_ \ |_ \| | | | / \ __/ | | | |__ _|| |__| | | |_| | (_) |__) | |_| | /_/\_\___|_| |_| |_|(_)_| |_|_|\___/ \___/____/ \___/ (XEN) Xen version 4.1-110923 (konrad@dumpdata.com) (gcc version 4.4.4 20100503 (Red Hat 4.4.4-2) (GCC) ) Fri Sep 23 19:38:15 EDT 2011 (XEN) Latest ChangeSet: unavailable (XEN) Console output is synchronous. (XEN) Bootloader: unknown (XEN) Command line: com1=115200,8n1 console=com1,vga guest_loglvl=all sync_console apic=debug (XEN) Video information: (XEN) VGA is text mode 80x25, font 8x16 (XEN) VBE/DDC methods: none; EDID transfer time: 2 seconds (XEN) EDID info not retrieved because no DDC retrieval method detected (XEN) Disc information: (XEN) Found 0 MBR signatures (XEN) Found 0 EDD information structures (XEN) Xen-e820 RAM map: (XEN) 0000000000000000 - 000000000009d800 (usable) (XEN) 000000000009d800 - 00000000000a0000 (reserved) (XEN) 00000000000cc000 - 0000000000100000 (reserved) (XEN) 0000000000100000 - 000000007fff0000 (usable) (XEN) 000000007fff0000 - 0000000080000000 (reserved) (XEN) 0000000080000000 - 00000000cfef0000 (usable) (XEN) 00000000cfef0000 - 00000000cfef5000 (ACPI data) (XEN) 00000000cfef5000 - 00000000cff7f000 (ACPI NVS) (XEN) 00000000cff80000 - 00000000d0000000 (reserved) (XEN) 00000000e0000000 - 00000000f0000000 (reserved) (XEN) 00000000fec00000 - 00000000fec10000 (reserved) (XEN) 00000000fee00000 - 00000000fee01000 (reserved) (XEN) 00000000fff80000 - 0000000100000000 (reserved) (XEN) 0000000100000000 - 0000000130000000 (usable) (XEN) ACPI: RSDP 000F79B0, 0024 (r2 PTLTD ) (XEN) ACPI: XSDT CFEF0753, 009C (r1 DELL PE_SC3 6040000 DELL 0) (XEN) DELL PE_SC3 6040000 MSFT 100000E) (XEN) ACPI: FACS CFEF5FC0, 0040 (XEN) ACPI: TCPA CFEF3D53, 0032 (r1 Phoeni x 6040000 TL 0) (XEN) ACPI: SLIC CFEF3D85, 0024 (r1 DELL PE_SC3 6040000 PTL 1) (XEN) ACPI: SPCR CFEF3DA9, 0050 (r1 DELL PE_SC3 6040000 PTL 1) (XEN) ACPI: EINJ CFEF3DF9, 01B0 (r1 PTL WHEAPTL 6040000 PTL 1) (XEN) ACPI: HEST CFEF3FA9, 00A8 (r1 PTL WHEAPTL 6040000 PTL 1) (XEN) ACPI: BERT CFEF4051, 0030 (r1 PTL WHEAPTL 6040000 PTL 1) (XEN) ACPI: SSDT CFEF4081, 00E1 (r1 wheaos wheaosc 6040000 INTL 20050624) (XEN) ACPI: ERST CFEF4162, 0270 (r1 PTL WHEAPTL 6040000 PTL 1) (XEN) ACPI: SRAT CFEF43D2, 00E8 (r1 AMD HAMMER 6040000 AMD 1) (XEN) ACPI: SSDT CFEF44BA, 0A30 (r1 AMD POWERNOW 6040000 AMD 1) (XEN) ACPI: MCFG CFEF4EEA, 003C (r1 PTLTD MCFG 6040000 LTP 0) (XEN) ACPI: HPET CFEF4F26, 0038 (r1 PTLTD HPETTBL 6040000 LTP 1) (XEN) ACPI: APIC CFEF4F5E, 007A (r1 PTLTD APIC 6040000 LTP 0) (XEN) ACPI: BOOT CFEF4FD8, 0028 (r1 PTLTD $SBFTBL$ 6040000 LTP 1) (XEN) System RAM: 4094MB (4192756kB) (XEN) Domain heap initialised (XEN) Processor #0 0:2 APIC version 16 (XEN) Processor #1 0:2 APIC version 16 (XEN) Processor #2 0:2 APIC version 16 (XEN) Processor #3 0:2 APIC version 16 (XEN) IOAPIC[0]: apic_id 4, version 17, address 0xfec00000, GSI 0-23 (XEN) Enabling APIC mode: Flat. Using 1 I/O APICs (XEN) ERST table is invalid (XEN) Using scheduler: SMP Credit Scheduler (credit) (XEN) Detected 2109.743 MHz processor. (XEN) Initing memory sharing. (XEN) AMD-Vi: IOMMU not found! (XEN) I/O virtualisation disabled (XEN) ENABLING IO-APIC IRQs (XEN) -> Using new ACK method (XEN) Platform timer is 25.000MHz HPET �(XEN) Allocated console ring of 16 KiB. (XEN) HVM: ASIDs enabled. (XEN) SVM: Support(XEN) Brought up 4 CPUs (XEN) *** LOADING DOMAIN 0 *** (XEN) Xen kernel: 64-bit, lsb, compat32 (XEN) Dom0 kernel: 64-bit, PAE, lsb, paddr 0x1000000 -> 0x202f000 (XEN) PHYSICAL MEY ARRANGEMENT: (XEN) Dom0 alloc.: 0000000118000000->000000011c000000 (926244 pages to be allocated) (XEN) Init. ramdisk: 00000001216cc000->000000012ffffc00 (XEN) VIRTUAL MEMORY ARRANGEMENT: (XEN) Loaded kernel: ffffffff81000000->ffffffff8202f000 (XEN) Init. ramdisk: ffffffff8202f000->ffffffff90962c00 (XEN) Phys-Mach map: ffffffff90963000->ffffffff91108ac0 (XEN) Start info: ffffffff91109000->ffffffff911094b4 (XEN) Page tables: ffffffff9110a000->ffffffff91197000 (XEN) Boot stack: ffffffff91197000->ffffffff91198000 (XEN) TOTAL: ffffffff80000000->ffffffff91400000 (XEN) ENTRY ADDRESS: ffffffff81aeb200 (XEN) Dom0 has maximum 4 VCPUs (XEN) Scrubbing Free RAM: .done. (XEN) Xen trace buffers: disabled (XEN) Std. Loglevel: Errors and warnings (XEN) Guest Loglevel: All (XEN) ********************************************** (XEN) ******* WARNING: CONSOLE OUTPUT IS SYNCHRONOUS (XEN) ******* This option is intended to aid debugging of Xen by ensuring (XEN) ******* that all output is synchronously delivered on the serial line. (XEN) ******* However it can introduce SIGNIFICANT latencies and affect (XEN) ******* timekeeping. It is NOT recommended for production use! (XEN) ********************************************** (XEN) 3... 2... 1... (XEN) Xen is relinquishing VGA console. (XEN) *** Serial input -> DOM0 (type ''CTRL-a'' three times to switch input to Xen) (XEN) Freed 228kB init memory. mapping kernel into physical memory Xen: setup ISA identity maps about to get started... and then it is just stuck.. This is v3.1-rc7 + your patches _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2011-Sep-26 14:20 UTC
Re: [Xen-devel] Re: xen: memory initialization/balloon fixes (#3)
> They don''t work on AMD boxes: > > mapping kernel into physical memory > Xen: setup ISA identity maps > about to get started... > > and then it is just stuck.. This is v3.1-rc7 + your patchesThis is a Dell T105 w/ 4GB of RAM. The config file is attached. earlyprintk=xen shows nothing, .. interestingly enough Ctrl-A works - it is just that I can''t see anything from the Linux kernel. It probably is stuck in a loop... _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
David Vrabel
2011-Sep-27 14:09 UTC
[Xen-devel] Re: xen: memory initialization/balloon fixes (#3)
On 24/09/11 03:08, Konrad Rzeszutek Wilk wrote:> On Thu, Sep 15, 2011 at 01:29:21PM +0100, David Vrabel wrote: >> This set of patches fixes some bugs in the memory initialization under >> Xen and in Xen''s memory balloon driver. They can make 100s of MB of >> additional RAM available (depending on the system/configuration). >> >> Patch 1 is already applied. >> >> Patch 2 fixes a bug in patch 1 and should be queued for 3.1 (and along >> with patch 1 considered for 3.0 stable). >> >> Patch 3 is a bug fix and should be queued for 3.1 and possibly >> queued for the 3.0 stable tree. >> >> Patches 5 & 6 increase the amount of low memory in 32 bit domains >> started with < 1 GiB of RAM. Please queue for 3.2 >> >> Patch 7 releases all pages in the initial allocation with PFNs that >> lie within a 1-1 mapping. This seems correct to me as I think that >> once the 1-1 mapping is set the MFN of the original page is lost so >> it''s no longer accessible by the kernel (and it cannot be used by >> another domain >> >> Changes since #2: >> >> - New patch: xen: avoid adding non-existant memory if the reservation >> is unlimited >> - Avoid using a hypercall to get the current number of pages in the >> ballon driver. Apparently the hypercall won''t return the right >> value if paging is used. >> - Addresses Konrad''s review comments. > > They don''t work on AMD boxes:It''s not specific to AMD boxes.> (XEN) Xen-e820 RAM map: > (XEN) 0000000000000000 - 000000000009d800 (usable)It''s because it''s not correctly handling the half-page of RAM at the end of this region. I don''t have access to any test boxes with a dodgy BIOS like this so can you test this patch? If it works I''ll fold it in and post an updated series. Can you remember why this page alignment was required? I''d like to update the comment with the reason because the bare-metal x86 memory init code doesn''t appear to fixup the memory map in this way. diff --git a/arch/x86/xen/setup.c b/arch/x86/xen/setup.c index 986661b..e473c4c 100644 --- a/arch/x86/xen/setup.c +++ b/arch/x86/xen/setup.c @@ -178,6 +178,19 @@ static unsigned long __init xen_get_max_pages(void) return min(max_pages, MAX_DOMAIN_PAGES); } +static void xen_e820_add_region(u64 start, u64 size, int type) +{ + u64 end = start + size; + + /* Align RAM regions to page boundaries. */ + if (type == E820_RAM || type == E820_UNUSABLE) { + start = PAGE_ALIGN(start); + end &= ~((u64)PAGE_SIZE - 1); + } + + e820_add_region(start, end - start, type); +} + /** * machine_specific_memory_setup - Hook for machine specific memory setup. **/ @@ -253,10 +266,6 @@ char * __init xen_memory_setup(void) u32 type = map[i].type; if (type == E820_RAM) { - /* RAM regions must be page aligned. */ - size -= (addr + size) % PAGE_SIZE; - addr = PAGE_ALIGN(addr); - if (addr < mem_end) { size = min(size, mem_end - addr); } else if (extra_pages) { @@ -267,7 +276,7 @@ char * __init xen_memory_setup(void) type = E820_UNUSABLE; } - e820_add_region(addr, size, type); + xen_e820_add_region(addr, size, type); map[i].addr += size; map[i].size -= size; David _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2011-Sep-27 23:10 UTC
Re: [Xen-devel] Re: xen: memory initialization/balloon fixes (#3)
> > (XEN) Xen-e820 RAM map: > > (XEN) 0000000000000000 - 000000000009d800 (usable) > > It''s because it''s not correctly handling the half-page of RAM at the end > of this region. > > I don''t have access to any test boxes with a dodgy BIOS like this so can > you test this patch? If it works I''ll fold it in and post an updated > series.It works. Albeit I think we are going to hit a problem with dmidecode if the DMI data is right in the reserved region (http://lists.xensource.com/archives/html/xen-devel/2011-09/msg01299.html) As in, if it starts in 9D800 - we consider 0->9d as RAM PFN, and 9e->100 as 1-1 mapping. I am thinking that perhaps the call to xen_set_phys_identity, where we call PFN_UP(x) should be replaced with PFN_DOWN(x). That way we would consider 0>9c as RAM PFN and 9D->100 as 1-1 mapping. That would imply a new patch to your series naturally.> > Can you remember why this page alignment was required? I''d like toThe e820_* calls define how the memory subsystem will use it. It ended at some point assuming that the full page is RAM even thought it was only half-RAM and tried to use it and blew the machine up. The fix was to make the calls to the e820_* with size and regions that were page-aligned. Anyhow, here is what the bootup looks now: [ 0.000000] Freeing 9e-a0 pfn range: 2 pages freed [ 0.000000] 1-1 mapping on 9e->a0 [ 0.000000] Freeing a0-100 pfn range: 96 pages freed [ 0.000000] 1-1 mapping on a0->100 [ 0.000000] Freeing 7fff0-80000 pfn range: 16 pages freed [ 0.000000] 1-1 mapping on 7fff0->80000 [ 0.000000] Freeing cfef0-cfef5 pfn range: 5 pages freed [ 0.000000] 1-1 mapping on cfef0->cfef5 [ 0.000000] Freeing cfef5-cff7f pfn range: 138 pages freed [ 0.000000] 1-1 mapping on cfef5->cff7f [ 0.000000] Freeing cff7f-d0000 pfn range: 129 pages freed [ 0.000000] 1-1 mapping on cff7f->d0000 [ 0.000000] Freeing d0000-f0000 pfn range: 131072 pages freed [ 0.000000] 1-1 mapping on d0000->f0000 [ 0.000000] Freeing f0000-f4b58 pfn range: 19288 pages freed [ 0.000000] 1-1 mapping on f0000->fec10 [ 0.000000] 1-1 mapping on fec10->fee01 [ 0.000000] 1-1 mapping on fee01->100000 [ 0.000000] Released 150746 pages of unused memory [ 0.000000] Set 196994 page(s) to 1-1 mapping [ 0.000000] BIOS-provided physical RAM map: [ 0.000000] Xen: 0000000000000000 - 000000000009d000 (usable) [ 0.000000] Xen: 000000000009d800 - 0000000000100000 (reserved) [ 0.000000] Xen: 0000000000100000 - 000000007fff0000 (usable) [ 0.000000] Xen: 000000007fff0000 - 0000000080000000 (reserved)> update the comment with the reason because the bare-metal x86 memory > init code doesn''t appear to fixup the memory map in this way. > > diff --git a/arch/x86/xen/setup.c b/arch/x86/xen/setup.c > index 986661b..e473c4c 100644 > --- a/arch/x86/xen/setup.c > +++ b/arch/x86/xen/setup.c > @@ -178,6 +178,19 @@ static unsigned long __init xen_get_max_pages(void) > return min(max_pages, MAX_DOMAIN_PAGES); > } > > +static void xen_e820_add_region(u64 start, u64 size, int type) > +{ > + u64 end = start + size; > + > + /* Align RAM regions to page boundaries. */ > + if (type == E820_RAM || type == E820_UNUSABLE) {Hm, do we care about E820_UNUSABLE to be page aligned? If so, please comment why.> + start = PAGE_ALIGN(start);Is that actually safe? Say it starts a 9ffff? We would end up using 9f000 which is not right. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
David Vrabel
2011-Sep-28 10:45 UTC
Re: [Xen-devel] Re: xen: memory initialization/balloon fixes (#3)
On 28/09/11 00:10, Konrad Rzeszutek Wilk wrote:>>> (XEN) Xen-e820 RAM map: >>> (XEN) 0000000000000000 - 000000000009d800 (usable) >> >> It''s because it''s not correctly handling the half-page of RAM at the end >> of this region. >> >> I don''t have access to any test boxes with a dodgy BIOS like this so can >> you test this patch? If it works I''ll fold it in and post an updated >> series. > > It works. Albeit I think we are going to hit a problem with dmidecode > if the DMI data is right in the reserved region > > (http://lists.xensource.com/archives/html/xen-devel/2011-09/msg01299.html) > > As in, if it starts in 9D800 - we consider 0->9d as RAM PFN, and 9e->100 as 1-1 > mapping. > > I am thinking that perhaps the call to xen_set_phys_identity, where > we call PFN_UP(x) should be replaced with PFN_DOWN(x). That way > we would consider 0>9c as RAM PFN and 9D->100 as 1-1 mapping.I almost did an equivalent change (see below) but discarded it as it would have resulting in overlapping regions and attempting to release/map some pages twice. I think we will have to move the release/map until after the final e820 map has been sanitized so there are no overlapping regions. I''ll prepare another patch for this.> That would imply a new patch to your series naturally. >> >> Can you remember why this page alignment was required? I''d like to > > The e820_* calls define how the memory subsystem will use it. > It ended at some point assuming that the full page is RAM even thought > it was only half-RAM and tried to use it and blew the machine up. > > The fix was to make the calls to the e820_* with size and regions > that were page-aligned. > > Anyhow, here is what the bootup looks now: > > [ 0.000000] Freeing 9e-a0 pfn range: 2 pages freed > [ 0.000000] 1-1 mapping on 9e->a0 > [ 0.000000] Freeing a0-100 pfn range: 96 pages freed > [ 0.000000] 1-1 mapping on a0->100 > [ 0.000000] Freeing 7fff0-80000 pfn range: 16 pages freed > [ 0.000000] 1-1 mapping on 7fff0->80000 > [ 0.000000] Freeing cfef0-cfef5 pfn range: 5 pages freed > [ 0.000000] 1-1 mapping on cfef0->cfef5 > [ 0.000000] Freeing cfef5-cff7f pfn range: 138 pages freed > [ 0.000000] 1-1 mapping on cfef5->cff7f > [ 0.000000] Freeing cff7f-d0000 pfn range: 129 pages freed > [ 0.000000] 1-1 mapping on cff7f->d0000 > [ 0.000000] Freeing d0000-f0000 pfn range: 131072 pages freed > [ 0.000000] 1-1 mapping on d0000->f0000 > [ 0.000000] Freeing f0000-f4b58 pfn range: 19288 pages freed > [ 0.000000] 1-1 mapping on f0000->fec10 > [ 0.000000] 1-1 mapping on fec10->fee01 > [ 0.000000] 1-1 mapping on fee01->100000 > [ 0.000000] Released 150746 pages of unused memory > [ 0.000000] Set 196994 page(s) to 1-1 mapping > [ 0.000000] BIOS-provided physical RAM map: > [ 0.000000] Xen: 0000000000000000 - 000000000009d000 (usable) > [ 0.000000] Xen: 000000000009d800 - 0000000000100000 (reserved) > [ 0.000000] Xen: 0000000000100000 - 000000007fff0000 (usable) > [ 0.000000] Xen: 000000007fff0000 - 0000000080000000 (reserved) > > >> update the comment with the reason because the bare-metal x86 memory >> init code doesn''t appear to fixup the memory map in this way. >> >> diff --git a/arch/x86/xen/setup.c b/arch/x86/xen/setup.c >> index 986661b..e473c4c 100644 >> --- a/arch/x86/xen/setup.c >> +++ b/arch/x86/xen/setup.c >> @@ -178,6 +178,19 @@ static unsigned long __init xen_get_max_pages(void) >> return min(max_pages, MAX_DOMAIN_PAGES); >> } >> >> +static void xen_e820_add_region(u64 start, u64 size, int type) >> +{ >> + u64 end = start + size; >> + >> + /* Align RAM regions to page boundaries. */ >> + if (type == E820_RAM || type == E820_UNUSABLE) { > > Hm, do we care about E820_UNUSABLE to be page aligned? > If so, please comment why.Er. We don''t really but I think this if needs to be: /* * Page align regions. * * Reduce RAM regions and expand other (reserved) regions. */ if (type == E820_RAM || type == E820_UNUSABLE) { start = PAGE_ALIGN(start); end &= ~((u64)PAGE_SIZE - 1); } else { start &= ~((u64)PAGE_SIZE - 1); end = PAGE_ALIGN(start); } So reserved regions also become page aligned (which is part of the fix for the dmidecode bug).>> + start = PAGE_ALIGN(start); > > Is that actually safe? Say it starts a 9ffff? We would > end up using 9f000 which is not right.PAGE_ALIGN() (and ALIGN()) round upwards. David _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2011-Sep-28 13:25 UTC
Re: [Xen-devel] Re: xen: memory initialization/balloon fixes (#3)
On Wed, Sep 28, 2011 at 11:45:02AM +0100, David Vrabel wrote:> On 28/09/11 00:10, Konrad Rzeszutek Wilk wrote: > >>> (XEN) Xen-e820 RAM map: > >>> (XEN) 0000000000000000 - 000000000009d800 (usable) > >> > >> It''s because it''s not correctly handling the half-page of RAM at the end > >> of this region. > >> > >> I don''t have access to any test boxes with a dodgy BIOS like this so can > >> you test this patch? If it works I''ll fold it in and post an updated > >> series. > > > > It works. Albeit I think we are going to hit a problem with dmidecode > > if the DMI data is right in the reserved region > > > > (http://lists.xensource.com/archives/html/xen-devel/2011-09/msg01299.html) > > > > As in, if it starts in 9D800 - we consider 0->9d as RAM PFN, and 9e->100 as 1-1 > > mapping. > > > > I am thinking that perhaps the call to xen_set_phys_identity, where > > we call PFN_UP(x) should be replaced with PFN_DOWN(x). That way > > we would consider 0>9c as RAM PFN and 9D->100 as 1-1 mapping. > > I almost did an equivalent change (see below) but discarded it as it > would have resulting in overlapping regions and attempting to > release/map some pages twice. > > I think we will have to move the release/map until after the final e820 > map has been sanitized so there are no overlapping regions.<nods>Fortunatly for us, the overlap does not happen - they are just next to each other. BTW, I think Xen hypervisor does the E820 sanitisation so there shouldn''t be any funny entries.> > I''ll prepare another patch for this.OK.> > > That would imply a new patch to your series naturally. > >> > >> Can you remember why this page alignment was required? I''d like to > > > > The e820_* calls define how the memory subsystem will use it. > > It ended at some point assuming that the full page is RAM even thought > > it was only half-RAM and tried to use it and blew the machine up. > > > > The fix was to make the calls to the e820_* with size and regions > > that were page-aligned. > > > > Anyhow, here is what the bootup looks now: > > > > [ 0.000000] Freeing 9e-a0 pfn range: 2 pages freed > > [ 0.000000] 1-1 mapping on 9e->a0 > > [ 0.000000] Freeing a0-100 pfn range: 96 pages freed > > [ 0.000000] 1-1 mapping on a0->100 > > [ 0.000000] Freeing 7fff0-80000 pfn range: 16 pages freed > > [ 0.000000] 1-1 mapping on 7fff0->80000 > > [ 0.000000] Freeing cfef0-cfef5 pfn range: 5 pages freed > > [ 0.000000] 1-1 mapping on cfef0->cfef5 > > [ 0.000000] Freeing cfef5-cff7f pfn range: 138 pages freed > > [ 0.000000] 1-1 mapping on cfef5->cff7f > > [ 0.000000] Freeing cff7f-d0000 pfn range: 129 pages freed > > [ 0.000000] 1-1 mapping on cff7f->d0000 > > [ 0.000000] Freeing d0000-f0000 pfn range: 131072 pages freed > > [ 0.000000] 1-1 mapping on d0000->f0000 > > [ 0.000000] Freeing f0000-f4b58 pfn range: 19288 pages freed > > [ 0.000000] 1-1 mapping on f0000->fec10 > > [ 0.000000] 1-1 mapping on fec10->fee01 > > [ 0.000000] 1-1 mapping on fee01->100000 > > [ 0.000000] Released 150746 pages of unused memory > > [ 0.000000] Set 196994 page(s) to 1-1 mapping > > [ 0.000000] BIOS-provided physical RAM map: > > [ 0.000000] Xen: 0000000000000000 - 000000000009d000 (usable) > > [ 0.000000] Xen: 000000000009d800 - 0000000000100000 (reserved) > > [ 0.000000] Xen: 0000000000100000 - 000000007fff0000 (usable) > > [ 0.000000] Xen: 000000007fff0000 - 0000000080000000 (reserved) > > > > > >> update the comment with the reason because the bare-metal x86 memory > >> init code doesn''t appear to fixup the memory map in this way. > >> > >> diff --git a/arch/x86/xen/setup.c b/arch/x86/xen/setup.c > >> index 986661b..e473c4c 100644 > >> --- a/arch/x86/xen/setup.c > >> +++ b/arch/x86/xen/setup.c > >> @@ -178,6 +178,19 @@ static unsigned long __init xen_get_max_pages(void) > >> return min(max_pages, MAX_DOMAIN_PAGES); > >> } > >> > >> +static void xen_e820_add_region(u64 start, u64 size, int type)Might as well call this function "xen_align_and_add_e820_region"> >> +{ > >> + u64 end = start + size; > >> + > >> + /* Align RAM regions to page boundaries. */ > >> + if (type == E820_RAM || type == E820_UNUSABLE) { > > > > Hm, do we care about E820_UNUSABLE to be page aligned? > > If so, please comment why. > > Er. We don''t really but I think this if needs to be: > > /* > * Page align regions. > * > * Reduce RAM regions and expand other (reserved) regions. > */ > if (type == E820_RAM || type == E820_UNUSABLE) { > start = PAGE_ALIGN(start); > end &= ~((u64)PAGE_SIZE - 1); > } else { > start &= ~((u64)PAGE_SIZE - 1); > end = PAGE_ALIGN(start); > } > > So reserved regions also become page aligned (which is part of the fix > for the dmidecode bug).<nods> That should be part of a seperate patch (well, the dmidecde patch). Instead of the "infinite loop, won''t boot on Konrad''s machines with non-standard E820".> > >> + start = PAGE_ALIGN(start); > > > > Is that actually safe? Say it starts a 9ffff? We would > > end up using 9f000 which is not right. > > PAGE_ALIGN() (and ALIGN()) round upwards.<smacks his head> Right. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2011-Sep-28 13:47 UTC
Re: [Xen-devel] Re: xen: memory initialization/balloon fixes (#3)
> Er. We don''t really but I think this if needs to be: > > /* > * Page align regions. > * > * Reduce RAM regions and expand other (reserved) regions. > */ > if (type == E820_RAM || type == E820_UNUSABLE) { > start = PAGE_ALIGN(start); > end &= ~((u64)PAGE_SIZE - 1); > } else { > start &= ~((u64)PAGE_SIZE - 1); > end = PAGE_ALIGN(start); > } > > So reserved regions also become page aligned (which is part of the fix > for the dmidecode bug).I am not sure actually that is required for the e820_* calls. As those are used by ''ioremap'' and memory buddy system and they only care about the RAM regions. Everybody else assumes that the "gaps" and "anything-but -RAM-regions" are OK - as long as they don''t touch the RAM regions. It certainly is required for the set_phys_to_identity(..) call. I am trying here to be sure we don''t mess it up - and I don''t know what the right answer is for e820_* calls. Well, I do know what the right answer if for RAM regions - they must be page-aligned. But for reserved/non-RAM/ACPI/ACPI-NVS... ? BIOS-e820: 00000000dfe8ac00 - 00000000dfe8cc00 (ACPI NVS) BIOS-e820: 00000000dfe8cc00 - 00000000dfe8ec00 (ACPI data) .. snip.. reserve RAM buffer: 00000000dfe8ac00 - 00000000dfffffff or say: BIOS-e820: 00000000bff66f00 - 00000000bff76300 (ACPI data) .. [ 1.026860] reserve RAM buffer: 00000000bff66f00 - 00000000bfffffff They all seem to work OK without being page-aligned. I think we can drop the page-alignment on non-RAM regions when we give it to e820_*. We want to diverge as little as possible from what baremetal does. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel