David Vrabel
2011-Aug-16 10:00 UTC
[Xen-devel] [RFC] limit dom0 memory using Xen''s dom0_mem command line option
This set of patches limits the amount of memory dom0 can use by using Xen''s dom0_mem=max:NNN command line option. This can make extra memory available because less memory is wasted on page tables that are never used and fewer pages are reserved for low memory situations. In addition, the extra pages can be in two regions which allows more low memory to be available if dom0 intially starts with less than 1 GiB. Xen requires the patch "x86: use ''dom0_mem'' to limit the number of pages for dom0" to provide the correct information in the XENMEM_maximum_reservation memory op. Instead of this approach would it be better to limit dom0 memory to the initial number pages and use the recently added memory hotplug support in the balloon driver for all reservation increases? David _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
David Vrabel
2011-Aug-16 10:00 UTC
[Xen-devel] [PATCH 1/3] xen: allow balloon driver to use more than one memory region
Allow the xen balloon driver to populate its list of extra pages from more than one region of memory. This will allow platforms to provide (for example) a region of low memory and a region of high memory. Signed-off-by: David Vrabel <david.vrabel@citrix.com> --- Is there a better way of passing the memory information to the balloon driver? --- arch/x86/xen/setup.c | 20 ++++++++++---------- drivers/xen/balloon.c | 48 ++++++++++++++++++++++++++++-------------------- include/xen/page.h | 9 ++++++++- 3 files changed, 46 insertions(+), 31 deletions(-) diff --git a/arch/x86/xen/setup.c b/arch/x86/xen/setup.c index df118a8..30d0015 100644 --- a/arch/x86/xen/setup.c +++ b/arch/x86/xen/setup.c @@ -37,7 +37,7 @@ extern void xen_syscall_target(void); extern void xen_syscall32_target(void); /* Amount of extra memory space we add to the e820 ranges */ -phys_addr_t xen_extra_mem_start, xen_extra_mem_size; +struct xen_memory_region xen_extra_mem[XEN_EXTRA_MEM_MAX_REGIONS]; /* * The maximum amount of extra memory compared to the base size. The @@ -56,7 +56,7 @@ static void __init xen_add_extra_mem(unsigned long pages) unsigned long pfn; u64 size = (u64)pages * PAGE_SIZE; - u64 extra_start = xen_extra_mem_start + xen_extra_mem_size; + u64 extra_start = xen_extra_mem[0].start + xen_extra_mem[0].size; if (!pages) return; @@ -66,7 +66,7 @@ static void __init xen_add_extra_mem(unsigned long pages) memblock_x86_reserve_range(extra_start, extra_start + size, "XEN EXTRA"); - xen_extra_mem_size += size; + xen_extra_mem[0].size += size; xen_max_p2m_pfn = PFN_DOWN(extra_start + size); @@ -226,7 +226,7 @@ char * __init xen_memory_setup(void) memcpy(map_raw, map, sizeof(map)); e820.nr_map = 0; - xen_extra_mem_start = mem_end; + xen_extra_mem[0].start = mem_end; for (i = 0; i < memmap.nr_entries; i++) { unsigned long long end; @@ -254,8 +254,8 @@ char * __init xen_memory_setup(void) e820_add_region(end, delta, E820_UNUSABLE); } - if (map[i].size > 0 && end > xen_extra_mem_start) - xen_extra_mem_start = end; + if (map[i].size > 0 && end > xen_extra_mem[0].start) + xen_extra_mem[0].start = end; /* Add region if any remains */ if (map[i].size > 0) @@ -263,10 +263,10 @@ char * __init xen_memory_setup(void) } /* Align the balloon area so that max_low_pfn does not get set * to be at the _end_ of the PCI gap at the far end (fee01000). - * Note that xen_extra_mem_start gets set in the loop above to be - * past the last E820 region. */ - if (xen_initial_domain() && (xen_extra_mem_start < (1ULL<<32))) - xen_extra_mem_start = (1ULL<<32); + * Note that the start of balloon area gets set in the loop above + * to be past the last E820 region. */ + if (xen_initial_domain() && (xen_extra_mem[0].start < (1ULL<<32))) + xen_extra_mem[0].start = (1ULL<<32); /* * In domU, the ISA region is normal, usable memory, but we diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c index 5dfd8f8..de7278a 100644 --- a/drivers/xen/balloon.c +++ b/drivers/xen/balloon.c @@ -555,11 +555,35 @@ void free_xenballooned_pages(int nr_pages, struct page** pages) } EXPORT_SYMBOL(free_xenballooned_pages); -static int __init balloon_init(void) +static void __init balloon_add_memory_region(unsigned long start_pfn, unsigned long pages) { unsigned long pfn, extra_pfn_end; struct page *page; + /* + * Initialise the balloon with excess memory space. We need + * to make sure we don''t add memory which doesn''t exist or + * logically exist. The E820 map can be trimmed to be smaller + * than the amount of physical memory due to the mem= command + * line parameter. And if this is a 32-bit non-HIGHMEM kernel + * on a system with memory which requires highmem to access, + * don''t try to use it. + */ + extra_pfn_end = min(min(max_pfn, e820_end_of_ram_pfn()), + start_pfn + pages); + for (pfn = start_pfn; pfn < extra_pfn_end; pfn++) { + page = pfn_to_page(pfn); + /* totalram_pages and totalhigh_pages do not + include the boot-time balloon extension, so + don''t subtract from it. */ + __balloon_append(page); + } +} + +static int __init balloon_init(void) +{ + int r; + if (!xen_domain()) return -ENODEV; @@ -583,25 +607,9 @@ static int __init balloon_init(void) register_memory_notifier(&xen_memory_nb); #endif - /* - * Initialise the balloon with excess memory space. We need - * to make sure we don''t add memory which doesn''t exist or - * logically exist. The E820 map can be trimmed to be smaller - * than the amount of physical memory due to the mem= command - * line parameter. And if this is a 32-bit non-HIGHMEM kernel - * on a system with memory which requires highmem to access, - * don''t try to use it. - */ - extra_pfn_end = min(min(max_pfn, e820_end_of_ram_pfn()), - (unsigned long)PFN_DOWN(xen_extra_mem_start + xen_extra_mem_size)); - for (pfn = PFN_UP(xen_extra_mem_start); - pfn < extra_pfn_end; - pfn++) { - page = pfn_to_page(pfn); - /* totalram_pages and totalhigh_pages do not include the boot-time - balloon extension, so don''t subtract from it. */ - __balloon_append(page); - } + for (r = 0; r < XEN_EXTRA_MEM_MAX_REGIONS; r++) + balloon_add_memory_region(PFN_UP(xen_extra_mem[r].start), + PFN_DOWN(xen_extra_mem[r].size)); return 0; } diff --git a/include/xen/page.h b/include/xen/page.h index 0be36b9..b01f6ac 100644 --- a/include/xen/page.h +++ b/include/xen/page.h @@ -3,6 +3,13 @@ #include <asm/xen/page.h> -extern phys_addr_t xen_extra_mem_start, xen_extra_mem_size; +struct xen_memory_region { + phys_addr_t start; + phys_addr_t size; +}; + +#define XEN_EXTRA_MEM_MAX_REGIONS 1 + +extern struct xen_memory_region xen_extra_mem[XEN_EXTRA_MEM_MAX_REGIONS]; #endif /* _XEN_PAGE_H */ -- 1.7.4.1 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
David Vrabel
2011-Aug-16 10:00 UTC
[Xen-devel] [PATCH 2/3] xen: allow extra memory to be two regions
Allow the extra memory (used by the balloon driver) to be in two regions (typically low and high memory). This allows the balloon driver to increase the number of available low pages (if the initial number if pages is small). As a side effect, the algorithm for building the e820 memory map is simpler and more obviously correct as the map supplied by the hypervisor is (almost) used as is. Signed-off-by: David Vrabel <david.vrabel@citrix.com> --- arch/x86/xen/setup.c | 167 +++++++++++++++++++++++++------------------------- include/xen/page.h | 2 +- 2 files changed, 84 insertions(+), 85 deletions(-) diff --git a/arch/x86/xen/setup.c b/arch/x86/xen/setup.c index 30d0015..3421c9e 100644 --- a/arch/x86/xen/setup.c +++ b/arch/x86/xen/setup.c @@ -51,23 +51,29 @@ struct xen_memory_region xen_extra_mem[XEN_EXTRA_MEM_MAX_REGIONS]; */ #define EXTRA_MEM_RATIO (10) -static void __init xen_add_extra_mem(unsigned long pages) +static void __init xen_add_extra_mem(u64 extra_start, u64 size) { unsigned long pfn; + int i; - u64 size = (u64)pages * PAGE_SIZE; - u64 extra_start = xen_extra_mem[0].start + xen_extra_mem[0].size; - - if (!pages) - return; - - e820_add_region(extra_start, size, E820_RAM); - sanitize_e820_map(e820.map, ARRAY_SIZE(e820.map), &e820.nr_map); + for (i = 0; i < XEN_EXTRA_MEM_MAX_REGIONS; i++) { + /* Add new region. */ + if (xen_extra_mem[i].size == 0) { + xen_extra_mem[i].start = extra_start; + xen_extra_mem[i].size = size; + break; + } + /* Append to existing region. */ + if (xen_extra_mem[i].start + xen_extra_mem[i].size == extra_start) { + xen_extra_mem[i].size += size; + break; + } + } + if (i == XEN_EXTRA_MEM_MAX_REGIONS) + printk(KERN_WARNING "Warning: not enough extra memory regions\n"); memblock_x86_reserve_range(extra_start, extra_start + size, "XEN EXTRA"); - xen_extra_mem[0].size += size; - xen_max_p2m_pfn = PFN_DOWN(extra_start + size); for (pfn = PFN_DOWN(extra_start); pfn <= xen_max_p2m_pfn; pfn++) @@ -118,7 +124,8 @@ static unsigned long __init xen_release_chunk(phys_addr_t start_addr, } static unsigned long __init xen_return_unused_memory(unsigned long max_pfn, - const struct e820map *e820) + const struct e820entry *map, + int nr_map) { phys_addr_t max_addr = PFN_PHYS(max_pfn); phys_addr_t last_end = ISA_END_ADDRESS; @@ -126,13 +133,13 @@ static unsigned long __init xen_return_unused_memory(unsigned long max_pfn, int i; /* Free any unused memory above the low 1Mbyte. */ - for (i = 0; i < e820->nr_map && last_end < max_addr; i++) { - phys_addr_t end = e820->map[i].addr; + for (i = 0; i < nr_map && last_end < max_addr; i++) { + phys_addr_t end = map[i].addr; end = min(max_addr, end); if (last_end < end) released += xen_release_chunk(last_end, end); - last_end = max(last_end, e820->map[i].addr + e820->map[i].size); + last_end = max(last_end, map[i].addr + map[i].size); } if (last_end < max_addr) @@ -184,20 +191,31 @@ static unsigned long __init xen_set_identity(const struct e820entry *list, PFN_UP(start_pci), PFN_DOWN(last)); return identity; } + +static unsigned long __init xen_get_max_pages(void) +{ + unsigned long max_pages = MAX_DOMAIN_PAGES; /* Limited by memory map. */ + + if (xen_initial_domain()) { + /* FIXME: ask hypervisor for max pages. */ + } + + return min(max_pages, MAX_DOMAIN_PAGES); +} + /** * machine_specific_memory_setup - Hook for machine specific memory setup. **/ char * __init xen_memory_setup(void) { static struct e820entry map[E820MAX] __initdata; - static struct e820entry map_raw[E820MAX] __initdata; unsigned long max_pfn = xen_start_info->nr_pages; unsigned long long mem_end; int rc; struct xen_memory_map memmap; + unsigned long max_pages; unsigned long extra_pages = 0; - unsigned long extra_limit; unsigned long identity_pages = 0; int i; int op; @@ -224,49 +242,54 @@ char * __init xen_memory_setup(void) } BUG_ON(rc); - memcpy(map_raw, map, sizeof(map)); - e820.nr_map = 0; - xen_extra_mem[0].start = mem_end; - for (i = 0; i < memmap.nr_entries; i++) { - unsigned long long end; - - /* Guard against non-page aligned E820 entries. */ - if (map[i].type == E820_RAM) - map[i].size -= (map[i].size + map[i].addr) % PAGE_SIZE; - - end = map[i].addr + map[i].size; - if (map[i].type == E820_RAM && end > mem_end) { - /* RAM off the end - may be partially included */ - u64 delta = min(map[i].size, end - mem_end); - - map[i].size -= delta; - end -= delta; - - extra_pages += PFN_DOWN(delta); - /* - * Set RAM below 4GB that is not for us to be unusable. - * This prevents "System RAM" address space from being - * used as potential resource for I/O address (happens - * when ''allocate_resource'' is called). - */ - if (delta && - (xen_initial_domain() && end < 0x100000000ULL)) - e820_add_region(end, delta, E820_UNUSABLE); - } + /* Make sure the Xen-supplied memory map is well-ordered. */ + /* FIXME: is this necessary? Does Xen provide any guarantees + on this? */ + sanitize_e820_map(map, memmap.nr_entries, &memmap.nr_entries); - if (map[i].size > 0 && end > xen_extra_mem[0].start) - xen_extra_mem[0].start = end; + max_pages = xen_get_max_pages(); + if (max_pages > max_pfn) + extra_pages += max_pages - max_pfn; - /* Add region if any remains */ - if (map[i].size > 0) - e820_add_region(map[i].addr, map[i].size, map[i].type); - } - /* Align the balloon area so that max_low_pfn does not get set - * to be at the _end_ of the PCI gap at the far end (fee01000). - * Note that the start of balloon area gets set in the loop above - * to be past the last E820 region. */ - if (xen_initial_domain() && (xen_extra_mem[0].start < (1ULL<<32))) - xen_extra_mem[0].start = (1ULL<<32); + extra_pages += xen_return_unused_memory(max_pfn, map, memmap.nr_entries); + + /* + * Clamp the amount of extra memory to a EXTRA_MEM_RATIO + * factor the base size. On non-highmem systems, the base + * size is the full initial memory allocation; on highmem it + * is limited to the max size of lowmem, so that it doesn''t + * get completely filled. + * + * In principle there could be a problem in lowmem systems if + * the initial memory is also very large with respect to + * lowmem, but we won''t try to deal with that here. + */ + extra_pages = min(EXTRA_MEM_RATIO * min(max_pfn, PFN_DOWN(MAXMEM)), extra_pages); + + i = 0; + while (i < memmap.nr_entries) { + u64 addr = map[i].addr; + u64 size = map[i].size; + u32 type = map[i].type; + + if (type == E820_RAM) { + if (addr < mem_end) { + size = min(size, mem_end - addr); + } else if (extra_pages) { + size = min(size, (u64)extra_pages * PAGE_SIZE); + extra_pages -= size / PAGE_SIZE; + xen_add_extra_mem(addr, size); + } else + type = E820_UNUSABLE; + } + + e820_add_region(addr, size, type); + + map[i].addr += size; + map[i].size -= size; + if (map[i].size == 0) + i++; + } /* * In domU, the ISA region is normal, usable memory, but we @@ -292,35 +315,11 @@ char * __init xen_memory_setup(void) sanitize_e820_map(e820.map, ARRAY_SIZE(e820.map), &e820.nr_map); - extra_pages += xen_return_unused_memory(xen_start_info->nr_pages, &e820); - - /* - * Clamp the amount of extra memory to a EXTRA_MEM_RATIO - * factor the base size. On non-highmem systems, the base - * size is the full initial memory allocation; on highmem it - * is limited to the max size of lowmem, so that it doesn''t - * get completely filled. - * - * In principle there could be a problem in lowmem systems if - * the initial memory is also very large with respect to - * lowmem, but we won''t try to deal with that here. - */ - extra_limit = min(EXTRA_MEM_RATIO * min(max_pfn, PFN_DOWN(MAXMEM)), - max_pfn + extra_pages); - - if (extra_limit >= max_pfn) - extra_pages = extra_limit - max_pfn; - else - extra_pages = 0; - - xen_add_extra_mem(extra_pages); - /* * Set P2M for all non-RAM pages and E820 gaps to be identity - * type PFNs. We supply it with the non-sanitized version - * of the E820. + * type PFNs. */ - identity_pages = xen_set_identity(map_raw, memmap.nr_entries); + identity_pages = xen_set_identity(e820.map, e820.nr_map); printk(KERN_INFO "Set %ld page(s) to 1-1 mapping.\n", identity_pages); return "Xen"; } diff --git a/include/xen/page.h b/include/xen/page.h index b01f6ac..5a13bb3 100644 --- a/include/xen/page.h +++ b/include/xen/page.h @@ -8,7 +8,7 @@ struct xen_memory_region { phys_addr_t size; }; -#define XEN_EXTRA_MEM_MAX_REGIONS 1 +#define XEN_EXTRA_MEM_MAX_REGIONS 2 extern struct xen_memory_region xen_extra_mem[XEN_EXTRA_MEM_MAX_REGIONS]; -- 1.7.4.1 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
David Vrabel
2011-Aug-16 10:00 UTC
[Xen-devel] [PATCH 3/3] xen: use maximum reservation to limit dom0 memory
Use the maximum reservation hypercall to set limit the amount of usable dom0 memory. This reduces the size of pages tables etc. if dom0 is to use less memory than the maximum available. Signed-off-by: David Vrabel <david.vrabel@citrix.com> --- Note this requires a patched Xen that sets max_pages when creating dom0. --- arch/x86/xen/setup.c | 7 ++++++- 1 files changed, 6 insertions(+), 1 deletions(-) diff --git a/arch/x86/xen/setup.c b/arch/x86/xen/setup.c index 3421c9e..584e7dc 100644 --- a/arch/x86/xen/setup.c +++ b/arch/x86/xen/setup.c @@ -197,7 +197,12 @@ static unsigned long __init xen_get_max_pages(void) unsigned long max_pages = MAX_DOMAIN_PAGES; /* Limited by memory map. */ if (xen_initial_domain()) { - /* FIXME: ask hypervisor for max pages. */ + domid_t domid = DOMID_SELF; + int ret; + + ret = HYPERVISOR_memory_op(XENMEM_maximum_reservation, &domid); + if (ret > 0) + max_pages = ret; } return min(max_pages, MAX_DOMAIN_PAGES); -- 1.7.4.1 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2011-Aug-16 13:33 UTC
[Xen-devel] Re: [RFC] limit dom0 memory using Xen''s dom0_mem command line option
On Tue, Aug 16, 2011 at 11:00:35AM +0100, David Vrabel wrote:> This set of patches limits the amount of memory dom0 can use by using > Xen''s dom0_mem=max:NNN command line option. This can make extra > memory available because less memory is wasted on page tables that are > never used and fewer pages are reserved for low memory situations. > > In addition, the extra pages can be in two regions which allows more > low memory to be available if dom0 intially starts with less than 1 > GiB. > > Xen requires the patch "x86: use ''dom0_mem'' to limit the number of > pages for dom0" to provide the correct information in the > XENMEM_maximum_reservation memory op. > > Instead of this approach would it be better to limit dom0 memory to > the initial number pages and use the recently added memory hotplug > support in the balloon driver for all reservation increases?I was thinking about it, but one interesting thing that Stefano and Jeremy pointed out is that we need to have some amount of "reserved" memory at the start of the machine. We need that "reserved" memory so that the Linux guest can allocate ''struct page'' for non-existent memory - and then those ''struct page'' we can use for sharing pages across domains. So if you use the memory hotplug, we might not have that reserved memory during bootup - and we need it. Perhaps there is a way to compromise on this? _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2011-Aug-16 13:38 UTC
[Xen-devel] Re: [PATCH 1/3] xen: allow balloon driver to use more than one memory region
On Tue, Aug 16, 2011 at 11:00:36AM +0100, David Vrabel wrote:> Allow the xen balloon driver to populate its list of extra pages from > more than one region of memory. This will allow platforms to provide > (for example) a region of low memory and a region of high memory.What does this solve? Is this a requirement for another patch? If so please specify the name of it.> > Signed-off-by: David Vrabel <david.vrabel@citrix.com> > --- > Is there a better way of passing the memory information to the balloon > driver?I think the way you have it is OK.> --- > arch/x86/xen/setup.c | 20 ++++++++++---------- > drivers/xen/balloon.c | 48 ++++++++++++++++++++++++++++-------------------- > include/xen/page.h | 9 ++++++++- > 3 files changed, 46 insertions(+), 31 deletions(-) > > diff --git a/arch/x86/xen/setup.c b/arch/x86/xen/setup.c > index df118a8..30d0015 100644 > --- a/arch/x86/xen/setup.c > +++ b/arch/x86/xen/setup.c > @@ -37,7 +37,7 @@ extern void xen_syscall_target(void); > extern void xen_syscall32_target(void); > > /* Amount of extra memory space we add to the e820 ranges */ > -phys_addr_t xen_extra_mem_start, xen_extra_mem_size; > +struct xen_memory_region xen_extra_mem[XEN_EXTRA_MEM_MAX_REGIONS]; > > /* > * The maximum amount of extra memory compared to the base size. The > @@ -56,7 +56,7 @@ static void __init xen_add_extra_mem(unsigned long pages) > unsigned long pfn; > > u64 size = (u64)pages * PAGE_SIZE; > - u64 extra_start = xen_extra_mem_start + xen_extra_mem_size; > + u64 extra_start = xen_extra_mem[0].start + xen_extra_mem[0].size;Wouldn''t this be for [1]?> > if (!pages) > return; > @@ -66,7 +66,7 @@ static void __init xen_add_extra_mem(unsigned long pages) > > memblock_x86_reserve_range(extra_start, extra_start + size, "XEN EXTRA"); > > - xen_extra_mem_size += size; > + xen_extra_mem[0].size += size; > > xen_max_p2m_pfn = PFN_DOWN(extra_start + size); > > @@ -226,7 +226,7 @@ char * __init xen_memory_setup(void) > > memcpy(map_raw, map, sizeof(map)); > e820.nr_map = 0; > - xen_extra_mem_start = mem_end; > + xen_extra_mem[0].start = mem_end; > for (i = 0; i < memmap.nr_entries; i++) { > unsigned long long end; > > @@ -254,8 +254,8 @@ char * __init xen_memory_setup(void) > e820_add_region(end, delta, E820_UNUSABLE); > } > > - if (map[i].size > 0 && end > xen_extra_mem_start) > - xen_extra_mem_start = end; > + if (map[i].size > 0 && end > xen_extra_mem[0].start) > + xen_extra_mem[0].start = end; > > /* Add region if any remains */ > if (map[i].size > 0) > @@ -263,10 +263,10 @@ char * __init xen_memory_setup(void) > } > /* Align the balloon area so that max_low_pfn does not get set > * to be at the _end_ of the PCI gap at the far end (fee01000). > - * Note that xen_extra_mem_start gets set in the loop above to be > - * past the last E820 region. */ > - if (xen_initial_domain() && (xen_extra_mem_start < (1ULL<<32))) > - xen_extra_mem_start = (1ULL<<32); > + * Note that the start of balloon area gets set in the loop above > + * to be past the last E820 region. */ > + if (xen_initial_domain() && (xen_extra_mem[0].start < (1ULL<<32))) > + xen_extra_mem[0].start = (1ULL<<32);So what about the highmem memory? Should there a be a check to move the lowmem to highmem count?> > /* > * In domU, the ISA region is normal, usable memory, but we > diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c > index 5dfd8f8..de7278a 100644 > --- a/drivers/xen/balloon.c > +++ b/drivers/xen/balloon.c > @@ -555,11 +555,35 @@ void free_xenballooned_pages(int nr_pages, struct page** pages) > } > EXPORT_SYMBOL(free_xenballooned_pages); > > -static int __init balloon_init(void) > +static void __init balloon_add_memory_region(unsigned long start_pfn, unsigned long pages) > { > unsigned long pfn, extra_pfn_end; > struct page *page; > > + /* > + * Initialise the balloon with excess memory space. We need > + * to make sure we don''t add memory which doesn''t exist or > + * logically exist. The E820 map can be trimmed to be smaller > + * than the amount of physical memory due to the mem= command > + * line parameter. And if this is a 32-bit non-HIGHMEM kernel > + * on a system with memory which requires highmem to access, > + * don''t try to use it. > + */ > + extra_pfn_end = min(min(max_pfn, e820_end_of_ram_pfn()), > + start_pfn + pages); > + for (pfn = start_pfn; pfn < extra_pfn_end; pfn++) { > + page = pfn_to_page(pfn); > + /* totalram_pages and totalhigh_pages do not > + include the boot-time balloon extension, so > + don''t subtract from it. */ > + __balloon_append(page); > + } > +} > + > +static int __init balloon_init(void) > +{ > + int r; > + > if (!xen_domain()) > return -ENODEV; > > @@ -583,25 +607,9 @@ static int __init balloon_init(void) > register_memory_notifier(&xen_memory_nb); > #endif > > - /* > - * Initialise the balloon with excess memory space. We need > - * to make sure we don''t add memory which doesn''t exist or > - * logically exist. The E820 map can be trimmed to be smaller > - * than the amount of physical memory due to the mem= command > - * line parameter. And if this is a 32-bit non-HIGHMEM kernel > - * on a system with memory which requires highmem to access, > - * don''t try to use it. > - */ > - extra_pfn_end = min(min(max_pfn, e820_end_of_ram_pfn()), > - (unsigned long)PFN_DOWN(xen_extra_mem_start + xen_extra_mem_size)); > - for (pfn = PFN_UP(xen_extra_mem_start); > - pfn < extra_pfn_end; > - pfn++) { > - page = pfn_to_page(pfn); > - /* totalram_pages and totalhigh_pages do not include the boot-time > - balloon extension, so don''t subtract from it. */ > - __balloon_append(page); > - } > + for (r = 0; r < XEN_EXTRA_MEM_MAX_REGIONS; r++)You probably should also check to make sure that the values are actually valid. Like if (!xedn_extra_mem[r].start) continue;> + balloon_add_memory_region(PFN_UP(xen_extra_mem[r].start), > + PFN_DOWN(xen_extra_mem[r].size)); > > return 0; > } > diff --git a/include/xen/page.h b/include/xen/page.h > index 0be36b9..b01f6ac 100644 > --- a/include/xen/page.h > +++ b/include/xen/page.h > @@ -3,6 +3,13 @@ > > #include <asm/xen/page.h> > > -extern phys_addr_t xen_extra_mem_start, xen_extra_mem_size; > +struct xen_memory_region { > + phys_addr_t start; > + phys_addr_t size; > +}; > + > +#define XEN_EXTRA_MEM_MAX_REGIONS 1 > + > +extern struct xen_memory_region xen_extra_mem[XEN_EXTRA_MEM_MAX_REGIONS]; > > #endif /* _XEN_PAGE_H */ > -- > 1.7.4.1_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2011-Aug-16 13:48 UTC
Re: [Xen-devel] [PATCH 2/3] xen: allow extra memory to be two regions
On Tue, Aug 16, 2011 at 11:00:37AM +0100, David Vrabel wrote:> Allow the extra memory (used by the balloon driver) to be in two > regions (typically low and high memory). This allows the balloon > driver to increase the number of available low pages (if the initial > number if pages is small). > > As a side effect, the algorithm for building the e820 memory map is > simpler and more obviously correct as the map supplied by the > hypervisor is (almost) used as is.Hm, which is not always good. The setting of ''E820_RESERVED'' and ''E820_UNUSABLE'', and realigning of start of balloon space at 4GB (if necessary) changes need to be preserved. You can look up the why if you run ''git annotate'' and look at those lines - we had lots of time getting those right. You also need to provide the virgin copy of the E820 to the xen_set_identity and not use the same version that is modified - which with the above setting won''t work. I am curious - with the patch to the hypervisor - and with just a newly implemented xen_get_max_pages() code path added to query the new truncated amount of how many pages we need - wont that solve the problem?> > Signed-off-by: David Vrabel <david.vrabel@citrix.com> > --- > arch/x86/xen/setup.c | 167 +++++++++++++++++++++++++------------------------- > include/xen/page.h | 2 +- > 2 files changed, 84 insertions(+), 85 deletions(-) > > diff --git a/arch/x86/xen/setup.c b/arch/x86/xen/setup.c > index 30d0015..3421c9e 100644 > --- a/arch/x86/xen/setup.c > +++ b/arch/x86/xen/setup.c > @@ -51,23 +51,29 @@ struct xen_memory_region xen_extra_mem[XEN_EXTRA_MEM_MAX_REGIONS]; > */ > #define EXTRA_MEM_RATIO (10) > > -static void __init xen_add_extra_mem(unsigned long pages) > +static void __init xen_add_extra_mem(u64 extra_start, u64 size) > { > unsigned long pfn; > + int i; > > - u64 size = (u64)pages * PAGE_SIZE; > - u64 extra_start = xen_extra_mem[0].start + xen_extra_mem[0].size; > - > - if (!pages) > - return; > - > - e820_add_region(extra_start, size, E820_RAM); > - sanitize_e820_map(e820.map, ARRAY_SIZE(e820.map), &e820.nr_map); > + for (i = 0; i < XEN_EXTRA_MEM_MAX_REGIONS; i++) { > + /* Add new region. */ > + if (xen_extra_mem[i].size == 0) { > + xen_extra_mem[i].start = extra_start; > + xen_extra_mem[i].size = size; > + break; > + } > + /* Append to existing region. */ > + if (xen_extra_mem[i].start + xen_extra_mem[i].size == extra_start) { > + xen_extra_mem[i].size += size; > + break; > + } > + } > + if (i == XEN_EXTRA_MEM_MAX_REGIONS) > + printk(KERN_WARNING "Warning: not enough extra memory regions\n"); > > memblock_x86_reserve_range(extra_start, extra_start + size, "XEN EXTRA"); > > - xen_extra_mem[0].size += size; > - > xen_max_p2m_pfn = PFN_DOWN(extra_start + size); > > for (pfn = PFN_DOWN(extra_start); pfn <= xen_max_p2m_pfn; pfn++) > @@ -118,7 +124,8 @@ static unsigned long __init xen_release_chunk(phys_addr_t start_addr, > } > > static unsigned long __init xen_return_unused_memory(unsigned long max_pfn, > - const struct e820map *e820) > + const struct e820entry *map, > + int nr_map) > { > phys_addr_t max_addr = PFN_PHYS(max_pfn); > phys_addr_t last_end = ISA_END_ADDRESS; > @@ -126,13 +133,13 @@ static unsigned long __init xen_return_unused_memory(unsigned long max_pfn, > int i; > > /* Free any unused memory above the low 1Mbyte. */ > - for (i = 0; i < e820->nr_map && last_end < max_addr; i++) { > - phys_addr_t end = e820->map[i].addr; > + for (i = 0; i < nr_map && last_end < max_addr; i++) { > + phys_addr_t end = map[i].addr; > end = min(max_addr, end); > > if (last_end < end) > released += xen_release_chunk(last_end, end); > - last_end = max(last_end, e820->map[i].addr + e820->map[i].size); > + last_end = max(last_end, map[i].addr + map[i].size); > } > > if (last_end < max_addr) > @@ -184,20 +191,31 @@ static unsigned long __init xen_set_identity(const struct e820entry *list, > PFN_UP(start_pci), PFN_DOWN(last)); > return identity; > } > + > +static unsigned long __init xen_get_max_pages(void) > +{ > + unsigned long max_pages = MAX_DOMAIN_PAGES; /* Limited by memory map. */ > + > + if (xen_initial_domain()) { > + /* FIXME: ask hypervisor for max pages. */ > + } > + > + return min(max_pages, MAX_DOMAIN_PAGES); > +} > + > /** > * machine_specific_memory_setup - Hook for machine specific memory setup. > **/ > char * __init xen_memory_setup(void) > { > static struct e820entry map[E820MAX] __initdata; > - static struct e820entry map_raw[E820MAX] __initdata; > > unsigned long max_pfn = xen_start_info->nr_pages; > unsigned long long mem_end; > int rc; > struct xen_memory_map memmap; > + unsigned long max_pages; > unsigned long extra_pages = 0; > - unsigned long extra_limit; > unsigned long identity_pages = 0; > int i; > int op; > @@ -224,49 +242,54 @@ char * __init xen_memory_setup(void) > } > BUG_ON(rc); > > - memcpy(map_raw, map, sizeof(map)); > - e820.nr_map = 0; > - xen_extra_mem[0].start = mem_end; > - for (i = 0; i < memmap.nr_entries; i++) { > - unsigned long long end; > - > - /* Guard against non-page aligned E820 entries. */ > - if (map[i].type == E820_RAM) > - map[i].size -= (map[i].size + map[i].addr) % PAGE_SIZE; > - > - end = map[i].addr + map[i].size; > - if (map[i].type == E820_RAM && end > mem_end) { > - /* RAM off the end - may be partially included */ > - u64 delta = min(map[i].size, end - mem_end); > - > - map[i].size -= delta; > - end -= delta; > - > - extra_pages += PFN_DOWN(delta); > - /* > - * Set RAM below 4GB that is not for us to be unusable. > - * This prevents "System RAM" address space from being > - * used as potential resource for I/O address (happens > - * when ''allocate_resource'' is called). > - */ > - if (delta && > - (xen_initial_domain() && end < 0x100000000ULL)) > - e820_add_region(end, delta, E820_UNUSABLE); > - } > + /* Make sure the Xen-supplied memory map is well-ordered. */ > + /* FIXME: is this necessary? Does Xen provide any guarantees > + on this? */ > + sanitize_e820_map(map, memmap.nr_entries, &memmap.nr_entries); > > - if (map[i].size > 0 && end > xen_extra_mem[0].start) > - xen_extra_mem[0].start = end; > + max_pages = xen_get_max_pages(); > + if (max_pages > max_pfn) > + extra_pages += max_pages - max_pfn; > > - /* Add region if any remains */ > - if (map[i].size > 0) > - e820_add_region(map[i].addr, map[i].size, map[i].type); > - } > - /* Align the balloon area so that max_low_pfn does not get set > - * to be at the _end_ of the PCI gap at the far end (fee01000). > - * Note that the start of balloon area gets set in the loop above > - * to be past the last E820 region. */ > - if (xen_initial_domain() && (xen_extra_mem[0].start < (1ULL<<32))) > - xen_extra_mem[0].start = (1ULL<<32); > + extra_pages += xen_return_unused_memory(max_pfn, map, memmap.nr_entries); > + > + /* > + * Clamp the amount of extra memory to a EXTRA_MEM_RATIO > + * factor the base size. On non-highmem systems, the base > + * size is the full initial memory allocation; on highmem it > + * is limited to the max size of lowmem, so that it doesn''t > + * get completely filled. > + * > + * In principle there could be a problem in lowmem systems if > + * the initial memory is also very large with respect to > + * lowmem, but we won''t try to deal with that here. > + */ > + extra_pages = min(EXTRA_MEM_RATIO * min(max_pfn, PFN_DOWN(MAXMEM)), extra_pages); > + > + i = 0; > + while (i < memmap.nr_entries) { > + u64 addr = map[i].addr; > + u64 size = map[i].size; > + u32 type = map[i].type; > + > + if (type == E820_RAM) { > + if (addr < mem_end) { > + size = min(size, mem_end - addr); > + } else if (extra_pages) { > + size = min(size, (u64)extra_pages * PAGE_SIZE); > + extra_pages -= size / PAGE_SIZE; > + xen_add_extra_mem(addr, size); > + } else > + type = E820_UNUSABLE; > + } > + > + e820_add_region(addr, size, type); > + > + map[i].addr += size; > + map[i].size -= size; > + if (map[i].size == 0) > + i++; > + } > > /* > * In domU, the ISA region is normal, usable memory, but we > @@ -292,35 +315,11 @@ char * __init xen_memory_setup(void) > > sanitize_e820_map(e820.map, ARRAY_SIZE(e820.map), &e820.nr_map); > > - extra_pages += xen_return_unused_memory(xen_start_info->nr_pages, &e820); > - > - /* > - * Clamp the amount of extra memory to a EXTRA_MEM_RATIO > - * factor the base size. On non-highmem systems, the base > - * size is the full initial memory allocation; on highmem it > - * is limited to the max size of lowmem, so that it doesn''t > - * get completely filled. > - * > - * In principle there could be a problem in lowmem systems if > - * the initial memory is also very large with respect to > - * lowmem, but we won''t try to deal with that here. > - */ > - extra_limit = min(EXTRA_MEM_RATIO * min(max_pfn, PFN_DOWN(MAXMEM)), > - max_pfn + extra_pages); > - > - if (extra_limit >= max_pfn) > - extra_pages = extra_limit - max_pfn; > - else > - extra_pages = 0; > - > - xen_add_extra_mem(extra_pages); > - > /* > * Set P2M for all non-RAM pages and E820 gaps to be identity > - * type PFNs. We supply it with the non-sanitized version > - * of the E820. > + * type PFNs. > */ > - identity_pages = xen_set_identity(map_raw, memmap.nr_entries); > + identity_pages = xen_set_identity(e820.map, e820.nr_map); > printk(KERN_INFO "Set %ld page(s) to 1-1 mapping.\n", identity_pages); > return "Xen"; > } > diff --git a/include/xen/page.h b/include/xen/page.h > index b01f6ac..5a13bb3 100644 > --- a/include/xen/page.h > +++ b/include/xen/page.h > @@ -8,7 +8,7 @@ struct xen_memory_region { > phys_addr_t size; > }; > > -#define XEN_EXTRA_MEM_MAX_REGIONS 1 > +#define XEN_EXTRA_MEM_MAX_REGIONS 2 > > extern struct xen_memory_region xen_extra_mem[XEN_EXTRA_MEM_MAX_REGIONS]; > > -- > 1.7.4.1 > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2011-Aug-16 13:53 UTC
[Xen-devel] Re: [PATCH 3/3] xen: use maximum reservation to limit dom0 memory
On Tue, Aug 16, 2011 at 11:00:38AM +0100, David Vrabel wrote:> Use the maximum reservation hypercall to set limit the amount of > usable dom0 memory. This reduces the size of pages tables etc. if > dom0 is to use less memory than the maximum available.Ok, so it sounds like this patch by itself can fix the "more page tables than we need" issue. If so, I would prefer that you stick the tiny piece of code that calls the xen_get_max_pages() from the setup in this patch. This way we can backport this particular patch to stable tree without including the other patchsets you have posted. And it is a nicely contained one-patch-fixes-the-problem.> > Signed-off-by: David Vrabel <david.vrabel@citrix.com> > --- > Note this requires a patched Xen that sets max_pages when creating dom0.Please mention in the description the c/s and the name of the patch.> --- > arch/x86/xen/setup.c | 7 ++++++- > 1 files changed, 6 insertions(+), 1 deletions(-) > > diff --git a/arch/x86/xen/setup.c b/arch/x86/xen/setup.c > index 3421c9e..584e7dc 100644 > --- a/arch/x86/xen/setup.c > +++ b/arch/x86/xen/setup.c > @@ -197,7 +197,12 @@ static unsigned long __init xen_get_max_pages(void) > unsigned long max_pages = MAX_DOMAIN_PAGES; /* Limited by memory map. */ > > if (xen_initial_domain()) { > - /* FIXME: ask hypervisor for max pages. */ > + domid_t domid = DOMID_SELF; > + int ret; > + > + ret = HYPERVISOR_memory_op(XENMEM_maximum_reservation, &domid); > + if (ret > 0) > + max_pages = ret;Don''t you want to clamp it? Say MAX_DOMAIN_PAGES is set to 1GB, and you set it to 2GB here - that will blow the P2M out. Perhaps max_pages = min(ret, max_pages); ?> } > > return min(max_pages, MAX_DOMAIN_PAGES); > -- > 1.7.4.1_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
David Vrabel
2011-Aug-16 14:21 UTC
[Xen-devel] Re: [PATCH 1/3] xen: allow balloon driver to use more than one memory region
On 16/08/11 14:38, Konrad Rzeszutek Wilk wrote:> On Tue, Aug 16, 2011 at 11:00:36AM +0100, David Vrabel wrote: >> Allow the xen balloon driver to populate its list of extra pages from >> more than one region of memory. This will allow platforms to provide >> (for example) a region of low memory and a region of high memory. > > What does this solve? Is this a requirement for another patch? If so > please specify the name of it. >> >> Signed-off-by: David Vrabel <david.vrabel@citrix.com> >> --- >> Is there a better way of passing the memory information to the balloon >> driver? > > I think the way you have it is OK. > >> --- >> arch/x86/xen/setup.c | 20 ++++++++++---------- >> drivers/xen/balloon.c | 48 ++++++++++++++++++++++++++++-------------------- >> include/xen/page.h | 9 ++++++++- >> 3 files changed, 46 insertions(+), 31 deletions(-) >> >> diff --git a/arch/x86/xen/setup.c b/arch/x86/xen/setup.c >> index df118a8..30d0015 100644 >> --- a/arch/x86/xen/setup.c >> +++ b/arch/x86/xen/setup.c >> @@ -37,7 +37,7 @@ extern void xen_syscall_target(void); >> extern void xen_syscall32_target(void); >> >> /* Amount of extra memory space we add to the e820 ranges */ >> -phys_addr_t xen_extra_mem_start, xen_extra_mem_size; >> +struct xen_memory_region xen_extra_mem[XEN_EXTRA_MEM_MAX_REGIONS]; >> >> /* >> * The maximum amount of extra memory compared to the base size. The >> @@ -56,7 +56,7 @@ static void __init xen_add_extra_mem(unsigned long pages) >> unsigned long pfn; >> >> u64 size = (u64)pages * PAGE_SIZE; >> - u64 extra_start = xen_extra_mem_start + xen_extra_mem_size; >> + u64 extra_start = xen_extra_mem[0].start + xen_extra_mem[0].size; > > Wouldn''t this be for [1]?No. I probably should have made it clear in the description but this patch doesn''t change the number of regions. It only changes the pair of variables to a single element array of a structure. See XEN_EXTRA_MEM_MAX_REGIONS in include/xen/page.h: --- a/include/xen/page.h +++ b/include/xen/page.h @@ -3,6 +3,13 @@ #include <asm/xen/page.h> -extern phys_addr_t xen_extra_mem_start, xen_extra_mem_size; +struct xen_memory_region { + phys_addr_t start; + phys_addr_t size; +}; + +#define XEN_EXTRA_MEM_MAX_REGIONS 1 + +extern struct xen_memory_region xen_extra_mem[XEN_EXTRA_MEM_MAX_REGIONS]; #endif /* _XEN_PAGE_H */>> @@ -263,10 +263,10 @@ char * __init xen_memory_setup(void) >> } >> /* Align the balloon area so that max_low_pfn does not get set >> * to be at the _end_ of the PCI gap at the far end (fee01000). >> - * Note that xen_extra_mem_start gets set in the loop above to be >> - * past the last E820 region. */ >> - if (xen_initial_domain() && (xen_extra_mem_start < (1ULL<<32))) >> - xen_extra_mem_start = (1ULL<<32); >> + * Note that the start of balloon area gets set in the loop above >> + * to be past the last E820 region. */ >> + if (xen_initial_domain() && (xen_extra_mem[0].start < (1ULL<<32))) >> + xen_extra_mem[0].start = (1ULL<<32); > > So what about the highmem memory? Should there a be a check to move > the lowmem to highmem count?Again, the patch isn''t adding any additional regions.>> + for (r = 0; r < XEN_EXTRA_MEM_MAX_REGIONS; r++) > > You probably should also check to make sure that the values are actually valid. > Like > if (!xedn_extra_mem[r].start) > continue;balloon_add_memory_region() is a nop if size == 0. But I can an explicit check (of size) here if that is preferred.>> + balloon_add_memory_region(PFN_UP(xen_extra_mem[r].start), >> + PFN_DOWN(xen_extra_mem[r].size)); >> >> return 0; >> }David _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
David Vrabel
2011-Aug-16 14:33 UTC
Re: [Xen-devel] [PATCH 2/3] xen: allow extra memory to be two regions
On 16/08/11 14:48, Konrad Rzeszutek Wilk wrote:> On Tue, Aug 16, 2011 at 11:00:37AM +0100, David Vrabel wrote: >> Allow the extra memory (used by the balloon driver) to be in two >> regions (typically low and high memory). This allows the balloon >> driver to increase the number of available low pages (if the initial >> number if pages is small). >> >> As a side effect, the algorithm for building the e820 memory map is >> simpler and more obviously correct as the map supplied by the >> hypervisor is (almost) used as is. > > Hm, which is not always good. The setting of ''E820_RESERVED'' and ''E820_UNUSABLE'', > and realigning of start of balloon space at 4GB (if necessary) changes > need to be preserved. You can look up the why if you run ''git annotate'' > and look at those lines - we had lots of time getting those right.My understanding of the history is that the problems were caused by not paying attention to the reserved regions reported in the machine memory map. This proposed algorithm is careful to only alter RAM regions -- all reserved regions and gaps are preserved as-is. I should add some comments explaining this. For example, should a BIOS reserve memory above 4 GiB then the current code will place the balloon memory over the top of it but the proposed code will not.> You also need to provide the virgin copy of the E820 to the xen_set_identity > and not use the same version that is modified - which with the above setting > won''t work.Because we don''t alter any reserved regions the resulting map is fine for this.> I am curious - with the patch to the hypervisor - and with just a newly > implemented xen_get_max_pages() code path added to query the new > truncated amount of how many pages we need - wont that solve the problem?I''ll address this in another email. David _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
David Vrabel
2011-Aug-16 14:41 UTC
[Xen-devel] Re: [PATCH 3/3] xen: use maximum reservation to limit dom0 memory
On 16/08/11 14:53, Konrad Rzeszutek Wilk wrote:> On Tue, Aug 16, 2011 at 11:00:38AM +0100, David Vrabel wrote: >> Use the maximum reservation hypercall to set limit the amount of >> usable dom0 memory. This reduces the size of pages tables etc. if >> dom0 is to use less memory than the maximum available. > > Ok, so it sounds like this patch by itself can fix the "more page tables > than we need" issue.This patch with the Xen patch does, yes.> If so, I would prefer that you stick the tiny piece of code that > calls the xen_get_max_pages() from the setup in this patch. This way > we can backport this particular patch to stable tree without including > the other patchsets you have posted. And it is a nicely contained > one-patch-fixes-the-problem.Does this problem need to be fixed in stable? It has a simple workaround (the ''mem'' kernel command line option) and requires an updated Xen. I do think that patches #1 and #2 are useful because they allow 32-bit guests to have more low memory, rather than making all balloon memory high memory. I could rearrange the order. Make #3 first so it can also be applied to 3.0.n and 3.1 and then #1 and #2 could be queued for 3.2.>> Note this requires a patched Xen that sets max_pages when creating dom0. > > Please mention in the description the c/s and the name of the patch.Ok.>> --- a/arch/x86/xen/setup.c >> +++ b/arch/x86/xen/setup.c >> @@ -197,7 +197,12 @@ static unsigned long __init xen_get_max_pages(void) >> unsigned long max_pages = MAX_DOMAIN_PAGES; /* Limited by memory map. */ >> >> if (xen_initial_domain()) { >> - /* FIXME: ask hypervisor for max pages. */ >> + domid_t domid = DOMID_SELF; >> + int ret; >> + >> + ret = HYPERVISOR_memory_op(XENMEM_maximum_reservation, &domid); >> + if (ret > 0) >> + max_pages = ret; > Don''t you want to clamp it? Say MAX_DOMAIN_PAGES is set to 1GB, and you > set it to 2GB here - that will blow the P2M out. PerhapsIt is...> max_pages = min(ret, max_pages); ? > >> } >> >> return min(max_pages, MAX_DOMAIN_PAGES);... here. David _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2011-Aug-16 14:48 UTC
Re: [Xen-devel] [PATCH 2/3] xen: allow extra memory to be two regions
On Tue, Aug 16, 2011 at 03:33:19PM +0100, David Vrabel wrote:> On 16/08/11 14:48, Konrad Rzeszutek Wilk wrote: > > On Tue, Aug 16, 2011 at 11:00:37AM +0100, David Vrabel wrote: > >> Allow the extra memory (used by the balloon driver) to be in two > >> regions (typically low and high memory). This allows the balloon > >> driver to increase the number of available low pages (if the initial > >> number if pages is small). > >> > >> As a side effect, the algorithm for building the e820 memory map is > >> simpler and more obviously correct as the map supplied by the > >> hypervisor is (almost) used as is. > > > > Hm, which is not always good. The setting of ''E820_RESERVED'' and ''E820_UNUSABLE'', > > and realigning of start of balloon space at 4GB (if necessary) changes > > need to be preserved. You can look up the why if you run ''git annotate'' > > and look at those lines - we had lots of time getting those right. > > My understanding of the history is that the problems were caused by not > paying attention to the reserved regions reported in the machine memoryThat might have been a problem too, but this is specific to RAM regions.> map. This proposed algorithm is careful to only alter RAM regions -- > all reserved regions and gaps are preserved as-is. I should add some > comments explaining this.We cut RAM regions down and the Linux code thought that they were "gap" spaces and used it as PCI I/O space. Hence we marked them as unusable. We need that behavior.> > For example, should a BIOS reserve memory above 4 GiB then the current > code will place the balloon memory over the top of it but the proposed > code will not.The problem with the 4GB was that the Local IOAPIC was not enumerated in the E820 but was in the ACPI boot table. This meant that E820 had a "gap" right before the 4GB limit and we thought it was the start of RAM and marked it to be used as balloon space.> > > You also need to provide the virgin copy of the E820 to the xen_set_identity > > and not use the same version that is modified - which with the above setting > > won''t work. > > Because we don''t alter any reserved regions the resulting map is fine > for this. > > > I am curious - with the patch to the hypervisor - and with just a newly > > implemented xen_get_max_pages() code path added to query the new > > truncated amount of how many pages we need - wont that solve the problem? > > I''ll address this in another email. > > David_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2011-Aug-16 14:48 UTC
Re: [Xen-devel] Re: [PATCH 1/3] xen: allow balloon driver to use more than one memory region
On Tue, Aug 16, 2011 at 03:21:03PM +0100, David Vrabel wrote:> On 16/08/11 14:38, Konrad Rzeszutek Wilk wrote: > > On Tue, Aug 16, 2011 at 11:00:36AM +0100, David Vrabel wrote: > >> Allow the xen balloon driver to populate its list of extra pages from > >> more than one region of memory. This will allow platforms to provide > >> (for example) a region of low memory and a region of high memory. > > > > What does this solve? Is this a requirement for another patch? If so > > please specify the name of it.^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ?> >> > >> Signed-off-by: David Vrabel <david.vrabel@citrix.com> > >> --- > >> Is there a better way of passing the memory information to the balloon > >> driver? > > > > I think the way you have it is OK. > > > >> --- > >> arch/x86/xen/setup.c | 20 ++++++++++---------- > >> drivers/xen/balloon.c | 48 ++++++++++++++++++++++++++++-------------------- > >> include/xen/page.h | 9 ++++++++- > >> 3 files changed, 46 insertions(+), 31 deletions(-) > >> > >> diff --git a/arch/x86/xen/setup.c b/arch/x86/xen/setup.c > >> index df118a8..30d0015 100644 > >> --- a/arch/x86/xen/setup.c > >> +++ b/arch/x86/xen/setup.c > >> @@ -37,7 +37,7 @@ extern void xen_syscall_target(void); > >> extern void xen_syscall32_target(void); > >> > >> /* Amount of extra memory space we add to the e820 ranges */ > >> -phys_addr_t xen_extra_mem_start, xen_extra_mem_size; > >> +struct xen_memory_region xen_extra_mem[XEN_EXTRA_MEM_MAX_REGIONS]; > >> > >> /* > >> * The maximum amount of extra memory compared to the base size. The > >> @@ -56,7 +56,7 @@ static void __init xen_add_extra_mem(unsigned long pages) > >> unsigned long pfn; > >> > >> u64 size = (u64)pages * PAGE_SIZE; > >> - u64 extra_start = xen_extra_mem_start + xen_extra_mem_size; > >> + u64 extra_start = xen_extra_mem[0].start + xen_extra_mem[0].size; > > > > Wouldn''t this be for [1]? > > No. I probably should have made it clear in the description but this > patch doesn''t change the number of regions. It only changes the pair of > variables to a single element array of a structure.Ok, I ask again - what does this patch solve?> > See XEN_EXTRA_MEM_MAX_REGIONS in include/xen/page.h: > > --- a/include/xen/page.h > +++ b/include/xen/page.h > @@ -3,6 +3,13 @@ > > #include <asm/xen/page.h> > > -extern phys_addr_t xen_extra_mem_start, xen_extra_mem_size; > +struct xen_memory_region { > + phys_addr_t start; > + phys_addr_t size; > +}; > + > +#define XEN_EXTRA_MEM_MAX_REGIONS 1 > + > +extern struct xen_memory_region xen_extra_mem[XEN_EXTRA_MEM_MAX_REGIONS]; > > #endif /* _XEN_PAGE_H */ > > >> @@ -263,10 +263,10 @@ char * __init xen_memory_setup(void) > >> } > >> /* Align the balloon area so that max_low_pfn does not get set > >> * to be at the _end_ of the PCI gap at the far end (fee01000). > >> - * Note that xen_extra_mem_start gets set in the loop above to be > >> - * past the last E820 region. */ > >> - if (xen_initial_domain() && (xen_extra_mem_start < (1ULL<<32))) > >> - xen_extra_mem_start = (1ULL<<32); > >> + * Note that the start of balloon area gets set in the loop above > >> + * to be past the last E820 region. */ > >> + if (xen_initial_domain() && (xen_extra_mem[0].start < (1ULL<<32))) > >> + xen_extra_mem[0].start = (1ULL<<32); > > > > So what about the highmem memory? Should there a be a check to move > > the lowmem to highmem count? > > Again, the patch isn''t adding any additional regions. > > > >> + for (r = 0; r < XEN_EXTRA_MEM_MAX_REGIONS; r++) > > > > You probably should also check to make sure that the values are actually valid. > > Like > > if (!xedn_extra_mem[r].start) > > continue; > > balloon_add_memory_region() is a nop if size == 0. But I can an > explicit check (of size) here if that is preferred. > > >> + balloon_add_memory_region(PFN_UP(xen_extra_mem[r].start), > >> + PFN_DOWN(xen_extra_mem[r].size)); > >> > >> return 0; > >> } > > David > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2011-Aug-16 14:50 UTC
[Xen-devel] Re: [PATCH 3/3] xen: use maximum reservation to limit dom0 memory
On Tue, Aug 16, 2011 at 09:53:52AM -0400, Konrad Rzeszutek Wilk wrote:> On Tue, Aug 16, 2011 at 11:00:38AM +0100, David Vrabel wrote: > > Use the maximum reservation hypercall to set limit the amount of > > usable dom0 memory. This reduces the size of pages tables etc. if > > dom0 is to use less memory than the maximum available. > > Ok, so it sounds like this patch by itself can fix the "more page tables > than we need" issue. > > If so, I would prefer that you stick the tiny piece of code that > calls the xen_get_max_pages() from the setup in this patch. This way > we can backport this particular patch to stable tree without including > the other patchsets you have posted. And it is a nicely contained > one-patch-fixes-the-problem. > > > > > Signed-off-by: David Vrabel <david.vrabel@citrix.com> > > --- > > Note this requires a patched Xen that sets max_pages when creating dom0. > > Please mention in the description the c/s and the name of the patch. > > > --- > > arch/x86/xen/setup.c | 7 ++++++- > > 1 files changed, 6 insertions(+), 1 deletions(-) > > > > diff --git a/arch/x86/xen/setup.c b/arch/x86/xen/setup.c > > index 3421c9e..584e7dc 100644 > > --- a/arch/x86/xen/setup.c > > +++ b/arch/x86/xen/setup.c > > @@ -197,7 +197,12 @@ static unsigned long __init xen_get_max_pages(void) > > unsigned long max_pages = MAX_DOMAIN_PAGES; /* Limited by memory map. */ > > > > if (xen_initial_domain()) {This actually is not neccessary? You could remove this and it would work fine under DomU cases too I think. The only issue would be when running this under older hypervisors as dom0 and getting ~0 as the max_pages - but the ''min'' clamping should solve that?> > - /* FIXME: ask hypervisor for max pages. */ > > + domid_t domid = DOMID_SELF; > > + int ret; > > + > > + ret = HYPERVISOR_memory_op(XENMEM_maximum_reservation, &domid); > > + if (ret > 0) > > + max_pages = ret; > Don''t you want to clamp it? Say MAX_DOMAIN_PAGES is set to 1GB, and you > set it to 2GB here - that will blow the P2M out. Perhaps > max_pages = min(ret, max_pages); ? > > > } > > > > return min(max_pages, MAX_DOMAIN_PAGES); > > -- > > 1.7.4.1_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2011-Aug-16 14:54 UTC
[Xen-devel] Re: [PATCH 3/3] xen: use maximum reservation to limit dom0 memory
On Tue, Aug 16, 2011 at 03:41:03PM +0100, David Vrabel wrote:> On 16/08/11 14:53, Konrad Rzeszutek Wilk wrote: > > On Tue, Aug 16, 2011 at 11:00:38AM +0100, David Vrabel wrote: > >> Use the maximum reservation hypercall to set limit the amount of > >> usable dom0 memory. This reduces the size of pages tables etc. if > >> dom0 is to use less memory than the maximum available. > > > > Ok, so it sounds like this patch by itself can fix the "more page tables > > than we need" issue. > > This patch with the Xen patch does, yes. > > > If so, I would prefer that you stick the tiny piece of code that > > calls the xen_get_max_pages() from the setup in this patch. This way > > we can backport this particular patch to stable tree without including > > the other patchsets you have posted. And it is a nicely contained > > one-patch-fixes-the-problem. > > Does this problem need to be fixed in stable? It has a simple > workaround (the ''mem'' kernel command line option) and requires an > updated Xen.I would like it be ported to 3.0.3 so folks who are using it can stop using the work-around.> > I do think that patches #1 and #2 are useful because they allow 32-bit > guests to have more low memory, rather than making all balloon memory > high memory.Sure, but they are a different subset of the problem.> > I could rearrange the order. Make #3 first so it can also be applied to > 3.0.n and 3.1 and then #1 and #2 could be queued for 3.2.OK. I was thinking to cherry-pick this specific patch for 3.1 anyhow.> > >> Note this requires a patched Xen that sets max_pages when creating dom0. > > > > Please mention in the description the c/s and the name of the patch. > > Ok. > > >> --- a/arch/x86/xen/setup.c > >> +++ b/arch/x86/xen/setup.c > >> @@ -197,7 +197,12 @@ static unsigned long __init xen_get_max_pages(void) > >> unsigned long max_pages = MAX_DOMAIN_PAGES; /* Limited by memory map. */ > >> > >> if (xen_initial_domain()) { > >> - /* FIXME: ask hypervisor for max pages. */ > >> + domid_t domid = DOMID_SELF; > >> + int ret; > >> + > >> + ret = HYPERVISOR_memory_op(XENMEM_maximum_reservation, &domid); > >> + if (ret > 0) > >> + max_pages = ret; > > Don''t you want to clamp it? Say MAX_DOMAIN_PAGES is set to 1GB, and you > > set it to 2GB here - that will blow the P2M out. Perhaps > > It is... > > > max_pages = min(ret, max_pages); ? > > > >> } > >> > >> return min(max_pages, MAX_DOMAIN_PAGES); > > ... here.Duh!> > David_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
David Vrabel
2011-Aug-16 15:03 UTC
Re: [Xen-devel] [PATCH 2/3] xen: allow extra memory to be two regions
On 16/08/11 15:48, Konrad Rzeszutek Wilk wrote:> On Tue, Aug 16, 2011 at 03:33:19PM +0100, David Vrabel wrote: >> On 16/08/11 14:48, Konrad Rzeszutek Wilk wrote: >>> On Tue, Aug 16, 2011 at 11:00:37AM +0100, David Vrabel wrote: >>>> Allow the extra memory (used by the balloon driver) to be in two >>>> regions (typically low and high memory). This allows the balloon >>>> driver to increase the number of available low pages (if the initial >>>> number if pages is small). >>>> >>>> As a side effect, the algorithm for building the e820 memory map is >>>> simpler and more obviously correct as the map supplied by the >>>> hypervisor is (almost) used as is. >>> >>> Hm, which is not always good. The setting of ''E820_RESERVED'' and ''E820_UNUSABLE'', >>> and realigning of start of balloon space at 4GB (if necessary) changes >>> need to be preserved. You can look up the why if you run ''git annotate'' >>> and look at those lines - we had lots of time getting those right. >> >> My understanding of the history is that the problems were caused by not >> paying attention to the reserved regions reported in the machine memory > > That might have been a problem too, but this is specific to RAM regions. >> map. This proposed algorithm is careful to only alter RAM regions -- >> all reserved regions and gaps are preserved as-is. I should add some >> comments explaining this. > > We cut RAM regions down and the Linux code thought that they were "gap" spaces > and used it as PCI I/O space. Hence we marked them as unusable. We need > that behavior.Okay. This behaviour is kept as well. For example, on my test box with 8 GiB RAM with dom0 started with 752 MiB of initial pages and limited to 4 GiB. [ 0.000000] Xen: 0000000000000000 - 000000000009e000 (usable) [ 0.000000] Xen: 00000000000a0000 - 0000000000100000 (reserved) [ 0.000000] Xen: 0000000000100000 - 00000000bf699000 (usable) [ 0.000000] Xen: 00000000bf699000 - 00000000bf6af000 (reserved) [ 0.000000] Xen: 00000000bf6af000 - 00000000bf6ce000 (ACPI data) [ 0.000000] Xen: 00000000bf6ce000 - 00000000c0000000 (reserved) [ 0.000000] Xen: 00000000e0000000 - 00000000f0000000 (reserved) [ 0.000000] Xen: 00000000fe000000 - 0000000100000000 (reserved) [ 0.000000] Xen: 0000000100000000 - 0000000140967000 (usable) [ 0.000000] Xen: 0000000140967000 - 0000000240000000 (unusable) David _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
David Vrabel
2011-Aug-16 15:03 UTC
Re: [Xen-devel] Re: [PATCH 1/3] xen: allow balloon driver to use more than one memory region
On 16/08/11 15:48, Konrad Rzeszutek Wilk wrote:> On Tue, Aug 16, 2011 at 03:21:03PM +0100, David Vrabel wrote: >> On 16/08/11 14:38, Konrad Rzeszutek Wilk wrote: >>> On Tue, Aug 16, 2011 at 11:00:36AM +0100, David Vrabel wrote: >>>> Allow the xen balloon driver to populate its list of extra pages from >>>> more than one region of memory. This will allow platforms to provide >>>> (for example) a region of low memory and a region of high memory. >>> >>> What does this solve? Is this a requirement for another patch? If so >>> please specify the name of it. > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > ?Sorry. This is required for patch 2/3 in this series. "xen: allow extra memory to be [in] two regions" I thought it sensible to break this out into its own patch. David _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2011-Aug-16 15:36 UTC
Re: [Xen-devel] [PATCH 2/3] xen: allow extra memory to be two regions
On Tue, Aug 16, 2011 at 04:03:01PM +0100, David Vrabel wrote:> On 16/08/11 15:48, Konrad Rzeszutek Wilk wrote: > > On Tue, Aug 16, 2011 at 03:33:19PM +0100, David Vrabel wrote: > >> On 16/08/11 14:48, Konrad Rzeszutek Wilk wrote: > >>> On Tue, Aug 16, 2011 at 11:00:37AM +0100, David Vrabel wrote: > >>>> Allow the extra memory (used by the balloon driver) to be in two > >>>> regions (typically low and high memory). This allows the balloon > >>>> driver to increase the number of available low pages (if the initial > >>>> number if pages is small). > >>>> > >>>> As a side effect, the algorithm for building the e820 memory map is > >>>> simpler and more obviously correct as the map supplied by the > >>>> hypervisor is (almost) used as is. > >>> > >>> Hm, which is not always good. The setting of ''E820_RESERVED'' and ''E820_UNUSABLE'', > >>> and realigning of start of balloon space at 4GB (if necessary) changes > >>> need to be preserved. You can look up the why if you run ''git annotate'' > >>> and look at those lines - we had lots of time getting those right. > >> > >> My understanding of the history is that the problems were caused by not > >> paying attention to the reserved regions reported in the machine memory > > > > That might have been a problem too, but this is specific to RAM regions. > >> map. This proposed algorithm is careful to only alter RAM regions -- > >> all reserved regions and gaps are preserved as-is. I should add some > >> comments explaining this. > > > > We cut RAM regions down and the Linux code thought that they were "gap" spaces > > and used it as PCI I/O space. Hence we marked them as unusable. We need > > that behavior. > > Okay. This behaviour is kept as well. > > For example, on my test box with 8 GiB RAM with dom0 started with 752 > MiB of initial pages and limited to 4 GiB.So dom0_mem=max:4GB,725MB ?> > [ 0.000000] Xen: 0000000000000000 - 000000000009e000 (usable) > [ 0.000000] Xen: 00000000000a0000 - 0000000000100000 (reserved) > [ 0.000000] Xen: 0000000000100000 - 00000000bf699000 (usable) > [ 0.000000] Xen: 00000000bf699000 - 00000000bf6af000 (reserved) > [ 0.000000] Xen: 00000000bf6af000 - 00000000bf6ce000 (ACPI data) > [ 0.000000] Xen: 00000000bf6ce000 - 00000000c0000000 (reserved) > [ 0.000000] Xen: 00000000e0000000 - 00000000f0000000 (reserved) > [ 0.000000] Xen: 00000000fe000000 - 0000000100000000 (reserved) > [ 0.000000] Xen: 0000000100000000 - 0000000140967000 (usable) > [ 0.000000] Xen: 0000000140967000 - 0000000240000000 (unusable) >What did it look before? What does "Memory:" look before and after?> David_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2011-Aug-22 13:55 UTC
Re: [Xen-devel] [PATCH 2/3] xen: allow extra memory to be two regions
> > Okay. This behaviour is kept as well. > > > > For example, on my test box with 8 GiB RAM with dom0 started with 752 > > MiB of initial pages and limited to 4 GiB. > > So dom0_mem=max:4GB,725MB ? > > > > > [ 0.000000] Xen: 0000000000000000 - 000000000009e000 (usable) > > [ 0.000000] Xen: 00000000000a0000 - 0000000000100000 (reserved) > > [ 0.000000] Xen: 0000000000100000 - 00000000bf699000 (usable) > > [ 0.000000] Xen: 00000000bf699000 - 00000000bf6af000 (reserved) > > [ 0.000000] Xen: 00000000bf6af000 - 00000000bf6ce000 (ACPI data) > > [ 0.000000] Xen: 00000000bf6ce000 - 00000000c0000000 (reserved) > > [ 0.000000] Xen: 00000000e0000000 - 00000000f0000000 (reserved) > > [ 0.000000] Xen: 00000000fe000000 - 0000000100000000 (reserved) > > [ 0.000000] Xen: 0000000100000000 - 0000000140967000 (usable) > > [ 0.000000] Xen: 0000000140967000 - 0000000240000000 (unusable) > > > What did it look before? > > What does "Memory:" look before and after?ping? Did I miss some patches from you? Are you waiting for Keir to merge the Xen hypervisor patches - if so I think we can just work on this in parallel?> > > David > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
David Vrabel
2011-Aug-22 14:01 UTC
Re: [Xen-devel] [PATCH 2/3] xen: allow extra memory to be two regions
On 22/08/11 14:55, Konrad Rzeszutek Wilk wrote:> > ping? Did I miss some patches from you? Are you waiting for Keir to merge > the Xen hypervisor patches - if so I think we can just work on this in parallel?I sent a new series with you Cc''d. Did you not receive them? http://lists.xensource.com/archives/html/xen-devel/2011-08/msg00810.html David _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel