Samuel Thibault
2008-Jan-23 15:51 UTC
[Xen-devel] [PATCH] ioemu: directly project all memory on x86_64
ioemu: directly project all memory on x86_64 On x86_64, we have enough virtual addressing space to just directly project all guest''s physical memory. We still need to keep the map cache, however, for some random addressing uses like acpi. For this, we need to know the amount of memory, so let''s have the domain builder always pass it, not only for ia64 (it''s harmless anyway). Signed-off-by: Samuel Thibault <samuel.thibault@eu.citrix.com> diff -r c364f80eb4b5 tools/ioemu/hw/xen_machine_fv.c --- a/tools/ioemu/hw/xen_machine_fv.c Wed Jan 23 13:27:21 2008 +0000 +++ b/tools/ioemu/hw/xen_machine_fv.c Wed Jan 23 15:37:19 2008 +0000 @@ -26,6 +26,52 @@ #include "vl.h" #include <xen/hvm/params.h> #include <sys/mman.h> + +#if defined(__x86_64__) || defined(__ia64__) +static void phys_ram_base_reinit(void) +{ + unsigned long nr_pages; + xen_pfn_t *page_array; + int i; + + if (phys_ram_base) + munmap(phys_ram_base, ram_size); + + nr_pages = ram_size / XC_PAGE_SIZE; + + page_array = (xen_pfn_t *)malloc(nr_pages * sizeof(xen_pfn_t)); + if (page_array == NULL) { + fprintf(logfile, "malloc returned error %d\n", errno); + exit(-1); + } + + for (i = 0; i < nr_pages; i++) + page_array[i] = i; + +#if defined(__ia64__) + /* VTI will not use memory between 3G~4G, so we just pass a legal pfn + to make QEMU map continuous virtual memory space */ + if (ram_size > MMIO_START) { + for (i = 0 ; i < (MEM_G >> XC_PAGE_SHIFT); i++) + page_array[(MMIO_START >> XC_PAGE_SHIFT) + i] + (STORE_PAGE_START >> XC_PAGE_SHIFT); + } +#endif + + phys_ram_base = xc_map_foreign_batch(xc_handle, domid, + PROT_READ|PROT_WRITE, + page_array, nr_pages); + if (phys_ram_base == 0) { + fprintf(logfile, "xc_map_foreign_batch returned error %d\n", errno); + exit(-1); + } + free(page_array); + + fprintf(logfile, "%ldMB direct physical ram projection\n", ram_size >> 20); +} +#else +#define phys_ram_base_reinit() ((void)0) +#endif #if defined(MAPCACHE) @@ -174,6 +220,8 @@ void qemu_invalidate_map_cache(void) last_address_vaddr = NULL; mapcache_unlock(); + + phys_ram_base_reinit(); } #endif /* defined(MAPCACHE) */ @@ -191,14 +239,10 @@ static void xen_init_fv(uint64_t ram_siz extern void *shared_page; extern void *buffered_io_page; #ifdef __ia64__ - unsigned long nr_pages; - xen_pfn_t *page_array; extern void *buffered_pio_page; - int i; #endif #if defined(__i386__) || defined(__x86_64__) - if (qemu_map_cache_init()) { fprintf(logfile, "qemu_map_cache_init returned: error %d\n", errno); exit(-1); @@ -232,35 +276,9 @@ static void xen_init_fv(uint64_t ram_siz fprintf(logfile, "map buffered PIO page returned error %d\n", errno); exit(-1); } +#endif - nr_pages = ram_size / XC_PAGE_SIZE; - - page_array = (xen_pfn_t *)malloc(nr_pages * sizeof(xen_pfn_t)); - if (page_array == NULL) { - fprintf(logfile, "malloc returned error %d\n", errno); - exit(-1); - } - - for (i = 0; i < nr_pages; i++) - page_array[i] = i; - - /* VTI will not use memory between 3G~4G, so we just pass a legal pfn - to make QEMU map continuous virtual memory space */ - if (ram_size > MMIO_START) { - for (i = 0 ; i < (MEM_G >> XC_PAGE_SHIFT); i++) - page_array[(MMIO_START >> XC_PAGE_SHIFT) + i] - (STORE_PAGE_START >> XC_PAGE_SHIFT); - } - - phys_ram_base = xc_map_foreign_batch(xc_handle, domid, - PROT_READ|PROT_WRITE, - page_array, nr_pages); - if (phys_ram_base == 0) { - fprintf(logfile, "xc_map_foreign_batch returned error %d\n", errno); - exit(-1); - } - free(page_array); -#endif + phys_ram_base_reinit(); timeoffset_get(); diff -r c364f80eb4b5 tools/ioemu/target-i386-dm/exec-dm.c --- a/tools/ioemu/target-i386-dm/exec-dm.c Wed Jan 23 13:27:21 2008 +0000 +++ b/tools/ioemu/target-i386-dm/exec-dm.c Wed Jan 23 15:37:19 2008 +0000 @@ -411,10 +411,12 @@ int iomem_index(target_phys_addr_t addr) return 0; } -#if defined(__i386__) || defined(__x86_64__) +#if defined(__i386__) #define phys_ram_addr(x) (qemu_map_cache(x)) +#elif defined(__x86_64__) +#define phys_ram_addr(x) (((x) < ram_size) ? (phys_ram_base + (x)) : qemu_map_cache(x)) #elif defined(__ia64__) -#define phys_ram_addr(x) ((addr < ram_size) ? (phys_ram_base + (x)) : NULL) +#define phys_ram_addr(x) (((x) < ram_size) ? (phys_ram_base + (x)) : NULL) #endif extern unsigned long *logdirty_bitmap; diff -r c364f80eb4b5 tools/python/xen/xend/image.py --- a/tools/python/xen/xend/image.py Wed Jan 23 13:27:21 2008 +0000 +++ b/tools/python/xen/xend/image.py Wed Jan 23 15:37:19 2008 +0000 @@ -186,6 +186,9 @@ class ImageHandler: # xm config file def parseDeviceModelArgs(self, vmConfig): ret = ["-domain-name", str(self.vm.info[''name_label''])] + + ret.append("-m") + ret.append("%s" % (self.getRequiredInitialReservation() / 1024)) # Find RFB console device, and if it exists, make QEMU enable # the VNC console. @@ -565,12 +568,6 @@ class IA64_HVM_ImageHandler(HVMImageHand # Explicit shadow memory is not a concept return 0 - def getDeviceModelArgs(self, restore = False): - args = HVMImageHandler.getDeviceModelArgs(self, restore) - args = args + ([ "-m", "%s" % - (self.getRequiredInitialReservation() / 1024) ]) - return args - class IA64_Linux_ImageHandler(LinuxImageHandler): _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2008-Jan-23 16:30 UTC
Re: [Xen-devel] [PATCH] ioemu: directly project all memory on x86_64
I don''t really want the memory-size parameter back. What if we support memory hotplug in future (e.g., we could do now if the guest decides to balloon in some memory higher up in its memory map)? Also I strongly suspect this breaks memory ballooning when the guest reduces its memory reservation, as qemu will crash as soon as it tries to access a non-existent guest page. Not only might this occur if the guest goes mad, or if it is doing some type of memory probing and hence touches completely unused gfns, but I know that it also happens sometimes legitimately after ballooning memory down. Unfortunately I can''t now remember what the scenario is, but I do remember that I implemented changeset 14773 for a good reason! Without that changeset, qemu would always crash soon after a large balloon-down operation, and at the time we tracked it down and 14773 was deemed the right approach to fix it. I''m now racking my brains to remember what the cause was... Anyway I don''t see what getting rid of the mapcache fixes. If you want to avoid capacity misses then turn it into a direct-map cache (1:1 hash function) for x86_64. -- Keir On 23/1/08 15:51, "Samuel Thibault" <samuel.thibault@eu.citrix.com> wrote:> ioemu: directly project all memory on x86_64 > On x86_64, we have enough virtual addressing space to just directly > project all guest''s physical memory. We still need to keep the map > cache, however, for some random addressing uses like acpi. > For this, we need to know the amount of memory, so let''s have the domain > builder always pass it, not only for ia64 (it''s harmless anyway). > > Signed-off-by: Samuel Thibault <samuel.thibault@eu.citrix.com> > > diff -r c364f80eb4b5 tools/ioemu/hw/xen_machine_fv.c > --- a/tools/ioemu/hw/xen_machine_fv.c Wed Jan 23 13:27:21 2008 +0000 > +++ b/tools/ioemu/hw/xen_machine_fv.c Wed Jan 23 15:37:19 2008 +0000 > @@ -26,6 +26,52 @@ > #include "vl.h" > #include <xen/hvm/params.h> > #include <sys/mman.h> > + > +#if defined(__x86_64__) || defined(__ia64__) > +static void phys_ram_base_reinit(void) > +{ > + unsigned long nr_pages; > + xen_pfn_t *page_array; > + int i; > + > + if (phys_ram_base) > + munmap(phys_ram_base, ram_size); > + > + nr_pages = ram_size / XC_PAGE_SIZE; > + > + page_array = (xen_pfn_t *)malloc(nr_pages * sizeof(xen_pfn_t)); > + if (page_array == NULL) { > + fprintf(logfile, "malloc returned error %d\n", errno); > + exit(-1); > + } > + > + for (i = 0; i < nr_pages; i++) > + page_array[i] = i; > + > +#if defined(__ia64__) > + /* VTI will not use memory between 3G~4G, so we just pass a legal pfn > + to make QEMU map continuous virtual memory space */ > + if (ram_size > MMIO_START) { > + for (i = 0 ; i < (MEM_G >> XC_PAGE_SHIFT); i++) > + page_array[(MMIO_START >> XC_PAGE_SHIFT) + i] > + (STORE_PAGE_START >> XC_PAGE_SHIFT); > + } > +#endif > + > + phys_ram_base = xc_map_foreign_batch(xc_handle, domid, > + PROT_READ|PROT_WRITE, > + page_array, nr_pages); > + if (phys_ram_base == 0) { > + fprintf(logfile, "xc_map_foreign_batch returned error %d\n", errno); > + exit(-1); > + } > + free(page_array); > + > + fprintf(logfile, "%ldMB direct physical ram projection\n", ram_size >> > 20); > +} > +#else > +#define phys_ram_base_reinit() ((void)0) > +#endif > > #if defined(MAPCACHE) > > @@ -174,6 +220,8 @@ void qemu_invalidate_map_cache(void) > last_address_vaddr = NULL; > > mapcache_unlock(); > + > + phys_ram_base_reinit(); > } > > #endif /* defined(MAPCACHE) */ > @@ -191,14 +239,10 @@ static void xen_init_fv(uint64_t ram_siz > extern void *shared_page; > extern void *buffered_io_page; > #ifdef __ia64__ > - unsigned long nr_pages; > - xen_pfn_t *page_array; > extern void *buffered_pio_page; > - int i; > #endif > > #if defined(__i386__) || defined(__x86_64__) > - > if (qemu_map_cache_init()) { > fprintf(logfile, "qemu_map_cache_init returned: error %d\n", errno); > exit(-1); > @@ -232,35 +276,9 @@ static void xen_init_fv(uint64_t ram_siz > fprintf(logfile, "map buffered PIO page returned error %d\n", errno); > exit(-1); > } > +#endif > > - nr_pages = ram_size / XC_PAGE_SIZE; > - > - page_array = (xen_pfn_t *)malloc(nr_pages * sizeof(xen_pfn_t)); > - if (page_array == NULL) { > - fprintf(logfile, "malloc returned error %d\n", errno); > - exit(-1); > - } > - > - for (i = 0; i < nr_pages; i++) > - page_array[i] = i; > - > - /* VTI will not use memory between 3G~4G, so we just pass a legal pfn > - to make QEMU map continuous virtual memory space */ > - if (ram_size > MMIO_START) { > - for (i = 0 ; i < (MEM_G >> XC_PAGE_SHIFT); i++) > - page_array[(MMIO_START >> XC_PAGE_SHIFT) + i] > - (STORE_PAGE_START >> XC_PAGE_SHIFT); > - } > - > - phys_ram_base = xc_map_foreign_batch(xc_handle, domid, > - PROT_READ|PROT_WRITE, > - page_array, nr_pages); > - if (phys_ram_base == 0) { > - fprintf(logfile, "xc_map_foreign_batch returned error %d\n", errno); > - exit(-1); > - } > - free(page_array); > -#endif > + phys_ram_base_reinit(); > > timeoffset_get(); > > diff -r c364f80eb4b5 tools/ioemu/target-i386-dm/exec-dm.c > --- a/tools/ioemu/target-i386-dm/exec-dm.c Wed Jan 23 13:27:21 2008 +0000 > +++ b/tools/ioemu/target-i386-dm/exec-dm.c Wed Jan 23 15:37:19 2008 +0000 > @@ -411,10 +411,12 @@ int iomem_index(target_phys_addr_t addr) > return 0; > } > > -#if defined(__i386__) || defined(__x86_64__) > +#if defined(__i386__) > #define phys_ram_addr(x) (qemu_map_cache(x)) > +#elif defined(__x86_64__) > +#define phys_ram_addr(x) (((x) < ram_size) ? (phys_ram_base + (x)) : > qemu_map_cache(x)) > #elif defined(__ia64__) > -#define phys_ram_addr(x) ((addr < ram_size) ? (phys_ram_base + (x)) : NULL) > +#define phys_ram_addr(x) (((x) < ram_size) ? (phys_ram_base + (x)) : NULL) > #endif > > extern unsigned long *logdirty_bitmap; > diff -r c364f80eb4b5 tools/python/xen/xend/image.py > --- a/tools/python/xen/xend/image.py Wed Jan 23 13:27:21 2008 +0000 > +++ b/tools/python/xen/xend/image.py Wed Jan 23 15:37:19 2008 +0000 > @@ -186,6 +186,9 @@ class ImageHandler: > # xm config file > def parseDeviceModelArgs(self, vmConfig): > ret = ["-domain-name", str(self.vm.info[''name_label''])] > + > + ret.append("-m") > + ret.append("%s" % (self.getRequiredInitialReservation() / 1024)) > > # Find RFB console device, and if it exists, make QEMU enable > # the VNC console. > @@ -565,12 +568,6 @@ class IA64_HVM_ImageHandler(HVMImageHand > # Explicit shadow memory is not a concept > return 0 > > - def getDeviceModelArgs(self, restore = False): > - args = HVMImageHandler.getDeviceModelArgs(self, restore) > - args = args + ([ "-m", "%s" % > - (self.getRequiredInitialReservation() / 1024) ]) > - return args > - > > class IA64_Linux_ImageHandler(LinuxImageHandler): > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Samuel Thibault
2008-Jan-23 16:38 UTC
Re: [Xen-devel] [PATCH] ioemu: directly project all memory on x86_64
Keir Fraser, le Wed 23 Jan 2008 16:30:27 +0000, a écrit :> I don''t really want the memory-size parameter back. What if we support > memory hotplug in future (e.g., we could do now if the guest decides to > balloon in some memory higher up in its memory map)?Well, the question holds for ia64 too, which already projects all memory.> Anyway I don''t see what getting rid of the mapcache fixes.Well, that was mostly to speed up memory lookup in the usual case, I don''t really need it. Samuel _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Samuel Thibault
2008-Jan-23 17:15 UTC
[Xen-devel] [PATCH] ioemu: fix phys_ram_addr parameter usage
ioemu: fix phys_ram_addr parameter usage Signed-off-by: Samuel Thibault <samuel.thibault@eu.citrix.com> diff -r c364f80eb4b5 tools/ioemu/target-i386-dm/exec-dm.c --- a/tools/ioemu/target-i386-dm/exec-dm.c Wed Jan 23 13:27:21 2008 +0000 +++ b/tools/ioemu/target-i386-dm/exec-dm.c Wed Jan 23 17:11:15 2008 +0000 @@ -414,7 +414,7 @@ int iomem_index(target_phys_addr_t addr) #if defined(__i386__) || defined(__x86_64__) #define phys_ram_addr(x) (qemu_map_cache(x)) #elif defined(__ia64__) -#define phys_ram_addr(x) ((addr < ram_size) ? (phys_ram_base + (x)) : NULL) +#define phys_ram_addr(x) (((x) < ram_size) ? (phys_ram_base + (x)) : NULL) #endif extern unsigned long *logdirty_bitmap; _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2008-Jan-23 17:54 UTC
Re: [Xen-devel] [PATCH] ioemu: directly project all memory on x86_64
On 23/1/08 16:38, "Samuel Thibault" <samuel.thibault@eu.citrix.com> wrote:> Keir Fraser, le Wed 23 Jan 2008 16:30:27 +0000, a écrit : >> I don''t really want the memory-size parameter back. What if we support >> memory hotplug in future (e.g., we could do now if the guest decides to >> balloon in some memory higher up in its memory map)? > > Well, the question holds for ia64 too, which already projects all > memory.Sure. I expect ia64 lags behind x86 in this respect. I remember the crash caused by not tracking unmapped pages by the way. At the time Xen was not notifying the qemu on increase_reservation, so qemu was not refreshing its guest memory map and would crash when newly-allocated pages were the source or detsination of I/O operations. Looks like it''s double fixed -- Xen is invalidating on both increase and decrease reservation, and qemu-dm is able to lazily fault in new guest mappings.>> Anyway I don''t see what getting rid of the mapcache fixes. > > Well, that was mostly to speed up memory lookup in the usual case, I > don''t really need it.It''s a good aim, I just think the optimisation should be done within the context of the mapcache subsystem. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Isaku Yamahata
2008-Jan-24 02:40 UTC
Re: [Xen-devel] [PATCH] ioemu: directly project all memory on x86_64
On Wed, Jan 23, 2008 at 05:54:40PM +0000, Keir Fraser wrote:> On 23/1/08 16:38, "Samuel Thibault" <samuel.thibault@eu.citrix.com> wrote: > > > Keir Fraser, le Wed 23 Jan 2008 16:30:27 +0000, a écrit : > >> I don''t really want the memory-size parameter back. What if we support > >> memory hotplug in future (e.g., we could do now if the guest decides to > >> balloon in some memory higher up in its memory map)? > > > > Well, the question holds for ia64 too, which already projects all > > memory. > > Sure. I expect ia64 lags behind x86 in this respect.IA64 doesn''t support HVM domain balloon yet. Presumably the first step for ia64 suport is to switch from direct mapping to mapcache (with 1:1 hash?) and see if perfromance degration is acceptable. But no one hasn''t tried it yet. Anyway on ia64 memory might not be populated continuously, so some kind of tracking allocate/unallocate memory in qemu-dm is necessary in long term. -- yamahata _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2008-Jan-24 07:51 UTC
Re: [Xen-devel] [PATCH] ioemu: directly project all memory on x86_64
On 24/1/08 02:40, "Isaku Yamahata" <yamahata@valinux.co.jp> wrote:>> Sure. I expect ia64 lags behind x86 in this respect. > > IA64 doesn''t support HVM domain balloon yet. > Presumably the first step for ia64 suport is to switch from direct > mapping to mapcache (with 1:1 hash?) and see if perfromance degration > is acceptable. But no one hasn''t tried it yet. > > Anyway on ia64 memory might not be populated continuously, > so some kind of tracking allocate/unallocate memory in qemu-dm > is necessary in long term.You won''t have any degradation if you make mapcache a 1:1 map for ia64 (and then also we can steal that code for x86_64!). mapcache already has the code for tracking missing pages. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel