Hi all, this is the fourth version of the Xen save/restore patch series. We have been discussing this issue for quite a while on #qemu and qemu-devel: http://marc.info/?l=qemu-devel&m=132346828427314&w=2 http://marc.info/?l=qemu-devel&m=132377734605464&w=2 A few different approaches were proposed to achieve the goal of a working save/restore with upstream Qemu on Xen, however after prototyping some of them I came up with yet another solution, that I think leads to the best results with the less amount of code duplications and ugliness. Far from saying that this patch series is an example of elegance and simplicity, but it is closer to acceptable anything else I have seen so far. What''s new is that Qemu is going to keep track of its own physmap on xenstore, so that Xen can be fully aware of the changes Qemu makes to the guest''s memory map at any time. This is all handled by Xen or Xen support in Qemu internally and can be used to solve our save/restore framebuffer problem.>From the Qemu common code POV, we still need to avoid saving the guest''sram when running on Xen, and we need to avoid resetting the videoram on restore (that is a benefit to the generic Qemu case too, because it saves few cpu cycles). Changes in v4: - keep a record of the MemoryRegion''s name on xenstore; - print a message when avoiding a memory allocation on restore. This is the list of patches with a diffstat: Anthony PERARD (4): vl.c: do not save the RAM state when Xen is enabled xen mapcache: check if memory region has moved. cirrus_vga: do not reset videoram on resume xen: change memory access behavior during migration. Stefano Stabellini (2): Set runstate to INMIGRATE earlier xen: record physmap changes to xenstore hw/cirrus_vga.c | 9 +++- vl.c | 8 ++- xen-all.c | 112 ++++++++++++++++++++++++++++++++++++++++++++++++++++++- xen-mapcache.c | 22 +++++++++- xen-mapcache.h | 9 +++- 5 files changed, 147 insertions(+), 13 deletions(-) git://xenbits.xen.org/people/sstabellini/qemu-dm.git saverestore-4 Cheers, Stefano
Stefano Stabellini
2012-Jan-20 17:21 UTC
[PATCH v4 1/6] vl.c: do not save the RAM state when Xen is enabled
From: Anthony PERARD <anthony.perard@citrix.com> In the Xen case, the guest RAM is not handle by QEMU, and it is saved by Xen tools. So, we just avoid to register the RAM save state handler. Signed-off-by: Anthony PERARD <anthony.perard@citrix.com> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> --- vl.c | 6 ++++-- 1 files changed, 4 insertions(+), 2 deletions(-) diff --git a/vl.c b/vl.c index ba55b35..6f0435b 100644 --- a/vl.c +++ b/vl.c @@ -3270,8 +3270,10 @@ int main(int argc, char **argv, char **envp) default_drive(default_sdcard, snapshot, machine->use_scsi, IF_SD, 0, SD_OPTS); - register_savevm_live(NULL, "ram", 0, 4, NULL, ram_save_live, NULL, - ram_load, NULL); + if (!xen_enabled()) { + register_savevm_live(NULL, "ram", 0, 4, NULL, ram_save_live, NULL, + ram_load, NULL); + } if (nb_numa_nodes > 0) { int i; -- 1.7.2.5
Stefano Stabellini
2012-Jan-20 17:21 UTC
[PATCH v4 2/6] xen mapcache: check if memory region has moved.
From: Anthony PERARD <anthony.perard@citrix.com> This patch changes the xen_map_cache behavior. Before trying to map a guest addr, mapcache will look into the list of range of address that have been moved (physmap/set_memory). There is currently one memory space like this, the vram, "moved" from were it''s allocated to were the guest will look into. This help to have a succefull migration. Signed-off-by: Anthony PERARD <anthony.perard@citrix.com> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> --- xen-all.c | 18 +++++++++++++++++- xen-mapcache.c | 22 +++++++++++++++++++--- xen-mapcache.h | 9 +++++++-- 3 files changed, 43 insertions(+), 6 deletions(-) diff --git a/xen-all.c b/xen-all.c index c86ebf4..507d93d 100644 --- a/xen-all.c +++ b/xen-all.c @@ -225,6 +225,22 @@ static XenPhysmap *get_physmapping(XenIOState *state, return NULL; } +static target_phys_addr_t xen_phys_offset_to_gaddr(target_phys_addr_t start_addr, + ram_addr_t size, void *opaque) +{ + target_phys_addr_t addr = start_addr & TARGET_PAGE_MASK; + XenIOState *xen_io_state = opaque; + XenPhysmap *physmap = NULL; + + QLIST_FOREACH(physmap, &xen_io_state->physmap, list) { + if (range_covers_byte(physmap->phys_offset, physmap->size, addr)) { + return physmap->start_addr; + } + } + + return start_addr; +} + #if CONFIG_XEN_CTRL_INTERFACE_VERSION >= 340 static int xen_add_to_physmap(XenIOState *state, target_phys_addr_t start_addr, @@ -964,7 +980,7 @@ int xen_hvm_init(void) } /* Init RAM management */ - xen_map_cache_init(); + xen_map_cache_init(xen_phys_offset_to_gaddr, state); xen_ram_init(ram_size); qemu_add_vm_change_state_handler(xen_hvm_change_state_handler, state); diff --git a/xen-mapcache.c b/xen-mapcache.c index 9fecc64..d9c995b 100644 --- a/xen-mapcache.c +++ b/xen-mapcache.c @@ -76,6 +76,9 @@ typedef struct MapCache { uint8_t *last_address_vaddr; unsigned long max_mcache_size; unsigned int mcache_bucket_shift; + + phys_offset_to_gaddr_t phys_offset_to_gaddr; + void *opaque; } MapCache; static MapCache *mapcache; @@ -89,13 +92,16 @@ static inline int test_bits(int nr, int size, const unsigned long *addr) return 0; } -void xen_map_cache_init(void) +void xen_map_cache_init(phys_offset_to_gaddr_t f, void *opaque) { unsigned long size; struct rlimit rlimit_as; mapcache = g_malloc0(sizeof (MapCache)); + mapcache->phys_offset_to_gaddr = f; + mapcache->opaque = opaque; + QTAILQ_INIT(&mapcache->locked_entries); mapcache->last_address_index = -1; @@ -191,9 +197,14 @@ uint8_t *xen_map_cache(target_phys_addr_t phys_addr, target_phys_addr_t size, uint8_t lock) { MapCacheEntry *entry, *pentry = NULL; - target_phys_addr_t address_index = phys_addr >> MCACHE_BUCKET_SHIFT; - target_phys_addr_t address_offset = phys_addr & (MCACHE_BUCKET_SIZE - 1); + target_phys_addr_t address_index; + target_phys_addr_t address_offset; target_phys_addr_t __size = size; + bool translated = false; + +tryagain: + address_index = phys_addr >> MCACHE_BUCKET_SHIFT; + address_offset = phys_addr & (MCACHE_BUCKET_SIZE - 1); trace_xen_map_cache(phys_addr); @@ -235,6 +246,11 @@ uint8_t *xen_map_cache(target_phys_addr_t phys_addr, target_phys_addr_t size, if(!test_bits(address_offset >> XC_PAGE_SHIFT, size >> XC_PAGE_SHIFT, entry->valid_mapping)) { mapcache->last_address_index = -1; + if (!translated && mapcache->phys_offset_to_gaddr) { + phys_addr = mapcache->phys_offset_to_gaddr(phys_addr, size, mapcache->opaque); + translated = true; + goto tryagain; + } trace_xen_map_cache_return(NULL); return NULL; } diff --git a/xen-mapcache.h b/xen-mapcache.h index da874ca..70301a5 100644 --- a/xen-mapcache.h +++ b/xen-mapcache.h @@ -11,9 +11,13 @@ #include <stdlib.h> +typedef target_phys_addr_t (*phys_offset_to_gaddr_t)(target_phys_addr_t start_addr, + ram_addr_t size, + void *opaque); #ifdef CONFIG_XEN -void xen_map_cache_init(void); +void xen_map_cache_init(phys_offset_to_gaddr_t f, + void *opaque); uint8_t *xen_map_cache(target_phys_addr_t phys_addr, target_phys_addr_t size, uint8_t lock); ram_addr_t xen_ram_addr_from_mapcache(void *ptr); @@ -22,7 +26,8 @@ void xen_invalidate_map_cache(void); #else -static inline void xen_map_cache_init(void) +static inline void xen_map_cache_init(phys_offset_to_gaddr_t f, + void *opaque) { } -- 1.7.2.5
Set runstate to RUN_STATE_INMIGRATE as soon as we can on resume. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> --- vl.c | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/vl.c b/vl.c index 6f0435b..bb0139f 100644 --- a/vl.c +++ b/vl.c @@ -2972,6 +2972,7 @@ int main(int argc, char **argv, char **envp) break; case QEMU_OPTION_incoming: incoming = optarg; + runstate_set(RUN_STATE_INMIGRATE); break; case QEMU_OPTION_nodefaults: default_serial = 0; @@ -3468,7 +3469,6 @@ int main(int argc, char **argv, char **envp) } if (incoming) { - runstate_set(RUN_STATE_INMIGRATE); int ret = qemu_start_incoming_migration(incoming); if (ret < 0) { fprintf(stderr, "Migration failed. Exit code %s(%d), exiting.\n", -- 1.7.2.5
Stefano Stabellini
2012-Jan-20 17:21 UTC
[PATCH v4 4/6] cirrus_vga: do not reset videoram on resume
From: Anthony PERARD <anthony.perard@citrix.com> When resuming we shouldn''t set the videoram to 0xff considering that we are about to read it from the savefile. Signed-off-by: Anthony PERARD <anthony.perard@citrix.com> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> --- hw/cirrus_vga.c | 9 ++++++--- 1 files changed, 6 insertions(+), 3 deletions(-) diff --git a/hw/cirrus_vga.c b/hw/cirrus_vga.c index f7b1d3d..eec2fc0 100644 --- a/hw/cirrus_vga.c +++ b/hw/cirrus_vga.c @@ -32,6 +32,7 @@ #include "console.h" #include "vga_int.h" #include "loader.h" +#include "sysemu.h" /* * TODO: @@ -2760,9 +2761,11 @@ static void cirrus_reset(void *opaque) } s->vga.cr[0x27] = s->device_id; - /* Win2K seems to assume that the pattern buffer is at 0xff - initially ! */ - memset(s->vga.vram_ptr, 0xff, s->real_vram_size); + if (!runstate_check(RUN_STATE_INMIGRATE)) { + /* Win2K seems to assume that the pattern buffer is at 0xff + initially ! */ + memset(s->vga.vram_ptr, 0xff, s->real_vram_size); + } s->cirrus_hidden_dac_lockindex = 5; s->cirrus_hidden_dac_data = 0; -- 1.7.2.5
Stefano Stabellini
2012-Jan-20 17:21 UTC
[PATCH v4 5/6] xen: record physmap changes to xenstore
Write to xenstore any physmap changes so that the hypervisor can be aware of them. Read physmap changes from xenstore on boot. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> --- xen-all.c | 78 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++- 1 files changed, 77 insertions(+), 1 deletions(-) diff --git a/xen-all.c b/xen-all.c index 507d93d..bb66c82 100644 --- a/xen-all.c +++ b/xen-all.c @@ -63,7 +63,7 @@ static inline ioreq_t *xen_vcpu_ioreq(shared_iopage_t *shared_page, int vcpu) typedef struct XenPhysmap { target_phys_addr_t start_addr; ram_addr_t size; - MemoryRegion *mr; + char *name; target_phys_addr_t phys_offset; QLIST_ENTRY(XenPhysmap) list; @@ -253,6 +253,7 @@ static int xen_add_to_physmap(XenIOState *state, XenPhysmap *physmap = NULL; target_phys_addr_t pfn, start_gpfn; target_phys_addr_t phys_offset = memory_region_get_ram_addr(mr); + char path[80], value[17]; if (get_physmapping(state, start_addr, size)) { return 0; @@ -291,6 +292,7 @@ go_physmap: physmap->start_addr = start_addr; physmap->size = size; + physmap->name = (char *)mr->name; physmap->phys_offset = phys_offset; QLIST_INSERT_HEAD(&state->physmap, physmap, list); @@ -299,6 +301,30 @@ go_physmap: start_addr >> TARGET_PAGE_BITS, (start_addr + size) >> TARGET_PAGE_BITS, XEN_DOMCTL_MEM_CACHEATTR_WB); + + snprintf(path, sizeof(path), + "/local/domain/0/device-model/%d/physmap/%"PRIx64"/start_addr", + xen_domid, (uint64_t)phys_offset); + snprintf(value, sizeof(value), "%"PRIx64, (uint64_t)start_addr); + if (!xs_write(state->xenstore, 0, path, value, strlen(value))) { + return -1; + } + snprintf(path, sizeof(path), + "/local/domain/0/device-model/%d/physmap/%"PRIx64"/size", + xen_domid, (uint64_t)phys_offset); + snprintf(value, sizeof(value), "%"PRIx64, (uint64_t)size); + if (!xs_write(state->xenstore, 0, path, value, strlen(value))) { + return -1; + } + if (mr->name) { + snprintf(path, sizeof(path), + "/local/domain/0/device-model/%d/physmap/%"PRIx64"/name", + xen_domid, (uint64_t)phys_offset); + if (!xs_write(state->xenstore, 0, path, mr->name, strlen(mr->name))) { + return -1; + } + } + return 0; } @@ -926,6 +952,55 @@ int xen_init(void) return 0; } +static void xen_read_physmap(XenIOState *state) +{ + XenPhysmap *physmap = NULL; + unsigned int len, num, i; + char path[80], *value = NULL; + char **entries = NULL; + + snprintf(path, sizeof(path), + "/local/domain/0/device-model/%d/physmap", xen_domid); + entries = xs_directory(state->xenstore, 0, path, &num); + if (entries == NULL) + return; + + for (i = 0; i < num; i++) { + physmap = g_malloc(sizeof (XenPhysmap)); + physmap->phys_offset = strtoull(entries[i], NULL, 16); + snprintf(path, sizeof(path), + "/local/domain/0/device-model/%d/physmap/%s/start_addr", + xen_domid, entries[i]); + value = xs_read(state->xenstore, 0, path, &len); + if (value == NULL) { + free(physmap); + continue; + } + physmap->start_addr = strtoull(value, NULL, 16); + free(value); + + snprintf(path, sizeof(path), + "/local/domain/0/device-model/%d/physmap/%s/size", + xen_domid, entries[i]); + value = xs_read(state->xenstore, 0, path, &len); + if (value == NULL) { + free(physmap); + continue; + } + physmap->size = strtoull(value, NULL, 16); + free(value); + + snprintf(path, sizeof(path), + "/local/domain/0/device-model/%d/physmap/%s/name", + xen_domid, entries[i]); + physmap->name = xs_read(state->xenstore, 0, path, &len); + + QLIST_INSERT_HEAD(&state->physmap, physmap, list); + } + free(entries); + return; +} + int xen_hvm_init(void) { int i, rc; @@ -998,6 +1073,7 @@ int xen_hvm_init(void) xen_be_register("console", &xen_console_ops); xen_be_register("vkbd", &xen_kbdmouse_ops); xen_be_register("qdisk", &xen_blkdev_ops); + xen_read_physmap(state); return 0; } -- 1.7.2.5
Stefano Stabellini
2012-Jan-20 17:21 UTC
[PATCH v4 6/6] xen: change memory access behavior during migration.
From: Anthony PERARD <anthony.perard@citrix.com> Do not allocate RAM during INMIGRATE runstate. Do not actually "do" set_memory during migration. Signed-off-by: Anthony PERARD <anthony.perard@citrix.com> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> --- xen-all.c | 16 ++++++++++++++++ 1 files changed, 16 insertions(+), 0 deletions(-) diff --git a/xen-all.c b/xen-all.c index bb66c82..9a80c30 100644 --- a/xen-all.c +++ b/xen-all.c @@ -190,6 +190,14 @@ void xen_ram_alloc(ram_addr_t ram_addr, ram_addr_t size, MemoryRegion *mr) xen_pfn_t *pfn_list; int i; + if (runstate_check(RUN_STATE_INMIGRATE)) { + /* RAM already populated in Xen */ + fprintf(stderr, "%s: do not alloc "RAM_ADDR_FMT + " bytes of ram at "RAM_ADDR_FMT" when runstate is INMIGRATE\n", + __func__, size, ram_addr); + return; + } + if (mr == &ram_memory) { return; } @@ -255,6 +263,14 @@ static int xen_add_to_physmap(XenIOState *state, target_phys_addr_t phys_offset = memory_region_get_ram_addr(mr); char path[80], value[17]; + if (runstate_check(RUN_STATE_INMIGRATE)) { + fprintf(stderr, "%s: do not add "RAM_ADDR_FMT + " bytes of ram to physmap at "RAM_ADDR_FMT + " when runstate is INMIGRATE\n", + __func__, size, start_addr); + return 0; + } + if (get_physmapping(state, start_addr, size)) { return 0; } -- 1.7.2.5
On 2012-01-20 18:20, Stefano Stabellini wrote:> Hi all, > this is the fourth version of the Xen save/restore patch series. > We have been discussing this issue for quite a while on #qemu and > qemu-devel: > > > http://marc.info/?l=qemu-devel&m=132346828427314&w=2 > http://marc.info/?l=qemu-devel&m=132377734605464&w=2 > > > A few different approaches were proposed to achieve the goal > of a working save/restore with upstream Qemu on Xen, however after > prototyping some of them I came up with yet another solution, that I > think leads to the best results with the less amount of code > duplications and ugliness. > Far from saying that this patch series is an example of elegance and > simplicity, but it is closer to acceptable anything else I have seen so > far. > > What''s new is that Qemu is going to keep track of its own physmap on > xenstore, so that Xen can be fully aware of the changes Qemu makes to > the guest''s memory map at any time. > This is all handled by Xen or Xen support in Qemu internally and can be > used to solve our save/restore framebuffer problem. > >>From the Qemu common code POV, we still need to avoid saving the guest''s > ram when running on Xen, and we need to avoid resetting the videoram on > restore (that is a benefit to the generic Qemu case too, because it > saves few cpu cycles).For my understanding: Refraining from the memset is required as the already restored vram would then be overwritten? Or what is the ordering of init, RAM restore, and initial device reset now? Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux
On Fri, 20 Jan 2012, Jan Kiszka wrote:> On 2012-01-20 18:20, Stefano Stabellini wrote: > > Hi all, > > this is the fourth version of the Xen save/restore patch series. > > We have been discussing this issue for quite a while on #qemu and > > qemu-devel: > > > > > > http://marc.info/?l=qemu-devel&m=132346828427314&w=2 > > http://marc.info/?l=qemu-devel&m=132377734605464&w=2 > > > > > > A few different approaches were proposed to achieve the goal > > of a working save/restore with upstream Qemu on Xen, however after > > prototyping some of them I came up with yet another solution, that I > > think leads to the best results with the less amount of code > > duplications and ugliness. > > Far from saying that this patch series is an example of elegance and > > simplicity, but it is closer to acceptable anything else I have seen so > > far. > > > > What''s new is that Qemu is going to keep track of its own physmap on > > xenstore, so that Xen can be fully aware of the changes Qemu makes to > > the guest''s memory map at any time. > > This is all handled by Xen or Xen support in Qemu internally and can be > > used to solve our save/restore framebuffer problem. > > > >>From the Qemu common code POV, we still need to avoid saving the guest''s > > ram when running on Xen, and we need to avoid resetting the videoram on > > restore (that is a benefit to the generic Qemu case too, because it > > saves few cpu cycles). > > For my understanding: Refraining from the memset is required as the > already restored vram would then be overwritten?Yep> Or what is the ordering > of init, RAM restore, and initial device reset now?RAM restore (done by Xen) physmap rebuild (done by xen_hvm_init in qemu) pc_init() qemu_system_reset() load_vmstate()
On 2012-01-23 11:47, Stefano Stabellini wrote:> On Fri, 20 Jan 2012, Jan Kiszka wrote: >> On 2012-01-20 18:20, Stefano Stabellini wrote: >>> Hi all, >>> this is the fourth version of the Xen save/restore patch series. >>> We have been discussing this issue for quite a while on #qemu and >>> qemu-devel: >>> >>> >>> http://marc.info/?l=qemu-devel&m=132346828427314&w=2 >>> http://marc.info/?l=qemu-devel&m=132377734605464&w=2 >>> >>> >>> A few different approaches were proposed to achieve the goal >>> of a working save/restore with upstream Qemu on Xen, however after >>> prototyping some of them I came up with yet another solution, that I >>> think leads to the best results with the less amount of code >>> duplications and ugliness. >>> Far from saying that this patch series is an example of elegance and >>> simplicity, but it is closer to acceptable anything else I have seen so >>> far. >>> >>> What''s new is that Qemu is going to keep track of its own physmap on >>> xenstore, so that Xen can be fully aware of the changes Qemu makes to >>> the guest''s memory map at any time. >>> This is all handled by Xen or Xen support in Qemu internally and can be >>> used to solve our save/restore framebuffer problem. >>> >>> >From the Qemu common code POV, we still need to avoid saving the guest''s >>> ram when running on Xen, and we need to avoid resetting the videoram on >>> restore (that is a benefit to the generic Qemu case too, because it >>> saves few cpu cycles). >> >> For my understanding: Refraining from the memset is required as the >> already restored vram would then be overwritten? > > Yep > >> Or what is the ordering >> of init, RAM restore, and initial device reset now? > > RAM restore (done by Xen) > > physmap rebuild (done by xen_hvm_init in qemu) > pc_init() > qemu_system_reset() > load_vmstate()Hmm, are you sure that this is the only case where a device init or reset handler writes to already restored guest memory? Preloading the RAM this way is a non-standard scenario for QEMU, thus conceptually fragile. Does restoring happen before QEMU is even started, or can this point be controlled from QEMU? Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux
On Mon, 23 Jan 2012, Jan Kiszka wrote:> >> Or what is the ordering > >> of init, RAM restore, and initial device reset now? > > > > RAM restore (done by Xen) > > > > physmap rebuild (done by xen_hvm_init in qemu) > > pc_init() > > qemu_system_reset() > > load_vmstate() > > Hmm, are you sure that this is the only case where a device init or > reset handler writes to already restored guest memory? Preloading the > RAM this way is a non-standard scenario for QEMU, thus conceptually > fragile. Does restoring happen before QEMU is even started, or can this > point be controlled from QEMU?Consider that this only happens with non-MMIO device memory, in practice only videoram. Vmware VGA does not memset the videoram in the reset handler, while QXL already has the following: /* pre loadvm reset must not touch QXLRam. This lives in * device memory, is migrated together with RAM and thus * already loaded at this point */ if (!loadvm) { qxl_reset_state(d); }
On 2012-01-23 12:59, Stefano Stabellini wrote:> On Mon, 23 Jan 2012, Jan Kiszka wrote: >>>> Or what is the ordering >>>> of init, RAM restore, and initial device reset now? >>> >>> RAM restore (done by Xen) >>> >>> physmap rebuild (done by xen_hvm_init in qemu) >>> pc_init() >>> qemu_system_reset() >>> load_vmstate() >> >> Hmm, are you sure that this is the only case where a device init or >> reset handler writes to already restored guest memory? Preloading the >> RAM this way is a non-standard scenario for QEMU, thus conceptually >> fragile. Does restoring happen before QEMU is even started, or can this >> point be controlled from QEMU? > > Consider that this only happens with non-MMIO device memory, in practice > only videoram. > Vmware VGA does not memset the videoram in the reset handler, while QXL > already has the following: > > /* pre loadvm reset must not touch QXLRam. This lives in > * device memory, is migrated together with RAM and thus > * already loaded at this point */ > if (!loadvm) { > qxl_reset_state(d); > }Yes, but QEMU restores the RAM _after_ device reset, not before it. That''s the problem with the Xen way - it is against the current QEMU standard. Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux
On Mon, 23 Jan 2012, Jan Kiszka wrote:> On 2012-01-23 12:59, Stefano Stabellini wrote: > > On Mon, 23 Jan 2012, Jan Kiszka wrote: > >>>> Or what is the ordering > >>>> of init, RAM restore, and initial device reset now? > >>> > >>> RAM restore (done by Xen) > >>> > >>> physmap rebuild (done by xen_hvm_init in qemu) > >>> pc_init() > >>> qemu_system_reset() > >>> load_vmstate() > >> > >> Hmm, are you sure that this is the only case where a device init or > >> reset handler writes to already restored guest memory? Preloading the > >> RAM this way is a non-standard scenario for QEMU, thus conceptually > >> fragile. Does restoring happen before QEMU is even started, or can this > >> point be controlled from QEMU? > > > > Consider that this only happens with non-MMIO device memory, in practice > > only videoram. > > Vmware VGA does not memset the videoram in the reset handler, while QXL > > already has the following: > > > > /* pre loadvm reset must not touch QXLRam. This lives in > > * device memory, is migrated together with RAM and thus > > * already loaded at this point */ > > if (!loadvm) { > > qxl_reset_state(d); > > } > > Yes, but QEMU restores the RAM _after_ device reset, not before it. > That''s the problem with the Xen way - it is against the current QEMU > standard.QEMU doesn''t save/restore the RAM (and the videoram) at all on Xen. To reply to your previous question more clearly: at restore time Qemu on Xen would run in a non-standard scenario; the restore of the RAM happens before QEMU is even started. That is unfortunate but it would be very hard to change (I can give you more details if you are interested in the reasons why it would be so difficult).
On 2012-01-23 15:46, Stefano Stabellini wrote:> On Mon, 23 Jan 2012, Jan Kiszka wrote: >> On 2012-01-23 12:59, Stefano Stabellini wrote: >>> On Mon, 23 Jan 2012, Jan Kiszka wrote: >>>>>> Or what is the ordering >>>>>> of init, RAM restore, and initial device reset now? >>>>> >>>>> RAM restore (done by Xen) >>>>> >>>>> physmap rebuild (done by xen_hvm_init in qemu) >>>>> pc_init() >>>>> qemu_system_reset() >>>>> load_vmstate() >>>> >>>> Hmm, are you sure that this is the only case where a device init or >>>> reset handler writes to already restored guest memory? Preloading the >>>> RAM this way is a non-standard scenario for QEMU, thus conceptually >>>> fragile. Does restoring happen before QEMU is even started, or can this >>>> point be controlled from QEMU? >>> >>> Consider that this only happens with non-MMIO device memory, in practice >>> only videoram. >>> Vmware VGA does not memset the videoram in the reset handler, while QXL >>> already has the following: >>> >>> /* pre loadvm reset must not touch QXLRam. This lives in >>> * device memory, is migrated together with RAM and thus >>> * already loaded at this point */ >>> if (!loadvm) { >>> qxl_reset_state(d); >>> } >> >> Yes, but QEMU restores the RAM _after_ device reset, not before it. >> That''s the problem with the Xen way - it is against the current QEMU >> standard. > > QEMU doesn''t save/restore the RAM (and the videoram) at all on Xen.But it does otherwise, and that''s the scenario the code you cited was written for. It won''t work as is under Xen.> > To reply to your previous question more clearly: at restore time Qemu on > Xen would run in a non-standard scenario; the restore of the RAM happens > before QEMU is even started. > > That is unfortunate but it would be very hard to change (I can give you > more details if you are interested in the reasons why it would be so > difficult).If you can''t change this, you need to properly introduce this new scenario - pre-initialized RAM - to the QEMU device model. Or you will see breakage outside cirrus sooner or later as well. So it might be good to explain the reason why it can''t be changed under Xen when motivating this concept extension to QEMU. Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux
Anthony Liguori
2012-Jan-23 15:58 UTC
Re: [PATCH v4 1/6] vl.c: do not save the RAM state when Xen is enabled
On 01/20/2012 11:21 AM, Stefano Stabellini wrote:> From: Anthony PERARD<anthony.perard@citrix.com> > > In the Xen case, the guest RAM is not handle by QEMU, and it is saved by > Xen tools. > So, we just avoid to register the RAM save state handler. > > Signed-off-by: Anthony PERARD<anthony.perard@citrix.com> > Signed-off-by: Stefano Stabellini<stefano.stabellini@eu.citrix.com> > --- > vl.c | 6 ++++-- > 1 files changed, 4 insertions(+), 2 deletions(-) > > diff --git a/vl.c b/vl.c > index ba55b35..6f0435b 100644 > --- a/vl.c > +++ b/vl.c > @@ -3270,8 +3270,10 @@ int main(int argc, char **argv, char **envp) > default_drive(default_sdcard, snapshot, machine->use_scsi, > IF_SD, 0, SD_OPTS); > > - register_savevm_live(NULL, "ram", 0, 4, NULL, ram_save_live, NULL, > - ram_load, NULL); > + if (!xen_enabled()) { > + register_savevm_live(NULL, "ram", 0, 4, NULL, ram_save_live, NULL, > + ram_load, NULL); > + }Why not introduce new Xen specific commands like I suggested on IRC? This sort of change is extremely non-intuitive. We should do as much as we can to avoid magic #ifdefs like this. Regards, Anthony Liguori> > if (nb_numa_nodes> 0) { > int i;
On 01/23/2012 04:47 AM, Stefano Stabellini wrote:> On Fri, 20 Jan 2012, Jan Kiszka wrote: >> On 2012-01-20 18:20, Stefano Stabellini wrote: >>> Hi all, >>> this is the fourth version of the Xen save/restore patch series. >>> We have been discussing this issue for quite a while on #qemu and >>> qemu-devel: >>> >>> >>> http://marc.info/?l=qemu-devel&m=132346828427314&w=2 >>> http://marc.info/?l=qemu-devel&m=132377734605464&w=2 >>> >>> >>> A few different approaches were proposed to achieve the goal >>> of a working save/restore with upstream Qemu on Xen, however after >>> prototyping some of them I came up with yet another solution, that I >>> think leads to the best results with the less amount of code >>> duplications and ugliness. >>> Far from saying that this patch series is an example of elegance and >>> simplicity, but it is closer to acceptable anything else I have seen so >>> far. >>> >>> What''s new is that Qemu is going to keep track of its own physmap on >>> xenstore, so that Xen can be fully aware of the changes Qemu makes to >>> the guest''s memory map at any time. >>> This is all handled by Xen or Xen support in Qemu internally and can be >>> used to solve our save/restore framebuffer problem. >>> >>> > From the Qemu common code POV, we still need to avoid saving the guest''s >>> ram when running on Xen, and we need to avoid resetting the videoram on >>> restore (that is a benefit to the generic Qemu case too, because it >>> saves few cpu cycles). >> >> For my understanding: Refraining from the memset is required as the >> already restored vram would then be overwritten? > > Yep > >> Or what is the ordering >> of init, RAM restore, and initial device reset now? > > RAM restore (done by Xen) > > physmap rebuild (done by xen_hvm_init in qemu) > pc_init() > qemu_system_reset() > load_vmstate()That''s your problem. You don''t want to do load_vmstate(). You just want to load the device model, not RAM. You should have a separate load_device_state() function and mark anything that is RAM as RAM when doing savevm registration. Better yet, mark devices as devices since that''s what you really care about. Regards, Anthony Liguori
On Mon, 23 Jan 2012, Jan Kiszka wrote:> On 2012-01-23 15:46, Stefano Stabellini wrote: > > On Mon, 23 Jan 2012, Jan Kiszka wrote: > >> On 2012-01-23 12:59, Stefano Stabellini wrote: > >>> On Mon, 23 Jan 2012, Jan Kiszka wrote: > >>>>>> Or what is the ordering > >>>>>> of init, RAM restore, and initial device reset now? > >>>>> > >>>>> RAM restore (done by Xen) > >>>>> > >>>>> physmap rebuild (done by xen_hvm_init in qemu) > >>>>> pc_init() > >>>>> qemu_system_reset() > >>>>> load_vmstate() > >>>> > >>>> Hmm, are you sure that this is the only case where a device init or > >>>> reset handler writes to already restored guest memory? Preloading the > >>>> RAM this way is a non-standard scenario for QEMU, thus conceptually > >>>> fragile. Does restoring happen before QEMU is even started, or can this > >>>> point be controlled from QEMU? > >>> > >>> Consider that this only happens with non-MMIO device memory, in practice > >>> only videoram. > >>> Vmware VGA does not memset the videoram in the reset handler, while QXL > >>> already has the following: > >>> > >>> /* pre loadvm reset must not touch QXLRam. This lives in > >>> * device memory, is migrated together with RAM and thus > >>> * already loaded at this point */ if (!loadvm) { > >>> qxl_reset_state(d); } > >> > >> Yes, but QEMU restores the RAM _after_ device reset, not before it. > >> That''s the problem with the Xen way - it is against the current > >> QEMU standard. > > > > QEMU doesn''t save/restore the RAM (and the videoram) at all on Xen. > > But it does otherwise, and that''s the scenario the code you cited was > written for. It won''t work as is under Xen.Ah, I see your point now. In that regard, is the comment above even correct? I am referring to "migrated together with RAM and thus already loaded at this point"?> > To reply to your previous question more clearly: at restore time Qemu on > > Xen would run in a non-standard scenario; the restore of the RAM happens > > before QEMU is even started. > > > > That is unfortunate but it would be very hard to change (I can give you > > more details if you are interested in the reasons why it would be so > > difficult). > > If you can''t change this, you need to properly introduce this new > scenario - pre-initialized RAM - to the QEMU device model. Or you will > see breakage outside cirrus sooner or later as well. So it might be good > to explain the reason why it can''t be changed under Xen when motivating > this concept extension to QEMU.OK. Are you thinking about introducing this concept as a new runstate? This special runstate could be set at restore time only on Xen. BTW the main reasons for having Xen saving the RAM are: - the need to support PV guests, that often run without Qemu; - the current save format, that is built around the fact that Xen saves the memory; - the fact that Qemu might be running in a very limited stub-domain.
On 01/23/2012 10:16 AM, Stefano Stabellini wrote:> On Mon, 23 Jan 2012, Jan Kiszka wrote: >> On 2012-01-23 15:46, Stefano Stabellini wrote: >>> On Mon, 23 Jan 2012, Jan Kiszka wrote: >>>> On 2012-01-23 12:59, Stefano Stabellini wrote: >>>>> On Mon, 23 Jan 2012, Jan Kiszka wrote: >>>>>>>> Or what is the ordering >>>>>>>> of init, RAM restore, and initial device reset now? >>>>>>> >>>>>>> RAM restore (done by Xen) >>>>>>> >>>>>>> physmap rebuild (done by xen_hvm_init in qemu) >>>>>>> pc_init() >>>>>>> qemu_system_reset() >>>>>>> load_vmstate() >>>>>> >>>>>> Hmm, are you sure that this is the only case where a device init or >>>>>> reset handler writes to already restored guest memory? Preloading the >>>>>> RAM this way is a non-standard scenario for QEMU, thus conceptually >>>>>> fragile. Does restoring happen before QEMU is even started, or can this >>>>>> point be controlled from QEMU? >>>>> >>>>> Consider that this only happens with non-MMIO device memory, in practice >>>>> only videoram. >>>>> Vmware VGA does not memset the videoram in the reset handler, while QXL >>>>> already has the following: >>>>> >>>>> /* pre loadvm reset must not touch QXLRam. This lives in >>>>> * device memory, is migrated together with RAM and thus >>>>> * already loaded at this point */ if (!loadvm) { >>>>> qxl_reset_state(d); } >>>> >>>> Yes, but QEMU restores the RAM _after_ device reset, not before it. >>>> That''s the problem with the Xen way - it is against the current >>>> QEMU standard. >>> >>> QEMU doesn''t save/restore the RAM (and the videoram) at all on Xen. >> >> But it does otherwise, and that''s the scenario the code you cited was >> written for. It won''t work as is under Xen. > > Ah, I see your point now. > In that regard, is the comment above even correct? > I am referring to "migrated together with RAM and thus already loaded at > this point"? > > >>> To reply to your previous question more clearly: at restore time Qemu on >>> Xen would run in a non-standard scenario; the restore of the RAM happens >>> before QEMU is even started. >>> >>> That is unfortunate but it would be very hard to change (I can give you >>> more details if you are interested in the reasons why it would be so >>> difficult). >> >> If you can''t change this, you need to properly introduce this new >> scenario - pre-initialized RAM - to the QEMU device model. Or you will >> see breakage outside cirrus sooner or later as well. So it might be good >> to explain the reason why it can''t be changed under Xen when motivating >> this concept extension to QEMU. > > OK. > Are you thinking about introducing this concept as a new runstate? > This special runstate could be set at restore time only on Xen.A runstate is not the right approach. Don''t abuse existing commands/protocols to make them have a different function on Xen. Just introduce a new command that has the behavior you want. Regards, Anthony Liguori
On Mon, 23 Jan 2012, Anthony Liguori wrote:> On 01/23/2012 04:47 AM, Stefano Stabellini wrote: > > On Fri, 20 Jan 2012, Jan Kiszka wrote: > >> On 2012-01-20 18:20, Stefano Stabellini wrote: > >>> Hi all, > >>> this is the fourth version of the Xen save/restore patch series. > >>> We have been discussing this issue for quite a while on #qemu and > >>> qemu-devel: > >>> > >>> > >>> http://marc.info/?l=qemu-devel&m=132346828427314&w=2 > >>> http://marc.info/?l=qemu-devel&m=132377734605464&w=2 > >>> > >>> > >>> A few different approaches were proposed to achieve the goal > >>> of a working save/restore with upstream Qemu on Xen, however after > >>> prototyping some of them I came up with yet another solution, that I > >>> think leads to the best results with the less amount of code > >>> duplications and ugliness. > >>> Far from saying that this patch series is an example of elegance and > >>> simplicity, but it is closer to acceptable anything else I have seen so > >>> far. > >>> > >>> What''s new is that Qemu is going to keep track of its own physmap on > >>> xenstore, so that Xen can be fully aware of the changes Qemu makes to > >>> the guest''s memory map at any time. > >>> This is all handled by Xen or Xen support in Qemu internally and can be > >>> used to solve our save/restore framebuffer problem. > >>> > >>> > From the Qemu common code POV, we still need to avoid saving the guest''s > >>> ram when running on Xen, and we need to avoid resetting the videoram on > >>> restore (that is a benefit to the generic Qemu case too, because it > >>> saves few cpu cycles). > >> > >> For my understanding: Refraining from the memset is required as the > >> already restored vram would then be overwritten? > > > > Yep > > > >> Or what is the ordering > >> of init, RAM restore, and initial device reset now? > > > > RAM restore (done by Xen) > > > > physmap rebuild (done by xen_hvm_init in qemu) > > pc_init() > > qemu_system_reset() > > load_vmstate() > > That''s your problem. You don''t want to do load_vmstate(). You just want to > load the device model, not RAM.True> Why not introduce new Xen specific commands like I suggested on IRC?Introducing a Xen specific command is not an issue, but I didn''t want to duplicate all the functionalities currently in savevm.c.> You should have a separate load_device_state() function and mark anything that > is RAM as RAM when doing savevm registration. Better yet, mark devices as > devices since that''s what you really care about.I dropped this approach because I thought it causes too much code duplication. However, following your suggestion, if I add a generic "device" flag in SaveStateEntry and implement a generic qemu_save_device_state in savevm.c, I believe that the duplication of code would be small. And patch #1 could go away. However the issue of patch #4, "do not reset videoram on resume", still remains: no matter what parameter I pass to Qemu, if qemu_system_reset is called on resume the videoram is going to be overwritten by 0xff. In this regard, don''t you think it would be advantageous to Qemu in general not to reset the videram in resume? It can be pretty large, so it is a significant waste of a memset.
On 01/23/2012 10:46 AM, Stefano Stabellini wrote:> On Mon, 23 Jan 2012, Anthony Liguori wrote: >> On 01/23/2012 04:47 AM, Stefano Stabellini wrote: >>> On Fri, 20 Jan 2012, Jan Kiszka wrote: >>>> On 2012-01-20 18:20, Stefano Stabellini wrote: >>>>> Hi all, >>>>> this is the fourth version of the Xen save/restore patch series. >>>>> We have been discussing this issue for quite a while on #qemu and >>>>> qemu-devel: >>>>> >>>>> >>>>> http://marc.info/?l=qemu-devel&m=132346828427314&w=2 >>>>> http://marc.info/?l=qemu-devel&m=132377734605464&w=2 >>>>> >>>>> >>>>> A few different approaches were proposed to achieve the goal >>>>> of a working save/restore with upstream Qemu on Xen, however after >>>>> prototyping some of them I came up with yet another solution, that I >>>>> think leads to the best results with the less amount of code >>>>> duplications and ugliness. >>>>> Far from saying that this patch series is an example of elegance and >>>>> simplicity, but it is closer to acceptable anything else I have seen so >>>>> far. >>>>> >>>>> What''s new is that Qemu is going to keep track of its own physmap on >>>>> xenstore, so that Xen can be fully aware of the changes Qemu makes to >>>>> the guest''s memory map at any time. >>>>> This is all handled by Xen or Xen support in Qemu internally and can be >>>>> used to solve our save/restore framebuffer problem. >>>>> >>>>>> From the Qemu common code POV, we still need to avoid saving the guest''s >>>>> ram when running on Xen, and we need to avoid resetting the videoram on >>>>> restore (that is a benefit to the generic Qemu case too, because it >>>>> saves few cpu cycles). >>>> >>>> For my understanding: Refraining from the memset is required as the >>>> already restored vram would then be overwritten? >>> >>> Yep >>> >>>> Or what is the ordering >>>> of init, RAM restore, and initial device reset now? >>> >>> RAM restore (done by Xen) >>> >>> physmap rebuild (done by xen_hvm_init in qemu) >>> pc_init() >>> qemu_system_reset() >>> load_vmstate() >> >> That''s your problem. You don''t want to do load_vmstate(). You just want to >> load the device model, not RAM. > > True > > >> Why not introduce new Xen specific commands like I suggested on IRC? > > Introducing a Xen specific command is not an issue, but I didn''t want to > duplicate all the functionalities currently in savevm.c.The code is fairly reusable since live migration and savevm use the same internal bits. I think you would just need another version of qemu_loadvm_state(). That function is only a hundred lines or so so you shouldn''t be duplicating much at all.>> You should have a separate load_device_state() function and mark anything that >> is RAM as RAM when doing savevm registration. Better yet, mark devices as >> devices since that''s what you really care about. > > I dropped this approach because I thought it causes too much code > duplication.Then you''re doing it wrong :-) But even if there is, just refactor out the common code.> However, following your suggestion, if I add a generic "device" flag in > SaveStateEntry and implement a generic qemu_save_device_state in > savevm.c, I believe that the duplication of code would be small. > And patch #1 could go away.Yup.> > > However the issue of patch #4, "do not reset videoram on resume", still > remains: no matter what parameter I pass to Qemu, if qemu_system_reset > is called on resume the videoram is going to be overwritten by 0xff.The memset(0xff) looks dubious to me. My guess is that this could be moved to the vgabios-cirrus which would solve your problem.> In this regard, don''t you think it would be advantageous to Qemu in > general not to reset the videram in resume? It can be pretty large, so > it is a significant waste of a memset.It claims to fix a real bug. Moving the memset to vgabios would do what you want to do in a more robust way I think. Regards, Anthony Liguori
On Mon, 23 Jan 2012, Anthony Liguori wrote:> On 01/23/2012 10:46 AM, Stefano Stabellini wrote: > > On Mon, 23 Jan 2012, Anthony Liguori wrote: > >> On 01/23/2012 04:47 AM, Stefano Stabellini wrote: > >>> On Fri, 20 Jan 2012, Jan Kiszka wrote: > >>>> On 2012-01-20 18:20, Stefano Stabellini wrote: > >>>>> Hi all, > >>>>> this is the fourth version of the Xen save/restore patch series. > >>>>> We have been discussing this issue for quite a while on #qemu and > >>>>> qemu-devel: > >>>>> > >>>>> > >>>>> http://marc.info/?l=qemu-devel&m=132346828427314&w=2 > >>>>> http://marc.info/?l=qemu-devel&m=132377734605464&w=2 > >>>>> > >>>>> > >>>>> A few different approaches were proposed to achieve the goal > >>>>> of a working save/restore with upstream Qemu on Xen, however after > >>>>> prototyping some of them I came up with yet another solution, that I > >>>>> think leads to the best results with the less amount of code > >>>>> duplications and ugliness. > >>>>> Far from saying that this patch series is an example of elegance and > >>>>> simplicity, but it is closer to acceptable anything else I have seen so > >>>>> far. > >>>>> > >>>>> What''s new is that Qemu is going to keep track of its own physmap on > >>>>> xenstore, so that Xen can be fully aware of the changes Qemu makes to > >>>>> the guest''s memory map at any time. > >>>>> This is all handled by Xen or Xen support in Qemu internally and can be > >>>>> used to solve our save/restore framebuffer problem. > >>>>> > >>>>>> From the Qemu common code POV, we still need to avoid saving the guest''s > >>>>> ram when running on Xen, and we need to avoid resetting the videoram on > >>>>> restore (that is a benefit to the generic Qemu case too, because it > >>>>> saves few cpu cycles). > >>>> > >>>> For my understanding: Refraining from the memset is required as the > >>>> already restored vram would then be overwritten? > >>> > >>> Yep > >>> > >>>> Or what is the ordering > >>>> of init, RAM restore, and initial device reset now? > >>> > >>> RAM restore (done by Xen) > >>> > >>> physmap rebuild (done by xen_hvm_init in qemu) > >>> pc_init() > >>> qemu_system_reset() > >>> load_vmstate() > >> > >> That''s your problem. You don''t want to do load_vmstate(). You just want to > >> load the device model, not RAM. > > > > True > > > > > >> Why not introduce new Xen specific commands like I suggested on IRC? > > > > Introducing a Xen specific command is not an issue, but I didn''t want to > > duplicate all the functionalities currently in savevm.c. > > The code is fairly reusable since live migration and savevm use the same > internal bits. I think you would just need another version of > qemu_loadvm_state(). That function is only a hundred lines or so so you > shouldn''t be duplicating much at all. > > >> You should have a separate load_device_state() function and mark anything that > >> is RAM as RAM when doing savevm registration. Better yet, mark devices as > >> devices since that''s what you really care about. > > > > I dropped this approach because I thought it causes too much code > > duplication. > > Then you''re doing it wrong :-) > > But even if there is, just refactor out the common code. > > > However, following your suggestion, if I add a generic "device" flag in > > SaveStateEntry and implement a generic qemu_save_device_state in > > savevm.c, I believe that the duplication of code would be small. > > And patch #1 could go away. > > Yup. > > > > > > > However the issue of patch #4, "do not reset videoram on resume", still > > remains: no matter what parameter I pass to Qemu, if qemu_system_reset > > is called on resume the videoram is going to be overwritten by 0xff. > > The memset(0xff) looks dubious to me. My guess is that this could be moved to > the vgabios-cirrus which would solve your problem. > > > In this regard, don''t you think it would be advantageous to Qemu in > > general not to reset the videram in resume? It can be pretty large, so > > it is a significant waste of a memset. > > It claims to fix a real bug. Moving the memset to vgabios would do what you > want to do in a more robust way I think.I think it does fix a bug (Win2K expects RAM to be 0xff at boot, or so a comment on qemu-xen claims) but certainly it is not supposed to run at restore time. I agree that moving the memset to vgabios should be a better way to fix the problem, I''ll give it a look. Unfortutely it means finding a Win2K install CD to repro the bug, sigh.
On 01/23/2012 11:05 AM, Stefano Stabellini wrote:> On Mon, 23 Jan 2012, Anthony Liguori wrote: >>> However the issue of patch #4, "do not reset videoram on resume", still >>> remains: no matter what parameter I pass to Qemu, if qemu_system_reset >>> is called on resume the videoram is going to be overwritten by 0xff. >> >> The memset(0xff) looks dubious to me. My guess is that this could be moved to >> the vgabios-cirrus which would solve your problem. >> >>> In this regard, don''t you think it would be advantageous to Qemu in >>> general not to reset the videram in resume? It can be pretty large, so >>> it is a significant waste of a memset. >> >> It claims to fix a real bug. Moving the memset to vgabios would do what you >> want to do in a more robust way I think. > > I think it does fix a bug (Win2K expects RAM to be 0xff at boot, or so a > comment on qemu-xen claims) but certainly it is not supposed to run at > restore time. > I agree that moving the memset to vgabios should be a better way to fix > the problem, I''ll give it a look. > Unfortutely it means finding a Win2K install CD to repro the bug, sigh.Right, it''s a bit annoying, but a much nicer solution :-) Thanks, Anthony Liguori
On 2012-01-23 17:16, Stefano Stabellini wrote:> On Mon, 23 Jan 2012, Jan Kiszka wrote: >> On 2012-01-23 15:46, Stefano Stabellini wrote: >>> On Mon, 23 Jan 2012, Jan Kiszka wrote: >>>> On 2012-01-23 12:59, Stefano Stabellini wrote: >>>>> On Mon, 23 Jan 2012, Jan Kiszka wrote: >>>>>>>> Or what is the ordering >>>>>>>> of init, RAM restore, and initial device reset now? >>>>>>> >>>>>>> RAM restore (done by Xen) >>>>>>> >>>>>>> physmap rebuild (done by xen_hvm_init in qemu) >>>>>>> pc_init() >>>>>>> qemu_system_reset() >>>>>>> load_vmstate() >>>>>> >>>>>> Hmm, are you sure that this is the only case where a device init or >>>>>> reset handler writes to already restored guest memory? Preloading the >>>>>> RAM this way is a non-standard scenario for QEMU, thus conceptually >>>>>> fragile. Does restoring happen before QEMU is even started, or can this >>>>>> point be controlled from QEMU? >>>>> >>>>> Consider that this only happens with non-MMIO device memory, in practice >>>>> only videoram. >>>>> Vmware VGA does not memset the videoram in the reset handler, while QXL >>>>> already has the following: >>>>> >>>>> /* pre loadvm reset must not touch QXLRam. This lives in >>>>> * device memory, is migrated together with RAM and thus >>>>> * already loaded at this point */ if (!loadvm) { >>>>> qxl_reset_state(d); } >>>> >>>> Yes, but QEMU restores the RAM _after_ device reset, not before it. >>>> That''s the problem with the Xen way - it is against the current >>>> QEMU standard. >>> >>> QEMU doesn''t save/restore the RAM (and the videoram) at all on Xen. >> >> But it does otherwise, and that''s the scenario the code you cited was >> written for. It won''t work as is under Xen. > > Ah, I see your point now. > In that regard, is the comment above even correct? > I am referring to "migrated together with RAM and thus already loaded at > this point"?The comment is correct. It refers to the invocation of the function from the vmload handler. Here, a flag is passed that says "don''t overwrite".> > >>> To reply to your previous question more clearly: at restore time Qemu on >>> Xen would run in a non-standard scenario; the restore of the RAM happens >>> before QEMU is even started. >>> >>> That is unfortunate but it would be very hard to change (I can give you >>> more details if you are interested in the reasons why it would be so >>> difficult). >> >> If you can''t change this, you need to properly introduce this new >> scenario - pre-initialized RAM - to the QEMU device model. Or you will >> see breakage outside cirrus sooner or later as well. So it might be good >> to explain the reason why it can''t be changed under Xen when motivating >> this concept extension to QEMU. > > OK. > Are you thinking about introducing this concept as a new runstate? > This special runstate could be set at restore time only on Xen. > > > BTW the main reasons for having Xen saving the RAM are: > > - the need to support PV guests, that often run without Qemu; > - the current save format, that is built around the fact that Xen saves the memory; > - the fact that Qemu might be running in a very limited stub-domain.Your problem is not the fact that guest RAM is restored by an external component. Your problem is that QEMU has no control over the when. If you fix this, you could coordinate the restoring with the initial device reset and would solve all potential current and future issues, not only this single cirrus related one. Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux
On 01/23/2012 11:13 AM, Jan Kiszka wrote:>>>> To reply to your previous question more clearly: at restore time Qemu on >>>> Xen would run in a non-standard scenario; the restore of the RAM happens >>>> before QEMU is even started. >>>> >>>> That is unfortunate but it would be very hard to change (I can give you >>>> more details if you are interested in the reasons why it would be so >>>> difficult). >>> >>> If you can''t change this, you need to properly introduce this new >>> scenario - pre-initialized RAM - to the QEMU device model. Or you will >>> see breakage outside cirrus sooner or later as well. So it might be good >>> to explain the reason why it can''t be changed under Xen when motivating >>> this concept extension to QEMU. >> >> OK. >> Are you thinking about introducing this concept as a new runstate? >> This special runstate could be set at restore time only on Xen. >> >> >> BTW the main reasons for having Xen saving the RAM are: >> >> - the need to support PV guests, that often run without Qemu; >> - the current save format, that is built around the fact that Xen saves the memory; >> - the fact that Qemu might be running in a very limited stub-domain. > > Your problem is not the fact that guest RAM is restored by an external > component. Your problem is that QEMU has no control over the when. If > you fix this, you could coordinate the restoring with the initial device > reset and would solve all potential current and future issues, not only > this single cirrus related one.Generally speaking, RAM is an independent device in most useful cases. Onboard RAM is a very special case because it''s extremely unusual. But since some video cards can make use of dedicated external RAM, I don''t think any video card really depends on initial RAM state. What''s most likely here is that the VGA BIOS of a Cirrus card sets an initial RAM state during device initialization. We really should view RAM as just another device so I don''t like the idea of propagating a global concept of "when RAM is restored" because that treats it specially compared to other devices. But viewing RAM as just another device, having Xen only restore a subset of devices should be a reasonable thing to do moving forward. The main problem here I believe is that we have part of the VGA Bios functionality in the hardware emulation. Regards, Anthony Liguori> > Jan >
On 2012-01-23 18:18, Anthony Liguori wrote:> On 01/23/2012 11:13 AM, Jan Kiszka wrote: >>>>> To reply to your previous question more clearly: at restore time Qemu on >>>>> Xen would run in a non-standard scenario; the restore of the RAM happens >>>>> before QEMU is even started. >>>>> >>>>> That is unfortunate but it would be very hard to change (I can give you >>>>> more details if you are interested in the reasons why it would be so >>>>> difficult). >>>> >>>> If you can''t change this, you need to properly introduce this new >>>> scenario - pre-initialized RAM - to the QEMU device model. Or you will >>>> see breakage outside cirrus sooner or later as well. So it might be good >>>> to explain the reason why it can''t be changed under Xen when motivating >>>> this concept extension to QEMU. >>> >>> OK. >>> Are you thinking about introducing this concept as a new runstate? >>> This special runstate could be set at restore time only on Xen. >>> >>> >>> BTW the main reasons for having Xen saving the RAM are: >>> >>> - the need to support PV guests, that often run without Qemu; >>> - the current save format, that is built around the fact that Xen saves the memory; >>> - the fact that Qemu might be running in a very limited stub-domain. >> >> Your problem is not the fact that guest RAM is restored by an external >> component. Your problem is that QEMU has no control over the when. If >> you fix this, you could coordinate the restoring with the initial device >> reset and would solve all potential current and future issues, not only >> this single cirrus related one. > > Generally speaking, RAM is an independent device in most useful cases. Onboard > RAM is a very special case because it''s extremely unusual. > > But since some video cards can make use of dedicated external RAM, I don''t think > any video card really depends on initial RAM state. > > What''s most likely here is that the VGA BIOS of a Cirrus card sets an initial > RAM state during device initialization. > > We really should view RAM as just another device so I don''t like the idea of > propagating a global concept of "when RAM is restored" because that treats it > specially compared to other devices. > > But viewing RAM as just another device, having Xen only restore a subset of > devices should be a reasonable thing to do moving forward. The main problem > here I believe is that we have part of the VGA Bios functionality in the > hardware emulation.To my understanding, QXL will break identically on Xen for the same reason: the reset handler assumes it can deal with the VRAM as it likes. Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux
On 01/23/2012 11:31 AM, Jan Kiszka wrote:> On 2012-01-23 18:18, Anthony Liguori wrote: >> On 01/23/2012 11:13 AM, Jan Kiszka wrote: >>>>>> To reply to your previous question more clearly: at restore time Qemu on >>>>>> Xen would run in a non-standard scenario; the restore of the RAM happens >>>>>> before QEMU is even started. >>>>>> >>>>>> That is unfortunate but it would be very hard to change (I can give you >>>>>> more details if you are interested in the reasons why it would be so >>>>>> difficult). >>>>> >>>>> If you can''t change this, you need to properly introduce this new >>>>> scenario - pre-initialized RAM - to the QEMU device model. Or you will >>>>> see breakage outside cirrus sooner or later as well. So it might be good >>>>> to explain the reason why it can''t be changed under Xen when motivating >>>>> this concept extension to QEMU. >>>> >>>> OK. >>>> Are you thinking about introducing this concept as a new runstate? >>>> This special runstate could be set at restore time only on Xen. >>>> >>>> >>>> BTW the main reasons for having Xen saving the RAM are: >>>> >>>> - the need to support PV guests, that often run without Qemu; >>>> - the current save format, that is built around the fact that Xen saves the memory; >>>> - the fact that Qemu might be running in a very limited stub-domain. >>> >>> Your problem is not the fact that guest RAM is restored by an external >>> component. Your problem is that QEMU has no control over the when. If >>> you fix this, you could coordinate the restoring with the initial device >>> reset and would solve all potential current and future issues, not only >>> this single cirrus related one. >> >> Generally speaking, RAM is an independent device in most useful cases. Onboard >> RAM is a very special case because it''s extremely unusual. >> >> But since some video cards can make use of dedicated external RAM, I don''t think >> any video card really depends on initial RAM state. >> >> What''s most likely here is that the VGA BIOS of a Cirrus card sets an initial >> RAM state during device initialization. >> >> We really should view RAM as just another device so I don''t like the idea of >> propagating a global concept of "when RAM is restored" because that treats it >> specially compared to other devices. >> >> But viewing RAM as just another device, having Xen only restore a subset of >> devices should be a reasonable thing to do moving forward. The main problem >> here I believe is that we have part of the VGA Bios functionality in the >> hardware emulation. > > To my understanding, QXL will break identically on Xen for the same > reason: the reset handler assumes it can deal with the VRAM as it likes.QXL also has a VGA BIOS that it could use to initialize VRAM. A similar change could be made for it. In general, devices shouldn''t make assumptions about the state of other devices during reset. Writing to RAM assumes that RAM has been fully initialized. We don''t express reset dependencies right now and it''s not clear to me that this is something that we should be modeling. Regards, Anthony Liguori> > Jan >
Hi,>>> We really should view RAM as just another device so I don''t like the >>> idea of >>> propagating a global concept of "when RAM is restored" because that >>> treats it >>> specially compared to other devices. >>> >>> But viewing RAM as just another device, having Xen only restore a >>> subset of >>> devices should be a reasonable thing to do moving forward.I don''t think modeling device memory (i.e. vga vram) as something independent from the device (vga) is a good idea. Because it isn''t.>> To my understanding, QXL will break identically on Xen for the same >> reason: the reset handler assumes it can deal with the VRAM as it likes.Yes. Some data structures for host <-> guest communication are living in device memory, and a reset initializes these.> QXL also has a VGA BIOS that it could use to initialize VRAM. A similar > change could be made for it.Hmm. Not sure this is a good idea. cheers, Gerd
On 01/23/2012 07:18 PM, Anthony Liguori wrote:> Generally speaking, RAM is an independent device in most useful cases.Can you give examples? Do you mean a subdevice with composition, or a really independent device?> Onboard RAM is a very special case because it''s extremely unusual.What''s unusual (or even extremely unusual) about it? I don''t see a difference between a few billion bits of state and, say, the few bits of state in an RTC device.> But since some video cards can make use of dedicated external RAM, I > don''t think any video card really depends on initial RAM state. > > What''s most likely here is that the VGA BIOS of a Cirrus card sets an > initial RAM state during device initialization.Yes, that''s likely. DRAMs aren''t reset and some state survives even a short power off.> > We really should view RAM as just another device so I don''t like the > idea of propagating a global concept of "when RAM is restored" because > that treats it specially compared to other devices.Agree. In fact the first step has been taken as now creating a RAM region with memory_region_init_ram() doesn''t register it for migration. The next step would be a VMSTATE_RAM() to make it part of the containing device.> But viewing RAM as just another device, having Xen only restore a > subset of devices should be a reasonable thing to do moving forward. > The main problem here I believe is that we have part of the VGA Bios > functionality in the hardware emulation.Doesn''t the main BIOS clear the screen first thing at boot? Not even sure the reset is needed. -- error compiling committee.c: too many arguments to function
On 01/24/2012 12:21 PM, Gerd Hoffmann wrote:> >>> > >>> But viewing RAM as just another device, having Xen only restore a > >>> subset of > >>> devices should be a reasonable thing to do moving forward. > > I don''t think modeling device memory (i.e. vga vram) as something > independent from the device (vga) is a good idea. Because it isn''t.Right. We should have VMSTATE_RAM() to express the dependency. -- error compiling committee.c: too many arguments to function
On 01/24/2012 12:10 PM, Avi Kivity wrote:>> But viewing RAM as just another device, having Xen only restore a >> subset of devices should be a reasonable thing to do moving forward. >> The main problem here I believe is that we have part of the VGA Bios >> functionality in the hardware emulation. > > Doesn''t the main BIOS clear the screen first thing at boot? Not even > sure the reset is needed.Clearing the screen should only write to the RAM at 0xB8000 (and perhaps 0xA0000 since IIRC it''s where text-mode fonts lie). The option ROM cannot even assume that the main BIOS knows about the VESA framebuffer, can it? Paolo
On 01/24/2012 01:27 PM, Paolo Bonzini wrote:> On 01/24/2012 12:10 PM, Avi Kivity wrote: >>> But viewing RAM as just another device, having Xen only restore a >>> subset of devices should be a reasonable thing to do moving forward. >>> The main problem here I believe is that we have part of the VGA Bios >>> functionality in the hardware emulation. >> >> Doesn''t the main BIOS clear the screen first thing at boot? Not even >> sure the reset is needed. > > Clearing the screen should only write to the RAM at 0xB8000 (and > perhaps 0xA0000 since IIRC it''s where text-mode fonts lie). The > option ROM cannot even assume that the main BIOS knows about the VESA > framebuffer, can it?Yes, but why should anything else be needed? When you switch to a graphics mode, clear as much of the framebuffer as you need. -- error compiling committee.c: too many arguments to function
On 01/24/2012 12:32 PM, Avi Kivity wrote:>> > Clearing the screen should only write to the RAM at 0xB8000 (and >> > perhaps 0xA0000 since IIRC it''s where text-mode fonts lie). The >> > option ROM cannot even assume that the main BIOS knows about the VESA >> > framebuffer, can it? > Yes, but why should anything else be needed? > > When you switch to a graphics mode, clear as much of the framebuffer as > you need.Based on the comments, Windows apparently disagrees. :) Paolo
On Tue, 24 Jan 2012, Avi Kivity wrote:> On 01/24/2012 01:27 PM, Paolo Bonzini wrote: > > On 01/24/2012 12:10 PM, Avi Kivity wrote: > >>> But viewing RAM as just another device, having Xen only restore a > >>> subset of devices should be a reasonable thing to do moving forward. > >>> The main problem here I believe is that we have part of the VGA Bios > >>> functionality in the hardware emulation. > >> > >> Doesn''t the main BIOS clear the screen first thing at boot? Not even > >> sure the reset is needed. > > > > Clearing the screen should only write to the RAM at 0xB8000 (and > > perhaps 0xA0000 since IIRC it''s where text-mode fonts lie). The > > option ROM cannot even assume that the main BIOS knows about the VESA > > framebuffer, can it? > > Yes, but why should anything else be needed? > > When you switch to a graphics mode, clear as much of the framebuffer as > you need.After installing Win2K, I managed to reproduce the issue with the old qemu-xen and rombios (removing the memset in the vga emulator). However I cannot reproduce the issue with upstream qemu and seabios (even if I remove the memset). I think that the memset was a workaround a bug in rombios, hence it is not needed anymore. Removing the cirrus_vga memset would solve the main issue we have left with save/restore on Xen. However it wouldn''t solve the problem for QXL: as Gerd pointed out, QXL needs a reset to initialize some memory regions, so another "if we are restoring don''t do that" would be required to make it work on Xen... Maybe we can extend the loadvm == 1 to all the reset performed before load_vmstate?
On Tue, 24 Jan 2012, Avi Kivity wrote:> On 01/24/2012 12:21 PM, Gerd Hoffmann wrote: > > >>> > > >>> But viewing RAM as just another device, having Xen only restore a > > >>> subset of > > >>> devices should be a reasonable thing to do moving forward. > > > > I don''t think modeling device memory (i.e. vga vram) as something > > independent from the device (vga) is a good idea. Because it isn''t. > > Right. We should have VMSTATE_RAM() to express the dependency.So if I understand correctly you are planning on removing the call to vmstate_register_ram_global in hw/vga.c? In that case it might be better if I wait for your "VMSTATE_RAM" patch before I can send a new patch to save only non-RAM devices, because I guess they will clash with one another.
On 01/24/2012 05:10 AM, Avi Kivity wrote:> On 01/23/2012 07:18 PM, Anthony Liguori wrote: >> Generally speaking, RAM is an independent device in most useful cases. > > Can you give examples? Do you mean a subdevice with composition, or a > really independent device?I expect we''ll have one Ram device. It''s size will be configurable. One Ram device will hang off of the machine which would be the main ram. A video card would have a Ram device via composition. The important consideration about reset is how it propagates. My expectation is that we''ll propagate reset through the composition tree in a preorder transversal. That means in the VGA device reset function, it''s child device (Ram) has not been reset yet.>> Onboard RAM is a very special case because it''s extremely unusual. > > What''s unusual (or even extremely unusual) about it? I don''t see a > difference between a few billion bits of state and, say, the few bits of > state in an RTC device.Okay, but let''s not get philosophical here :-) There are very few devices that have a DRAM chip that is mapped through a bar into the main address space and behaves (usually) like normal RAM.>> We really should view RAM as just another device so I don''t like the >> idea of propagating a global concept of "when RAM is restored" because >> that treats it specially compared to other devices. > > Agree. In fact the first step has been taken as now creating a RAM > region with memory_region_init_ram() doesn''t register it for migration. > The next step would be a VMSTATE_RAM() to make it part of the containing > device.That''s not necessary (or wise). Let''s not confuse a Ram device with a MemoryRegion. They are separate things and should be treated as separate things. I thought we discussed that MemoryRegions are stateless (or at least, there state is all derived) and don''t need to be serialized? Regards, Anthony Liguori
On 01/24/2012 05:52 AM, Stefano Stabellini wrote:> On Tue, 24 Jan 2012, Avi Kivity wrote: >> On 01/24/2012 01:27 PM, Paolo Bonzini wrote: >>> On 01/24/2012 12:10 PM, Avi Kivity wrote: >>>>> But viewing RAM as just another device, having Xen only restore a >>>>> subset of devices should be a reasonable thing to do moving forward. >>>>> The main problem here I believe is that we have part of the VGA Bios >>>>> functionality in the hardware emulation. >>>> >>>> Doesn''t the main BIOS clear the screen first thing at boot? Not even >>>> sure the reset is needed. >>> >>> Clearing the screen should only write to the RAM at 0xB8000 (and >>> perhaps 0xA0000 since IIRC it''s where text-mode fonts lie). The >>> option ROM cannot even assume that the main BIOS knows about the VESA >>> framebuffer, can it? >> >> Yes, but why should anything else be needed? >> >> When you switch to a graphics mode, clear as much of the framebuffer as >> you need. > > After installing Win2K, I managed to reproduce the issue with the old > qemu-xen and rombios (removing the memset in the vga emulator). > However I cannot reproduce the issue with upstream qemu and seabios > (even if I remove the memset). > > I think that the memset was a workaround a bug in rombios, hence it is > not needed anymore. > > Removing the cirrus_vga memset would solve the main issue we have left > with save/restore on Xen. > However it wouldn''t solve the problem for QXL: as Gerd pointed out, QXL > needs a reset to initialize some memory regions, so another "if we are > restoring don''t do that" would be required to make it work on Xen...I believe this is a bug in QXL (or misdesign). It should move this logic into the VGA Bios. Regards, Anthony Liguori
On 01/24/2012 05:13 AM, Avi Kivity wrote:> On 01/24/2012 12:21 PM, Gerd Hoffmann wrote: >>>>> >>>>> But viewing RAM as just another device, having Xen only restore a >>>>> subset of >>>>> devices should be a reasonable thing to do moving forward. >> >> I don''t think modeling device memory (i.e. vga vram) as something >> independent from the device (vga) is a good idea. Because it isn''t. > > Right. We should have VMSTATE_RAM() to express the dependency.No, VMSTATE has nothign to do with reset. Ram should be a device and then you can hook up ram through the composition tree. But reset is going to propagate preorder, not postorder, so this isn''t going to help anyway. Postorder initialization doesn''t make a whole lot of sense when you think about the semantics of it. Regards, Anthony Liguori>
On 01/24/2012 04:21 AM, Gerd Hoffmann wrote:> Hi, > >>>> We really should view RAM as just another device so I don''t like the >>>> idea of >>>> propagating a global concept of "when RAM is restored" because that >>>> treats it >>>> specially compared to other devices. >>>> >>>> But viewing RAM as just another device, having Xen only restore a >>>> subset of >>>> devices should be a reasonable thing to do moving forward. > > I don''t think modeling device memory (i.e. vga vram) as something > independent from the device (vga) is a good idea. Because it isn''t.It is, but probably not in the way you''re assuming that I mean it is.> >>> To my understanding, QXL will break identically on Xen for the same >>> reason: the reset handler assumes it can deal with the VRAM as it likes. > > Yes. Some data structures for host<-> guest communication are living > in device memory, and a reset initializes these.This should be done by firmware or system software. Devices don''t initialize memory with data structures on power up. The first problem with doing this is that it assumes a reset order which would lengthen the quiescing process. The second problem is that this would require fairly sophisticated logic that could just as well live in firmware. The advantage of not enabling a device to behave like this is it lets us do a much simpler reset model. To model this properly, we would have to support multiple reset propagation hooks (at least a preorder and postorder hook). This adds a fair amount of complication for not a lot of benefit (the only benefit is moving firmware logic into QEMU which is really a downside :-)). Regards, Anthony Liguori> >> QXL also has a VGA BIOS that it could use to initialize VRAM. A similar >> change could be made for it. > > Hmm. Not sure this is a good idea. > > cheers, > Gerd > >
On 01/24/2012 02:00 PM, Stefano Stabellini wrote:> On Tue, 24 Jan 2012, Avi Kivity wrote: > > On 01/24/2012 12:21 PM, Gerd Hoffmann wrote: > > > >>> > > > >>> But viewing RAM as just another device, having Xen only restore a > > > >>> subset of > > > >>> devices should be a reasonable thing to do moving forward. > > > > > > I don''t think modeling device memory (i.e. vga vram) as something > > > independent from the device (vga) is a good idea. Because it isn''t. > > > > Right. We should have VMSTATE_RAM() to express the dependency. > > So if I understand correctly you are planning on removing the call to > vmstate_register_ram_global in hw/vga.c?I can''t say I''m planning to do so, but that''s the direction I think we need to go.> In that case it might be better if I wait for your "VMSTATE_RAM" patch > before I can send a new patch to save only non-RAM devices, because I > guess they will clash with one another.I don''t understand what sense a patch to save non-RAM devices makes (or even, what non-RAM devices are). We have a special case here where Xen manages the RAM even though qemu thinks it does. IMO an if (xen) is the best way to express this specialness. -- error compiling committee.c: too many arguments to function
On 01/24/2012 03:18 PM, Anthony Liguori wrote:> On 01/24/2012 05:13 AM, Avi Kivity wrote: >> On 01/24/2012 12:21 PM, Gerd Hoffmann wrote: >>>>>> >>>>>> But viewing RAM as just another device, having Xen only restore a >>>>>> subset of >>>>>> devices should be a reasonable thing to do moving forward. >>> >>> I don''t think modeling device memory (i.e. vga vram) as something >>> independent from the device (vga) is a good idea. Because it isn''t. >> >> Right. We should have VMSTATE_RAM() to express the dependency. > > No, VMSTATE has nothign to do with reset.It''s so that the device''s post_load happens after the RAM was loaded.> > Ram should be a device and then you can hook up ram through the > composition tree.I think that''s too heavy a hammer. Think about things like ivshmem. Does it really need to be a composite device? If we keep going in this direction the amount of know how needed to write a device for qemu will be overwhelming. RAM is a large array of bits. We model an 32-bit register with a uint32_t, memory_region_init_io(), and some vmstate. We should be able to do the same thing for RAM.> > But reset is going to propagate preorder, not postorder, so this isn''t > going to help anyway. > > Postorder initialization doesn''t make a whole lot of sense when you > think about the semantics of it.-- error compiling committee.c: too many arguments to function
On 01/24/2012 03:25 PM, Anthony Liguori wrote:> >> >>>> To my understanding, QXL will break identically on Xen for the same >>>> reason: the reset handler assumes it can deal with the VRAM as it >>>> likes. >> >> Yes. Some data structures for host<-> guest communication are living >> in device memory, and a reset initializes these. > > This should be done by firmware or system software.It should be done in the way the device spec says it does. What does the qxl spec says? If it''s lacking, we need to fill it up from existing practice.> Devices don''t initialize memory with data structures on power up. The > first problem with doing this is that it assumes a reset order which > would lengthen the quiescing process. The second problem is that this > would require fairly sophisticated logic that could just as well live > in firmware.If it''s in device firmware, there''s no way to tell. Nothing prevents the device from clearing its on-board memory during reset.> > The advantage of not enabling a device to behave like this is it lets > us do a much simpler reset model. To model this properly, we would > have to support multiple reset propagation hooks (at least a preorder > and postorder hook). This adds a fair amount of complication for not > a lot of benefit (the only benefit is moving firmware logic into QEMU > which is really a downside :-)). >You''re arguing about changing a device so it could fit qemu better. It should be the other way around. Granted this is a paravirt device so we have more leeway, but still qxl is out there and we can''t change it. -- error compiling committee.c: too many arguments to function
On 01/24/2012 01:44 PM, Paolo Bonzini wrote:> On 01/24/2012 12:32 PM, Avi Kivity wrote: >>> > Clearing the screen should only write to the RAM at 0xB8000 (and >>> > perhaps 0xA0000 since IIRC it''s where text-mode fonts lie). The >>> > option ROM cannot even assume that the main BIOS knows about the >>> VESA >>> > framebuffer, can it? >> Yes, but why should anything else be needed? >> >> When you switch to a graphics mode, clear as much of the framebuffer as >> you need. > > Based on the comments, Windows apparently disagrees. :)Based on the testing, it does not. -- error compiling committee.c: too many arguments to function
On 01/24/2012 03:14 PM, Anthony Liguori wrote:> On 01/24/2012 05:10 AM, Avi Kivity wrote: >> On 01/23/2012 07:18 PM, Anthony Liguori wrote: >>> Generally speaking, RAM is an independent device in most useful cases. >> >> Can you give examples? Do you mean a subdevice with composition, or a >> really independent device? > > I expect we''ll have one Ram device. It''s size will be configurable. > One Ram device will hang off of the machine which would be the main ram.We''ll also have a hotpluggable variant which talks to some GPIOs. A motherboard may support multiple slots with such devices.> A video card would have a Ram device via composition.IMO, overkill.> > The important consideration about reset is how it propagates. My > expectation is that we''ll propagate reset through the composition tree > in a preorder transversal. That means in the VGA device reset > function, it''s child device (Ram) has not been reset yet.Doesn''t depth first make more sense? A bus is considered reset after all devices in the bus have been reset.> >>> We really should view RAM as just another device so I don''t like the >>> idea of propagating a global concept of "when RAM is restored" because >>> that treats it specially compared to other devices. >> >> Agree. In fact the first step has been taken as now creating a RAM >> region with memory_region_init_ram() doesn''t register it for migration. >> The next step would be a VMSTATE_RAM() to make it part of the containing >> device. > > That''s not necessary (or wise). > > Let''s not confuse a Ram device with a MemoryRegion. They are separate > things and should be treated as separate things. I thought we > discussed that MemoryRegions are stateless (or at least, there state > is all derived) and don''t need to be serialized?Well, the actual bits in memory are state. All other attributes are indeed derived. I just think that making any device that has a bit of RAM a composed device is overkill. What do we gain from it? The cost is not trivial. -- error compiling committee.c: too many arguments to function
On 01/24/2012 09:56 AM, Avi Kivity wrote:> On 01/24/2012 03:14 PM, Anthony Liguori wrote: >> On 01/24/2012 05:10 AM, Avi Kivity wrote: >>> On 01/23/2012 07:18 PM, Anthony Liguori wrote: >>>> Generally speaking, RAM is an independent device in most useful cases. >>> >>> Can you give examples? Do you mean a subdevice with composition, or a >>> really independent device? >> >> I expect we''ll have one Ram device. It''s size will be configurable. >> One Ram device will hang off of the machine which would be the main ram. > > We''ll also have a hotpluggable variant which talks to some GPIOs. A > motherboard may support multiple slots with such devices.Yup. And boards can subclass the Ram device in order to be able to restrict the valid RAM sizes. -m just becomes an analysis to set the ram size for a ram device. The infrastructure around RAMBlock, etc. would go away as each RAMBlock would correspond to a Ram device.>> A video card would have a Ram device via composition. > > IMO, overkill.I think it''s a lot of refactoring to get there and it''s not the most critical infrastructure changes we have coming up, but I think it would be nice to get there eventually.> >> >> The important consideration about reset is how it propagates. My >> expectation is that we''ll propagate reset through the composition tree >> in a preorder transversal. That means in the VGA device reset >> function, it''s child device (Ram) has not been reset yet. > > Doesn''t depth first make more sense?Yes, it would be depth first, but you''re looking for a preorder and post order hook. In otherwords: void do_reset(DeviceState *dev) { dev->pre_reset(dev); // this is preorder do_reset_all(dev->children); dev->post_reset(dev); // this is postorder } The PCI bus overloads the reset handler and does a post-order reset. I guess I can rationalize it after all but it''s not entirely clear to me that a pre_reset hook isn''t necessary too.> A bus is considered reset after > all devices in the bus have been reset. > > >> >>>> We really should view RAM as just another device so I don''t like the >>>> idea of propagating a global concept of "when RAM is restored" because >>>> that treats it specially compared to other devices. >>> >>> Agree. In fact the first step has been taken as now creating a RAM >>> region with memory_region_init_ram() doesn''t register it for migration. >>> The next step would be a VMSTATE_RAM() to make it part of the containing >>> device. >> >> That''s not necessary (or wise). >> >> Let''s not confuse a Ram device with a MemoryRegion. They are separate >> things and should be treated as separate things. I thought we >> discussed that MemoryRegions are stateless (or at least, there state >> is all derived) and don''t need to be serialized? > > Well, the actual bits in memory are state. All other attributes are > indeed derived. > > I just think that making any device that has a bit of RAM a composed > device is overkill. What do we gain from it? The cost is not trivial.The cost of composition is pretty trivial. struct VGADevice { DeviceState parent; Ram vga_ram; }; static void vga_device_initfn(Object *obj) { VGADevice *vga = VGA(obj); object_property_add_child(obj, "vram", (Object *)&vga_ram, TYPE_RAM, NULL); } Now a user can control the VGA ram size by accessing vga/vram.size = 16M. Regards, Anthony Liguori
Hi all, this is the fourth version of the Xen save/restore patch series. We have been discussing this issue for quite a while on #qemu and qemu-devel: http://marc.info/?l=qemu-devel&m=132346828427314&w=2 http://marc.info/?l=qemu-devel&m=132377734605464&w=2 The principal changes in the this version are: - Following Anthony''s suggestion I have introduced a new monitor command to save the non-ram device state to file. - I have also removed the hack not to reset the cirrus videoram on restore, because it turns out that the videoram doesn''t need to be reset in the reset handler at all (tested on Win2K, where the problem was found in the first place). This is the list of patches with a diffstat: Anthony PERARD (2): xen mapcache: check if memory region has moved. xen: do not allocate RAM during INMIGRATE runstate Stefano Stabellini (4): cirrus_vga: do not reset videoram Introduce "save_devices" xen: record physmap changes to xenstore Set runstate to INMIGRATE earlier block-migration.c | 2 +- hmp-commands.hx | 14 +++++++ hw/cirrus_vga.c | 4 -- hw/hw.h | 1 + qmp-commands.hx | 17 +++++++++ savevm.c | 72 ++++++++++++++++++++++++++++++++++++- sysemu.h | 1 + vl.c | 4 +- xen-all.c | 104 +++++++++++++++++++++++++++++++++++++++++++++++++++- xen-mapcache.c | 22 ++++++++++-- xen-mapcache.h | 9 ++++- 11 files changed, 235 insertions(+), 15 deletions(-) git://xenbits.xen.org/people/sstabellini/qemu-dm.git saverestore-4 Cheers, Stefano
There is no need to set the videoram to 0xff in cirrus_reset, because it is the BIOS'' job. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> --- hw/cirrus_vga.c | 4 ---- 1 files changed, 0 insertions(+), 4 deletions(-) diff --git a/hw/cirrus_vga.c b/hw/cirrus_vga.c index f7b1d3d..5564478 100644 --- a/hw/cirrus_vga.c +++ b/hw/cirrus_vga.c @@ -2760,10 +2760,6 @@ static void cirrus_reset(void *opaque) } s->vga.cr[0x27] = s->device_id; - /* Win2K seems to assume that the pattern buffer is at 0xff - initially ! */ - memset(s->vga.vram_ptr, 0xff, s->real_vram_size); - s->cirrus_hidden_dac_lockindex = 5; s->cirrus_hidden_dac_data = 0; } -- 1.7.2.5
- add an "is_ram" flag to SaveStateEntry; - add an "is_ram" parameter to register_savevm_live; - introduce a "save_devices" monitor command that can be used to save the state of non-ram devices. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> --- block-migration.c | 2 +- hmp-commands.hx | 14 ++++++++++ hw/hw.h | 1 + qmp-commands.hx | 17 ++++++++++++ savevm.c | 72 ++++++++++++++++++++++++++++++++++++++++++++++++++++- sysemu.h | 1 + vl.c | 2 +- 7 files changed, 106 insertions(+), 3 deletions(-) diff --git a/block-migration.c b/block-migration.c index 2b7edbc..424a87d 100644 --- a/block-migration.c +++ b/block-migration.c @@ -720,6 +720,6 @@ void blk_mig_init(void) QSIMPLEQ_INIT(&block_mig_state.bmds_list); QSIMPLEQ_INIT(&block_mig_state.blk_list); - register_savevm_live(NULL, "block", 0, 1, block_set_params, + register_savevm_live(NULL, "block", 0, 1, 0, block_set_params, block_save_live, NULL, block_load, &block_mig_state); } diff --git a/hmp-commands.hx b/hmp-commands.hx index a586498..ba4c9d5 100644 --- a/hmp-commands.hx +++ b/hmp-commands.hx @@ -241,6 +241,20 @@ a snapshot with the same tag or ID, it is replaced. More info at ETEXI { + .name = "save_devices", + .args_type = "filename:F", + .params = "filename", + .help = "save the state of non-ram devices.", + .mhandler.cmd_new = do_save_device_state, + }, + +STEXI +@item save_devices @var{filename} +@findex save_devices +Save the state of non-ram devices. +ETEXI + + { .name = "loadvm", .args_type = "name:s", .params = "tag|id", diff --git a/hw/hw.h b/hw/hw.h index 932014a..9d3d76d 100644 --- a/hw/hw.h +++ b/hw/hw.h @@ -263,6 +263,7 @@ int register_savevm_live(DeviceState *dev, const char *idstr, int instance_id, int version_id, + int is_ram, SaveSetParamsHandler *set_params, SaveLiveStateHandler *save_live_state, SaveStateHandler *save_state, diff --git a/qmp-commands.hx b/qmp-commands.hx index 7e3f4b9..51f4436 100644 --- a/qmp-commands.hx +++ b/qmp-commands.hx @@ -429,6 +429,23 @@ Note: inject-nmi is only supported for x86 guest currently, it will EQMP { + .name = "save_devices", + .args_type = "filename:F", + .params = "filename", + .help = "save the state of non-ram devices.", + .user_print = monitor_user_noop, + .mhandler.cmd_new = do_save_device_state, + }, + +SQMP +save_devices +------- + +Save the state of non-ram devices. + +EQMP + + { .name = "migrate", .args_type = "detach:-d,blk:-b,inc:-i,uri:s", .params = "[-d] [-b] [-i] uri", diff --git a/savevm.c b/savevm.c index 80be1ff..d0e62bb 100644 --- a/savevm.c +++ b/savevm.c @@ -1177,6 +1177,7 @@ typedef struct SaveStateEntry { void *opaque; CompatEntry *compat; int no_migrate; + int is_ram; } SaveStateEntry; @@ -1223,6 +1224,7 @@ int register_savevm_live(DeviceState *dev, const char *idstr, int instance_id, int version_id, + int is_ram, SaveSetParamsHandler *set_params, SaveLiveStateHandler *save_live_state, SaveStateHandler *save_state, @@ -1241,6 +1243,7 @@ int register_savevm_live(DeviceState *dev, se->opaque = opaque; se->vmsd = NULL; se->no_migrate = 0; + se->is_ram = is_ram; if (dev && dev->parent_bus && dev->parent_bus->info->get_dev_path) { char *id = dev->parent_bus->info->get_dev_path(dev); @@ -1277,7 +1280,7 @@ int register_savevm(DeviceState *dev, LoadStateHandler *load_state, void *opaque) { - return register_savevm_live(dev, idstr, instance_id, version_id, + return register_savevm_live(dev, idstr, instance_id, version_id, 0, NULL, NULL, save_state, load_state, opaque); } @@ -1728,6 +1731,43 @@ out: return ret; } +static int qemu_save_device_state(Monitor *mon, QEMUFile *f) +{ + SaveStateEntry *se; + + qemu_put_be32(f, QEMU_VM_FILE_MAGIC); + qemu_put_be32(f, QEMU_VM_FILE_VERSION); + + cpu_synchronize_all_states(); + + QTAILQ_FOREACH(se, &savevm_handlers, entry) { + int len; + + if (se->is_ram) + continue; + if (se->save_state == NULL && se->vmsd == NULL) + continue; + + /* Section type */ + qemu_put_byte(f, QEMU_VM_SECTION_FULL); + qemu_put_be32(f, se->section_id); + + /* ID string */ + len = strlen(se->idstr); + qemu_put_byte(f, len); + qemu_put_buffer(f, (uint8_t *)se->idstr, len); + + qemu_put_be32(f, se->instance_id); + qemu_put_be32(f, se->version_id); + + vmstate_save(f, se); + } + + qemu_put_byte(f, QEMU_VM_EOF); + + return qemu_file_get_error(f); +} + static SaveStateEntry *find_se(const char *idstr, int instance_id) { SaveStateEntry *se; @@ -2109,6 +2149,36 @@ void do_savevm(Monitor *mon, const QDict *qdict) vm_start(); } +int do_save_device_state(Monitor *mon, const QDict *qdict, QObject **ret_data) +{ + int ret; + QEMUFile *f; + int saved_vm_running; + const char *filename = qdict_get_try_str(qdict, "filename"); + + saved_vm_running = runstate_is_running(); + vm_stop(RUN_STATE_SAVE_VM); + + f = qemu_fopen(filename, "wb"); + if (!f) { + monitor_printf(mon, "Could not open VM state file\n"); + ret = -1; + goto the_end; + } + ret = qemu_save_device_state(mon, f); + qemu_fclose(f); + if (ret < 0) { + monitor_printf(mon, "Error %d while writing VM\n", ret); + goto the_end; + } + ret = 0; + + the_end: + if (saved_vm_running) + vm_start(); + return ret; +} + int load_vmstate(const char *name) { BlockDriverState *bs, *bs_vm_state; diff --git a/sysemu.h b/sysemu.h index ddef2bb..4103b08 100644 --- a/sysemu.h +++ b/sysemu.h @@ -59,6 +59,7 @@ void qemu_remove_exit_notifier(Notifier *notify); void qemu_add_machine_init_done_notifier(Notifier *notify); void do_savevm(Monitor *mon, const QDict *qdict); +int do_save_device_state(Monitor *mon, const QDict *qdict, QObject **ret_data); int load_vmstate(const char *name); void do_delvm(Monitor *mon, const QDict *qdict); void do_info_snapshots(Monitor *mon); diff --git a/vl.c b/vl.c index ba55b35..c3a155f 100644 --- a/vl.c +++ b/vl.c @@ -3270,7 +3270,7 @@ int main(int argc, char **argv, char **envp) default_drive(default_sdcard, snapshot, machine->use_scsi, IF_SD, 0, SD_OPTS); - register_savevm_live(NULL, "ram", 0, 4, NULL, ram_save_live, NULL, + register_savevm_live(NULL, "ram", 0, 4, 1, NULL, ram_save_live, NULL, ram_load, NULL); if (nb_numa_nodes > 0) { -- 1.7.2.5
Stefano Stabellini
2012-Jan-25 13:05 UTC
[PATCH v4 3/6] xen: record physmap changes to xenstore
Write to xenstore any physmap changes so that the hypervisor can be aware of them. Read physmap changes from xenstore on boot. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> --- xen-all.c | 78 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++- 1 files changed, 77 insertions(+), 1 deletions(-) diff --git a/xen-all.c b/xen-all.c index c86ebf4..4b00a6c 100644 --- a/xen-all.c +++ b/xen-all.c @@ -63,7 +63,7 @@ static inline ioreq_t *xen_vcpu_ioreq(shared_iopage_t *shared_page, int vcpu) typedef struct XenPhysmap { target_phys_addr_t start_addr; ram_addr_t size; - MemoryRegion *mr; + char *name; target_phys_addr_t phys_offset; QLIST_ENTRY(XenPhysmap) list; @@ -237,6 +237,7 @@ static int xen_add_to_physmap(XenIOState *state, XenPhysmap *physmap = NULL; target_phys_addr_t pfn, start_gpfn; target_phys_addr_t phys_offset = memory_region_get_ram_addr(mr); + char path[80], value[17]; if (get_physmapping(state, start_addr, size)) { return 0; @@ -275,6 +276,7 @@ go_physmap: physmap->start_addr = start_addr; physmap->size = size; + physmap->name = (char *)mr->name; physmap->phys_offset = phys_offset; QLIST_INSERT_HEAD(&state->physmap, physmap, list); @@ -283,6 +285,30 @@ go_physmap: start_addr >> TARGET_PAGE_BITS, (start_addr + size) >> TARGET_PAGE_BITS, XEN_DOMCTL_MEM_CACHEATTR_WB); + + snprintf(path, sizeof(path), + "/local/domain/0/device-model/%d/physmap/%"PRIx64"/start_addr", + xen_domid, (uint64_t)phys_offset); + snprintf(value, sizeof(value), "%"PRIx64, (uint64_t)start_addr); + if (!xs_write(state->xenstore, 0, path, value, strlen(value))) { + return -1; + } + snprintf(path, sizeof(path), + "/local/domain/0/device-model/%d/physmap/%"PRIx64"/size", + xen_domid, (uint64_t)phys_offset); + snprintf(value, sizeof(value), "%"PRIx64, (uint64_t)size); + if (!xs_write(state->xenstore, 0, path, value, strlen(value))) { + return -1; + } + if (mr->name) { + snprintf(path, sizeof(path), + "/local/domain/0/device-model/%d/physmap/%"PRIx64"/name", + xen_domid, (uint64_t)phys_offset); + if (!xs_write(state->xenstore, 0, path, mr->name, strlen(mr->name))) { + return -1; + } + } + return 0; } @@ -910,6 +936,55 @@ int xen_init(void) return 0; } +static void xen_read_physmap(XenIOState *state) +{ + XenPhysmap *physmap = NULL; + unsigned int len, num, i; + char path[80], *value = NULL; + char **entries = NULL; + + snprintf(path, sizeof(path), + "/local/domain/0/device-model/%d/physmap", xen_domid); + entries = xs_directory(state->xenstore, 0, path, &num); + if (entries == NULL) + return; + + for (i = 0; i < num; i++) { + physmap = g_malloc(sizeof (XenPhysmap)); + physmap->phys_offset = strtoull(entries[i], NULL, 16); + snprintf(path, sizeof(path), + "/local/domain/0/device-model/%d/physmap/%s/start_addr", + xen_domid, entries[i]); + value = xs_read(state->xenstore, 0, path, &len); + if (value == NULL) { + free(physmap); + continue; + } + physmap->start_addr = strtoull(value, NULL, 16); + free(value); + + snprintf(path, sizeof(path), + "/local/domain/0/device-model/%d/physmap/%s/size", + xen_domid, entries[i]); + value = xs_read(state->xenstore, 0, path, &len); + if (value == NULL) { + free(physmap); + continue; + } + physmap->size = strtoull(value, NULL, 16); + free(value); + + snprintf(path, sizeof(path), + "/local/domain/0/device-model/%d/physmap/%s/name", + xen_domid, entries[i]); + physmap->name = xs_read(state->xenstore, 0, path, &len); + + QLIST_INSERT_HEAD(&state->physmap, physmap, list); + } + free(entries); + return; +} + int xen_hvm_init(void) { int i, rc; @@ -982,6 +1057,7 @@ int xen_hvm_init(void) xen_be_register("console", &xen_console_ops); xen_be_register("vkbd", &xen_kbdmouse_ops); xen_be_register("qdisk", &xen_blkdev_ops); + xen_read_physmap(state); return 0; } -- 1.7.2.5
Stefano Stabellini
2012-Jan-25 13:05 UTC
[PATCH v4 4/6] xen mapcache: check if memory region has moved.
From: Anthony PERARD <anthony.perard@citrix.com> This patch changes the xen_map_cache behavior. Before trying to map a guest addr, mapcache will look into the list of range of address that have been moved (physmap/set_memory). There is currently one memory space like this, the vram, "moved" from were it''s allocated to were the guest will look into. This help to have a succefull migration. Signed-off-by: Anthony PERARD <anthony.perard@citrix.com> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> --- xen-all.c | 18 +++++++++++++++++- xen-mapcache.c | 22 +++++++++++++++++++--- xen-mapcache.h | 9 +++++++-- 3 files changed, 43 insertions(+), 6 deletions(-) diff --git a/xen-all.c b/xen-all.c index 4b00a6c..bb66c82 100644 --- a/xen-all.c +++ b/xen-all.c @@ -225,6 +225,22 @@ static XenPhysmap *get_physmapping(XenIOState *state, return NULL; } +static target_phys_addr_t xen_phys_offset_to_gaddr(target_phys_addr_t start_addr, + ram_addr_t size, void *opaque) +{ + target_phys_addr_t addr = start_addr & TARGET_PAGE_MASK; + XenIOState *xen_io_state = opaque; + XenPhysmap *physmap = NULL; + + QLIST_FOREACH(physmap, &xen_io_state->physmap, list) { + if (range_covers_byte(physmap->phys_offset, physmap->size, addr)) { + return physmap->start_addr; + } + } + + return start_addr; +} + #if CONFIG_XEN_CTRL_INTERFACE_VERSION >= 340 static int xen_add_to_physmap(XenIOState *state, target_phys_addr_t start_addr, @@ -1039,7 +1055,7 @@ int xen_hvm_init(void) } /* Init RAM management */ - xen_map_cache_init(); + xen_map_cache_init(xen_phys_offset_to_gaddr, state); xen_ram_init(ram_size); qemu_add_vm_change_state_handler(xen_hvm_change_state_handler, state); diff --git a/xen-mapcache.c b/xen-mapcache.c index 9fecc64..d9c995b 100644 --- a/xen-mapcache.c +++ b/xen-mapcache.c @@ -76,6 +76,9 @@ typedef struct MapCache { uint8_t *last_address_vaddr; unsigned long max_mcache_size; unsigned int mcache_bucket_shift; + + phys_offset_to_gaddr_t phys_offset_to_gaddr; + void *opaque; } MapCache; static MapCache *mapcache; @@ -89,13 +92,16 @@ static inline int test_bits(int nr, int size, const unsigned long *addr) return 0; } -void xen_map_cache_init(void) +void xen_map_cache_init(phys_offset_to_gaddr_t f, void *opaque) { unsigned long size; struct rlimit rlimit_as; mapcache = g_malloc0(sizeof (MapCache)); + mapcache->phys_offset_to_gaddr = f; + mapcache->opaque = opaque; + QTAILQ_INIT(&mapcache->locked_entries); mapcache->last_address_index = -1; @@ -191,9 +197,14 @@ uint8_t *xen_map_cache(target_phys_addr_t phys_addr, target_phys_addr_t size, uint8_t lock) { MapCacheEntry *entry, *pentry = NULL; - target_phys_addr_t address_index = phys_addr >> MCACHE_BUCKET_SHIFT; - target_phys_addr_t address_offset = phys_addr & (MCACHE_BUCKET_SIZE - 1); + target_phys_addr_t address_index; + target_phys_addr_t address_offset; target_phys_addr_t __size = size; + bool translated = false; + +tryagain: + address_index = phys_addr >> MCACHE_BUCKET_SHIFT; + address_offset = phys_addr & (MCACHE_BUCKET_SIZE - 1); trace_xen_map_cache(phys_addr); @@ -235,6 +246,11 @@ uint8_t *xen_map_cache(target_phys_addr_t phys_addr, target_phys_addr_t size, if(!test_bits(address_offset >> XC_PAGE_SHIFT, size >> XC_PAGE_SHIFT, entry->valid_mapping)) { mapcache->last_address_index = -1; + if (!translated && mapcache->phys_offset_to_gaddr) { + phys_addr = mapcache->phys_offset_to_gaddr(phys_addr, size, mapcache->opaque); + translated = true; + goto tryagain; + } trace_xen_map_cache_return(NULL); return NULL; } diff --git a/xen-mapcache.h b/xen-mapcache.h index da874ca..70301a5 100644 --- a/xen-mapcache.h +++ b/xen-mapcache.h @@ -11,9 +11,13 @@ #include <stdlib.h> +typedef target_phys_addr_t (*phys_offset_to_gaddr_t)(target_phys_addr_t start_addr, + ram_addr_t size, + void *opaque); #ifdef CONFIG_XEN -void xen_map_cache_init(void); +void xen_map_cache_init(phys_offset_to_gaddr_t f, + void *opaque); uint8_t *xen_map_cache(target_phys_addr_t phys_addr, target_phys_addr_t size, uint8_t lock); ram_addr_t xen_ram_addr_from_mapcache(void *ptr); @@ -22,7 +26,8 @@ void xen_invalidate_map_cache(void); #else -static inline void xen_map_cache_init(void) +static inline void xen_map_cache_init(phys_offset_to_gaddr_t f, + void *opaque) { } -- 1.7.2.5
Set runstate to RUN_STATE_INMIGRATE as soon as we can on resume. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> --- vl.c | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/vl.c b/vl.c index c3a155f..9b9a4fb 100644 --- a/vl.c +++ b/vl.c @@ -2972,6 +2972,7 @@ int main(int argc, char **argv, char **envp) break; case QEMU_OPTION_incoming: incoming = optarg; + runstate_set(RUN_STATE_INMIGRATE); break; case QEMU_OPTION_nodefaults: default_serial = 0; @@ -3466,7 +3467,6 @@ int main(int argc, char **argv, char **envp) } if (incoming) { - runstate_set(RUN_STATE_INMIGRATE); int ret = qemu_start_incoming_migration(incoming); if (ret < 0) { fprintf(stderr, "Migration failed. Exit code %s(%d), exiting.\n", -- 1.7.2.5
Stefano Stabellini
2012-Jan-25 13:05 UTC
[PATCH v4 6/6] xen: do not allocate RAM during INMIGRATE runstate
From: Anthony PERARD <anthony.perard@citrix.com> Signed-off-by: Anthony PERARD <anthony.perard@citrix.com> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> --- xen-all.c | 8 ++++++++ 1 files changed, 8 insertions(+), 0 deletions(-) diff --git a/xen-all.c b/xen-all.c index bb66c82..5b10d0c 100644 --- a/xen-all.c +++ b/xen-all.c @@ -190,6 +190,14 @@ void xen_ram_alloc(ram_addr_t ram_addr, ram_addr_t size, MemoryRegion *mr) xen_pfn_t *pfn_list; int i; + if (runstate_check(RUN_STATE_INMIGRATE)) { + /* RAM already populated in Xen */ + fprintf(stderr, "%s: do not alloc "RAM_ADDR_FMT + " bytes of ram at "RAM_ADDR_FMT" when runstate is INMIGRATE\n", + __func__, size, ram_addr); + return; + } + if (mr == &ram_memory) { return; } -- 1.7.2.5
On Wed, 25 Jan 2012, Stefano Stabellini wrote:> Hi all, > this is the fourth version of the Xen save/restore patch series. > We have been discussing this issue for quite a while on #qemu and > qemu-devel: > > > http://marc.info/?l=qemu-devel&m=132346828427314&w=2 > http://marc.info/?l=qemu-devel&m=132377734605464&w=2 > > > The principal changes in the this version are: > > - Following Anthony''s suggestion I have introduced a new monitor command > to save the non-ram device state to file. > > - I have also removed the hack not to reset the cirrus videoram on > restore, because it turns out that the videoram doesn''t need to be > reset in the reset handler at all (tested on Win2K, where the problem > was found in the first place). >Is everybody happy enough with this series? Do you have any additional comments?
On Tue, 31 Jan 2012, Stefano Stabellini wrote:> On Wed, 25 Jan 2012, Stefano Stabellini wrote: > > Hi all, > > this is the fourth version of the Xen save/restore patch series. > > We have been discussing this issue for quite a while on #qemu and > > qemu-devel: > > > > > > http://marc.info/?l=qemu-devel&m=132346828427314&w=2 > > http://marc.info/?l=qemu-devel&m=132377734605464&w=2 > > > > > > The principal changes in the this version are: > > > > - Following Anthony''s suggestion I have introduced a new monitor command > > to save the non-ram device state to file. > > > > - I have also removed the hack not to reset the cirrus videoram on > > restore, because it turns out that the videoram doesn''t need to be > > reset in the reset handler at all (tested on Win2K, where the problem > > was found in the first place). > > > > Is everybody happy enough with this series? > Do you have any additional comments? >ping
On Mon, 13 Feb 2012, Stefano Stabellini wrote:> On Tue, 31 Jan 2012, Stefano Stabellini wrote: > > On Wed, 25 Jan 2012, Stefano Stabellini wrote: > > > Hi all, > > > this is the fourth version of the Xen save/restore patch series. > > > We have been discussing this issue for quite a while on #qemu and > > > qemu-devel: > > > > > > > > > http://marc.info/?l=qemu-devel&m=132346828427314&w=2 > > > http://marc.info/?l=qemu-devel&m=132377734605464&w=2 > > > > > > > > > The principal changes in the this version are: > > > > > > - Following Anthony''s suggestion I have introduced a new monitor command > > > to save the non-ram device state to file. > > > > > > - I have also removed the hack not to reset the cirrus videoram on > > > restore, because it turns out that the videoram doesn''t need to be > > > reset in the reset handler at all (tested on Win2K, where the problem > > > was found in the first place). > > > > > > > Is everybody happy enough with this series? > > Do you have any additional comments? > > > > ping >Can we at least agree on the basic approach (especially for patch #1, #2 and #5 that are not xen specific), so that we can check in the libxl side of the series? It is currently blocking other patch series from being merged into xen-unstable.