Zhai, Edwin
2009-Aug-11 12:12 UTC
[Xen-devel] [PATCH] [IOEMU]: fix the crash of HVM live migration with intensive disk access
[IOEMU]: fix the crash of HVM live migration with intensive disk access Intensive disk access, e.g. sum of big file, during HVM live migration would cause guest error even file system crash. Guest dmesg said "attempt to access beyond end of device hda1: rw=0, want=10232032112, limit=10474317" Current map cache used by qemu dma doesn''t mark the page dirty, so that these pages(probably holding DMA data struct) are not transferred in the last iteration during live migration. This patch fixes it, and also merges the qemu''s original dirty bitmap used by other devices such as vga. Signed-Off-By: Zhai Edwin <edwin.zhai@intel.com> Index: hv/tools/ioemu-remote/cpu-all.h ==================================================================--- hv.orig/tools/ioemu-remote/cpu-all.h +++ hv/tools/ioemu-remote/cpu-all.h @@ -975,6 +975,16 @@ static inline int cpu_physical_memory_ge static inline void cpu_physical_memory_set_dirty(ram_addr_t addr) { phys_ram_dirty[addr >> TARGET_PAGE_BITS] = 0xff; + +#ifndef CONFIG_STUBDOM + if (logdirty_bitmap != NULL) { + addr >>= TARGET_PAGE_BITS; + if (addr / 8 < logdirty_bitmap_size) { + logdirty_bitmap[addr / HOST_LONG_BITS] + |= 1UL << addr % HOST_LONG_BITS; + } + } +#endif } void cpu_physical_memory_reset_dirty(ram_addr_t start, ram_addr_t end, Index: hv/tools/ioemu-remote/i386-dm/exec-dm.c ==================================================================--- hv.orig/tools/ioemu-remote/i386-dm/exec-dm.c +++ hv/tools/ioemu-remote/i386-dm/exec-dm.c @@ -806,6 +806,24 @@ void *cpu_physical_memory_map(target_phy if ((*plen) > l) *plen = l; #endif +#ifndef CONFIG_STUBDOM + if (logdirty_bitmap != NULL) { + /* Record that we have dirtied this frame */ + unsigned long pfn = addr >> TARGET_PAGE_BITS; + do { + if (pfn / 8 >= logdirty_bitmap_size) { + fprintf(logfile, "dirtying pfn %lx >= bitmap " + "size %lx\n", pfn, logdirty_bitmap_size * 8); + } else { + logdirty_bitmap[pfn / HOST_LONG_BITS] + |= 1UL << pfn % HOST_LONG_BITS; + } + + pfn++; + } while ( (pfn << TARGET_PAGE_BITS) < addr + *plen ); + + } +#endif return qemu_map_cache(addr, 1); } _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Stefano Stabellini
2009-Aug-11 12:56 UTC
Re: [Xen-devel] [PATCH] [IOEMU]: fix the crash of HVM live migration with intensive disk access
On Tue, 11 Aug 2009, Zhai, Edwin wrote:> [IOEMU]: fix the crash of HVM live migration with intensive disk access > > Intensive disk access, e.g. sum of big file, during HVM live migration would > cause guest error even file system crash. Guest dmesg said > "attempt to access beyond end of device > hda1: rw=0, want=10232032112, limit=10474317" > > Current map cache used by qemu dma doesn''t mark the page dirty, so that these > pages(probably holding DMA data struct) are not transferred in the last > iteration during live migration. > > This patch fixes it, and also merges the qemu''s original dirty bitmap used by > other devices such as vga. > > Signed-Off-By: Zhai Edwin <edwin.zhai@intel.com> >I think the fix is correct but we should thinking about dropping logdirty and start using xc_hvm_modified_memory instead for all cases. I think Gianluca also may have something to say about this but this week he is on vacation. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Zhai, Edwin
2009-Aug-12 03:09 UTC
Re: [Xen-devel] [PATCH] [IOEMU]: fix the crash of HVM live migration with intensive disk access
Stefano Stabellini wrote:> On Tue, 11 Aug 2009, Zhai, Edwin wrote: > >> [IOEMU]: fix the crash of HVM live migration with intensive disk access >> >> Intensive disk access, e.g. sum of big file, during HVM live migration would >> cause guest error even file system crash. Guest dmesg said >> "attempt to access beyond end of device >> hda1: rw=0, want=10232032112, limit=10474317" >> >> Current map cache used by qemu dma doesn''t mark the page dirty, so that these >> pages(probably holding DMA data struct) are not transferred in the last >> iteration during live migration. >> >> This patch fixes it, and also merges the qemu''s original dirty bitmap used by >> other devices such as vga. >> >> Signed-Off-By: Zhai Edwin <edwin.zhai@intel.com> >> >> > > I think the fix is correct but we should thinking about dropping > logdirty and start using xc_hvm_modified_memory instead for all cases. >One interface should be better. But I''m not sure about the perf implications. You know, qemu use logdirty for its device emulation even without live migration, e.g. vga screen refresh. Changing to xc_hvm_modified_memory would cause many hypercall to set/get the bitmap in xen...> I think Gianluca also may have something to say about this but this week > he is on vacation. > >-- best rgds, edwin _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Stefano Stabellini
2009-Aug-12 11:28 UTC
Re: [Xen-devel] [PATCH] [IOEMU]: fix the crash of HVM live migration with intensive disk access
On Wed, 12 Aug 2009, Zhai, Edwin wrote:> One interface should be better. But I''m not sure about the perf > implications. You know, qemu use logdirty for its device emulation even > without live migration, e.g. vga screen refresh. Changing to > xc_hvm_modified_memory would cause many hypercall to set/get the bitmap > in xen... >For vga screen refresh qemu calls xc_hvm_track_dirty_vram and then cpu_physical_memory_set_dirty; the latter needs only to call xc_hvm_modified_memory in case logdirty is active. In facts your patch is modifying logdirty_bitmap only when logdirty_bitmap != NULL that is when a logdirty event is triggered on xenstore. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Zhai, Edwin
2009-Sep-01 00:40 UTC
Re: [Xen-devel] [PATCH] [IOEMU]: fix the crash of HVM live migration with intensive disk access
Gianluca, Do you have comments for this? Maybe Stefano and I are on the same page. My assumption is that we need keep the cpu_physical_memory_set_dirty and phys_ram_dirty for qemu itself, and make cpu_physical_memory_set_dirty call xc_hvm_modified_memory if in live migration. Thanks, Zhai, Edwin wrote:> > > Stefano Stabellini wrote: >> On Tue, 11 Aug 2009, Zhai, Edwin wrote: >> >>> [IOEMU]: fix the crash of HVM live migration with intensive disk >>> access >>> >>> Intensive disk access, e.g. sum of big file, during HVM live >>> migration would cause guest error even file system crash. Guest >>> dmesg said >>> "attempt to access beyond end of device >>> hda1: rw=0, want=10232032112, limit=10474317" >>> >>> Current map cache used by qemu dma doesn''t mark the page dirty, so >>> that these pages(probably holding DMA data struct) are not >>> transferred in the last iteration during live migration. >>> >>> This patch fixes it, and also merges the qemu''s original dirty >>> bitmap used by other devices such as vga. >>> >>> Signed-Off-By: Zhai Edwin <edwin.zhai@intel.com> >>> >>> >> >> I think the fix is correct but we should thinking about dropping >> logdirty and start using xc_hvm_modified_memory instead for all cases. >> > > One interface should be better. But I''m not sure about the perf > implications. You know, qemu use logdirty for its device emulation > even without live migration, e.g. vga screen refresh. Changing to > xc_hvm_modified_memory would cause many hypercall to set/get the > bitmap in xen... > >> I think Gianluca also may have something to say about this but this week >> he is on vacation. >> >> >-- best rgds, edwin _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel