Mike Sun
2008-Oct-02 19:19 UTC
[Xen-devel] Sources of page dirtying for HVM domains in Xen 3.2.x
Hi all, I recently moved my research implementation to Xen 3.2.x from 3.1.x and am trying to get a grasp on the shadow page table code changes. I need to trap all writes/modifications to memory pages mapped in a HVM domain''s pseudo-physical address space. I have been using the log-dirty mode as a means to do this. It seems like for an HVM domain in log-dirty mode, all writes to guest memory will trigger a page fault, which can be then handled in sh_page_fault(). The log-dirty handling code is buried inside _sh_propagate() which is eventually called from sh_page_fault(). But I''ve noticed that the paging_mark_dirty() is also called from other places that do not originate from the page fault handler. This makes sense to me for PV translated domains as other Xen functions may modify guest pages. But for HVM domains, are there other sources of guest memory modification that will not originate from page faults? Thanks, Mike _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Pratt
2008-Oct-02 20:49 UTC
RE: [Xen-devel] Sources of page dirtying for HVM domains in Xen 3.2.x
> But I''ve noticed that the paging_mark_dirty() is also called from > other places that do not originate from the page fault handler. This > makes sense to me for PV translated domains as other Xen functions may > modify guest pages. But for HVM domains, are there other sources of > guest memory modification that will not originate from page faults?Qemu-dm emulated DMA, PV driver DMA. Within page fault handling you''ve also got to consider the special cases of MMIO, PTE accessed/dirty bit update. Ian _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Mike Sun
2008-Oct-03 13:06 UTC
Re: [Xen-devel] Sources of page dirtying for HVM domains in Xen 3.2.x
Argh. Forgot to reply to the list. Sorry about that Ian.> Within page fault handling you''ve also got to consider the special cases > of MMIO, PTE accessed/dirty bit update.And the emulated writes to guest PTs as well, correct? One confusing thing is that it seems that since _sh_propagate() is called before the emulated write or MMIO write that the page_dirty() would already be called and that the page_dirty() called after the emulated write is redundant? I''m probably missing something though.> Qemu-dm emulated DMA, PV driver DMA.Any way I can have Qemu not do DMA? Essentially I''m looking for a clean way of doing a copy-on-write on a guest domain''s memory pages. The log-dirty mode seems most appropriate, but I have to make sure I can copy the page before it''s dirtied/modified. Have any good suggestions. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Tim Deegan
2008-Oct-03 13:26 UTC
Re: [Xen-devel] Sources of page dirtying for HVM domains in Xen 3.2.x
Hi, At 06:06 -0700 on 03 Oct (1223013977), Mike Sun wrote:> One confusing thing is that it seems that since _sh_propagate() is > called before the emulated write or MMIO write that the page_dirty() > would already be called and that the page_dirty() called after the > emulated write is redundant? I''m probably missing something though.For emulated writes to pagetables, the _sh_propagate() call doesn''t make a writeable mapping (since the target is a pagetable) so it shouldn''t call page_dirty().> Any way I can have Qemu not do DMA?Don''t use disks or network cards? :) It should be possible to set the IDE controller up to do PIO, but presumably your performance would then suck.> Essentially I''m looking for a clean way of doing a copy-on-write on a > guest domain''s memory pages. The log-dirty mode seems most > appropriate, but I have to make sure I can copy the page before it''s > dirtied/modified. Have any good suggestions.The log-dirty mode seems like a good candidate, but there are a number of places where log-dirty marks a page dirty _after_ writing to it; in particular qemu DMA. It might be possible to reshuffle so that that the marking happens before the write in all cases, but it needs some thought about race conditions like: 1. mark dirty 2. live-migration tool copies the page and clears the bitmap 3. write to the page I suspect you end up having to take action both before and after some writes (ones that happen inside Xen are OK because the domain is always paused when the bitmap is read so as long as you finish the writes before returning from Xen you''re safe). Another option is to hook everything from the p2m table lookup operation, though you''d then need to plumb through a "read/write" argument to lookups so you don''t trigger on reads too. Cheers, Tim. -- Tim Deegan <Tim.Deegan@citrix.com> Principal Software Engineer, Citrix Systems (R&D) Ltd. [Company #02300071, SL9 0DZ, UK.] _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Mike Sun
2008-Oct-03 15:40 UTC
Re: [Xen-devel] Sources of page dirtying for HVM domains in Xen 3.2.x
Thanks Tim. This is becoming a bit more involved than I had initially hoped--not as simple as CoW on a normal OS kernel :).> For emulated writes to pagetables, the _sh_propagate() call doesn''t make > a writeable mapping (since the target is a pagetable) so it shouldn''t > call page_dirty().Not quite sure what you mean by not "making a writeable mapping". Could you explain? When I look at the code for sh_page_fault(), the _sh_propagate() would be called before the code starts on the emulate path; If log-dirty mode is on, wouldn''t this trigger the paging_mark_dirty() that occurs when ft == WRITE for the faulting gva? What I then see is that in the emulation code (in the emulate_unmap_page()), a page_dirty() is called again for an mfn (or two) given in the cpu context. Maybe I don''t understand what the emulate write actually does?> The log-dirty mode seems like a good candidate, but there are a > number of places where log-dirty marks a page dirty _after_ writing to > it; in particular qemu DMA. It might be possible to reshuffle so that > that the marking happens before the write in all cases, but it needs > some thought about race conditions like:Yeah, that''s part of the fear I have. Ideally, I wanted a single point where I could trigger a fault on any attempted write, such as making all the guest''s memory pages read only and having CR0.WP turned on. But it seems like there are a bunch of other things that Xen does that modifies/dirties pages without causing a protection fault.> Another option is to hook everything from the p2m table lookup > operation, though you''d then need to plumb through a "read/write" > argument to lookups so you don''t trigger on reads too.p2m table lookup? Sorry, I''m really realizing I probably don''t understand what''s really going on fully. Isn''t that p2m lookup only done when the shadows are created? Could that really be a intervention point for any writes? Thanks for all your help Tim! Mike _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Tim Deegan
2008-Oct-03 15:57 UTC
Re: [Xen-devel] Sources of page dirtying for HVM domains in Xen 3.2.x
At 08:40 -0700 on 03 Oct (1223023218), Mike Sun wrote:> Not quite sure what you mean by not "making a writeable mapping". > Could you explain? When I look at the code for sh_page_fault(), the > _sh_propagate() would be called before the code starts on the emulate > path; If log-dirty mode is on, wouldn''t this trigger the > paging_mark_dirty() that occurs when ft == WRITE for the faulting gva?Yes, it will. Sorry, I had misremembered the code. I though that the check (just below that) for this being a pagetable would cause the mark-dirty not to happen, but I was wrong. I think that would be the correct behaviour, since this mark-dirty is intended to mean "we have given the guest direct write access to the page".> What I then see is that in the emulation code (in the > emulate_unmap_page()), a page_dirty() is called again for an mfn (or > two) given in the cpu context. Maybe I don''t understand what the > emulate write actually does?That mark is to say that we''ve actually changed the contents of the page. As it happens, you''re correct and only one of the two is needed, but I think it''s good practise to mark the page at the time that xen actually changes it.> > The log-dirty mode seems like a good candidate, but there are a > > number of places where log-dirty marks a page dirty _after_ writing to > > it; in particular qemu DMA. It might be possible to reshuffle so that > > that the marking happens before the write in all cases, but it needs > > some thought about race conditions like: > > Yeah, that''s part of the fear I have. Ideally, I wanted a single > point where I could trigger a fault on any attempted write, such as > making all the guest''s memory pages read only and having CR0.WP turned > on. But it seems like there are a bunch of other things that Xen does > that modifies/dirties pages without causing a protection fault.Yep. :( At least most of the places in Xen are easy to find by the calls to paging_mark_dirty(). And probably for most of them it will be safe to do the marking before the write, since both of them will happen before returning from Xen, and the bitmap operations are done with the domain paused. If your guest has PV drivers, then it might grant access to other guests, who don''t have to be so picky (their writes are picked up by calling paging_mark_dirty() when the grant is destroyed). But special-casing the grant tables should be OK. And in Xen 3.2 DMA accesses from qemu were completely un-marked (which I believe has either been fixed in xen-unstable or else there''s a patch on the way). Since qemu already has a wrapper function for its writes to guest RAM it''s easy to add a hypercall at the top of it. Here''s a harder question: What do you do if you hit a copy-on-write fault and don''t have any memory available to copy into? Or are you planning to allocate a full domain of memory up-front for VM fork?> p2m table lookup? Sorry, I''m really realizing I probably don''t > understand what''s really going on fully. Isn''t that p2m lookup only > done when the shadows are created?It''s also done when more or less anything else wants to access the guest''s memory. But I''m not sure it''s any better than your current approach (and actually there have been more changes to that areas since 3.2 than to log-dirty so it might be storing up pain in the future.) Cheers, Tim. -- Tim Deegan <Tim.Deegan@citrix.com> Principal Software Engineer, Citrix Systems (R&D) Ltd. [Company #02300071, SL9 0DZ, UK.] _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Mike Sun
2008-Oct-03 18:34 UTC
Re: [Xen-devel] Sources of page dirtying for HVM domains in Xen 3.2.x
> And in Xen 3.2 DMA accesses from qemu were completely un-marked (which I > believe has either been fixed in xen-unstable or else there''s a patch on > the way). Since qemu already has a wrapper function for its writes to > guest RAM it''s easy to add a hypercall at the top of it.Meaning that they''re not marked dirty at all? My understanding was that qemu-dm marked them dirty and that this dirty bitmap for DMA writes could be accessed. It seems like that''s what the live migration code uses to check for dirtied pages from qemu dma (init_qemu_maps(), qemu_flip_buffer()). I can see a need for a hypercall when qemu does the dirty marking in its code. I may just save myself some trouble and pre-copy those memory pages instead of letting them be CoW.> Here''s a harder question: What do you do if you hit a copy-on-write > fault and don''t have any memory available to copy into? Or are you > planning to allocate a full domain of memory up-front for VM fork?I''m actually not aiming for a VM fork, it''s for a checkpoint. Yup, I''m having dom0 allocate a large enough buffer which I pass to the hypervisor. It''s a crude approach for now, but I''ll optimze it later with some sort of circular buffer which the checkpoint agent in dom0 can ensure there''s always available buffer space. You''ve been tremendously helpful in clarifying things for me. Thanks again for the help! Mike _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Tim Deegan
2008-Oct-07 09:04 UTC
Re: [Xen-devel] Sources of page dirtying for HVM domains in Xen 3.2.x
Hi, At 11:34 -0700 on 03 Oct (1223033670), Mike Sun wrote:> Meaning that they''re not marked dirty at all? My understanding was > that qemu-dm marked them dirty and that this dirty bitmap for DMA > writes could be accessed. It seems like that''s what the live > migration code uses to check for dirtied pages from qemu dma > (init_qemu_maps(), qemu_flip_buffer()).qemu tracks them in its own bitmap. What I meant is there''s no marking done inside xen for those writes, so just hooking the log-dirty bitmap there won''t catch them.> I can see a need for a > hypercall when qemu does the dirty marking in its code. I may just > save myself some trouble and pre-copy those memory pages instead of > letting them be CoW.Yep, if you can do that from inside qemu, that should be fine. Cheers, Tim. -- Tim Deegan <Tim.Deegan@citrix.com> Principal Software Engineer, Citrix Systems (R&D) Ltd. [Company #02300071, SL9 0DZ, UK.] _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel