Razvan Cojocaru
2013-Jan-17 14:02 UTC
[PATCH V2] mem_event: Allow emulating an instruction that caused a page fault
This patch makes it possible to emulate an instruction that triggered a page fault (received via the mem_event API). This is done by setting the MEM_EVENT_FLAG_EMULATE in mem_event_response_t.flags. The purpose of this is to be able to receive several distinct page fault mem_events for the same address, and choose which ones are allowed to go through from dom0 userspace. Signed-off-by: Razvan Cojocaru <rzvncj@gmail.com> diff -r b6195e277da5 -r c5db0882bfcf xen/arch/x86/mm/p2m.c --- a/xen/arch/x86/mm/p2m.c Wed Jan 16 14:15:44 2013 +0000 +++ b/xen/arch/x86/mm/p2m.c Thu Jan 17 16:01:11 2013 +0200 @@ -1309,6 +1309,17 @@ bool_t p2m_mem_access_check(paddr_t gpa, } } + if ( v->arch.hvm_vmx.mem_event_emulate ) + { + struct hvm_emulate_ctxt ctx[1] = {}; + + v->arch.hvm_vmx.mem_event_emulate = 0; + hvm_emulate_prepare(ctx, guest_cpu_user_regs()); + hvm_emulate_one(ctx); + + return 1; + } + *req_ptr = NULL; req = xzalloc(mem_event_request_t); if ( req ) @@ -1347,8 +1358,15 @@ void p2m_mem_access_resume(struct domain /* Pull all responses off the ring */ while( mem_event_get_response(d, &d->mem_event->access, &rsp) ) { + d->vcpu[rsp.vcpu_id]->arch.hvm_vmx.mem_event_emulate = 0; + if ( rsp.flags & MEM_EVENT_FLAG_DUMMY ) continue; + + /* Mark vcpu for skipping one instruction upon rescheduling */ + if ( rsp.flags & MEM_EVENT_FLAG_EMULATE ) + d->vcpu[rsp.vcpu_id]->arch.hvm_vmx.mem_event_emulate = 1; + /* Unpause domain */ if ( rsp.flags & MEM_EVENT_FLAG_VCPU_PAUSED ) vcpu_unpause(d->vcpu[rsp.vcpu_id]); diff -r b6195e277da5 -r c5db0882bfcf xen/include/asm-x86/hvm/vmx/vmcs.h --- a/xen/include/asm-x86/hvm/vmx/vmcs.h Wed Jan 16 14:15:44 2013 +0000 +++ b/xen/include/asm-x86/hvm/vmx/vmcs.h Thu Jan 17 16:01:11 2013 +0200 @@ -125,6 +125,8 @@ struct arch_vmx_struct { /* Remember EFLAGS while in virtual 8086 mode */ uint32_t vm86_saved_eflags; int hostenv_migrated; + /* Should we emulate the first instruction on VCPU resume after a mem_event? */ + uint8_t mem_event_emulate; }; int vmx_create_vmcs(struct vcpu *v); diff -r b6195e277da5 -r c5db0882bfcf xen/include/public/mem_event.h --- a/xen/include/public/mem_event.h Wed Jan 16 14:15:44 2013 +0000 +++ b/xen/include/public/mem_event.h Thu Jan 17 16:01:11 2013 +0200 @@ -36,6 +36,7 @@ #define MEM_EVENT_FLAG_EVICT_FAIL (1 << 2) #define MEM_EVENT_FLAG_FOREIGN (1 << 3) #define MEM_EVENT_FLAG_DUMMY (1 << 4) +#define MEM_EVENT_FLAG_EMULATE (1 << 5) /* Emulate the instruction that caused the current mem_event */ /* Reasons for the memory event request */ #define MEM_EVENT_REASON_UNKNOWN 0 /* typical reason */
Andres Lagar-Cavilla
2013-Jan-17 15:38 UTC
Re: [PATCH V2] mem_event: Allow emulating an instruction that caused a page fault
> This patch makes it possible to emulate an instruction that triggered > a page fault (received via the mem_event API). This is done by setting > the MEM_EVENT_FLAG_EMULATE in mem_event_response_t.flags. The purpose > of this is to be able to receive several distinct page fault mem_events > for the same address, and choose which ones are allowed to go through > from dom0 userspace. > > Signed-off-by: Razvan Cojocaru <rzvncj@gmail.com> > > diff -r b6195e277da5 -r c5db0882bfcf xen/arch/x86/mm/p2m.c > --- a/xen/arch/x86/mm/p2m.c Wed Jan 16 14:15:44 2013 +0000 > +++ b/xen/arch/x86/mm/p2m.c Thu Jan 17 16:01:11 2013 +0200 > @@ -1309,6 +1309,17 @@ bool_t p2m_mem_access_check(paddr_t gpa, > } > } > > + if ( v->arch.hvm_vmx.mem_event_emulate )Lack of mem event support for AMD processors will be fixed as soon as someone with interest (and time!) gets going. I don''t recall anything fundamental standing in the way. So it''s best if you lift this field into the generic hvm level.> + { > + struct hvm_emulate_ctxt ctx[1] = {}; > + > + v->arch.hvm_vmx.mem_event_emulate = 0; > + hvm_emulate_prepare(ctx, guest_cpu_user_regs());Tim''s point is that you won''t get all the mem events, because the instruction can easily touch multiple pages. It''s a question that addresses the need for this patch in the first place. One potential (hairy) fix is to have all get_page_from_gfn check for and emit a mem event. It''s a bit of rat hole, because we''ll need to pass the intended permissions down the stack, check against mem event status, etc etc. It will help extend mem event to catch all hypervisor-based accesses that currently it mostly can''t, as well as foreign mappings. It''s certainly not for the faint of heart.> + hvm_emulate_one(ctx); > + > + return 1; > + } > + > *req_ptr = NULL; > req = xzalloc(mem_event_request_t); > if ( req ) > @@ -1347,8 +1358,15 @@ void p2m_mem_access_resume(struct domain > /* Pull all responses off the ring */ > while( mem_event_get_response(d, &d->mem_event->access, &rsp) ) > { > + d->vcpu[rsp.vcpu_id]->arch.hvm_vmx.mem_event_emulate = 0; > + > if ( rsp.flags & MEM_EVENT_FLAG_DUMMY ) > continue; > + > + /* Mark vcpu for skipping one instruction upon rescheduling */ > + if ( rsp.flags & MEM_EVENT_FLAG_EMULATE ) > + d->vcpu[rsp.vcpu_id]->arch.hvm_vmx.mem_event_emulate = 1; > + > /* Unpause domain */ > if ( rsp.flags & MEM_EVENT_FLAG_VCPU_PAUSED ) > vcpu_unpause(d->vcpu[rsp.vcpu_id]); > diff -r b6195e277da5 -r c5db0882bfcf xen/include/asm-x86/hvm/vmx/vmcs.h > --- a/xen/include/asm-x86/hvm/vmx/vmcs.h Wed Jan 16 14:15:44 2013 +0000 > +++ b/xen/include/asm-x86/hvm/vmx/vmcs.h Thu Jan 17 16:01:11 2013 +0200 > @@ -125,6 +125,8 @@ struct arch_vmx_struct { > /* Remember EFLAGS while in virtual 8086 mode */ > uint32_t vm86_saved_eflags; > int hostenv_migrated; > + /* Should we emulate the first instruction on VCPU resume after a mem_event? */ > + uint8_t mem_event_emulate; > }; > > int vmx_create_vmcs(struct vcpu *v); > diff -r b6195e277da5 -r c5db0882bfcf xen/include/public/mem_event.h > --- a/xen/include/public/mem_event.h Wed Jan 16 14:15:44 2013 +0000 > +++ b/xen/include/public/mem_event.h Thu Jan 17 16:01:11 2013 +0200 > @@ -36,6 +36,7 @@ > #define MEM_EVENT_FLAG_EVICT_FAIL (1 << 2) > #define MEM_EVENT_FLAG_FOREIGN (1 << 3) > #define MEM_EVENT_FLAG_DUMMY (1 << 4) > +#define MEM_EVENT_FLAG_EMULATE (1 << 5) /* Emulate the instruction that caused the current mem_event */Line overflow, better stack the comment on top Thanks Andres> > /* Reasons for the memory event request */ > #define MEM_EVENT_REASON_UNKNOWN 0 /* typical reason */ > > >
Razvan Cojocaru
2013-Jan-17 15:50 UTC
Re: [PATCH V2] mem_event: Allow emulating an instruction that caused a page fault
Hello, and thanks for your comments.>> + if ( v->arch.hvm_vmx.mem_event_emulate ) > > Lack of mem event support for AMD processors will be fixed as soon as someone with interest (and time!) gets going. I don''t recall anything fundamental standing in the way. > > So it''s best if you lift this field into the generic hvm level.That would make it v->arch.mem_event_emulate, correct?> Tim''s point is that you won''t get all the mem events, because the instruction can easily touch multiple pages. It''s a question that addresses the need for this patch in the first place.Yes, I did get Tim''s point. Sorry if I haven''t been able to make that as clear as I should have in my reply to Tim''s email.> One potential (hairy) fix is to have all get_page_from_gfn check for and emit a mem event. It''s a bit of rat hole, because we''ll need to pass the intended permissions down the stack, check against mem event status, etc etc. It will help extend mem event to catch all hypervisor-based accesses that currently it mostly can''t, as well as foreign mappings. It''s certainly not for the faint of heart.Fortunately, for the purposes of my patch, the simple solution is enough.>> +#define MEM_EVENT_FLAG_EMULATE (1 << 5) /* Emulate the instruction that caused the current mem_event */ > Line overflow, better stack the comment on topNo problem. Thank you, Razvan Cojocaru
Razvan Cojocaru
2013-Jan-21 23:13 UTC
Re: [PATCH V2] mem_event: Allow emulating an instruction that caused a page fault
Hello,> Tim''s point is that you won''t get all the mem events, because the instruction can easily touch multiple pages. It''s a question that addresses the need for this patch in the first place.what would be an example of such an instruction? Thanks, Razvan Cojocaru
Tim Deegan
2013-Jan-22 12:31 UTC
Re: [PATCH V2] mem_event: Allow emulating an instruction that caused a page fault
At 01:13 +0200 on 22 Jan (1358817224), Razvan Cojocaru wrote:> Hello, > > > Tim''s point is that you won''t get all the mem events, because the instruction can easily touch multiple pages. It''s a question that addresses the need for this patch in the first place. > > what would be an example of such an instruction?x86 instructions aren''t aligned, so any instruction that encodes as more than one byte can touch two pages just in instruction fetch, if it overlaps the end of a page. The same goes for explicit memory operands, so an instruction like MOVSW that has two memory operands can touch six pages - two for the instruction fetch and two for each operand. Each of those accesses is to a virtual address; in 64-bit mode a TLB miss can add four more memory accesses to walk the pagetables, so we''re looking at a worst case of 25 pages of memory that might be touched by a successful MOVSW (the top-level pagetable only counts once). But what if the last access caused a page fault? After up to 24 accesses, the CPU now needs to access the IDT to figure out what to do with #PF (+4 = 28), and the stack to push an exception (+8 = 36), and if the OS is using a task gate then it needs the old and new TSSes (+8 44). And if the stack faults, we nede to do the whole thing again for #DF (-1, +12 = 55). Now that''s a pretty unlikely scenario (and I may have got some of the details wrong) but the upshot is: a single x86 instruction can access enormous amounts of memory, so turning off protection and single-stepping, especially if you don''t trust the OS, is exposing a lot more than the single frame you took the first fault on. Cheers, Tim.
Razvan Cojocaru
2013-Jan-22 12:53 UTC
Re: [PATCH V2] mem_event: Allow emulating an instruction that caused a page fault
> #DF (-1, +12 = 55). Now that''s a pretty unlikely scenario (and I may > have got some of the details wrong) but the upshot is: a single x86 > instruction can access enormous amounts of memory, so turning off > protection and single-stepping, especially if you don''t trust the OS, is > exposing a lot more than the single frame you took the first fault on.Thank you, Tim, for clearing that up. Now, ''touching'' a page is quite different from ''writing to'' a page, and I''m really only interested in the latter. So, in a scenario where reads are permitted by default and we''re only interested in writes, are we still talking about these limitations? A MOVSW, for example, only needs to write to a single page, even though it does touch more pages in read mode. Thanks, Razvan Cojocaru
Tim Deegan
2013-Jan-22 13:20 UTC
Re: [PATCH V2] mem_event: Allow emulating an instruction that caused a page fault
At 14:53 +0200 on 22 Jan (1358866429), Razvan Cojocaru wrote:> >#DF (-1, +12 = 55). Now that''s a pretty unlikely scenario (and I may > >have got some of the details wrong) but the upshot is: a single x86 > >instruction can access enormous amounts of memory, so turning off > >protection and single-stepping, especially if you don''t trust the OS, is > >exposing a lot more than the single frame you took the first fault on. > > Thank you, Tim, for clearing that up. Now, ''touching'' a page is quite > different from ''writing to'' a page, and I''m really only interested in > the latter. So, in a scenario where reads are permitted by default and > we''re only interested in writes, are we still talking about these > limitations? A MOVSW, for example, only needs to write to a single page, > even though it does touch more pages in read mode.Ok, talking only about writes, we have the destination operand, plus all the pagetables (for setting Accessed bits) plus any stacks and TSSes needed in delivering faults; something like 32 pages for the full double-fault scenario. Tim
Razvan Cojocaru
2013-Jan-22 13:47 UTC
Re: [PATCH V2] mem_event: Allow emulating an instruction that caused a page fault
> Ok, talking only about writes, we have the destination operand, plus all > the pagetables (for setting Accessed bits) plus any stacks and TSSes > needed in delivering faults; something like 32 pages for the full > double-fault scenario.I see, but then, even setting aside Andres'' argument that having all possible events sent to userspace is far from trivial, doing so would completely cripple the monitored domain speed-wise. Imagine having userspace having to decide if it allows a write 32 times per one instruction. Even with its limitations, this patch at least gives us a shot at looking at what the domain does with memory - the existing model (releasing the page completely to allow the write) is not fit for anything like this at all. So far, between the requirements of reasonable demands on the domU, and satisfactory levels of control provided by the received mem_events, at least as far as writes go, the patch''s done it''s job quite nicely. Thanks, Razvan Cojocaru
Andres Lagar-Cavilla
2013-Jan-22 14:02 UTC
Re: [PATCH V2] mem_event: Allow emulating an instruction that caused a page fault
On Jan 22, 2013, at 8:47 AM, Razvan Cojocaru <rzvncj@gmail.com> wrote:>> Ok, talking only about writes, we have the destination operand, plus all >> the pagetables (for setting Accessed bits) plus any stacks and TSSes >> needed in delivering faults; something like 32 pages for the full >> double-fault scenario. > > I see, but then, even setting aside Andres'' argument that having all possible events sent to userspace is far from trivial, doing so would completely cripple the monitored domain speed-wise. Imagine having userspace having to decide if it allows a write 32 times per one instruction.Razvan, have you tried the n2rwx mode? It''ll cause an event for the first access to each page. It''ll automatically patch the access (hence 2 rwx) to allow execution to continue unhindered. So user-space doesn''t decide anything, but gets informed about everything (within the constraints of mem event, i.e. no foreign mappings, no Xen mappings). If you want to know about every write and decide on every write… you can''t have your cake and eat it too, right? Andres> > Even with its limitations, this patch at least gives us a shot at looking at what the domain does with memory - the existing model (releasing the page completely to allow the write) is not fit for anything like this at all. > > So far, between the requirements of reasonable demands on the domU, and satisfactory levels of control provided by the received mem_events, at least as far as writes go, the patch''s done it''s job quite nicely. > > Thanks, > Razvan Cojocaru
Razvan Cojocaru
2013-Jan-22 14:22 UTC
Re: [PATCH V2] mem_event: Allow emulating an instruction that caused a page fault
> Razvan, have you tried the n2rwx mode? It''ll cause an event for the first access to each page. It''ll automatically patch the access (hence 2 rwx) to allow execution to continue unhindered. > > So user-space doesn''t decide anything, but gets informed about everything (within the constraints of mem event, i.e. no foreign mappings, no Xen mappings).Hello, yes, I did look at that. Unfortunately, my userspace application needs to be notified about subsequent writes to the same page, and once the page goes to rwx mode there are no further events. So that''s not a very useful mechanism.> If you want to know about every write and decide on every write… you can''t have your cake and eat it too, right?Right. Thanks, Razvan Cojocaru
Tim Deegan
2013-Jan-22 14:26 UTC
Re: [PATCH V2] mem_event: Allow emulating an instruction that caused a page fault
At 15:47 +0200 on 22 Jan (1358869654), Razvan Cojocaru wrote:> >Ok, talking only about writes, we have the destination operand, plus all > >the pagetables (for setting Accessed bits) plus any stacks and TSSes > >needed in delivering faults; something like 32 pages for the full > >double-fault scenario. > > I see, but then, even setting aside Andres'' argument that having all > possible events sent to userspace is far from trivial, doing so would > completely cripple the monitored domain speed-wise. Imagine having > userspace having to decide if it allows a write 32 times per one > instruction.Well, the _common_ case is only to have one write. Even rarer cases like needing an Accessed bit or crossing a page will only be two writes. But sending all the events from userspace doesn''t necessarily solve the problem. The guest could be overwriting the instruction stream from another vcpu, or by DMA from disk, so that by the time you''ve returned to Xen the instruction you emulate isn''t even the one that faulted. The only properly safe way to allow exactly one exception to your rules is to emulate the instruction in user-space. (Well, that or somehow move your policy into Xen and do the emulation there, but I''m quite strongly opposed to that).> Even with its limitations, this patch at least gives us a shot at > looking at what the domain does with memory - the existing model > (releasing the page completely to allow the write) is not fit for > anything like this at all.If you''re just using this to gather statistics about how often a page gets written, you could use sampling; you don''t need to see _every_ write.> So far, between the requirements of reasonable demands on the domU, and > satisfactory levels of control provided by the received mem_events, at > least as far as writes go, the patch''s done it''s job quite nicely.It might be helpful if you could give us a clear description of exactly what problem you''re trying to solve. Tim.
Razvan Cojocaru
2013-Jan-22 14:45 UTC
Re: [PATCH V2] mem_event: Allow emulating an instruction that caused a page fault
> The only properly safe way to allow exactly one exception to your rules > is to emulate the instruction in user-space. (Well, that or somehow > move your policy into Xen and do the emulation there, but I''m quite > strongly opposed to that).Is there an example of that somewhere in the Xen source code tree?> If you''re just using this to gather statistics about how often a page > gets written, you could use sampling; you don''t need to see _every_ > write.I''m not gathering statistics.> It might be helpful if you could give us a clear description of exactly > what problem you''re trying to solve.I''m watching for suspicious activity on the domU. If any occurs, the domU should be paused (at least the VCPU in question). A dom0 userspace application should decide what constitutes suspicious activity, with (1) the least possible slowing down of the domU, and (2) with as little "false positive" writes allowed as possible (ideally zero, if there''s a way that doesn''t go against requirement (1)). Thanks, Razvan Cojocaru
Tim Deegan
2013-Jan-24 11:05 UTC
Re: [PATCH V2] mem_event: Allow emulating an instruction that caused a page fault
At 16:45 +0200 on 22 Jan (1358873134), Razvan Cojocaru wrote:> >The only properly safe way to allow exactly one exception to your rules > >is to emulate the instruction in user-space. (Well, that or somehow > >move your policy into Xen and do the emulation there, but I''m quite > >strongly opposed to that). > > Is there an example of that somewhere in the Xen source code tree?I don''t think so. It occurs to me that if you''re willing to rely on the Xen x86_emulate() emulator, the model that we use for emulated MMIO might be better. There, Xen emulates the instruction directly in the fault handler and sends individual memory accesses to qemu for emulation. qemu receives them as a series of ioreqs (basically, address/size/data tuples). So you could, for example: - invent up a new p2m type (probably based very closely on p2m_ram_ro, maybe you could even just use p2m_ram_ro). - Use the HVMOP_set_mem_type to mark the pages you want readonly. - Use Julien Grall''s new ioreq interfaces to register your helper as the handler for the pages you care about. Then your user-space helper will get told about each actual write, rather than each faulting instruction. If the write is OK, the helper will map the target address and do the write. Have a look at, e.g. http://lists.xen.org/archives/html/xen-devel/2012-08/msg01767.html for Julien''s multiple-ioreq-handlers code; I''m not sure what the current state of that is, except that it doesn''t seem to be checked in yet. Cheers, Tim.
Razvan Cojocaru
2013-Jan-24 11:34 UTC
Re: [PATCH V2] mem_event: Allow emulating an instruction that caused a page fault
> Have a look at, e.g. > http://lists.xen.org/archives/html/xen-devel/2012-08/msg01767.html > for Julien''s multiple-ioreq-handlers code; I''m not sure what the > current state of that is, except that it doesn''t seem to be checked in > yet.Thank you for your suggestion, I''ll take a look at that. Thanks, Razvan Cojocaru