Petersson, Mats
2006-Jun-02 16:34 UTC
[Xen-devel] Fetching instructions after page-fault, near page boundary?
If we get a page-fault due to a MMIO access to a virtual MMIO device (such as VGA screen in HVM), we shouldn''t need to worry about crossing the page-boundary at the end of the instruction, right? Let''s say the instruction is a 7-byte instruction like this: xxxx1FFD: 11 22 33 <page boundary to page xxxx2000> 44 55 66 77 If the page xxxx2000 isn''t present when the instruction is started, then we''d FIRST get a page-fault for this address, so either we fail the instruction (if xxxx2000 page isn''t actually possible to be fixed up), or we get the page fixed up and therefore the second time, when we get to the page-fault handler looking at the address the instruction is accessing [doing the MMIO part], the second page is present [assuming we haven''t got any sneaky code going round modifying the page-tables for this guest domain - which I don''t think is a VALID thing to expect, is it?] Next case is where we have a short instruction before an empty(unused page), say a three-byte instruction (RR is another instructon, such as a return instruction). xxx1FFC: 11 22 33 RR <page boundary to xxxx2000> [not readable since it''s not present]. My design idea for the merged x86_emulate.c in QEMU is to read instruction bytes blind (i.e. not knowing the actual instruction length) by the this method: Try to read 15 bytes (MAX_INST_LEN), and if the instruction bytes happen to cross a page-boundary, and the second page is not readable, I''ll just cut the number of bytes short, assuming that the valid instruction is shorter than 15 bytes. Does anyone see a problem with this method? [By the way, this makes an improvement over the current setup, which fails if we try to read a page that isn''t readable - which at least the SVM model does try sometimes]. -- Mats _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2006-Jun-02 16:40 UTC
Re: [Xen-devel] Fetching instructions after page-fault, near page boundary?
On 2 Jun 2006, at 17:34, Petersson, Mats wrote:> Does anyone see a problem with this method?I wouldn''t trust it. What if you have code running in paged memory (e.g., random privileged userspace process)? Pages can disappear under your feet. I think you need to remember how many bytes you managed to read and do the job thoroughly. It''s not that much extra code. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Petersson, Mats
2006-Jun-02 17:07 UTC
RE: [Xen-devel] Fetching instructions after page-fault, near page boundary?
> -----Original Message----- > From: Keir Fraser [mailto:Keir.Fraser@cl.cam.ac.uk] > Sent: 02 June 2006 17:40 > To: Petersson, Mats > Cc: xen-devel@lists.xensource.com > Subject: Re: [Xen-devel] Fetching instructions after > page-fault, near page boundary? > > > On 2 Jun 2006, at 17:34, Petersson, Mats wrote: > > > Does anyone see a problem with this method? > > I wouldn''t trust it. What if you have code running in paged > memory (e.g., random privileged userspace process)? Pages can > disappear under your feet. I think you need to remember how > many bytes you managed to read and do the job thoroughly. > It''s not that much extra code.But that means that we''d have to parse the instruction bytes in Xen (since we can''t read them as trivially in QEMU) and figure out how many bytes the instruction is. Since both AMD and Intel have problems with getting the correct number of bytes from the processor during a page-fault intercept, it''s no help that Intel SOMETIMES have a correct number of bytes in a VMCS entry... How do we do it properly, if there''s non-present page, re-inject the page-fault, I guess? -- Mats> > -- Keir > > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2006-Jun-02 17:12 UTC
Re: [Xen-devel] Fetching instructions after page-fault, near page boundary?
On 2 Jun 2006, at 18:07, Petersson, Mats wrote:>> I wouldn''t trust it. What if you have code running in paged >> memory (e.g., random privileged userspace process)? Pages can >> disappear under your feet. I think you need to remember how >> many bytes you managed to read and do the job thoroughly. >> It''s not that much extra code. > > But that means that we''d have to parse the instruction bytes in Xen > (since we can''t read them as trivially in QEMU) and figure out how many > bytes the instruction is. Since both AMD and Intel have problems with > getting the correct number of bytes from the processor during a > page-fault intercept, it''s no help that Intel SOMETIMES have a correct > number of bytes in a VMCS entry...Read as many as you can, up to 15. Tell QEMU how many you actually managed to read.> How do we do it properly, if there''s non-present page, re-inject the > page-fault, I guess?Just try re-executing the instruction (i.e. directly return to the guest). If the page has become unmapped then the processor should handle the fault on instruction fetch. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Petersson, Mats
2006-Jun-02 17:20 UTC
RE: [Xen-devel] Fetching instructions after page-fault, near page boundary?
> -----Original Message----- > From: Keir Fraser [mailto:Keir.Fraser@cl.cam.ac.uk] > Sent: 02 June 2006 18:13 > To: Petersson, Mats > Cc: xen-devel@lists.xensource.com > Subject: Re: [Xen-devel] Fetching instructions after > page-fault, near page boundary? > > > On 2 Jun 2006, at 18:07, Petersson, Mats wrote: > > >> I wouldn''t trust it. What if you have code running in paged memory > >> (e.g., random privileged userspace process)? Pages can disappear > >> under your feet. I think you need to remember how many bytes you > >> managed to read and do the job thoroughly. > >> It''s not that much extra code. > > > > But that means that we''d have to parse the instruction bytes in Xen > > (since we can''t read them as trivially in QEMU) and figure out how > > many bytes the instruction is. Since both AMD and Intel > have problems > > with getting the correct number of bytes from the processor > during a > > page-fault intercept, it''s no help that Intel SOMETIMES > have a correct > > number of bytes in a VMCS entry... > > Read as many as you can, up to 15. Tell QEMU how many you > actually managed to read.That was my original plan [telling how many I got, that is].> > > How do we do it properly, if there''s non-present page, > re-inject the > > page-fault, I guess? > > Just try re-executing the instruction (i.e. directly return > to the guest). If the page has become unmapped then the > processor should handle the fault on instruction fetch.Ok, that approach makes more sense than my silly ideas of counting instruction bytes... And hopefully the code that removed our very much needed page will eventually let us actually emulate the instruction at some point, without too many re-executions... ;-) -- Mats> > -- Keir > > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2006-Jun-02 18:50 UTC
Re: [Xen-devel] Fetching instructions after page-fault, near page boundary?
On 2 Jun 2006, at 18:20, Petersson, Mats wrote:>> Just try re-executing the instruction (i.e. directly return >> to the guest). If the page has become unmapped then the >> processor should handle the fault on instruction fetch. > > Ok, that approach makes more sense than my silly ideas of counting > instruction bytes... And hopefully the code that removed our very much > needed page will eventually let us actually emulate the instruction at > some point, without too many re-executions... ;-)Bear in mind that we need to be able to inject page faults into the guest from the emulator anyway, for other reasons. For example, consider INSB/OUTSB -- the memory area being transferred to/from may be paged out. Current HVM MMIO code is rather lax about dealing with this (i.e., it doesn''t -- it ignores error returns from gva_to_gpa(), which itself has a bogus error value anyway (0 is a valid pa)). Given we need the code, we may just want to inject faults for instruction-fetch errors too, but we do have a choice for those. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Petersson, Mats
2006-Jun-02 19:04 UTC
RE: [Xen-devel] Fetching instructions after page-fault, near page boundary?
> -----Original Message----- > From: Keir Fraser [mailto:Keir.Fraser@cl.cam.ac.uk] > Sent: 02 June 2006 19:51 > To: Petersson, Mats > Cc: xen-devel@lists.xensource.com > Subject: Re: [Xen-devel] Fetching instructions after > page-fault, near page boundary? > > > On 2 Jun 2006, at 18:20, Petersson, Mats wrote: > > >> Just try re-executing the instruction (i.e. directly return to the > >> guest). If the page has become unmapped then the processor should > >> handle the fault on instruction fetch. > > > > Ok, that approach makes more sense than my silly ideas of counting > > instruction bytes... And hopefully the code that removed > our very much > > needed page will eventually let us actually emulate the > instruction at > > some point, without too many re-executions... ;-) > > Bear in mind that we need to be able to inject page faults > into the guest from the emulator anyway, for other reasons. > For example, consider INSB/OUTSB -- the memory area being > transferred to/from may be paged out. Current HVM MMIO code > is rather lax about dealing with this (i.e., it doesn''t -- it > ignores error returns from gva_to_gpa(), which itself has a > bogus error value anyway (0 is a valid pa)). Given we need > the code, we may just want to inject faults for > instruction-fetch errors too, but we do have a choice for those.I was initially going to ignore IN/OUT instructions, because: 1. They are fairly "OK" to handle in SVM/VMX, because there''s very little parsing that needs to be done, because most of the info is in VMC[BS] already - relative to Page-fault MMIO handling at least... I think it''s only the target memory address that isn''t being handled by the chip itself. [In fact, we even get the next EIP given to us in the VMCB on this one, so no need to figure out how long the instruction is, and we only need to scan for prefix if we see an unusual length instruction]. 2. They are not currently supported by x86_emulate.c anyways - so there''s no apparent duplicated code - except that SVM and VMX code being near-identical copies of each other - or at least they were before I re-arranged ours... ;-) It is of course broken wrt page-faults on the destination address. Most drivers however that do INS/OUTS would be doing so in response to an interrupt, and if it''s done IN the interrupt handler, then it''s going to be to a non-pagable memory region. But it could of course be deferred to a lower priority level and thus be to a pageable memory region... Which would make for "interesting" crashes on a SMP system where that page is replaced by something completely different... <Kaboom> - can see bits flying every direction... -- Mats> > -- Keir > > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Anthony Liguori
2006-Jun-02 20:16 UTC
Re: [Xen-devel] Fetching instructions after page-fault, near page boundary?
I would think you would not only have to worry about crossing page boundaries, but also crossing a segment descriptor limit. These days, segmentation is used for some security purposes (to emulate a NX bit for instance). Regards, Anthony Liguori Petersson, Mats wrote:> If we get a page-fault due to a MMIO access to a virtual MMIO device > (such as VGA screen in HVM), we shouldn''t need to worry about crossing > the page-boundary at the end of the instruction, right? Let''s say the > instruction is a 7-byte instruction like this: > > xxxx1FFD: 11 22 33 <page boundary to page xxxx2000> 44 55 66 77 > > If the page xxxx2000 isn''t present when the instruction is started, then > we''d FIRST get a page-fault for this address, so either we fail the > instruction (if xxxx2000 page isn''t actually possible to be fixed up), > or we get the page fixed up and therefore the second time, when we get > to the page-fault handler looking at the address the instruction is > accessing [doing the MMIO part], the second page is present [assuming we > haven''t got any sneaky code going round modifying the page-tables for > this guest domain - which I don''t think is a VALID thing to expect, is > it?] > > Next case is where we have a short instruction before an empty(unused > page), say a three-byte instruction (RR is another instructon, such as a > return instruction). > > xxx1FFC: 11 22 33 RR <page boundary to xxxx2000> [not readable since > it''s not present]. > > > My design idea for the merged x86_emulate.c in QEMU is to read > instruction bytes blind (i.e. not knowing the actual instruction length) > by the this method: > Try to read 15 bytes (MAX_INST_LEN), and if the instruction bytes happen > to cross a page-boundary, and the second page is not readable, I''ll just > cut the number of bytes short, assuming that the valid instruction is > shorter than 15 bytes. > > Does anyone see a problem with this method? > > [By the way, this makes an improvement over the current setup, which > fails if we try to read a page that isn''t readable - which at least the > SVM model does try sometimes]. > > -- > Mats > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Petersson, Mats
2006-Jun-02 20:29 UTC
RE: [Xen-devel] Fetching instructions after page-fault, near page boundary?
> -----Original Message----- > From: Anthony Liguori [mailto:aliguori@us.ibm.com] > Sent: 02 June 2006 21:16 > To: Petersson, Mats > Cc: xen-devel@lists.xensource.com > Subject: Re: [Xen-devel] Fetching instructions after > page-fault, near page boundary? > > I would think you would not only have to worry about crossing > page boundaries, but also crossing a segment descriptor > limit. These days, segmentation is used for some security > purposes (to emulate a NX bit for instance).Ah, yet another place where segments show their "ugly" head.... And the current code is not doing this very well... In fact, it assumes that non-real-mode segments have base=0 and that the limit "is big enough". Although, in a normal system, that sort of violation would be caught by the processor itself [GP faulting the instruction] before we get the page-fault, unless: 1. Someone is modifying the instructions we''re emulating - and that would have to be done at exactly the right time for the page-fault to be in transit in Xen, but not yet read the data from the page - which I''m sure someone can figure out how to do [it''s actually several thousand cycles, so it''s not exactly a tiny hole as such], but it''s not exactly the most likely attack scenario I can think of. 2. Someone is updating the descriptor tables between the processor executing the original trapping instruction, and us emulating the same instruction. However, I think we should START this project [moving x86_emulate_memop() into QEMU] by aiming to achieve something that is better than the current solution - not fill every hole and gap possible all in one go. So do you think it''s fair to say that we can make a note of this lack of security and ignore it for now? [Otherwise, I fear that I will be moved to another project before I even get a chance to finish this project]. -- Mats> > Regards, > > Anthony Liguori > > Petersson, Mats wrote: > > If we get a page-fault due to a MMIO access to a virtual > MMIO device > > (such as VGA screen in HVM), we shouldn''t need to worry > about crossing > > the page-boundary at the end of the instruction, right? > Let''s say the > > instruction is a 7-byte instruction like this: > > > > xxxx1FFD: 11 22 33 <page boundary to page xxxx2000> 44 55 66 77 > > > > If the page xxxx2000 isn''t present when the instruction is started, > > then we''d FIRST get a page-fault for this address, so > either we fail > > the instruction (if xxxx2000 page isn''t actually possible > to be fixed > > up), or we get the page fixed up and therefore the second > time, when > > we get to the page-fault handler looking at the address the > > instruction is accessing [doing the MMIO part], the second page is > > present [assuming we haven''t got any sneaky code going > round modifying > > the page-tables for this guest domain - which I don''t think > is a VALID > > thing to expect, is it?] > > > > Next case is where we have a short instruction before an > empty(unused > > page), say a three-byte instruction (RR is another > instructon, such as > > a return instruction). > > > > xxx1FFC: 11 22 33 RR <page boundary to xxxx2000> [not > readable since > > it''s not present]. > > > > > > My design idea for the merged x86_emulate.c in QEMU is to read > > instruction bytes blind (i.e. not knowing the actual instruction > > length) by the this method: > > Try to read 15 bytes (MAX_INST_LEN), and if the instruction bytes > > happen to cross a page-boundary, and the second page is not > readable, > > I''ll just cut the number of bytes short, assuming that the valid > > instruction is shorter than 15 bytes. > > > > Does anyone see a problem with this method? > > > > [By the way, this makes an improvement over the current > setup, which > > fails if we try to read a page that isn''t readable - which at least > > the SVM model does try sometimes]. > > > > -- > > Mats > > > > > > _______________________________________________ > > Xen-devel mailing list > > Xen-devel@lists.xensource.com > > http://lists.xensource.com/xen-devel > > > > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Anthony Liguori
2006-Jun-02 20:35 UTC
Re: [Xen-devel] Fetching instructions after page-fault, near page boundary?
Petersson, Mats wrote:> Ah, yet another place where segments show their "ugly" head.... And the > current code is not doing this very well... In fact, it assumes that > non-real-mode segments have base=0 and that the limit "is big enough". > > Although, in a normal system, that sort of violation would be caught by > the processor itself [GP faulting the instruction] before we get the > page-fault, unless: > 1. Someone is modifying the instructions we''re emulating - and that > would have to be done at exactly the right time for the page-fault to be > in transit in Xen, but not yet read the data from the page - which I''m > sure someone can figure out how to do [it''s actually several thousand > cycles, so it''s not exactly a tiny hole as such], but it''s not exactly > the most likely attack scenario I can think of. > > 2. Someone is updating the descriptor tables between the processor > executing the original trapping instruction, and us emulating the same > instruction. > > However, I think we should START this project [moving > x86_emulate_memop() into QEMU] by aiming to achieve something that is > better than the current solution - not fill every hole and gap possible > all in one go. So do you think it''s fair to say that we can make a note > of this lack of security and ignore it for now? [Otherwise, I fear that > I will be moved to another project before I even get a chance to finish > this project]. >heh, sure, I think that''s fine :-) I haven''t been able to think of a way to actually exploit this FWIW. It''s certainly a correctness issue so we should eventually address it but for now I think it''s fine to just sweep it under the table. Regards, Anthony Liguori> -- > Mats > >> Regards, >> >> Anthony Liguori >> >> Petersson, Mats wrote: >> >>> If we get a page-fault due to a MMIO access to a virtual >>> >> MMIO device >> >>> (such as VGA screen in HVM), we shouldn''t need to worry >>> >> about crossing >> >>> the page-boundary at the end of the instruction, right? >>> >> Let''s say the >> >>> instruction is a 7-byte instruction like this: >>> >>> xxxx1FFD: 11 22 33 <page boundary to page xxxx2000> 44 55 66 77 >>> >>> If the page xxxx2000 isn''t present when the instruction is started, >>> then we''d FIRST get a page-fault for this address, so >>> >> either we fail >> >>> the instruction (if xxxx2000 page isn''t actually possible >>> >> to be fixed >> >>> up), or we get the page fixed up and therefore the second >>> >> time, when >> >>> we get to the page-fault handler looking at the address the >>> instruction is accessing [doing the MMIO part], the second page is >>> present [assuming we haven''t got any sneaky code going >>> >> round modifying >> >>> the page-tables for this guest domain - which I don''t think >>> >> is a VALID >> >>> thing to expect, is it?] >>> >>> Next case is where we have a short instruction before an >>> >> empty(unused >> >>> page), say a three-byte instruction (RR is another >>> >> instructon, such as >> >>> a return instruction). >>> >>> xxx1FFC: 11 22 33 RR <page boundary to xxxx2000> [not >>> >> readable since >> >>> it''s not present]. >>> >>> >>> My design idea for the merged x86_emulate.c in QEMU is to read >>> instruction bytes blind (i.e. not knowing the actual instruction >>> length) by the this method: >>> Try to read 15 bytes (MAX_INST_LEN), and if the instruction bytes >>> happen to cross a page-boundary, and the second page is not >>> >> readable, >> >>> I''ll just cut the number of bytes short, assuming that the valid >>> instruction is shorter than 15 bytes. >>> >>> Does anyone see a problem with this method? >>> >>> [By the way, this makes an improvement over the current >>> >> setup, which >> >>> fails if we try to read a page that isn''t readable - which at least >>> the SVM model does try sometimes]. >>> >>> -- >>> Mats >>> >>> >>> _______________________________________________ >>> Xen-devel mailing list >>> Xen-devel@lists.xensource.com >>> http://lists.xensource.com/xen-devel >>> >>> >> >> > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Regarding the suggestion to use container_of() macro to fish out more complex context for QEMU from the ops callback functions, do you have a suggestion of where the container_of() macro should be copied/moved to. At the moment, I''ve just made a copy of it in helper2 where I''ve currently put all new code - but that''s not a particulary good way to reuse an existing macro... On the other hand, I don''t think we want to include kernel.h into QEMU? Although looking further, it doesn''t look like .../xen/include/xen/kernel.h contains ANYTHING kernel, just a few fairly generic macros... Maybe renaming the file is a better solution [or moving everything in it into a different file - leaving this one empty-ish] ? -- Mats _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 2 Jun 2006, at 22:39, Petersson, Mats wrote:> Regarding the suggestion to use container_of() macro to fish out more > complex context for QEMU from the ops callback functions, do you have a > suggestion of where the container_of() macro should be copied/moved to. > At the moment, I''ve just made a copy of it in helper2 where I''ve > currently put all new code - but that''s not a particulary good way to > reuse an existing macro... On the other hand, I don''t think we want to > include kernel.h into QEMU?Copying it in helper2.c for now is fine. It''s a very simple generic piece of C trickery. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2006-Jun-03 08:53 UTC
Re: [Xen-devel] Fetching instructions after page-fault, near page boundary?
On 2 Jun 2006, at 20:04, Petersson, Mats wrote:> 2. They are not currently supported by x86_emulate.c anyways - so > there''s no apparent duplicated code - except that SVM and VMX code > being > near-identical copies of each other - or at least they were before I > re-arranged ours... ;-)What about MOVS? :-) -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel