Sina Bahram
2008-Dec-11 02:59 UTC
[Xen-devel] paging and shadow paging in xen: trying to implement split memory
Hi all, I''ve been reading through the code regarding paging --> spending a lot of time in mm/*.*, as well as some of the other parts up a level or two, but I''m still unclear as to some key things. Here''s what I think I know: I think I know how a domain''s shadow page table is first allocated E.G. the hash_table is xmalloc''ed and when it is destroyed E.G. xfree''ed. I believe I have identified the functions where a shadow page is inserted and deleted with all the tlb modifications that go along with that. I''m semi-comfortable with the format of the shadow page table itself. In 32-bit PAE, it follows the 2, 9, 9, 12 format. Some questions: Why do shadow page tables exist in xen for pv guests? What is their purpose, and how do pv guests interact with them? How does one activate this? Does one have to have pae enabled for 32-bit pv guests? I thought I read that I do, but when I look at the source, the classic 10, 10, 12 format for page tables is supported. Is that not supported for shadow page tables, and if so ... How can I learn more about this? * general goal * Here is what I''m trying to do, in a finite way. I''d like to add a structure, for now a reference in the paging struct would be fine, let''s call it hash_table2 for lack of a better name. I''d like to mirror all operations to the page table, to hash_table, in my hash_table2. Now to the purpose of why I''m doing this. I''d like to make it so that if a page is accessed, with the supervisory bit set, I direct all reads and writes to the original hash_table, but I want to direct all executes to hash_table2, or vice versa, that hardly matters which one gets which. Eventually I''d like to not even mirror pages that are just data (read and write only) or just code (execute only). Again, I only want to do this for page swith supervisory bit set so as to only affect the kernel''s pages. That''s the kernel of the pv guest. In this way, I hope to implement split memory as a way of preventing certain attacks to the guest. Is there anyone I can speak to about this, perhaps over detailed emails, IM, or even phone? Thanks so much Take care, Sina _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Tim Deegan
2008-Dec-11 10:20 UTC
Re: [Xen-devel] paging and shadow paging in xen: trying to implement split memory
Hi, At 21:59 -0500 on 10 Dec (1228946361), Sina Bahram wrote:> Why do shadow page tables exist in xen for pv guests? What is their purpose, > and how do pv guests interact with them?They''re used in live migration, to track which of the guest''s pages have been written to. It''s described in the paper I mentioned before: http://www.cl.cam.ac.uk/research/srg/netos/papers/2005-nsdi-migration.pdf> How does one activate this?Using the XEN_DOMCTL_shadow_op domctl (see xen/include/public/domctl.h for details and the uses of xc_shadow_control() in libxc for examples).> Does one have to have pae enabled for 32-bit pv guests?Yes.> I thought I read > that I do, but when I look at the source, the classic 10, 10, 12 format for > page tables is supported. Is that not supported for shadow page tables, and > if so ... How can I learn more about this?The shadow pagetable code does support non-PAE paging, because it has to handle HVM guests, which can''t be constrained to particular paging behaviour.> * general goal * > > Here is what I''m trying to do, in a finite way. > > I''d like to add a structure, for now a reference in the paging struct would > be fine, let''s call it hash_table2 for lack of a better name. > > I''d like to mirror all operations to the page table, to hash_table, in my > hash_table2.I''m not sure what hash table you''re talking about here. The hash table in the shadow code just contains the list of which shadows there are of a guest pagetable, not any page permissions or such.> Now to the purpose of why I''m doing this. > > I''d like to make it so that if a page is accessed, with the supervisory bit > set, I direct all reads and writes to the original hash_table, but I want to > direct all executes to hash_table2, or vice versa, that hardly matters which > one gets which. > > Eventually I''d like to not even mirror pages that are just data (read and > write only) or just code (execute only). > > Again, I only want to do this for page swith supervisory bit set so as to > only affect the kernel''s pages. That''s the kernel of the pv guest. > > In this way, I hope to implement split memory as a way of preventing certain > attacks to the guest.Are you thinking of building two sets of shadow pagetables, one with only execute permissions and one with only write permissions? The CPU only ever uses one set of pagetables at a time, so you''d never be able to use the non-executable one. I think it makes more sense to have just one set of shadow pagetables but switch the individual mappings of a page back and forth. In fact, getting into the shadow pagetables is probably just making life difficult for yourself; if you can use a recent AMD processor that supports NPT, you could just change p2m map back and forth, and use the nested-pagefault handler to know when to make the change. Much simpler, and easier to get right. By the way, it''s not possible using x86 pagetables to have a page that''s executable but not readable.> Is there anyone I can speak to about this, perhaps over detailed emails, IM, > or even phone?This mailing list (xen-devel) is the best place to discuss implementation details. Cheers, Tim. -- Tim Deegan <Tim.Deegan@citrix.com> Principal Software Engineer, Citrix Systems (R&D) Ltd. [Company #02300071, SL9 0DZ, UK.] _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Sina Bahram
2008-Dec-11 17:50 UTC
RE: [Xen-devel] paging and shadow paging in xen: trying toimplement split memory
Some things are more clear now, so thank you for that. To elaborate, what I would like to do is direct all reads and writes from a guest to one page table and all executes to another. It doesn''t matter whether the page is readable or not, because I would be directing all "read and write operations" to one page and all "execute" operations to another page. What you mentioned with swapping the p2m mapping sounds like exactly what I want to do, but I''m hoping it''s not constrained to AMD. Can I not do this inside of Xen, making it transparent to the pv guest? How would I go about doing this? Again, would trying to do this for HVM guests be easier, or even possible, because of the layer of abstraction? Take care, Sina -----Original Message----- From: Tim Deegan [mailto:Tim.Deegan@citrix.com] Sent: Thursday, December 11, 2008 5:20 AM To: Sina Bahram Cc: xen-devel@lists.xensource.com Subject: Re: [Xen-devel] paging and shadow paging in xen: trying toimplement split memory Hi, At 21:59 -0500 on 10 Dec (1228946361), Sina Bahram wrote:> Why do shadow page tables exist in xen for pv guests? What is theirpurpose,> and how do pv guests interact with them?They''re used in live migration, to track which of the guest''s pages have been written to. It''s described in the paper I mentioned before: http://www.cl.cam.ac.uk/research/srg/netos/papers/2005-nsdi-migration.pdf> How does one activate this?Using the XEN_DOMCTL_shadow_op domctl (see xen/include/public/domctl.h for details and the uses of xc_shadow_control() in libxc for examples).> Does one have to have pae enabled for 32-bit pv guests?Yes.> I thought I read > that I do, but when I look at the source, the classic 10, 10, 12 formatfor> page tables is supported. Is that not supported for shadow page tables,and> if so ... How can I learn more about this?The shadow pagetable code does support non-PAE paging, because it has to handle HVM guests, which can''t be constrained to particular paging behaviour.> * general goal * > > Here is what I''m trying to do, in a finite way. > > I''d like to add a structure, for now a reference in the paging structwould> be fine, let''s call it hash_table2 for lack of a better name. > > I''d like to mirror all operations to the page table, to hash_table, in my > hash_table2.I''m not sure what hash table you''re talking about here. The hash table in the shadow code just contains the list of which shadows there are of a guest pagetable, not any page permissions or such.> Now to the purpose of why I''m doing this. > > I''d like to make it so that if a page is accessed, with the supervisorybit> set, I direct all reads and writes to the original hash_table, but I wantto> direct all executes to hash_table2, or vice versa, that hardly matterswhich> one gets which. > > Eventually I''d like to not even mirror pages that are just data (read and > write only) or just code (execute only). > > Again, I only want to do this for page swith supervisory bit set so as to > only affect the kernel''s pages. That''s the kernel of the pv guest. > > In this way, I hope to implement split memory as a way of preventingcertain> attacks to the guest.Are you thinking of building two sets of shadow pagetables, one with only execute permissions and one with only write permissions? The CPU only ever uses one set of pagetables at a time, so you''d never be able to use the non-executable one. I think it makes more sense to have just one set of shadow pagetables but switch the individual mappings of a page back and forth. In fact, getting into the shadow pagetables is probably just making life difficult for yourself; if you can use a recent AMD processor that supports NPT, you could just change p2m map back and forth, and use the nested-pagefault handler to know when to make the change. Much simpler, and easier to get right. By the way, it''s not possible using x86 pagetables to have a page that''s executable but not readable.> Is there anyone I can speak to about this, perhaps over detailed emails,IM,> or even phone?This mailing list (xen-devel) is the best place to discuss implementation details. Cheers, Tim. -- Tim Deegan <Tim.Deegan@citrix.com> Principal Software Engineer, Citrix Systems (R&D) Ltd. [Company #02300071, SL9 0DZ, UK.] _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Tim Deegan
2008-Dec-12 10:11 UTC
Re: [Xen-devel] paging and shadow paging in xen: trying toimplement split memory
At 12:50 -0500 on 11 Dec (1228999819), Sina Bahram wrote:> To elaborate, what I would like to do is direct all reads and writes from a > guest to one page table and all executes to another. It doesn''t matter > whether the page is readable or not, because I would be directing all "read > and write operations" to one page and all "execute" operations to another > page.OK, as I said, there''s no way of making a page executable but not readable, so I think> What you mentioned with swapping the p2m mapping sounds like exactly what I > want to do, but I''m hoping it''s not constrained to AMD. Can I not do this > inside of Xen, making it transparent to the pv guest?Sorry, I should have said - that mechanism only applies to HVM guests, not PV ones. It would indeed be transparent to the guest. The Intel exquivalent will be available in the "Nehalem" processor line, which I think is coming out early next year; support for it is already in Xen from version 3.3 on.> How would I go about doing this?In the p2m tree (arch/x86/mm/p2m.c), mark every page as non-executable. I think a little bit of tinkering with p2m_change_type_global() would work. Then in the NPT nested pagefault handler (arch/x86/hvm/svm/svm.c) when the guest faults trying to execute a page, call back to the p2m code to change its permissions so that page becomes executable but not writeable. Likewise, for write faults, switch the page back to being writeable but not executable. Three drawbacks: - code that writes to the page it''s on (self-modifying code or on-stack trampolines) will just spin forever unless you do something cunning like emulate the instruction. - when the page is executable it will also be readable. As I said there''s no way of specifying that a page should be executable but not readable. (Intel''s EPT spec will let you request that combination but the actual processors "may choose not to support" it.) - this affects _all_ memory, not just kernel memory. Since it''s dealing with physical memory you can''t easily tell which frames contain user-space data and which are kernel. Cheers, Tim.> Again, would trying to do this for HVM guests be easier, or even possible, > because of the layer of abstraction? > > Take care, > Sina > > -----Original Message----- > From: Tim Deegan [mailto:Tim.Deegan@citrix.com] > Sent: Thursday, December 11, 2008 5:20 AM > To: Sina Bahram > Cc: xen-devel@lists.xensource.com > Subject: Re: [Xen-devel] paging and shadow paging in xen: trying toimplement > split memory > > Hi, > > At 21:59 -0500 on 10 Dec (1228946361), Sina Bahram wrote: > > Why do shadow page tables exist in xen for pv guests? What is their > purpose, > > and how do pv guests interact with them? > > They''re used in live migration, to track which of the guest''s pages have > been written to. It''s described in the paper I mentioned before: > http://www.cl.cam.ac.uk/research/srg/netos/papers/2005-nsdi-migration.pdf > > > How does one activate this? > > Using the XEN_DOMCTL_shadow_op domctl (see xen/include/public/domctl.h > for details and the uses of xc_shadow_control() in libxc for examples). > > > Does one have to have pae enabled for 32-bit pv guests? > > Yes. > > > I thought I read > > that I do, but when I look at the source, the classic 10, 10, 12 format > for > > page tables is supported. Is that not supported for shadow page tables, > and > > if so ... How can I learn more about this? > > The shadow pagetable code does support non-PAE paging, because it has to > handle HVM guests, which can''t be constrained to particular paging > behaviour. > > > * general goal * > > > > Here is what I''m trying to do, in a finite way. > > > > I''d like to add a structure, for now a reference in the paging struct > would > > be fine, let''s call it hash_table2 for lack of a better name. > > > > I''d like to mirror all operations to the page table, to hash_table, in my > > hash_table2. > > I''m not sure what hash table you''re talking about here. The hash table > in the shadow code just contains the list of which shadows there are of > a guest pagetable, not any page permissions or such. > > > Now to the purpose of why I''m doing this. > > > > I''d like to make it so that if a page is accessed, with the supervisory > bit > > set, I direct all reads and writes to the original hash_table, but I want > to > > direct all executes to hash_table2, or vice versa, that hardly matters > which > > one gets which. > > > > Eventually I''d like to not even mirror pages that are just data (read and > > write only) or just code (execute only). > > > > Again, I only want to do this for page swith supervisory bit set so as to > > only affect the kernel''s pages. That''s the kernel of the pv guest. > > > > In this way, I hope to implement split memory as a way of preventing > certain > > attacks to the guest. > > Are you thinking of building two sets of shadow pagetables, one with > only execute permissions and one with only write permissions? The CPU > only ever uses one set of pagetables at a time, so you''d never be able > to use the non-executable one. > > I think it makes more sense to have just one set of shadow pagetables > but switch the individual mappings of a page back and forth. > > In fact, getting into the shadow pagetables is probably just making life > difficult for yourself; if you can use a recent AMD processor that > supports NPT, you could just change p2m map back and forth, and use the > nested-pagefault handler to know when to make the change. Much simpler, > and easier to get right. > > By the way, it''s not possible using x86 pagetables to have a page that''s > executable but not readable. > > > Is there anyone I can speak to about this, perhaps over detailed emails, > IM, > > or even phone? > > This mailing list (xen-devel) is the best place to discuss > implementation details. > > Cheers, > > Tim. > > -- > Tim Deegan <Tim.Deegan@citrix.com> > Principal Software Engineer, Citrix Systems (R&D) Ltd. > [Company #02300071, SL9 0DZ, UK.] > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel-- Tim Deegan <Tim.Deegan@citrix.com> Principal Software Engineer, Citrix Systems (R&D) Ltd. [Company #02300071, SL9 0DZ, UK.] _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Sina Bahram
2008-Dec-12 15:02 UTC
RE: [Xen-devel] paging and shadow paging in xen: tryingtoimplement split memory
Thanks for your response below. Take care, Sina -----Original Message----- From: Tim Deegan [mailto:Tim.Deegan@citrix.com] Sent: Friday, December 12, 2008 5:12 AM To: Sina Bahram Cc: xen-devel@lists.xensource.com Subject: Re: [Xen-devel] paging and shadow paging in xen: tryingtoimplement split memory At 12:50 -0500 on 11 Dec (1228999819), Sina Bahram wrote:> To elaborate, what I would like to do is direct all reads and writes froma> guest to one page table and all executes to another. It doesn''t matter > whether the page is readable or not, because I would be directing all"read> and write operations" to one page and all "execute" operations to another > page.OK, as I said, there''s no way of making a page executable but not readable, so I think> What you mentioned with swapping the p2m mapping sounds like exactly whatI> want to do, but I''m hoping it''s not constrained to AMD. Can I not do this > inside of Xen, making it transparent to the pv guest?Sorry, I should have said - that mechanism only applies to HVM guests, not PV ones. It would indeed be transparent to the guest. The Intel exquivalent will be available in the "Nehalem" processor line, which I think is coming out early next year; support for it is already in Xen from version 3.3 on.> How would I go about doing this?In the p2m tree (arch/x86/mm/p2m.c), mark every page as non-executable. I think a little bit of tinkering with p2m_change_type_global() would work. Then in the NPT nested pagefault handler (arch/x86/hvm/svm/svm.c) when the guest faults trying to execute a page, call back to the p2m code to change its permissions so that page becomes executable but not writeable. Likewise, for write faults, switch the page back to being writeable but not executable. Three drawbacks: - code that writes to the page it''s on (self-modifying code or on-stack trampolines) will just spin forever unless you do something cunning like emulate the instruction. - when the page is executable it will also be readable. As I said there''s no way of specifying that a page should be executable but not readable. (Intel''s EPT spec will let you request that combination but the actual processors "may choose not to support" it.) - this affects _all_ memory, not just kernel memory. Since it''s dealing with physical memory you can''t easily tell which frames contain user-space data and which are kernel. Cheers, Tim.> Again, would trying to do this for HVM guests be easier, or even possible, > because of the layer of abstraction? > > Take care, > Sina > > -----Original Message----- > From: Tim Deegan [mailto:Tim.Deegan@citrix.com] > Sent: Thursday, December 11, 2008 5:20 AM > To: Sina Bahram > Cc: xen-devel@lists.xensource.com > Subject: Re: [Xen-devel] paging and shadow paging in xen: tryingtoimplement> split memory > > Hi, > > At 21:59 -0500 on 10 Dec (1228946361), Sina Bahram wrote: > > Why do shadow page tables exist in xen for pv guests? What is their > purpose, > > and how do pv guests interact with them? > > They''re used in live migration, to track which of the guest''s pages have > been written to. It''s described in the paper I mentioned before: > http://www.cl.cam.ac.uk/research/srg/netos/papers/2005-nsdi-migration.pdf > > > How does one activate this? > > Using the XEN_DOMCTL_shadow_op domctl (see xen/include/public/domctl.h > for details and the uses of xc_shadow_control() in libxc for examples). > > > Does one have to have pae enabled for 32-bit pv guests? > > Yes. > > > I thought I read > > that I do, but when I look at the source, the classic 10, 10, 12 format > for > > page tables is supported. Is that not supported for shadow page tables, > and > > if so ... How can I learn more about this? > > The shadow pagetable code does support non-PAE paging, because it has to > handle HVM guests, which can''t be constrained to particular paging > behaviour. > > > * general goal * > > > > Here is what I''m trying to do, in a finite way. > > > > I''d like to add a structure, for now a reference in the paging struct > would > > be fine, let''s call it hash_table2 for lack of a better name. > > > > I''d like to mirror all operations to the page table, to hash_table, inmy> > hash_table2. > > I''m not sure what hash table you''re talking about here. The hash table > in the shadow code just contains the list of which shadows there are of > a guest pagetable, not any page permissions or such. > > > Now to the purpose of why I''m doing this. > > > > I''d like to make it so that if a page is accessed, with the supervisory > bit > > set, I direct all reads and writes to the original hash_table, but Iwant> to > > direct all executes to hash_table2, or vice versa, that hardly matters > which > > one gets which. > > > > Eventually I''d like to not even mirror pages that are just data (readand> > write only) or just code (execute only). > > > > Again, I only want to do this for page swith supervisory bit set so asto> > only affect the kernel''s pages. That''s the kernel of the pv guest. > > > > In this way, I hope to implement split memory as a way of preventing > certain > > attacks to the guest. > > Are you thinking of building two sets of shadow pagetables, one with > only execute permissions and one with only write permissions? The CPU > only ever uses one set of pagetables at a time, so you''d never be able > to use the non-executable one. > > I think it makes more sense to have just one set of shadow pagetables > but switch the individual mappings of a page back and forth. > > In fact, getting into the shadow pagetables is probably just making life > difficult for yourself; if you can use a recent AMD processor that > supports NPT, you could just change p2m map back and forth, and use the > nested-pagefault handler to know when to make the change. Much simpler, > and easier to get right. > > By the way, it''s not possible using x86 pagetables to have a page that''s > executable but not readable. > > > Is there anyone I can speak to about this, perhaps over detailed emails, > IM, > > or even phone? > > This mailing list (xen-devel) is the best place to discuss > implementation details. > > Cheers, > > Tim. > > -- > Tim Deegan <Tim.Deegan@citrix.com> > Principal Software Engineer, Citrix Systems (R&D) Ltd. > [Company #02300071, SL9 0DZ, UK.] > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel-- Tim Deegan <Tim.Deegan@citrix.com> Principal Software Engineer, Citrix Systems (R&D) Ltd. [Company #02300071, SL9 0DZ, UK.] _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Apparently Analagous Threads
- can dom0 modify Shadow PT of HVM domU?
- relationship of the auto_translated_physmap feature and the shadow_mode_translate mode of domain
- [Question] How to support page offline in Xen environment
- One question on MMIO
- [PATCH] Fix guest_handle_okay/guest_handle_subrange_okay