hi, I''m developing my own 32-bit (no PAE) paravirtualized kernel for xen with Mini-OS as a starting point. I am currently working on process page table support (equivalent of arch/i386/mm/pgtable-xen.c) and mostly following Linux for the moment. I noticed that linux-2.6.18-xen never pins an L1 table (a pte), yet __pgd_pin() walks the page directory and gives up write access on the kernel mappings of pte pages and only pins the pgd page. How do set_pte() and set_pte_at() macros work if they are writing directly to the page table entires ? do we fault in the kernel to handle this ?! thanks, satya. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
PTEs end up pinned by virtue of being referenced via a pinned PGD. When a PGD is pinned Xen walks the whole pagetable structure. -- Keir On 16/10/07 15:44, "Satya" <satyakiran@gmail.com> wrote:> hi, > I''m developing my own 32-bit (no PAE) paravirtualized kernel for xen with > Mini-OS as a starting point. I am currently working on process page table > support (equivalent of arch/i386/mm/pgtable-xen.c) and mostly following Linux > for the moment. I noticed that linux-2.6.18-xen never pins an L1 table (a > pte), yet __pgd_pin() walks the page directory and gives up write access on > the kernel mappings of pte pages and only pins the pgd page. How do set_pte() > and set_pte_at() macros work if they are writing directly to the page table > entires ? do we fault in the kernel to handle this ?! > > thanks, > satya. > > > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
> I''m developing my own 32-bit (no PAE) paravirtualized kernel for xen with > Mini-OS as a starting point. I am currently working on process page table > support (equivalent of arch/i386/mm/pgtable-xen.c) and mostly following > Linux for the moment. I noticed that linux-2.6.18-xen never pins an L1 > table (a pte), yet __pgd_pin() walks the page directory and gives up write > access on the kernel mappings of pte pages and only pins the pgd page. How > do set_pte() and set_pte_at() macros work if they are writing directly to > the page table entires ? do we fault in the kernel to handle this ?!Xen catches the faults on writing to pagetables. In more recent versions of Xen, it traps each write and emulates it. In older versions, it will unhook the pagetable temporarily, allowing the guest to write directly to it. There''s an explicit pagetable update API for guests to batch changes to pagetables rather than using trap-and-emulate if there is a large group of changes to be made. Cheers, Mark -- Dave: Just a question. What use is a unicyle with no seat? And no pedals! Mark: To answer a question with a question: What use is a skateboard? Dave: Skateboards have wheels. Mark: My wheel has a wheel! _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 10/16/07, Mark Williamson <mark.williamson@cl.cam.ac.uk> wrote:> > > I''m developing my own 32-bit (no PAE) paravirtualized kernel for xen > with > > Mini-OS as a starting point. I am currently working on process page > table > > support (equivalent of arch/i386/mm/pgtable-xen.c) and mostly following > > Linux for the moment. I noticed that linux-2.6.18-xen never pins an L1 > > table (a pte), yet __pgd_pin() walks the page directory and gives up > write > > access on the kernel mappings of pte pages and only pins the pgd page. > How > > do set_pte() and set_pte_at() macros work if they are writing directly > to > > the page table entires ? do we fault in the kernel to handle this ?! > > Xen catches the faults on writing to pagetables. In more recent versions > of > Xen, it traps each write and emulates it. In older versions, it will > unhook > the pagetable temporarily, allowing the guest to write directly to it.Does that need a vm_assist() call to enable writable page tables? or is this the default? Yes I am using an older version of Xen (Xen 3.0). There''s an explicit pagetable update API for guests to batch changes to> pagetables rather than using trap-and-emulate if there is a large group of > changes to be made.I plan to use HYPERVISOR_mmu_update() call to batch my pte changes. So going by Keir''s reply I guess I have to use this hypercall in my set_pte() function that modifies a pte entry - even though I didn''t explicitly issue an L1_PIN request to the hypervisor. What''s troubling me is that linux-2.6.18-xen writes to the pte entry directly by dereferencing a ptep! I think I am missing something here. thanks, satya. Cheers,> Mark > > -- > Dave: Just a question. What use is a unicyle with no seat? And no pedals! > Mark: To answer a question with a question: What use is a skateboard? > Dave: Skateboards have wheels. > Mark: My wheel has a wheel! >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
> > Xen catches the faults on writing to pagetables. In more recent versions > > of > > Xen, it traps each write and emulates it. In older versions, it will > > unhook > > the pagetable temporarily, allowing the guest to write directly to it. > > Does that need a vm_assist() call to enable writable page tables? or is > this the default? Yes I am using an older version of Xen (Xen 3.0).A vm_assist() is required to enable "writeable pagetables", yes.> There''s an explicit pagetable update API for guests to batch changes to > > > pagetables rather than using trap-and-emulate if there is a large group > > of changes to be made. > > I plan to use HYPERVISOR_mmu_update() call to batch my pte changes. So > going by Keir''s reply I guess I have to use this hypercall in my set_pte() > function that modifies a pte entry - even though I didn''t explicitly issue > an L1_PIN request to the hypervisor.That sounds about right; pagetables are pinned recursively - you can''t pin an L2 table without implicitly pinning all its children. This is because the validity / safety of an L2 table''s contents depends implicitly on the contents of the L1 as well. Pinning validates the pagetable as conforming to the constraints required by Xen; it wouldn''t make sense to validate an L2 table without checking that the ptes its children referenced also conformed to these constraints. So that''s the rationale for this behaviour.> What''s troubling me is that linux-2.6.18-xen writes to the pte entry > directly by dereferencing a ptep! I think I am missing something here.You''re allowed to do that, once you''ve activated writeable pagetable mode. Your Xen 3.0 release will then do something like the following: 1) verify that you''re writing to an L1 pagetable, and unhook from its parent L2 table 2) make the page writeable so that the write can succeed The guest will run for a bit and may now issue further writes without trapping into Xen. If the guest tries to access a virtual memory address within the range covered by that L1 table then it''ll cause a fault during the translation process. This will trap back into Xen, which will: 3) notice that the fault was caused by a pagetable unhooking of the writeable pagetable code 4) the page is made read only again, and all the changes in that pagetable are revalidated at once. 5) the L1 page is hooked back into the pagetable structure 6) the faulting instruction in the guest is retried; it should now succeed This process makes it possible for a guest to get the illusion of writing directly to the pagetables but also to benefit from batching of update operations when many changes to the pagetables are being made. More recent versions of Xen provide the same interface to the guest, but implement a trap-and-emulate approach: Xen traps the write faults and individually validates each attempt to update the pagetables before updating them on behalf of the guest. This is faster in the common case of the guest making a small number of updates. The guest can use the explicit batching interface when it wants to update a larger range. I hope this helps clarify things a bit. If you don''t mind me asking, why are you using such an old version of Xen? Cheers, Mark -- Dave: Just a question. What use is a unicyle with no seat? And no pedals! Mark: To answer a question with a question: What use is a skateboard? Dave: Skateboards have wheels. Mark: My wheel has a wheel! _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2007-Oct-16 19:57 UTC
Re: [Xen-devel] Xeno Linux never pins L1 tables ?
Satya wrote:> I''m developing my own 32-bit (no PAE) paravirtualized kernel for xen > with Mini-OS as a starting point. I am currently working on process > page table support (equivalent of arch/i386/mm/pgtable-xen.c) and > mostly following Linux for the moment. I noticed that linux-2.6.18-xen > never pins an L1 table (a pte), yet __pgd_pin() walks the page > directory and gives up write access on the kernel mappings of pte > pages and only pins the pgd page. How do set_pte() and set_pte_at() > macros work if they are writing directly to the page table entires ? > do we fault in the kernel to handle this ?!Pinning the top level of a pagetable implicitly pins all the lower levels, so they are all pinned. J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 10/16/07, Mark Williamson <mark.williamson@cl.cam.ac.uk> wrote:> > > > Xen catches the faults on writing to pagetables. In more recent > versions > > > of > > > Xen, it traps each write and emulates it. In older versions, it will > > > unhook > > > the pagetable temporarily, allowing the guest to write directly to it. > > > > > Does that need a vm_assist() call to enable writable page tables? or is > > this the default? Yes I am using an older version of Xen (Xen 3.0). > > A vm_assist() is required to enable "writeable pagetables", yes. > > > There''s an explicit pagetable update API for guests to batch changes to > > > > > pagetables rather than using trap-and-emulate if there is a large > group > > > of changes to be made. > > > > I plan to use HYPERVISOR_mmu_update() call to batch my pte changes. So > > going by Keir''s reply I guess I have to use this hypercall in my > set_pte() > > function that modifies a pte entry - even though I didn''t explicitly > issue > > an L1_PIN request to the hypervisor. > > That sounds about right; pagetables are pinned recursively - you can''t pin > an > L2 table without implicitly pinning all its children. This is because the > validity / safety of an L2 table''s contents depends implicitly on the > contents of the L1 as well. Pinning validates the pagetable as conforming > to > the constraints required by Xen; it wouldn''t make sense to validate an L2 > table without checking that the ptes its children referenced also > conformed > to these constraints. So that''s the rationale for this behaviour. > > > What''s troubling me is that linux-2.6.18-xen writes to the pte entry > > directly by dereferencing a ptep! I think I am missing something here. > > You''re allowed to do that, once you''ve activated writeable pagetable mode. > Your Xen 3.0 release will then do something like the following: > > 1) verify that you''re writing to an L1 pagetable, and unhook from its > parent > L2 table > 2) make the page writeable so that the write can succeed > > The guest will run for a bit and may now issue further writes without > trapping > into Xen. If the guest tries to access a virtual memory address within > the > range covered by that L1 table then it''ll cause a fault during the > translation process. This will trap back into Xen, which will: > > 3) notice that the fault was caused by a pagetable unhooking of the > writeable > pagetable code > 4) the page is made read only again, and all the changes in that pagetable > are > revalidated at once. > 5) the L1 page is hooked back into the pagetable structure > 6) the faulting instruction in the guest is retried; it should now succeed > > > This process makes it possible for a guest to get the illusion of writing > directly to the pagetables but also to benefit from batching of update > operations when many changes to the pagetables are being made. > > More recent versions of Xen provide the same interface to the guest, but > implement a trap-and-emulate approach: Xen traps the write faults and > individually validates each attempt to update the pagetables before > updating > them on behalf of the guest. This is faster in the common case of the > guest > making a small number of updates. The guest can use the explicit batching > interface when it wants to update a larger range. > > I hope this helps clarify things a bit.Yep. That clears a lot of stuff. I''ll just use a hypercall to update page table entries and not write to them directly. If you don''t mind me asking, why are you using such an old version of Xen? Well, we are building a research OS at our university and the reasons for choosing xen-3.0 are really trivial. Like say - this version installs from binary (debian package) without PAE support. (we don''t want to deal with 3-level page tables at this point to keep things simple for one; and then we are going to move to 64-bit anyway after this initial prototype). I could compile newer versions without PAE; I am just being lazy. We''re not punting on efficiency at this time. Anything that works is fine for us. thanks, satya. Cheers,> Mark > > -- > Dave: Just a question. What use is a unicyle with no seat? And no pedals! > Mark: To answer a question with a question: What use is a skateboard? > Dave: Skateboards have wheels. > Mark: My wheel has a wheel! >-- http://cs.uic.edu/~spopuri <http://cs.uic.edu/%7Espopuri> _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 10/16/07, Jeremy Fitzhardinge <jeremy@goop.org> wrote:> > Satya wrote: > > I''m developing my own 32-bit (no PAE) paravirtualized kernel for xen > > with Mini-OS as a starting point. I am currently working on process > > page table support (equivalent of arch/i386/mm/pgtable-xen.c) and > > mostly following Linux for the moment. I noticed that linux-2.6.18-xen > > never pins an L1 table (a pte), yet __pgd_pin() walks the page > > directory and gives up write access on the kernel mappings of pte > > pages and only pins the pgd page. How do set_pte() and set_pte_at() > > macros work if they are writing directly to the page table entires ? > > do we fault in the kernel to handle this ?! > > Pinning the top level of a pagetable implicitly pins all the lower > levels, so they are all pinned.Wonder what happens if an L1 page is created *after* a PGD is pinned ? I think I have to explicitly pin the L1 page then? or unlikely( does Xen pin it when I map into into the pinned PGD using a hypercall ) ? :) thanks, satya. J>-- http://cs.uic.edu/~spopuri _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2007-Oct-16 21:18 UTC
Re: [Xen-devel] Xeno Linux never pins L1 tables ?
Satya wrote:> Wonder what happens if an L1 page is created *after* a PGD is pinned ? > I think I have to explicitly pin the L1 page then? or unlikely( does > Xen pin it when I map into into the pinned PGD using a hypercall )When you update the L2 entry to point to the L1, Xen checks to make sure the L1 page is all OK (read-only, valid contents), then pins it to the rest of the pagetable. J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel