Hi guys, During my refresh to latest linux, I noticed, direct mapping of all non-RAM pages in xen_set_identity_and_release(). I currently don''t map all at front, but as needed looking at the PAGE_IO bit in the pte. One result of that is minor change to common code macro: __set_fixmap(idx, phys, PAGE_KERNEL_NOCACHE) to to __set_fixmap(idx, phys, PAGE_KERNEL_IO_NOCACHE) To avoid this change, and keep all my changes limited to xen files only, I thought I could just map the entire non-ram pages up front too. But I am concerned the EPT may grow too large? Specially, when we get to *really* large NUMA boxes. What do you guys think? Should I worry about it? thanks Mukesh E820 on my small box: Xen: [mem 0x0000000000000000-0x000000000009cfff] usable Xen: [mem 0x000000000009d800-0x00000000000fffff] reserved Xen: [mem 0x0000000000100000-0x00000000bf30cfff] usable Xen: [mem 0x00000000bf30d000-0x00000000bf38cfff] ACPI NVS Xen: [mem 0x00000000bf38d000-0x00000000bf3a2fff] reserved Xen: [mem 0x00000000bf3a3000-0x00000000bf3a3fff] ACPI NVS Xen: [mem 0x00000000bf3a4000-0x00000000bf3b4fff] reserved Xen: [mem 0x00000000bf3b5000-0x00000000bf3b7fff] ACPI NVS Xen: [mem 0x00000000bf3b8000-0x00000000bf3defff] reserved Xen: [mem 0x00000000bf3df000-0x00000000bf3dffff] usable Xen: [mem 0x00000000bf3e0000-0x00000000bf3e0fff] ACPI NVS Xen: [mem 0x00000000bf3e1000-0x00000000bf415fff] reserved Xen: [mem 0x00000000bf416000-0x00000000bf41ffff] ACPI data Xen: [mem 0x00000000bf420000-0x00000000bf420fff] ACPI NVS Xen: [mem 0x00000000bf421000-0x00000000bf422fff] ACPI data Xen: [mem 0x00000000bf423000-0x00000000bf42afff] ACPI NVS Xen: [mem 0x00000000bf42b000-0x00000000bf453fff] reserved Xen: [mem 0x00000000bf454000-0x00000000bf656fff] ACPI NVS Xen: [mem 0x00000000bf657000-0x00000000bf7fffff] usable Xen: [mem 0x00000000c0000000-0x00000000cfffffff] reserved Xen: [mem 0x00000000fec00000-0x00000000fec02fff] reserved Xen: [mem 0x00000000fec90000-0x00000000fec90fff] reserved Xen: [mem 0x00000000fed1c000-0x00000000fed1ffff] reserved Xen: [mem 0x00000000fee00000-0x00000000fee00fff] reserved Xen: [mem 0x00000000ff000000-0x00000000ffffffff] reserved Xen: [mem 0x0000000100000000-0x00000002bfffffff] usable
On Fri, 15 Jun 2012, Mukesh Rathor wrote:> Hi guys, > > During my refresh to latest linux, I noticed, direct mapping of all > non-RAM pages in xen_set_identity_and_release(). I currently don''t map > all at front, but as needed looking at the PAGE_IO bit in the pte. One > result of that is minor change to common code macro: > > __set_fixmap(idx, phys, PAGE_KERNEL_NOCACHE) to > to __set_fixmap(idx, phys, PAGE_KERNEL_IO_NOCACHE) > > > To avoid this change, and keep all my changes limited to xen files only, > I thought I could just map the entire non-ram pages up front too. But > I am concerned the EPT may grow too large? Specially, when we get to > *really* large NUMA boxes. What do you guys think? Should I worry about > it?I would map them all up front and worry about it later.
On Fri, Jun 15, 2012 at 12:02:19PM +0100, Stefano Stabellini wrote:> On Fri, 15 Jun 2012, Mukesh Rathor wrote: > > Hi guys, > > > > During my refresh to latest linux, I noticed, direct mapping of all > > non-RAM pages in xen_set_identity_and_release(). I currently don''t map > > all at front, but as needed looking at the PAGE_IO bit in the pte. OnePV doesn''t look at that all the time either. The P2M tree code has a couple of leafs, that if they have IDENTITY_FRAME_BIT set it will automatically stick _PAGE_IOMAP on the PTE.> > result of that is minor change to common code macro: > > > > __set_fixmap(idx, phys, PAGE_KERNEL_NOCACHE) to > > to __set_fixmap(idx, phys, PAGE_KERNEL_IO_NOCACHE)I am really wafling on that. Jeremy posted a patch some time ago to x86 folks that would do something similar (I can''t remember the details), but hpa said - why don''t you just consult the E820. That is where the IDENTITY_FRAME_BIT thing in the P2M tree came about. It could probably be implemented for your cases using ranges. Similary to how Xen permits/disallows certain IO regions to be touched. Would something like that potentially allow you do something like this: xen_hybrid_pte() phys_addr_t phys = pte.pte & PTE_PFN_MASK if (phys .. within ranges) pte |= _PAGE_IOMAP; return pte;> > > > > > To avoid this change, and keep all my changes limited to xen files only, > > I thought I could just map the entire non-ram pages up front too. But > > I am concerned the EPT may grow too large? Specially, when we get to > > *really* large NUMA boxes. What do you guys think? Should I worry about > > it?Would NUMA boxes have more than 1GB of E820 non-RAM regions? I can see them having gobs of RAM regions, but non-RAM regions?> > I would map them all up front and worry about it later. > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel
>>> On 18.06.12 at 20:35, Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> wrote: > On Fri, Jun 15, 2012 at 12:02:19PM +0100, Stefano Stabellini wrote: >> On Fri, 15 Jun 2012, Mukesh Rathor wrote: >> > Hi guys, >> > >> > During my refresh to latest linux, I noticed, direct mapping of all >> > non-RAM pages in xen_set_identity_and_release(). I currently don''t map >> > all at front, but as needed looking at the PAGE_IO bit in the pte. One > > PV doesn''t look at that all the time either. The P2M tree code > has a couple of leafs, that if they have IDENTITY_FRAME_BIT set it will > automatically stick _PAGE_IOMAP on the PTE. > >> > result of that is minor change to common code macro: >> > >> > __set_fixmap(idx, phys, PAGE_KERNEL_NOCACHE) to >> > to __set_fixmap(idx, phys, PAGE_KERNEL_IO_NOCACHE) > > I am really wafling on that. Jeremy posted a patch some time ago > to x86 folks that would do something similar (I can''t remember the > details), but hpa said - why don''t you just consult the E820. > > That is where the IDENTITY_FRAME_BIT thing in the P2M tree came > about. It could probably be implemented for your cases using ranges. > > Similary to how Xen permits/disallows certain IO regions to be touched. > Would something like that potentially allow you do something like this: > > xen_hybrid_pte() > > phys_addr_t phys = pte.pte & PTE_PFN_MASK > > if (phys .. within ranges) > pte |= _PAGE_IOMAP; > return pte; > >> > >> > >> > To avoid this change, and keep all my changes limited to xen files only, >> > I thought I could just map the entire non-ram pages up front too. But >> > I am concerned the EPT may grow too large? Specially, when we get to >> > *really* large NUMA boxes. What do you guys think? Should I worry about >> > it? > > Would NUMA boxes have more than 1GB of E820 non-RAM regions? I can see them > having gobs of RAM regions, but non-RAM regions?There can be multi-terabyte-sized non-RAM regions on systems with discontiguous RAM. Jan
Maybe Matching Threads
- [PATCH] Disable PAT support when running under Xen (v1).
- Illegal PV kernel pfm/pfn translations on PROT_NONE ioremaps
- [PATCH] PVH: remove code to map iomem from guest
- enable EPT table
- [RFC PATCH v6 76/92] kvm: x86: disable EPT A/D bits if introspection is present