In the paging mechanism of XEN what is the role of the variable ''idle_pg_table*'' variables ?? For a 4-levels paging system these variables are defined in x86_64.S and partially initialised. Here is the code, copied from x86_64.S: __________________________________________________ ... /* Initial PML4 -- level-4 page table. */ .org 0x2000 ENTRY(idle_pg_table) ENTRY(idle_pg_table_4) .quad idle_pg_table_l3 - __PAGE_OFFSET + 7 # PML4[0] .fill 261,8,0 .quad idle_pg_table_l3 - __PAGE_OFFSET + 7 # PML4[262] /* Initial PDP -- level-3 page table. */ .org 0x3000 ENTRY(idle_pg_table_l3) .quad idle_pg_table_l2 - __PAGE_OFFSET + 7 /* Initial PDE -- level-2 page table. Maps first 64MB physical memory. */ .org 0x4000 ENTRY(idle_pg_table_l2) .macro identmap from=0, count=32 .if \count-1 identmap "(\from+0)","(\count/2)" identmap "(\from+(0x200000*(\count/2)))","(\count/2)" .else .quad 0x00000000000001e3 + \from .endif .endm identmap .org 0x4000 + PAGE_SIZE .code64 .section ".bss.stack_aligned","w" ENTRY(cpu0_stack) .fill STACK_SIZE,1,0 ______________________________________________________ trying to understand that: - idle_pg_table_l4 is the same as idle_pg_table and contains 263 enties, all zeroed but two (identical) ones. These two pointers point somewhere close to idle_pg_table_l3. Why are there two identical pointers and why shift them by __PAGE_OFFSET +7? - idle_pg_table_l3 is located between 0x3000 and 0x4000 , with only the first slot initialised. The later points to level 2 table with some offset. - idle_pg_table_l2 has terrible code with a recursive macro, who expands into 63 quad constants. It is unclear to me why this complicated macro?? I would have put a table of constants pretty simply... Every entry in that l2 table points to a fixed address, at intervals of 4K (a page).l2 tables are located between 0x01E3 to 0x03E001E3 in groups. Every group is apparently a set of 4 page tables and each table has a size of 128K. Groups are separated by approx 256MB. Why are these spacings and groups? - idle_pg_table_l1 is not an entry and so l1 tables are not allocated. Why? thanks for help! Armand _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
> -----Original Message----- > From: xen-devel-bounces@lists.xensource.com > [mailto:xen-devel-bounces@lists.xensource.com] On Behalf Of > PUCCETTI Armand > Sent: 01 September 2006 17:11 > To: xen-devel@lists.xensource.com > Subject: [Xen-devel] idle_pg_tables?? > > In the paging mechanism of XEN what is the role of the variable > ''idle_pg_table*'' variables ?? > > For a 4-levels paging system these variables are defined in > x86_64.S and > partially initialised. > Here is the code, copied from x86_64.S: > > __________________________________________________ > ... > > /* Initial PML4 -- level-4 page table. */ > .org 0x2000 > ENTRY(idle_pg_table) > ENTRY(idle_pg_table_4) > .quad idle_pg_table_l3 - __PAGE_OFFSET + 7 # PML4[0] > .fill 261,8,0 > .quad idle_pg_table_l3 - __PAGE_OFFSET + 7 # PML4[262] > > /* Initial PDP -- level-3 page table. */ > .org 0x3000 > ENTRY(idle_pg_table_l3) > .quad idle_pg_table_l2 - __PAGE_OFFSET + 7 > > /* Initial PDE -- level-2 page table. Maps first 64MB > physical memory. */ > .org 0x4000 > ENTRY(idle_pg_table_l2) > .macro identmap from=0, count=32 > .if \count-1 > identmap "(\from+0)","(\count/2)" > identmap "(\from+(0x200000*(\count/2)))","(\count/2)" > .else > .quad 0x00000000000001e3 + \from > .endif > .endm > identmap > > .org 0x4000 + PAGE_SIZE > .code64 > > .section ".bss.stack_aligned","w" > ENTRY(cpu0_stack) > .fill STACK_SIZE,1,0 > ______________________________________________________ > trying to understand that: > > - idle_pg_table_l4 is the same as idle_pg_table and contains > 263 enties, > all zeroed but two (identical) ones. These > two pointers point somewhere close to idle_pg_table_l3. Why are there > two identical pointers and why shift them by __PAGE_OFFSET +7?So that we can have a map for both LOW memory (address zero and 1GB forward) and a map for the upper range of memory where Xen''s base-virtual address is (__PAGE_OFFSET). I think you''ll find that if you shift __PAGE_OFFSET sufficient number of bits (30 or so), the remaining number is 262... [I haven''t checked this]. Reusing the same-pagetable entry allows the use of a single entry in the next page-table level. It''s shifted by PAGE_OFFSET because the code is linked such that everything is based on the virtual address that we eventually will use in the system. But the page-table wants to have a PHYSICAL address, so we subtract the virtual baseaddress from the location that we want the PT entry to point to. The magic number of 7 represents the flags for the page-entry, which is bit 0=Present, bit 1= R/W (Writable) and bit 2 U/S => User accessible. Since this is the top lavel page, it makes sense to set it all to present and allow full access, since next level down can always override a permission (but can''t allow something forbidden by upper level).> > - idle_pg_table_l3 is located between 0x3000 and 0x4000 , > with only the > first slot initialised. The later points to > level 2 table with some offset.This allows the next 128MB of memory to be mapped. Which is sufficient for the initialization of the system.> > - idle_pg_table_l2 has terrible code with a recursive macro, > who expands > into 63 quad constants. It is unclear > to me why this complicated macro?? I would have put a table > of constants > pretty simply... Every entry in that l2 table points to a > fixed address, at intervals of 4K (a page).l2 tables are > located between > 0x01E3 to 0x03E001E3 in groups. Every group > is apparently a set of 4 page tables and each table has a > size of 128K. > Groups are separated by approx 256MB. > Why are these spacings and groups?I can''t explain why there is a macro and why it does things in the way it does, except I think you''ll find that it''s related to the code being located at a virtual address which is non-zero at this level [I haven''t checked this out]. The value 0x1E3 is used to indicate that the pages are 2MB, Dirty (prevents the MMU from rewriting them dirty if they are later written), Accessed (same reason as D), Writeable and Present.> > - idle_pg_table_l1 is not an entry and so l1 tables are not > allocated. Why?Because the value 1E3 (or part thereof) is indicating that the page is 2MB pages, so we don''t need a L1 table entry for the pages defined in the above way. May I ask what you''re trying to achieve - as far as I know, the above code is working just fine, so messing with it doesn''t seem like a good plan [Getting page-table initialization and such things to work right is notoriously complicated, because it tends to break without any way of really debugging it]. -- Mats> > thanks for help! > > Armand > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel > > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
* Petersson, Mats <Mats.Petersson@amd.com> [2006-09-01 11:52]:> > > > -----Original Message----- > > From: xen-devel-bounces@lists.xensource.com > > [mailto:xen-devel-bounces@lists.xensource.com] On Behalf Of > > PUCCETTI Armand > > Sent: 01 September 2006 17:11 > > To: xen-devel@lists.xensource.com > > Subject: [Xen-devel] idle_pg_tables?? > > > > In the paging mechanism of XEN what is the role of the variable > > ''idle_pg_table*'' variables ?? > > > > For a 4-levels paging system these variables are defined in > > x86_64.S and > > partially initialised. > > Here is the code, copied from x86_64.S: > > > > __________________________________________________ > > ... > > > > /* Initial PML4 -- level-4 page table. */ > > .org 0x2000 > > ENTRY(idle_pg_table) > > ENTRY(idle_pg_table_4) > > .quad idle_pg_table_l3 - __PAGE_OFFSET + 7 # PML4[0] > > .fill 261,8,0 > > .quad idle_pg_table_l3 - __PAGE_OFFSET + 7 # PML4[262] > > > > /* Initial PDP -- level-3 page table. */ > > .org 0x3000 > > ENTRY(idle_pg_table_l3) > > .quad idle_pg_table_l2 - __PAGE_OFFSET + 7 > > > > /* Initial PDE -- level-2 page table. Maps first 64MB > > physical memory. */ > > .org 0x4000 > > ENTRY(idle_pg_table_l2) > > .macro identmap from=0, count=32 > > .if \count-1 > > identmap "(\from+0)","(\count/2)" > > identmap "(\from+(0x200000*(\count/2)))","(\count/2)" > > .else > > .quad 0x00000000000001e3 + \from > > .endif > > .endm > > identmap > > > > .org 0x4000 + PAGE_SIZE > > .code64 > > > > .section ".bss.stack_aligned","w" > > ENTRY(cpu0_stack) > > .fill STACK_SIZE,1,0 > > ______________________________________________________ > > trying to understand that: > > > > - idle_pg_table_l4 is the same as idle_pg_table and contains > > 263 enties, > > all zeroed but two (identical) ones. These > > two pointers point somewhere close to idle_pg_table_l3. Why are there > > two identical pointers and why shift them by __PAGE_OFFSET +7? > > So that we can have a map for both LOW memory (address zero and 1GB > forward) and a map for the upper range of memory where Xen''s > base-virtual address is (__PAGE_OFFSET). I think you''ll find that if you > shift __PAGE_OFFSET sufficient number of bits (30 or so), the remaining > number is 262... [I haven''t checked this]. Reusing the same-pagetable > entry allows the use of a single entry in the next page-table level. > > It''s shifted by PAGE_OFFSET because the code is linked such that > everything is based on the virtual address that we eventually will use > in the system. But the page-table wants to have a PHYSICAL address, so > we subtract the virtual baseaddress from the location that we want the > PT entry to point to. > > The magic number of 7 represents the flags for the page-entry, which is > bit 0=Present, bit 1= R/W (Writable) and bit 2 U/S => User accessible. > Since this is the top lavel page, it makes sense to set it all to > present and allow full access, since next level down can always override > a permission (but can''t allow something forbidden by upper level). > > > > > - idle_pg_table_l3 is located between 0x3000 and 0x4000 , > > with only the > > first slot initialised. The later points to > > level 2 table with some offset. > > This allows the next 128MB of memory to be mapped. Which is sufficient > for the initialization of the system. > > > > - idle_pg_table_l2 has terrible code with a recursive macro, > > who expands > > into 63 quad constants. It is unclear > > to me why this complicated macro?? I would have put a table > > of constants > > pretty simply... Every entry in that l2 table points to a > > fixed address, at intervals of 4K (a page).l2 tables are > > located between > > 0x01E3 to 0x03E001E3 in groups. Every group > > is apparently a set of 4 page tables and each table has a > > size of 128K. > > Groups are separated by approx 256MB. > > Why are these spacings and groups? > > I can''t explain why there is a macro and why it does things in the way > it does, except I think you''ll find that it''s related to the code being > located at a virtual address which is non-zero at this level [I haven''t > checked this out]. > > The value 0x1E3 is used to indicate that the pages are 2MB, Dirty > (prevents the MMU from rewriting them dirty if they are later written), > Accessed (same reason as D), Writeable and Present. > > > > - idle_pg_table_l1 is not an entry and so l1 tables are not > > allocated. Why? > > Because the value 1E3 (or part thereof) is indicating that the page is > 2MB pages, so we don''t need a L1 table entry for the pages defined in > the above way. > > May I ask what you''re trying to achieve - as far as I know, the above > code is working just fine, so messing with it doesn''t seem like a good > plan [Getting page-table initialization and such things to work right is > notoriously complicated, because it tends to break without any way of > really debugging it]. > > -- > MatsThe above was a lot of help for me and I''m sure many others. Thanks. Do we have a good place in the xenwiki where this sort of low-level initialization can be captured for future reference? Maybe another entry in http://wiki.xensource.com/xenwiki/XenArchitecture ? -- Ryan Harper Software Engineer; Linux Technology Center IBM Corp., Austin, Tx (512) 838-9253 T/L: 678-9253 ryanh@us.ibm.com _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Petersson, Mats a écrit :> > > >> -----Original Message----- >> From: xen-devel-bounces@lists.xensource.com >> [mailto:xen-devel-bounces@lists.xensource.com] On Behalf Of >> PUCCETTI Armand >> Sent: 01 September 2006 17:11 >> To: xen-devel@lists.xensource.com >> Subject: [Xen-devel] idle_pg_tables?? >> >> In the paging mechanism of XEN what is the role of the variable >> ''idle_pg_table*'' variables ?? >> >> For a 4-levels paging system these variables are defined in >> x86_64.S and >> partially initialised. >> Here is the code, copied from x86_64.S: >> >> __________________________________________________ >> ... >> >> /* Initial PML4 -- level-4 page table. */ >> .org 0x2000 >> ENTRY(idle_pg_table) >> ENTRY(idle_pg_table_4) >> .quad idle_pg_table_l3 - __PAGE_OFFSET + 7 # PML4[0] >> .fill 261,8,0 >> .quad idle_pg_table_l3 - __PAGE_OFFSET + 7 # PML4[262] >> >> /* Initial PDP -- level-3 page table. */ >> .org 0x3000 >> ENTRY(idle_pg_table_l3) >> .quad idle_pg_table_l2 - __PAGE_OFFSET + 7 >> >> /* Initial PDE -- level-2 page table. Maps first 64MB >> physical memory. */ >> .org 0x4000 >> ENTRY(idle_pg_table_l2) >> .macro identmap from=0, count=32 >> .if \count-1 >> identmap "(\from+0)","(\count/2)" >> identmap "(\from+(0x200000*(\count/2)))","(\count/2)" >> .else >> .quad 0x00000000000001e3 + \from >> .endif >> .endm >> identmap >> >> .org 0x4000 + PAGE_SIZE >> .code64 >> >> .section ".bss.stack_aligned","w" >> ENTRY(cpu0_stack) >> .fill STACK_SIZE,1,0 >> ______________________________________________________ >> trying to understand that: >> >> - idle_pg_table_l4 is the same as idle_pg_table and contains >> 263 enties, >> all zeroed but two (identical) ones. These >> two pointers point somewhere close to idle_pg_table_l3. Why are there >> two identical pointers and why shift them by __PAGE_OFFSET +7? >> > > So that we can have a map for both LOW memory (address zero and 1GB > forward) and a map for the upper range of memory where Xen''s > base-virtual address is (__PAGE_OFFSET). I think you''ll find that if you > shift __PAGE_OFFSET sufficient number of bits (30 or so), the remaining > number is 262... [I haven''t checked this]. Reusing the same-pagetable > entry allows the use of a single entry in the next page-table level. > > It''s shifted by PAGE_OFFSET because the code is linked such that > everything is based on the virtual address that we eventually will use > in the system. But the page-table wants to have a PHYSICAL address, so > we subtract the virtual baseaddress from the location that we want the > PT entry to point to. > > The magic number of 7 represents the flags for the page-entry, which is > bit 0=Present, bit 1= R/W (Writable) and bit 2 U/S => User accessible. > Since this is the top lavel page, it makes sense to set it all to > present and allow full access, since next level down can always override > a permission (but can''t allow something forbidden by upper level). > > >> - idle_pg_table_l3 is located between 0x3000 and 0x4000 , >> with only the >> first slot initialised. The later points to >> level 2 table with some offset. >> > > This allows the next 128MB of memory to be mapped. Which is sufficient > for the initialization of the system. > >> - idle_pg_table_l2 has terrible code with a recursive macro, >> who expands >> into 63 quad constants. It is unclear >> to me why this complicated macro?? I would have put a table >> of constants >> pretty simply... Every entry in that l2 table points to a >> fixed address, at intervals of 4K (a page).l2 tables are >> located between >> 0x01E3 to 0x03E001E3 in groups. Every group >> is apparently a set of 4 page tables and each table has a >> size of 128K. >> Groups are separated by approx 256MB. >> Why are these spacings and groups? >> > > I can''t explain why there is a macro and why it does things in the way > it does, except I think you''ll find that it''s related to the code being > located at a virtual address which is non-zero at this level [I haven''t > checked this out]. > > The value 0x1E3 is used to indicate that the pages are 2MB, Dirty > (prevents the MMU from rewriting them dirty if they are later written), > Accessed (same reason as D), Writeable and Present. > >> - idle_pg_table_l1 is not an entry and so l1 tables are not >> allocated. Why? >> > > Because the value 1E3 (or part thereof) is indicating that the page is > 2MB pages, so we don''t need a L1 table entry for the pages defined in > the above way. >OK, pages are 2MB in size on AMD64. Now, what is the supposed size of that idle_pgtable_l2? According to page.h, the C view of that variable is extern l2_pgentry_t idle_pg_table_l2[ROOT_PAGETABLE_ENTRIES] with ROOT_PAGETABLE_ENTRIES=512 but according to x86_64.S, the memory area where that variable is placed, namely between 0x4000 and 0x4000+PAGE_SIZE is much bigger. What is the real size of that variable? Same question for the idle_pgtable_l4 and _l3? They are, a priori, 512*8 bytes=2M long...> May I ask what you''re trying to achieve - as far as I know, the above > code is working just fine, so messing with it doesn''t seem like a good > plan [Getting page-table initialization and such things to work right is > notoriously complicated, because it tends to break without any way of > really debugging it]. >Sorry that I forgot to write my intentions: the purpose is to understand the source code and perform a static analysis of it. That''s part of IST FP6 OPENTC project. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
> OK, pages are 2MB in size on AMD64. Now, what is the supposed size of > that idle_pgtable_l2? According to page.h, the C view of that > variable is > > extern l2_pgentry_t idle_pg_table_l2[ROOT_PAGETABLE_ENTRIES] > with ROOT_PAGETABLE_ENTRIES=512 > > but according to x86_64.S, the memory area where that variable is > placed, namely > between 0x4000 and 0x4000+PAGE_SIZE is much bigger. What is the real > size of that > variable? > Same question for the idle_pgtable_l4 and _l3? They are, a > priori, 512*8 > bytes=2M long...Last time I calculated 512 * 8 it came to 4096 (sizeof(l2_pgentry_t) should be 8 bytes, as we''re using PAE-mode in 64-bit code, and thus 8 bytes per page-table-entry). PAGE_SIZE is the basic x86 page-size of 4096, I would assume, so 512 * 8 = PAGE_SIZE, and thus it''s NOT much bigger... All page-table entries that point to another page-table entry will point to an even 4K page, which means, also, that they have 512 entries in PAE-mode. In non-PAE mode each entry is 4 bytes and there are 1024 entries on each page. It depends on the actual entry in the table whether it terminates the page-table on L2 or L1. If bit 7 of the L2 entry is set, that means that the entry is a 2MB entry, and thus the entry is indicated by the bits above bit 20 (or put another way, bits 21 and up will contain the physical address of the memory actually used by the processor - it is then combined with bits 20 and down from the location indicated by the code, which is the offset into the memory). For example: 0x000002bcdef0 as an address. In binary: 0000.0000.0000.0000.0000.0010.1011.1100.1101.1110.1111.0000 47 43 39 35 31 27 23 19 15 11 7 3 (msb number) CR3 = idle_pg_table = 0x102000 Bits 47..39 -> 0.0000.0000 => 0x1020000 + (0 * 8) => idle_pg_table_l3 (0x103000) Bits 38..30 -> 0.0000.0000 => 0x1030000 + (0 * 8) => idle_pg_table_l2 (0x104000) Bits 29..21 -> 0.0001.0101 => 0x1040000 + (21 * 8) => 1040A8 -> 0x02A001e3 This means the base addres is 10MB + offset of the lower 21 bits of the number 1.1100.1101.1110.1111.0000 -> 1CBDEF0, which means that the address is 2CBDEF0 - which is what we''d expect, as it''s a linear mapping. -- Mats _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel