Julien Grall
2013-Dec-10 17:36 UTC
[PATCH] libxc/arm: Correctly handle the difference between virtual and physical address
xc_dom_alloc_page deals with virtual address not physical address. When an ELF is loaded, virtual address and physical address may be different. Signed-off-by: Julien Grall <julien.grall@linaro.org> --- tools/libxc/xc_dom_arm.c | 16 +++++++++------- 1 file changed, 9 insertions(+), 7 deletions(-) diff --git a/tools/libxc/xc_dom_arm.c b/tools/libxc/xc_dom_arm.c index a40e04d..75a6f1c 100644 --- a/tools/libxc/xc_dom_arm.c +++ b/tools/libxc/xc_dom_arm.c @@ -108,13 +108,15 @@ static int shared_info_arm(struct xc_dom_image *dom, void *ptr) static int vcpu_arm32(struct xc_dom_image *dom, void *ptr) { vcpu_guest_context_t *ctxt = ptr; + uint32_t rambase = dom->rambase_pfn << XC_PAGE_SHIFT; + uint32_t offset = dom->parms.virt_base - rambase; DOMPRINTF_CALLED(dom->xch); /* clear everything */ memset(ctxt, 0, sizeof(*ctxt)); - ctxt->user_regs.pc32 = dom->parms.virt_entry; + ctxt->user_regs.pc32 = dom->parms.virt_entry - offset; /* Linux boot protocol. See linux.Documentation/arm/Booting. */ ctxt->user_regs.r0_usr = 0; /* SBZ */ @@ -125,7 +127,7 @@ static int vcpu_arm32(struct xc_dom_image *dom, void *ptr) * like a valid pointer to a set of ATAGS or a DTB. */ ctxt->user_regs.r2_usr = dom->devicetree_blob ? - dom->devicetree_seg.vstart : 0xffffffff; + (dom->devicetree_seg.vstart - offset) : 0xffffffff; ctxt->sctlr = SCTLR_GUEST_INIT; @@ -280,15 +282,15 @@ int arch_setup_meminit(struct xc_dom_image *dom) if ( dom->devicetree_blob ) { - const uint64_t rambase = dom->rambase_pfn << XC_PAGE_SHIFT; - const uint64_t ramend = rambase + ( dom->total_pages << XC_PAGE_SHIFT ); + const uint64_t virtbase = dom->parms.virt_base; + const uint64_t virtend = virtbase + ( dom->total_pages << XC_PAGE_SHIFT ); const uint64_t dtbsize = ROUNDUP(dom->devicetree_size, XC_PAGE_SHIFT); /* Place at 128MB if there is sufficient RAM */ - if ( ramend >= rambase + 128*1024*1024 + dtbsize ) - dom->devicetree_seg.vstart = rambase + 128*1024*1024; + if ( virtend >= virtbase + 128*1024*1024 + dtbsize ) + dom->devicetree_seg.vstart = virtbase + 128*1024*1024; else /* otherwise at top of RAM */ - dom->devicetree_seg.vstart = ramend - dtbsize; + dom->devicetree_seg.vstart = virtend - dtbsize; dom->devicetree_seg.vend dom->devicetree_seg.vstart + dom->devicetree_size; -- 1.7.10.4
Ian Campbell
2013-Dec-10 17:44 UTC
Re: [PATCH] libxc/arm: Correctly handle the difference between virtual and physical address
On Tue, 2013-12-10 at 17:36 +0000, Julien Grall wrote:> xc_dom_alloc_page deals with virtual address not physical address. When > an ELF is loaded, virtual address and physical address may be different.Can you give an example of the program headers of an ELF file (readelf -l) which causes this? How was it constructed? When we are building the guest the MMU is disabled so virt == phys. If the ELF has virt != phys then how do we know whether it expects to be launched with the MMU on or off? We can only really launch with the MMU off, so do we require that the initial ELF entry point be PIC and know how to enable the MMU? Ian.> > Signed-off-by: Julien Grall <julien.grall@linaro.org> > --- > tools/libxc/xc_dom_arm.c | 16 +++++++++------- > 1 file changed, 9 insertions(+), 7 deletions(-) > > diff --git a/tools/libxc/xc_dom_arm.c b/tools/libxc/xc_dom_arm.c > index a40e04d..75a6f1c 100644 > --- a/tools/libxc/xc_dom_arm.c > +++ b/tools/libxc/xc_dom_arm.c > @@ -108,13 +108,15 @@ static int shared_info_arm(struct xc_dom_image *dom, void *ptr) > static int vcpu_arm32(struct xc_dom_image *dom, void *ptr) > { > vcpu_guest_context_t *ctxt = ptr; > + uint32_t rambase = dom->rambase_pfn << XC_PAGE_SHIFT; > + uint32_t offset = dom->parms.virt_base - rambase; > > DOMPRINTF_CALLED(dom->xch); > > /* clear everything */ > memset(ctxt, 0, sizeof(*ctxt)); > > - ctxt->user_regs.pc32 = dom->parms.virt_entry; > + ctxt->user_regs.pc32 = dom->parms.virt_entry - offset; > > /* Linux boot protocol. See linux.Documentation/arm/Booting. */ > ctxt->user_regs.r0_usr = 0; /* SBZ */ > @@ -125,7 +127,7 @@ static int vcpu_arm32(struct xc_dom_image *dom, void *ptr) > * like a valid pointer to a set of ATAGS or a DTB. > */ > ctxt->user_regs.r2_usr = dom->devicetree_blob ? > - dom->devicetree_seg.vstart : 0xffffffff; > + (dom->devicetree_seg.vstart - offset) : 0xffffffff; > > ctxt->sctlr = SCTLR_GUEST_INIT; > > @@ -280,15 +282,15 @@ int arch_setup_meminit(struct xc_dom_image *dom) > > if ( dom->devicetree_blob ) > { > - const uint64_t rambase = dom->rambase_pfn << XC_PAGE_SHIFT; > - const uint64_t ramend = rambase + ( dom->total_pages << XC_PAGE_SHIFT ); > + const uint64_t virtbase = dom->parms.virt_base; > + const uint64_t virtend = virtbase + ( dom->total_pages << XC_PAGE_SHIFT ); > const uint64_t dtbsize = ROUNDUP(dom->devicetree_size, XC_PAGE_SHIFT); > > /* Place at 128MB if there is sufficient RAM */ > - if ( ramend >= rambase + 128*1024*1024 + dtbsize ) > - dom->devicetree_seg.vstart = rambase + 128*1024*1024; > + if ( virtend >= virtbase + 128*1024*1024 + dtbsize ) > + dom->devicetree_seg.vstart = virtbase + 128*1024*1024; > else /* otherwise at top of RAM */ > - dom->devicetree_seg.vstart = ramend - dtbsize; > + dom->devicetree_seg.vstart = virtend - dtbsize; > > dom->devicetree_seg.vend > dom->devicetree_seg.vstart + dom->devicetree_size;
Julien Grall
2013-Dec-10 18:16 UTC
Re: [PATCH] libxc/arm: Correctly handle the difference between virtual and physical address
On 12/10/2013 05:44 PM, Ian Campbell wrote:> On Tue, 2013-12-10 at 17:36 +0000, Julien Grall wrote: >> xc_dom_alloc_page deals with virtual address not physical address. When >> an ELF is loaded, virtual address and physical address may be different. > > Can you give an example of the program headers of an ELF file (readelf > -l) which causes this? How was it constructed?This is the beginning of readelf -l for a FreeBSD ARM binary: Elf file type is EXEC (Executable file) Entry point 0xc0100100 There are 5 program headers, starting at offset 52 Program Headers: Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align EXIDX 0x2f6e54 0xc03f6e54 0xc03f6e54 0x0df90 0x0df90 R 0x4 PHDR 0x000034 0xc0100034 0xc0100034 0x000a0 0x000a0 R E 0x4 INTERP 0x2b67f0 0xc03b67f0 0xc03b67f0 0x0000d 0x0000d R 0x1 [Requesting program interpreter: /red/herring] LOAD 0x000000 0xc0100000 0xc0100000 0x32dadc 0x3538d8 RWE 0x8000 DYNAMIC 0x32da74 0xc042da74 0xc042da74 0x00068 0x00068 RW 0x4 FreeBSD will build the kernel with: - kernel physical address: 0x80100000 - kernel virtual address: 0xC0100000 And the ELF will be linked with the virtual address.> When we are building the guest the MMU is disabled so virt == phys.Right, but ELF loader is based on virtual address not physical address. Guest creation will fail when libxc is trying to allocate segment for the device tree (see xc_dom_alloc_segment). As it''s allocate after the kernel, virt_alloc_end will contains a virtual address. But device tree address is generated with a physical address... Therefore, the function will fail with "segment start too low ...".> If the ELF has virt != phys then how do we know whether it expects to be > launched with the MMU on or off? We can only really launch with the MMU > off, so do we require that the initial ELF entry point be PIC and know > how to enable the MMU?This issue is when as virt == phys, build ELF with virt != phys is very difficult. I took a couple of hours without any success. In any case, FreeBSD is building its ELF with virt = phys. When the guest is creating, the ELF should loaded like zImage at the specific physical address. Then the guest will start will MMU turn off, and during the first instructions it will use fixup to get the right address. -- Julien Grall
Ian Campbell
2013-Dec-11 10:02 UTC
Re: [PATCH] libxc/arm: Correctly handle the difference between virtual and physical address
On Tue, 2013-12-10 at 18:16 +0000, Julien Grall wrote:> This issue is when as virt == phys, build ELF with virt != phys is very difficult.OK, so I think I was mislead by your commit message, this is not actually about virt vs phys as such (or as represented in the ELF header) but is actually more about link address vs. load address. Where link address == ELF vaddr (==paddr). And because the MMU is disable load address == some vaddr (==paddr), which may differ from the ELF vaddr. The terminology in xc_dom, which uses vaddr a lot because of PV x86, isn''t helpful here but is that an accurate summary? [...]> When the guest is creating, the ELF should loaded like zImage at the specific > physical address.So that would be my next question -- where does this load address come from? I suppose it has to be the 0x80100000 kernel physical address you gave earlier? I think this means "where is dom->parms.virt_base" initialised. Have you added some ELF notes to your BSD kernel image?> Then the guest will start will MMU turn off, and during the > first instructions it will use fixup to get the right address."fixup" == enable the MMU, or "fixup" == some sort of relocation/PIC? Ian.
Julien Grall
2013-Dec-11 19:26 UTC
Re: [PATCH] libxc/arm: Correctly handle the difference between virtual and physical address
On 12/11/2013 10:02 AM, Ian Campbell wrote:> On Tue, 2013-12-10 at 18:16 +0000, Julien Grall wrote: >> This issue is when as virt == phys, build ELF with virt != phys is very difficult. > > OK, so I think I was mislead by your commit message, this is not > actually about virt vs phys as such (or as represented in the ELF > header) but is actually more about link address vs. load address. Where > link address == ELF vaddr (==paddr). And because the MMU is disable load > address == some vaddr (==paddr), which may differ from the ELF vaddr. > > The terminology in xc_dom, which uses vaddr a lot because of PV x86, > isn''t helpful here but is that an accurate summary?Right, and this address is actually equal to ELF vaddr. But the device tree base address will be computed with the physical address.> [...] >> When the guest is creating, the ELF should loaded like zImage at the specific >> physical address. > > So that would be my next question -- where does this load address come > from?Load address ? Virtual ? Physical ?> I suppose it has to be the 0x80100000 kernel physical address you gave > earlier? > > I think this means "where is dom->parms.virt_base" initialised. Have you > added some ELF notes to your BSD kernel image?I have tried to play with the ELF notes. My current configuration is: XEN_ELFNOTE_VIRT_BASE vaddr (0xC0000000) XEN_ELFNOTE_PADDR_OFFSET vaddr (0xC0000000) It''s copied from Linux. If I try to modify PADDR_OFFSET with (KERNBASE - PHYSADDR) or whatever value it doesn''t work. I got "segment kernel too large (0x355 > 0x4000 - 0x80100 pages): Out of memory".>> Then the guest will start will MMU turn off, and during the >> first instructions it will use fixup to get the right address. > > "fixup" == enable the MMU, or "fixup" == some sort of relocation/PIC?Some sort of relocation. When MMU is turn off, every time the kernel needs to call asm callback, it will add an offset. -- Julien Grall
Ian Campbell
2013-Dec-12 10:35 UTC
Re: [PATCH] libxc/arm: Correctly handle the difference between virtual and physical address
On Wed, 2013-12-11 at 19:26 +0000, Julien Grall wrote:> On 12/11/2013 10:02 AM, Ian Campbell wrote: > > On Tue, 2013-12-10 at 18:16 +0000, Julien Grall wrote: > >> This issue is when as virt == phys, build ELF with virt != phys is very difficult. > > > > OK, so I think I was mislead by your commit message, this is not > > actually about virt vs phys as such (or as represented in the ELF > > header) but is actually more about link address vs. load address. Where > > link address == ELF vaddr (==paddr). And because the MMU is disable load > > address == some vaddr (==paddr), which may differ from the ELF vaddr. > > > > The terminology in xc_dom, which uses vaddr a lot because of PV x86, > > isn''t helpful here but is that an accurate summary? > > Right, and this address is actually equal to ELF vaddr. But the device > tree base address will be computed with the physical address. > > > [...] > >> When the guest is creating, the ELF should loaded like zImage at the specific > >> physical address. > > > > So that would be my next question -- where does this load address come > > from? > > Load address ? Virtual ? Physical ?Load address == The literal address where the kernel image is loaded, e.g. on native it would be the physical address.> > I suppose it has to be the 0x80100000 kernel physical address you gave > > earlier? > > > > I think this means "where is dom->parms.virt_base" initialised. Have you > > added some ELF notes to your BSD kernel image? > > I have tried to play with the ELF notes. My current configuration is: > > XEN_ELFNOTE_VIRT_BASE vaddr (0xC0000000) > XEN_ELFNOTE_PADDR_OFFSET vaddr (0xC0000000) > > It''s copied from Linux.XEN_ELFNOTE_PADDR_OFFSET is 0 for x86 Linux though. It''s possible that for ARM it should be RAMBASE or something? Or perhaps it should be 0 and the RAMBASE should be added by the tools. Or maybe it should be omitted and only VIRT_BASE is needed? Does NetBSD absolutely require that it is loaded at precisely address 0x80100000 or is the requirement rambase + 0x00100000. If NetBSD has requirements about where RAM is the we may need a new note to designate the required RAM base address.> If I try to modify PADDR_OFFSET with (KERNBASE - PHYSADDR) or whatever > value it doesn''t work. I got > "segment kernel too large (0x355 > 0x4000 - 0x80100 pages): Out of memory".It is very possible that the existing code buggily assumes in various places that RAM starts at address 0, because it has only ever been run on x86. You will probably need to fix this by using rambase_pfn in various places.> >> Then the guest will start will MMU turn off, and during the > >> first instructions it will use fixup to get the right address. > > > > "fixup" == enable the MMU, or "fixup" == some sort of relocation/PIC? > > Some sort of relocation. When MMU is turn off, every time the kernel > needs to call asm callback, it will add an offset.Sounds a bit mad to me compared with writing PIC, but OK. Ian.
Julien Grall
2013-Dec-12 13:55 UTC
Re: [PATCH] libxc/arm: Correctly handle the difference between virtual and physical address
On 12/12/2013 10:35 AM, Ian Campbell wrote:> On Wed, 2013-12-11 at 19:26 +0000, Julien Grall wrote: >> On 12/11/2013 10:02 AM, Ian Campbell wrote: >>> On Tue, 2013-12-10 at 18:16 +0000, Julien Grall wrote: >>>> This issue is when as virt == phys, build ELF with virt != phys is very difficult. >>> >>> OK, so I think I was mislead by your commit message, this is not >>> actually about virt vs phys as such (or as represented in the ELF >>> header) but is actually more about link address vs. load address. Where >>> link address == ELF vaddr (==paddr). And because the MMU is disable load >>> address == some vaddr (==paddr), which may differ from the ELF vaddr. >>> >>> The terminology in xc_dom, which uses vaddr a lot because of PV x86, >>> isn''t helpful here but is that an accurate summary? >> >> Right, and this address is actually equal to ELF vaddr. But the device >> tree base address will be computed with the physical address. >> >>> [...] >>>> When the guest is creating, the ELF should loaded like zImage at the specific >>>> physical address. >>> >>> So that would be my next question -- where does this load address come >>> from? >> >> Load address ? Virtual ? Physical ? > > Load address == The literal address where the kernel image is loaded, > e.g. on native it would be the physical address. > >>> I suppose it has to be the 0x80100000 kernel physical address you gave >>> earlier? >>> >>> I think this means "where is dom->parms.virt_base" initialised. Have you >>> added some ELF notes to your BSD kernel image? >> >> I have tried to play with the ELF notes. My current configuration is: >> >> XEN_ELFNOTE_VIRT_BASE vaddr (0xC0000000) >> XEN_ELFNOTE_PADDR_OFFSET vaddr (0xC0000000) >> >> It''s copied from Linux. > > XEN_ELFNOTE_PADDR_OFFSET is 0 for x86 Linux though.Sorry I meant FreeBSD.> > It''s possible that for ARM it should be RAMBASE or something? Or perhaps > it should be 0 and the RAMBASE should be added by the tools. Or maybe it > should be omitted and only VIRT_BASE is needed?It''s not clear to me what is the usage of XEN_ELFNOTE_PADDR_OFFSET. Is it the physical address? An offset?> Does NetBSD absolutely require that it is loaded at precisely address > 0x80100000 or is the requirement rambase + 0x00100000. > > If NetBSD has requirements about where RAM is the we may need a new note > to designate the required RAM base address.FreeBSD requires to know at compile time the RAM base address. I think it''s fine now to hardcode to GUEST_RAMBASE. We can modify it later.>> If I try to modify PADDR_OFFSET with (KERNBASE - PHYSADDR) or whatever >> value it doesn''t work. I got >> "segment kernel too large (0x355 > 0x4000 - 0x80100 pages): Out of memory". > > It is very possible that the existing code buggily assumes in various > places that RAM starts at address 0, because it has only ever been run > on x86. You will probably need to fix this by using rambase_pfn in > various places.That would mean we need to fix up every seg->vstart/vend. The main issue is, at the end of xc_dom_alloc_segment, we will store vend to virt_alloc_end. This value we be retrieved to the next allocation.. I wrote the patch in this way because I think it''s fine to do all the allocation in the virtual address space (even if MMU is turn off during at boot time) and then, thanks to xc_dom_alloc_segment, translate to physical address. It''s the currently behaviour. But it works fine because for zImage, we made the assumption that virt_base == rambase_pfn << PAGE_SHIFT. The main change is using virt_base to allocate the device tree and substract different offset when the guest will start.>>>> Then the guest will start will MMU turn off, and during the >>>> first instructions it will use fixup to get the right address. >>> >>> "fixup" == enable the MMU, or "fixup" == some sort of relocation/PIC? >> >> Some sort of relocation. When MMU is turn off, every time the kernel >> needs to call asm callback, it will add an offset. > > Sounds a bit mad to me compared with writing PIC, but OK.Most of the code is writing PIC. But as in Linux, for assembly code you can have something like that: .word my_func my_func will be replaced by the virtual address not the physical address. So when the kernel want to call the function with MMU disabled, it will need to add an offset. Thankfully, you don''t need it everywhere. -- Julien Grall
Ian Campbell
2013-Dec-12 14:36 UTC
Re: [PATCH] libxc/arm: Correctly handle the difference between virtual and physical address
On Thu, 2013-12-12 at 13:55 +0000, Julien Grall wrote:> > It''s possible that for ARM it should be RAMBASE or something? Or perhaps > > it should be 0 and the RAMBASE should be added by the tools. Or maybe it > > should be omitted and only VIRT_BASE is needed? > > It''s not clear to me what is the usage of XEN_ELFNOTE_PADDR_OFFSET. Is > it the physical address? An offset?/* * The offset of the ELF paddr field from the actual required * pseudo-physical address (numeric). * * This is used to maintain backwards compatibility with older kernels * which wrote __PAGE_OFFSET into that field. This field defaults to 0 * if not present. * * LEGACY: ELF_PADDR_OFFSET. (n.b. legacy default is VIRT_BASE) */ #define XEN_ELFNOTE_PADDR_OFFSET 4 So apparently it is the offset between the paddr field in the ELF file and the real load address. Which seems like it should be 0xc0100000-0x80100000, but you''ll have to UTSL to be sure.> > Does NetBSD absolutely require that it is loaded at precisely address > > 0x80100000 or is the requirement rambase + 0x00100000. > > > > If NetBSD has requirements about where RAM is the we may need a new note > > to designate the required RAM base address. > > FreeBSD requires to know at compile time the RAM base address. I think > it''s fine now to hardcode to GUEST_RAMBASE. We can modify it later.OK, that makes it easier for these headers to fit I think.> >> If I try to modify PADDR_OFFSET with (KERNBASE - PHYSADDR) or whatever > >> value it doesn''t work. I got > >> "segment kernel too large (0x355 > 0x4000 - 0x80100 pages): Out of memory". > > > > It is very possible that the existing code buggily assumes in various > > places that RAM starts at address 0, because it has only ever been run > > on x86. You will probably need to fix this by using rambase_pfn in > > various places. > > That would mean we need to fix up every seg->vstart/vend.Quite possible, yes. This code was written to deal with a PV x86 guest, which has a different set of requirements to what is needed on ARM.> The main issue > is, at the end of xc_dom_alloc_segment, we will store vend to > virt_alloc_end. This value we be retrieved to the next allocation.. > > I wrote the patch in this way because I think it''s fine to do all the > allocation in the virtual address space (even if MMU is turn off during > at boot time) and then, thanks to xc_dom_alloc_segment, translate to > physical address. It''s the currently behaviour. But it works fine > because for zImage, we made the assumption that virt_base == rambase_pfn > << PAGE_SHIFT.The fact that the fields are called virt_* is irrelevant, these are physical addresses on ARM, the naming is from x86 where they are physical addresses. If you are trying to somehow shoehorn the kernels virtual addresses into these then things will get confusing fast. Everything should be thought of in terms of physical addresses and the *load* address that BSD expects.> The main change is using virt_base to allocate the device tree and > substract different offset when the guest will start.If this is done correctly then virt_base will contain physical addresses and everything works without offsets. Ignore the field names and think only about physical addresses.> >>>> Then the guest will start will MMU turn off, and during the > >>>> first instructions it will use fixup to get the right address. > >>> > >>> "fixup" == enable the MMU, or "fixup" == some sort of relocation/PIC? > >> > >> Some sort of relocation. When MMU is turn off, every time the kernel > >> needs to call asm callback, it will add an offset. > > > > Sounds a bit mad to me compared with writing PIC, but OK. > > Most of the code is writing PIC. But as in Linux, for assembly code you > can have something like that: > > .word my_func > > my_func will be replaced by the virtual address not the physical > address. So when the kernel want to call the function with MMU disabled, > it will need to add an offset. Thankfully, you don''t need it everywhere.That''s not PIC code then. PIC code would have an offset not an address and would call the function as the offset from current PC or something similar, but we are digressing here. Ian.
Possibly Parallel Threads
- [PATCH] libxc/arm: align to page size the base address of the device tree
- [PATCH v6 00/16] xen: arm: 64-bit guest support and domU FDT autogeneration
- [PATCH v2 00/14] xen: arm: 64-bit guest support and domU FDT autogeneration
- Bare-metal Xen on ARM boot
- [PATCH] xen/arm: Allow balooning working with 1:1 memory mapping