Konrad Rzeszutek Wilk
2013-Mar-25 16:48 UTC
[PATCH kindof] to http://wiki.xen.org/wiki/X86_Paravirtualised_Memory_Management
Was thinking to add this to the Wiki, but before I do that - does anybody have a handy segment decoder ring? == Segments = The initial segments that the guest boots with are dependent on the mode of the guest and the hypervisor. If a 64-bit hypervisor is used, the guest can be in either 64-bit mode or 32-bit mode. If 32-bit hypervisor (deprecated) then obviously only a 32-bit guest can be started. The initial segment values are set to be flat. They are also located at the far end of the GDT as to not conflict with OS-es that have hard-coded segment values. This means that when a guest boots the GDT indexes for segment selectors are usually past 259. If the guest is 64-bit, then DS.ES.FS, and GS all point to the NULL selector. That is they contain the contents of 0x00. If it is 32-bit, then the DS,ES,FS, and GS all point to a segment set for ring 1 (0xe021), index 260. The GDT entry looks as so: 0x00cfb2000000ffff /* 0xe021 ring 1 data */ The SS segment in 64-bit mode, contains the GDT index 262, and the RPL is set for ring 3 (0xe033). The GDT entry looks as so: 0x00affa000000ffff /* 0xe033 ring 3 code, 64-bit mode */ In 32-bit mode, then the SS GDT index is 260, and in ring 1 (0xe021). The GDT entry looks as so: 0x00cfb2000000ffff /* 0xe021 ring 1 data */ The CS segment in 64-bit mode, contains the GDT index 262, and is in ring 3 (0xe033). The GDT entry looks as so: 0x00affa000000ffff /* 0xe033 ring 3 code, 64-bit mode */ The CS segment in 32-bit mode is index 259, ring 1 (0xe019), and contains: 0x00cfba000000ffff /* 0xe019 ring 1 code, compatibility */ The 32-bit guest and 32-bit hypervisor segments are not explained. [TBD: decode the GDT, explain how to program them, point to AMD Vol 2 document, explain the transition] == Hyperpage = Depending on the mode of the guest (32-bit or 64-bit) the hypervisor populates the hyperpage with appropriate instructions. The virtual address where the hypervisor needs to populate the hyperpage is set by the ELF PT_NOTE entry. Specifically the XEN_ELFNOTE_HYPERCALL_PAGE (see http://xenbits.xen.org/docs/unstable/hypercall/include,public,elfnote.h.html#incontents_elfnote) The contents of the page if the guest is 64-bit is: push %rcx push %r11 mov $hypercall_entry, %eax syscall pop %r11 pop %%rcx ret For 32-bit it is: mov $hypercall_entry, %eax int 0x82 ret The hypercall_entry is a value from zero to 64 and they correspond to: http://xenbits.xen.org/docs/unstable/hypercall/include,public,xen.h.html#incontents_hcalls One of the hypercalls (iret) is different b/c it does not return and expects a special stack frame. Guests jump at this transfer point instead of calling it. For 64-bit guests: push %rcx push %r11 push %rax movl $__HYPERVISOR_iret, %eax syscall For 32-bit guests: push %eax movl $__HYPERVISOR_iret, %eax int 0x82
Ian Campbell
2013-Apr-10 08:28 UTC
Re: [PATCH kindof] to http://wiki.xen.org/wiki/X86_Paravirtualised_Memory_Management
On Mon, 2013-03-25 at 16:48 +0000, Konrad Rzeszutek Wilk wrote:> Was thinking to add this to the Wiki, but before I do that - does anybody > have a handy segment decoder ring?They are described in the SDM, or were you after something else?> == Segments => > The initial segments that the guest boots with are dependent on the mode > of the guest and the hypervisor. If a 64-bit hypervisor is used, the guest > can be in either 64-bit mode or 32-bit mode. If 32-bit hypervisor (deprecated)Not just deprecated but gone altogether in 4.3.> then obviously only a 32-bit guest can be started. > > The initial segment values are set to be flat. They are also located at > the far end of the GDT as to not conflict with OS-es that have hard-codedStrictly speaking it just reduces the chance of the conflict (based on the assumption that most OSes uses the lowest entries first)> segment values. This means that when a guest boots the GDT indexes for > segment selectors are usually past 259. > > > If the guest is 64-bit, then DS.ES.FS, and GS all point to the NULL selector. > That is they contain the contents of 0x00. > > If it is 32-bit, then the DS,ES,FS, and GS all point to a segment set for > ring 1 (0xe021), index 260. The GDT entry looks as so: > > 0x00cfb2000000ffff /* 0xe021 ring 1 data */ > > The SS segment in 64-bit mode, contains the GDT index 262, and the RPL > is set for ring 3 (0xe033). The GDT entry looks as so: > > 0x00affa000000ffff /* 0xe033 ring 3 code, 64-bit mode */ > > In 32-bit mode, then the SS GDT index is 260, and in ring 1 (0xe021). > The GDT entry looks as so: > > 0x00cfb2000000ffff /* 0xe021 ring 1 data */ > > The CS segment in 64-bit mode, contains the GDT index 262, and is in ring 3 > (0xe033). The GDT entry looks as so: > > 0x00affa000000ffff /* 0xe033 ring 3 code, 64-bit mode */ > > The CS segment in 32-bit mode is index 259, ring 1 (0xe019), and contains: > > 0x00cfba000000ffff /* 0xe019 ring 1 code, compatibility */ > > The 32-bit guest and 32-bit hypervisor segments are not explained. > > [TBD: decode the GDT, explain how to program them, point to AMD Vol 2 > document, explain the transition] > > == Hyperpage => > Depending on the mode of the guest (32-bit or 64-bit) the hypervisor > populates the hyperpage with appropriate instructions.s/hyperpage/hypercall page/> The virtual address > where the hypervisor needs to populate the hyperpage is set by the ELF PT_NOTE > entry. Specifically the XEN_ELFNOTE_HYPERCALL_PAGE > (see http://xenbits.xen.org/docs/unstable/hypercall/include,public,elfnote.h.html#incontents_elfnote) > > The contents of the page if the guest is 64-bit is: > push %rcx > push %r11 > mov $hypercall_entry, %eax > syscall > pop %r11 > pop %%rcx > ret > > For 32-bit it is: > mov $hypercall_entry, %eax > int 0x82 > retThe actual instructions are supposed to be transparent to the guest I think, all it needs to know is that it should call (or jmp for iret) the appropriate entry. In particular I''m not sure this doesn''t vary for AMD vs Intel and I certainly differs for VMX/SVM (which are a bit off topic in this context I suppose), PVH too I suppose. e.g. We might in the future want make the 32-on-64 entry jump to a special compat segment to do compat arg translation in ring != 0 or something clever like that.> The hypercall_entry is a value from zero to 64 and they correspond to: > http://xenbits.xen.org/docs/unstable/hypercall/include,public,xen.h.html#incontents_hcalls > > One of the hypercalls (iret) is different b/c it does not return and expects > a special stack frame. Guests jump at this transfer point instead of calling it. > For 64-bit guests: > > push %rcx > push %r11 > push %rax > movl $__HYPERVISOR_iret, %eax > syscall > > For 32-bit guests: > push %eax > movl $__HYPERVISOR_iret, %eax > int 0x82 > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel