thr3ads.net - Xen devel - Nested events in 64bit mini-OS [Oct 2012]

If this information is useful, please help other people find it:
Share via:

Xu Zhang

2012-Oct-23 22:43 UTC

Nested events in 64bit mini-OS

Dear all,

I don''t understand why 64-bit mini-OS doesn''t check against
nested Xen
events (in entry.S). In 32-bit code, a critical section is defined and 
re-entrant is checked. A fix-up routine is executed to coalesce the 
stack frames if that''s the case, because there is a window for nested 
events to happen after re-enabling event delivery and before a direct iret.

In 64-bit mini-OS, however, a critical section is defined but not used. 
My understanding is that ideally, one could
a) unmask event and do an direct iret, but check against nested events, 
try to fix stack frames if that happens; or
b) do not re-enable event and use hypercall iret to return (no need to 
fix-up stack frames).

64-bit mini-OS seems to adopt a mixed use of both (in HYPERVISOR_IRET). 
mini-OS doesn''t have an userspace, so unless an NMI happened, it always
perform interrupt/exception return with machine instruction iret, 
without checking against nested events. This is wrong to me. Am I 
missing something here?

Please advise.

Thank you,
Xu

-- 
xu

:q!

Samuel Thibault

2012-Oct-25 20:56 UTC

head link

Re: Nested events in 64bit mini-OS

Xu Zhang, le Tue 23 Oct 2012 17:43:28 -0500, a écrit :> 64-bit mini-OS seems to adopt a mixed use of both (in HYPERVISOR_IRET).
> mini-OS doesn''t have an userspace, so unless an NMI happened, it
always
> perform interrupt/exception return with machine instruction iret, without
> checking against nested events. This is wrong to me. Am I missing something
> here?
I don''t think you are missing anything. Cc-ing Grzegorz, who actually
wrote the code.

Samuel

Xu Zhang

2012-Nov-06 05:56 UTC

head link

Re: Nested events in 64bit mini-OS

Samuel Thibault

2012-Nov-10 13:52 UTC

head link

Re: Nested events in 64bit mini-OS

Hello,

Xu Zhang, le Mon 05 Nov 2012 23:56:04 -0600, a écrit :> I haven''t seen any updates on this matter, so I try to come up
with a fix.
Thanks!
> Generally speaking, I want to mimic 32-bit mini-OS behaviour,
Ok. Probably a good thing :)
> The "git diff" output is attached to this email. I only did some
naive
> "tests" by running it inside gdb.
Do you mean that you actually ran a stack merging, even if only in gdb?
I would trust that well enough.
> I am wondering is there a test suite for mini-OS?
You can simply run it, giving it a blk dev (or also a fb dev), it will
run app_main(), which keeps reading stuff, and draws squares on the fb.

Ideally you should manage to trigger fixups, by stuffing a wait loop in
the critical section.
> A follow-up question is that given this fix, do you still need
> hypercall iret?
Actually it''s worse than that, see below.
> +	movq %rbx,5*8(%rsp)
...> +
> +	# check against re-entrance
> +	movq RIP(%rsp),%rbx
> +	cmpq $scrit,%rbx
> +	jb 10f
> +	cmpq $ecrit,%rbx
> +	jb  critical_region_fixup
> +
> +10:	movq RBX(%rsp),%rbx			# restore rbx
Do we really need to restore %rbx?  I don''t think
do_hypervisor_callback
uses it.
> +    .byte 0x78,0x78,0x78,0x78,0x78            # jmp    hypercall_page +
(__HYPERVISOR_iret * 32)
Here we would also need a fixup table for the code at hypercall_page!
A nicer fix would be to inline the hypercall code here.  That said, I
have to say I don''t know any detail about the NMI flag, I
don''t see it
defined in the Intel manual, and I don''t see it being set in the Xen
hypervisor.  Unless somebody knows more about it, I would assume that it
currently never happens, and simply stuff a

ud2 /* TODO: fix re-entrance fixup for hypercall in NMI case (when does that
happen actually?) */

before the jmp, to catch it if it ever does happen.

Also, it would be good to check against critical section size change, in
case somebody e.g. changes a value, or a macro like XEN_PUT_VCPU_INFO.
For instance, stuff right after the table:

	.if (ecrit-scrit) != (critical_fixup_table_end - critical_fixup_table)
	.error "The critical has changed, the fixup table needs updating"
	.endif

More generally, some cleanup could go along the patch, but I''d say
keep it as you have done, focused on only the fixup code, to have it
as such in the repository history, and then we could clean some things
afterwards.

Samuel

Xu Zhang

2012-Nov-11 15:36 UTC

head link

Re: Nested events in 64bit mini-OS

Samuel,

Thanks for the reply.

On 11/10/2012 07:52 AM, Samuel Thibault wrote:
> You can simply run it, giving it a blk dev (or also a fb dev), it will
> run app_main(), which keeps reading stuff, and draws squares on the fb.
>
> Ideally you should manage to trigger fixups, by stuffing a wait loop in
> the critical section.I''ll try this in the following week. Thank you for pointing it out.
>> A follow-up question is that given this fix, do you still need
>> hypercall iret?
> Actually it''s worse than that, see below.
>
>> +	movq %rbx,5*8(%rsp)
> ...
>> +
>> +	# check against re-entrance
>> +	movq RIP(%rsp),%rbx
>> +	cmpq $scrit,%rbx
>> +	jb 10f
>> +	cmpq $ecrit,%rbx
>> +	jb  critical_region_fixup
>> +
>> +10:	movq RBX(%rsp),%rbx			# restore rbx
> Do we really need to restore %rbx?  I don''t think
do_hypervisor_callback
> uses it.No I don''t think so. I was just being pre-cautious. :-)
>> +    .byte 0x78,0x78,0x78,0x78,0x78            # jmp    hypercall_page
+ (__HYPERVISOR_iret * 32)
> Here we would also need a fixup table for the code at hypercall_page!Nice catch!
> A nicer fix would be to inline the hypercall code here.  That said, I
> have to say I don''t know any detail about the NMI flag, I
don''t see it
> defined in the Intel manual, and I don''t see it being set in the
Xen
> hypervisor.  Unless somebody knows more about it, I would assume that it
> currently never happens, and simply stuff a
>
> ud2 /* TODO: fix re-entrance fixup for hypercall in NMI case (when does
that happen actually?) */
>
> before the jmp, to catch it if it ever does happen.I don''t know that either. I tried to rise guest domain an NMI with
"xl
trigger", but it seems that only works for HVM hosts.
>
> Also, it would be good to check against critical section size change, in
> case somebody e.g. changes a value, or a macro like XEN_PUT_VCPU_INFO.
> For instance, stuff right after the table:
>
> 	.if (ecrit-scrit) != (critical_fixup_table_end - critical_fixup_table)
> 	.error "The critical has changed, the fixup table needs
updating"
> 	.endifTotally agree. And it somewhat saves the obligation of checking if the 
table size matches the one of critical section by looking at disassemble 
output. :-)
>
> More generally, some cleanup could go along the patch, but I''d say
> keep it as you have done, focused on only the fixup code, to have it
> as such in the repository history, and then we could clean some things
> afterwards.
>
> Samuel
>I really appreciate your feedback. Thanks again, Samuel!

Xu

Xu Zhang

2012-Nov-14 01:49 UTC

head link

Re: Nested events in 64bit mini-OS

>>> +    .byte 0x78,0x78,0x78,0x78,0x78            # jmp    
>>> hypercall_page + (__HYPERVISOR_iret * 32)
>> Here we would also need a fixup table for the code at hypercall_page!
> Nice catch!
>
>> A nicer fix would be to inline the hypercall code here.After a second thought, it seems to me such "expansion" of 
hypercall_page into fixup table is not necessary (and wrong). Insead, we 
can and need to mask out events before we doing a hypercall iret. I 
think it is safe (and must, see below) to do this because:
1. if event is disabled: doesn''t hurt to mask it again;
2. if event is enabled: we disable event, and jumps to hypercall_page to 
make a hypercall iret, which eventually calls do_iret:

In do_iret, line 309:
      /* Restore upcall mask from supplied EFLAGS.IF. */
      vcpu_info(v, evtchn_upcall_mask) = !(iret_saved.rflags & 
X86_EFLAGS_IF);

No matter what the current value of upcall mask is, do_iret always 
retores it to the negation of rflags.IF saved on stack. In other words, 
it is safe to mask upcall mask before hypercall iret. And in fact, I am 
afraid it''s necessary that we make it a must, because even one can
guard
against nested events in guest OS, but can never do so on hypercall page 
(because the code is in Xen). We should never jump to hypercall_page 
iret with events enabled. So we have to set upcall mask before the jump. 
I guess that''s one of the reason why Xen always restores upcall mask 
from stack.

>> Also, it would be good to check against critical section size change,
in
>> case somebody e.g. changes a value, or a macro like XEN_PUT_VCPU_INFO.
>> For instance, stuff right after the table:
>>
>>     .if (ecrit-scrit) != (critical_fixup_table_end - 
>> critical_fixup_table)
>>     .error "The critical has changed, the fixup table needs
updating"
>>     .endif
> Totally agree. And it somewhat saves the obligation of checking if the 
> table size matches the one of critical section by looking at 
> disassemble output. :-)
>
>Correct me if I am wrong, I think hypercall_page is mapped at runtime to 
guest OS by Xen. It''s not actually part of the critical section of
guest
OS, at least not at compile time. So having a fixup table for the code 
at hypercall_page would cause such check to fail (mismatch). Following 
the discussion above, we could easily avoid such fixup table by mask out 
the events.

I don''t know if this make sense at all. :-)

Samuel Thibault

2012-Nov-18 17:43 UTC

head link

Re: Nested events in 64bit mini-OS

Xu Zhang, le Tue 13 Nov 2012 19:49:27 -0600, a écrit :> 1. if event is disabled: doesn''t hurt to mask it again;
> 2. if event is enabled: we disable event, and jumps to hypercall_page to
> make a hypercall iret, which eventually calls do_iret:
> 
> In do_iret, line 309:
>      /* Restore upcall mask from supplied EFLAGS.IF. */
>      vcpu_info(v, evtchn_upcall_mask) = !(iret_saved.rflags &
> X86_EFLAGS_IF);
Ah, right. Disabling events just before the jmp seems all right to me
then.
> Correct me if I am wrong, I think hypercall_page is mapped at runtime to
> guest OS by Xen. It''s not actually part of the critical section of
guest OS,
> at least not at compile time.
Sure. I meant it''d mean a second fixup table, but who knows what code
is
there, it could be tampering with the stack.
> Following the discussion above, we could easily avoid such fixup table
> by mask out the events.
Completely.

Samuel

Ian Campbell

2012-Nov-19 10:21 UTC

head link

Re: Nested events in 64bit mini-OS

On Sat, 2012-11-10 at 13:52 +0000, Samuel Thibault
wrote:> > +    .byte 0x78,0x78,0x78,0x78,0x78            # jmp    hypercall_page
+ (__HYPERVISOR_iret * 32)
> 
> Here we would also need a fixup table for the code at hypercall_page!
> A nicer fix would be to inline the hypercall code here.  That said, I
> have to say I don''t know any detail about the NMI flag, I
don''t see it
> defined in the Intel manual, and I don''t see it being set in the
Xen
> hypervisor.
It''s a "software defined" bit (i.e. stolen out of eflags :-()
used in
the classic-Xen Linux guest to indicate that return must go via
HYPERVISOR_iret (rather than direct iret), I think in order to let the
hypervisor implement the correct semantics wrt NMI masking. The
hypervisor itself doesn''t actually know anything about the bit and the
guest could implement this some other way if it wanted (doesn''t 64 bit
minios *always* use HYPERVISOR_iret?).
>   Unless somebody knows more about it, I would assume that it
> currently never happens,
That''s probably a pretty good assumption, you would expect NMIs only in
dom0 (real NMIs getting forwarded) or in a guest which you explicitly
sent an NMI to via the toolstack, which I doubt you would do to an
unsuspecting mini-os domain (not in the expectation of anything good
happening at least).

Ian.

Ian Campbell

2012-Nov-19 10:22 UTC

head link

Re: Nested events in 64bit mini-OS

On Sun, 2012-11-18 at 17:43 +0000, Samuel Thibault
wrote:> > Correct me if I am wrong, I think hypercall_page is mapped at runtime
to
> > guest OS by Xen. It''s not actually part of the critical
section of guest OS,
> > at least not at compile time.
> 
> Sure. I meant it''d mean a second fixup table, but who knows what
code is
> there, it could be tampering with the stack.
In practice it isn''t. Since the hypercall page is a blackbox to the
guest I don''t think it is really reasonable for it to be messing with
the stack without makings its own arrangements for correctness of
nesting etc.

Ian.

Xen devel - Oct 2012 - Nested events in 64bit mini-OS

Nested events in 64bit mini-OS

Re: Nested events in 64bit mini-OS

Re: Nested events in 64bit mini-OS

Re: Nested events in 64bit mini-OS

Re: Nested events in 64bit mini-OS

Re: Nested events in 64bit mini-OS

Re: Nested events in 64bit mini-OS

Re: Nested events in 64bit mini-OS

Re: Nested events in 64bit mini-OS