David Vrabel
2012-Oct-17  11:42 UTC
[PATCH] xen/x86: don''t corrupt %eip when returning from a signal handler
From: David Vrabel <david.vrabel@citrix.com> In 32 bit guests, if a userspace process has %eax == -ERESTARTSYS (-512) or -ERESTARTNOINTR (-513) when it is interrupted by an event /and/ the process has a pending signal then %eip (and %eax) are corrupted when returning to the main process after handling the signal. The application may then crash with SIGSEGV or a SIGILL or it may have subtly incorrect behaviour (depending on what instruction it returned to). The occurs because handle_signal() is incorrectly thinking that there is a system call that needs to restarted so it adjusts %eip and %eax to re-execute the system call instruction (even though user space had not done a system call). If %eax == -514 (-ERESTARTNOHAND (-514) or -ERESTART_RESTARTBLOCK (-516) then handle_signal() only corrupted %eax (by setting it to -EINTR). This may cause the application to crash or have incorrect behaviour. handle_signal() assumes that regs->orig_ax >= 0 means a system call so any kernel entry point that is not for a system call must push a negative value for orig_ax. For example, for physical interrupts on bare metal the inverse of the vector is pushed and page_fault() sets regs->orig_ax to -1, overwriting the hardware provided error code. xen_hypervisor_callback() was incorrectly pushing 0 for orig_ax instead of -1. For consistency, we also change xen_failsafe_callback(). Classic Xen kernels pushed %eax which works as %eax cannot be both non-negative and -RESTARTSYS (etc.), but using -1 avoids the additional tests in handle_signal(). Signed-off-by: David Vrabel <david.vrabel@citrix.com> Cc: stable@kernel.org --- arch/x86/kernel/entry_32.S | 4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/x86/kernel/entry_32.S b/arch/x86/kernel/entry_32.S index 2c63407..6a19e66 100644 --- a/arch/x86/kernel/entry_32.S +++ b/arch/x86/kernel/entry_32.S @@ -1042,7 +1042,7 @@ ENTRY(xen_sysenter_target) ENTRY(xen_hypervisor_callback) CFI_STARTPROC - pushl_cfi $0 + pushl_cfi $-1 /* orig_ax = -1 => not a system call */ SAVE_ALL TRACE_IRQS_OFF @@ -1078,7 +1078,7 @@ ENDPROC(xen_hypervisor_callback) # We distinguish between categories by maintaining a status value in EAX. ENTRY(xen_failsafe_callback) CFI_STARTPROC - pushl_cfi %eax + pushl_cfi $-1 /* orig_ax = -1 => not a system call */ movl $1,%eax 1: mov 4(%esp),%ds 2: mov 8(%esp),%es -- 1.7.2.5
David Vrabel
2012-Oct-17  11:48 UTC
Re: [PATCH] xen/x86: don''t corrupt %eip when returning from a signal handler
On 17/10/12 12:42, David Vrabel wrote:> From: David Vrabel <david.vrabel@citrix.com> > > In 32 bit guests, if a userspace process has %eax == -ERESTARTSYS > (-512) or -ERESTARTNOINTR (-513) when it is interrupted by an event > /and/ the process has a pending signal then %eip (and %eax) are > corrupted when returning to the main process after handling the > signal. The application may then crash with SIGSEGV or a SIGILL or it > may have subtly incorrect behaviour (depending on what instruction it > returned to).The following test program shows the bug. It will receive the signal when in the infinite loop and return to the preceeding int 3 instruction. Big thanks to Frediano for producing the test program and the majority of the effort in tracking down this bug. David 8<-------------------------- #include <stdio.h> #include <stdlib.h> #include <signal.h> #include <assert.h> #include <sys/time.h> static void handler(int sig) { static unsigned count = 0; if (++count == 60 * 1000) exit(0); } int main(void) { struct sigaction act; // set signal sigfillset(&act.sa_mask); act.sa_flags = 0; act.sa_handler = handler; int err = sigaction(SIGALRM, &act, NULL); assert(!err); // set timer struct itimerval ival = { { 0, 1000 }, { 0, 1000 } }; err = setitimer(ITIMER_REAL, &ival, NULL); assert(!err); #if !defined(__x86_64__) && !defined(__i386__) # error This code work only on Intel architecture! #endif // wait for a core !! asm( #ifdef __x86_64__ " mov $-513, %rax\n" #else " mov $-513, %eax\n" #endif " jmp infinite\n" " int $3\n" " int $3\n" " int $3\n" " int $3\n" "infinite:\n" " jmp infinite\n" ); return 0; }
Ian Campbell
2012-Oct-17  11:52 UTC
Re: [PATCH] xen/x86: don''t corrupt %eip when returning from a signal handler
On Wed, 2012-10-17 at 12:42 +0100, David Vrabel wrote:> From: David Vrabel <david.vrabel@citrix.com> > > In 32 bit guests, if a userspace process has %eax == -ERESTARTSYS > (-512) or -ERESTARTNOINTR (-513) when it is interrupted by an event > /and/ the process has a pending signal then %eip (and %eax) are > corrupted when returning to the main process after handling the > signal. The application may then crash with SIGSEGV or a SIGILL or it > may have subtly incorrect behaviour (depending on what instruction it > returned to). > > The occurs because handle_signal() is incorrectly thinking that there > is a system call that needs to restarted so it adjusts %eip and %eax > to re-execute the system call instruction (even though user space had > not done a system call). > > If %eax == -514 (-ERESTARTNOHAND (-514) or -ERESTART_RESTARTBLOCK > (-516) then handle_signal() only corrupted %eax (by setting it to > -EINTR). This may cause the application to crash or have incorrect > behaviour. > > handle_signal() assumes that regs->orig_ax >= 0 means a system call so > any kernel entry point that is not for a system call must push a > negative value for orig_ax. For example, for physical interrupts on > bare metal the inverse of the vector is pushed and page_fault() sets > regs->orig_ax to -1, overwriting the hardware provided error code. > > xen_hypervisor_callback() was incorrectly pushing 0 for orig_ax > instead of -1. For consistency, we also change > xen_failsafe_callback(). > > Classic Xen kernels pushed %eax which works as %eax cannot be both > non-negative and -RESTARTSYS (etc.), but using -1 avoids the > additional tests in handle_signal(). > > Signed-off-by: David Vrabel <david.vrabel@citrix.com>Acked-by: Ian Campbell <ian.campbell@citrix.com>> Cc: stable@kernel.org > --- > arch/x86/kernel/entry_32.S | 4 ++-- > 1 files changed, 2 insertions(+), 2 deletions(-) > > diff --git a/arch/x86/kernel/entry_32.S b/arch/x86/kernel/entry_32.S > index 2c63407..6a19e66 100644 > --- a/arch/x86/kernel/entry_32.S > +++ b/arch/x86/kernel/entry_32.S > @@ -1042,7 +1042,7 @@ ENTRY(xen_sysenter_target) > > ENTRY(xen_hypervisor_callback) > CFI_STARTPROC > - pushl_cfi $0 > + pushl_cfi $-1 /* orig_ax = -1 => not a system call */ > SAVE_ALL > TRACE_IRQS_OFF > > @@ -1078,7 +1078,7 @@ ENDPROC(xen_hypervisor_callback) > # We distinguish between categories by maintaining a status value in EAX. > ENTRY(xen_failsafe_callback) > CFI_STARTPROC > - pushl_cfi %eax > + pushl_cfi $-1 /* orig_ax = -1 => not a system call */ > movl $1,%eax > 1: mov 4(%esp),%ds > 2: mov 8(%esp),%es
Jan Beulich
2012-Oct-17  12:07 UTC
Re: [PATCH] xen/x86: don''t corrupt %eip when returning from a signal handler
>>> On 17.10.12 at 13:42, David Vrabel <david.vrabel@citrix.com> wrote: > From: David Vrabel <david.vrabel@citrix.com> > > In 32 bit guests, if a userspace process has %eax == -ERESTARTSYS > (-512) or -ERESTARTNOINTR (-513) when it is interrupted by an event > /and/ the process has a pending signal then %eip (and %eax) are > corrupted when returning to the main process after handling the > signal. The application may then crash with SIGSEGV or a SIGILL or it > may have subtly incorrect behaviour (depending on what instruction it > returned to). > > The occurs because handle_signal() is incorrectly thinking that there > is a system call that needs to restarted so it adjusts %eip and %eax > to re-execute the system call instruction (even though user space had > not done a system call). > > If %eax == -514 (-ERESTARTNOHAND (-514) or -ERESTART_RESTARTBLOCK > (-516) then handle_signal() only corrupted %eax (by setting it to > -EINTR). This may cause the application to crash or have incorrect > behaviour. > > handle_signal() assumes that regs->orig_ax >= 0 means a system call so > any kernel entry point that is not for a system call must push a > negative value for orig_ax. For example, for physical interrupts on > bare metal the inverse of the vector is pushed and page_fault() sets > regs->orig_ax to -1, overwriting the hardware provided error code. > > xen_hypervisor_callback() was incorrectly pushing 0 for orig_ax > instead of -1. For consistency, we also change > xen_failsafe_callback().Is this really just for consistency? There is a way for the failsafe callback to continue to ret_from_exception, and I would think that the same situation could arise there (and for the x86-64 case too). Jan> Classic Xen kernels pushed %eax which works as %eax cannot be both > non-negative and -RESTARTSYS (etc.), but using -1 avoids the > additional tests in handle_signal(). > > Signed-off-by: David Vrabel <david.vrabel@citrix.com> > Cc: stable@kernel.org > --- > arch/x86/kernel/entry_32.S | 4 ++-- > 1 files changed, 2 insertions(+), 2 deletions(-) > > diff --git a/arch/x86/kernel/entry_32.S b/arch/x86/kernel/entry_32.S > index 2c63407..6a19e66 100644 > --- a/arch/x86/kernel/entry_32.S > +++ b/arch/x86/kernel/entry_32.S > @@ -1042,7 +1042,7 @@ ENTRY(xen_sysenter_target) > > ENTRY(xen_hypervisor_callback) > CFI_STARTPROC > - pushl_cfi $0 > + pushl_cfi $-1 /* orig_ax = -1 => not a system call */ > SAVE_ALL > TRACE_IRQS_OFF > > @@ -1078,7 +1078,7 @@ ENDPROC(xen_hypervisor_callback) > # We distinguish between categories by maintaining a status value in EAX. > ENTRY(xen_failsafe_callback) > CFI_STARTPROC > - pushl_cfi %eax > + pushl_cfi $-1 /* orig_ax = -1 => not a system call */ > movl $1,%eax > 1: mov 4(%esp),%ds > 2: mov 8(%esp),%es > -- > 1.7.2.5 > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel
David Vrabel
2012-Oct-17  12:19 UTC
Re: [PATCH] xen/x86: don''t corrupt %eip when returning from a signal handler
On 17/10/12 13:07, Jan Beulich wrote:>>>> On 17.10.12 at 13:42, David Vrabel <david.vrabel@citrix.com> wrote: >> From: David Vrabel <david.vrabel@citrix.com> >> >> In 32 bit guests, if a userspace process has %eax == -ERESTARTSYS >> (-512) or -ERESTARTNOINTR (-513) when it is interrupted by an event >> /and/ the process has a pending signal then %eip (and %eax) are >> corrupted when returning to the main process after handling the >> signal. The application may then crash with SIGSEGV or a SIGILL or it >> may have subtly incorrect behaviour (depending on what instruction it >> returned to). >> >> The occurs because handle_signal() is incorrectly thinking that there >> is a system call that needs to restarted so it adjusts %eip and %eax >> to re-execute the system call instruction (even though user space had >> not done a system call). >> >> If %eax == -514 (-ERESTARTNOHAND (-514) or -ERESTART_RESTARTBLOCK >> (-516) then handle_signal() only corrupted %eax (by setting it to >> -EINTR). This may cause the application to crash or have incorrect >> behaviour. >> >> handle_signal() assumes that regs->orig_ax >= 0 means a system call so >> any kernel entry point that is not for a system call must push a >> negative value for orig_ax. For example, for physical interrupts on >> bare metal the inverse of the vector is pushed and page_fault() sets >> regs->orig_ax to -1, overwriting the hardware provided error code. >> >> xen_hypervisor_callback() was incorrectly pushing 0 for orig_ax >> instead of -1. For consistency, we also change >> xen_failsafe_callback(). > > Is this really just for consistency? There is a way for the failsafe > callback to continue to ret_from_exception, and I would think that > the same situation could arise there (and for the x86-64 case too).The 32-bit xen_failsafe_callback was using %eax for orig_ax which is safe (see comment on classic kernel behaviour below). We couldn''t repro the issue in 64-bit guests and looking at entry_64.S xen_hypervisor_callback is correct (see the zeroentry macro). I must admit to not really being able to follow what xen_failsafe_callback is doing, but it does look like that last pushq before the SAVE_ALL should be pushq_cfi $-1 as well. Do you agree? David>> Classic Xen kernels pushed %eax which works as %eax cannot be both >> non-negative and -RESTARTSYS (etc.), but using -1 avoids the >> additional tests in handle_signal(). >> >> Signed-off-by: David Vrabel <david.vrabel@citrix.com> >> Cc: stable@kernel.org >> --- >> arch/x86/kernel/entry_32.S | 4 ++-- >> 1 files changed, 2 insertions(+), 2 deletions(-) >> >> diff --git a/arch/x86/kernel/entry_32.S b/arch/x86/kernel/entry_32.S >> index 2c63407..6a19e66 100644 >> --- a/arch/x86/kernel/entry_32.S >> +++ b/arch/x86/kernel/entry_32.S >> @@ -1042,7 +1042,7 @@ ENTRY(xen_sysenter_target) >> >> ENTRY(xen_hypervisor_callback) >> CFI_STARTPROC >> - pushl_cfi $0 >> + pushl_cfi $-1 /* orig_ax = -1 => not a system call */ >> SAVE_ALL >> TRACE_IRQS_OFF >> >> @@ -1078,7 +1078,7 @@ ENDPROC(xen_hypervisor_callback) >> # We distinguish between categories by maintaining a status value in EAX. >> ENTRY(xen_failsafe_callback) >> CFI_STARTPROC >> - pushl_cfi %eax >> + pushl_cfi $-1 /* orig_ax = -1 => not a system call */ >> movl $1,%eax >> 1: mov 4(%esp),%ds >> 2: mov 8(%esp),%es >> -- >> 1.7.2.5 >> >> >> _______________________________________________ >> Xen-devel mailing list >> Xen-devel@lists.xen.org >> http://lists.xen.org/xen-devel > > >
Jan Beulich
2012-Oct-17  12:28 UTC
Re: [PATCH] xen/x86: don''t corrupt %eip when returning from a signal handler
>>> On 17.10.12 at 14:19, David Vrabel <david.vrabel@citrix.com> wrote: > On 17/10/12 13:07, Jan Beulich wrote: >>>>> On 17.10.12 at 13:42, David Vrabel <david.vrabel@citrix.com> wrote: >>> From: David Vrabel <david.vrabel@citrix.com> >>> >>> In 32 bit guests, if a userspace process has %eax == -ERESTARTSYS >>> (-512) or -ERESTARTNOINTR (-513) when it is interrupted by an event >>> /and/ the process has a pending signal then %eip (and %eax) are >>> corrupted when returning to the main process after handling the >>> signal. The application may then crash with SIGSEGV or a SIGILL or it >>> may have subtly incorrect behaviour (depending on what instruction it >>> returned to). >>> >>> The occurs because handle_signal() is incorrectly thinking that there >>> is a system call that needs to restarted so it adjusts %eip and %eax >>> to re-execute the system call instruction (even though user space had >>> not done a system call). >>> >>> If %eax == -514 (-ERESTARTNOHAND (-514) or -ERESTART_RESTARTBLOCK >>> (-516) then handle_signal() only corrupted %eax (by setting it to >>> -EINTR). This may cause the application to crash or have incorrect >>> behaviour. >>> >>> handle_signal() assumes that regs->orig_ax >= 0 means a system call so >>> any kernel entry point that is not for a system call must push a >>> negative value for orig_ax. For example, for physical interrupts on >>> bare metal the inverse of the vector is pushed and page_fault() sets >>> regs->orig_ax to -1, overwriting the hardware provided error code. >>> >>> xen_hypervisor_callback() was incorrectly pushing 0 for orig_ax >>> instead of -1. For consistency, we also change >>> xen_failsafe_callback(). >> >> Is this really just for consistency? There is a way for the failsafe >> callback to continue to ret_from_exception, and I would think that >> the same situation could arise there (and for the x86-64 case too). > > The 32-bit xen_failsafe_callback was using %eax for orig_ax which is > safe (see comment on classic kernel behaviour below).Ah, yes, I didn''t spot that %eax was used there instead of 0 (and other than inn the classic Xen kernel, which I will want to fix too).> We couldn''t repro the issue in 64-bit guests and looking at entry_64.S > xen_hypervisor_callback is correct (see the zeroentry macro).Yes, the normal callback path isn''t affected there.> I must admit to not really being able to follow what > xen_failsafe_callback is doing, but it does look like that last pushq > before the SAVE_ALL should be pushq_cfi $-1 as well. Do you agree?That''s what I was trying to hint at. Feel free to put my ack on the patch if you re-submit with that adjustment (and ideally the commit message adjusted too). Jan>>> Classic Xen kernels pushed %eax which works as %eax cannot be both >>> non-negative and -RESTARTSYS (etc.), but using -1 avoids the >>> additional tests in handle_signal(). >>> >>> Signed-off-by: David Vrabel <david.vrabel@citrix.com> >>> Cc: stable@kernel.org >>> --- >>> arch/x86/kernel/entry_32.S | 4 ++-- >>> 1 files changed, 2 insertions(+), 2 deletions(-) >>> >>> diff --git a/arch/x86/kernel/entry_32.S b/arch/x86/kernel/entry_32.S >>> index 2c63407..6a19e66 100644 >>> --- a/arch/x86/kernel/entry_32.S >>> +++ b/arch/x86/kernel/entry_32.S >>> @@ -1042,7 +1042,7 @@ ENTRY(xen_sysenter_target) >>> >>> ENTRY(xen_hypervisor_callback) >>> CFI_STARTPROC >>> - pushl_cfi $0 >>> + pushl_cfi $-1 /* orig_ax = -1 => not a system call */ >>> SAVE_ALL >>> TRACE_IRQS_OFF >>> >>> @@ -1078,7 +1078,7 @@ ENDPROC(xen_hypervisor_callback) >>> # We distinguish between categories by maintaining a status value in EAX. >>> ENTRY(xen_failsafe_callback) >>> CFI_STARTPROC >>> - pushl_cfi %eax >>> + pushl_cfi $-1 /* orig_ax = -1 => not a system call */ >>> movl $1,%eax >>> 1: mov 4(%esp),%ds >>> 2: mov 8(%esp),%es >>> -- >>> 1.7.2.5 >>> >>> >>> _______________________________________________ >>> Xen-devel mailing list >>> Xen-devel@lists.xen.org >>> http://lists.xen.org/xen-devel >> >> >>