Zhu Yanhai
2013-Nov-07 04:15 UTC
Xen PV ABI on FPU doesn't match with pvops kernel FPU code, reducing to a serious memory data damage
Hi guys, The PV ABI of xen will clear CR0.TS before trapping into the PV guest kernel's exception handler, so the exception handler in guest kernel runs with CR0.TS clear at the very beginning (which is different with on baremetal). In Xenolinux and mainline Linux kernel before 2.6.26 everything is fine since they don't sleep nor enable interrupts during the handler, however the current mainline pvops kernel has a schedule window opened within it. [Pls see the code below, void math_state_restore(void) { struct task_struct *tsk = current; if (!tsk_used_math(tsk)) { local_irq_enable(); <<<< Here it might open a schedule window /* * does a slab alloc which can sleep */ if (init_fpu(tsk)) { /* * ran out of memory! */ do_group_exit(SIGKILL); return; } local_irq_disable(); <<<< Here it closes } __thread_fpu_begin(tsk); /* * Paranoid restore. send a SIGSEGV if we fail to restore the state. */ if (unlikely(restore_fpu_checking(tsk))) { drop_init_fpu(tsk); force_sig(SIGSEGV, tsk); return; } tsk->fpu_counter++; } ] This can cause serious data damage against xen pv guests (including dom0 kernel), we have encountered this issue both in stress test and production systems. Please take a look at http://comments.gmane.org/gmane.comp.emulators.xen.devel/176970 for more discussions and test cases. Ian and George, we have confirmed this is the very root cause of http://comments.gmane.org/gmane.comp.emulators.xen.devel/175491 The xen ABI is part of history which can't be changed, as well as simply adding a couple of stts()-clts() around the enabling/disabling interrupts is a really ugly hack. Maintainers, any thoughts? -- Thanks, Zhu Yanhai
Apparently Analagous Threads
- Xen PV ABI on FPU doesn't match with pvops kernel FPU code, reducing to a serious memory data damage
- [PATCH] linux-2.6.18/x86: improve CR0 read/write handling
- [PATCH -tip v3 13/23] x86/trap: Use NOKPROBE_SYMBOL macro in trap.c
- [PATCH 2 of 8] FPU: create FPU init and destroy functions
- [PATCH][RFC] FPU LWP 5/5: enable LWP CPUID for HVM guests