Georgi Mungov
2005-Feb-22 10:59 UTC
[Xen-devel] Floating Point Exception when compiling w/ gcc
Hi, I''ve received twice a floating point exception on a different virtual machines during compilation. The first time it was with mysql, the second - kernel 2.6.0. After I ran "make" again the compilation finished without problems. I''m using Xen 2.0.4 build from sources on kernel 2.6.10 installed on standard slackware 9.1. The machine is PIII w/ 512MB ram running 4 VM, each with 64MB ram. The two errors happened on different VM. Could this be some Xen issue? Regards, Georgi Mungov ------------------------------------------------------- SF email is sponsored by - The IT Product Guide Read honest & candid reviews on hundreds of IT Products from real users. Discover which products truly live up to the hype. Start reading now. http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
Ian Pratt
2005-Feb-22 16:07 UTC
RE: [Xen-devel] Floating Point Exception when compiling w/ gcc
> Hi, > I''ve received twice a floating point exception on a different > virtual machines > during compilation. The first time it was with mysql, the > second - kernel > 2.6.0. After I ran "make" again the compilation finished > without problems. > > I''m using Xen 2.0.4 build from sources on kernel 2.6.10 > installed on standard > slackware 9.1. The machine is PIII w/ 512MB ram running 4 VM, > each with 64MB > ram. The two errors happened on different VM. > > Could this be some Xen issue?It''s not totally impossible... Are you running an Xserver? What modules does it have loaded? Are you using the default kernel config? Please can you confirm this is a PIII and not an Athlon. Thanks, Ian ------------------------------------------------------- SF email is sponsored by - The IT Product Guide Read honest & candid reviews on hundreds of IT Products from real users. Discover which products truly live up to the hype. Start reading now. http://ads.osdn.com/?ad_ide95&alloc_id396&op=click _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
Stephan Diestelhorst
2005-Feb-22 16:21 UTC
Re: [Xen-devel] Floating Point Exception when compiling w/ gcc
> Hi, > I''ve received twice a floating point exception on a different virtual > machines during compilation. The first time it was with mysql, the second - > kernel 2.6.0. After I ran "make" again the compilation finished without > problems. > I''m using Xen 2.0.4 build from sources on kernel 2.6.10 installed on > standard slackware 9.1. The machine is PIII w/ 512MB ram running 4 VM, each > with 64MB ram. The two errors happened on different VM. > > Could this be some Xen issue? > > Regards, > Georgi MungovHave you been running X at that time? If yes, could you have a look (in XFree86.log or xorg.log) whether it uses the "int10" and/or "vesa" library? Thanks, Stephan ------------------------------------------------------- SF email is sponsored by - The IT Product Guide Read honest & candid reviews on hundreds of IT Products from real users. Discover which products truly live up to the hype. Start reading now. http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
Georgi Mungov
2005-Feb-22 16:36 UTC
Re: [Xen-devel] Floating Point Exception when compiling w/ gcc
On Tuesday 22 February 2005 18:07, Ian Pratt wrote:> > Hi, > > I''ve received twice a floating point exception on a different > > virtual machines > > during compilation. The first time it was with mysql, the > > second - kernel > > 2.6.0. After I ran "make" again the compilation finished > > without problems. > > > > I''m using Xen 2.0.4 build from sources on kernel 2.6.10 > > installed on standard > > slackware 9.1. The machine is PIII w/ 512MB ram running 4 VM, > > each with 64MB > > ram. The two errors happened on different VM. > > > > Could this be some Xen issue? > > It''s not totally impossible... > Are you running an Xserver? What modules does it have loaded?I''m not using X on the machine> Are you using the default kernel config?The default + smb support.> Please can you confirm this is a PIII and not an Athlon.sure root@vm4:/usr/src/linux-2.6.0# cat /proc/cpuinfo processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 8 model name : Pentium III (Coppermine) stepping : 10 cpu MHz : 940.641 cache size : 256 KB fdiv_bug : no hlt_bug : yes f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 2 wp : yes flags : fpu tsc msr pae mce cx8 apic mca cmov pat pse36 mmx fxsr sse bogomips : 627.50 The CPU is overclocked but there wasn''t such problems so far. I wanted to test how stable Xen is by running kernel compilation 100 times on 4 VM simultaneously and the 4th VM failed on the second compilation. The other 3 are still running without problems> > Thanks, > Ian > > > ------------------------------------------------------- > SF email is sponsored by - The IT Product Guide > Read honest & candid reviews on hundreds of IT Products from real users. > Discover which products truly live up to the hype. Start reading now. > http://ads.osdn.com/?ad_ide95&alloc_id396&op=Click > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/xen-devel------------------------------------------------------- SF email is sponsored by - The IT Product Guide Read honest & candid reviews on hundreds of IT Products from real users. Discover which products truly live up to the hype. Start reading now. http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
Ian Pratt
2005-Feb-22 17:10 UTC
RE: [Xen-devel] Floating Point Exception when compiling w/ gcc
Were they always showing up as floating point exceptions, or sometimes segv? Can you do some floating point specific tests instead of kernel builds? E.g. using fptest/paranoia. Robin Green was seeing something similar to this, but it was on an Athlon and required X to be running. I''m not aware of any other reports of this category. BTW: Do your VMs have swap? Are you sure the memory in the machine is good? Ian> The CPU is overclocked but there wasn''t such problems so far. I wantedto test> how stable Xen is by running kernel compilation 100 times on 4 VM > simultaneously and the 4th VM failed on the second > compilation. The other 3 > are still running without problems > > > > > Thanks, > > Ian > > > > > > ------------------------------------------------------- > > SF email is sponsored by - The IT Product Guide > > Read honest & candid reviews on hundreds of IT Products > from real users. > > Discover which products truly live up to the hype. Start > reading now. > > http://ads.osdn.com/?ad_ide95&alloc_id396&op=Click > > _______________________________________________ > > Xen-devel mailing list > > Xen-devel@lists.sourceforge.net > > https://lists.sourceforge.net/lists/listinfo/xen-devel >------------------------------------------------------- SF email is sponsored by - The IT Product Guide Read honest & candid reviews on hundreds of IT Products from real users. Discover which products truly live up to the hype. Start reading now. http://ads.osdn.com/?ad_ide95&alloc_id396&op=click _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
Robin Green
2005-Feb-22 19:26 UTC
Re: [Xen-devel] Floating Point Exception when compiling w/ gcc
On Tue, 22 Feb 2005, Georgi Mungov wrote:> Hi, > I''ve received twice a floating point exception on a different virtual machines > during compilation. The first time it was with mysql, the second - kernel > 2.6.0. After I ran "make" again the compilation finished without problems.I found a floating-point bug in xen-unstable, which, since I don''t know what causes it, might also be present in xen 2.0.4. (See thread "reproducable data corruption in xen-unstable".) HOWEVER, I did create a workaround patch, which effectively solves the problem for me. Remember to apply this patch both to xen0 and xenU kernels. Here it is: --- arch/xen/i386/kernel/process.c.orig 2005-02-12 03:39:44.000000000 +0000 +++ arch/xen/i386/kernel/process.c 2005-02-13 02:46:03.000000000 +0000 @@ -563,6 +563,8 @@ if (prev_p->thread_info->status & TS_USEDFPU) { save_init_fpu(prev_p); queue_multicall0(__HYPERVISOR_fpu_taskswitch); + } else { + stts (); } /*> I''m using Xen 2.0.4 build from sources on kernel 2.6.10 installed on standard > slackware 9.1. The machine is PIII w/ 512MB ram running 4 VM, each with 64MB > ram. The two errors happened on different VM.My machine is an Athlon XP, so this shows that it isn''t some strange AMD-only bug (unless of course you have found a _different_ floating-point bug). Thanks for the bug report! -- Robin ------------------------------------------------------- SF email is sponsored by - The IT Product Guide Read honest & candid reviews on hundreds of IT Products from real users. Discover which products truly live up to the hype. Start reading now. http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
Keir Fraser
2005-Feb-22 23:47 UTC
Re: [Xen-devel] Floating Point Exception when compiling w/ gcc
On 22 Feb 2005, at 19:26, Robin Green wrote:> I found a floating-point bug in xen-unstable, which, since I don''t know > what causes it, might also be present in xen 2.0.4. (See thread > "reproducable data corruption in xen-unstable".) > > HOWEVER, I did create a workaround patch, which effectively solves the > problem for me. Remember to apply this patch both to xen0 and xenU > kernels.I''d like to get to the bottom of this problem. I''ve checked in some cleanups to FPU code in both -testing and -unstable trees (the cleanups are particularly extensive in the unstable tree). Can you see if the problem still occurs? If it does, it would be useful to see at task-switch time, where you see that TS is incorrectly cleared, whether Xen per-domain flags EDF_GUEST_STTS and EDF_USEDFPU are true or false. You''ll have to hack in a call-down into Xen to print out the flags'' status -- perhaps hack into the ''xen_version'' hypercall? -- Keir ------------------------------------------------------- SF email is sponsored by - The IT Product Guide Read honest & candid reviews on hundreds of IT Products from real users. Discover which products truly live up to the hype. Start reading now. http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
Robin Green
2005-Feb-23 01:17 UTC
Re: [Xen-devel] Floating Point Exception when compiling w/ gcc
On Tue, 22 Feb 2005, Keir Fraser wrote:> > On 22 Feb 2005, at 19:26, Robin Green wrote: > >> I found a floating-point bug in xen-unstable, which, since I don''t know >> what causes it, might also be present in xen 2.0.4. (See thread >> "reproducable data corruption in xen-unstable".) >> >> HOWEVER, I did create a workaround patch, which effectively solves the >> problem for me. Remember to apply this patch both to xen0 and xenU >> kernels. > > I''d like to get to the bottom of this problem. I''ve checked in some cleanups > to FPU > code in both -testing and -unstable trees (the cleanups are particularly > extensive > in the unstable tree). Can you see if the problem still occurs?When did you check in those changes? -- Robin ------------------------------------------------------- SF email is sponsored by - The IT Product Guide Read honest & candid reviews on hundreds of IT Products from real users. Discover which products truly live up to the hype. Start reading now. http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
Keir Fraser
2005-Feb-23 07:02 UTC
Re: [Xen-devel] Floating Point Exception when compiling w/ gcc
On 23 Feb 2005, at 01:17, Robin Green wrote:>> I''d like to get to the bottom of this problem. I''ve checked in some >> cleanups to FPU >> code in both -testing and -unstable trees (the cleanups are >> particularly extensive >> in the unstable tree). Can you see if the problem still occurs? > > When did you check in those changes?Just before sending the email. They were in public xen.bkbits.net at that point but the changes don''t get into our tarball snapshots until our overnight build is completed. -- Keir ------------------------------------------------------- SF email is sponsored by - The IT Product Guide Read honest & candid reviews on hundreds of IT Products from real users. Discover which products truly live up to the hype. Start reading now. http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
Robin Green
2005-Mar-05 19:22 UTC
[Xen-devel] Re: Floating Point Exception when compiling w/ gcc
On Tue, 22 Feb 2005 23:47:37 +0000, Keir Fraser wrote:> On 22 Feb 2005, at 19:26, Robin Green wrote: > >> I found a floating-point bug in xen-unstable, which, since I don''t know >> what causes it, might also be present in xen 2.0.4. (See thread >> "reproducable data corruption in xen-unstable".) >> >> HOWEVER, I did create a workaround patch, which effectively solves the >> problem for me. Remember to apply this patch both to xen0 and xenU >> kernels. > > I''d like to get to the bottom of this problem. I''ve checked in some > cleanups to FPU > code in both -testing and -unstable trees (the cleanups are > particularly extensive > in the unstable tree). Can you see if the problem still occurs?I just tried xen-unstable from 20050302 (I was waiting for it to be packaged in the fedora development repo), and it''s much better - thanks! However, there is still a less frequent problem with ptrace (which also occurred even when I''d applied my workaround patch). When running firefox under gdb only, display corruption and odd crashes in floating-point code are seen - which are the same symptoms as I had before. So I assume that something is still wrong with the floating-point save/restore in the case where a process is being ptraced. Not nearly such a big deal, but annoying when you are trying to debug something. -- Robin ------------------------------------------------------- SF email is sponsored by - The IT Product Guide Read honest & candid reviews on hundreds of IT Products from real users. Discover which products truly live up to the hype. Start reading now. http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
Ian Pratt
2005-Mar-05 19:37 UTC
RE: [Xen-devel] Re: Floating Point Exception when compiling w/ gcc
> I just tried xen-unstable from 20050302 (I was waiting for it to be > packaged in the fedora development repo), and it''s much > better - thanks!Please could you try out 2.0-testing.bk and report back. If it works for you, I''ll declare it 2.0.5. We never actually managed to produce a testcase, so we''re not sure the FP issue you observed is actually fixed.> However, there is still a less frequent problem with ptrace > (which also > occurred even when I''d applied my workaround patch). When running > firefox under gdb only, display corruption and odd crashes in > floating-point code are seen - which are the same symptoms as > I had before. > > So I assume that something is still wrong with the floating-point > save/restore in the case where a process is being ptraced.Hmmm. Please can you try producing a simple set of instructions that exhibits the bug reliably. Thanks, Ian ------------------------------------------------------- SF email is sponsored by - The IT Product Guide Read honest & candid reviews on hundreds of IT Products from real users. Discover which products truly live up to the hype. Start reading now. http://ads.osdn.com/?ad_ide95&alloc_id396&op=click _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
Keir Fraser
2005-Mar-06 09:45 UTC
Re: [Xen-devel] Re: Floating Point Exception when compiling w/ gcc
On 5 Mar 2005, at 19:22, Robin Green wrote:> I just tried xen-unstable from 20050302 (I was waiting for it to be > packaged in the fedora development repo), and it''s much better - > thanks! > > However, there is still a less frequent problem with ptrace (which also > occurred even when I''d applied my workaround patch). When running > firefox under gdb only, display corruption and odd crashes in > floating-point code are seen - which are the same symptoms as I had > before. > > So I assume that something is still wrong with the floating-point > save/restore in the case where a process is being ptraced.Does this bug still exhibit if you add in your patch to always set CR0_TS on task switch? -- Keir ------------------------------------------------------- SF email is sponsored by - The IT Product Guide Read honest & candid reviews on hundreds of IT Products from real users. Discover which products truly live up to the hype. Start reading now. http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
Robin Green
2005-Mar-07 15:03 UTC
RE: [Xen-devel] Re: Floating Point Exception when compiling w/ gcc
On Sat, 5 Mar 2005, Ian Pratt wrote:>> I just tried xen-unstable from 20050302 (I was waiting for it to be >> packaged in the fedora development repo), and it''s much >> better - thanks! > > Please could you try out 2.0-testing.bk and report back. If it works for > you, I''ll declare it 2.0.5.Yes, it works for me!>> However, there is still a less frequent problem with ptrace >> (which also >> occurred even when I''d applied my workaround patch). When running >> firefox under gdb only, display corruption and odd crashes in >> floating-point code are seen - which are the same symptoms as >> I had before. >> >> So I assume that something is still wrong with the floating-point >> save/restore in the case where a process is being ptraced. > > Hmmm. Please can you try producing a simple set of instructions that > exhibits the bug reliably.I couldn''t find a reliable test case, but I think I''ve fixed this, too. It appears to have been a typo in this patch: http://sourceforge.net/mailarchive/message.php?msg_id=10775274 which has not yet been applied to xen-unstable, because it''s a patch for linux 2.6.11. In the last change of the patch, if (tsk_used_math(tsk)) should read if (!tsk_used_math(tsk)) The fix is already checked in to fedora CVS. -- Robin ------------------------------------------------------- SF email is sponsored by - The IT Product Guide Read honest & candid reviews on hundreds of IT Products from real users. Discover which products truly live up to the hype. Start reading now. http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel