Adler, Avraham
2013-May-28 22:36 UTC
[Rd] R-3.0.1 - "transient" make check failure in splines-EX.r
Hello. I seem to be having the same problem that Paul had in the thread titled "[Rd] R 2.15.2 make check failure on 32-bit --with-blas="-lgoto2"" from October of last year <https://stat.ethz.ch/pipermail/r-devel/2012-October/065103.html> Unfortunately, that thread ended without an answer to his last question. Briefly, I am trying to compile an Rblas for Windows NT 32bit using OpenBlas (successor to GotoBlas) (Nehalem - corei7), and the compiled version passes all tests except for the "splines-Ex" test in the exact same place that Paul had issues: ~~~~> stopifnot(identical(ns(x), ns(x, df = 1)),+ identical(ns(x, df = 2), ns(x, df = 2, knots = NULL)), # not true till 2.15.2 + !is.null(kk <- attr(ns(x), "knots")), # not true till 1.5.1 + length(kk) == 0) Error: identical(ns(x, df = 2), ns(x, df = 2, knots = NULL)) is not TRUE ~~~~ Yet, opening up R and running the actual code shows that the error is transient: ~~~~> identical(ns(x, df = 2), ns(x, df = 2, knots = NULL))[1] TRUE> identical(ns(x, df = 2), ns(x, df = 2, knots = NULL))[1] TRUE> identical(ns(x, df = 2), ns(x, df = 2, knots = NULL))[1] TRUE> identical(ns(x, df = 2), ns(x, df = 2, knots = NULL))[1] FALSE> identical(ns(x, df = 2), ns(x, df = 2, knots = NULL))[1] TRUE> identical(ns(x, df = 2), ns(x, df = 2, knots = NULL))[1] TRUE> identical(ns(x, df = 2), ns(x, df = 2, knots = NULL))[1] TRUE> identical(ns(x, df = 2), ns(x, df = 2, knots = NULL))[1] TRUE> identical(ns(x, df = 2), ns(x, df = 2, knots = NULL))[1] TRUE> identical(ns(x, df = 2), ns(x, df = 2, knots = NULL))[1] FALSE ~~~~ This is the only error I have on the 32-bit version, I believe (trying to build a blas for 64-bit on SandyBridge is a completely different kettle of fish that is causing me to pull out what little hair I have left), and if it can be solved that would be great. Thank you, Avraham
Paul Gilbert
2013-May-30 04:25 UTC
[Rd] R-3.0.1 - "transient" make check failure in splines-EX.r
Avraham I resolved this only by switching to a different BLAS on the 32 bit machine.Since no one else seemed to be having problems, I considered it possible that there was a hardware issue on my old 32 bit machine. The R check test failed somewhat randomly, but often. most disconcertingly, it failed because it gives different answers. If you source the code in an R session a few times you have no trouble reproducing this. It gives the impression of an improperly zeroed matrix. (All this from memory, I'm on the road.) Paul On 13-05-28 06:36 PM, Adler, Avraham wrote:> > Hello. > > I seem to be having the same problem that Paul had in the thread titled "[Rd] R 2.15.2 make check failure on 32-bit --with-blas="-lgoto2"" from October of last year <https://stat.ethz.ch/pipermail/r-devel/2012-October/065103.html> Unfortunately, that thread ended without an answer to his last question. > > Briefly, I am trying to compile an Rblas for Windows NT 32bit using OpenBlas (successor to GotoBlas) (Nehalem - corei7), and the compiled version passes all tests except for the "splines-Ex" test in the exact same place that Paul had issues: > > ~~~~ >> stopifnot(identical(ns(x), ns(x, df = 1)), > + identical(ns(x, df = 2), ns(x, df = 2, knots = NULL)), # not true till 2.15.2 > + !is.null(kk <- attr(ns(x), "knots")), # not true till 1.5.1 > + length(kk) == 0) > Error: identical(ns(x, df = 2), ns(x, df = 2, knots = NULL)) is not TRUE > ~~~~ > > Yet, opening up R and running the actual code shows that the error is transient: > > ~~~~ >> identical(ns(x, df = 2), ns(x, df = 2, knots = NULL)) > [1] TRUE >> identical(ns(x, df = 2), ns(x, df = 2, knots = NULL)) > [1] TRUE >> identical(ns(x, df = 2), ns(x, df = 2, knots = NULL)) > [1] TRUE >> identical(ns(x, df = 2), ns(x, df = 2, knots = NULL)) > [1] FALSE >> identical(ns(x, df = 2), ns(x, df = 2, knots = NULL)) > [1] TRUE >> identical(ns(x, df = 2), ns(x, df = 2, knots = NULL)) > [1] TRUE >> identical(ns(x, df = 2), ns(x, df = 2, knots = NULL)) > [1] TRUE >> identical(ns(x, df = 2), ns(x, df = 2, knots = NULL)) > [1] TRUE >> identical(ns(x, df = 2), ns(x, df = 2, knots = NULL)) > [1] TRUE >> identical(ns(x, df = 2), ns(x, df = 2, knots = NULL)) > [1] FALSE > ~~~~ > > This is the only error I have on the 32-bit version, I believe (trying to build a blas for 64-bit on SandyBridge is a completely different kettle of fish that is causing me to pull out what little hair I have left), and if it can be solved that would be great. > > Thank you, > > Avraham > > > > >
Adler, Avraham
2013-May-30 21:17 UTC
[Rd] R-3.0.1 - "transient" make check failure in splines-EX.r
I just found this thread on StackOverflow <http://stackoverflow.com/questions/13871818/ns-varies-for-no-apparent-reason/13878936> which had the same problem with the `ns` call changing with Revolution, and the answer given by tech support was that the MKL BLAS sometime returns ever-so-slightly different floating point results than a reference BLAS. The problem with that answer is that if it is true, the runs should not change *on the same machine* but it is another example of this issue. Unfortunately, it seems to dead-end too. Avraham -----Original Message----- From: Adler, Avraham Sent: Thursday, May 30, 2013 3:12 PM To: Paul Gilbert Cc: r-devel at r-project.org Subject: RE: R-3.0.1 - "transient" make check failure in splines-EX.r Thank you very much, Paul. Serendipitously, I seem to have stumbled on a solution. In my parallel (still unsuccessful) attempt to build a BLAS for a 64bit machine (see <https://stat.ethz.ch/pipermail/r-devel/2013-May/066731.html>) I remembered from ATLAS that under the newer Windows there is a divergence from the "standard" ABI (see <http://math-atlas.sourceforge.net/atlas_install/node57.html>). Looking through the various makefiles under OpenBLAS, I found the following: ifeq ($(C_COMPILER), GCC) #Test for supporting MS_ABI GCCVERSIONGTEQ4 := $(shell expr `$(CC) -dumpversion | cut -f1 -d.` \>= 4) GCCVERSIONGT4 := $(shell expr `$(CC) -dumpversion | cut -f1 -d.` \> 4) GCCMINORVERSIONGTEQ7 := $(shell expr `$(CC) -dumpversion | cut -f2 -d.` \>= 7) ifeq ($(GCCVERSIONGT4), 1) # GCC Majar version > 4 # It is compatible with MSVC ABI. CCOMMON_OPT += -DMS_ABI endif I had been building OPBL using gcc4.8.0, which is ostensibly "compatible" with the newer ABI, but Rtools still lives in 4.6.3, which isn't. Recompiling the BLAS with MinGW32 for 4.6.3 created a file that has passed `make check-all` twice now. I plan on comparing the speed with the ATLAS-based blas, and if it is faster, I hope to e-mail the dll and check results to Dr. Ligges. I say "stumbled serendipitously" because when using the 64 bit version of MinGw 4.6.3 resulted in the same `optim`-based error in `factanal` which I describe in the thread linked-to above. I will try using different versions of MinGW or even trying under Cygwin, I guess. In any event, Paul, I am curious if when you were trying to compile and had the same issue, were you using a different version or generation of gcc in the BLAS compilation than in the R compilation? Once again, thank you very much. Avraham Adler -----Original Message----- From: Paul Gilbert Sent: Thursday, May 30, 2013 12:26 AM To: Adler, Avraham Subject: Re: R-3.0.1 - "transient" make check failure in splines-EX.r Avraham I resolved this only by switching to a different BLAS on the 32 bit machine.Since no one else seemed to be having problems, I considered it possible that there was a hardware issue on my old 32 bit machine. The R check test failed somewhat randomly, but often. most disconcertingly, it failed because it gives different answers. If you source the code in an R session a few times you have no trouble reproducing this. It gives the impression of an improperly zeroed matrix. (All this from memory, I'm on the road.) Paul On 13-05-28 06:36 PM, Adler, Avraham wrote:> > Hello. > > I seem to be having the same problem that Paul had in the thread titled "[Rd] R 2.15.2 make check failure on 32-bit --with-blas="-lgoto2"" from October of last year <https://stat.ethz.ch/pipermail/r-devel/2012-October/065103.html> Unfortunately, that thread ended without an answer to his last question. > > Briefly, I am trying to compile an Rblas for Windows NT 32bit using OpenBlas (successor to GotoBlas) (Nehalem - corei7), and the compiled version passes all tests except for the "splines-Ex" test in the exact same place that Paul had issues: > > ~~~~ >> stopifnot(identical(ns(x), ns(x, df = 1)), > + identical(ns(x, df = 2), ns(x, df = 2, knots = NULL)), # not true till 2.15.2 > + !is.null(kk <- attr(ns(x), "knots")), # not true till 1.5.1 > + length(kk) == 0) > Error: identical(ns(x, df = 2), ns(x, df = 2, knots = NULL)) is not > TRUE ~~~~ > > Yet, opening up R and running the actual code shows that the error is transient: > > ~~~~ >> identical(ns(x, df = 2), ns(x, df = 2, knots = NULL)) > [1] TRUE >> identical(ns(x, df = 2), ns(x, df = 2, knots = NULL)) > [1] TRUE >> identical(ns(x, df = 2), ns(x, df = 2, knots = NULL)) > [1] TRUE >> identical(ns(x, df = 2), ns(x, df = 2, knots = NULL)) > [1] FALSE >> identical(ns(x, df = 2), ns(x, df = 2, knots = NULL)) > [1] TRUE >> identical(ns(x, df = 2), ns(x, df = 2, knots = NULL)) > [1] TRUE >> identical(ns(x, df = 2), ns(x, df = 2, knots = NULL)) > [1] TRUE >> identical(ns(x, df = 2), ns(x, df = 2, knots = NULL)) > [1] TRUE >> identical(ns(x, df = 2), ns(x, df = 2, knots = NULL)) > [1] TRUE >> identical(ns(x, df = 2), ns(x, df = 2, knots = NULL)) > [1] FALSE > ~~~~ > > This is the only error I have on the 32-bit version, I believe (trying to build a blas for 64-bit on SandyBridge is a completely different kettle of fish that is causing me to pull out what little hair I have left), and if it can be solved that would be great. > > Thank you, > > Avraham
Adler, Avraham
2013-May-31 16:43 UTC
[Rd] R-3.0.1 - "transient" make check failure in splines-EX.r
Fascinating. Of course, I am so far out of my league now that I wouldn't have any idea of how to address the issues, let alone break down the performance into assembly calls and identify the problem. Once again, thank you for the insight! Avraham Adler -----Original Message----- From: Mike Marchywka Sent: Friday, May 31, 2013 11:43 AM To: Adler, Avraham Subject: RE: [Rd] R-3.0.1 - "transient" make check failure in splines-EX.r ----------------------------------------> From: Avraham.Adler at guycarp.com > To: marchywka at hotmail.com; r-devel at r-project.org > Date: Fri, 31 May 2013 10:16:11 -0500 > Subject: RE: [Rd] R-3.0.1 - "transient" make check failure in > splines-EX.r > > Thank you, Mike, I did not know that! > > I tried to prevent multi-threaded issues by setting the compiler options to be single-threaded, but I know so little about this area that there may be something else going on. > > Do you think that the same problem may be causing the 64-bit issue I am having (<https://stat.ethz.ch/pipermail/r-devel/2013-May/066731.html>)? I tend to think not, as I haven't seen changing results in the call to `optim`, and I still don't know what "NEW_X" means. >I really did not even know what you were trying to do and multi threading had not occured to me until I checked the latest on their site :). I'd just browse their performance related publications. Floating point is not reproducible except in java although I guess there too the mulithreading could mess it up if the order of summations changes for example. But of course do not ignore real bugs like unitialized memory. I remember once our codec started running real slow even though the audio it was encoding still sounded ok. This turned out to be doing some processing on unitialized memory which typically cuased fp exceptions that are VERY slow. So even something not of consequence to the output could effect performance.> Once again, thank you. > > Avraham Adler > > > -----Original Message----- > From: Mike Marchywka [mailto:marchywka at hotmail.com] > Sent: Thursday, May 30, 2013 7:21 PM > To: Adler, Avraham; 'r-devel at r-project.org' > Subject: RE: [Rd] R-3.0.1 - "transient" make check failure in > splines-EX.r > > ---------------------------------------- >> From: Avraham.Adler at guycarp.com >> To: r-devel at r-project.org >> Date: Thu, 30 May 2013 16:17:36 -0500 >> Subject: Re: [Rd] R-3.0.1 - "transient" make check failure in >> splines-EX.r >> >> I just found this thread on StackOverflow <http://stackoverflow.com/questions/13871818/ns-varies-for-no-apparent-reason/13878936> which had the same problem with the `ns` call changing with Revolution, and the answer given by tech support was that the MKL BLAS sometime returns ever-so-slightly different floating point results than a reference BLAS. The problem with that answer is that if it is true, the runs should not change *on the same machine* but it is another example of this issue. Unfortunately, it seems to dead-end too. >> > > Read some of the documents on the Intel site about floating point consistency and compiler optimizations. There are some reasons that you could get a different result from repeated runs on the same machine. One of these would be bugs like unititialized memory but another would be things like state of FPU and issues with multi-threaded code having some order dependencies etc. > > ( hotmail can not believe I am trying to post text but maybe you can > figure it out from whatver this link eds up looking like.... ) href="http://www.google.com/search?biw=1253&bih=542&hl=en&q=floating+point+low+bits+vary+fpu+prior+state+site%253Aintel.com&oq=floating+point+low+bits+vary+fpu+prior+state+site%253Aintel.com" > > > > >> Avraham >> >> >> -----Original Message----- >> From: Adler, Avraham >> Sent: Thursday, May 30, 2013 3:12 PM >> To: Paul Gilbert >> Cc: r-devel at r-project.org >> Subject: RE: R-3.0.1 - "transient" make check failure in splines-EX.r >> >> Thank you very much, Paul. >> >> Serendipitously, I seem to have stumbled on a solution. In my parallel (still unsuccessful) attempt to build a BLAS for a 64bit machine (see <https://stat.ethz.ch/pipermail/r-devel/2013-May/066731.html>) I remembered from ATLAS that under the newer Windows there is a divergence from the "standard" ABI (see <http://math-atlas.sourceforge.net/atlas_install/node57.html>). >> >> Looking through the various makefiles under OpenBLAS, I found the following: >> >> ifeq ($(C_COMPILER), GCC) >> #Test for supporting MS_ABI >> GCCVERSIONGTEQ4 := $(shell expr `$(CC) -dumpversion | cut -f1 -d.` >> \>= 4) >> GCCVERSIONGT4 := $(shell expr `$(CC) -dumpversion | cut -f1 -d.` \> >> 4) >> GCCMINORVERSIONGTEQ7 := $(shell expr `$(CC) -dumpversion | cut -f2 >> -d.` \>= 7) ifeq ($(GCCVERSIONGT4), 1) # GCC Majar version> 4 # It is >> compatible with MSVC ABI. >> CCOMMON_OPT += -DMS_ABI >> endif >> >> I had been building OPBL using gcc4.8.0, which is ostensibly "compatible" with the newer ABI, but Rtools still lives in 4.6.3, which isn't. Recompiling the BLAS with MinGW32 for 4.6.3 created a file that has passed `make check-all` twice now. I plan on comparing the speed with the ATLAS-based blas, and if it is faster, I hope to e-mail the dll and check results to Dr. Ligges. >> >> I say "stumbled serendipitously" because when using the 64 bit version of MinGw 4.6.3 resulted in the same `optim`-based error in `factanal` which I describe in the thread linked-to above. I will try using different versions of MinGW or even trying under Cygwin, I guess. >> >> In any event, Paul, I am curious if when you were trying to compile and had the same issue, were you using a different version or generation of gcc in the BLAS compilation than in the R compilation? >> >> Once again, thank you very much. >> >> Avraham Adler >> >> >> -----Original Message----- >> From: Paul Gilbert >> Sent: Thursday, May 30, 2013 12:26 AM >> To: Adler, Avraham >> Subject: Re: R-3.0.1 - "transient" make check failure in splines-EX.r >> >> Avraham >> >> I resolved this only by switching to a different BLAS on the 32 bit machine.Since no one else seemed to be having problems, I considered it possible that there was a hardware issue on my old 32 bit machine. The R check test failed somewhat randomly, but often. most disconcertingly, it failed because it gives different answers. If you source the code in an R session a few times you have no trouble reproducing this. It gives the impression of an improperly zeroed matrix. >> >> (All this from memory, I'm on the road.) >> >> Paul >> >> On 13-05-28 06:36 PM, Adler, Avraham wrote: >>> >>> Hello. >>> >>> I seem to be having the same problem that Paul had in the thread titled "[Rd] R 2.15.2 make check failure on 32-bit --with-blas="-lgoto2"" from October of last year <https://stat.ethz.ch/pipermail/r-devel/2012-October/065103.html> Unfortunately, that thread ended without an answer to his last question. >>> >>> Briefly, I am trying to compile an Rblas for Windows NT 32bit using OpenBlas (successor to GotoBlas) (Nehalem - corei7), and the compiled version passes all tests except for the "splines-Ex" test in the exact same place that Paul had issues: >>> >>> ~~~~ >>>> stopifnot(identical(ns(x), ns(x, df = 1)), >>> + identical(ns(x, df = 2), ns(x, df = 2, knots = NULL)), # not true >>> + till 2.15.2 !is.null(kk <- attr(ns(x), "knots")), # not true till >>> + 1.5.1 >>> + length(kk) == 0) >>> Error: identical(ns(x, df = 2), ns(x, df = 2, knots = NULL)) is not >>> TRUE ~~~~ >>> >>> Yet, opening up R and running the actual code shows that the error is transient: >>> >>> ~~~~ >>>> identical(ns(x, df = 2), ns(x, df = 2, knots = NULL)) >>> [1] TRUE >>>> identical(ns(x, df = 2), ns(x, df = 2, knots = NULL)) >>> [1] TRUE >>>> identical(ns(x, df = 2), ns(x, df = 2, knots = NULL)) >>> [1] TRUE >>>> identical(ns(x, df = 2), ns(x, df = 2, knots = NULL)) >>> [1] FALSE >>>> identical(ns(x, df = 2), ns(x, df = 2, knots = NULL)) >>> [1] TRUE >>>> identical(ns(x, df = 2), ns(x, df = 2, knots = NULL)) >>> [1] TRUE >>>> identical(ns(x, df = 2), ns(x, df = 2, knots = NULL)) >>> [1] TRUE >>>> identical(ns(x, df = 2), ns(x, df = 2, knots = NULL)) >>> [1] TRUE >>>> identical(ns(x, df = 2), ns(x, df = 2, knots = NULL)) >>> [1] TRUE >>>> identical(ns(x, df = 2), ns(x, df = 2, knots = NULL)) >>> [1] FALSE >>> ~~~~ >>> >>> This is the only error I have on the 32-bit version, I believe (trying to build a blas for 64-bit on SandyBridge is a completely different kettle of fish that is causing me to pull out what little hair I have left), and if it can be solved that would be great. >>> >>> Thank you, >>> >>> Avraham >> ______________________________________________ >> R-devel at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-devel