Colleagues, I have a routine in package labdsv that calls a FORTRAN subroutine. Recently, I was informed that it sometimes gives different results on a PC and Mac, and that the PC version is clearly wrong. I tested it on linux (because I don't have a PC), and I get the same (incorrect) behavior as the PC. Simply by inserting debug WRITE statements in the FORTRAN I would get different, and correct, results, so I assumed it was an optimization error. So, 1) R CMD SHLIB duleg.f does not work, and produces bogus code however, 2) gfortran -c alt_duleg.f gcc -O -std=gnu99 -shared -L/usr/local/lib -o alt_duleg.so alt_duleg.o -lgfortran -lm -lgcc_s does work (note the -O low optimization). Otherwise the gcc command is identical to the one produced by R CMD SHLIB. Has anyone else seen this? Is there a simple way to have my package enforce no optimization so others don't struggle with this? As far as I know the same code worked under g77. Thanks, Dave Roberts -- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ David W. Roberts office 406-994-4548 Professor and Head FAX 406-994-3190 Department of Ecology email droberts at montana.edu Montana State University Bozeman, MT 59717-3460
On 24/10/2008 6:53 PM, Dave Roberts wrote:> Colleagues, > > I have a routine in package labdsv that calls a FORTRAN subroutine. > Recently, I was informed that it sometimes gives different results on a > PC and Mac, and that the PC version is clearly wrong. I tested it on > linux (because I don't have a PC), and I get the same (incorrect) > behavior as the PC. > > Simply by inserting debug WRITE statements in the FORTRAN I would get > different, and correct, results, so I assumed it was an optimization > error. > > So, > > 1) R CMD SHLIB duleg.f does not work, and produces bogus code > > however, > > 2) gfortran -c alt_duleg.f > gcc -O -std=gnu99 -shared -L/usr/local/lib -o alt_duleg.so > alt_duleg.o -lgfortran -lm -lgcc_s > > does work (note the -O low optimization). Otherwise the gcc command is > identical to the one produced by R CMD SHLIB. > > Has anyone else seen this? Is there a simple way to have my package > enforce no optimization so others don't struggle with this? As far as I > know the same code worked under g77.I haven't heard of this before, but I think we would be very interested in hearing of cases where our default level of optimization produces incorrect results. This could be a compiler bug, which we would want to work around. It might also be an unstable algorithm, in which case you probably should fix it, but you might choose instead to use Makevars or Makefiles in your package to explicitly describe how you want it built. Would it be possible for you to extract a minimal example that illustrates the bug? Duncan Murdoch
Dave, yes, we have experienced different results in different platforms, the problem was partly due to the different versions of gfortran installed across platforms, are you using the same version for all the platforms? Clearly, gfortran is undergoing rapid changes so that the results might vary a lot across versions. You might want to contact the developers should you find out a compiler bug. Kind regards Simone On Sat, Oct 25, 2008 at 12:53 AM, Dave Roberts <droberts@montana.edu> wrote:> Colleagues, > > I have a routine in package labdsv that calls a FORTRAN subroutine. > Recently, I was informed that it sometimes gives different results on a PC > and Mac, and that the PC version is clearly wrong. I tested it on linux > (because I don't have a PC), and I get the same (incorrect) behavior as the > PC. > > Simply by inserting debug WRITE statements in the FORTRAN I would get > different, and correct, results, so I assumed it was an optimization error. > > So, > > 1) R CMD SHLIB duleg.f does not work, and produces bogus code > > however, > > 2) gfortran -c alt_duleg.f > gcc -O -std=gnu99 -shared -L/usr/local/lib -o alt_duleg.so > alt_duleg.o -lgfortran -lm -lgcc_s > > does work (note the -O low optimization). Otherwise the gcc command is > identical to the one produced by R CMD SHLIB. > > Has anyone else seen this? Is there a simple way to have my package > enforce no optimization so others don't struggle with this? As far as I > know the same code worked under g77. > > Thanks, Dave Roberts > -- > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > David W. Roberts office 406-994-4548 > Professor and Head FAX 406-994-3190 > Department of Ecology email droberts@montana.edu > Montana State University > Bozeman, MT 59717-3460 > > ______________________________________________ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel >-- ______________________________________________________ Simone Giannerini Dipartimento di Scienze Statistiche "Paolo Fortunati" Universita' di Bologna Via delle belle arti 41 - 40126 Bologna, ITALY Tel: +39 051 2098262 Fax: +39 051 232153 http://www2.stat.unibo.it/giannerini/ ______________________________________________________ [[alternative HTML version deleted]]
Dave Roberts <droberts at montana.edu> wrote:> I have a routine in package labdsv that calls a FORTRAN subroutine. > Recently, I was informed that it sometimes gives different results on a > PC and Mac, and that the PC version is clearly wrong. I tested it on > linux (because I don't have a PC), and I get the same (incorrect) > behavior as the PC. > > Simply by inserting debug WRITE statements in the FORTRAN I would get > different, and correct, results, so I assumed it was an optimization > error. > >[...]Not to prejudge the case, but such sometime errors are classic symptoms of access violations in Fortran, such as accessing a variable whose value has not been set, exceeding array bounds, or something similar. The usual suggestion is to recompile the Fortran with ALL error-checking options turned on and see if such a bug turns up. -- Mike Prager, NOAA, Beaufort, NC * Opinions expressed are personal and not represented otherwise. * Any use of tradenames does not constitute a NOAA endorsement.
Bill and Mike, Thanks for your help. I think Mike was right about the 80-bit/64-bit compare. As Mike thought, the error occurs at a test for equality, i.e. if (x .ge. y) then I know better than to test reals for equality, but this is a closed interval test, and it still fails. if (x-y .gt. -0.00001) worked. Bill, thanks for the tip on --ffloat-store. I tried gfortran -c alt_duleg.f gcc -std=gnu99 -ffloat-store -shared -L/usr/local/lib -o alt_duleg.so alt_duleg.o -lgfortran -lm -lgcc_s (standard R CMD SHLIB except for the addition of --float-store) and it seems to have cleared up the problem. I should have seen this coming, since I once had a comparison fail on a 4-byte real versus a double precision that I know are exactly the same in base10. To fix that I canged EVERYTHING to double precision, but I didn't see the register problem coning at all. Now I need to figure out how to implement a makefile for my package, but that's another story and easily solved I suspect. Thanks to everyone who took a stab at this one. I learned a lot. Sorry to have taken so long to get back to it, but other priorities demanded so. Dave ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ David W. Roberts office 406-994-4548 Professor and Head FAX 406-994-3190 Department of Ecology email droberts at montana.edu Montana State University Bozeman, MT 59717-3460 William Dunlap wrote:> > >> -----Original Message----- >> From: William Dunlap >> Sent: Monday, November 03, 2008 8:34 AM >> To: 'Mike Prager'; 'droberts at montana.edu' >> Subject: RE: [Rd] gfortran optimization problems > ... >> Another common problem, in C or Fortran, is that optimization >> can result in a very small floating point number remaining in >> an 80-bit floating-point-unit register instead of being >> truncated to the 64-bits available in main memory. The >> difference in roundoff error can result in big differences in >> results if your algorithm is ill conditioned or has a test >> for an exact 0.0. >> Adding a debugging WRITE or printf statement generally causes >> the variable to be copied to 64-bit main memory. > > A quick way to see if using floating point registers or not > affects your results is to add the gcc flag (probably a gfortran > flag as well) '-ffloat-store'. 'man gcc' says: > > The following options control compiler behavior regarding > floating > point arithmetic. These options trade off between speed and > correct- > ness. All must be specifically enabled. > > -ffloat-store > Do not store floating point variables in registers, and > inhibit > other options that might change whether a floating point > value is > taken from a register or memory. > > This option prevents undesirable excess precision on machines > such > as the 68000 where the floating registers (of the 68881) keep > more > precision than a "double" is supposed to have. Similarly for > the > x86 architecture. For most programs, the excess precision > does > only good, but a few programs rely on the precise definition > of > IEEE floating point. Use -ffloat-store for such programs, > after > modifying them to store all pertinent intermediate > computations > into variables. > > Bill Dunlap > TIBCO Spotfire Inc > wdunlap tibco.com > > >-- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ David W. Roberts office 406-994-4548 Professor and Head FAX 406-994-3190 Department of Ecology email droberts at montana.edu Montana State University Bozeman, MT 59717-3460