Rosenberger George
2013-Oct-15 13:00 UTC
[Rd] randomForest: Numeric deviation between 32/64 Windows builds
Dear R Developers I'm using the great randomForest package (4.6-7) for many projects and recently stumbled upon a problem when I wrote unit tests for one of my projects: On Windows, there are small numeric deviations when using the 32- / 64-bit version of R, which doesn't seem to be a problem on Linux or Mac. R64 on Windows produces the same results as R64/R32 on Linux or Mac:> set.seed(131) > importance(randomForest(Species ~ ., data=iris))MeanDecreaseGini Sepal.Length 9.452470 Sepal.Width 2.037092 Petal.Length 43.603071 Petal.Width 44.116904 R32 on Windows produces the following:> set.seed(131) > importance(randomForest(Species ~ ., data=iris))MeanDecreaseGini Sepal.Length 9.433986 Sepal.Width 2.249871 Petal.Length 43.594159 Petal.Width 43.941870 Is there a reason why this is different for the Windows builds? Are the compilers on Windows doing different things for 32- / 64-bit builds than the ones on Linux or Mac? Thank you very much for your help. Best regards, George
Prof Brian Ripley
2013-Oct-16 10:58 UTC
[Rd] randomForest: Numeric deviation between 32/64 Windows builds
On 15/10/2013 14:00, Rosenberger George wrote:> Dear R Developers > > I'm using the great randomForest package (4.6-7) for many projects and recently stumbled upon a problem when I wrote unit tests for one of my projects: > > On Windows, there are small numeric deviations when using the 32- / 64-bit version of R, which doesn't seem to be a problem on Linux or Mac. > > R64 on Windows produces the same results as R64/R32 on Linux or Mac: > >> set.seed(131) >> importance(randomForest(Species ~ ., data=iris)) > MeanDecreaseGini > Sepal.Length 9.452470 > Sepal.Width 2.037092 > Petal.Length 43.603071 > Petal.Width 44.116904 > > R32 on Windows produces the following: > >> set.seed(131) >> importance(randomForest(Species ~ ., data=iris)) > MeanDecreaseGini > Sepal.Length 9.433986 > Sepal.Width 2.249871 > Petal.Length 43.594159 > Petal.Width 43.941870 > > Is there a reason why this is different for the Windows builds? Are the compilers on Windows doing different things for 32- / 64-bit builds than the ones on Linux or Mac?Yes, no (but these are not R issues). There are bigger differences in the OS's equivalent of libm on Windows. You did not tell us what CPUs your compilers targeted on Linux and OS X (sic), but generally they assume more than the i386 assumed on 32-bit Windows by Microsoft. OTOH, all x86_64 OSes, including Windows, can assume more as all such CPUs have post-i686 features. Remember Windows XP is still supported, and that was released in 2001. Based on much wider experience than you give (e.g. reference results from R itself and recommended packages), deviations from x86_64 results are increasingly likely on OS X, i686 Linux and then i386 Windows. -- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595