Dear R experts, we are preparing an R-package to compute the Oja Median which contains some C++ code in which random numbers are needed. To generate the random numbers we use the following Mersenne-Twister implementation: // MersenneTwister.h // Mersenne Twister random number generator -- a C++ class MTRand // Based on code by Makoto Matsumoto, Takuji Nishimura, and Shawn Cokus // Richard J. Wagner v1.0 15 May 2003 rjwagner at writeme.com the random seed for the Mersenne-Twister is provided by our R-function which gives an (random) integer to the C++ function srand() which in turn sets the seed in the code. Using the set.seed in R makes now the results reproducible, but the results differ between windows and linux. Does anyone know what the problem there is? Our suspicion is that the reason is that some libraries are different implemented on linux and windows (XP) compilers. After the program start we set the seed in row 447(vkm.cpp) with srand(int); When the median will be calculated, an intern seed is set with unsigned int seed = rand(); ( in row 100 (vkm.cpp)). This seed will be used to calculate some random subsets and to create a Mersenne Twister object with MTRand rr(seed); (row 156, vkm.cpp). The MTRand Object rr is called with an unsigned Integer, so the important function in the mersenneTwister.h class is in line 87: MTRand( const uint32& oneSeed ); According to that the Random Number Generator uses the methods initialize(oneSeed); and reload(); (inside the method, beginning in line 215) This both methods (line 283 and line 301) are using beside others registers. Could it be that there is a different behavior between Windows and Linux? We do not want to use only srand() since we might need more then the number of pseudo random numbers that algorithm can provide. For those interested and which would like to see the code, a first version of the package, called OjaMedian, is available as source file and windows binary on my homepage: http://www.uta.fi/~klaus.nordhausen/down.html The problem is in the ojaMedian function when the evolutionary algorithm is used. Involved C++-files are mainly vkm.cpp and MersenneTwister.h. We would be very grateful for any advice on how to solve this problem. (below is also a demonstration) Thank you very much in advance, Klaus Results on windows XP: Compiler used: gcc version 4.2.1-sjlj (mingw32-2)> library(OjaMedian) > set.seed(1) > testD <- rmvnorm(20,c(0,0)) > summary(testD)V1 V2 Min. :-2.2147 Min. :-1.989352 1st Qu.:-0.3844 1st Qu.:-0.399466 Median : 0.3597 Median :-0.054967 Mean : 0.1905 Mean :-0.006472 3rd Qu.: 0.7590 3rd Qu.: 0.655663 Max. : 1.5953 Max. : 1.358680> set.seed(1) > ojaMedian(testD)[1] 0.21423705 -0.05799643> sessionInfo()R version 2.9.0 (2009-04-17) i386-pc-mingw32 locale: LC_COLLATE=Finnish_Finland.1252;LC_CTYPE=Finnish_Finland.1252;LC_MONETARY=Finnish_Finland.1252;LC_NUMERIC=C;LC_TIME=Finnish_Finland.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] OjaMedian_0.0-14 ICSNP_1.0-3 ICS_1.2-1 survey_3.14 [5] mvtnorm_0.9-5 loaded via a namespace (and not attached): [1] tools_2.9.0>Results on Linux Kubuntu 8.10 result of: cat /proc/version: Linux version 2.6.28-11-generic (buildd at palmer) (gcc version 4.3.3 (Ubuntu 4.3.3-5ubuntu4) ) #42-Ubuntu SMP Fri Apr 17 01:57:59 UTC 2009> library(OjaMedian) > set.seed(1) > testD <- rmvnorm(20,c(0,0)) > summary(testD)V1 V2 Min. :-2.2147 Min. :-1.989352 1st Qu.:-0.3844 1st Qu.:-0.399466 Median : 0.3597 Median :-0.054967 Mean : 0.1905 Mean :-0.006472 3rd Qu.: 0.7590 3rd Qu.: 0.655663 Max. : 1.5953 Max. : 1.358680> set.seed(1) > ojaMedian(testD)(-0.501381, 0.193929)[1] 0.119149071 0.002732100> sessionInfo()R version 2.8.1 (2008-12-22) i486-pc-linux-gnu locale: LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=C;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] OjaMedian_0.0-14 ICSNP_1.0-3 ICS_1.2-1 survey_3.14 [5] mvtnorm_0.9-5 -- Klaus Nordhausen Researcher Tampere School of Public Health FIN-33014 University of Tampere phone: +358 3 3551 4153 fax: +358 3 3551 4150 e-mail: Klaus.Nordhausen at uta.fi
>>>>> On Tue, 12 May 2009 13:30:21 +0300, >>>>> Klaus Nordhausen (KN) wrote:> Dear R experts, > we are preparing an R-package to compute the Oja Median which contains > some C++ code in which random numbers are needed. To generate the random > numbers we use the following Mersenne-Twister implementation: > // MersenneTwister.h > // Mersenne Twister random number generator -- a C++ class MTRand > // Based on code by Makoto Matsumoto, Takuji Nishimura, and Shawn Cokus > // Richard J. Wagner v1.0 15 May 2003 rjwagner at writeme.com > the random seed for the Mersenne-Twister is provided by our R-function > which gives an (random) integer to the C++ function srand() which in > turn sets the seed in the code. > Using the set.seed in R makes now the results reproducible, but the > results differ between windows and linux. > Does anyone know what the problem there is? [...] I cannot directly help with the problem, but one quick question: Why do you ship your own random number generator rather than use the one that ships with R (which by default is Mersenne Twister anyway)? The API is documented in "Writing R Extensions". The advantage in using R's RNG is that the user can change it, e.g., to take care of streams for parallel processing on clusters etc Best, Fritz -- ----------------------------------------------------------------------- Prof. Dr. Friedrich Leisch Institut f?r Statistik Tel: (+49 89) 2180 3165 Ludwig-Maximilians-Universit?t Fax: (+49 89) 2180 5308 Ludwigstra?e 33 D-80539 M?nchen http://www.statistik.lmu.de/~leisch ----------------------------------------------------------------------- Journal Computational Statistics --- http://www.springer.com/180 M?nchner R Kurse --- http://www.statistik.lmu.de/R
Hi Klaus, Why not just use R's random number generator? See section 6.3 of Writing R Extensions. It should give you the same sequence of pseudorandom numbers on all platforms. HTH, Kjell On 12 mai 09, at 12:30, Klaus Nordhausen wrote:> Dear R experts, > > we are preparing an R-package to compute the Oja Median which contains > some C++ code in which random numbers are needed. To generate the > random > numbers we use the following Mersenne-Twister implementation: > > // MersenneTwister.h > // Mersenne Twister random number generator -- a C++ class MTRand > // Based on code by Makoto Matsumoto, Takuji Nishimura, and Shawn > Cokus > // Richard J. Wagner v1.0 15 May 2003 rjwagner at writeme.com > > the random seed for the Mersenne-Twister is provided by our R-function > which gives an (random) integer to the C++ function srand() which in > turn sets the seed in the code. > > Using the set.seed in R makes now the results reproducible, but the > results differ between windows and linux. > > Does anyone know what the problem there is? > > Our suspicion is that the reason is that some libraries are different > implemented on linux and windows (XP) compilers. > > After the program start we set the seed in row 447(vkm.cpp) with > srand(int); > > When the median will be calculated, an intern seed is set with > unsigned > int seed = rand(); ( in row 100 (vkm.cpp)). This seed will be used > to calculate some random subsets and to > create a Mersenne Twister object with MTRand rr(seed); (row 156, > vkm.cpp). > > The MTRand Object rr is called with an unsigned Integer, so the > important function in the mersenneTwister.h class is in line 87: > MTRand( const uint32& oneSeed ); > > According to that the Random Number Generator uses the methods > initialize(oneSeed); and reload(); (inside the method, beginning in > line 215) > > This both methods (line 283 and line 301) are using beside others > registers. Could it be that there is a different behavior between > Windows and Linux? > > We do not want to use only srand() since we might need more then the > number of pseudo random numbers that algorithm can provide. > > For those interested and which would like to see the code, a first > version of the package, called OjaMedian, is available as source file > and windows binary on my homepage: > http://www.uta.fi/~klaus.nordhausen/down.html > > The problem is in the ojaMedian function when the evolutionary > algorithm > is used. Involved C++-files are mainly vkm.cpp and MersenneTwister.h. > > We would be very grateful for any advice on how to solve this problem. > (below is also a demonstration) > > Thank you very much in advance, > > Klaus > > Results on windows XP: > > Compiler used: gcc version 4.2.1-sjlj (mingw32-2) > >> library(OjaMedian) >> set.seed(1) >> testD <- rmvnorm(20,c(0,0)) >> summary(testD) > V1 V2 > Min. :-2.2147 Min. :-1.989352 > 1st Qu.:-0.3844 1st Qu.:-0.399466 > Median : 0.3597 Median :-0.054967 > Mean : 0.1905 Mean :-0.006472 > 3rd Qu.: 0.7590 3rd Qu.: 0.655663 > Max. : 1.5953 Max. : 1.358680 >> set.seed(1) >> ojaMedian(testD) > [1] 0.21423705 -0.05799643 >> sessionInfo() > R version 2.9.0 (2009-04-17) > i386-pc-mingw32 > > locale: > LC_COLLATE=Finnish_Finland.1252;LC_CTYPE=Finnish_Finland. > 1252;LC_MONETARY=Finnish_Finland. > 1252;LC_NUMERIC=C;LC_TIME=Finnish_Finland.1252 > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] OjaMedian_0.0-14 ICSNP_1.0-3 ICS_1.2-1 survey_3.14 > [5] mvtnorm_0.9-5 > > loaded via a namespace (and not attached): > [1] tools_2.9.0 >> > > Results on Linux Kubuntu 8.10 > result of: cat /proc/version: > Linux version 2.6.28-11-generic (buildd at palmer) (gcc version 4.3.3 > (Ubuntu 4.3.3-5ubuntu4) ) #42-Ubuntu SMP Fri Apr 17 01:57:59 UTC 2009 > >> library(OjaMedian) >> set.seed(1) >> testD <- rmvnorm(20,c(0,0)) >> summary(testD) > > V1 V2 > Min. :-2.2147 Min. :-1.989352 > 1st Qu.:-0.3844 1st Qu.:-0.399466 > Median : 0.3597 Median :-0.054967 > Mean : 0.1905 Mean :-0.006472 > 3rd Qu.: 0.7590 3rd Qu.: 0.655663 > Max. : 1.5953 Max. : 1.358680 > >> set.seed(1) >> ojaMedian(testD) > > (-0.501381, 0.193929)[1] 0.119149071 0.002732100 > >> sessionInfo() > > R version 2.8.1 (2008-12-22) > i486-pc-linux-gnu > > locale: > LC_CTYPE > = > en_US > .UTF > -8 > ;LC_NUMERIC > = > C > ;LC_TIME > = > en_US > .UTF > -8 > ;LC_COLLATE > = > en_US > .UTF > -8 > ;LC_MONETARY > = > C > ;LC_MESSAGES > = > en_US > .UTF > -8 > ;LC_PAPER > = > en_US > .UTF > -8 > ;LC_NAME > = > C > ;LC_ADDRESS > =C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] OjaMedian_0.0-14 ICSNP_1.0-3 ICS_1.2-1 survey_3.14 > [5] mvtnorm_0.9-5 > > > > > > -- > Klaus Nordhausen > Researcher > Tampere School of Public Health > FIN-33014 University of Tampere > > phone: +358 3 3551 4153 > fax: +358 3 3551 4150 > e-mail: Klaus.Nordhausen at uta.fi > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel
Hi 2009/5/12 Klaus Nordhausen <klaus.nordhausen at uta.fi>:> Our suspicion is that the reason is that some libraries are different > implemented on linux and windows (XP) compilers.If useful. http://www.derkeiler.com/Newsgroups/sci.crypt/2004-10/1325.html I think that you do not like to use rand and srand. -- EI-JI Nakama <nakama (a) ki.rim.or.jp> "\u4e2d\u9593\u6804\u6cbb" <nakama (a) ki.rim.or.jp>
I cannot say what the problem is in your code, but in general it is possible to get the same random sequences from Linux and Windows with R's Mersenne Twister generator. If you generate long sequences (say 10s of thousands) and then start doing comparisons involving remainders, like differencing the sums of two mean zero sequences, then differences between libraries will show up. I have found bigger differences between Linux variants than I have between Windows and most Linuxes. Paul Gilbert Klaus Nordhausen wrote:> Dear R experts, > > we are preparing an R-package to compute the Oja Median which contains > some C++ code in which random numbers are needed. To generate the random > numbers we use the following Mersenne-Twister implementation: > > // MersenneTwister.h > // Mersenne Twister random number generator -- a C++ class MTRand > // Based on code by Makoto Matsumoto, Takuji Nishimura, and Shawn Cokus > // Richard J. Wagner v1.0 15 May 2003 rjwagner at writeme.com > > the random seed for the Mersenne-Twister is provided by our R-function > which gives an (random) integer to the C++ function srand() which in > turn sets the seed in the code. > > Using the set.seed in R makes now the results reproducible, but the > results differ between windows and linux. > > Does anyone know what the problem there is? > > Our suspicion is that the reason is that some libraries are different > implemented on linux and windows (XP) compilers. > > After the program start we set the seed in row 447(vkm.cpp) with > srand(int); > > When the median will be calculated, an intern seed is set with unsigned > int seed = rand(); ( in row 100 (vkm.cpp)). This seed will be used > to calculate some random subsets and to > create a Mersenne Twister object with MTRand rr(seed); (row 156, vkm.cpp). > > The MTRand Object rr is called with an unsigned Integer, so the > important function in the mersenneTwister.h class is in line 87: > MTRand( const uint32& oneSeed ); > > According to that the Random Number Generator uses the methods > initialize(oneSeed); and reload(); (inside the method, beginning in > line 215) > > This both methods (line 283 and line 301) are using beside others > registers. Could it be that there is a different behavior between > Windows and Linux? > > We do not want to use only srand() since we might need more then the > number of pseudo random numbers that algorithm can provide. > > For those interested and which would like to see the code, a first > version of the package, called OjaMedian, is available as source file > and windows binary on my homepage: > http://www.uta.fi/~klaus.nordhausen/down.html > > The problem is in the ojaMedian function when the evolutionary algorithm > is used. Involved C++-files are mainly vkm.cpp and MersenneTwister.h. > > We would be very grateful for any advice on how to solve this problem. > (below is also a demonstration) > > Thank you very much in advance, > > Klaus > > Results on windows XP: > > Compiler used: gcc version 4.2.1-sjlj (mingw32-2) > >> library(OjaMedian) >> set.seed(1) >> testD <- rmvnorm(20,c(0,0)) >> summary(testD) > V1 V2 > Min. :-2.2147 Min. :-1.989352 > 1st Qu.:-0.3844 1st Qu.:-0.399466 > Median : 0.3597 Median :-0.054967 > Mean : 0.1905 Mean :-0.006472 > 3rd Qu.: 0.7590 3rd Qu.: 0.655663 > Max. : 1.5953 Max. : 1.358680 >> set.seed(1) >> ojaMedian(testD) > [1] 0.21423705 -0.05799643 >> sessionInfo() > R version 2.9.0 (2009-04-17) > i386-pc-mingw32 > > locale: > LC_COLLATE=Finnish_Finland.1252;LC_CTYPE=Finnish_Finland.1252;LC_MONETARY=Finnish_Finland.1252;LC_NUMERIC=C;LC_TIME=Finnish_Finland.1252 > > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] OjaMedian_0.0-14 ICSNP_1.0-3 ICS_1.2-1 survey_3.14 > [5] mvtnorm_0.9-5 > > loaded via a namespace (and not attached): > [1] tools_2.9.0 >> > > Results on Linux Kubuntu 8.10 > result of: cat /proc/version: > Linux version 2.6.28-11-generic (buildd at palmer) (gcc version 4.3.3 > (Ubuntu 4.3.3-5ubuntu4) ) #42-Ubuntu SMP Fri Apr 17 01:57:59 UTC 2009 > >> library(OjaMedian) >> set.seed(1) >> testD <- rmvnorm(20,c(0,0)) >> summary(testD) > > V1 V2 > Min. :-2.2147 Min. :-1.989352 > 1st Qu.:-0.3844 1st Qu.:-0.399466 > Median : 0.3597 Median :-0.054967 > Mean : 0.1905 Mean :-0.006472 > 3rd Qu.: 0.7590 3rd Qu.: 0.655663 > Max. : 1.5953 Max. : 1.358680 > >> set.seed(1) >> ojaMedian(testD) > > (-0.501381, 0.193929)[1] 0.119149071 0.002732100 > >> sessionInfo() > > R version 2.8.1 (2008-12-22) > i486-pc-linux-gnu > > locale: > LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=C;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C > > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] OjaMedian_0.0-14 ICSNP_1.0-3 ICS_1.2-1 survey_3.14 > [5] mvtnorm_0.9-5 > > > > >=================================================================================== La version fran?aise suit le texte anglais. ------------------------------------------------------------------------------------ This email may contain privileged and/or confidential in...{{dropped:26}}