Elizabeth Purdom
2019-Apr-12 00:38 UTC
[R] Question about behavior of sample.kind in set.seed (R 3.6)
Hello, I am trying to update a package for the upcoming release of R, and my unit tests are affected by the change in the sample. I understand that to reproduce the old sampling, I need to set sample.kind=?Rounding? in RNGkind or set.seed. But I am confused by the behavior of the sample.kind argument in set.seed, as it doesn?t seem to change my results. In particular, I was trying to understand what happens if you make a call to set.seed within a function to the global environment. So I set up a test as follows: ###Test set.seed f<-function(n,sample.kind){ #="Rounding" or "Rejection" cat("RNG at beginning\n") print(RNGkind()) # RNGkind(sample.kind=sample.kind) # cat("RNG at after set\n") # print(RNGkind()) set.seed(23,sample.kind=sample.kind) cat("RNG at after set seed\n") print(RNGkind()) sample(1:400000,size=n,replace=TRUE) } RNGkind(sample.kind="Rejection?) print(RNGkind()) n<-1000000 y<-f(n,"Rounding?) print(RNGkind()) y2<-f(n,"Rejection?) print(RNGkind()) all(y==y2) However, it didn?t do anything:> RNGkind(sample.kind="Rejection") > print(RNGkind())[1] "Mersenne-Twister" "Inversion" "Rejection"> n<-1000000 > y<-f(n,"Rounding")RNG at beginning [1] "Mersenne-Twister" "Inversion" "Rejection" RNG at after set seed [1] "Mersenne-Twister" "Inversion" "Rejection" Warning message: In set.seed(23, sample.kind = sample.kind) : non-uniform 'Rounding' sampler used> print(RNGkind())[1] "Mersenne-Twister" "Inversion" "Rejection"> y2<-f(n,"Rejection")RNG at beginning [1] "Mersenne-Twister" "Inversion" "Rejection" RNG at after set seed [1] "Mersenne-Twister" "Inversion" "Rejection"> print(RNGkind())[1] "Mersenne-Twister" "Inversion" "Rejection"> all(y==y2)[1] TRUE If I run the same test with calls to RNGkind, however, it does change the method (and I discovered in answer to my question, it appears to change the global method, which is an unfortunate fact for what I am trying to do). ###Test RNGkind f<-function(n,sample.kind){ #="Rounding" or "Rejection" cat("RNG at beginning\n") print(RNGkind()) RNGkind(sample.kind=sample.kind) cat("RNG at after set\n") print(RNGkind()) set.seed(23) cat("RNG at after set seed\n") print(RNGkind()) sample(1:400000,size=n,replace=TRUE) } RNGkind(sample.kind="Rejection?) print(RNGkind()) n<-1000000 y<-f(n,"Rounding?) print(RNGkind()) y2<-f(n,"Rejection?) print(RNGkind()) all(y==y2)> RNGkind(sample.kind="Rejection") > print(RNGkind())[1] "Mersenne-Twister" "Inversion" "Rejection"> n<-1000000 > y<-f(n,"Rounding")RNG at beginning [1] "Mersenne-Twister" "Inversion" "Rejection" RNG at after set [1] "Mersenne-Twister" "Inversion" "Rounding" RNG at after set seed [1] "Mersenne-Twister" "Inversion" "Rounding" Warning message: In RNGkind(sample.kind = sample.kind) : non-uniform 'Rounding' sampler used> print(RNGkind())[1] "Mersenne-Twister" "Inversion" "Rounding"> y2<-f(n,"Rejection")RNG at beginning [1] "Mersenne-Twister" "Inversion" "Rounding" RNG at after set [1] "Mersenne-Twister" "Inversion" "Rejection" RNG at after set seed [1] "Mersenne-Twister" "Inversion" "Rejection"> print(RNGkind())[1] "Mersenne-Twister" "Inversion" "Rejection"> all(y==y2)[1] FALSE So clearly I should use RNGkind to change it, but what is the argument actually doing in set.seed? Thanks, Elizabeth Purdom> sessionInfo()R version 3.6.0 alpha (2019-04-09 r76363) Platform: x86_64-apple-darwin15.6.0 (64-bit) Running under: OS X El Capitan 10.11.6 Matrix products: default BLAS: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRblas.0.dylib LAPACK: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRlapack.dylib locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 attached base packages: [1] stats graphics grDevices utils datasets methods base loaded via a namespace (and not attached): [1] BiocManager_1.30.4 compiler_3.6.0 tools_3.6.0
Tierney, Luke
2019-Apr-14 16:01 UTC
[R] [External] Question about behavior of sample.kind in set.seed (R 3.6)
Thanks for the report. The sample.kind argument was not being passed on to the .Internal. This is now fixed in R-devel and the R 3.6.0 branch. Best, luke On Fri, 12 Apr 2019, Elizabeth Purdom wrote:> Hello, > > I am trying to update a package for the upcoming release of R, and my unit tests are affected by the change in the sample. I understand that to reproduce the old sampling, I need to set sample.kind=?Rounding? in RNGkind or set.seed. But I am confused by the behavior of the sample.kind argument in set.seed, as it doesn?t seem to change my results. > > In particular, I was trying to understand what happens if you make a call to set.seed within a function to the global environment. So I set up a test as follows: > > ###Test set.seed > f<-function(n,sample.kind){ #="Rounding" or "Rejection" > cat("RNG at beginning\n") > print(RNGkind()) > # RNGkind(sample.kind=sample.kind) > # cat("RNG at after set\n") > # print(RNGkind()) > set.seed(23,sample.kind=sample.kind) > cat("RNG at after set seed\n") > print(RNGkind()) > sample(1:400000,size=n,replace=TRUE) > } > > RNGkind(sample.kind="Rejection?) > print(RNGkind()) > n<-1000000 > y<-f(n,"Rounding?) > print(RNGkind()) > y2<-f(n,"Rejection?) > print(RNGkind()) > all(y==y2) > > However, it didn?t do anything: >> RNGkind(sample.kind="Rejection") >> print(RNGkind()) > [1] "Mersenne-Twister" "Inversion" "Rejection" >> n<-1000000 >> y<-f(n,"Rounding") > RNG at beginning > [1] "Mersenne-Twister" "Inversion" "Rejection" > RNG at after set seed > [1] "Mersenne-Twister" "Inversion" "Rejection" > Warning message: > In set.seed(23, sample.kind = sample.kind) : > non-uniform 'Rounding' sampler used >> print(RNGkind()) > [1] "Mersenne-Twister" "Inversion" "Rejection" >> y2<-f(n,"Rejection") > RNG at beginning > [1] "Mersenne-Twister" "Inversion" "Rejection" > RNG at after set seed > [1] "Mersenne-Twister" "Inversion" "Rejection" >> print(RNGkind()) > [1] "Mersenne-Twister" "Inversion" "Rejection" >> all(y==y2) > [1] TRUE > > If I run the same test with calls to RNGkind, however, it does change the method (and I discovered in answer to my question, it appears to change the global method, which is an unfortunate fact for what I am trying to do). > > ###Test RNGkind > f<-function(n,sample.kind){ #="Rounding" or "Rejection" > cat("RNG at beginning\n") > print(RNGkind()) > RNGkind(sample.kind=sample.kind) > cat("RNG at after set\n") > print(RNGkind()) > set.seed(23) > cat("RNG at after set seed\n") > print(RNGkind()) > sample(1:400000,size=n,replace=TRUE) > } > > RNGkind(sample.kind="Rejection?) > print(RNGkind()) > n<-1000000 > y<-f(n,"Rounding?) > print(RNGkind()) > y2<-f(n,"Rejection?) > print(RNGkind()) > all(y==y2) > >> RNGkind(sample.kind="Rejection") >> print(RNGkind()) > [1] "Mersenne-Twister" "Inversion" "Rejection" >> n<-1000000 >> y<-f(n,"Rounding") > RNG at beginning > [1] "Mersenne-Twister" "Inversion" "Rejection" > RNG at after set > [1] "Mersenne-Twister" "Inversion" "Rounding" > RNG at after set seed > [1] "Mersenne-Twister" "Inversion" "Rounding" > Warning message: > In RNGkind(sample.kind = sample.kind) : non-uniform 'Rounding' sampler used >> print(RNGkind()) > [1] "Mersenne-Twister" "Inversion" "Rounding" >> y2<-f(n,"Rejection") > RNG at beginning > [1] "Mersenne-Twister" "Inversion" "Rounding" > RNG at after set > [1] "Mersenne-Twister" "Inversion" "Rejection" > RNG at after set seed > [1] "Mersenne-Twister" "Inversion" "Rejection" >> print(RNGkind()) > [1] "Mersenne-Twister" "Inversion" "Rejection" >> all(y==y2) > [1] FALSE > > So clearly I should use RNGkind to change it, but what is the argument actually doing in set.seed? > > Thanks, > Elizabeth Purdom > >> sessionInfo() > R version 3.6.0 alpha (2019-04-09 r76363) > Platform: x86_64-apple-darwin15.6.0 (64-bit) > Running under: OS X El Capitan 10.11.6 > > Matrix products: default > BLAS: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRblas.0.dylib > LAPACK: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRlapack.dylib > > locale: > [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > loaded via a namespace (and not attached): > [1] BiocManager_1.30.4 compiler_3.6.0 tools_3.6.0 > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Luke Tierney Ralph E. Wareham Professor of Mathematical Sciences University of Iowa Phone: 319-335-3386 Department of Statistics and Fax: 319-335-3017 Actuarial Science 241 Schaeffer Hall email: luke-tierney at uiowa.edu Iowa City, IA 52242 WWW: http://www.stat.uiowa.edu