Elizabeth Purdom
2019-Apr-12 00:38 UTC
[R] Question about behavior of sample.kind in set.seed (R 3.6)
Hello,
I am trying to update a package for the upcoming release of R, and my unit tests
are affected by the change in the sample. I understand that to reproduce the old
sampling, I need to set sample.kind=?Rounding? in RNGkind or set.seed. But I am
confused by the behavior of the sample.kind argument in set.seed, as it doesn?t
seem to change my results.
In particular, I was trying to understand what happens if you make a call to
set.seed within a function to the global environment. So I set up a test as
follows:
###Test set.seed
f<-function(n,sample.kind){ #="Rounding" or "Rejection"
cat("RNG at beginning\n")
print(RNGkind())
# RNGkind(sample.kind=sample.kind)
# cat("RNG at after set\n")
# print(RNGkind())
set.seed(23,sample.kind=sample.kind)
cat("RNG at after set seed\n")
print(RNGkind())
sample(1:400000,size=n,replace=TRUE)
}
RNGkind(sample.kind="Rejection?)
print(RNGkind())
n<-1000000
y<-f(n,"Rounding?)
print(RNGkind())
y2<-f(n,"Rejection?)
print(RNGkind())
all(y==y2)
However, it didn?t do anything:> RNGkind(sample.kind="Rejection")
> print(RNGkind())
[1] "Mersenne-Twister" "Inversion"
"Rejection" > n<-1000000
> y<-f(n,"Rounding")
RNG at beginning
[1] "Mersenne-Twister" "Inversion"
"Rejection"
RNG at after set seed
[1] "Mersenne-Twister" "Inversion"
"Rejection"
Warning message:
In set.seed(23, sample.kind = sample.kind) :
non-uniform 'Rounding' sampler used> print(RNGkind())
[1] "Mersenne-Twister" "Inversion"
"Rejection" > y2<-f(n,"Rejection")
RNG at beginning
[1] "Mersenne-Twister" "Inversion"
"Rejection"
RNG at after set seed
[1] "Mersenne-Twister" "Inversion"
"Rejection" > print(RNGkind())
[1] "Mersenne-Twister" "Inversion"
"Rejection" > all(y==y2)
[1] TRUE
If I run the same test with calls to RNGkind, however, it does change the method
(and I discovered in answer to my question, it appears to change the global
method, which is an unfortunate fact for what I am trying to do).
###Test RNGkind
f<-function(n,sample.kind){ #="Rounding" or "Rejection"
cat("RNG at beginning\n")
print(RNGkind())
RNGkind(sample.kind=sample.kind)
cat("RNG at after set\n")
print(RNGkind())
set.seed(23)
cat("RNG at after set seed\n")
print(RNGkind())
sample(1:400000,size=n,replace=TRUE)
}
RNGkind(sample.kind="Rejection?)
print(RNGkind())
n<-1000000
y<-f(n,"Rounding?)
print(RNGkind())
y2<-f(n,"Rejection?)
print(RNGkind())
all(y==y2)
> RNGkind(sample.kind="Rejection")
> print(RNGkind())
[1] "Mersenne-Twister" "Inversion"
"Rejection" > n<-1000000
> y<-f(n,"Rounding")
RNG at beginning
[1] "Mersenne-Twister" "Inversion"
"Rejection"
RNG at after set
[1] "Mersenne-Twister" "Inversion"
"Rounding"
RNG at after set seed
[1] "Mersenne-Twister" "Inversion"
"Rounding"
Warning message:
In RNGkind(sample.kind = sample.kind) : non-uniform 'Rounding' sampler
used> print(RNGkind())
[1] "Mersenne-Twister" "Inversion"
"Rounding" > y2<-f(n,"Rejection")
RNG at beginning
[1] "Mersenne-Twister" "Inversion"
"Rounding"
RNG at after set
[1] "Mersenne-Twister" "Inversion"
"Rejection"
RNG at after set seed
[1] "Mersenne-Twister" "Inversion"
"Rejection" > print(RNGkind())
[1] "Mersenne-Twister" "Inversion"
"Rejection" > all(y==y2)
[1] FALSE
So clearly I should use RNGkind to change it, but what is the argument actually
doing in set.seed?
Thanks,
Elizabeth Purdom
> sessionInfo()
R version 3.6.0 alpha (2019-04-09 r76363)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: OS X El Capitan 10.11.6
Matrix products: default
BLAS:
/Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRblas.0.dylib
LAPACK:
/Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRlapack.dylib
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
loaded via a namespace (and not attached):
[1] BiocManager_1.30.4 compiler_3.6.0 tools_3.6.0
Tierney, Luke
2019-Apr-14 16:01 UTC
[R] [External] Question about behavior of sample.kind in set.seed (R 3.6)
Thanks for the report. The sample.kind argument was not being passed on to the .Internal. This is now fixed in R-devel and the R 3.6.0 branch. Best, luke On Fri, 12 Apr 2019, Elizabeth Purdom wrote:> Hello, > > I am trying to update a package for the upcoming release of R, and my unit tests are affected by the change in the sample. I understand that to reproduce the old sampling, I need to set sample.kind=?Rounding? in RNGkind or set.seed. But I am confused by the behavior of the sample.kind argument in set.seed, as it doesn?t seem to change my results. > > In particular, I was trying to understand what happens if you make a call to set.seed within a function to the global environment. So I set up a test as follows: > > ###Test set.seed > f<-function(n,sample.kind){ #="Rounding" or "Rejection" > cat("RNG at beginning\n") > print(RNGkind()) > # RNGkind(sample.kind=sample.kind) > # cat("RNG at after set\n") > # print(RNGkind()) > set.seed(23,sample.kind=sample.kind) > cat("RNG at after set seed\n") > print(RNGkind()) > sample(1:400000,size=n,replace=TRUE) > } > > RNGkind(sample.kind="Rejection?) > print(RNGkind()) > n<-1000000 > y<-f(n,"Rounding?) > print(RNGkind()) > y2<-f(n,"Rejection?) > print(RNGkind()) > all(y==y2) > > However, it didn?t do anything: >> RNGkind(sample.kind="Rejection") >> print(RNGkind()) > [1] "Mersenne-Twister" "Inversion" "Rejection" >> n<-1000000 >> y<-f(n,"Rounding") > RNG at beginning > [1] "Mersenne-Twister" "Inversion" "Rejection" > RNG at after set seed > [1] "Mersenne-Twister" "Inversion" "Rejection" > Warning message: > In set.seed(23, sample.kind = sample.kind) : > non-uniform 'Rounding' sampler used >> print(RNGkind()) > [1] "Mersenne-Twister" "Inversion" "Rejection" >> y2<-f(n,"Rejection") > RNG at beginning > [1] "Mersenne-Twister" "Inversion" "Rejection" > RNG at after set seed > [1] "Mersenne-Twister" "Inversion" "Rejection" >> print(RNGkind()) > [1] "Mersenne-Twister" "Inversion" "Rejection" >> all(y==y2) > [1] TRUE > > If I run the same test with calls to RNGkind, however, it does change the method (and I discovered in answer to my question, it appears to change the global method, which is an unfortunate fact for what I am trying to do). > > ###Test RNGkind > f<-function(n,sample.kind){ #="Rounding" or "Rejection" > cat("RNG at beginning\n") > print(RNGkind()) > RNGkind(sample.kind=sample.kind) > cat("RNG at after set\n") > print(RNGkind()) > set.seed(23) > cat("RNG at after set seed\n") > print(RNGkind()) > sample(1:400000,size=n,replace=TRUE) > } > > RNGkind(sample.kind="Rejection?) > print(RNGkind()) > n<-1000000 > y<-f(n,"Rounding?) > print(RNGkind()) > y2<-f(n,"Rejection?) > print(RNGkind()) > all(y==y2) > >> RNGkind(sample.kind="Rejection") >> print(RNGkind()) > [1] "Mersenne-Twister" "Inversion" "Rejection" >> n<-1000000 >> y<-f(n,"Rounding") > RNG at beginning > [1] "Mersenne-Twister" "Inversion" "Rejection" > RNG at after set > [1] "Mersenne-Twister" "Inversion" "Rounding" > RNG at after set seed > [1] "Mersenne-Twister" "Inversion" "Rounding" > Warning message: > In RNGkind(sample.kind = sample.kind) : non-uniform 'Rounding' sampler used >> print(RNGkind()) > [1] "Mersenne-Twister" "Inversion" "Rounding" >> y2<-f(n,"Rejection") > RNG at beginning > [1] "Mersenne-Twister" "Inversion" "Rounding" > RNG at after set > [1] "Mersenne-Twister" "Inversion" "Rejection" > RNG at after set seed > [1] "Mersenne-Twister" "Inversion" "Rejection" >> print(RNGkind()) > [1] "Mersenne-Twister" "Inversion" "Rejection" >> all(y==y2) > [1] FALSE > > So clearly I should use RNGkind to change it, but what is the argument actually doing in set.seed? > > Thanks, > Elizabeth Purdom > >> sessionInfo() > R version 3.6.0 alpha (2019-04-09 r76363) > Platform: x86_64-apple-darwin15.6.0 (64-bit) > Running under: OS X El Capitan 10.11.6 > > Matrix products: default > BLAS: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRblas.0.dylib > LAPACK: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRlapack.dylib > > locale: > [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > loaded via a namespace (and not attached): > [1] BiocManager_1.30.4 compiler_3.6.0 tools_3.6.0 > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Luke Tierney Ralph E. Wareham Professor of Mathematical Sciences University of Iowa Phone: 319-335-3386 Department of Statistics and Fax: 319-335-3017 Actuarial Science 241 Schaeffer Hall email: luke-tierney at uiowa.edu Iowa City, IA 52242 WWW: http://www.stat.uiowa.edu