Paul Johnson
2014-Aug-06 18:10 UTC
[Rd] portableParalleSeeds Package violation, CRAN exception?
I'm writing to ask for a policy exception, or advice on how to make
this package CRAN allowable.
http://rweb.quant.ku.edu/kran/src/contrib/portableParallelSeeds_0.9.tar.gz
Yesterday I tried to submit a package on CRAN and Dr Ripley pointed
out that I had not understood the instructions about packages. Here's
the part where the R check gives a Note
* checking R code for possible problems ... NOTE
Found the following assignments to the global environment:
File ?portableParallelSeeds/R/initPortableStreams.R?:
assign("currentStream", n, envir = .GlobalEnv)
assign("currentStates", curStates, envir = .GlobalEnv)
assign("currentStream", 1L, envir = .GlobalEnv)
assign("startStates", runSeeds, envir = .GlobalEnv)
assign("currentStates", runSeeds, envir = .GlobalEnv)
assign("currentStream", as.integer(currentStream), envir =
.GlobalEnv)
assign("startStates", runSeeds, envir = .GlobalEnv)
assign("currentStates", runSeeds, envir = .GlobalEnv)
Altering the user's environment requires a special arrangement with
CRAN. I believe this is justified, I'll sketch the reasons now. But,
mostly, I'm at your mercy and if there is any way to make this
possible, I would be very grateful.
To control & replace random number streams, it really is necessary to
alter the workspace. That's where the random generator state is
stored. It is acknowledged in Robert Gentleman' s Book, R Programming
for Bionformatics "The decision to have these [random generator]
functions manipulate a global variable, .Random.seed, is slightly
unfortunate as it makes it somewhat more difficult to manage several
different random number streams simultaneously? (Gentleman, 2009, p.
201).
I have developed an understandable set of wrapper functions that handle this.
Some of you may recall this project. I've asked about it here a couple
of times. We allow separate streams of randoms for different purposes
within a single R run. There is a framework to save 1000s of those
sets in a file, so it can be used on a cluster or in a single
workstation. This is handy because, when 1 run in 10,000 on the
cluster exhibits some weird behavior, we can easily re-initiate that
interactively and see what's going on.
I have a vignette "pps" that explains. I dropped a copy of that here
in case you don't want to get the package:
http://pj.freefaculty.org/scraps/pps.pdf
While working on that, I gained a considerably deeper understanding of
random generators and seeds. That is what this vignette is about
http://pj.freefaculty.org/scraps/PRNG-basics.pdf
We've been running simulations on our cluster with the
portableParallelSeeds framework for 2 years, we've never had any
trouble. We are able to re-start runs, verify random number draws in
separate streams.
PJ
--
Paul E. Johnson
Professor, Political Science Assoc. Director
1541 Lilac Lane, Room 504 Center for Research Methods
University of Kansas University of Kansas
http://pj.freefaculty.org http://quant.ku.edu
Gábor Csárdi
2014-Aug-06 18:20 UTC
[Rd] portableParalleSeeds Package violation, CRAN exception?
Why not place them in the package environment? Gabor On Wed, Aug 6, 2014 at 2:10 PM, Paul Johnson <pauljohn32 at gmail.com> wrote:> I'm writing to ask for a policy exception, or advice on how to make > this package CRAN allowable. > > http://rweb.quant.ku.edu/kran/src/contrib/portableParallelSeeds_0.9.tar.gz > > Yesterday I tried to submit a package on CRAN and Dr Ripley pointed > out that I had not understood the instructions about packages. Here's > the part where the R check gives a Note > > * checking R code for possible problems ... NOTE > Found the following assignments to the global environment: > File ?portableParallelSeeds/R/initPortableStreams.R?: > assign("currentStream", n, envir = .GlobalEnv) > assign("currentStates", curStates, envir = .GlobalEnv) > assign("currentStream", 1L, envir = .GlobalEnv) > assign("startStates", runSeeds, envir = .GlobalEnv) > assign("currentStates", runSeeds, envir = .GlobalEnv) > assign("currentStream", as.integer(currentStream), envir = .GlobalEnv) > assign("startStates", runSeeds, envir = .GlobalEnv) > assign("currentStates", runSeeds, envir = .GlobalEnv) > > Altering the user's environment requires a special arrangement with > CRAN. I believe this is justified, I'll sketch the reasons now. But, > mostly, I'm at your mercy and if there is any way to make this > possible, I would be very grateful. > > To control & replace random number streams, it really is necessary to > alter the workspace. That's where the random generator state is > stored. It is acknowledged in Robert Gentleman' s Book, R Programming > for Bionformatics "The decision to have these [random generator] > functions manipulate a global variable, .Random.seed, is slightly > unfortunate as it makes it somewhat more difficult to manage several > different random number streams simultaneously? (Gentleman, 2009, p. > 201). > > I have developed an understandable set of wrapper functions that handle this. > > Some of you may recall this project. I've asked about it here a couple > of times. We allow separate streams of randoms for different purposes > within a single R run. There is a framework to save 1000s of those > sets in a file, so it can be used on a cluster or in a single > workstation. This is handy because, when 1 run in 10,000 on the > cluster exhibits some weird behavior, we can easily re-initiate that > interactively and see what's going on. > > I have a vignette "pps" that explains. I dropped a copy of that here > in case you don't want to get the package: > > http://pj.freefaculty.org/scraps/pps.pdf > > While working on that, I gained a considerably deeper understanding of > random generators and seeds. That is what this vignette is about > > http://pj.freefaculty.org/scraps/PRNG-basics.pdf > > > We've been running simulations on our cluster with the > portableParallelSeeds framework for 2 years, we've never had any > trouble. We are able to re-start runs, verify random number draws in > separate streams. > > PJ > -- > Paul E. Johnson > Professor, Political Science Assoc. Director > 1541 Lilac Lane, Room 504 Center for Research Methods > University of Kansas University of Kansas > http://pj.freefaculty.org http://quant.ku.edu > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel
William Dunlap
2014-Aug-06 19:48 UTC
[Rd] portableParalleSeeds Package violation, CRAN exception?
You can make an environment called streamsEnv in your package by adding streamsEnv <- new.env() to one of your R/*.R files. (its parent environment will be namespace:yourPackage) and your functions can assign things to this environment instead of to .GlobalEnv. Bill Dunlap TIBCO Software wdunlap tibco.com On Wed, Aug 6, 2014 at 11:10 AM, Paul Johnson <pauljohn32 at gmail.com> wrote:> I'm writing to ask for a policy exception, or advice on how to make > this package CRAN allowable. > > http://rweb.quant.ku.edu/kran/src/contrib/portableParallelSeeds_0.9.tar.gz > > Yesterday I tried to submit a package on CRAN and Dr Ripley pointed > out that I had not understood the instructions about packages. Here's > the part where the R check gives a Note > > * checking R code for possible problems ... NOTE > Found the following assignments to the global environment: > File ?portableParallelSeeds/R/initPortableStreams.R?: > assign("currentStream", n, envir = .GlobalEnv) > assign("currentStates", curStates, envir = .GlobalEnv) > assign("currentStream", 1L, envir = .GlobalEnv) > assign("startStates", runSeeds, envir = .GlobalEnv) > assign("currentStates", runSeeds, envir = .GlobalEnv) > assign("currentStream", as.integer(currentStream), envir = .GlobalEnv) > assign("startStates", runSeeds, envir = .GlobalEnv) > assign("currentStates", runSeeds, envir = .GlobalEnv) > > Altering the user's environment requires a special arrangement with > CRAN. I believe this is justified, I'll sketch the reasons now. But, > mostly, I'm at your mercy and if there is any way to make this > possible, I would be very grateful. > > To control & replace random number streams, it really is necessary to > alter the workspace. That's where the random generator state is > stored. It is acknowledged in Robert Gentleman' s Book, R Programming > for Bionformatics "The decision to have these [random generator] > functions manipulate a global variable, .Random.seed, is slightly > unfortunate as it makes it somewhat more difficult to manage several > different random number streams simultaneously? (Gentleman, 2009, p. > 201). > > I have developed an understandable set of wrapper functions that handle this. > > Some of you may recall this project. I've asked about it here a couple > of times. We allow separate streams of randoms for different purposes > within a single R run. There is a framework to save 1000s of those > sets in a file, so it can be used on a cluster or in a single > workstation. This is handy because, when 1 run in 10,000 on the > cluster exhibits some weird behavior, we can easily re-initiate that > interactively and see what's going on. > > I have a vignette "pps" that explains. I dropped a copy of that here > in case you don't want to get the package: > > http://pj.freefaculty.org/scraps/pps.pdf > > While working on that, I gained a considerably deeper understanding of > random generators and seeds. That is what this vignette is about > > http://pj.freefaculty.org/scraps/PRNG-basics.pdf > > > We've been running simulations on our cluster with the > portableParallelSeeds framework for 2 years, we've never had any > trouble. We are able to re-start runs, verify random number draws in > separate streams. > > PJ > -- > Paul E. Johnson > Professor, Political Science Assoc. Director > 1541 Lilac Lane, Room 504 Center for Research Methods > University of Kansas University of Kansas > http://pj.freefaculty.org http://quant.ku.edu > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel
Michael Lawrence
2014-Aug-07 21:11 UTC
[Rd] portableParalleSeeds Package violation, CRAN exception?
I would recommend against maintaining multiple global variables and would instead take an object-oriented approach. Probably should define a reference class representing a random number stream (think java.util.Random). Then define a reference class representing a collection of them, tracking which one is current. Instantiate the collection class and keep a reference to it inside your package namespace. You could then define a new API that expects a random stream object as an argument, using the active one as the default. Your wrappers would delegate to that API, relying on the default. On Wed, Aug 6, 2014 at 11:10 AM, Paul Johnson <pauljohn32@gmail.com> wrote:> I'm writing to ask for a policy exception, or advice on how to make > this package CRAN allowable. > > http://rweb.quant.ku.edu/kran/src/contrib/portableParallelSeeds_0.9.tar.gz > > Yesterday I tried to submit a package on CRAN and Dr Ripley pointed > out that I had not understood the instructions about packages. Here's > the part where the R check gives a Note > > * checking R code for possible problems ... NOTE > Found the following assignments to the global environment: > File ‘portableParallelSeeds/R/initPortableStreams.R’: > assign("currentStream", n, envir = .GlobalEnv) > assign("currentStates", curStates, envir = .GlobalEnv) > assign("currentStream", 1L, envir = .GlobalEnv) > assign("startStates", runSeeds, envir = .GlobalEnv) > assign("currentStates", runSeeds, envir = .GlobalEnv) > assign("currentStream", as.integer(currentStream), envir = .GlobalEnv) > assign("startStates", runSeeds, envir = .GlobalEnv) > assign("currentStates", runSeeds, envir = .GlobalEnv) > > Altering the user's environment requires a special arrangement with > CRAN. I believe this is justified, I'll sketch the reasons now. But, > mostly, I'm at your mercy and if there is any way to make this > possible, I would be very grateful. > > To control & replace random number streams, it really is necessary to > alter the workspace. That's where the random generator state is > stored. It is acknowledged in Robert Gentleman' s Book, R Programming > for Bionformatics "The decision to have these [random generator] > functions manipulate a global variable, .Random.seed, is slightly > unfortunate as it makes it somewhat more difficult to manage several > different random number streams simultaneously” (Gentleman, 2009, p. > 201). > > I have developed an understandable set of wrapper functions that handle > this. > > Some of you may recall this project. I've asked about it here a couple > of times. We allow separate streams of randoms for different purposes > within a single R run. There is a framework to save 1000s of those > sets in a file, so it can be used on a cluster or in a single > workstation. This is handy because, when 1 run in 10,000 on the > cluster exhibits some weird behavior, we can easily re-initiate that > interactively and see what's going on. > > I have a vignette "pps" that explains. I dropped a copy of that here > in case you don't want to get the package: > > http://pj.freefaculty.org/scraps/pps.pdf > > While working on that, I gained a considerably deeper understanding of > random generators and seeds. That is what this vignette is about > > http://pj.freefaculty.org/scraps/PRNG-basics.pdf > > > We've been running simulations on our cluster with the > portableParallelSeeds framework for 2 years, we've never had any > trouble. We are able to re-start runs, verify random number draws in > separate streams. > > PJ > -- > Paul E. Johnson > Professor, Political Science Assoc. Director > 1541 Lilac Lane, Room 504 Center for Research Methods > University of Kansas University of Kansas > http://pj.freefaculty.org http://quant.ku.edu > > ______________________________________________ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel >[[alternative HTML version deleted]]