Paul Johnson
2014-Aug-06 18:10 UTC
[Rd] portableParalleSeeds Package violation, CRAN exception?
I'm writing to ask for a policy exception, or advice on how to make this package CRAN allowable. http://rweb.quant.ku.edu/kran/src/contrib/portableParallelSeeds_0.9.tar.gz Yesterday I tried to submit a package on CRAN and Dr Ripley pointed out that I had not understood the instructions about packages. Here's the part where the R check gives a Note * checking R code for possible problems ... NOTE Found the following assignments to the global environment: File ?portableParallelSeeds/R/initPortableStreams.R?: assign("currentStream", n, envir = .GlobalEnv) assign("currentStates", curStates, envir = .GlobalEnv) assign("currentStream", 1L, envir = .GlobalEnv) assign("startStates", runSeeds, envir = .GlobalEnv) assign("currentStates", runSeeds, envir = .GlobalEnv) assign("currentStream", as.integer(currentStream), envir = .GlobalEnv) assign("startStates", runSeeds, envir = .GlobalEnv) assign("currentStates", runSeeds, envir = .GlobalEnv) Altering the user's environment requires a special arrangement with CRAN. I believe this is justified, I'll sketch the reasons now. But, mostly, I'm at your mercy and if there is any way to make this possible, I would be very grateful. To control & replace random number streams, it really is necessary to alter the workspace. That's where the random generator state is stored. It is acknowledged in Robert Gentleman' s Book, R Programming for Bionformatics "The decision to have these [random generator] functions manipulate a global variable, .Random.seed, is slightly unfortunate as it makes it somewhat more difficult to manage several different random number streams simultaneously? (Gentleman, 2009, p. 201). I have developed an understandable set of wrapper functions that handle this. Some of you may recall this project. I've asked about it here a couple of times. We allow separate streams of randoms for different purposes within a single R run. There is a framework to save 1000s of those sets in a file, so it can be used on a cluster or in a single workstation. This is handy because, when 1 run in 10,000 on the cluster exhibits some weird behavior, we can easily re-initiate that interactively and see what's going on. I have a vignette "pps" that explains. I dropped a copy of that here in case you don't want to get the package: http://pj.freefaculty.org/scraps/pps.pdf While working on that, I gained a considerably deeper understanding of random generators and seeds. That is what this vignette is about http://pj.freefaculty.org/scraps/PRNG-basics.pdf We've been running simulations on our cluster with the portableParallelSeeds framework for 2 years, we've never had any trouble. We are able to re-start runs, verify random number draws in separate streams. PJ -- Paul E. Johnson Professor, Political Science Assoc. Director 1541 Lilac Lane, Room 504 Center for Research Methods University of Kansas University of Kansas http://pj.freefaculty.org http://quant.ku.edu
Gábor Csárdi
2014-Aug-06 18:20 UTC
[Rd] portableParalleSeeds Package violation, CRAN exception?
Why not place them in the package environment? Gabor On Wed, Aug 6, 2014 at 2:10 PM, Paul Johnson <pauljohn32 at gmail.com> wrote:> I'm writing to ask for a policy exception, or advice on how to make > this package CRAN allowable. > > http://rweb.quant.ku.edu/kran/src/contrib/portableParallelSeeds_0.9.tar.gz > > Yesterday I tried to submit a package on CRAN and Dr Ripley pointed > out that I had not understood the instructions about packages. Here's > the part where the R check gives a Note > > * checking R code for possible problems ... NOTE > Found the following assignments to the global environment: > File ?portableParallelSeeds/R/initPortableStreams.R?: > assign("currentStream", n, envir = .GlobalEnv) > assign("currentStates", curStates, envir = .GlobalEnv) > assign("currentStream", 1L, envir = .GlobalEnv) > assign("startStates", runSeeds, envir = .GlobalEnv) > assign("currentStates", runSeeds, envir = .GlobalEnv) > assign("currentStream", as.integer(currentStream), envir = .GlobalEnv) > assign("startStates", runSeeds, envir = .GlobalEnv) > assign("currentStates", runSeeds, envir = .GlobalEnv) > > Altering the user's environment requires a special arrangement with > CRAN. I believe this is justified, I'll sketch the reasons now. But, > mostly, I'm at your mercy and if there is any way to make this > possible, I would be very grateful. > > To control & replace random number streams, it really is necessary to > alter the workspace. That's where the random generator state is > stored. It is acknowledged in Robert Gentleman' s Book, R Programming > for Bionformatics "The decision to have these [random generator] > functions manipulate a global variable, .Random.seed, is slightly > unfortunate as it makes it somewhat more difficult to manage several > different random number streams simultaneously? (Gentleman, 2009, p. > 201). > > I have developed an understandable set of wrapper functions that handle this. > > Some of you may recall this project. I've asked about it here a couple > of times. We allow separate streams of randoms for different purposes > within a single R run. There is a framework to save 1000s of those > sets in a file, so it can be used on a cluster or in a single > workstation. This is handy because, when 1 run in 10,000 on the > cluster exhibits some weird behavior, we can easily re-initiate that > interactively and see what's going on. > > I have a vignette "pps" that explains. I dropped a copy of that here > in case you don't want to get the package: > > http://pj.freefaculty.org/scraps/pps.pdf > > While working on that, I gained a considerably deeper understanding of > random generators and seeds. That is what this vignette is about > > http://pj.freefaculty.org/scraps/PRNG-basics.pdf > > > We've been running simulations on our cluster with the > portableParallelSeeds framework for 2 years, we've never had any > trouble. We are able to re-start runs, verify random number draws in > separate streams. > > PJ > -- > Paul E. Johnson > Professor, Political Science Assoc. Director > 1541 Lilac Lane, Room 504 Center for Research Methods > University of Kansas University of Kansas > http://pj.freefaculty.org http://quant.ku.edu > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel
William Dunlap
2014-Aug-06 19:48 UTC
[Rd] portableParalleSeeds Package violation, CRAN exception?
You can make an environment called streamsEnv in your package by adding streamsEnv <- new.env() to one of your R/*.R files. (its parent environment will be namespace:yourPackage) and your functions can assign things to this environment instead of to .GlobalEnv. Bill Dunlap TIBCO Software wdunlap tibco.com On Wed, Aug 6, 2014 at 11:10 AM, Paul Johnson <pauljohn32 at gmail.com> wrote:> I'm writing to ask for a policy exception, or advice on how to make > this package CRAN allowable. > > http://rweb.quant.ku.edu/kran/src/contrib/portableParallelSeeds_0.9.tar.gz > > Yesterday I tried to submit a package on CRAN and Dr Ripley pointed > out that I had not understood the instructions about packages. Here's > the part where the R check gives a Note > > * checking R code for possible problems ... NOTE > Found the following assignments to the global environment: > File ?portableParallelSeeds/R/initPortableStreams.R?: > assign("currentStream", n, envir = .GlobalEnv) > assign("currentStates", curStates, envir = .GlobalEnv) > assign("currentStream", 1L, envir = .GlobalEnv) > assign("startStates", runSeeds, envir = .GlobalEnv) > assign("currentStates", runSeeds, envir = .GlobalEnv) > assign("currentStream", as.integer(currentStream), envir = .GlobalEnv) > assign("startStates", runSeeds, envir = .GlobalEnv) > assign("currentStates", runSeeds, envir = .GlobalEnv) > > Altering the user's environment requires a special arrangement with > CRAN. I believe this is justified, I'll sketch the reasons now. But, > mostly, I'm at your mercy and if there is any way to make this > possible, I would be very grateful. > > To control & replace random number streams, it really is necessary to > alter the workspace. That's where the random generator state is > stored. It is acknowledged in Robert Gentleman' s Book, R Programming > for Bionformatics "The decision to have these [random generator] > functions manipulate a global variable, .Random.seed, is slightly > unfortunate as it makes it somewhat more difficult to manage several > different random number streams simultaneously? (Gentleman, 2009, p. > 201). > > I have developed an understandable set of wrapper functions that handle this. > > Some of you may recall this project. I've asked about it here a couple > of times. We allow separate streams of randoms for different purposes > within a single R run. There is a framework to save 1000s of those > sets in a file, so it can be used on a cluster or in a single > workstation. This is handy because, when 1 run in 10,000 on the > cluster exhibits some weird behavior, we can easily re-initiate that > interactively and see what's going on. > > I have a vignette "pps" that explains. I dropped a copy of that here > in case you don't want to get the package: > > http://pj.freefaculty.org/scraps/pps.pdf > > While working on that, I gained a considerably deeper understanding of > random generators and seeds. That is what this vignette is about > > http://pj.freefaculty.org/scraps/PRNG-basics.pdf > > > We've been running simulations on our cluster with the > portableParallelSeeds framework for 2 years, we've never had any > trouble. We are able to re-start runs, verify random number draws in > separate streams. > > PJ > -- > Paul E. Johnson > Professor, Political Science Assoc. Director > 1541 Lilac Lane, Room 504 Center for Research Methods > University of Kansas University of Kansas > http://pj.freefaculty.org http://quant.ku.edu > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel
Michael Lawrence
2014-Aug-07 21:11 UTC
[Rd] portableParalleSeeds Package violation, CRAN exception?
I would recommend against maintaining multiple global variables and would instead take an object-oriented approach. Probably should define a reference class representing a random number stream (think java.util.Random). Then define a reference class representing a collection of them, tracking which one is current. Instantiate the collection class and keep a reference to it inside your package namespace. You could then define a new API that expects a random stream object as an argument, using the active one as the default. Your wrappers would delegate to that API, relying on the default. On Wed, Aug 6, 2014 at 11:10 AM, Paul Johnson <pauljohn32@gmail.com> wrote:> I'm writing to ask for a policy exception, or advice on how to make > this package CRAN allowable. > > http://rweb.quant.ku.edu/kran/src/contrib/portableParallelSeeds_0.9.tar.gz > > Yesterday I tried to submit a package on CRAN and Dr Ripley pointed > out that I had not understood the instructions about packages. Here's > the part where the R check gives a Note > > * checking R code for possible problems ... NOTE > Found the following assignments to the global environment: > File ‘portableParallelSeeds/R/initPortableStreams.R’: > assign("currentStream", n, envir = .GlobalEnv) > assign("currentStates", curStates, envir = .GlobalEnv) > assign("currentStream", 1L, envir = .GlobalEnv) > assign("startStates", runSeeds, envir = .GlobalEnv) > assign("currentStates", runSeeds, envir = .GlobalEnv) > assign("currentStream", as.integer(currentStream), envir = .GlobalEnv) > assign("startStates", runSeeds, envir = .GlobalEnv) > assign("currentStates", runSeeds, envir = .GlobalEnv) > > Altering the user's environment requires a special arrangement with > CRAN. I believe this is justified, I'll sketch the reasons now. But, > mostly, I'm at your mercy and if there is any way to make this > possible, I would be very grateful. > > To control & replace random number streams, it really is necessary to > alter the workspace. That's where the random generator state is > stored. It is acknowledged in Robert Gentleman' s Book, R Programming > for Bionformatics "The decision to have these [random generator] > functions manipulate a global variable, .Random.seed, is slightly > unfortunate as it makes it somewhat more difficult to manage several > different random number streams simultaneously” (Gentleman, 2009, p. > 201). > > I have developed an understandable set of wrapper functions that handle > this. > > Some of you may recall this project. I've asked about it here a couple > of times. We allow separate streams of randoms for different purposes > within a single R run. There is a framework to save 1000s of those > sets in a file, so it can be used on a cluster or in a single > workstation. This is handy because, when 1 run in 10,000 on the > cluster exhibits some weird behavior, we can easily re-initiate that > interactively and see what's going on. > > I have a vignette "pps" that explains. I dropped a copy of that here > in case you don't want to get the package: > > http://pj.freefaculty.org/scraps/pps.pdf > > While working on that, I gained a considerably deeper understanding of > random generators and seeds. That is what this vignette is about > > http://pj.freefaculty.org/scraps/PRNG-basics.pdf > > > We've been running simulations on our cluster with the > portableParallelSeeds framework for 2 years, we've never had any > trouble. We are able to re-start runs, verify random number draws in > separate streams. > > PJ > -- > Paul E. Johnson > Professor, Political Science Assoc. Director > 1541 Lilac Lane, Room 504 Center for Research Methods > University of Kansas University of Kansas > http://pj.freefaculty.org http://quant.ku.edu > > ______________________________________________ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel >[[alternative HTML version deleted]]