> On 30/10/2019 9:08 a.m., peter dalgaard wrote: > > You can fairly easily work around that by saving and restoring .Random.seed.This is actually quite tedious to get correct; it requires you to under how and when .Random.seed is set, and what are valid values on .Random.seed. For instance, a common mistake (me too) is to reset to .GlobalEnv$.Random.seed <- NULL in a fresh R session but this will produce a warning on: ".Random.seed' is not an integer vector but of type 'NULL', so ignored". You end up having to do things such as: oseed <- .GlobalEnv$.Random.seed on.exit({ if (is.null(oseed)) { rm(list=".Random.seed", envir= .GlobalEnv) } else { assign(".Random.seed", value=oseed, envir= .GlobalEnv) } }) to avoid that warning. So, having support functions for this in base R would be helpful, e.g. oseed <- base::getRandomSeed() on.exit(base::setRandomSeed(oseed)) Back to Marvin's point/question. I think it would be useful if the CRAN Policies would explicitly say that functions must not change the random seed to a fixed one, without undoing it, unless the user specifies it via an argument. If not, there's a great risk it will mess up statistical analysis. Also, if there would a way to test against such practices, which I think is really hard, I would be the first one backing it up. On a related note; there are packages that forward the .Random.seed when loaded. This is also an unfortunate behavior because you will give different RNGs depending on that package was already loaded before you called a function that depends on it or not. For example, if pkgA forwards the .Random.seed when loaded, the following is *not* reproducible: # User might or might not have loaded pkgA already if (runif(1) < 0.5) loadNamespace(pkgA) set.seed(0xBEEF) loadNamespace(pkgA) y <- runif(1) I ran into this problem when doing some strict testing. I argue this also falls into the set of "bad" practices to be avoided. /Henrik On Wed, Oct 30, 2019 at 6:23 AM Duncan Murdoch <murdoch.duncan at gmail.com> wrote:> > On 30/10/2019 9:08 a.m., peter dalgaard wrote: > > We commit a similar sin in the help pages, e.g. > > > > example(set.seed) ; runif(2) > > example(set.seed) ; runif(2) > > > > gives you the same random uniforms both times. (Of course it isn't that much of an issue, since you would rarely be running examples before any serious simulations.) > > I think it's pretty common in example code, and that's justifiable. But > it could be avoided by using withr::with_seed() or something equivalent. > > Duncan Murdoch > > > > > You can fairly easily work around that by saving and restoring .Random.seed. I wonder if that isn't also true of the cases using set.seed() for other reasons? > > > > -pd > > > > > >> On 30 Oct 2019, at 13:46 , Duncan Murdoch <murdoch.duncan at gmail.com> wrote: > >> > >> On 30/10/2019 3:28 a.m., Marvin Wright wrote: > >>> Hi all, > >>> I recently found several calls of set.seed() in a CRAN package. These calls are in a plot function, which could lead to unexpected behaviour. See https://github.com/sammo3182/interplot/issues/33 <https://github.com/sammo3182/interplot/issues/33> for a description of the problem. > >>> I checked the CRAN repository policies and could not find anything about this. I would have expected a policy against setting fixed seeds somewhere in a package. Am I missing something? > >> > >> set.seed() writes .Random.seed in the user's global environment, which violates this policy: > >> > >> - Packages should not modify the global environment (user?s workspace). > >> > >> However, every call to a random number generator creates or modifies .Random.seed as well, and most of those are expected and shouldn't be flagged. And interplot() is documented to do random simulations, so it would be expected to change the seed: the issue is that given the same inputs it always changes it to the same thing. I think that would be quite hard for a test to detect. > >> > >> Should it be a policy with no test? Maybe, because I agree with you that interplot()'s set.seed(324) is bad practice. > >> > >> Duncan Murdoch > >> > >> ______________________________________________ > >> R-devel at r-project.org mailing list > >> https://stat.ethz.ch/mailman/listinfo/r-devel > > > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel
Forgot to say: For, oseed <- base::getRandomSeed() on.exit(base::setRandomSeed(oseed)) one could upgrade set.seed() to take this role, e.g. oseed <- set.seed(0xBEEF) on.exit(set.seed(oseed)) Current, set.seed() always return NULL. BTW, and my memory might be bad, I think I mentioned this in the past but was told that you cannot reset the RNG state for all types of RNG kinds. That might complicate things, but on the other hand, that could be checked for at run-time by the above functions. /Henrik On Wed, Oct 30, 2019 at 9:50 AM Henrik Bengtsson <henrik.bengtsson at gmail.com> wrote:> > > On 30/10/2019 9:08 a.m., peter dalgaard wrote: > > > You can fairly easily work around that by saving and restoring .Random.seed. > > This is actually quite tedious to get correct; it requires you to > under how and when .Random.seed is set, and what are valid values on > .Random.seed. For instance, a common mistake (me too) is to reset to > .GlobalEnv$.Random.seed <- NULL in a fresh R session but this will > produce a warning on: ".Random.seed' is not an integer vector but of > type 'NULL', so ignored". You end up having to do things such as: > > oseed <- .GlobalEnv$.Random.seed > on.exit({ > if (is.null(oseed)) { > rm(list=".Random.seed", envir= .GlobalEnv) > } else { > assign(".Random.seed", value=oseed, envir= .GlobalEnv) > } > }) > > to avoid that warning. So, having support functions for this in base > R would be helpful, e.g. > > oseed <- base::getRandomSeed() > on.exit(base::setRandomSeed(oseed)) > > Back to Marvin's point/question. I think it would be useful if the > CRAN Policies would explicitly say that functions must not change the > random seed to a fixed one, without undoing it, unless the user > specifies it via an argument. If not, there's a great risk it will > mess up statistical analysis. Also, if there would a way to test > against such practices, which I think is really hard, I would be the > first one backing it up. > > On a related note; there are packages that forward the .Random.seed > when loaded. This is also an unfortunate behavior because you will > give different RNGs depending on that package was already loaded > before you called a function that depends on it or not. For example, > if pkgA forwards the .Random.seed when loaded, the following is *not* > reproducible: > > # User might or might not have loaded pkgA already > if (runif(1) < 0.5) loadNamespace(pkgA) > > set.seed(0xBEEF) > loadNamespace(pkgA) > y <- runif(1) > > I ran into this problem when doing some strict testing. I argue this > also falls into the set of "bad" practices to be avoided. > > /Henrik > > On Wed, Oct 30, 2019 at 6:23 AM Duncan Murdoch <murdoch.duncan at gmail.com> wrote: > > > > On 30/10/2019 9:08 a.m., peter dalgaard wrote: > > > We commit a similar sin in the help pages, e.g. > > > > > > example(set.seed) ; runif(2) > > > example(set.seed) ; runif(2) > > > > > > gives you the same random uniforms both times. (Of course it isn't that much of an issue, since you would rarely be running examples before any serious simulations.) > > > > I think it's pretty common in example code, and that's justifiable. But > > it could be avoided by using withr::with_seed() or something equivalent. > > > > Duncan Murdoch > > > > > > > > You can fairly easily work around that by saving and restoring .Random.seed. I wonder if that isn't also true of the cases using set.seed() for other reasons? > > > > > > -pd > > > > > > > > >> On 30 Oct 2019, at 13:46 , Duncan Murdoch <murdoch.duncan at gmail.com> wrote: > > >> > > >> On 30/10/2019 3:28 a.m., Marvin Wright wrote: > > >>> Hi all, > > >>> I recently found several calls of set.seed() in a CRAN package. These calls are in a plot function, which could lead to unexpected behaviour. See https://github.com/sammo3182/interplot/issues/33 <https://github.com/sammo3182/interplot/issues/33> for a description of the problem. > > >>> I checked the CRAN repository policies and could not find anything about this. I would have expected a policy against setting fixed seeds somewhere in a package. Am I missing something? > > >> > > >> set.seed() writes .Random.seed in the user's global environment, which violates this policy: > > >> > > >> - Packages should not modify the global environment (user?s workspace). > > >> > > >> However, every call to a random number generator creates or modifies .Random.seed as well, and most of those are expected and shouldn't be flagged. And interplot() is documented to do random simulations, so it would be expected to change the seed: the issue is that given the same inputs it always changes it to the same thing. I think that would be quite hard for a test to detect. > > >> > > >> Should it be a policy with no test? Maybe, because I agree with you that interplot()'s set.seed(324) is bad practice. > > >> > > >> Duncan Murdoch > > >> > > >> ______________________________________________ > > >> R-devel at r-project.org mailing list > > >> https://stat.ethz.ch/mailman/listinfo/r-devel > > > > > > > ______________________________________________ > > R-devel at r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-devel
Thanks for your answers. I agree with Duncan and Henrik that we should have such a policy, even if we cannot test it.> On 30. Oct 2019, at 17:59, Henrik Bengtsson <henrik.bengtsson at gmail.com> wrote: > > Forgot to say: For, > > oseed <- base::getRandomSeed() > on.exit(base::setRandomSeed(oseed)) > > one could upgrade set.seed() to take this role, e.g. > > oseed <- set.seed(0xBEEF) > on.exit(set.seed(oseed)) > > Current, set.seed() always return NULL. > > BTW, and my memory might be bad, I think I mentioned this in the past > but was told that you cannot reset the RNG state for all types of RNG > kinds. That might complicate things, but on the other hand, that > could be checked for at run-time by the above functions. > > /Henrik > > On Wed, Oct 30, 2019 at 9:50 AM Henrik Bengtsson > <henrik.bengtsson at gmail.com> wrote: >> >>> On 30/10/2019 9:08 a.m., peter dalgaard wrote: >>>> You can fairly easily work around that by saving and restoring .Random.seed. >> >> This is actually quite tedious to get correct; it requires you to >> under how and when .Random.seed is set, and what are valid values on >> .Random.seed. For instance, a common mistake (me too) is to reset to >> .GlobalEnv$.Random.seed <- NULL in a fresh R session but this will >> produce a warning on: ".Random.seed' is not an integer vector but of >> type 'NULL', so ignored". You end up having to do things such as: >> >> oseed <- .GlobalEnv$.Random.seed >> on.exit({ >> if (is.null(oseed)) { >> rm(list=".Random.seed", envir= .GlobalEnv) >> } else { >> assign(".Random.seed", value=oseed, envir= .GlobalEnv) >> } >> }) >> >> to avoid that warning. So, having support functions for this in base >> R would be helpful, e.g. >> >> oseed <- base::getRandomSeed() >> on.exit(base::setRandomSeed(oseed)) >> >> Back to Marvin's point/question. I think it would be useful if the >> CRAN Policies would explicitly say that functions must not change the >> random seed to a fixed one, without undoing it, unless the user >> specifies it via an argument. If not, there's a great risk it will >> mess up statistical analysis. Also, if there would a way to test >> against such practices, which I think is really hard, I would be the >> first one backing it up. >> >> On a related note; there are packages that forward the .Random.seed >> when loaded. This is also an unfortunate behavior because you will >> give different RNGs depending on that package was already loaded >> before you called a function that depends on it or not. For example, >> if pkgA forwards the .Random.seed when loaded, the following is *not* >> reproducible: >> >> # User might or might not have loaded pkgA already >> if (runif(1) < 0.5) loadNamespace(pkgA) >> >> set.seed(0xBEEF) >> loadNamespace(pkgA) >> y <- runif(1) >> >> I ran into this problem when doing some strict testing. I argue this >> also falls into the set of "bad" practices to be avoided. >> >> /Henrik >> >> On Wed, Oct 30, 2019 at 6:23 AM Duncan Murdoch <murdoch.duncan at gmail.com> wrote: >>> >>> On 30/10/2019 9:08 a.m., peter dalgaard wrote: >>>> We commit a similar sin in the help pages, e.g. >>>> >>>> example(set.seed) ; runif(2) >>>> example(set.seed) ; runif(2) >>>> >>>> gives you the same random uniforms both times. (Of course it isn't that much of an issue, since you would rarely be running examples before any serious simulations.) >>> >>> I think it's pretty common in example code, and that's justifiable. But >>> it could be avoided by using withr::with_seed() or something equivalent. >>> >>> Duncan Murdoch >>> >>>> >>>> You can fairly easily work around that by saving and restoring .Random.seed. I wonder if that isn't also true of the cases using set.seed() for other reasons? >>>> >>>> -pd >>>> >>>> >>>>> On 30 Oct 2019, at 13:46 , Duncan Murdoch <murdoch.duncan at gmail.com> wrote: >>>>> >>>>> On 30/10/2019 3:28 a.m., Marvin Wright wrote: >>>>>> Hi all, >>>>>> I recently found several calls of set.seed() in a CRAN package. These calls are in a plot function, which could lead to unexpected behaviour. See https://github.com/sammo3182/interplot/issues/33 <https://github.com/sammo3182/interplot/issues/33> for a description of the problem. >>>>>> I checked the CRAN repository policies and could not find anything about this. I would have expected a policy against setting fixed seeds somewhere in a package. Am I missing something? >>>>> >>>>> set.seed() writes .Random.seed in the user's global environment, which violates this policy: >>>>> >>>>> - Packages should not modify the global environment (user?s workspace). >>>>> >>>>> However, every call to a random number generator creates or modifies .Random.seed as well, and most of those are expected and shouldn't be flagged. And interplot() is documented to do random simulations, so it would be expected to change the seed: the issue is that given the same inputs it always changes it to the same thing. I think that would be quite hard for a test to detect. >>>>> >>>>> Should it be a policy with no test? Maybe, because I agree with you that interplot()'s set.seed(324) is bad practice. >>>>> >>>>> Duncan Murdoch >>>>> >>>>> ______________________________________________ >>>>> R-devel at r-project.org mailing list >>>>> https://stat.ethz.ch/mailman/listinfo/r-devel >>>> >>> >>> ______________________________________________ >>> R-devel at r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-devel > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel >