thr3ads.net - R devel - [Rd] simulate in stats [Sep 2005]

If this information is useful, please help other people find it:
Share via:

Paul Gilbert

2005-Sep-14 19:45 UTC

[Rd] simulate in stats

Can the arguments nsim and seed be passed as part of ... in the new 
simulate generic in R-2.2.0alpha package stats?

This would potentially allow me to use the stats generic rather than the 
one I define in dse. There are contexts where nsim and seed do not make 
sense. I realize that the default arguments could be ignored, but it 
does not really make sense to introduce a new generic with that in mind. 
(I would also prefer that the "object" argument was called
"model" but
this is less important.)

Paul Gilbert

Paul Gilbert

2005-Sep-15 00:50 UTC

head link

[Rd] simulate in stats

(Sorry if this was posted twice. I seem to be having some email issues.)

Can the arguments nsim and seed be passed as part of ... in the new 
simulate generic in R-2.2.0alpha package stats?

This would potentially allow me to use the stats generic rather than the 
one I define in dse. There are contexts where nsim and seed do not make 
sense. I realize that the default arguments could be ignored, but it 
does not really make sense to introduce a new generic with that in mind. 
(I would also prefer that the "object" argument was called
"model" but
this is less important.)

Paul Gilbert

Paul Gilbert

2005-Sep-15 16:07 UTC

head link

[Rd] simulate in stats

BTW, I think there is a problem with the way the argument "seed" is
used
in the new simulate in stats.  The problem is that repeated calls to 
simulate using the default argument will introduce a new pattern into 
the RNG:

 > stats:::simulate
function (object, nsim = 1, seed = as.integer(runif(1, 0, 
.Machine$integer.max)),   ...)
UseMethod("simulate")
<environment: namespace:stats>

 > stats:::simulate.lm
function (object, nsim = 1, seed = as.integer(runif(1, 0, 
.Machine$integer.max)),    ...)
{
    if (!exists(".Random.seed", envir = .GlobalEnv))
        runif(1)
    RNGstate <- .Random.seed
    set.seed(seed)
  ...

This should not be done, as the resulting RNG has not been studied or 
proven. A better mechanism is  to have a default argument equal NULL, 
and not touch the seed in that case. There are several examples of this 
in the package dse1 (in bundle dse),  see for example simulate.ARMA and 
simulate.SS. They also use the utilities in the setRNG package to save 
more of the information necessary to reproduce simulations. Roughly it 
is done like this:

simulate.x <- function (model, rng = NULL,  ...)
  {if (is.null(rng)) rng <- setRNG() #returns the RNG setting to be 
saved with the result
    else {
        old.rng <- setRNG(rng)
        on.exit(setRNG(old.rng))
        }
   ...

The seed by itself is not very useful if the purpose is to be able to 
reproduce things, and I think it would be a good idea to incorporate the 
few small functions setRNG into stats (especially if the simulate 
mechanism is being introduced).

The argument "nsim" presumably alleviates to some extent the above 
concern about changing the RNG pattern. However, in my fairly extensive 
experience it is not very workable to produce all the simulations and 
then do the analysis of them. In a Monte Carlo experiment the generated 
data set is just too big. A better approach is to do the analysis and 
save only necessary information after each simulation. That is the 
approach, for example, in dse2:::EstEval.

Paul

Paul Gilbert wrote:
> Can the arguments nsim and seed be passed as part of ... in the new 
> simulate generic in R-2.2.0alpha package stats?
>
> This would potentially allow me to use the stats generic rather than 
> the one I define in dse. There are contexts where nsim and seed do not 
> make sense. I realize that the default arguments could be ignored, but 
> it does not really make sense to introduce a new generic with that in 
> mind. (I would also prefer that the "object" argument was called 
> "model" but this is less important.)
>
> Paul Gilbert

Kasper Daniel Hansen

2005-Sep-15 20:18 UTC

head link

[Rd] simulate in stats

I agree: no function should per default touch the random number  
stream. Otherwise this will undoubtedly lead to misuse. And while one  
may want to include a seed argument in case a user wants to set it  
explicitly, I would argue that the preferred usage is to do
   set.seed(SOMETHING)
   someFunction()
and then educate users that this is the way to go.

Kasper


On Sep 15, 2005, at 9:07 AM, Paul Gilbert wrote:
> BTW, I think there is a problem with the way the argument "seed"
is
> used
> in the new simulate in stats.  The problem is that repeated calls to
> simulate using the default argument will introduce a new pattern into
> the RNG:
>
>
>> stats:::simulate
>>
> function (object, nsim = 1, seed = as.integer(runif(1, 0,
> .Machine$integer.max)),   ...)
> UseMethod("simulate")
> <environment: namespace:stats>
>
>
>
>> stats:::simulate.lm
>>
> function (object, nsim = 1, seed = as.integer(runif(1, 0,
> .Machine$integer.max)),    ...)
> {
>     if (!exists(".Random.seed", envir = .GlobalEnv))
>         runif(1)
>     RNGstate <- .Random.seed
>     set.seed(seed)
>   ...
>
> This should not be done, as the resulting RNG has not been studied or
> proven. A better mechanism is  to have a default argument equal NULL,
> and not touch the seed in that case. There are several examples of  
> this
> in the package dse1 (in bundle dse),  see for example simulate.ARMA  
> and
> simulate.SS. They also use the utilities in the setRNG package to save
> more of the information necessary to reproduce simulations. Roughly it
> is done like this
> simulate.x <- function (model, rng = NULL,  ...)
>   {if (is.null(rng)) rng <- setRNG() #returns the RNG setting to be
> saved with the result
>     else {
>         old.rng <- setRNG(rng)
>         on.exit(setRNG(old.rng))
>         }
>    ...
>
>
> The seed by itself is not very useful if the purpose is to be able to
> reproduce things, and I think it would be a good idea to  
> incorporate the
> few small functions setRNG into stats (especially if the simulate
> mechanism is being introduced).
>
> The argument "nsim" presumably alleviates to some extent the
above
> concern about changing the RNG pattern. However, in my fairly  
> extensive
> experience it is not very workable to produce all the simulations and
> then do the analysis of them. In a Monte Carlo experiment the  
> generated
> data set is just too big. A better approach is to do the analysis and
> save only necessary information after each simulation. That is the
> approach, for example, in dse2:::EstEval.
>
> Paul
>
> Paul Gilbert wrote:
>
>
>> Can the arguments nsim and seed be passed as part of ... in the new
>> simulate generic in R-2.2.0alpha package stats?
>>
>> This would potentially allow me to use the stats generic rather than
>> the one I define in dse. There are contexts where nsim and seed do  
>> not
>> make sense. I realize that the default arguments could be ignored,  
>> but
>> it does not really make sense to introduce a new generic with that in
>> mind. (I would also prefer that the "object" argument was
called
>> "model" but this is less important.)
>>
>> Paul Gilbert
>>
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

R devel - Sep 2005 - simulate in stats

[Rd] simulate in stats

[Rd] simulate in stats

[Rd] simulate in stats

[Rd] simulate in stats