thr3ads.net - R help - [R] A reproducibility puzzle with NORM [Sep 2007]

If this information is useful, please help other people find it:
Share via:

(Ted Harding)

2007-Sep-21 13:13 UTC

[R] A reproducibility puzzle with NORM

Hi Folks,
I'm using the 'norm' package (based on Shafer's NORM)
on some  data. In outline, (X,Y) are bivariate normal,
var(X)=0.29, var(Y)=24.4, cov(X,Y)=-0.277,
there are some 900 cases, and some 170 values of Y
have been set "missing" (NA).

The puzzle is that, repeating the multiple imputation
starting from the same random seed, I get different
answers from the repeats depending if I do an odd number
of imputations, but the same answer on the repeats
if I do en even number (which includes the second
repeat of an odd number).

It may possibly have something to do with how I've
written the code for the loop, but if so then I'm
not seeing it!

CODE:

## Set up the situation:
Data<-read.csv("MyData.csv")
X<-Data$X; Y<-Data$Y
##(If you want to try it, set your own data here)
Raw<-cbind(X,Y)
library(norm)

## Initialise stuff
s<-prelim.norm(Raw)
t0<-em.norm(s)

##########################
## Set the Random Seed
rngseed(31425)

## Do the first imputation:
t     <-  da.norm(s,t0,steps=20)
Imp   <- imp.norm(s,t, Raw)
X.Imp <- Imp[,1]; Y.Imp<-Imp[,2]

## Now do the rest, and accumulate lists of the results
## Est.Imp = list of estimated coeffs
## SE.Imp  = list of SEs of estimated coeffs:
Est.Imp <- list(summary(lm(Y.Imp~X.Imp))$coef[,1])
SE.Imp  <- list(summary(lm(Y.Imp~X.Imp))$coef[,2])
N=4
for(i in (2:N)){
  t<-da.norm(s,t,steps=20)
  Imp<-imp.norm(s,t,Raw)
  X.Imp<-Imp[,1]; Y.Imp<-Imp[,2]
  Est.Imp<-c(Est.Imp,list(summary(lm(Y.Imp~X.Imp))$coef[,1]))
  SE.Imp <-c( SE.Imp,list(summary(lm(Y.Imp~X.Imp))$coef[,2]))
}

## Finally, combine the imputations:
mi.inference(Est.Imp,SE.Imp)


I've illustrated N=4 (even) above, for 4 imputations.

Now, I run the code repeatedly from "## Set the Random Seed"
down to "mi.inference(Est.Imp,SE.Imp)"

With N=4, I always get the same result.

If I set N=5, I alternately get different results:
The second run is different from the first, but the
third is the same as the first, and the fourth is the
same as the second, ...

In general, for even N, it is as for N=4, and for odd N
it is as for N=5.

Any ideas???

Thanks,
Ted.

--------------------------------------------------------------------
E-Mail: (Ted Harding) <Ted.Harding at manchester.ac.uk>
Fax-to-email: +44 (0)870 094 0861
Date: 21-Sep-07                                       Time: 14:13:27
------------------------------ XFMail ------------------------------

Prof Brian Ripley

2007-Sep-21 13:44 UTC

head link

[R] A reproducibility puzzle with NORM

Norm uses a Box-Muller normal RNG, and rngseed does not reset its state
(it has some Fortran save variables).  So if you ask for an odd number of 
normals and call rngseed, the next normal 'generated' is the second half
of the last pair with the previous seed.

Ideally packages should be converted to use R's number generators which do 
not have such quirks.

On Fri, 21 Sep 2007, Ted.Harding at manchester.ac.uk wrote:
> Hi Folks,
> I'm using the 'norm' package (based on Shafer's NORM)
> on some  data. In outline, (X,Y) are bivariate normal,
> var(X)=0.29, var(Y)=24.4, cov(X,Y)=-0.277,
> there are some 900 cases, and some 170 values of Y
> have been set "missing" (NA).
>
> The puzzle is that, repeating the multiple imputation
> starting from the same random seed, I get different
> answers from the repeats depending if I do an odd number
> of imputations, but the same answer on the repeats
> if I do en even number (which includes the second
> repeat of an odd number).
>
> It may possibly have something to do with how I've
> written the code for the loop, but if so then I'm
> not seeing it!
>
> CODE:
>
> ## Set up the situation:
> Data<-read.csv("MyData.csv")
> X<-Data$X; Y<-Data$Y
> ##(If you want to try it, set your own data here)
> Raw<-cbind(X,Y)
> library(norm)
>
> ## Initialise stuff
> s<-prelim.norm(Raw)
> t0<-em.norm(s)
>
> ##########################
> ## Set the Random Seed
> rngseed(31425)
>
> ## Do the first imputation:
> t     <-  da.norm(s,t0,steps=20)
> Imp   <- imp.norm(s,t, Raw)
> X.Imp <- Imp[,1]; Y.Imp<-Imp[,2]
>
> ## Now do the rest, and accumulate lists of the results
> ## Est.Imp = list of estimated coeffs
> ## SE.Imp  = list of SEs of estimated coeffs:
> Est.Imp <- list(summary(lm(Y.Imp~X.Imp))$coef[,1])
> SE.Imp  <- list(summary(lm(Y.Imp~X.Imp))$coef[,2])
> N=4
> for(i in (2:N)){
>  t<-da.norm(s,t,steps=20)
>  Imp<-imp.norm(s,t,Raw)
>  X.Imp<-Imp[,1]; Y.Imp<-Imp[,2]
>  Est.Imp<-c(Est.Imp,list(summary(lm(Y.Imp~X.Imp))$coef[,1]))
>  SE.Imp <-c( SE.Imp,list(summary(lm(Y.Imp~X.Imp))$coef[,2]))
> }
>
> ## Finally, combine the imputations:
> mi.inference(Est.Imp,SE.Imp)
>
>
> I've illustrated N=4 (even) above, for 4 imputations.
>
> Now, I run the code repeatedly from "## Set the Random Seed"
> down to "mi.inference(Est.Imp,SE.Imp)"
>
> With N=4, I always get the same result.
>
> If I set N=5, I alternately get different results:
> The second run is different from the first, but the
> third is the same as the first, and the fourth is the
> same as the second, ...
>
> In general, for even N, it is as for N=4, and for odd N
> it is as for N=5.
>
> Any ideas???
>
> Thanks,
> Ted.
>
> --------------------------------------------------------------------
> E-Mail: (Ted Harding) <Ted.Harding at manchester.ac.uk>
> Fax-to-email: +44 (0)870 094 0861
> Date: 21-Sep-07                                       Time: 14:13:27
> ------------------------------ XFMail ------------------------------
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

Seemingly Similar Threads

Search for more reasonably related threads

R help - Sep 2007 - A reproducibility puzzle with NORM

[R] A reproducibility puzzle with NORM

[R] A reproducibility puzzle with NORM

Seemingly Similar Threads