Anthony Dick
2011-Feb-24  19:23 UTC
[R] parallel bootstrap linear model on multicore mac (re-post)
Hello all,
I am re-posting my previous question with a simpler, more transparent, 
commented code.
I have been ramming my head against this problem, and I wondered if 
anyone could lend a hand. I want to make parallel a bootstrap of a 
linear mixed model on my 8-core mac. Below is the process that I want to 
make parallel (namely, the boot.out<-boot(dat.res,boot.fun, R = nboot) 
command). This is an extension to lmer of the bootstrapping linear 
models example in Venables and Ripley. Please excuse my rather terrible 
programming skills. I am always open to suggestions. Below the example I 
describe what methods I have tried.
library(boot)
library(lme4)
dat<-read.table("http://www2.fiu.edu/~adick/downloads/toy2.dat",
header = T)
nboot<-1000 # number of bootstraps
attach(dat)
x<-dat[,2] # IV number 1
y<-dat[,4] # DV
z<-dat[,3] # IV number 2
subj<-dat[,1] # random factor
boot.fun<-function(data,i) { # function to resample residuals
              d<-data
              d$y<- d$fitted+d$res[i] # populate new y values based on 
resampled residuals
              as.numeric(coef(update(m2.fit,data=d))[1][[1]][1,c(1:4)]) 
# update the linear model and output the coefficients
              }
fit<-lmer(y~x*z + (1|(subj))) # the linear model
dat.res<-data.frame(y,x,z,subj, res=resid(fit), fitted=fitted(fit)) # 
add residuals and fitted values to dat
boot.out<-boot(dat.res,boot.fun, R = nboot) # run the bootstrap using 
the boot.fun
boot.out
Methods attempted:
Using the multicore package, I tried 
boot.out<-collect(parallel(boot(dat.res,boot.fun, R = nboot))). This 
returned a correct result, but did not speed things up. Not sure why...
I also tried snowfall and snow. While I can create a cluster and run 
simple processes (e.g., provided example from literature), I can't get 
the bootstrap to run. For example, using snow:
cl <- makeCluster(8)
clusterSetupRNG(cl)
clusterEvalQ(cl,library(boot))
clusterEvalQ(cl,library(lme4))
boot.out<-clusterCall(cl,boot(dat.res,boot.fun, R = nboot))
stopCluster()
returns the following error:
Error in checkForRemoteErrors(lapply(cl, recvResult)) :
   8 nodes produced errors; first error: could not find function "fun"
I am stuck and at the limit of my programming knowledge and am punting 
to the R-help list. I need to run this process thousands of times, which 
is the reason to make it parallel. Any suggestions are much appreciated.
Anthony
-- 
Anthony Steven Dick, Ph.D.
Assistant Professor
Department of Psychology
Florida International University
Modesto A. Maidique Campus DM 296B
11200 S.W. 8th Street
Miami, FL 33199
Phone: 305-348-4202
Lab Phone: 305-348-9057 or 305-348-9055 (I am usually here)
Fax: 305-348-3879
Email: adick at fiu.edu
Webpage: http://www.fiu.edu/~adick
Anthony Dick
2011-Mar-02  22:38 UTC
[R] parallel bootstrap linear model on multicore mac (re-post)
Hello all,
I am re-posting my previous question with a simpler, more transparent,
commented code.
I have been ramming my head against this problem, and I wondered if
anyone could lend a hand. I want to make parallel a bootstrap of a
linear mixed model on my 8-core mac. Below is the process that I want to
make parallel (namely, the boot.out<-boot(dat.res,boot.fun, R = nboot)
command). This is an extension to lmer of the bootstrapping linear
models example in Venables and Ripley. Please excuse my rather terrible
programming skills. I am always open to suggestions. Below the example I
describe what methods I have tried.
library(boot)
library(lme4)
dat<-read.table("http://www2.fiu.edu/~adick/downloads/toy2.dat 
<http://www2.fiu.edu/%7Eadick/downloads/toy2.dat>", header = T)
nboot<-1000 # number of bootstraps
attach(dat)
x<-dat[,2] # IV number 1
y<-dat[,4] # DV
z<-dat[,3] # IV number 2
subj<-dat[,1] # random factor
boot.fun<-function(data,i) { # function to resample residuals
               d<-data
               d$y<- d$fitted+d$res[i] # populate new y values based on
resampled residuals
               as.numeric(coef(update(m2.fit,data=d))[1][[1]][1,c(1:4)])
# update the linear model and output the coefficients
               }
fit<-lmer(y~x*z + (1|(subj))) # the linear model
dat.res<-data.frame(y,x,z,subj, res=resid(fit), fitted=fitted(fit)) #
add residuals and fitted values to dat
boot.out<-boot(dat.res,boot.fun, R = nboot) # run the bootstrap using
the boot.fun
boot.out
Methods attempted:
Using the multicore package, I tried
boot.out<-collect(parallel(boot(dat.res,boot.fun, R = nboot))). This
returned a correct result, but did not speed things up. Not sure why...
I also tried snowfall and snow. While I can create a cluster and run
simple processes (e.g., provided example from literature), I can't get
the bootstrap to run. For example, using snow:
cl<- makeCluster(8)
clusterSetupRNG(cl)
clusterEvalQ(cl,library(boot))
clusterEvalQ(cl,library(lme4))
boot.out<-clusterCall(cl,boot(dat.res,boot.fun, R = nboot))
stopCluster()
returns the following error:
Error in checkForRemoteErrors(lapply(cl, recvResult)) :
    8 nodes produced errors; first error: could not find function
"fun"
I am stuck and at the limit of my programming knowledge and am punting
to the R-help list. I need to run this process thousands of times, which
is the reason to make it parallel. Any suggestions are much appreciated.
Anthony
-- 
Anthony Steven Dick, Ph.D.
Assistant Professor
Department of Psychology
Florida International University
Modesto A. Maidique Campus DM 296B
11200 S.W. 8th Street
Miami, FL 33199
Phone: 305-348-4202
Lab Phone: 305-348-9057 or 305-348-9055 (I am usually here)
Fax: 305-348-3879
Email: adick@fiu.edu
Webpage: http://www.fiu.edu/~adick
	[[alternative HTML version deleted]]