Doran, Harold wrote:> Dear List:
>
> It appears that simulating data where all dataframes are stored as a
> list will only work for relatively small analyses. Instead, it appears
> that creating N individual dataframes, saving them, and loading them
> when needed is the best way to save memory and make this a feasible
> task.
>
> As such, I now have a new(er) question with respect to dealing with
> individual files within a loop. To begin, I construct 250 individual
> data files as follows:
>
> library(MASS)
> Sigma<-matrix(c(400,80,80,80,80,400,80,80,80,80,400,80,80,80,80,400),4,4
> )
> mu<-c(100,150,200,250)
> sample.size<-5000
> N=250 #Number of dataframes
>
> # Step 1 Create dataframes
> for(i in 1:N)
> {
> assign(paste("Data.", i, sep=''),
> as.data.frame(cbind(seq(1:sample.size),(mvrnorm(n=sample.size, mu,
> Sigma)))))
> }
>
>
> My goal is to now save each file, remove them from memory, and load each
> file individually, run it through a linear model, and save the output.
>
> To attempt to save each file I try
>
> for (i in 1:N) {
> save(paste("Data.", i, sep=""),
file=paste("Data.",i, ".Rdata", sep=""))
>
> }
>
> Which gives me an error message. It essentially states that Object
> "paste("Data.", i, sep = "")" not found
>
> I'm not quite sure what I am doing wrong here.
You need to use 'get' to convert a character string into an object. An
alternative is
for (nm in paste("Data.", 1:N, sep = '')) {
save(list = nm, file = paste(nm, ".Rdata", sep = ''))
}
> The other issue I am encountering is how to best work with an object in
> a loop. I get the following to work, but I'm not sure if this is the
> best method for doing so.
>
> for(i in 1:N){
> assign(paste("out.",i,sep=""),
>
lm((get(paste("Data.",i,sep=""))[["V3"]])~(get(paste("Data.",i,sep=""))[
> ["V2"]])))
> }
As shown above, looping in R does not need to be over a numeric index
vector. You can loop over any vector object, including the vector of
names. Another way you can simplify this is to use the formula/data
specification for lm. If your model formula is always going to be V3 ~
V2 then just use that as in
for (nm in paste("Data.", 1:N, sep = '')) {
assign(paste(nm, '.out', sep=''), lm(V3 ~ V2, data = get(nm))
}
There are probably better approaches but I think you would prefer that I
work on the lme4 code instead of spending more time answering these
questions. :-)
(BTW, I believe I have a very general formulation for fitting models
with non-nested grouping factors, such as the student-teacher-school
data from DC. Still a few bugs though.)