new ruser
2007-May-31 12:45 UTC
[R] loading several "samples" of data from hard-drive, run "lm", "rlm", etc, save results in a list
I have many "sample" datasets (e.g. sample 5, sample 6, etc), each
identified by a number as a suffix. These datasets are saved as individual R
objects on my hard drive. (e.g."Wind.5.r" .
"Wind.6.r","Solar.5.r","Solar.6.r") For example
purposes, I have written code that creates similar data files using the
"airquality" dataset. (see below)
#this creates my sample data files
library(datasets)
getwd() #fyi
for(m in 5:9) {
tempdata=subset(airquality,Month==m)
for (col in 1:4){
tempdata2 = tempdata[col]
tempname=paste(names(airquality)[col],m,sep=".")
assign(tempname,tempdata2,pos=.GlobalEnv )
save( list=tempname , file = paste(tempname,".r",sep="" ) )
rm(tempdata2,list=tempname,tempname)
}
rm(tempdata,col,m)
}
(While it might be possible to combine all the data into one large R-object, I
have chosen not to do so. Due to the large size of my datasets, and the way
they are organized, I feels it does make sense to keep them as individual
files.)
I wish to load several variables from each "sample", to perform a
regression using the "lm" function, and to then save the all the
regressions as objects in a list.
Here is the code I have written. Is there a "better/simpler" way to
do this? (Ideally, I'd like the model I specify to be flexible, and to be
able to use not only lm, but also rlm, etc. (I have simplified my code for this
example, but I think this repasts the essential parts of what I am trying to
accomplish.) )
#my code to run a regression for each "sample"
(i.e."samples" 5,6,7,8,& 9),
#this saves the regression results in a list called "results"
y='Ozone'
x=c('Wind','Temp')
results=list(NULL)
for (i in 5:9) {
load(file = paste(y,i,"r" ,sep="."), envir = .GlobalEnv)
y1=get(paste(y,i,sep="."))
for (d in 1:length(x)) {
load(file = paste(x[d],i,"r" ,sep="."), envir = .GlobalEnv)
assign(paste("x",d,sep=""),get(paste(x[d],i,sep=".")
))
} #end d loop
reg <- lm(y1[,1]~x1[,1]+x2[,1])
results[i-5] <- list(reg)
names(results)[i-5] <- i
#need to add a line to remove any data files loaded
}
summary(results[[1]])
summary(results[[2]])
lapply(results,summary)
---------------------------------
[[alternative HTML version deleted]]
Apparently Analagous Threads
- MICE data analysis with glmulti
- Odd behaviour in within.list() when deleting 2+ variables
- Odd behaviour in within.list() when deleting 2+ variables
- Odd behaviour in within.list() when deleting 2+ variables
- Odd behaviour in within.list() when deleting 2+ variables
