new ruser
2007-May-31 12:45 UTC
[R] loading several "samples" of data from hard-drive, run "lm", "rlm", etc, save results in a list
I have many "sample" datasets (e.g. sample 5, sample 6, etc), each identified by a number as a suffix. These datasets are saved as individual R objects on my hard drive. (e.g."Wind.5.r" . "Wind.6.r","Solar.5.r","Solar.6.r") For example purposes, I have written code that creates similar data files using the "airquality" dataset. (see below) #this creates my sample data files library(datasets) getwd() #fyi for(m in 5:9) { tempdata=subset(airquality,Month==m) for (col in 1:4){ tempdata2 = tempdata[col] tempname=paste(names(airquality)[col],m,sep=".") assign(tempname,tempdata2,pos=.GlobalEnv ) save( list=tempname , file = paste(tempname,".r",sep="" ) ) rm(tempdata2,list=tempname,tempname) } rm(tempdata,col,m) } (While it might be possible to combine all the data into one large R-object, I have chosen not to do so. Due to the large size of my datasets, and the way they are organized, I feels it does make sense to keep them as individual files.) I wish to load several variables from each "sample", to perform a regression using the "lm" function, and to then save the all the regressions as objects in a list. Here is the code I have written. Is there a "better/simpler" way to do this? (Ideally, I'd like the model I specify to be flexible, and to be able to use not only lm, but also rlm, etc. (I have simplified my code for this example, but I think this repasts the essential parts of what I am trying to accomplish.) ) #my code to run a regression for each "sample" (i.e."samples" 5,6,7,8,& 9), #this saves the regression results in a list called "results" y='Ozone' x=c('Wind','Temp') results=list(NULL) for (i in 5:9) { load(file = paste(y,i,"r" ,sep="."), envir = .GlobalEnv) y1=get(paste(y,i,sep=".")) for (d in 1:length(x)) { load(file = paste(x[d],i,"r" ,sep="."), envir = .GlobalEnv) assign(paste("x",d,sep=""),get(paste(x[d],i,sep=".") )) } #end d loop reg <- lm(y1[,1]~x1[,1]+x2[,1]) results[i-5] <- list(reg) names(results)[i-5] <- i #need to add a line to remove any data files loaded } summary(results[[1]]) summary(results[[2]]) lapply(results,summary) --------------------------------- [[alternative HTML version deleted]]
Seemingly Similar Threads
- MICE data analysis with glmulti
- Odd behaviour in within.list() when deleting 2+ variables
- Odd behaviour in within.list() when deleting 2+ variables
- Odd behaviour in within.list() when deleting 2+ variables
- Odd behaviour in within.list() when deleting 2+ variables