Hello, I have a question concerning ?for loops? on multiple columns. I made 91 columns with results (all made together with a for loop) and I want to us lm to fit the model. I want to compare the results of all these calculated columns (91) with one column with observed values. I use the function lm to fit the model and calculate r.squared. I manage to do this for each column separately: For example: my calculated results are in the dataframe ?results6?, my observed results in data, (data$observed). #To calculate R2 for column 1: lm.modelobs1 <- lm(results6[,c(1)] ~ data$observed) R2.1 <- summary(lm.modelobs1)["r.squared"] #To calculate R2 for column 91: lm.modelobs91 <- lm(results6[,c(91)] ~ data$observed) R2.91 <- summary(lm.modelobs91)["r.squared"] But I think there has to be a method to do this automatically and not 91 times. I tried to use a for loop: ###(length(C) = 91) results7<-data.frame(lm.modelobs=rep(NA,length(C))) for (i in (1:91)) { results7$lm.modelobs[i] <- lm(results6[i] ~ data$observed) R2.[i] <- summary(lm.modelobs[i])["r.squared"] } I also tried just to calculate results7$lm.modelobs[i] without directly calculating r.squared but I also didn?t manage. It seems like it?s not possible to use the referral to a column in a for loop or a function. (if I just ask R the data in column 5 with ? results6[5] ?, that works. ? results6[,c(5)]? gives the same but replacing results6[i] by results6[,c([i])] in the for loop is apparently also no a solution). I?m looking for a manner to repeat a calculation/function on several columns. I kind of need this as well further in my script, not only in this part? I would greatly appreciate any suggestions! Thanks! Nerak -- View this message in context: http://r.789695.n4.nabble.com/Accomplishing-a-loop-on-multiple-columns-tp4284974p4284974.html Sent from the R help mailing list archive at Nabble.com.
Lists are the answer. LIST<-list() for(i in 1:ncol(results6)) { LIST[[i]]<-lm(results6[,i]~data$observed) } You'll now have a 91 entry list of lm(). You can then do something like this: LIST2<-list() for(i in 1:length(LIST)) { LIST2[[i]]<-LIST[[i]]$r.squared } This should now be a list of 91 R-squared, which you can unlist() and save in matrix form if you want. ----- ---- Isaac Research Assistant Quantitative Finance Faculty, UTS -- View this message in context: http://r.789695.n4.nabble.com/Accomplishing-a-loop-on-multiple-columns-tp4284974p4285136.html Sent from the R help mailing list archive at Nabble.com.
Many thanks! Never used lists before, but it?s a great solution! It works very well! Although, I have a next question concerning this. I want to know for which value (column) I have the maximal Rsquared. Therefore, I unlist the LIST so that it?s written like a vector. The columns were always named in the same way. They always start with results4$depth_ following by the number. The numbers are constructed as: seq(1,10,0.1). But if the R squared values are now in 1 column, I don?t know for which column they are calculated. So I made a new data frame with both columns: R2 <- unlist(LIST) Cvalue <- c(seq(1,10,0.1)) results5 <- data.frame(Cvalue,R2) # I know I can calculate the max value of Rsquared by this way: max(results5$R2) # now I want to know to which Cvalue this belongs. I would write it like this: results5$Cvalue[which(results5$R2 == "max(results5$R2)")] # But I always get the solution: numeric(0) # I don?t know if these Rsquared values are in a kind of format that this doesn?t work? (I used before for similar things, and I know that for example it cannot works if R recognises the values as a date) Maybe because it?s with decimals? I know that max(results5$R2) is in this example 0.6081547 and I can see that that belongs to the Cvalue == 1.8. It works in the opposite way. results5$R2[which(results5$Cvalue == "1.8")] # But neither results5$Cvalue[which(results5$R2 == "0.6081547")] # nor results5$Cvalue[which(results5$R2 == "max(results5$R2)")] # works? # I have an other question concerning accomplishing calculations on several colums. Again, there is a loop involved? I don?t know if I should ask it in this topic as well, because I don?t want to start to many kind of similar topics. I searched in the helpforum but unfortunately I couldn?t find something similar. Again, I manage to do it for one column (with the use of the specific name for this column). In each columns, I have 60 values. But to compare it to another column, I should reorganize the values. I want that value 2 becomes value 1, value 3 value 2 and so on. The first value would be NA. If I would do this for 1 column, I would do it like this: results$newdepth[1] <- NA for (t in 2:60) { results$newdepth[t] <- results$depth[t-1] } Like I mentioned before, the names of each column are constructed in the same way: results$depth_ followed by a number (seq(1,10,0.1)). So I don?t know how to manage to repeat this for all the columns at the same time? I would think about a for loop with for example for (i in 1:91) because there are 91 columns, but then I don?t know how to say that it should happen for each column. I was thinking about using this for (u in 1:91) { results$newdepth [,u]<- results$depth [,u] for (t in 2:60) { results$newdepth[,u][t] <- results$depth[,u] [t-1] }} But I can see that there are several reasons why a for loop like this cannot work. (like [ ][ ], ?) I just really cannot find an other manner to repeat a calculation or something els on several columns... -- View this message in context: http://r.789695.n4.nabble.com/Accomplishing-a-loop-on-multiple-columns-tp4284974p4289137.html Sent from the R help mailing list archive at Nabble.com.
I just saw a little mistake in my last post: Totally in the end, last line of the last loop, results$depth[,u] [t-1] should be results$newdepth[,u] [t-1]. My apologies. for (u in 1:91) { results$newdepth [,u]<- results$depth [,u] for (t in 2:60) { results$newdepth[,u][t] <- results$newdepth[,u] [t-1] }} -- View this message in context: http://r.789695.n4.nabble.com/Accomplishing-a-loop-on-multiple-columns-tp4284974p4289248.html Sent from the R help mailing list archive at Nabble.com.
Maybe Matching Threads
- result numeric(0) when using variable1[which(variable2="max(variable2)"]
- manipulating data of several columns simultaneously
- Help documentation of "The Studentized range Distribution"
- differentiating a numeric vector
- Start plot really at baselines x=0, y=0