Ben Ammar
2013-Oct-28 13:19 UTC
[R] How to extract residuals from multiple regressions from a loop
Dear all I've got the following problem, I want to extract the residuals from regression loops. The problem here is that some columns include NA's at the beginning and end (i.e. each time series of stocks starts at different points in time and ends at different points in time). When I want to transfer these residuals into a matrix to determine the residual matrix, I get the error message ("number of items to replace is not a multiple of replacement length"). I tried it with na.action=na.exclude but that doesn't work because that command doesn't actually change the vector length. With a loop I came this far: Number of stocks is 50 and maximum time period is 258 months: for (i in 1:50) {CAPM.res[,i] <- residuals(lm(timeseries[,i]~exc.mkt), na.action=na.exclude)} as I said it doesn't work because of the different column length in the matrix "timeseries". So right now I'm doing kind of manually which works perfectly but is quite intensive and looks like that: test.1 <- lm(timeseries[,1]~exc.mkt, na.action=na.exclude) residual.test.1 <- residuals(test.1) CAPM.res[,1] <- residual.Life.1 test.2 <- lm(timeseries[,2]~exc.mkt, na.action=na.exclude) residual.test.2 <- residuals(test.2) CAPM.res[,2] <- residual.test.2 ....and so on for the remaining 49 stocks. When I look at that I obviously see that this must be done with a loop but in the end I can't put in the matrix because of the different lengths. So far I got this: test<-matrix(0,50,258) residual.test<-matrix(0,50,258) for (i in 1:50) {lm(timeseries[,i]~exc.mkt, na.action=na.exclude) {residual.test[i] <- residuals(test[i]) {CAPM.res[,i] <- residual.test[i] }}} but here I get the error message: "Error: $ operator is invalid for atomic vectors" and I don't think "test" and "residual.test" is defined correctly because I don't know where to look for the residuals. Does anyone have an idea how to extract the residuals and put them in a 258x50 matrix? Any help would be very much appreciated! Cheers, Ben
William Dunlap
2013-Oct-28 15:33 UTC
[R] How to extract residuals from multiple regressions from a loop
Can you trim down your example to a size where you can show us the data (using dump() or dput()) and the commands you used so one can just copy it from your mail and paste it into R to reproduce the problem? I don't see your problem when I made up data similar to what you described:> timeseries <- ts(cbind(c(NA, 2:9, NA), c(1,NA,3,NA,5:10), c(rep(NA,9), 10))) > exc.mkt <- log(1:10) > for(i in 1:ncol(timeseries)) print(length(residuals(lm(timeseries[,i]~exc.mkt, na.action=na.exclude))))[1] 10 [1] 10 [1] 10> # without na.action=na.exclude residuals do vary in length > for(i in 1:ncol(timeseries)) print(length(residuals(lm(timeseries[,i]~exc.mkt))))[1] 8 [1] 8 [1] 1 Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com> -----Original Message----- > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf > Of Ben Ammar > Sent: Monday, October 28, 2013 6:20 AM > To: r-help at r-project.org > Subject: [R] How to extract residuals from multiple regressions from a loop > > > Dear all > > I've got the following problem, I want to extract the residuals from > regression loops. The problem here is that some columns include NA's at the > beginning and end (i.e. each time series of stocks starts at different > points in time and ends at different points in time). When I want to > transfer these residuals into a matrix to determine the residual matrix, I > get the error message ("number of items to replace is not a multiple of > replacement length"). I tried it with na.action=na.exclude but that doesn't > work because that command doesn't actually change the vector length. With a > loop I came this far: > Number of stocks is 50 and maximum time period is 258 months: > > for (i in 1:50) {CAPM.res[,i] <- residuals(lm(timeseries[,i]~exc.mkt), > na.action=na.exclude)} > > as I said it doesn't work because of the different column length in the > matrix "timeseries". So right now I'm doing kind of manually which works > perfectly but is quite intensive and looks like that: > test.1 <- lm(timeseries[,1]~exc.mkt, na.action=na.exclude) > residual.test.1 <- residuals(test.1) > CAPM.res[,1] <- residual.Life.1 > > test.2 <- lm(timeseries[,2]~exc.mkt, na.action=na.exclude) > residual.test.2 <- residuals(test.2) > CAPM.res[,2] <- residual.test.2 > > ....and so on for the remaining 49 stocks. When I look at that I obviously > see that this must be done with a loop but in the end I can't put in the > matrix because of the different lengths. So far I got this: > test<-matrix(0,50,258) > residual.test<-matrix(0,50,258) > for (i in 1:50) {lm(timeseries[,i]~exc.mkt, na.action=na.exclude) > {residual.test[i] <- residuals(test[i]) > {CAPM.res[,i] <- residual.test[i] > }}} > > but here I get the error message: "Error: $ operator is invalid for atomic > vectors" > and I don't think "test" and "residual.test" is defined correctly because I > don't know where to look for the residuals. > > Does anyone have an idea how to extract the residuals and put them in a > 258x50 matrix? > Any help would be very much appreciated! > > Cheers, > Ben > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Adams, Jean
2013-Oct-28 15:50 UTC
[R] How to extract residuals from multiple regressions from a loop
Ben, It helps us help you if you provide a simple example with code that anyone can run. When I created a simplified version of you situation, the NAs in the response variable yielded NAs in the residuals just as the help for na.exclude indicates. ?na.exclude Can you replicate the problem (and the error message) you are having using reproducible code? Jean # number of observations n1 <- 10 # number of response variables n2 <- 5 # randomly generate the independent variable x <- rnorm(n1) # create a matrix to store the residuals in keep <- matrix(NA, nrow=n1, ncol=n2) # fit a linear model for each response variable for(i in 1:n2) { # randomly generate the response y <- rnorm(n1) # randomly put in some missing values y[sample(n1, 2)] <- NA # fit the data fit <- lm(y ~ x, na.action=na.exclude) # keep the residuals keep[, i] <- resid(fit) } # look at the resulting matrix of residuals keep [,1] [,2] [,3] [,4] [,5] [1,] -1.1435507 0.29915778 -0.4593465 -0.61984029 0.8691960 [2,] 1.3410409 0.51701634 NA 0.65691397 NA [3,] 1.0896517 NA 0.2239847 -0.63233644 -1.0831747 [4,] NA -1.80344171 -1.2984848 -0.26543679 0.4486482 [5,] -2.0836253 1.05313477 NA -0.02031142 -0.1059559 [6,] 1.4498942 0.61520388 -0.2172015 -0.90647457 0.4935462 [7,] -0.6265764 -0.08396366 0.5153020 NA -0.2501000 [8,] NA NA 0.9337658 -0.10289794 NA [9,] -0.4217575 -0.42633169 0.1070141 1.89038347 -0.6588342 [10,] 0.3949231 -0.17077571 0.1949662 NA 0.2866744 On Mon, Oct 28, 2013 at 8:19 AM, Ben Ammar <Ben-Ammar@gmx.de> wrote:> > Dear all > > I've got the following problem, I want to extract the residuals from > regression loops. The problem here is that some columns include NA's at > the > beginning and end (i.e. each time series of stocks starts at different > points in time and ends at different points in time). When I want to > transfer these residuals into a matrix to determine the residual > matrix, I > get the error message ("number of items to replace is not a multiple of > replacement length"). I tried it with na.action=na.exclude but that > doesn't > work because that command doesn't actually change the vector length. > With a > loop I came this far: > Number of stocks is 50 and maximum time period is 258 months: > > for (i in 1:50) {CAPM.res[,i] <- residuals(lm(timeseries[,i]~exc.mkt), > na.action=na.exclude)} > > as I said it doesn't work because of the different column length in the > matrix "timeseries". So right now I'm doing kind of manually which works > perfectly but is quite intensive and looks like that: > test.1 <- lm(timeseries[,1]~exc.mkt, na.action=na.exclude) > residual.test.1 <- residuals(test.1) > CAPM.res[,1] <- residual.Life.1 > > test.2 <- lm(timeseries[,2]~exc.mkt, na.action=na.exclude) > residual.test.2 <- residuals(test.2) > CAPM.res[,2] <- residual.test.2 > > ....and so on for the remaining 49 stocks. When I look at that I > obviously > see that this must be done with a loop but in the end I can't put in the > matrix because of the different lengths. So far I got this: > test<-matrix(0,50,258) > residual.test<-matrix(0,50,258) > for (i in 1:50) {lm(timeseries[,i]~exc.mkt, na.action=na.exclude) > {residual.test[i] <- residuals(test[i]) > {CAPM.res[,i] <- residual.test[i] > }}} > > but here I get the error message: "Error: $ operator is invalid for > atomic > vectors" > and I don't think "test" and "residual.test" is defined correctly > because I > don't know where to look for the residuals. > > Does anyone have an idea how to extract the residuals and put them in a > 258x50 matrix? > Any help would be very much appreciated! > > Cheers, > Ben > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]