Ben Ammar
2013-Oct-28 13:19 UTC
[R] How to extract residuals from multiple regressions from a loop
Dear all
I've got the following problem, I want to extract the residuals from
regression loops. The problem here is that some columns include NA's at
the
beginning and end (i.e. each time series of stocks starts at different
points in time and ends at different points in time). When I want to
transfer these residuals into a matrix to determine the residual matrix, I
get the error message ("number of items to replace is not a multiple of
replacement length"). I tried it with na.action=na.exclude but that
doesn't
work because that command doesn't actually change the vector length. With
a
loop I came this far:
Number of stocks is 50 and maximum time period is 258 months:
for (i in 1:50) {CAPM.res[,i] <- residuals(lm(timeseries[,i]~exc.mkt),
na.action=na.exclude)}
as I said it doesn't work because of the different column length in the
matrix "timeseries". So right now I'm doing kind of manually
which works
perfectly but is quite intensive and looks like that:
test.1 <- lm(timeseries[,1]~exc.mkt, na.action=na.exclude)
residual.test.1 <- residuals(test.1)
CAPM.res[,1] <- residual.Life.1
test.2 <- lm(timeseries[,2]~exc.mkt, na.action=na.exclude)
residual.test.2 <- residuals(test.2)
CAPM.res[,2] <- residual.test.2
....and so on for the remaining 49 stocks. When I look at that I obviously
see that this must be done with a loop but in the end I can't put in the
matrix because of the different lengths. So far I got this:
test<-matrix(0,50,258)
residual.test<-matrix(0,50,258)
for (i in 1:50) {lm(timeseries[,i]~exc.mkt, na.action=na.exclude)
{residual.test[i] <- residuals(test[i])
{CAPM.res[,i] <- residual.test[i]
}}}
but here I get the error message: "Error: $ operator is invalid for
atomic
vectors"
and I don't think "test" and "residual.test" is
defined correctly because I
don't know where to look for the residuals.
Does anyone have an idea how to extract the residuals and put them in a
258x50 matrix?
Any help would be very much appreciated!
Cheers,
Ben
William Dunlap
2013-Oct-28 15:33 UTC
[R] How to extract residuals from multiple regressions from a loop
Can you trim down your example to a size where you can show us the data (using dump() or dput()) and the commands you used so one can just copy it from your mail and paste it into R to reproduce the problem? I don't see your problem when I made up data similar to what you described:> timeseries <- ts(cbind(c(NA, 2:9, NA), c(1,NA,3,NA,5:10), c(rep(NA,9), 10))) > exc.mkt <- log(1:10) > for(i in 1:ncol(timeseries)) print(length(residuals(lm(timeseries[,i]~exc.mkt, na.action=na.exclude))))[1] 10 [1] 10 [1] 10> # without na.action=na.exclude residuals do vary in length > for(i in 1:ncol(timeseries)) print(length(residuals(lm(timeseries[,i]~exc.mkt))))[1] 8 [1] 8 [1] 1 Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com> -----Original Message----- > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf > Of Ben Ammar > Sent: Monday, October 28, 2013 6:20 AM > To: r-help at r-project.org > Subject: [R] How to extract residuals from multiple regressions from a loop > > > Dear all > > I've got the following problem, I want to extract the residuals from > regression loops. The problem here is that some columns include NA's at the > beginning and end (i.e. each time series of stocks starts at different > points in time and ends at different points in time). When I want to > transfer these residuals into a matrix to determine the residual matrix, I > get the error message ("number of items to replace is not a multiple of > replacement length"). I tried it with na.action=na.exclude but that doesn't > work because that command doesn't actually change the vector length. With a > loop I came this far: > Number of stocks is 50 and maximum time period is 258 months: > > for (i in 1:50) {CAPM.res[,i] <- residuals(lm(timeseries[,i]~exc.mkt), > na.action=na.exclude)} > > as I said it doesn't work because of the different column length in the > matrix "timeseries". So right now I'm doing kind of manually which works > perfectly but is quite intensive and looks like that: > test.1 <- lm(timeseries[,1]~exc.mkt, na.action=na.exclude) > residual.test.1 <- residuals(test.1) > CAPM.res[,1] <- residual.Life.1 > > test.2 <- lm(timeseries[,2]~exc.mkt, na.action=na.exclude) > residual.test.2 <- residuals(test.2) > CAPM.res[,2] <- residual.test.2 > > ....and so on for the remaining 49 stocks. When I look at that I obviously > see that this must be done with a loop but in the end I can't put in the > matrix because of the different lengths. So far I got this: > test<-matrix(0,50,258) > residual.test<-matrix(0,50,258) > for (i in 1:50) {lm(timeseries[,i]~exc.mkt, na.action=na.exclude) > {residual.test[i] <- residuals(test[i]) > {CAPM.res[,i] <- residual.test[i] > }}} > > but here I get the error message: "Error: $ operator is invalid for atomic > vectors" > and I don't think "test" and "residual.test" is defined correctly because I > don't know where to look for the residuals. > > Does anyone have an idea how to extract the residuals and put them in a > 258x50 matrix? > Any help would be very much appreciated! > > Cheers, > Ben > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Adams, Jean
2013-Oct-28 15:50 UTC
[R] How to extract residuals from multiple regressions from a loop
Ben,
It helps us help you if you provide a simple example with code that anyone
can run. When I created a simplified version of you situation, the NAs in
the response variable yielded NAs in the residuals just as the help for
na.exclude indicates.
?na.exclude
Can you replicate the problem (and the error message) you are having using
reproducible code?
Jean
# number of observations
n1 <- 10
# number of response variables
n2 <- 5
# randomly generate the independent variable
x <- rnorm(n1)
# create a matrix to store the residuals in
keep <- matrix(NA, nrow=n1, ncol=n2)
# fit a linear model for each response variable
for(i in 1:n2) {
# randomly generate the response
y <- rnorm(n1)
# randomly put in some missing values
y[sample(n1, 2)] <- NA
# fit the data
fit <- lm(y ~ x, na.action=na.exclude)
# keep the residuals
keep[, i] <- resid(fit)
}
# look at the resulting matrix of residuals
keep
[,1] [,2] [,3] [,4] [,5]
[1,] -1.1435507 0.29915778 -0.4593465 -0.61984029 0.8691960
[2,] 1.3410409 0.51701634 NA 0.65691397 NA
[3,] 1.0896517 NA 0.2239847 -0.63233644 -1.0831747
[4,] NA -1.80344171 -1.2984848 -0.26543679 0.4486482
[5,] -2.0836253 1.05313477 NA -0.02031142 -0.1059559
[6,] 1.4498942 0.61520388 -0.2172015 -0.90647457 0.4935462
[7,] -0.6265764 -0.08396366 0.5153020 NA -0.2501000
[8,] NA NA 0.9337658 -0.10289794 NA
[9,] -0.4217575 -0.42633169 0.1070141 1.89038347 -0.6588342
[10,] 0.3949231 -0.17077571 0.1949662 NA 0.2866744
On Mon, Oct 28, 2013 at 8:19 AM, Ben Ammar <Ben-Ammar@gmx.de> wrote:
>
> Dear all
>
> I've got the following problem, I want to extract the residuals
from
> regression loops. The problem here is that some columns include NA's
at
> the
> beginning and end (i.e. each time series of stocks starts at different
> points in time and ends at different points in time). When I want to
> transfer these residuals into a matrix to determine the residual
> matrix, I
> get the error message ("number of items to replace is not a
multiple of
> replacement length"). I tried it with na.action=na.exclude but that
> doesn't
> work because that command doesn't actually change the vector length.
> With a
> loop I came this far:
> Number of stocks is 50 and maximum time period is 258 months:
>
> for (i in 1:50) {CAPM.res[,i] <-
residuals(lm(timeseries[,i]~exc.mkt),
> na.action=na.exclude)}
>
> as I said it doesn't work because of the different column length in
the
> matrix "timeseries". So right now I'm doing kind of
manually which works
> perfectly but is quite intensive and looks like that:
> test.1 <- lm(timeseries[,1]~exc.mkt, na.action=na.exclude)
> residual.test.1 <- residuals(test.1)
> CAPM.res[,1] <- residual.Life.1
>
> test.2 <- lm(timeseries[,2]~exc.mkt, na.action=na.exclude)
> residual.test.2 <- residuals(test.2)
> CAPM.res[,2] <- residual.test.2
>
> ....and so on for the remaining 49 stocks. When I look at that I
> obviously
> see that this must be done with a loop but in the end I can't put in
the
> matrix because of the different lengths. So far I got this:
> test<-matrix(0,50,258)
> residual.test<-matrix(0,50,258)
> for (i in 1:50) {lm(timeseries[,i]~exc.mkt, na.action=na.exclude)
> {residual.test[i] <- residuals(test[i])
> {CAPM.res[,i] <- residual.test[i]
> }}}
>
> but here I get the error message: "Error: $ operator is invalid for
> atomic
> vectors"
> and I don't think "test" and "residual.test" is
defined correctly
> because I
> don't know where to look for the residuals.
>
> Does anyone have an idea how to extract the residuals and put them in a
> 258x50 matrix?
> Any help would be very much appreciated!
>
> Cheers,
> Ben
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
[[alternative HTML version deleted]]