francesca casalino
2011-Oct-24  14:23 UTC
[R] Creating data frame with residuals of a data frame
Dear experts,
I am trying to create a data frame from the residuals I get after
having applied a linear regression to each column of a data frame, but
I don't know how to create this data frame from the resulting list
since the list has differing numbers of rows.
So for example:
age<- c(5,6,10,14,16,NA,18)
value1<- c(30,70,40,50,NA,NA,NA)
value2<- c(2,4,1,4,4,4,4)
df<- data.frame(age, value1, value2)
#Run linear regression to adjust for age and get residuals:
lm_f <- function(x) {
x<- residuals(lm(data=df, formula= x ~ age))
}
resid <- apply(df,2,lm_f)
resid<- resid[-1]
Then resid is a list with different row numbers:
$value1
         1          2          3          4
-16.945813  22.906404  -7.684729   1.724138
$value2
          1           2           3           4           5           7
-0.37398374  1.50406504 -1.98373984  0.52845528  0.28455285  0.04065041
I am trying to get both the original variable and their residuals in
the same data frame like this:
age, value1, value2, resid_value1, resid_value2
But when I try cbind or other operations I get an error message
because they do not have the same number of rows. Can you please help
me figure out how to solve this?
Thank you.
try this:> age<- c(5,6,10,14,16,NA,18) > value1<- c(30,70,40,50,NA,NA,NA) > value2<- c(2,4,1,4,4,4,4) > df<- data.frame(age, value1, value2) > > #Run linear regression to adjust for age and get residuals: > > lm_f <- function(x) {+ x<- residuals(lm(data=df, formula= x ~ age)) + }> resid <- apply(df,2,lm_f) > resid<- resid[-1] > for (i in names(resid)){+ newCol <- paste(i, 'res', sep = '') + df[[newCol]] <- NA # initialize + df[[newCol]][as.integer(names(resid[[i]]))] <- resid[[i]] + }> dfage value1 value2 value1res value2res 1 5 30 2 -16.945813 -0.37398374 2 6 70 4 22.906404 1.50406504 3 10 40 1 -7.684729 -1.98373984 4 14 50 4 1.724138 0.52845528 5 16 NA 4 NA 0.28455285 6 NA NA 4 NA NA 7 18 NA 4 NA 0.04065041 On Mon, Oct 24, 2011 at 10:23 AM, francesca casalino <francy.casalino at gmail.com> wrote:> Dear experts, > > I am trying to create a data frame from the residuals I get after > having applied a linear regression to each column of a data frame, but > I don't know how to create this data frame from the resulting list > since the list has differing numbers of rows. > > So for example: > age<- c(5,6,10,14,16,NA,18) > value1<- c(30,70,40,50,NA,NA,NA) > value2<- c(2,4,1,4,4,4,4) > df<- data.frame(age, value1, value2) > > #Run linear regression to adjust for age and get residuals: > > lm_f <- function(x) { > x<- residuals(lm(data=df, formula= x ~ age)) > } > resid <- apply(df,2,lm_f) > resid<- resid[-1] > > Then resid is a list with different row numbers: > > $value1 > ? ? ? ? 1 ? ? ? ? ?2 ? ? ? ? ?3 ? ? ? ? ?4 > -16.945813 ?22.906404 ?-7.684729 ? 1.724138 > > $value2 > ? ? ? ? ?1 ? ? ? ? ? 2 ? ? ? ? ? 3 ? ? ? ? ? 4 ? ? ? ? ? 5 ? ? ? ? ? 7 > -0.37398374 ?1.50406504 -1.98373984 ?0.52845528 ?0.28455285 ?0.04065041 > > I am trying to get both the original variable and their residuals in > the same data frame like this: > > age, value1, value2, resid_value1, resid_value2 > > But when I try cbind or other operations I get an error message > because they do not have the same number of rows. Can you please help > me figure out how to solve this? > > Thank you. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Jim Holtman Data Munger Guru What is the problem that you are trying to solve?