Murtaza Das
2008-Dec-26 17:37 UTC
[R] lm() with same formula but different column/factor combinations in data frame
Hi, I am trying to find an efficient way of applying a linear regression model to different factor combinations in a data frame. I want to obtain the output with minimal or no use of loops if possible. Please let me know if this query is unclear. Thanks, Murtaza *********************************************************************************************************************************************************** The data frame TEST1 has four factor columns followed by thirteen numeric columns defined as : 1) Community, levels: "20232" 2) WT, levels: "B", "E", "M" 3) LTC, levels: "L", "M", "S", "1" 4) UC, levels: "1X1", "2X2" 5) UncDmd: Response variable in the linear model 6-16) M1...M11: Explanatory variables in the linear model A few sample rows in the data frame are as follows:> TEST1[1:15,]Community WT LTC UC UncDmd M1 M2 M3 M4 M5 M6 M7 M8 M9 M10 M11 1 20232 E L 1X1 1.000000 0 0 0 0 0 0 0 0 0 0 1 2 20232 E L 2X2 0.000000 0 0 0 0 0 0 0 0 0 0 1 3 20232 E M 1X1 1.000000 0 0 0 0 0 0 0 0 0 0 1 4 20232 E M 2X2 1.000000 0 0 0 0 0 0 0 0 0 0 1 5 20232 E S 1X1 0.000000 0 0 0 0 0 0 0 0 0 0 1 6 20232 E S 2X2 0.000000 0 0 0 0 1 0 0 0 0 0 0 7 20232 B 1 1X1 0.209117 0 0 0 0 0 0 0 0 0 0 1 8 20232 B 1 2X2 0.190605 0 0 0 0 0 0 0 0 0 0 1 9 20232 B L 1X1 0.000000 0 0 0 0 1 0 0 0 0 0 0 10 20232 B L 2X2 1.000000 0 0 0 0 0 0 0 0 0 0 1 11 20232 B M 1X1 4.000000 0 0 0 0 0 0 0 0 0 0 1 12 20232 B M 2X2 0.000000 0 0 0 0 0 0 0 0 0 0 1 13 20232 B S 1X1 0.000000 1 0 0 0 0 0 0 0 0 0 0 14 20232 B S 2X2 0.000000 0 0 0 0 0 0 0 0 0 0 1 15 20232 M 1 1X1 0.618689 0 0 0 0 0 0 0 0 0 1 0 ********************************************************************************************************************************************************* I need to store the coefficients using lm() for different combinations of the 4 factors, or different combinations of 3 factors or different combinations of 2 factors or differennt combinations of 1 factor. The formula remains fixed as:> FormulaUncDmd ~ M1 + M2 + M3 + M4 + M5 + M6 + M7 + M8 + M9 + M10 + M11 So, different models I want to solve in R are : 1) Community : lm(Formula,TEST1[ as.logical( (TEST1[[1]]=="20232") ) , ]) 2) WT : lm(Formula,TEST1[ as.logical( (TEST1[[2]]=="B") ) , ]) 3) WT : lm(Formula,TEST1[ as.logical( (TEST1[[2]]=="E") ) , ]) 4) WT : lm(Formula,TEST1[ as.logical( (TEST1[[2]]=="M") ) , ]) 5) LTC : lm(Formula,TEST1[ as.logical( (TEST1[[3]]=="L") ) , ]) 6) LTC : lm(Formula,TEST1[ as.logical( (TEST1[[3]]=="M") ) , ]) 7) LTC : lm(Formula,TEST1[ as.logical( (TEST1[[3]]=="S") ) , ]) 8) LTC : lm(Formula,TEST1[ as.logical( (TEST1[[3]]=="1L") ) , ]) 9) UC : lm(Formula,TEST1[ as.logical( (TEST1[[4]]=="1X1") ) , ]) 10) UC : lm(Formula,TEST1[ as.logical( (TEST1[[4]]=="2X2") ) , ]) 11) Community, WT : lm(Formula,TEST1[ as.logical( (TEST1[[1]]=="20232") * (TEST1[[2]]=="B") ) , ]) 12) Community, WT : lm(Formula,TEST1[ as.logical( (TEST1[[1]]=="20232") * (TEST1[[2]]=="E") ) , ]) 13) Community, WT : lm(Formula,TEST1[ as.logical( (TEST1[[1]]=="20232") * (TEST1[[2]]=="M") ) , ]) 14) Community, LTC : lm(Formula,TEST1[ as.logical( (TEST1[[1]]=="20232") * (TEST1[[3]]=="L") ) , ]) 15) Community, LTC : lm(Formula,TEST1[ as.logical( (TEST1[[1]]=="20232") * (TEST1[[3]]=="M") ) , ]) 16) Community, LTC : lm(Formula,TEST1[ as.logical( (TEST1[[1]]=="20232") * (TEST1[[3]]=="S") ) , ]) 17) Community, LTC : lm(Formula,TEST1[ as.logical( (TEST1[[1]]=="20232") * (TEST1[[3]]=="1") ) , ]) 18) Community, UC : lm(Formula,TEST1[ as.logical( (TEST1[[1]]=="20232") * (TEST1[[4]]=="1X1") ) , ]) 19) Community, UC : lm(Formula,TEST1[ as.logical( (TEST1[[1]]=="20232") * (TEST1[[4]]=="2X2") ) , ]) 20) WT, LTC : lm(Formula,TEST1[ as.logical( (TEST1[[2]]=="B") * (TEST1[[3]]=="L") ) , ]) 21) WT, LTC : lm(Formula,TEST1[ as.logical( (TEST1[[2]]=="B") * (TEST1[[3]]=="M") ) , ]) 22) WT, LTC : lm(Formula,TEST1[ as.logical( (TEST1[[2]]=="B") * (TEST1[[3]]=="S") ) , ]) 23) WT, LTC : lm(Formula,TEST1[ as.logical( (TEST1[[2]]=="B") * (TEST1[[3]]=="1") ) , ]) 24) WT, LTC : lm(Formula,TEST1[ as.logical( (TEST1[[2]]=="E") * (TEST1[[3]]=="L") ) , ]) 25) WT, LTC : lm(Formula,TEST1[ as.logical( (TEST1[[2]]=="E") * (TEST1[[3]]=="M") ) , ]) 26) WT, LTC : lm(Formula,TEST1[ as.logical( (TEST1[[2]]=="E") * (TEST1[[3]]=="S") ) , ]) 27) WT, LTC : lm(Formula,TEST1[ as.logical( (TEST1[[2]]=="E") * (TEST1[[3]]=="1") ) , ]) 28) WT, LTC : lm(Formula,TEST1[ as.logical( (TEST1[[2]]=="M") * (TEST1[[3]]=="L") ) , ]) 29) WT, LTC : lm(Formula,TEST1[ as.logical( (TEST1[[2]]=="M") * (TEST1[[3]]=="M") ) , ]) 30) WT, LTC : lm(Formula,TEST1[ as.logical( (TEST1[[2]]=="M") * (TEST1[[3]]=="S") ) , ]) 31) WT, LTC : lm(Formula,TEST1[ as.logical( (TEST1[[2]]=="M") * (TEST1[[3]]=="1") ) , ]) 32) WT, UC : ... ... xx) LTC, UC : ... xxx) Community, WT, LTC : ... ... and so on upto: xxxx) Community, WT, LTC, UC : lm(Formula,TEST1[ as.logical( (TEST1[[1]]=="20232") * (TEST1[[2]]=="M") * (TEST1[[3]]=="1") ) * (TEST1[[4]]=="2X2"), ]) *********************************************************************************************************************************************************** Desired Output format (or something simlar): Factor1 Factor2 Factor3 Factor4 Intercept M1 M2 M3 M4 M5 M6 M7 M8 M9 M10 M11 1) 20232 x x x x x x x x x x x x 2) B x x x x x x x x x x x x 3) E x x x x x x x x x x x x 4) M x x x x x x x x x x x x 5) L x x x x x x x x x x x x 6) M x x x x x x x x x x x x 7) S x x x x x x x x x x x x 8) 1 x x x x x x x x x x x x 9) 1X1 x x x x x x x x x x x x 10) 2X2 x x x x x x x x x x x x 11) 20232 B x x x x x x x x x x x x .. .. and so on.. x is the respective coefficient obtained from the linear fit.
Gabor Grothendieck
2008-Dec-26 17:58 UTC
[R] lm() with same formula but different column/factor combinations in data frame
See the leaps package. On Fri, Dec 26, 2008 at 12:37 PM, Murtaza Das <murtazadas at gmail.com> wrote:> Hi, > > I am trying to find an efficient way of applying a linear regression > model to different factor combinations in a data frame. > I want to obtain the output with minimal or no use of loops if > possible. Please let me know if this query is unclear. > > Thanks, > Murtaza > > *********************************************************************************************************************************************************** > > The data frame TEST1 has four factor columns followed by thirteen > numeric columns defined as : > 1) Community, levels: "20232" > 2) WT, levels: "B", "E", "M" > 3) LTC, levels: "L", "M", "S", "1" > 4) UC, levels: "1X1", "2X2" > 5) UncDmd: Response variable in the linear model > 6-16) M1...M11: Explanatory variables in the linear model > > A few sample rows in the data frame are as follows: >> TEST1[1:15,] > Community WT LTC UC UncDmd M1 M2 M3 M4 M5 M6 M7 M8 M9 M10 M11 > 1 20232 E L 1X1 1.000000 0 0 0 0 0 0 0 0 0 0 1 > 2 20232 E L 2X2 0.000000 0 0 0 0 0 0 0 0 0 0 1 > 3 20232 E M 1X1 1.000000 0 0 0 0 0 0 0 0 0 0 1 > 4 20232 E M 2X2 1.000000 0 0 0 0 0 0 0 0 0 0 1 > 5 20232 E S 1X1 0.000000 0 0 0 0 0 0 0 0 0 0 1 > 6 20232 E S 2X2 0.000000 0 0 0 0 1 0 0 0 0 0 0 > 7 20232 B 1 1X1 0.209117 0 0 0 0 0 0 0 0 0 0 1 > 8 20232 B 1 2X2 0.190605 0 0 0 0 0 0 0 0 0 0 1 > 9 20232 B L 1X1 0.000000 0 0 0 0 1 0 0 0 0 0 0 > 10 20232 B L 2X2 1.000000 0 0 0 0 0 0 0 0 0 0 1 > 11 20232 B M 1X1 4.000000 0 0 0 0 0 0 0 0 0 0 1 > 12 20232 B M 2X2 0.000000 0 0 0 0 0 0 0 0 0 0 1 > 13 20232 B S 1X1 0.000000 1 0 0 0 0 0 0 0 0 0 0 > 14 20232 B S 2X2 0.000000 0 0 0 0 0 0 0 0 0 0 1 > 15 20232 M 1 1X1 0.618689 0 0 0 0 0 0 0 0 0 1 0 > > ********************************************************************************************************************************************************* > I need to store the coefficients using lm() for different combinations > of the 4 factors, or different combinations of 3 factors or different > combinations of 2 factors or > differennt combinations of 1 factor. > The formula remains fixed as: >> Formula > UncDmd ~ M1 + M2 + M3 + M4 + M5 + M6 + M7 + M8 + M9 + M10 + M11 > > So, different models I want to solve in R are : > 1) Community : lm(Formula,TEST1[ as.logical( > (TEST1[[1]]=="20232") ) , ]) > 2) WT : lm(Formula,TEST1[ as.logical( > (TEST1[[2]]=="B") ) , ]) > 3) WT : lm(Formula,TEST1[ as.logical( > (TEST1[[2]]=="E") ) , ]) > 4) WT : lm(Formula,TEST1[ as.logical( > (TEST1[[2]]=="M") ) , ]) > 5) LTC : lm(Formula,TEST1[ as.logical( > (TEST1[[3]]=="L") ) , ]) > 6) LTC : lm(Formula,TEST1[ as.logical( > (TEST1[[3]]=="M") ) , ]) > 7) LTC : lm(Formula,TEST1[ as.logical( > (TEST1[[3]]=="S") ) , ]) > 8) LTC : lm(Formula,TEST1[ as.logical( > (TEST1[[3]]=="1L") ) , ]) > 9) UC : lm(Formula,TEST1[ as.logical( > (TEST1[[4]]=="1X1") ) , ]) > 10) UC : lm(Formula,TEST1[ as.logical( > (TEST1[[4]]=="2X2") ) , ]) > 11) Community, WT : lm(Formula,TEST1[ as.logical( > (TEST1[[1]]=="20232") * (TEST1[[2]]=="B") ) , ]) > 12) Community, WT : lm(Formula,TEST1[ as.logical( > (TEST1[[1]]=="20232") * (TEST1[[2]]=="E") ) , ]) > 13) Community, WT : lm(Formula,TEST1[ as.logical( > (TEST1[[1]]=="20232") * (TEST1[[2]]=="M") ) , ]) > 14) Community, LTC : lm(Formula,TEST1[ as.logical( > (TEST1[[1]]=="20232") * (TEST1[[3]]=="L") ) , ]) > 15) Community, LTC : lm(Formula,TEST1[ as.logical( > (TEST1[[1]]=="20232") * (TEST1[[3]]=="M") ) , ]) > 16) Community, LTC : lm(Formula,TEST1[ as.logical( > (TEST1[[1]]=="20232") * (TEST1[[3]]=="S") ) , ]) > 17) Community, LTC : lm(Formula,TEST1[ as.logical( > (TEST1[[1]]=="20232") * (TEST1[[3]]=="1") ) , ]) > 18) Community, UC : lm(Formula,TEST1[ as.logical( > (TEST1[[1]]=="20232") * (TEST1[[4]]=="1X1") ) , ]) > 19) Community, UC : lm(Formula,TEST1[ as.logical( > (TEST1[[1]]=="20232") * (TEST1[[4]]=="2X2") ) , ]) > 20) WT, LTC : lm(Formula,TEST1[ as.logical( > (TEST1[[2]]=="B") * (TEST1[[3]]=="L") ) , ]) > 21) WT, LTC : lm(Formula,TEST1[ as.logical( > (TEST1[[2]]=="B") * (TEST1[[3]]=="M") ) , ]) > 22) WT, LTC : lm(Formula,TEST1[ as.logical( > (TEST1[[2]]=="B") * (TEST1[[3]]=="S") ) , ]) > 23) WT, LTC : lm(Formula,TEST1[ as.logical( > (TEST1[[2]]=="B") * (TEST1[[3]]=="1") ) , ]) > 24) WT, LTC : lm(Formula,TEST1[ as.logical( > (TEST1[[2]]=="E") * (TEST1[[3]]=="L") ) , ]) > 25) WT, LTC : lm(Formula,TEST1[ as.logical( > (TEST1[[2]]=="E") * (TEST1[[3]]=="M") ) , ]) > 26) WT, LTC : lm(Formula,TEST1[ as.logical( > (TEST1[[2]]=="E") * (TEST1[[3]]=="S") ) , ]) > 27) WT, LTC : lm(Formula,TEST1[ as.logical( > (TEST1[[2]]=="E") * (TEST1[[3]]=="1") ) , ]) > 28) WT, LTC : lm(Formula,TEST1[ as.logical( > (TEST1[[2]]=="M") * (TEST1[[3]]=="L") ) , ]) > 29) WT, LTC : lm(Formula,TEST1[ as.logical( > (TEST1[[2]]=="M") * (TEST1[[3]]=="M") ) , ]) > 30) WT, LTC : lm(Formula,TEST1[ as.logical( > (TEST1[[2]]=="M") * (TEST1[[3]]=="S") ) , ]) > 31) WT, LTC : lm(Formula,TEST1[ as.logical( > (TEST1[[2]]=="M") * (TEST1[[3]]=="1") ) , ]) > 32) WT, UC : > ... > ... > xx) LTC, UC : > ... > xxx) Community, WT, LTC : > ... > ... > and so on upto: > xxxx) Community, WT, LTC, UC : lm(Formula,TEST1[ as.logical( > (TEST1[[1]]=="20232") * (TEST1[[2]]=="M") * (TEST1[[3]]=="1") ) * > (TEST1[[4]]=="2X2"), ]) > *********************************************************************************************************************************************************** > Desired Output format (or something simlar): > Factor1 Factor2 Factor3 Factor4 Intercept M1 M2 M3 M4 M5 M6 > M7 M8 M9 M10 M11 > 1) 20232 x x x > x x x x x x x x x > 2) B x x x > x x x x x x x x x > 3) E x x x > x x x x x x x x x > 4) M x x x > x x x x x x x x x > 5) L x x x > x x x x x x x x x > 6) M x x x > x x x x x x x x x > 7) S x x x > x x x x x x x x x > 8) 1 x x x > x x x x x x x x x > 9) 1X1 x x x > x x x x x x x x x > 10) 2X2 x x x > x x x x x x x x x > 11) 20232 B x x x x > x x x x x x x x > .. > .. > and so on.. > > > x is the respective coefficient obtained from the linear fit. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
Murtaza Das
2008-Dec-26 18:57 UTC
[R] lm() with same formula but different column/factor combinations in data frame
Thanks for replying Gabor. I checked the leaps() function and i think it is intended to find the best combination of predictors in the linear model. Does leaps have a way to combine different factor columns in my data frame as follows : I have the regression model fixed. The combination of predictor variables used always remains the same. UncDmd ~ M1 + M2 + M3 + M4 + M5 + M6 + M7 + M8 + M9 + M10 + M11 I want to get the coefficients in this linear model when different combinations of factors (select a combination from first four columns of the data frame) and their levels are taken from a data frame(apply lm model for a each combination of levels within the selected factor columns). Thus corresponding to each combination, the data used to determine the model coefficients will be different. I am attaching the data and R files (long method using loops) that I use to get the result. Currently, I modify keys to get different combinations. Also, note in the script, the data frame is named LRO1. Thanks again, Murtaza On Fri, Dec 26, 2008 at 12:58 PM, Gabor Grothendieck <ggrothendieck at gmail.com> wrote:> See the leaps package. > > On Fri, Dec 26, 2008 at 12:37 PM, Murtaza Das <murtazadas at gmail.com> wrote: >> Hi, >> >> I am trying to find an efficient way of applying a linear regression >> model to different factor combinations in a data frame. >> I want to obtain the output with minimal or no use of loops if >> possible. Please let me know if this query is unclear. >> >> Thanks, >> Murtaza >> >> *********************************************************************************************************************************************************** >> >> The data frame TEST1 has four factor columns followed by thirteen >> numeric columns defined as : >> 1) Community, levels: "20232" >> 2) WT, levels: "B", "E", "M" >> 3) LTC, levels: "L", "M", "S", "1" >> 4) UC, levels: "1X1", "2X2" >> 5) UncDmd: Response variable in the linear model >> 6-16) M1...M11: Explanatory variables in the linear model >> >> A few sample rows in the data frame are as follows: >>> TEST1[1:15,] >> Community WT LTC UC UncDmd M1 M2 M3 M4 M5 M6 M7 M8 M9 M10 M11 >> 1 20232 E L 1X1 1.000000 0 0 0 0 0 0 0 0 0 0 1 >> 2 20232 E L 2X2 0.000000 0 0 0 0 0 0 0 0 0 0 1 >> 3 20232 E M 1X1 1.000000 0 0 0 0 0 0 0 0 0 0 1 >> 4 20232 E M 2X2 1.000000 0 0 0 0 0 0 0 0 0 0 1 >> 5 20232 E S 1X1 0.000000 0 0 0 0 0 0 0 0 0 0 1 >> 6 20232 E S 2X2 0.000000 0 0 0 0 1 0 0 0 0 0 0 >> 7 20232 B 1 1X1 0.209117 0 0 0 0 0 0 0 0 0 0 1 >> 8 20232 B 1 2X2 0.190605 0 0 0 0 0 0 0 0 0 0 1 >> 9 20232 B L 1X1 0.000000 0 0 0 0 1 0 0 0 0 0 0 >> 10 20232 B L 2X2 1.000000 0 0 0 0 0 0 0 0 0 0 1 >> 11 20232 B M 1X1 4.000000 0 0 0 0 0 0 0 0 0 0 1 >> 12 20232 B M 2X2 0.000000 0 0 0 0 0 0 0 0 0 0 1 >> 13 20232 B S 1X1 0.000000 1 0 0 0 0 0 0 0 0 0 0 >> 14 20232 B S 2X2 0.000000 0 0 0 0 0 0 0 0 0 0 1 >> 15 20232 M 1 1X1 0.618689 0 0 0 0 0 0 0 0 0 1 0 >> >> ********************************************************************************************************************************************************* >> I need to store the coefficients using lm() for different combinations >> of the 4 factors, or different combinations of 3 factors or different >> combinations of 2 factors or >> differennt combinations of 1 factor. >> The formula remains fixed as: >>> Formula >> UncDmd ~ M1 + M2 + M3 + M4 + M5 + M6 + M7 + M8 + M9 + M10 + M11 >> >> So, different models I want to solve in R are : >> 1) Community : lm(Formula,TEST1[ as.logical( >> (TEST1[[1]]=="20232") ) , ]) >> 2) WT : lm(Formula,TEST1[ as.logical( >> (TEST1[[2]]=="B") ) , ]) >> 3) WT : lm(Formula,TEST1[ as.logical( >> (TEST1[[2]]=="E") ) , ]) >> 4) WT : lm(Formula,TEST1[ as.logical( >> (TEST1[[2]]=="M") ) , ]) >> 5) LTC : lm(Formula,TEST1[ as.logical( >> (TEST1[[3]]=="L") ) , ]) >> 6) LTC : lm(Formula,TEST1[ as.logical( >> (TEST1[[3]]=="M") ) , ]) >> 7) LTC : lm(Formula,TEST1[ as.logical( >> (TEST1[[3]]=="S") ) , ]) >> 8) LTC : lm(Formula,TEST1[ as.logical( >> (TEST1[[3]]=="1L") ) , ]) >> 9) UC : lm(Formula,TEST1[ as.logical( >> (TEST1[[4]]=="1X1") ) , ]) >> 10) UC : lm(Formula,TEST1[ as.logical( >> (TEST1[[4]]=="2X2") ) , ]) >> 11) Community, WT : lm(Formula,TEST1[ as.logical( >> (TEST1[[1]]=="20232") * (TEST1[[2]]=="B") ) , ]) >> 12) Community, WT : lm(Formula,TEST1[ as.logical( >> (TEST1[[1]]=="20232") * (TEST1[[2]]=="E") ) , ]) >> 13) Community, WT : lm(Formula,TEST1[ as.logical( >> (TEST1[[1]]=="20232") * (TEST1[[2]]=="M") ) , ]) >> 14) Community, LTC : lm(Formula,TEST1[ as.logical( >> (TEST1[[1]]=="20232") * (TEST1[[3]]=="L") ) , ]) >> 15) Community, LTC : lm(Formula,TEST1[ as.logical( >> (TEST1[[1]]=="20232") * (TEST1[[3]]=="M") ) , ]) >> 16) Community, LTC : lm(Formula,TEST1[ as.logical( >> (TEST1[[1]]=="20232") * (TEST1[[3]]=="S") ) , ]) >> 17) Community, LTC : lm(Formula,TEST1[ as.logical( >> (TEST1[[1]]=="20232") * (TEST1[[3]]=="1") ) , ]) >> 18) Community, UC : lm(Formula,TEST1[ as.logical( >> (TEST1[[1]]=="20232") * (TEST1[[4]]=="1X1") ) , ]) >> 19) Community, UC : lm(Formula,TEST1[ as.logical( >> (TEST1[[1]]=="20232") * (TEST1[[4]]=="2X2") ) , ]) >> 20) WT, LTC : lm(Formula,TEST1[ as.logical( >> (TEST1[[2]]=="B") * (TEST1[[3]]=="L") ) , ]) >> 21) WT, LTC : lm(Formula,TEST1[ as.logical( >> (TEST1[[2]]=="B") * (TEST1[[3]]=="M") ) , ]) >> 22) WT, LTC : lm(Formula,TEST1[ as.logical( >> (TEST1[[2]]=="B") * (TEST1[[3]]=="S") ) , ]) >> 23) WT, LTC : lm(Formula,TEST1[ as.logical( >> (TEST1[[2]]=="B") * (TEST1[[3]]=="1") ) , ]) >> 24) WT, LTC : lm(Formula,TEST1[ as.logical( >> (TEST1[[2]]=="E") * (TEST1[[3]]=="L") ) , ]) >> 25) WT, LTC : lm(Formula,TEST1[ as.logical( >> (TEST1[[2]]=="E") * (TEST1[[3]]=="M") ) , ]) >> 26) WT, LTC : lm(Formula,TEST1[ as.logical( >> (TEST1[[2]]=="E") * (TEST1[[3]]=="S") ) , ]) >> 27) WT, LTC : lm(Formula,TEST1[ as.logical( >> (TEST1[[2]]=="E") * (TEST1[[3]]=="1") ) , ]) >> 28) WT, LTC : lm(Formula,TEST1[ as.logical( >> (TEST1[[2]]=="M") * (TEST1[[3]]=="L") ) , ]) >> 29) WT, LTC : lm(Formula,TEST1[ as.logical( >> (TEST1[[2]]=="M") * (TEST1[[3]]=="M") ) , ]) >> 30) WT, LTC : lm(Formula,TEST1[ as.logical( >> (TEST1[[2]]=="M") * (TEST1[[3]]=="S") ) , ]) >> 31) WT, LTC : lm(Formula,TEST1[ as.logical( >> (TEST1[[2]]=="M") * (TEST1[[3]]=="1") ) , ]) >> 32) WT, UC : >> ... >> ... >> xx) LTC, UC : >> ... >> xxx) Community, WT, LTC : >> ... >> ... >> and so on upto: >> xxxx) Community, WT, LTC, UC : lm(Formula,TEST1[ as.logical( >> (TEST1[[1]]=="20232") * (TEST1[[2]]=="M") * (TEST1[[3]]=="1") ) * >> (TEST1[[4]]=="2X2"), ]) >> *********************************************************************************************************************************************************** >> Desired Output format (or something simlar): >> Factor1 Factor2 Factor3 Factor4 Intercept M1 M2 M3 M4 M5 M6 >> M7 M8 M9 M10 M11 >> 1) 20232 x x x >> x x x x x x x x x >> 2) B x x x >> x x x x x x x x x >> 3) E x x x >> x x x x x x x x x >> 4) M x x x >> x x x x x x x x x >> 5) L x x x >> x x x x x x x x x >> 6) M x x x >> x x x x x x x x x >> 7) S x x x >> x x x x x x x x x >> 8) 1 x x x >> x x x x x x x x x >> 9) 1X1 x x x >> x x x x x x x x x >> 10) 2X2 x x x >> x x x x x x x x x >> 11) 20232 B x x x x >> x x x x x x x x >> .. >> .. >> and so on.. >> >> >> x is the respective coefficient obtained from the linear fit. >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> >
Gabor Grothendieck
2008-Dec-26 19:38 UTC
[R] lm() with same formula but different column/factor combinations in data frame
Try variations of this: library(leaps) b<-regsubsets(Fertility~.,data=swiss) w <- summary(b)$which lapply(1:nrow(w), function(i) coef(lm(Fertility ~., swiss[w[i, ]]))) On Fri, Dec 26, 2008 at 1:57 PM, Murtaza Das <murtazadas at gmail.com> wrote:> Thanks for replying Gabor. > > I checked the leaps() function and i think it is intended to find the > best combination of predictors in the linear model. > Does leaps have a way to combine different factor columns in my data > frame as follows : > > I have the regression model fixed. The combination of predictor > variables used always remains the same. > UncDmd ~ M1 + M2 + M3 + M4 + M5 + M6 + M7 + M8 + M9 + M10 + M11 > > I want to get the coefficients in this linear model when different > combinations of factors (select a combination from first four columns > of the data frame) and their levels are taken from a data frame(apply > lm model for a each combination of levels within the selected factor > columns). Thus corresponding to each combination, the data used to > determine the model coefficients will be different. > > I am attaching the data and R files (long method using loops) that I > use to get the result. Currently, I modify keys to get different > combinations. Also, note in the script, the data frame is named LRO1. > > Thanks again, > Murtaza > > > On Fri, Dec 26, 2008 at 12:58 PM, Gabor Grothendieck > <ggrothendieck at gmail.com> wrote: >> See the leaps package. >> >> On Fri, Dec 26, 2008 at 12:37 PM, Murtaza Das <murtazadas at gmail.com> wrote: >>> Hi, >>> >>> I am trying to find an efficient way of applying a linear regression >>> model to different factor combinations in a data frame. >>> I want to obtain the output with minimal or no use of loops if >>> possible. Please let me know if this query is unclear. >>> >>> Thanks, >>> Murtaza >>> >>> *********************************************************************************************************************************************************** >>> >>> The data frame TEST1 has four factor columns followed by thirteen >>> numeric columns defined as : >>> 1) Community, levels: "20232" >>> 2) WT, levels: "B", "E", "M" >>> 3) LTC, levels: "L", "M", "S", "1" >>> 4) UC, levels: "1X1", "2X2" >>> 5) UncDmd: Response variable in the linear model >>> 6-16) M1...M11: Explanatory variables in the linear model >>> >>> A few sample rows in the data frame are as follows: >>>> TEST1[1:15,] >>> Community WT LTC UC UncDmd M1 M2 M3 M4 M5 M6 M7 M8 M9 M10 M11 >>> 1 20232 E L 1X1 1.000000 0 0 0 0 0 0 0 0 0 0 1 >>> 2 20232 E L 2X2 0.000000 0 0 0 0 0 0 0 0 0 0 1 >>> 3 20232 E M 1X1 1.000000 0 0 0 0 0 0 0 0 0 0 1 >>> 4 20232 E M 2X2 1.000000 0 0 0 0 0 0 0 0 0 0 1 >>> 5 20232 E S 1X1 0.000000 0 0 0 0 0 0 0 0 0 0 1 >>> 6 20232 E S 2X2 0.000000 0 0 0 0 1 0 0 0 0 0 0 >>> 7 20232 B 1 1X1 0.209117 0 0 0 0 0 0 0 0 0 0 1 >>> 8 20232 B 1 2X2 0.190605 0 0 0 0 0 0 0 0 0 0 1 >>> 9 20232 B L 1X1 0.000000 0 0 0 0 1 0 0 0 0 0 0 >>> 10 20232 B L 2X2 1.000000 0 0 0 0 0 0 0 0 0 0 1 >>> 11 20232 B M 1X1 4.000000 0 0 0 0 0 0 0 0 0 0 1 >>> 12 20232 B M 2X2 0.000000 0 0 0 0 0 0 0 0 0 0 1 >>> 13 20232 B S 1X1 0.000000 1 0 0 0 0 0 0 0 0 0 0 >>> 14 20232 B S 2X2 0.000000 0 0 0 0 0 0 0 0 0 0 1 >>> 15 20232 M 1 1X1 0.618689 0 0 0 0 0 0 0 0 0 1 0 >>> >>> ********************************************************************************************************************************************************* >>> I need to store the coefficients using lm() for different combinations >>> of the 4 factors, or different combinations of 3 factors or different >>> combinations of 2 factors or >>> differennt combinations of 1 factor. >>> The formula remains fixed as: >>>> Formula >>> UncDmd ~ M1 + M2 + M3 + M4 + M5 + M6 + M7 + M8 + M9 + M10 + M11 >>> >>> So, different models I want to solve in R are : >>> 1) Community : lm(Formula,TEST1[ as.logical( >>> (TEST1[[1]]=="20232") ) , ]) >>> 2) WT : lm(Formula,TEST1[ as.logical( >>> (TEST1[[2]]=="B") ) , ]) >>> 3) WT : lm(Formula,TEST1[ as.logical( >>> (TEST1[[2]]=="E") ) , ]) >>> 4) WT : lm(Formula,TEST1[ as.logical( >>> (TEST1[[2]]=="M") ) , ]) >>> 5) LTC : lm(Formula,TEST1[ as.logical( >>> (TEST1[[3]]=="L") ) , ]) >>> 6) LTC : lm(Formula,TEST1[ as.logical( >>> (TEST1[[3]]=="M") ) , ]) >>> 7) LTC : lm(Formula,TEST1[ as.logical( >>> (TEST1[[3]]=="S") ) , ]) >>> 8) LTC : lm(Formula,TEST1[ as.logical( >>> (TEST1[[3]]=="1L") ) , ]) >>> 9) UC : lm(Formula,TEST1[ as.logical( >>> (TEST1[[4]]=="1X1") ) , ]) >>> 10) UC : lm(Formula,TEST1[ as.logical( >>> (TEST1[[4]]=="2X2") ) , ]) >>> 11) Community, WT : lm(Formula,TEST1[ as.logical( >>> (TEST1[[1]]=="20232") * (TEST1[[2]]=="B") ) , ]) >>> 12) Community, WT : lm(Formula,TEST1[ as.logical( >>> (TEST1[[1]]=="20232") * (TEST1[[2]]=="E") ) , ]) >>> 13) Community, WT : lm(Formula,TEST1[ as.logical( >>> (TEST1[[1]]=="20232") * (TEST1[[2]]=="M") ) , ]) >>> 14) Community, LTC : lm(Formula,TEST1[ as.logical( >>> (TEST1[[1]]=="20232") * (TEST1[[3]]=="L") ) , ]) >>> 15) Community, LTC : lm(Formula,TEST1[ as.logical( >>> (TEST1[[1]]=="20232") * (TEST1[[3]]=="M") ) , ]) >>> 16) Community, LTC : lm(Formula,TEST1[ as.logical( >>> (TEST1[[1]]=="20232") * (TEST1[[3]]=="S") ) , ]) >>> 17) Community, LTC : lm(Formula,TEST1[ as.logical( >>> (TEST1[[1]]=="20232") * (TEST1[[3]]=="1") ) , ]) >>> 18) Community, UC : lm(Formula,TEST1[ as.logical( >>> (TEST1[[1]]=="20232") * (TEST1[[4]]=="1X1") ) , ]) >>> 19) Community, UC : lm(Formula,TEST1[ as.logical( >>> (TEST1[[1]]=="20232") * (TEST1[[4]]=="2X2") ) , ]) >>> 20) WT, LTC : lm(Formula,TEST1[ as.logical( >>> (TEST1[[2]]=="B") * (TEST1[[3]]=="L") ) , ]) >>> 21) WT, LTC : lm(Formula,TEST1[ as.logical( >>> (TEST1[[2]]=="B") * (TEST1[[3]]=="M") ) , ]) >>> 22) WT, LTC : lm(Formula,TEST1[ as.logical( >>> (TEST1[[2]]=="B") * (TEST1[[3]]=="S") ) , ]) >>> 23) WT, LTC : lm(Formula,TEST1[ as.logical( >>> (TEST1[[2]]=="B") * (TEST1[[3]]=="1") ) , ]) >>> 24) WT, LTC : lm(Formula,TEST1[ as.logical( >>> (TEST1[[2]]=="E") * (TEST1[[3]]=="L") ) , ]) >>> 25) WT, LTC : lm(Formula,TEST1[ as.logical( >>> (TEST1[[2]]=="E") * (TEST1[[3]]=="M") ) , ]) >>> 26) WT, LTC : lm(Formula,TEST1[ as.logical( >>> (TEST1[[2]]=="E") * (TEST1[[3]]=="S") ) , ]) >>> 27) WT, LTC : lm(Formula,TEST1[ as.logical( >>> (TEST1[[2]]=="E") * (TEST1[[3]]=="1") ) , ]) >>> 28) WT, LTC : lm(Formula,TEST1[ as.logical( >>> (TEST1[[2]]=="M") * (TEST1[[3]]=="L") ) , ]) >>> 29) WT, LTC : lm(Formula,TEST1[ as.logical( >>> (TEST1[[2]]=="M") * (TEST1[[3]]=="M") ) , ]) >>> 30) WT, LTC : lm(Formula,TEST1[ as.logical( >>> (TEST1[[2]]=="M") * (TEST1[[3]]=="S") ) , ]) >>> 31) WT, LTC : lm(Formula,TEST1[ as.logical( >>> (TEST1[[2]]=="M") * (TEST1[[3]]=="1") ) , ]) >>> 32) WT, UC : >>> ... >>> ... >>> xx) LTC, UC : >>> ... >>> xxx) Community, WT, LTC : >>> ... >>> ... >>> and so on upto: >>> xxxx) Community, WT, LTC, UC : lm(Formula,TEST1[ as.logical( >>> (TEST1[[1]]=="20232") * (TEST1[[2]]=="M") * (TEST1[[3]]=="1") ) * >>> (TEST1[[4]]=="2X2"), ]) >>> *********************************************************************************************************************************************************** >>> Desired Output format (or something simlar): >>> Factor1 Factor2 Factor3 Factor4 Intercept M1 M2 M3 M4 M5 M6 >>> M7 M8 M9 M10 M11 >>> 1) 20232 x x x >>> x x x x x x x x x >>> 2) B x x x >>> x x x x x x x x x >>> 3) E x x x >>> x x x x x x x x x >>> 4) M x x x >>> x x x x x x x x x >>> 5) L x x x >>> x x x x x x x x x >>> 6) M x x x >>> x x x x x x x x x >>> 7) S x x x >>> x x x x x x x x x >>> 8) 1 x x x >>> x x x x x x x x x >>> 9) 1X1 x x x >>> x x x x x x x x x >>> 10) 2X2 x x x >>> x x x x x x x x x >>> 11) 20232 B x x x x >>> x x x x x x x x >>> .. >>> .. >>> and so on.. >>> >>> >>> x is the respective coefficient obtained from the linear fit. >>> >>> ______________________________________________ >>> R-help at r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >>> >> >