Greetings, I am interested in creating a stepwise fixed order regression function. There's a function for this already called add1( ). The F statistics are calculated using type 2 anova (the SS and the F changes don't match SPSS's). You can see my use of this at the very end of the email. What I want: a function to make an anova table with f changes and delt R^2. I ran into 10 snags to making this a fully automated function using the full linear model (order matters here). Each snag is marked with a Comment #. Some snags are repeated because I couldn't do the first time and certainly couldn't do it after that. Help with any/all snags would be very appreciated. I'm a 2 1/2 month [R] user who's reading everything online (incl. manuals) & ordering every book I can (looking at Dalgaard, Crawly's and Teetor's very helpful books right now). Loops and their usage is a foreign thing to me, despite studying them, and unfortunately I think that my function may call for them. I also realize that beyond 10 predictors may make this function way to bulky. I'm running the latest version of R (2.12.2)on a windows 7 machine. DATASET mtcars full.model<-lm(mpg~cyl+disp+hp+drat, data=mtcars) CODE stepFO<-function(model) { m<-data.frame(model.frame(model)) num.of.var<-length(colnames(m)) mod1<-lm(m[,1]~m[,2]) mod2<-lm(m[,1]~m[,2]+m[,3]) mod3<-lm(m[,1]~m[,2]+m[,3]+m[,4]) mod4<-lm(m[,1]~m[,2]+m[,3]+m[,4]+m[,5]) #Comment 1--I don't know how to automated this process(above) of adding #...additional variables. Probably a loop is needed but I don't understand #...how to apply it here. Maybe update.model [1:num.ofvar]? a1<-anova(mod1) a2<-anova(mod2) a3<-anova(mod3) a4<-anova(mod4) #Comment 2--SAME AS COMMENT 1 except applied to the anova tables. How do I make #...[R] add a5, a6...an as necessary? rb<-rbind(a1,a2,a3,a4) #Comment 3--again I can't automate this to make the addition of a's automated anova1<-rbind(rb[1,],rb[4,],rb[8,],rb[13,],rb[14,]) #Comment 4--the rb's follow a pattern of 1+3+4+5...+n variables #then I row bind these starting with 1 and rowbind one more after the last #...rb to include the bottom piece of the anova table that tells #...about residuals (how do I aoutomate this?) anova<-anova1[,1:num.of.var] anova.table<-data.frame(anova) #Comment 5--Something that bugs me here is that I have to turn it into a dataframe to #...add the totals row and the delta R^2 (tried playing w/ tkinsert to no avail) #...I miss the stuff that's at the bottom of the anova table (sig values) #Comment 6--I'd love to turn the place value to round to 2 after the decimal. #...I've worked this many ways including changing my options but this does #...not seem to apply to a data frame Total<-c(sum(anova[,1]),sum(anova[,2])," ", " ", " ") anova.table<-rbind(anova.table,Total) R1<-summary(mod1)[[8]][[1]] R2<-summary(mod2)[[8]][[1]] R3<-summary(mod3)[[8]][[1]] R4<-summary(mod4)[[8]][[1]] #Comment 7--SAME AS COMMENT 2. How do I make #...[R] add R5, R6...Rn as necessary? deltaR.1<-R1 deltaR.2<-R2-R1 deltaR.3<-R3-R2 deltaR.4<-R4-R3 #Comment 8--SAME AS COMMENT 7. How would I aoutomate this process? Delta.R.Squared<-c(deltaR.1,deltaR.2,deltaR.3,deltaR.4," ","") #Comment 9--I need a way to add as many deltaR's as #...necessary(n of R = n of predictors) anova.table<-cbind(anova.table, Delta.R.Squared) colnames(anova.table)<-c("df","Sum Sq","Mean Sq","F-change", "P-value","Delta.R.Squared") rownames(anova.table)<-c("X1", "X2 elminating for X1", "X3 eliminating for X2 & X3", "X4 eliminating for X1,X2, & X3","Residuals", "Total") anova.table } #Comment 10--Again I would need [R] to automate the list for row names as we #...add more predictors. #See the final product below I'm after (with two places after the decimal)> anova.tabledf Sum Sq Mean Sq F-change P-value Delta.R.Squared X1 1 817.712952354614 817.712952354614 79.5610275293349 6.112687142581e-10 0.726180005093805 X2 elminating for X1 1 37.5939529324851 37.5939529324851 4.0268283172755 0.0541857201845285 0.0333857704630917 X3 eliminating for X2 & X3 1 9.37092926438942 9.37092926438942 1.00388976918267 0.324951851250774 0.00832196853596723 X4 eliminating for X1,Xx2, & X3 1 16.4674349222492 16.4674349222492 1.81550535203668 0.189048514740917 0.0146241073243205 Residuals 27 244.901918026262 9.07044140838007 <NA> <NA> Total 31 1126.0471875 USING THE ADD1() FUNCTION> NOT WHAT I WANT> mod1<-lm(mpg~cyl, data=mtcars) add1(mod1,~cyl+disp+hp+drat, data=mtcars, test="F") Model: mpg ~ cyl Df Sum of Sq RSS AIC F value Pr(F) <none> 308.33 76.494 disp 1 37.594 270.74 74.334 4.0268 0.05419 . hp 1 16.360 291.98 76.750 1.6249 0.21253 drat 1 15.841 292.49 76.807 1.5706 0.22012 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 [[alternative HTML version deleted]]
Frank Harrell
2011-Apr-07 20:43 UTC
[R] Automated Fixed Order Stepwise Regression Function
This is a statistically invalid procedure. I would not spend much time on it. It is a good idea to first bootstrap an idea or do regular Monte Carlo simulation to see if the statistical properties are OK (confidence interval coverage, bias, type I error, etc.). You will find they are not in this case. Frank Tyler Rinker wrote:> > Greetings, > > I am interested in creating a stepwise fixed order regression function. > There's a function for this already called add1( ). The F statistics are > calculated using type 2 anova (the SS and the F changes don't match > SPSS's). You can see my use of this at the very end of the email. > > What I want: a function to make an anova table with f changes and delt > R^2. > > I ran into 10 snags to making this a fully automated function using the > full linear model (order matters here). Each snag is marked with a > Comment #. Some snags are repeated because I couldn't do the first time > and certainly couldn't do it after that. Help with any/all snags would be > very appreciated. > > I'm a 2 1/2 month [R] user who's reading everything online (incl. manuals) > & ordering every book I can (looking at Dalgaard, Crawly's and Teetor's > very helpful books right now). Loops and their usage is a foreign thing > to me, despite studying them, and unfortunately I think that my function > may call for them. I also realize that beyond 10 predictors may make this > function way to bulky. > > I'm running the latest version of R (2.12.2)on a windows 7 machine. > > DATASET > > mtcars > full.model<-lm(mpg~cyl+disp+hp+drat, data=mtcars) > > CODE > > stepFO<-function(model) > { > m<-data.frame(model.frame(model)) > num.of.var<-length(colnames(m)) > mod1<-lm(m[,1]~m[,2]) > mod2<-lm(m[,1]~m[,2]+m[,3]) > mod3<-lm(m[,1]~m[,2]+m[,3]+m[,4]) > mod4<-lm(m[,1]~m[,2]+m[,3]+m[,4]+m[,5]) > #Comment 1--I don't know how to automated this process(above) of adding > #...additional variables. Probably a loop is needed but I don't > understand > #...how to apply it here. Maybe update.model [1:num.ofvar]? > a1<-anova(mod1) > a2<-anova(mod2) > a3<-anova(mod3) > a4<-anova(mod4) > #Comment 2--SAME AS COMMENT 1 except applied to the anova tables. How do > I make > #...[R] add a5, a6...an as necessary? > > rb<-rbind(a1,a2,a3,a4) > #Comment 3--again I can't automate this to make the addition of a's > automated > > anova1<-rbind(rb[1,],rb[4,],rb[8,],rb[13,],rb[14,]) > #Comment 4--the rb's follow a pattern of 1+3+4+5...+n variables > #then I row bind these starting with 1 and rowbind one more after the last > #...rb to include the bottom piece of the anova table that tells > #...about residuals (how do I aoutomate this?) > > anova<-anova1[,1:num.of.var] > anova.table<-data.frame(anova) > #Comment 5--Something that bugs me here is that I have to turn it into a > dataframe to > #...add the totals row and the delta R^2 (tried playing w/ tkinsert to no > avail) > #...I miss the stuff that's at the bottom of the anova table (sig values) > #Comment 6--I'd love to turn the place value to round to 2 after the > decimal. > #...I've worked this many ways including changing my options but this does > #...not seem to apply to a data frame > > Total<-c(sum(anova[,1]),sum(anova[,2])," ", " ", " ") > anova.table<-rbind(anova.table,Total) > R1<-summary(mod1)[[8]][[1]] > R2<-summary(mod2)[[8]][[1]] > R3<-summary(mod3)[[8]][[1]] > R4<-summary(mod4)[[8]][[1]] > #Comment 7--SAME AS COMMENT 2. How do I make > #...[R] add R5, R6...Rn as necessary? > > deltaR.1<-R1 > deltaR.2<-R2-R1 > deltaR.3<-R3-R2 > deltaR.4<-R4-R3 > #Comment 8--SAME AS COMMENT 7. How would I aoutomate this process? > > Delta.R.Squared<-c(deltaR.1,deltaR.2,deltaR.3,deltaR.4," ","") > #Comment 9--I need a way to add as many deltaR's as > #...necessary(n of R = n of predictors) > > anova.table<-cbind(anova.table, Delta.R.Squared) > colnames(anova.table)<-c("df","Sum Sq","Mean Sq","F-change", > "P-value","Delta.R.Squared") > rownames(anova.table)<-c("X1", "X2 elminating for X1", > "X3 eliminating for X2 & X3", "X4 eliminating for X1,X2, & > X3","Residuals", > "Total") > anova.table > } > #Comment 10--Again I would need [R] to automate the list for row names as > we > #...add more predictors. > #See the final product below I'm after (with two places after the decimal) >> anova.table > df Sum Sq Mean Sq > F-change P-value Delta.R.Squared > X1 1 817.712952354614 817.712952354614 > 79.5610275293349 6.112687142581e-10 0.726180005093805 > X2 elminating for X1 1 37.5939529324851 37.5939529324851 > 4.0268283172755 0.0541857201845285 0.0333857704630917 > X3 eliminating for X2 & X3 1 9.37092926438942 9.37092926438942 > 1.00388976918267 0.324951851250774 0.00832196853596723 > X4 eliminating for X1,Xx2, & X3 1 16.4674349222492 16.4674349222492 > 1.81550535203668 0.189048514740917 0.0146241073243205 > Residuals 27 244.901918026262 9.07044140838007 > Total 31 1126.0471875 > > > USING THE ADD1() FUNCTION> NOT WHAT I WANT> > > mod1<-lm(mpg~cyl, data=mtcars) > add1(mod1,~cyl+disp+hp+drat, data=mtcars, test="F") > > Model: > mpg ~ cyl > Df Sum of Sq RSS AIC F value Pr(F) > 308.33 76.494 > disp 1 37.594 270.74 74.334 4.0268 0.05419 . > hp 1 16.360 291.98 76.750 1.6249 0.21253 > drat 1 15.841 292.49 76.807 1.5706 0.22012 > --- > Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 > > > > [[alternative HTML version deleted]] > > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >----- Frank Harrell Department of Biostatistics, Vanderbilt University -- View this message in context: http://r.789695.n4.nabble.com/Automated-Fixed-Order-Stepwise-Regression-Function-tp3433343p3434522.html Sent from the R help mailing list archive at Nabble.com.
Maybe Matching Threads
- R package for bootstrapping (comparing two quadratic regression models)
- lmer specification for random effects: contradictory reults
- te( ) interactions and AIC model selection with GAM
- strange PREDICTIONS from a PIECEWISE LINEAR (mixed) MODEL
- stepAIC/lme problem (1.7.0 only)