zhuanyi at zay.name
2007-Apr-01 05:54 UTC
[R] Doing partial-f test for stepwise regression
Hello all, I am trying to figure out an optimal linear model by using stepwise regression which requires partial f-test, I did some Googling on the Internet and realised that someone seemed to ask the question before: Jim Milks <jrclmilks at joimail.com> writes:> Dear all: > > I have a regression model that has collinearity problems (between > three regressor variables). I need a F-test that will allow me to > compare between full (with all variables) and partial models (minus > 1=< variables). The general F-test formula I'm using is: > > F = {[SS(full model) - SS(reduced model)] / (#variables taken out)} / > MSS(full model) > > Unfortunately, the ANOVA table parses the SS and MSS between the > variables and does not give the statistics for the regression model as > a whole, otherwise I'd do this by hand. > > So, really, I have two questions: 1) Can I just add up all the SS and > MSS for all the variables to get the model SS and MSS and 2) Are > there any functions or packages I can use to calculate the F-statistic? >Just use anova(model1, model2). >(One potential catch: Make sure that both models are fitted to the same >data set. Missing values in predictors may interfere.)However, in the answer provided by Mr. Peter Dalgaard,(use anova(model1,model2) I could not understand what model1 and model2 are supposed to referring to, which one is supposedly to be the full model and which one is to be the partial model? Or it does not matter? Thanks in advance for help from anyone! Regards, Anyi Zhu
And what about to read the help page ?anova ...? >>> When given a sequence of objects, 'anova' tests the models against one another in the order specified. <<< Generally you almost never fit a full model (including all possible interactions etc) - no one can interpret such complicated models. Anova gives you a comparison between a broader model (the first argument to anova) and its submodel(s). Petr zhuanyi at zay.name napsal(a):> Hello all, > I am trying to figure out an optimal linear model by using stepwise > regression which requires partial f-test, I did some Googling on the > Internet and realised that someone seemed to ask the question before: > > Jim Milks <jrclmilks at joimail.com> writes: >> Dear all: >> >> I have a regression model that has collinearity problems (between >> three regressor variables). I need a F-test that will allow me to >> compare between full (with all variables) and partial models (minus >> 1=< variables). The general F-test formula I'm using is: >> >> F = {[SS(full model) - SS(reduced model)] / (#variables taken out)} / >> MSS(full model) >> >> Unfortunately, the ANOVA table parses the SS and MSS between the >> variables and does not give the statistics for the regression model as >> a whole, otherwise I'd do this by hand. >> >> So, really, I have two questions: 1) Can I just add up all the SS and >> MSS for all the variables to get the model SS and MSS and 2) Are >> there any functions or packages I can use to calculate the F-statistic? >> Just use anova(model1, model2). >> (One potential catch: Make sure that both models are fitted to the same >> data set. Missing values in predictors may interfere.) > > However, in the answer provided by Mr. Peter Dalgaard,(use > anova(model1,model2) I could not understand what model1 and model2 are > supposed to referring to, which one is supposedly to be the full model and > which one is to be the partial model? Or it does not matter? > > Thanks in advance for help from anyone! > > Regards, > Anyi Zhu > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Petr Klasterecky Dept. of Probability and Statistics Charles University in Prague Czech Republic
On Apr 1, 2007, at 1:54 AM, zhuanyi at zay.name wrote:> Hello all, > I am trying to figure out an optimal linear model by using stepwise > regression which requires partial f-test, I did some Googling on the > Internet and realised that someone seemed to ask the question before: > > Jim Milks <jrclmilks at joimail.com> writes: >> Dear all: >> >> I have a regression model that has collinearity problems (between >> three regressor variables). I need a F-test that will allow me to >> compare between full (with all variables) and partial models (minus >> 1=< variables). The general F-test formula I'm using is: >> >> F = {[SS(full model) - SS(reduced model)] / (#variables taken out)} / >> MSS(full model) >> >> Unfortunately, the ANOVA table parses the SS and MSS between the >> variables and does not give the statistics for the regression >> model as >> a whole, otherwise I'd do this by hand. >> >> So, really, I have two questions: 1) Can I just add up all the SS and >> MSS for all the variables to get the model SS and MSS and 2) Are >> there any functions or packages I can use to calculate the F- >> statistic? >> Just use anova(model1, model2). >> (One potential catch: Make sure that both models are fitted to the >> same >> data set. Missing values in predictors may interfere.) > > However, in the answer provided by Mr. Peter Dalgaard,(use > anova(model1,model2) I could not understand what model1 and model2 are > supposed to referring to, which one is supposedly to be the full > model and > which one is to be the partial model? Or it does not matter?You can tell which is which by looking at the degrees of freedom. _____________________________ Professor Michael Kubovy University of Virginia Department of Psychology USPS: P.O.Box 400400 Charlottesville, VA 22904-4400 Parcels: Room 102 Gilmer Hall McCormick Road Charlottesville, VA 22903 Office: B011 +1-434-982-4729 Lab: B019 +1-434-982-4751 Fax: +1-434-982-4766 WWW: http://www.people.virginia.edu/~mk9y/
rolf at math.unb.ca
2007-Apr-01 14:06 UTC
[R] Doing partial-f test for stepwise regression
Petr Klasterecky <klaster at karlin.mff.cuni.cz> wrote:> And what about to read the help page ?anova ...? > > >>> > When given a sequence of objects, 'anova' tests the models against > one another in the order specified. > <<<One perfectly reasonable response to ``what about'' is that it is not *at all* clear as to what the statement in the help page actually means.> Generally you almost never fit a full model (including all possible > interactions etc) - no one can interpret such complicated models.This assertion is certainly open to some dispute.> Anova gives you a comparison between a broader model (the first > argument to anova) and its submodel(s).As I read the above statement, it seems you've got it exactly backwards. Broader model == full model, submodel = model under the null hypothesis, is it not so? You should actually specify the ``reduced'' model (the model under the null hypothesis) first, and the full model second. E.g.: > y <- runif(20) > x1 <- runif(20) > x2 <- runif(20) > x3 <- runif(20) > f1 <- lm(y~x1+x2+x3) > f2 <- lm(y~x1) > anova(f1,f2) Analysis of Variance Table Model 1: y ~ x1 + x2 + x3 Model 2: y ~ x1 Res.Df RSS Df Sum of Sq F Pr(>F) 1 16 0.93225 2 18 1.07998 -2 -0.14774 1.2678 0.3083 Doing it your way --- full model first --- gives a negative sum of squares. And negative degrees of freedom for the effect being tested. Not that it really matters --- the anova() function gives you the same F statistic and p-value either way. And the negative SS is a dead giveaway that something is a bit skew-wiff. cheers, Rolf Turner rolf at math.unb.ca