Bjørn-Helge Mevik
2003-Aug-11 13:30 UTC
[R] Marginal (type II) SS for powers of continuous variables in a linear model?
I've used Anova() from the car package to get marginal (aka type II) sum-of-squares and tests for linear models with categorical variables. Is it possible to get marginal SSs also for continuous variables, when the model includes powers of the continuous variables? For instance, if A and B are categorical ("factor"s) and x is continuous ("numeric"), Anova (lm (y ~ A*B + x, ...)) will produce marginal SSs for all terms (A, B, A:B and x). However, with Anova (lm (y ~ A*B + x + I(x^2), ...)) the SS for 'x' is calculated with I(x^2) present in the model, i.e. it is no longer marginal. Using poly (x, 2) instead of x + I(x^2), one gets a marginal SS for the total effect of x, but not for the linear and quadratic effects separately. (summary.aov() has a 'split' argument?that can be used to get separate SSs, but these are not marginal.) -- Bj?rn-Helge Mevik
Spencer Graves
2003-Aug-11 14:24 UTC
[R] Marginal (type II) SS for powers of continuous variables in a linear model?
I'm confused. Consider the following example: > Df <- data.frame(x=1:9, y=rep(c(-1,1), length=9)) > anova(lm(y~x, Df)) Analysis of Variance Table Response: y Df Sum Sq Mean Sq F value Pr(>F) x 1 2.861e-34 2.861e-34 2.253e-34 1 Residuals 7 8.8889 1.2698 > anova(lm(y~x+I(x^2), Df)) Analysis of Variance Table Response: y Df Sum Sq Mean Sq F value Pr(>F) x 1 2.861e-34 2.861e-34 2.065e-34 1.0000 I(x^2) 1 0.5772 0.5772 0.4167 0.5425 Residuals 6 8.3117 1.3853 > > Df <- data.frame(x=1:9, y=rep(c(-1,1), length=9)) > anova(lm(y~x, Df)) Analysis of Variance Table Response: y Df Sum Sq Mean Sq F value Pr(>F) x 1 2.861e-34 2.861e-34 2.253e-34 1 Residuals 7 8.8889 1.2698 > anova(lm(y~x+I(x^2), Df)) Analysis of Variance Table Response: y Df Sum Sq Mean Sq F value Pr(>F) x 1 2.861e-34 2.861e-34 2.065e-34 1.0000 I(x^2) 1 0.5772 0.5772 0.4167 0.5425 Residuals 6 8.3117 1.3853 > anova(lm(y~I(x^2)+x, Df)) Analysis of Variance Table Response: y Df Sum Sq Mean Sq F value Pr(>F) I(x^2) 1 0.0282 0.0282 0.0203 0.8912 x 1 0.5490 0.5490 0.3963 0.5522 Residuals 6 8.3117 1.3853 > In S-Plus 6.1, the ANOVA table is preceeded by a statement, "Terms added sequentially (first to last)". From these examples, it certainly looks like this is what it is doing. Apart from round off error, the sum of squares and mean squares are identical for the models without and with I(x^2). In an example with a nonzero sum of squares for x, the F value would be different, because the mean square for residuals would be different, and the Pr(>F) would also be affected by differing degrees of freedom. The third example here puts I(x^2) before x in the model statement and gets a clearly different anova. (The coefficients should be not change when the order of the terms is modified, though they could change if other terms are addeed. I didn't check that for this example, but I've done this before and would be surprised if they were different.) Best Wishes, Spencer Bj?rn-Helge Mevik wrote:> I've used Anova() from the car package to get marginal (aka type II) > sum-of-squares and tests for linear models with categorical > variables. Is it possible to get marginal SSs also for continuous > variables, when the model includes powers of the continuous variables? > > For instance, if A and B are categorical ("factor"s) and x is > continuous ("numeric"), > > Anova (lm (y ~ A*B + x, ...)) > > will produce marginal SSs for all terms (A, B, A:B and x). However, > with > > Anova (lm (y ~ A*B + x + I(x^2), ...)) > > the SS for 'x' is calculated with I(x^2) present in the model, i.e. it > is no longer marginal. > > Using poly (x, 2) instead of x + I(x^2), one gets a marginal SS for > the total effect of x, but not for the linear and quadratic effects > separately. (summary.aov() has a 'split' argument that can be used to > get separate SSs, but these are not marginal.) > >