Hi, I am trying to run ANOVA with an interaction term on 2 factors (treat has 7 levels, group has 2 levels). I found the coefficient for the last interaction term is always 0, see attached dataset and the code below:> test<-read.table("test.txt",sep='\t',header=T,row.names=NULL) > lm(y~factor(treat)*factor(group),test)Call: lm(formula = y ~ factor(treat) * factor(group), data = test) Coefficients: ????????????????? (Intercept)???????????????? factor(treat)2???????????????? factor(treat)3? ???????????????????? 0.429244?????????????????????? 0.499982?????????????????????? 0.352971? ?????????????? factor(treat)4???????????????? factor(treat)5???????????????? factor(treat)6? ??????????????????? -0.204752?????????????????????? 0.142042?????????????????????? 0.044155? ?????????????? factor(treat)7???????????????? factor(group)2? factor(treat)2:factor(group)2? ??????????????????? -0.007775????????????????????? -0.337907????????????????????? -0.208734? factor(treat)3:factor(group)2? factor(treat)4:factor(group)2? factor(treat)5:factor(group)2? ??????????????????? -0.195138?????????????????????? 0.800029?????????????????????? 0.227514? factor(treat)6:factor(group)2? factor(treat)7:factor(group)2? ???????????????????? 0.331548???????????????????????????? NA I guess this is due to model matrix being singular or collinearity among the matrix columns? But I can't figure out how the matrix is singular in this case? Can someone show me why this is the case? Thanks John -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: test.txt URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20111107/430e48a8/attachment.txt>

On Nov 7, 2011, at 7:33 PM, array chip wrote:> Hi, I am trying to run ANOVA with an interaction term on 2 factors > (treat has 7 levels, group has 2 levels). I found the coefficient > for the last interaction term is always 0, see attached dataset and > the code below: > >> test<-read.table("test.txt",sep='\t',header=T,row.names=NULL) >> lm(y~factor(treat)*factor(group),test) > > Call: > lm(formula = y ~ factor(treat) * factor(group), data = test) > > Coefficients: > (Intercept) > factor(treat)2 factor(treat)3 > 0.429244 > 0.499982 0.352971 > factor(treat)4 > factor(treat)5 factor(treat)6 > -0.204752 > 0.142042 0.044155 > factor(treat)7 factor(group)2 > factor(treat)2:factor(group)2 > -0.007775 > -0.337907 -0.208734 > factor(treat)3:factor(group)2 factor(treat)4:factor(group)2 > factor(treat)5:factor(group)2 > -0.195138 > 0.800029 0.227514 > factor(treat)6:factor(group)2 factor(treat)7:factor(group)2 > 0.331548 NA > > > I guess this is due to model matrix being singular or collinearity > among the matrix columns? But I can't figure out how the matrix is > singular in this case? Can someone show me why this is the case?Because you have no cases in one of the crossed categories. -- David Winsemius, MD West Hartford, CT

Hi Dennis, The cell mean mu_12 from the model involves the intercept and factor 2: Coefficients: (Intercept) factor(treat)2 factor(treat)3 0.429244 0.499982 0.352971 factor(treat)4 factor(treat)5 factor(treat)6 -0.204752 0.142042 0.044155 factor(treat)7 factor(group)2 factor(treat)2:factor(group)2 -0.007775 -0.337907 -0.208734 factor(treat)3:factor(group)2 factor(treat)4:factor(group)2 factor(treat)5:factor(group)2 -0.195138 0.800029 0.227514 factor(treat)6:factor(group)2 factor(treat)7:factor(group)2 0.331548 NA So mu_12 = 0.429244-0.337907 = 0.091337. This can be verified by:> predict(fit,data.frame(list(treat=1,group=2)))1 0.09133691 Warning message: In predict.lm(fit, data.frame(list(treat = 1, group = 2))) : prediction from a rank-deficient fit may be misleading But as you can see, it gave a warning about rank-deficient fit... why this is a rank-deficient fit? Because "treat 1_group 2" has no cases, so why it is still estimable while on the contrary, "treat 7_group 2" which has 2 cases is not? Thanks John ________________________________ From: Dennis Murphy <djmuser@gmail.com> Sent: Monday, November 7, 2011 9:29 PM Subject: Re: [R] why NA coefficients Hi John: What is the estimate of the cell mean \mu_{12}? Which model effects involve that cell mean? With this data arrangement, the expected population marginal means of treatment 1 and group 2 are not estimable either, unless you're willing to assume a no-interaction model. Chapters 13 and 14 of Milliken and Johnson's Analysis of Messy Data (vol. 1) cover this topic in some detail, but it assumes you're familiar with the matrix form of a linear statistical model. Both chapters cover the two-way model with interaction - Ch.13 from the cell means model approach and Ch. 14 from the model effects approach. Because this was written in the mid 80s and republished in the early 90s, all the code used is in SAS. HTH, Dennis> Thanks David. The only category that has no cases is "treat 1-group 2": > >> with(test,table(treat,group)) > group > treat 1 2 > 1 8 0 > 2 1 5 > 3 5 5 > 4 7 3 > 5 7 4 > 6 3 3 > 7 8 2 > > But why the coefficient for "treat 7-group 2" is not estimable? > > Thanks > > John > > > > > ________________________________ > From: David Winsemius <dwinsemius@comcast.net> > > Cc: "r-help@r-project.org" <r-help@r-project.org> > Sent: Monday, November 7, 2011 5:13 PM > Subject: Re: [R] why NA coefficients > > > On Nov 7, 2011, at 7:33 PM, array chip wrote: > >> Hi, I am trying to run ANOVA with an interaction term on 2 factors (treat has 7 levels, group has 2 levels). I found the coefficient for the last interaction term is always 0, see attached dataset and the code below: >> >>> test<-read.table("test.txt",sep='\t',header=T,row.names=NULL) >>> lm(y~factor(treat)*factor(group),test) >> >> Call: >> lm(formula = y ~ factor(treat) * factor(group), data = test) >> >> Coefficients: >> (Intercept) factor(treat)2 factor(treat)3 >> 0.429244 0.499982 0.352971 >> factor(treat)4 factor(treat)5 factor(treat)6 >> -0.204752 0.142042 0.044155 >> factor(treat)7 factor(group)2 factor(treat)2:factor(group)2 >> -0.007775 -0.337907 -0.208734 >> factor(treat)3:factor(group)2 factor(treat)4:factor(group)2 factor(treat)5:factor(group)2 >> -0.195138 0.800029 0.227514 >> factor(treat)6:factor(group)2 factor(treat)7:factor(group)2 >> 0.331548 NA >> >> >> I guess this is due to model matrix being singular or collinearity among the matrix columns? But I can't figure out how the matrix is singular in this case? Can someone show me why this is the case? > > Because you have no cases in one of the crossed categories. > > --David Winsemius, MD > West Hartford, CT > [[alternative HTML version deleted]] > > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > >[[alternative HTML version deleted]]