How does R estimate the intercept term \alpha in a loglinear model with Poisson model and log link for a contingency table of counts? (E.g., for a 2-by-2 table {n_{ij}) with \log(\mu) = \alpha + \beta_{i} + \gamma_{j}) I fitted such a model and checked the calculations by hand. I agreed with the main effect terms but not the intercept. Interestingly, I agreed with the fitted value provided by R for the first cell {11} in the table. If my estimate of intercept = \hat{\alpha}, my estimate of the fitted value for the first cell = exp(\hat{\alpha}) but R seems to be doing something else for the estimate of the intercept. However if I check the R $fitted_value for n_{11} it agrees with my exp(\hat{\alpha}). I would expect that with the corner-point parametrization, the estimates for a 2 x 2 table would correspond to expected frequencies exp(\alpha), exp(\alpha + \beta), exp(\alpha + \gamma), exp(\alpha + \beta + \gamma). The MLE of \alpha appears to be log(n_{.1} * n_{1.}/n_{..}), but this is not equal to the intercept given by R in the example I tried. With thanks in anticipation, Colin Aitken -- Professor Colin Aitken, Professor of Forensic Statistics, School of Mathematics, King?s Buildings, University of Edinburgh, Mayfield Road, Edinburgh, EH9 3JZ. Tel: 0131 650 4877 E-mail: c.g.g.aitken at ed.ac.uk Fax : 0131 650 6553 http://www.maths.ed.ac.uk/~cgga The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336.
On Nov 7, 2011, at 12:59 PM, Colin Aitken wrote:> How does R estimate the intercept term \alpha in a loglinear > model with Poisson model and log link for a contingency table of > counts? > > (E.g., for a 2-by-2 table {n_{ij}) with \log(\mu) = \alpha + > \beta_{i} + \gamma_{j}) > > I fitted such a model and checked the calculations by hand. I > agreed with the main effect terms but not the intercept. > Interestingly, I agreed with the fitted value provided by R for the > first cell {11} in the table. > > If my estimate of intercept = \hat{\alpha}, my estimate of the > fitted value for the first cell = exp(\hat{\alpha}) but R seems to > be doing something else for the estimate of the intercept. > > However if I check the R $fitted_value for n_{11} it agrees with my > exp(\hat{\alpha}). > > I would expect that with the corner-point parametrization, the > estimates for a 2 x 2 table would correspond to expected frequencies > exp(\alpha), exp(\alpha + \beta), exp(\alpha + \gamma), exp(\alpha + > \beta + \gamma). The MLE of \alpha appears to be log(n_{.1} * n_{1.}/ > n_{..}), but this is not equal to the intercept given by R in the > example I tried. > > With thanks in anticipation, > > Colin Aitken > > > -- > Professor Colin Aitken, > Professor of Forensic Statistics,Do you suppose you could provide a data-corpse for us to dissect? Noting the tag line for every posting ....> and provide commented, minimal, self-contained, reproducible code.-- David Winsemius, MD West Hartford, CT
On Nov 07, 2011 at 7:59pm Colin Aitken wrote:> How does R estimate the intercept term \alpha in a loglinear > model with Poisson model and log link for a contingency table of counts?Colin, If you fitted this using a GLM then the default in R is to use so-called treatment contrasts (i.e. Dunnett contrasts). See ?contr.treatment. Take the first example on the ?glm help page ## Dobson (1990) Page 93: Randomized Controlled Trial : counts <- c(18,17,15,20,10,20,25,13,12) outcome <- gl(3,1,9) treatment <- gl(3,3) print(d.AD <- data.frame(treatment, outcome, counts)) glm.D93 <- glm(counts ~ outcome + treatment, family=poisson()) anova(glm.D93) summary(glm.D93) < snip > Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) 3.045e+00 1.709e-01 17.815 <2e-16 *** outcome2 -4.543e-01 2.022e-01 -2.247 0.0246 * outcome3 -2.930e-01 1.927e-01 -1.520 0.1285 treatment2 1.338e-15 2.000e-01 0.000 1.0000 treatment3 1.421e-15 2.000e-01 0.000 1.0000 < snip >> levels(outcome)[1] "1" "2" "3"> levels(treatment)[1] "1" "2" "3" So here the intercept represents the estimated counts at the first level of "outcome" (i.e. outcome = 1) and the first level of "treatment" (i.e. treatment = 1).> predict(glm.D93, newdata=data.frame(outcome="1", treatment="1"))1 3.044522 Regards, Mark. ----- Mark Difford (Ph.D.) Research Associate Botany Department Nelson Mandela Metropolitan University Port Elizabeth, South Africa -- View this message in context: http://r.789695.n4.nabble.com/Estimate-of-intercept-in-loglinear-model-tp4009905p4012346.html Sent from the R help mailing list archive at Nabble.com.
On Nov 07, 2011 at 9:04pm Mark Difford wrote:> So here the intercept represents the estimated counts...Perhaps I should have added (though surely unnecessary in your case) that exponentiation gives the predicted/estimated counts, viz 21 (compared to 18 for the saturated model). ##> exp(3.044522)[1] 20.99999 Regards, Mark. ----- Mark Difford (Ph.D.) Research Associate Botany Department Nelson Mandela Metropolitan University Port Elizabeth, South Africa -- View this message in context: http://r.789695.n4.nabble.com/Estimate-of-intercept-in-loglinear-model-tp4009905p4012723.html Sent from the R help mailing list archive at Nabble.com.
On 08/11/11 07:11, David Winsemius wrote: (in response to>> Professor Colin Aitken, >> Professor of Forensic Statistics,!!!) <SNIP>> > Do you suppose you could provide a data-corpse for us to dissect?Fortune nomination!!! cheers, Rolf Turner
Sorry about that. However I have solved the problem by declaring the explanatory variables as factors. An unresolved problem is: what does R do when the explanatory factors are not defined as factors when it obtains a different value for the intercept but the correct value for the fitted value? A description of the data and the R code and output is attached for anyone interested. Best wishes, Colin Aitken ------------------- David Winsemius wrote:> > On Nov 7, 2011, at 12:59 PM, Colin Aitken wrote: > >> How does R estimate the intercept term \alpha in a loglinear >> model with Poisson model and log link for a contingency table of counts? >> >> (E.g., for a 2-by-2 table {n_{ij}) with \log(\mu) = \alpha + \beta_{i} >> + \gamma_{j}) >> >> I fitted such a model and checked the calculations by hand. I agreed >> with the main effect terms but not the intercept. Interestingly, I >> agreed with the fitted value provided by R for the first cell {11} in >> the table. >> >> If my estimate of intercept = \hat{\alpha}, my estimate of the fitted >> value for the first cell = exp(\hat{\alpha}) but R seems to be doing >> something else for the estimate of the intercept. >> >> However if I check the R $fitted_value for n_{11} it agrees with my >> exp(\hat{\alpha}). >> >> I would expect that with the corner-point parametrization, the >> estimates for a 2 x 2 table would correspond to expected frequencies >> exp(\alpha), exp(\alpha + \beta), exp(\alpha + \gamma), exp(\alpha + >> \beta + \gamma). The MLE of \alpha appears to be log(n_{.1} * >> n_{1.}/n_{..}), but this is not equal to the intercept given by R in >> the example I tried. >> >> With thanks in anticipation, >> >> Colin Aitken >> >> >> -- >> Professor Colin Aitken, >> Professor of Forensic Statistics, > > Do you suppose you could provide a data-corpse for us to dissect? > > Noting the tag line for every posting .... >> and provide commented, minimal, self-contained, reproducible code. >-- Professor Colin Aitken, Professor of Forensic Statistics, School of Mathematics, King?s Buildings, University of Edinburgh, Mayfield Road, Edinburgh, EH9 3JZ. Tel: 0131 650 4877 E-mail: c.g.g.aitken at ed.ac.uk Fax : 0131 650 6553 http://www.maths.ed.ac.uk/~cgga The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336.