> -----Original Message----- > From: r-help-bounces at stat.math.ethz.ch > [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of David Firth > Sent: Tuesday, March 16, 2004 1:12 PM > To: Paul Johnson > Cc: r-help at r-project.org > Subject: Re: [R] glm questions > > > Dear Paul > > Here are some attempts at your questions. I hope it's of some help. > > On Tuesday, Mar 16, 2004, at 06:00 Europe/London, Paul Johnson wrote: > > > Greetings, everybody. Can I ask some glm questions? > > > > 1. How do you find out -2*lnL(saturated model)? > > > > In the output from glm, I find: > > > > Null deviance: which I think is -2[lnL(null) - lnL(saturated)] > > Residual deviance: -2[lnL(fitted) - lnL(saturated)] > > > > The Null model is the one that includes the constant only > (plus offset > > if specified). Right? > > > > I can use the Null and Residual deviance to calculate the > "usual model > > Chi-squared" statistic > > -2[lnL(null) - lnL(fitted)]. > > > > But, just for curiosity's sake, what't the saturated model's -2lnL ? > > It's important to remember that lnL is defined only up to an additive > constant. For example a Poisson model has lnL contributions -mu + > y*log(mu) + constant, and the constant is arbitrary. The > differencing > in the deviance calculation eliminates it. What constant would you > like to use?? >I have always been und the impression that the constant chosen by glm is that which makes the deviance of the saturated model 0, the saturated model being the one with one parameter per observation in the dataset. For example:> y <- sample( 0:10, 15, replace=T ) > A <- factor( rep( 1:5, 3 ) ) > B <- factor( rep( 1:3, each=5 ) ) > data.frame( y, A, B )y A B 1 1 1 1 2 4 2 1 3 3 3 1 4 7 4 1 5 1 5 1 6 0 1 2 7 5 2 2 8 8 3 2 9 4 4 2 10 2 5 2 11 6 1 3 12 10 2 3 13 6 3 3 14 0 4 3 15 1 5 3> glm( y ~ A + B, family=poisson )Call: glm(formula = y ~ A + B, family = poisson) Coefficients: (Intercept) A2 A3 A4 A5 B2 0.6581 0.9985 0.8873 0.4520 -0.5596 0.1719 B3 0.3629 Degrees of Freedom: 14 Total (i.e. Null); 8 Residual Null Deviance: 40.33 Residual Deviance: 24.07 AIC: 78.9> glm( y ~ A * B, family=poisson )Call: glm(formula = y ~ A * B, family = poisson) Coefficients: (Intercept) A2 A3 A4 A5 B2 2.535e-15 1.386e+00 1.099e+00 1.946e+00 -1.293e-14 -2.330e+01 B3 A2:B2 A3:B2 A4:B2 A5:B2 A2:B3 1.792e+00 2.353e+01 2.428e+01 2.274e+01 2.400e+01 -8.755e-01 A3:B3 A4:B3 A5:B3 -1.099e+00 -2.704e+01 -1.792e+00 Degrees of Freedom: 14 Total (i.e. Null); 0 Residual Null Deviance: 40.33 Residual Deviance: 3.033e-10 AIC: 70.84 ---------------------- Bendix Carstensen Senior Statistician Steno Diabetes Center Niels Steensens Vej 2 DK-2820 Gentofte Denmark tel: +45 44 43 87 38 mob: +45 30 75 87 38 fax: +45 44 43 07 06 bxc at steno.dk www.biostat.ku.dk/~bxc
On Tuesday, Mar 16, 2004, at 14:51 Europe/London, BXC (Bendix Carstensen) wrote:>> -----Original Message----- >> From: r-help-bounces at stat.math.ethz.ch >> [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of David Firth >> Sent: Tuesday, March 16, 2004 1:12 PM >> To: Paul Johnson >> Cc: r-help at r-project.org >> Subject: Re: [R] glm questions >> >> >> Dear Paul >> >> Here are some attempts at your questions. I hope it's of some help. >> >> On Tuesday, Mar 16, 2004, at 06:00 Europe/London, Paul Johnson wrote: >> >>> Greetings, everybody. Can I ask some glm questions? >>> >>> 1. How do you find out -2*lnL(saturated model)? >>> >>> In the output from glm, I find: >>> >>> Null deviance: which I think is -2[lnL(null) - lnL(saturated)] >>> Residual deviance: -2[lnL(fitted) - lnL(saturated)] >>> >>> The Null model is the one that includes the constant only >> (plus offset >>> if specified). Right? >>> >>> I can use the Null and Residual deviance to calculate the >> "usual model >>> Chi-squared" statistic >>> -2[lnL(null) - lnL(fitted)]. >>> >>> But, just for curiosity's sake, what't the saturated model's -2lnL ? >> >> It's important to remember that lnL is defined only up to an additive >> constant. For example a Poisson model has lnL contributions -mu + >> y*log(mu) + constant, and the constant is arbitrary. The >> differencing >> in the deviance calculation eliminates it. What constant would you >> like to use?? >> > > I have always been und the impression that the constant chosen by glm > is > that which makes the deviance of the saturated model 0, the saturated > model being the one with one parameter per observation in the dataset. > ...But a look at the deviance formula above --- -2[lnL(fitted) - lnL(saturated)] --- shows us that *any* constant can be added to lnL, and the deviance for the saturated model will still be zero. David
"BXC (Bendix Carstensen)" <bxc at steno.dk> writes:> > It's important to remember that lnL is defined only up to an additive > > constant. For example a Poisson model has lnL contributions -mu + > > y*log(mu) + constant, and the constant is arbitrary. The > > differencing > > in the deviance calculation eliminates it. What constant would you > > like to use?? > > > > I have always been und the impression that the constant chosen by glm is > that which makes the deviance of the saturated model 0, the saturated > model being the one with one parameter per observation in the dataset.As David pointed out, the deviance of a saturated model is zero by definition. However, there's nothing arbitrary about the constant in a likelihood either since it is supposed to be a density if seen as a function of y (well, if you *really* want to quibble, it's a density with respect to an arbitrary measure, so you could get an arbitrary constant in if you insist, I suppose). The point is that the constant is *uniformative* since it depends on y only, not mu, and hence that people tend to throw some bits of the likelihood away, and not always the same bits. -- O__ ---- Peter Dalgaard Blegdamsvej 3 c/ /'_ --- Dept. of Biostatistics 2200 Cph. N (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907