John Sorkin
2008-Dec-01 23:00 UTC
[R] Comparing output from linear regression to output from quasipoisson to determine the model that fits best.
R 2.7 Windows XP I have two model that have been run using exactly the same data, both fit using glm(). One model is a linear regression (gaussian(link = "identity")) the other a quasipoisson(link = "log"). I have log likelihoods from each model. Is there any way I can determine which model is a better fit to the data? anova() does not appear to work as the models have the same residual degrees of freedom: fit1<-glm(PHYSFUNC~HIV,data=KA) summary(fit1) fitQP<-glm(PHYSFUNC~HIV,data=KA,family=quasipoisson) summary(fitQP) anova(fit1,fitOP) Program OUTPUT:> fit1<-glm(PHYSFUNC~HIV,data=KA) > summary(fit1)Call: glm(formula = PHYSFUNC ~ HIV, data = KA) Deviance Residuals: Min 1Q Median 3Q Max -4.197 -4.192 -2.192 2.808 19.808 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 4.19670 0.08508 49.33 <2e-16 *** HIV -0.00487 0.12071 -0.04 0.968 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 (Dispersion parameter for gaussian family taken to be 22.78134) Null deviance: 142429 on 6253 degrees of freedom Residual deviance: 142429 on 6252 degrees of freedom (213 observations deleted due to missingness) AIC: 37302 Number of Fisher Scoring iterations: 2> > fitQP<-glm(PHYSFUNC~HIV,data=KA,family=quasipoisson) > summary(fitQP)Call: glm(formula = PHYSFUNC ~ HIV, family = quasipoisson, data = KA) Deviance Residuals: Min 1Q Median 3Q Max -2.897 -2.895 -1.193 1.250 6.644 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 1.434297 0.020280 70.72 <2e-16 *** HIV -0.001161 0.028780 -0.04 0.968 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 (Dispersion parameter for quasipoisson family taken to be 5.432011) Null deviance: 35439 on 6253 degrees of freedom Residual deviance: 35439 on 6252 degrees of freedom (213 observations deleted due to missingness) AIC: NA Number of Fisher Scoring iterations: 5> anova(fit1,fitQP)Analysis of Deviance Table Model 1: PHYSFUNC ~ HIV Model 2: PHYSFUNC ~ HIV Resid. Df Resid. Dev Df Deviance 1 6252 142429 2 6252 35439 0 106989>Thanks, John John David Sorkin M.D., Ph.D. Chief, Biostatistics and Informatics University of Maryland School of Medicine Division of Gerontology Baltimore VA Medical Center 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 (Phone) 410-605-7119 (Fax) 410-605-7913 (Please call phone number above prior to faxing) Confidentiality Statement: This email message, including any attachments, is for th...{{dropped:6}}
Uwe Ligges
2008-Dec-02 08:58 UTC
[R] Comparing output from linear regression to output from quasipoisson to determine the model that fits best.
John Sorkin wrote:> R 2.7 > Windows XP > > I have two model that have been run using exactly the same data, both fit using glm(). One model is a linear regression (gaussian(link = "identity")) the other a quasipoisson(link = "log"). I have log likelihoods from each model. Is there any way I can determine which model is a better fit to the data? anova() does not appear to work as the models have the same residual degrees of freedom:Since the class of the models is quite different, I'd go on by looking carefully at the residuals. Uwe Ligges> fit1<-glm(PHYSFUNC~HIV,data=KA) > summary(fit1) > > fitQP<-glm(PHYSFUNC~HIV,data=KA,family=quasipoisson) > summary(fitQP) > > anova(fit1,fitOP) > > > Program OUTPUT: >> fit1<-glm(PHYSFUNC~HIV,data=KA) >> summary(fit1) > > Call: > glm(formula = PHYSFUNC ~ HIV, data = KA) > > Deviance Residuals: > Min 1Q Median 3Q Max > -4.197 -4.192 -2.192 2.808 19.808 > > Coefficients: > Estimate Std. Error t value Pr(>|t|) > (Intercept) 4.19670 0.08508 49.33 <2e-16 *** > HIV -0.00487 0.12071 -0.04 0.968 > --- > Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 > > (Dispersion parameter for gaussian family taken to be 22.78134) > > Null deviance: 142429 on 6253 degrees of freedom > Residual deviance: 142429 on 6252 degrees of freedom > (213 observations deleted due to missingness) > AIC: 37302 > > Number of Fisher Scoring iterations: 2 > >> fitQP<-glm(PHYSFUNC~HIV,data=KA,family=quasipoisson) >> summary(fitQP) > > Call: > glm(formula = PHYSFUNC ~ HIV, family = quasipoisson, data = KA) > > Deviance Residuals: > Min 1Q Median 3Q Max > -2.897 -2.895 -1.193 1.250 6.644 > > Coefficients: > Estimate Std. Error t value Pr(>|t|) > (Intercept) 1.434297 0.020280 70.72 <2e-16 *** > HIV -0.001161 0.028780 -0.04 0.968 > --- > Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 > > (Dispersion parameter for quasipoisson family taken to be 5.432011) > > Null deviance: 35439 on 6253 degrees of freedom > Residual deviance: 35439 on 6252 degrees of freedom > (213 observations deleted due to missingness) > AIC: NA > > Number of Fisher Scoring iterations: 5 > >> anova(fit1,fitQP) > Analysis of Deviance Table > > Model 1: PHYSFUNC ~ HIV > Model 2: PHYSFUNC ~ HIV > Resid. Df Resid. Dev Df Deviance > 1 6252 142429 > 2 6252 35439 0 106989 > > > Thanks, > John > > > > > > John David Sorkin M.D., Ph.D. > Chief, Biostatistics and Informatics > University of Maryland School of Medicine Division of Gerontology > Baltimore VA Medical Center > 10 North Greene Street > GRECC (BT/18/GR) > Baltimore, MD 21201-1524 > (Phone) 410-605-7119 > (Fax) 410-605-7913 (Please call phone number above prior to faxing) > > Confidentiality Statement: > This email message, including any attachments, is for th...{{dropped:6}} > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.