Jerome Myers
2012-Jan-18 15:27 UTC
[R] confint function in MASS package for logistic regression analysis
I have the following binary data set: Sex Response 0 1 0 159 162 1 4 37 My commands library(MASS) sib.glm=glm(sib~sex,family=binomial,data=sib.data) summary(sib.glm) The coefficients in the output are Estimate Std. Error z value Pr(>|z|) (Intercept) -3.6826 0.5062 -7.274 3.48e-13 *** sex 2.2059 0.5380 4.100 4.13e-05 *** I have calculated the .95 confidencce interval for sex two ways: (1) confint(sib.glm) The result is 2.5 % 97.5 % (Intercept) -4.861153 -2.823206 sex 1.263976 3.428764 Using the usual confidence interval formula, (2) 2.2059 +/- 1.96*.538 = 1.15142. 3.26038 The results from (2) are identical to those from SPSS but do not agree with those from the confint function. I have reviewed the MASS pdf file and, seeing no solution there, have tried to get the Venables & Ripley book from the local college libraries but the only copies are out on loan. I suspect there is a simple explanation of the discrepancy, perhaps a modification to account for pre-asymptotic distribution. Or perhaps I misunderstand the application of the confint fuuction in the MASS package. If someone knows the explanation, I'd appreciate it. -- Jerome L. Myers [[alternative HTML version deleted]]
Marc Schwartz
2012-Jan-18 17:55 UTC
[R] confint function in MASS package for logistic regression analysis
On Jan 18, 2012, at 9:27 AM, Jerome Myers wrote:> I have the following binary data set: > Sex > Response 0 1 > 0 159 162 > 1 4 37 > My commands > library(MASS) > sib.glm=glm(sib~sex,family=binomial,data=sib.data) > summary(sib.glm) > The coefficients in the output are > Estimate Std. Error z value Pr(>|z|) > (Intercept) -3.6826 0.5062 -7.274 3.48e-13 *** > sex 2.2059 0.5380 4.100 4.13e-05 *** > I have calculated the .95 confidencce interval for sex two ways: > (1) confint(sib.glm) The result is > 2.5 % 97.5 % > (Intercept) -4.861153 -2.823206 > sex 1.263976 3.428764 > > Using the usual confidence interval formula, > (2) 2.2059 +/- 1.96*.538 = 1.15142. 3.26038 > The results from (2) are identical to those from SPSS but do not agree > with those from the confint function. > > I have reviewed the MASS pdf file and, seeing no solution there, > have tried to get the Venables & Ripley book from the local college > libraries but the only copies are out on loan. I suspect there is a > simple explanation of the discrepancy, perhaps a modification to account > for pre-asymptotic distribution. Or perhaps I misunderstand the > application of the confint fuuction in the MASS package. If someone > knows the explanation, I'd appreciate it.The confint.glm() function in V&R's MASS package provides "profile likelihood" confidence intervals for better coverage, as compared to the formula you have in (2), which presumes a normally distributed parameter estimate (eg. Wald type CI's). If SPSS is using (2) for logistic regression by default, I would have to question why, but not being a user, would also have to think that they offer alternative methods. Do a Google search for "profile likelihood confidence interval" which will lead you to a number of suitable references on theory. HTH, Marc Schwartz
Prof Brian Ripley
2012-Jan-18 17:55 UTC
[R] confint function in MASS package for logistic regression analysis
Yes, the results from confint() are much more accurate than yours and SPSS's. (As Bill Venables once said in a similar circumstance: this is not the place to report bugs in SPSS.) Hint: the word 'profile' appears all over the place on the help pages. confint() uses profile likelihood methods. These are particularly appropriate[*] for logistic regression, as explained in the book for which this is support software, so you will need to get hold of it. [*] Rather, what you style 'usual' methods are particularly inappropriate. On 18/01/2012 15:27, Jerome Myers wrote:> I have the following binary data set: > Sex > Response 0 1 > 0 159 162 > 1 4 37 > My commands > library(MASS) > sib.glm=glm(sib~sex,family=binomial,data=sib.data) > summary(sib.glm) > The coefficients in the output are > Estimate Std. Error z value Pr(>|z|) > (Intercept) -3.6826 0.5062 -7.274 3.48e-13 *** > sex 2.2059 0.5380 4.100 4.13e-05 *** > I have calculated the .95 confidencce interval for sex two ways: > (1) confint(sib.glm) The result is > 2.5 % 97.5 % > (Intercept) -4.861153 -2.823206 > sex 1.263976 3.428764 > > Using the usual confidence interval formula, > (2) 2.2059 +/- 1.96*.538 = 1.15142. 3.26038 > The results from (2) are identical to those from SPSS but do not agree > with those from the confint function. > > I have reviewed the MASS pdf file and, seeing no solution there, > have tried to get the Venables& Ripley book from the local college > libraries but the only copies are out on loan. I suspect there is a > simple explanation of the discrepancy, perhaps a modification to account > for pre-asymptotic distribution. Or perhaps I misunderstand the > application of the confint fuuction in the MASS package. If someone > knows the explanation, I'd appreciate it. >-- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595
Nordlund, Dan (DSHS/RDA)
2012-Jan-18 18:02 UTC
[R] confint function in MASS package for logistic regression analysis
> -----Original Message----- > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r- > project.org] On Behalf Of Jerome Myers > Sent: Wednesday, January 18, 2012 7:27 AM > To: r-help at r-project.org > Subject: [R] confint function in MASS package for logistic regression > analysis > > I have the following binary data set: > Sex > Response 0 1 > 0 159 162 > 1 4 37 > My commands > library(MASS) > sib.glm=glm(sib~sex,family=binomial,data=sib.data) > summary(sib.glm) > The coefficients in the output are > Estimate Std. Error z value Pr(>|z|) > (Intercept) -3.6826 0.5062 -7.274 3.48e-13 *** > sex 2.2059 0.5380 4.100 4.13e-05 *** > I have calculated the .95 confidencce interval for sex two ways: > (1) confint(sib.glm) The result is > 2.5 % 97.5 % > (Intercept) -4.861153 -2.823206 > sex 1.263976 3.428764 > > Using the usual confidence interval formula, > (2) 2.2059 +/- 1.96*.538 = 1.15142. 3.26038 > The results from (2) are identical to those from SPSS but do not agree > with those from the confint function. > > I have reviewed the MASS pdf file and, seeing no solution there, > have tried to get the Venables & Ripley book from the local college > libraries but the only copies are out on loan. I suspect there is a > simple explanation of the discrepancy, perhaps a modification to > account > for pre-asymptotic distribution. Or perhaps I misunderstand the > application of the confint fuuction in the MASS package. If someone > knows the explanation, I'd appreciate it. > > -- > Jerome L. Myers >Jerome, I suspect that the difference you are seeing is due to your "usual confidence interval formula" being based on asymptotic normal methods, while the confint.glm method from MASS uses profile likelihoods. Dan Daniel J. Nordlund Washington State Department of Social and Health Services Planning, Performance, and Accountability Research and Data Analysis Division Olympia, WA 98504-5204
William Dunlap
2012-Jan-18 18:07 UTC
[R] confint function in MASS package for logistic regression analysis
Your original data must have looked something like the following: sib.data <- data.frame(sib=rep(c(0,1,0,1), c(159,4,162,37)), sex=rep(c(0,0,1,1), c(159, 4, 162, 37))) as that gives the 2x2 table you showed (with 'Response' -> 'sib'): > table(sib.data) sex sib 0 1 0 159 162 1 4 37 With that data we can recreate your results: > sib.glm <- glm(sib~sex,family=binomial,data=sib.data) > summary(sib.glm)$coefficients Estimate Std. Error z value Pr(>|z|) (Intercept) -3.682610 0.5062440 -7.274377 3.480223e-13 sex 2.205931 0.5380361 4.099969 4.132054e-05 You can get confidence intervals matching SPSS's and "the usual confidence interval formula" by calling confint.default (which uses the variance of the coefficient estimates, vcov(sib.glm), and the asymptotic formula you gave). > confint.default(sib.glm) 2.5 % 97.5 % (Intercept) -4.67483 -2.690390 sex 1.15140 3.260463 I believe special glm method for confint uses "profile likelihood" to find the confidence intervals. There are quite a few descriptions of that available. > confint(sib.glm) Waiting for profiling to be done... 2.5 % 97.5 % (Intercept) -4.861153 -2.823206 sex 1.263976 3.428764 SPSS probably can do the same sort of calculation, as the following SPSS document describes the algorithm: http://publib.boulder.ibm.com/infocenter/spssstat/v20r0m0/index.jsp?topic=%2Fcom.ibm.spss.statistics.help%2Falg_genlin_gzlm_est_ci.htm Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com> -----Original Message----- > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Jerome Myers > Sent: Wednesday, January 18, 2012 7:27 AM > To: r-help at r-project.org > Subject: [R] confint function in MASS package for logistic regression analysis > > I have the following binary data set: > Sex > Response 0 1 > 0 159 162 > 1 4 37 > My commands > library(MASS) > sib.glm=glm(sib~sex,family=binomial,data=sib.data) > summary(sib.glm) > The coefficients in the output are > Estimate Std. Error z value Pr(>|z|) > (Intercept) -3.6826 0.5062 -7.274 3.48e-13 *** > sex 2.2059 0.5380 4.100 4.13e-05 *** > I have calculated the .95 confidencce interval for sex two ways: > (1) confint(sib.glm) The result is > 2.5 % 97.5 % > (Intercept) -4.861153 -2.823206 > sex 1.263976 3.428764 > > Using the usual confidence interval formula, > (2) 2.2059 +/- 1.96*.538 = 1.15142. 3.26038 > The results from (2) are identical to those from SPSS but do not agree > with those from the confint function. > > I have reviewed the MASS pdf file and, seeing no solution there, > have tried to get the Venables & Ripley book from the local college > libraries but the only copies are out on loan. I suspect there is a > simple explanation of the discrepancy, perhaps a modification to account > for pre-asymptotic distribution. Or perhaps I misunderstand the > application of the confint fuuction in the MASS package. If someone > knows the explanation, I'd appreciate it. > > -- > Jerome L. Myers > > > > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.