Jerome Myers
2012-Jan-18 15:27 UTC
[R] confint function in MASS package for logistic regression analysis
I have the following binary data set:
Sex
Response 0 1
0 159 162
1 4 37
My commands
library(MASS)
sib.glm=glm(sib~sex,family=binomial,data=sib.data)
summary(sib.glm)
The coefficients in the output are
Estimate Std. Error z value Pr(>|z|)
(Intercept) -3.6826 0.5062 -7.274 3.48e-13 ***
sex 2.2059 0.5380 4.100 4.13e-05 ***
I have calculated the .95 confidencce interval for sex two ways:
(1) confint(sib.glm) The result is
2.5 % 97.5 %
(Intercept) -4.861153 -2.823206
sex 1.263976 3.428764
Using the usual confidence interval formula,
(2) 2.2059 +/- 1.96*.538 = 1.15142. 3.26038
The results from (2) are identical to those from SPSS but do not agree
with those from the confint function.
I have reviewed the MASS pdf file and, seeing no solution there,
have tried to get the Venables & Ripley book from the local college
libraries but the only copies are out on loan. I suspect there is a
simple explanation of the discrepancy, perhaps a modification to account
for pre-asymptotic distribution. Or perhaps I misunderstand the
application of the confint fuuction in the MASS package. If someone
knows the explanation, I'd appreciate it.
--
Jerome L. Myers
[[alternative HTML version deleted]]
Marc Schwartz
2012-Jan-18 17:55 UTC
[R] confint function in MASS package for logistic regression analysis
On Jan 18, 2012, at 9:27 AM, Jerome Myers wrote:> I have the following binary data set: > Sex > Response 0 1 > 0 159 162 > 1 4 37 > My commands > library(MASS) > sib.glm=glm(sib~sex,family=binomial,data=sib.data) > summary(sib.glm) > The coefficients in the output are > Estimate Std. Error z value Pr(>|z|) > (Intercept) -3.6826 0.5062 -7.274 3.48e-13 *** > sex 2.2059 0.5380 4.100 4.13e-05 *** > I have calculated the .95 confidencce interval for sex two ways: > (1) confint(sib.glm) The result is > 2.5 % 97.5 % > (Intercept) -4.861153 -2.823206 > sex 1.263976 3.428764 > > Using the usual confidence interval formula, > (2) 2.2059 +/- 1.96*.538 = 1.15142. 3.26038 > The results from (2) are identical to those from SPSS but do not agree > with those from the confint function. > > I have reviewed the MASS pdf file and, seeing no solution there, > have tried to get the Venables & Ripley book from the local college > libraries but the only copies are out on loan. I suspect there is a > simple explanation of the discrepancy, perhaps a modification to account > for pre-asymptotic distribution. Or perhaps I misunderstand the > application of the confint fuuction in the MASS package. If someone > knows the explanation, I'd appreciate it.The confint.glm() function in V&R's MASS package provides "profile likelihood" confidence intervals for better coverage, as compared to the formula you have in (2), which presumes a normally distributed parameter estimate (eg. Wald type CI's). If SPSS is using (2) for logistic regression by default, I would have to question why, but not being a user, would also have to think that they offer alternative methods. Do a Google search for "profile likelihood confidence interval" which will lead you to a number of suitable references on theory. HTH, Marc Schwartz
Prof Brian Ripley
2012-Jan-18 17:55 UTC
[R] confint function in MASS package for logistic regression analysis
Yes, the results from confint() are much more accurate than yours and SPSS's. (As Bill Venables once said in a similar circumstance: this is not the place to report bugs in SPSS.) Hint: the word 'profile' appears all over the place on the help pages. confint() uses profile likelihood methods. These are particularly appropriate[*] for logistic regression, as explained in the book for which this is support software, so you will need to get hold of it. [*] Rather, what you style 'usual' methods are particularly inappropriate. On 18/01/2012 15:27, Jerome Myers wrote:> I have the following binary data set: > Sex > Response 0 1 > 0 159 162 > 1 4 37 > My commands > library(MASS) > sib.glm=glm(sib~sex,family=binomial,data=sib.data) > summary(sib.glm) > The coefficients in the output are > Estimate Std. Error z value Pr(>|z|) > (Intercept) -3.6826 0.5062 -7.274 3.48e-13 *** > sex 2.2059 0.5380 4.100 4.13e-05 *** > I have calculated the .95 confidencce interval for sex two ways: > (1) confint(sib.glm) The result is > 2.5 % 97.5 % > (Intercept) -4.861153 -2.823206 > sex 1.263976 3.428764 > > Using the usual confidence interval formula, > (2) 2.2059 +/- 1.96*.538 = 1.15142. 3.26038 > The results from (2) are identical to those from SPSS but do not agree > with those from the confint function. > > I have reviewed the MASS pdf file and, seeing no solution there, > have tried to get the Venables& Ripley book from the local college > libraries but the only copies are out on loan. I suspect there is a > simple explanation of the discrepancy, perhaps a modification to account > for pre-asymptotic distribution. Or perhaps I misunderstand the > application of the confint fuuction in the MASS package. If someone > knows the explanation, I'd appreciate it. >-- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595
Nordlund, Dan (DSHS/RDA)
2012-Jan-18 18:02 UTC
[R] confint function in MASS package for logistic regression analysis
> -----Original Message----- > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r- > project.org] On Behalf Of Jerome Myers > Sent: Wednesday, January 18, 2012 7:27 AM > To: r-help at r-project.org > Subject: [R] confint function in MASS package for logistic regression > analysis > > I have the following binary data set: > Sex > Response 0 1 > 0 159 162 > 1 4 37 > My commands > library(MASS) > sib.glm=glm(sib~sex,family=binomial,data=sib.data) > summary(sib.glm) > The coefficients in the output are > Estimate Std. Error z value Pr(>|z|) > (Intercept) -3.6826 0.5062 -7.274 3.48e-13 *** > sex 2.2059 0.5380 4.100 4.13e-05 *** > I have calculated the .95 confidencce interval for sex two ways: > (1) confint(sib.glm) The result is > 2.5 % 97.5 % > (Intercept) -4.861153 -2.823206 > sex 1.263976 3.428764 > > Using the usual confidence interval formula, > (2) 2.2059 +/- 1.96*.538 = 1.15142. 3.26038 > The results from (2) are identical to those from SPSS but do not agree > with those from the confint function. > > I have reviewed the MASS pdf file and, seeing no solution there, > have tried to get the Venables & Ripley book from the local college > libraries but the only copies are out on loan. I suspect there is a > simple explanation of the discrepancy, perhaps a modification to > account > for pre-asymptotic distribution. Or perhaps I misunderstand the > application of the confint fuuction in the MASS package. If someone > knows the explanation, I'd appreciate it. > > -- > Jerome L. Myers >Jerome, I suspect that the difference you are seeing is due to your "usual confidence interval formula" being based on asymptotic normal methods, while the confint.glm method from MASS uses profile likelihoods. Dan Daniel J. Nordlund Washington State Department of Social and Health Services Planning, Performance, and Accountability Research and Data Analysis Division Olympia, WA 98504-5204
William Dunlap
2012-Jan-18 18:07 UTC
[R] confint function in MASS package for logistic regression analysis
Your original data must have looked something like the following:
sib.data <- data.frame(sib=rep(c(0,1,0,1), c(159,4,162,37)),
sex=rep(c(0,0,1,1), c(159, 4, 162, 37)))
as that gives the 2x2 table you showed (with 'Response' ->
'sib'):
> table(sib.data)
sex
sib 0 1
0 159 162
1 4 37
With that data we can recreate your results:
> sib.glm <- glm(sib~sex,family=binomial,data=sib.data)
> summary(sib.glm)$coefficients
Estimate Std. Error z value Pr(>|z|)
(Intercept) -3.682610 0.5062440 -7.274377 3.480223e-13
sex 2.205931 0.5380361 4.099969 4.132054e-05
You can get confidence intervals matching SPSS's and
"the usual confidence interval formula" by calling
confint.default (which uses the variance of the coefficient
estimates, vcov(sib.glm), and the asymptotic formula you
gave).
> confint.default(sib.glm)
2.5 % 97.5 %
(Intercept) -4.67483 -2.690390
sex 1.15140 3.260463
I believe special glm method for confint uses
"profile likelihood" to find the confidence intervals.
There are quite a few descriptions of that available.
> confint(sib.glm)
Waiting for profiling to be done...
2.5 % 97.5 %
(Intercept) -4.861153 -2.823206
sex 1.263976 3.428764
SPSS probably can do the same sort of calculation, as the
following SPSS document describes the algorithm:
http://publib.boulder.ibm.com/infocenter/spssstat/v20r0m0/index.jsp?topic=%2Fcom.ibm.spss.statistics.help%2Falg_genlin_gzlm_est_ci.htm
Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com
> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at
r-project.org] On Behalf Of Jerome Myers
> Sent: Wednesday, January 18, 2012 7:27 AM
> To: r-help at r-project.org
> Subject: [R] confint function in MASS package for logistic regression
analysis
>
> I have the following binary data set:
> Sex
> Response 0 1
> 0 159 162
> 1 4 37
> My commands
> library(MASS)
> sib.glm=glm(sib~sex,family=binomial,data=sib.data)
> summary(sib.glm)
> The coefficients in the output are
> Estimate Std. Error z value Pr(>|z|)
> (Intercept) -3.6826 0.5062 -7.274 3.48e-13 ***
> sex 2.2059 0.5380 4.100 4.13e-05 ***
> I have calculated the .95 confidencce interval for sex two ways:
> (1) confint(sib.glm) The result is
> 2.5 % 97.5 %
> (Intercept) -4.861153 -2.823206
> sex 1.263976 3.428764
>
> Using the usual confidence interval formula,
> (2) 2.2059 +/- 1.96*.538 = 1.15142. 3.26038
> The results from (2) are identical to those from SPSS but do not agree
> with those from the confint function.
>
> I have reviewed the MASS pdf file and, seeing no solution there,
> have tried to get the Venables & Ripley book from the local college
> libraries but the only copies are out on loan. I suspect there is a
> simple explanation of the discrepancy, perhaps a modification to account
> for pre-asymptotic distribution. Or perhaps I misunderstand the
> application of the confint fuuction in the MASS package. If someone
> knows the explanation, I'd appreciate it.
>
> --
> Jerome L. Myers
>
>
>
>
>
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.