Pankaj Choudhary
2003-Jan-21 03:29 UTC
[R] Logistic regression: At times correlation matrix of coefficients gets messed up
Hi,
When I include a categorical variable (RACE with 3 levels - "white",
"black" and "other") in my logistic regression model, the
correlation
matrix of the coefficients gets messed up. I get something like:
-----------------------------------------
Correlation of Coefficients:
( A L RACEb
AGE , 1
LWT , 1
RACEblack 1
RACEother . .
attr(,"legend")
[1] 0 ` ' 0.3 `.' 0.6 `,' 0.8 `+' 0.9 `*' 0.95 `B' 1
-------------------------------------
I couldn't figure out how to interpret it. Here is the sequence of
commands and the complete output. (I am using R 1.6.2)
-----------------------------------------
> lowbwt.alr <- glm(LOW~AGE+LWT+RACE, family=binomial, data=lowbwt)
> summary(lowbwt.alr, correlation=TRUE)
Call:
glm(formula = LOW ~ AGE + LWT + RACE, family = binomial, data = lowbwt)
Deviance Residuals:
Min 1Q Median 3Q Max
-1.4052 -0.8946 -0.7209 1.2484 2.0982
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 1.306741 1.069558 1.222 0.2218
AGE -0.025524 0.033244 -0.768 0.4426
LWT -0.014353 0.006521 -2.201 0.0277 *
RACEblack 1.003821 0.497957 2.016 0.0438 *
RACEother 0.443460 0.360184 1.231 0.2182
---
Signif. codes: 0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` '
1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 234.67 on 188 degrees of freedom
Residual deviance: 222.66 on 184 degrees of freedom
AIC: 232.66
Number of Fisher Scoring iterations: 3
Correlation of Coefficients:
( A L RACEb
AGE , 1
LWT , 1
RACEblack 1
RACEother . .
attr(,"legend")
[1] 0 ` ' 0.3 `.' 0.6 `,' 0.8 `+' 0.9 `*' 0.95 `B' 1
---------------------------------------------------------------------
Strangely enough, when I just use (AGE and RACE) or (LWT and RACE) or
(AGE and LWT) or just RACE as the explanatory variable(s), there is no
problem.
Am I doing something wrong? I will greatly appreciate any help.
With best wishes,
Pankaj Choudhary
U. of Texas at Dallas
Prof Brian D Ripley
2003-Jan-21 09:29 UTC
[R] Logistic regression: At times correlation matrix of coefficients gets messed up
It's not messed up, just someone's idea of a compact display.
Options are
1) Use vcov(fit) instead
2) Use print(summary(fit), symbolic.cor=FALSE)
Does anyone think that the current arrangement (use this scheme for more
than 4 coefficients) is sensible? Surely the abbreviations are not
("(" for intercept?), and why is the diagonal being shown but the top
row
and last column have been omitted? If the whole matrix was shown, the
column labels could be omitted.
I'd much prefer symbolic.cor=FALSE to be the default.
On Mon, 20 Jan 2003, Pankaj Choudhary wrote:
>
> Hi,
>
> When I include a categorical variable (RACE with 3 levels -
"white",
> "black" and "other") in my logistic regression model,
the correlation
> matrix of the coefficients gets messed up. I get something like:
>
> -----------------------------------------
> Correlation of Coefficients:
> ( A L RACEb
> AGE , 1
> LWT , 1
> RACEblack 1
> RACEother . .
> attr(,"legend")
> [1] 0 ` ' 0.3 `.' 0.6 `,' 0.8 `+' 0.9 `*' 0.95 `B'
1
> -------------------------------------
>
> I couldn't figure out how to interpret it. Here is the sequence of
> commands and the complete output. (I am using R 1.6.2)
--
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595