thr3ads.net - R help - [R] polr (MASS) and lrm (Design) differences in tests of statistical signifcance [Sep 2004]

If this information is useful, please help other people find it:
Share via:

Paul Johnson

2004-Sep-30 21:40 UTC

[R] polr (MASS) and lrm (Design) differences in tests of statistical signifcance

Greetings:

I'm running R-1.9.1 on Fedora Core 2 Linux.

I tested a proportional odds logistic regression with MASS's polr and 
Design's lrm.  Parameter estimates between the 2 are consistent, but the 
standard errors are quite different, and the conclusions from the t and 
Wald tests are dramatically different. I cranked the "abstol" argument
up quite a bit in the polr method and it did not make the differences go 
away.

So

1. Can you help me see why the std. errors in the polr are so much 
smaller, and

2. Can I hear more opinions on the question of t vs. Wald in making 
these signif tests. So far, I understand the t is based on the 
asymptotic Normality of the estimate of b, and for finite samples b/se 
is not exactly distributed as a t. But I also had the impression that 
the Wald value was an approximation as well.

 > summary(polr(as.factor(RENUCYC) ~ DOCS + PCT65PLS*RANNEY2 + OLDCRASH 
+  FISCAL2 + PCTMETRO + ADMLICEN, data=elaine1))

Re-fitting to get Hessian

Call:
polr(formula = as.factor(RENUCYC) ~ DOCS + PCT65PLS * RANNEY2 +
     OLDCRASH + FISCAL2 + PCTMETRO + ADMLICEN, data = elaine1)

Coefficients:
                         Value  Std. Error   t value
DOCS              0.004942217 0.002952001  1.674192
PCT65PLS          0.454638558 0.113504288  4.005475
RANNEY2           0.110473483 0.010829826 10.200855
OLDCRASH          0.139808663 0.042245692  3.309418
FISCAL2           0.025592117 0.011465812  2.232037
PCTMETRO          0.018184093 0.007792680  2.333484
ADMLICEN         -0.028490387 0.011470999 -2.483688
PCT65PLS:RANNEY2 -0.008559228 0.001456543 -5.876400

Intercepts:
       Value   Std. Error t value
2|3    6.6177  0.3019    21.9216
3|4    7.1524  0.2773    25.7938
4|5   10.5856  0.2149    49.2691
5|6   12.2132  0.1858    65.7424
6|8   12.2704  0.1856    66.1063
8|10  13.0345  0.2184    59.6707
10|12 13.9801  0.3517    39.7519
12|18 14.6806  0.5587    26.2782

Residual Deviance: 587.0995
AIC: 619.0995


 > lrm(RENUCYC ~ DOCS + PCT65PLS*RANNEY2 + OLDCRASH +  FISCAL2 + 
PCTMETRO + ADMLICEN, data=elaine1)

Logistic Regression Model

lrm(formula = RENUCYC ~ DOCS + PCT65PLS * RANNEY2 + OLDCRASH +
     FISCAL2 + PCTMETRO + ADMLICEN, data = elaine1)


Frequencies of Responses
   2   3   4   5   6   8  10  12  18
  21  12 149  46   1  10   6   2   2

Frequencies of Missing Values Due to Each Variable
  RENUCYC     DOCS PCT65PLS  RANNEY2 OLDCRASH  FISCAL2 PCTMETRO ADMLICEN
        5        0        0        6        0        5        0        5

        Obs  Max Deriv Model L.R.       d.f.          P          C 
   Dxy
        249      7e-05      56.58          8          0      0.733 
0.465
      Gamma      Tau-a         R2      Brier
       0.47      0.278       0.22      0.073

                    Coef       S.E.     Wald Z P
y>=3                -6.617857 6.716688 -0.99  0.3245
y>=4                -7.152561 6.716571 -1.06  0.2869
y>=5               -10.585705 6.742222 -1.57  0.1164
y>=6               -12.213340 6.755656 -1.81  0.0706
y>=8               -12.270506 6.755571 -1.82  0.0693
y>=10              -13.034584 6.756829 -1.93  0.0537
y>=12              -13.980235 6.767724 -2.07  0.0389
y>=18              -14.680760 6.786639 -2.16  0.0305
DOCS                 0.004942 0.002932  1.69  0.0918
PCT65PLS             0.454653 0.552430  0.82  0.4105
RANNEY2              0.110475 0.076438  1.45  0.1484
OLDCRASH             0.139805 0.042104  3.32  0.0009
FISCAL2              0.025592 0.011374  2.25  0.0245
PCTMETRO             0.018184 0.007823  2.32  0.0201
ADMLICEN            -0.028490 0.011576 -2.46  0.0138
PCT65PLS * RANNEY2  -0.008559 0.006417 -1.33  0.1822

 >

-- 
Paul E. Johnson                       email: pauljohn at ku.edu
Dept. of Political Science            http://lark.cc.ku.edu/~pauljohn
1541 Lilac Lane, Rm 504
University of Kansas                  Office: (785) 864-9086
Lawrence, Kansas 66044-3177           FAX: (785) 864-5700

John Fox

2004-Oct-01 00:34 UTC

head link

[R] polr (MASS) and lrm (Design) differences in tests of statistical signifcance

Dear Paul,

I tried polr() and lrm() on a different problem and (except for the
difference in signs for the cut-points/intercepts) got identical results for
both coefficients and standard errors. There might be something
ill-conditioned about your problem that produces the discrepancy -- I
noticed, for example, that some of the upper categories of the response are
very sparse. Perhaps the two functions use different forms of the
information matrix. I expect that someone else will be able to supply more
details.

I believe that the t-statistics in the polr() output are actually Wald
statistics.

I hope this helps,
 John


> -----Original Message-----
> From: r-help-bounces at stat.math.ethz.ch 
> [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Paul Johnson
> Sent: Thursday, September 30, 2004 4:41 PM
> To: r help
> Subject: [R] polr (MASS) and lrm (Design) differences in 
> tests of statistical signifcance 
> 
> Greetings:
> 
> I'm running R-1.9.1 on Fedora Core 2 Linux.
> 
> I tested a proportional odds logistic regression with MASS's 
> polr and Design's lrm.  Parameter estimates between the 2 are 
> consistent, but the standard errors are quite different, and 
> the conclusions from the t and Wald tests are dramatically 
> different. I cranked the "abstol" argument up quite a bit in 
> the polr method and it did not make the differences go away.
> 
> So
> 
> 1. Can you help me see why the std. errors in the polr are so 
> much smaller, and
> 
> 2. Can I hear more opinions on the question of t vs. Wald in 
> making these signif tests. So far, I understand the t is 
> based on the asymptotic Normality of the estimate of b, and 
> for finite samples b/se is not exactly distributed as a t. 
> But I also had the impression that the Wald value was an 
> approximation as well.
> 
>  > summary(polr(as.factor(RENUCYC) ~ DOCS + PCT65PLS*RANNEY2 
> + OLDCRASH 
> +  FISCAL2 + PCTMETRO + ADMLICEN, data=elaine1))
> 
> Re-fitting to get Hessian
> 
> Call:
> polr(formula = as.factor(RENUCYC) ~ DOCS + PCT65PLS * RANNEY2 +
>      OLDCRASH + FISCAL2 + PCTMETRO + ADMLICEN, data = elaine1)
> 
> Coefficients:
>                          Value  Std. Error   t value
> DOCS              0.004942217 0.002952001  1.674192
> PCT65PLS          0.454638558 0.113504288  4.005475
> RANNEY2           0.110473483 0.010829826 10.200855
> OLDCRASH          0.139808663 0.042245692  3.309418
> FISCAL2           0.025592117 0.011465812  2.232037
> PCTMETRO          0.018184093 0.007792680  2.333484
> ADMLICEN         -0.028490387 0.011470999 -2.483688
> PCT65PLS:RANNEY2 -0.008559228 0.001456543 -5.876400
> 
> Intercepts:
>        Value   Std. Error t value
> 2|3    6.6177  0.3019    21.9216
> 3|4    7.1524  0.2773    25.7938
> 4|5   10.5856  0.2149    49.2691
> 5|6   12.2132  0.1858    65.7424
> 6|8   12.2704  0.1856    66.1063
> 8|10  13.0345  0.2184    59.6707
> 10|12 13.9801  0.3517    39.7519
> 12|18 14.6806  0.5587    26.2782
> 
> Residual Deviance: 587.0995
> AIC: 619.0995
> 
> 
>  > lrm(RENUCYC ~ DOCS + PCT65PLS*RANNEY2 + OLDCRASH +  
> FISCAL2 + PCTMETRO + ADMLICEN, data=elaine1)
> 
> Logistic Regression Model
> 
> lrm(formula = RENUCYC ~ DOCS + PCT65PLS * RANNEY2 + OLDCRASH +
>      FISCAL2 + PCTMETRO + ADMLICEN, data = elaine1)
> 
> 
> Frequencies of Responses
>    2   3   4   5   6   8  10  12  18
>   21  12 149  46   1  10   6   2   2
> 
> Frequencies of Missing Values Due to Each Variable
>   RENUCYC     DOCS PCT65PLS  RANNEY2 OLDCRASH  FISCAL2 
> PCTMETRO ADMLICEN
>         5        0        0        6        0        5        
> 0        5
> 
>         Obs  Max Deriv Model L.R.       d.f.          P          C 
>    Dxy
>         249      7e-05      56.58          8          0      0.733 
> 0.465
>       Gamma      Tau-a         R2      Brier
>        0.47      0.278       0.22      0.073
> 
>                     Coef       S.E.     Wald Z P
> y>=3                -6.617857 6.716688 -0.99  0.3245
> y>=4                -7.152561 6.716571 -1.06  0.2869
> y>=5               -10.585705 6.742222 -1.57  0.1164
> y>=6               -12.213340 6.755656 -1.81  0.0706
> y>=8               -12.270506 6.755571 -1.82  0.0693
> y>=10              -13.034584 6.756829 -1.93  0.0537
> y>=12              -13.980235 6.767724 -2.07  0.0389
> y>=18              -14.680760 6.786639 -2.16  0.0305
> DOCS                 0.004942 0.002932  1.69  0.0918
> PCT65PLS             0.454653 0.552430  0.82  0.4105
> RANNEY2              0.110475 0.076438  1.45  0.1484
> OLDCRASH             0.139805 0.042104  3.32  0.0009
> FISCAL2              0.025592 0.011374  2.25  0.0245
> PCTMETRO             0.018184 0.007823  2.32  0.0201
> ADMLICEN            -0.028490 0.011576 -2.46  0.0138
> PCT65PLS * RANNEY2  -0.008559 0.006417 -1.33  0.1822
> 
>  >
> 
> -- 
> Paul E. Johnson                       email: pauljohn at ku.edu
> Dept. of Political Science            http://lark.cc.ku.edu/~pauljohn
> 1541 Lilac Lane, Rm 504
> University of Kansas                  Office: (785) 864-9086
> Lawrence, Kansas 66044-3177           FAX: (785) 864-5700
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html

Apparently Analagous Threads

dotplot & lattice problems: y axis values and bg color output in jpg

R help - Sep 2004 - polr (MASS) and lrm (Design) differences in tests of statistical signifcance

[R] polr (MASS) and lrm (Design) differences in tests of statistical signifcance

[R] polr (MASS) and lrm (Design) differences in tests of statistical signifcance

Apparently Analagous Threads