thr3ads.net - R help - [R] Multinomial logistic regression under R and Stata [Mar 2003]

If this information is useful, please help other people find it:
Share via:

Tak Wing Chan

2003-Mar-27 10:32 UTC

[R] Multinomial logistic regression under R and Stata

Dear Colleagues

I have been fitting some multinomial logistic regression models using R 
(version 1.6.1 on a linux box) and Stata 7. Although the vast majority 
of the parameter estimates and standard errors I get from R are the same 
as those from Stata (given rounding errors and so on), there are a few 
estimates for the same model which are quite different. I would be most 
grateful if colleagues could advise me as to what might be causing this, 
and should I worry ...

Anyway, with R, I have been using the function multinom under the 
package nnet. Below are two examples where the estimates for standard 
error differ substantially between R and Stata:

          beta              s.e.
R:        5.939880 2.920165
Stata:  5.939747 5.455495

R:      11.228705 2.191625
Stata: 11.22761  4.630293

The parameters concerned are the quadratic term of a quantitative 
variable (measuring social status). I notice that the s.e. for this 
quadratic term are large anyway compared to other s.e. in the model.

There are other differences between R and Stata, and these concerned the 
intercept terms. Here is an example:

           beta            s.e.
R:        0.2870793 0.4512347
Stata: -0.2109653 0.5053566

Since both estimates are not significantly different from zero, I trust 
I can ignore the difference between the estimates. Or could I?

Many thanks in advance for any help. Please let me know if I should 
provide further info.

With best wishes.  

Wing
  

-- 
Department of Sociology, University of Oxford,
Littlegate House, St Ebbes, Oxford OX1 1PT, UK
tel: +44 (1865) 286176, fax: +44 (1865) 286171
http://users.ox.ac.uk/~sfos0006

ripley@stats.ox.ac.uk

2003-Mar-27 10:58 UTC

head link

[R] Multinomial logistic regression under R and Stata

Perhaps you should ask Stata how it finds its estimates, and why it 
disagrees with R?

R uses the observed information matrix for the standard errors.  It is
also possible to use the expected (Fisher) information matrix.  Where they
differ, the observed one is generally regarded as a better choice,
especially when as here the curvature is measured over a reasonably-sized
neighbourhood.

Intercepts often depend on coding, and you should cross-check the coding.
More generally, such differences can be caused by the Hauck-Donner effect 
and lack of convergence, so it is almost always worth playing with the 
convergence criteria.

On Thu, 27 Mar 2003, Tak Wing Chan wrote:
> Dear Colleagues
> 
> I have been fitting some multinomial logistic regression models using R 
> (version 1.6.1 on a linux box) and Stata 7. Although the vast majority 
> of the parameter estimates and standard errors I get from R are the same 
> as those from Stata (given rounding errors and so on), there are a few 
> estimates for the same model which are quite different. I would be most 
> grateful if colleagues could advise me as to what might be causing this, 
> and should I worry ...
> 
> Anyway, with R, I have been using the function multinom under the 
> package nnet. Below are two examples where the estimates for standard 
> error differ substantially between R and Stata:
> 
>           beta              s.e.
> R:        5.939880 2.920165
> Stata:  5.939747 5.455495
> 
> R:      11.228705 2.191625
> Stata: 11.22761  4.630293
> 
> The parameters concerned are the quadratic term of a quantitative 
> variable (measuring social status). I notice that the s.e. for this 
> quadratic term are large anyway compared to other s.e. in the model.
> 
> There are other differences between R and Stata, and these concerned the 
> intercept terms. Here is an example:
> 
>            beta            s.e.
> R:        0.2870793 0.4512347
> Stata: -0.2109653 0.5053566
> 
> Since both estimates are not significantly different from zero, I trust 
> I can ignore the difference between the estimates. Or could I?
> 
> Many thanks in advance for any help. Please let me know if I should 
> provide further info.
> 
> With best wishes.  
> 
> Wing
>   
> 
> 
-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

Thomas W Blackwell

2003-Mar-28 01:06 UTC

head link

[R] Multinomial logistic regression under R and Stata

Wing  -

Apologies if I''m overdoing the answer to a fundamentally simple
question.

There''s another experiment you might do:  Try doing the
orthogonalization
between quadratic and linear terms yourself, explicitly.  (See below.)
Find out whether and how much the reported standard errors change in each
package, when you do the orthogonalization explicitly, ahead of time,
versus when you let the multinom() fitting routine use its own default
procedure.  The package whose standard error estimate doesn''t change is
doing it right.

I think it''s not possible to do the orthogonalization exactly, because
one doesn''t have access ahead of time to the weights that will be used
in the multinomial fit.  But maybe approximate is good enough to show
whether there is a difference in the standard errors.

Suppose your "social status" covariate is a column in a data frame
 d1  named "soc.st".  The non-orthogonalized quadratic term would be
"soc.st^2".  Define a new data frame, d2, by

 d2 <- cbind(d1, soc.quad = residuals(lm(soc.st^2 ~ soc.st, data=d1)))

The column named "soc.quad" in d2 is the quadratic term, crudely
orthogonalized to the constant and linear terms in soc.st.  Use it
in your multinomial regression in place of "soc.st^2", (if
that''s
what you were doing before) and see whether both packages give the
same estimates and standard errors that they did before.  I''ll be
most surprised if the estimates are different, but I expect that
one package or the other will change its reported standard errors.

-  tom blackwell  -  u michigan medical school  -  ann arbor  -

On Thu, 27 Mar 2003, Tak Wing Chan wrote:
> Dear Colleagues
>
> I have been fitting some multinomial logistic regression models using R
> (version 1.6.1 on a linux box) and Stata 7. Although the vast majority
> of the parameter estimates and standard errors I get from R are the same
> as those from Stata (given rounding errors and so on), there are a few
> estimates for the same model which are quite different. I would be most
> grateful if colleagues could advise me as to what might be causing this,
> and should I worry ...
>
> Anyway, with R, I have been using the function multinom under the
> package nnet. Below are two examples where the estimates for standard
> error differ substantially between R and Stata:
>
>           beta              s.e.
> R:        5.939880 2.920165
> Stata:  5.939747 5.455495
>
> R:      11.228705 2.191625
> Stata: 11.22761  4.630293
>
> The parameters concerned are the quadratic term of a quantitative
> variable (measuring social status). I notice that the s.e. for this
> quadratic term are large anyway compared to other s.e. in the model.
>
> There are other differences between R and Stata, and these concerned the
> intercept terms. Here is an example:
>
>            beta            s.e.
> R:        0.2870793 0.4512347
> Stata: -0.2109653 0.5053566
>
> Since both estimates are not significantly different from zero, I trust
> I can ignore the difference between the estimates. Or could I?
>
> Many thanks in advance for any help. Please let me know if I should
> provide further info.
>
> With best wishes.
>
> Wing
> --
> Department of Sociology, University of Oxford,
> Littlegate House, St Ebbes, Oxford OX1 1PT, UK
> tel: +44 (1865) 286176, fax: +44 (1865) 286171
> http://users.ox.ac.uk/~sfos0006

Tak Wing Chan

2003-Apr-21 15:58 UTC

head link

[R] Multinomial logistic regression under R and Stata

Dear Colleagues

I posted to the R-help and Stata lists a little while ago concerning 
some disagreement in results I obtained from using the multinom function 
in R and the mlogit command in Stata.

Many thanks to colleagues for your comments and ideas. I have checked 
out some of your suggestions, and here is a report. The disagreements I 
reported are of two types: (1) parameter estimates for the intercepts 
and (2) standard errors of a quadratic term of a quantitative variable.

Regarding (1): yes, as some of you suggested, this is due to the coding 
of another covariate in the model. Thanks!

As for (2), it turns out that the problem has to do with the scale of 
the quadratic term.

In my original model, I have, out of habit, scaled down the quadratic 
term by 100, so as to make its scale comparable to the linear term. I.e. 
I did the following.

varsq <- var*var/100

This is in fact unnecessary in the present case, given the scale of the 
linear term. But anyway, with the division, R and Stata disagree:

        beta        s.e.
R:      5.939880  2.920165
Stata:  5.939747  5.455495

R:      11.228705 2.191625
Stata:  11.22761  4.630293

However, if I don't divide the quadratic term by 100, then R and Stata
agree.

R:       0.05939645  0.05455490
Stata:   0.0593975   0.0545549

R:       0.11227038  0.04630296
Stata:   0.1122761   0.0463029

So it apprears that R might have some precision problem in calculating 
the Hessian when the scale of a variable is very small. I talked to a 
colleague, David Firth, about this, and he suggests

 > Possibly it would be worth implementing an algebraic vcov method for 
multinomial logit models [in R]?

Once again, many thanks to colleagues for your time and help.

Cheers.  Wing

-- 
Department of Sociology, University of Oxford,
Littlegate House, St Ebbes, Oxford OX1 1PT, UK
tel: +44 (1865) 286176, fax: +44 (1865) 286171
http://users.ox.ac.uk/~sfos0006

Prof Brian Ripley

2003-Apr-24 16:10 UTC

head link

[R] Multinomial logistic regression under R and Stata

On Mon, 21 Apr 2003, Tak Wing Chan wrote:
> the Hessian when the scale of a variable is very small. I talked to a 
> colleague, David Firth, about this, and he suggests
> 
>  > Possibly it would be worth implementing an algebraic vcov method for 
> multinomial logit models [in R]?
Yes, and I look forward to receiving your contribution of it. (See the
R startup banner.)

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

Seemingly Similar Threads

Search for more apparently analagous threads

R help - Mar 2003 - Multinomial logistic regression under R and Stata

[R] Multinomial logistic regression under R and Stata

[R] Multinomial logistic regression under R and Stata

[R] Multinomial logistic regression under R and Stata

[R] Multinomial logistic regression under R and Stata

[R] Multinomial logistic regression under R and Stata

Seemingly Similar Threads