thr3ads.net - R help - [R] Stata and R user GLM method [Jan 2010]

If this information is useful, please help other people find it:
Share via:

Jean-Baptiste Combes

2010-Jan-22 14:25 UTC

[R] Stata and R user GLM method

Hello people,

I am in the process of migrating from Stata to R and I would like to check
if my results are similar under the two softwares:

Here is my GLM command under R
nurse.model<-glm(pQSfteHT~dQSvacrateHTQuali3_2 + dQSvacrateHTQuali3_3 +
dQSvacrateHTQuali3_4 + dQSvacrateHTQuali3_5 + cluster_32 + cluster_33 +
cluster_34 ,family=binomial(link = "logit"))


and below the stata command
glm pQSfteHT dQSvacrateHTQuali3_2 dQSvacrateHTQuali3_3 dQSvacrateHTQuali3_4
dQSvacrateHTQuali3_5 cluster_32 cluster_33 cluster_34, link(probit)
family(binomial) robust

Apart from the robust option, it seems to me from what I understand that I
should get the same things.
Stata output:



*Second model (N=690*



*Coef.*

*p-value*

Constant**

0.241***

0.000

QV>SV>0

0.076***

0.001

SV>QV>0

0.071**

0.027

QV>SV=0

0.051**

0.019

SV>QV=0

0.042

0.368

Mental Health HTs

-0.226***

0.000

Acute Teaching HTs

0.159***

0.000

Other HTs

0.084

0.200


R output (Sorry for the presentation, but I am not able at the moment to
produce nice tables, the variables are in the same order as above)
Call:
glm(formula = pQSfteHT ~ dQSvacrateHTQuali3_2 + dQSvacrateHTQuali3_3 +
    dQSvacrateHTQuali3_4 + dQSvacrateHTQuali3_5 + cluster_32 +
    cluster_33 + cluster_34, family = binomial(link = "logit"))

Deviance Residuals:
       Min          1Q      Median          3Q         Max
-2.297e+00   2.107e-08   2.107e-08   6.275e-06   3.850e-01

Coefficients:
                       Estimate Std. Error   z value Pr(>|z|)
(Intercept)           4.476e+01  1.950e+04     0.002    0.998
dQSvacrateHTQuali3_2 -1.112e+00  2.136e+04 -5.21e-05    1.000
dQSvacrateHTQuali3_3 -5.365e-01  2.576e+04 -2.08e-05    1.000
dQSvacrateHTQuali3_4 -2.011e+01  1.693e+04    -0.001    0.999
dQSvacrateHTQuali3_5 -6.509e-01  4.040e+04 -1.61e-05    1.000
cluster_32           -3.194e-01  1.788e+04 -1.79e-05    1.000
cluster_33           -2.857e-02  2.475e+04 -1.15e-06    1.000
cluster_34           -2.209e+01  9.666e+03    -0.002    0.998

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 15.0690  on 688  degrees of freedom
Residual deviance:  7.2049  on 681  degrees of freedom
AIC: 23.205

Number of Fisher Scoring iterations: 24



My suggestion is that I have something wrong with my data under R (I am
confident with the Stata results). What do you think? I am not expecting you
to solve my problem as I reckon it is a bit difficult for you as you do not
know the data, I just would like an opinion on the differences found between
the two softwares, do you agree that there is something wrong?

Thank you for reading this e-mail.

I would like to thank you in advance and alos the people who answered my
previous e-mail that was very kind of you.

Jean-Baptiste

	[[alternative HTML version deleted]]

Joseph Magagnoli

2010-Jan-22 16:32 UTC

head link

[R] Stata and R user GLM method

Jean-Baptiste
The most immediate difference I see is that you use a logit link in the R
code but a probit link function
in the stata code.
Joe

On Fri, Jan 22, 2010 at 8:25 AM, Jean-Baptiste Combes
<jbcombes@laposte.net>wrote:
> Hello people,
>
> I am in the process of migrating from Stata to R and I would like to check
> if my results are similar under the two softwares:
>
> Here is my GLM command under R
> nurse.model<-glm(pQSfteHT~dQSvacrateHTQuali3_2 + dQSvacrateHTQuali3_3 +
> dQSvacrateHTQuali3_4 + dQSvacrateHTQuali3_5 + cluster_32 + cluster_33 +
> cluster_34 ,family=binomial(link = "logit"))
>
>
> and below the stata command
> glm pQSfteHT dQSvacrateHTQuali3_2 dQSvacrateHTQuali3_3 dQSvacrateHTQuali3_4
> dQSvacrateHTQuali3_5 cluster_32 cluster_33 cluster_34, link(probit)
> family(binomial) robust
>
> Apart from the robust option, it seems to me from what I understand that I
> should get the same things.
> Stata output:
>
>
>
> *Second model (N=690*
>
>
>
> *Coef.*
>
> *p-value*
>
> Constant**
>
> 0.241***
>
> 0.000
>
> QV>SV>0
>
> 0.076***
>
> 0.001
>
> SV>QV>0
>
> 0.071**
>
> 0.027
>
> QV>SV=0
>
> 0.051**
>
> 0.019
>
> SV>QV=0
>
> 0.042
>
> 0.368
>
> Mental Health HTs
>
> -0.226***
>
> 0.000
>
> Acute Teaching HTs
>
> 0.159***
>
> 0.000
>
> Other HTs
>
> 0.084
>
> 0.200
>
>
> R output (Sorry for the presentation, but I am not able at the moment to
> produce nice tables, the variables are in the same order as above)
> Call:
> glm(formula = pQSfteHT ~ dQSvacrateHTQuali3_2 + dQSvacrateHTQuali3_3 +
>    dQSvacrateHTQuali3_4 + dQSvacrateHTQuali3_5 + cluster_32 +
>    cluster_33 + cluster_34, family = binomial(link = "logit"))
>
> Deviance Residuals:
>       Min          1Q      Median          3Q         Max
> -2.297e+00   2.107e-08   2.107e-08   6.275e-06   3.850e-01
>
> Coefficients:
>                       Estimate Std. Error   z value Pr(>|z|)
> (Intercept)           4.476e+01  1.950e+04     0.002    0.998
> dQSvacrateHTQuali3_2 -1.112e+00  2.136e+04 -5.21e-05    1.000
> dQSvacrateHTQuali3_3 -5.365e-01  2.576e+04 -2.08e-05    1.000
> dQSvacrateHTQuali3_4 -2.011e+01  1.693e+04    -0.001    0.999
> dQSvacrateHTQuali3_5 -6.509e-01  4.040e+04 -1.61e-05    1.000
> cluster_32           -3.194e-01  1.788e+04 -1.79e-05    1.000
> cluster_33           -2.857e-02  2.475e+04 -1.15e-06    1.000
> cluster_34           -2.209e+01  9.666e+03    -0.002    0.998
>
> (Dispersion parameter for binomial family taken to be 1)
>
>    Null deviance: 15.0690  on 688  degrees of freedom
> Residual deviance:  7.2049  on 681  degrees of freedom
> AIC: 23.205
>
> Number of Fisher Scoring iterations: 24
>
>
>
> My suggestion is that I have something wrong with my data under R (I am
> confident with the Stata results). What do you think? I am not expecting
> you
> to solve my problem as I reckon it is a bit difficult for you as you do not
> know the data, I just would like an opinion on the differences found
> between
> the two softwares, do you agree that there is something wrong?
>
> Thank you for reading this e-mail.
>
> I would like to thank you in advance and alos the people who answered my
> previous e-mail that was very kind of you.
>
> Jean-Baptiste
>
>        [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Joseph C. Magagnoli
Doctoral Student
Department of Political Science
University of North Texas
1155 Union Circle #305340
Denton, Texas 76203-5017
Email: jcm0250@unt.edu

	[[alternative HTML version deleted]]

ONKELINX, Thierry

2010-Jan-22 16:33 UTC

head link

[R] Stata and R user GLM method

Jean-Baptiste,

You are not doing the same thing in R as in Stata. In stata you used the
probit link, in R the logit link.

HTH,

Thierry 


------------------------------------------------------------------------
----
ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek
team Biometrie & Kwaliteitszorg
Gaverstraat 4
9500 Geraardsbergen
Belgium

Research Institute for Nature and Forest
team Biometrics & Quality Assurance
Gaverstraat 4
9500 Geraardsbergen
Belgium

tel. + 32 54/436 185
Thierry.Onkelinx at inbo.be
www.inbo.be

To call in the statistician after the experiment is done may be no more
than asking him to perform a post-mortem examination: he may be able to
say what the experiment died of.
~ Sir Ronald Aylmer Fisher

The plural of anecdote is not data.
~ Roger Brinner

The combination of some data and an aching desire for an answer does not
ensure that a reasonable answer can be extracted from a given body of
data.
~ John Tukey

-----Oorspronkelijk bericht-----
Van: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org]
Namens Jean-Baptiste Combes
Verzonden: vrijdag 22 januari 2010 15:25
Aan: r-help at r-project.org
Onderwerp: [R] Stata and R user GLM method

Hello people,

I am in the process of migrating from Stata to R and I would like to
check if my results are similar under the two softwares:

Here is my GLM command under R
nurse.model<-glm(pQSfteHT~dQSvacrateHTQuali3_2 + dQSvacrateHTQuali3_3 +
dQSvacrateHTQuali3_4 + dQSvacrateHTQuali3_5 + cluster_32 + cluster_33 +
cluster_34 ,family=binomial(link = "logit"))


and below the stata command
glm pQSfteHT dQSvacrateHTQuali3_2 dQSvacrateHTQuali3_3
dQSvacrateHTQuali3_4
dQSvacrateHTQuali3_5 cluster_32 cluster_33 cluster_34, link(probit)
family(binomial) robust

Apart from the robust option, it seems to me from what I understand that
I should get the same things.
Stata output:



*Second model (N=690*



*Coef.*

*p-value*

Constant**

0.241***

0.000

QV>SV>0

0.076***

0.001

SV>QV>0

0.071**

0.027

QV>SV=0

0.051**

0.019

SV>QV=0

0.042

0.368

Mental Health HTs

-0.226***

0.000

Acute Teaching HTs

0.159***

0.000

Other HTs

0.084

0.200


R output (Sorry for the presentation, but I am not able at the moment to
produce nice tables, the variables are in the same order as above)
Call:
glm(formula = pQSfteHT ~ dQSvacrateHTQuali3_2 + dQSvacrateHTQuali3_3 +
    dQSvacrateHTQuali3_4 + dQSvacrateHTQuali3_5 + cluster_32 +
    cluster_33 + cluster_34, family = binomial(link = "logit"))

Deviance Residuals:
       Min          1Q      Median          3Q         Max
-2.297e+00   2.107e-08   2.107e-08   6.275e-06   3.850e-01

Coefficients:
                       Estimate Std. Error   z value Pr(>|z|)
(Intercept)           4.476e+01  1.950e+04     0.002    0.998
dQSvacrateHTQuali3_2 -1.112e+00  2.136e+04 -5.21e-05    1.000
dQSvacrateHTQuali3_3 -5.365e-01  2.576e+04 -2.08e-05    1.000
dQSvacrateHTQuali3_4 -2.011e+01  1.693e+04    -0.001    0.999
dQSvacrateHTQuali3_5 -6.509e-01  4.040e+04 -1.61e-05    1.000
cluster_32           -3.194e-01  1.788e+04 -1.79e-05    1.000
cluster_33           -2.857e-02  2.475e+04 -1.15e-06    1.000
cluster_34           -2.209e+01  9.666e+03    -0.002    0.998

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 15.0690  on 688  degrees of freedom Residual
deviance:  7.2049  on 681  degrees of freedom
AIC: 23.205

Number of Fisher Scoring iterations: 24



My suggestion is that I have something wrong with my data under R (I am
confident with the Stata results). What do you think? I am not expecting
you to solve my problem as I reckon it is a bit difficult for you as you
do not know the data, I just would like an opinion on the differences
found between the two softwares, do you agree that there is something
wrong?

Thank you for reading this e-mail.

I would like to thank you in advance and alos the people who answered my
previous e-mail that was very kind of you.

Jean-Baptiste

	[[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Druk dit bericht a.u.b. niet onnodig af.
Please do not print this message unnecessarily.

Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer 
en binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd is
door een geldig ondertekend document. The views expressed in  this message 
and any annex are purely those of the writer and may not be regarded as stating 
an official position of INBO, as long as the message is not confirmed by a duly 
signed document.

Reasonably Related Threads

Search for more maybe matching threads

R help - Jan 2010 - Stata and R user GLM method

[R] Stata and R user GLM method

[R] Stata and R user GLM method

[R] Stata and R user GLM method

Reasonably Related Threads