Hello people, I am in the process of migrating from Stata to R and I would like to check if my results are similar under the two softwares: Here is my GLM command under R nurse.model<-glm(pQSfteHT~dQSvacrateHTQuali3_2 + dQSvacrateHTQuali3_3 + dQSvacrateHTQuali3_4 + dQSvacrateHTQuali3_5 + cluster_32 + cluster_33 + cluster_34 ,family=binomial(link = "logit")) and below the stata command glm pQSfteHT dQSvacrateHTQuali3_2 dQSvacrateHTQuali3_3 dQSvacrateHTQuali3_4 dQSvacrateHTQuali3_5 cluster_32 cluster_33 cluster_34, link(probit) family(binomial) robust Apart from the robust option, it seems to me from what I understand that I should get the same things. Stata output: *Second model (N=690* *Coef.* *p-value* Constant** 0.241*** 0.000 QV>SV>0 0.076*** 0.001 SV>QV>0 0.071** 0.027 QV>SV=0 0.051** 0.019 SV>QV=0 0.042 0.368 Mental Health HTs -0.226*** 0.000 Acute Teaching HTs 0.159*** 0.000 Other HTs 0.084 0.200 R output (Sorry for the presentation, but I am not able at the moment to produce nice tables, the variables are in the same order as above) Call: glm(formula = pQSfteHT ~ dQSvacrateHTQuali3_2 + dQSvacrateHTQuali3_3 + dQSvacrateHTQuali3_4 + dQSvacrateHTQuali3_5 + cluster_32 + cluster_33 + cluster_34, family = binomial(link = "logit")) Deviance Residuals: Min 1Q Median 3Q Max -2.297e+00 2.107e-08 2.107e-08 6.275e-06 3.850e-01 Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) 4.476e+01 1.950e+04 0.002 0.998 dQSvacrateHTQuali3_2 -1.112e+00 2.136e+04 -5.21e-05 1.000 dQSvacrateHTQuali3_3 -5.365e-01 2.576e+04 -2.08e-05 1.000 dQSvacrateHTQuali3_4 -2.011e+01 1.693e+04 -0.001 0.999 dQSvacrateHTQuali3_5 -6.509e-01 4.040e+04 -1.61e-05 1.000 cluster_32 -3.194e-01 1.788e+04 -1.79e-05 1.000 cluster_33 -2.857e-02 2.475e+04 -1.15e-06 1.000 cluster_34 -2.209e+01 9.666e+03 -0.002 0.998 (Dispersion parameter for binomial family taken to be 1) Null deviance: 15.0690 on 688 degrees of freedom Residual deviance: 7.2049 on 681 degrees of freedom AIC: 23.205 Number of Fisher Scoring iterations: 24 My suggestion is that I have something wrong with my data under R (I am confident with the Stata results). What do you think? I am not expecting you to solve my problem as I reckon it is a bit difficult for you as you do not know the data, I just would like an opinion on the differences found between the two softwares, do you agree that there is something wrong? Thank you for reading this e-mail. I would like to thank you in advance and alos the people who answered my previous e-mail that was very kind of you. Jean-Baptiste [[alternative HTML version deleted]]
Jean-Baptiste The most immediate difference I see is that you use a logit link in the R code but a probit link function in the stata code. Joe On Fri, Jan 22, 2010 at 8:25 AM, Jean-Baptiste Combes <jbcombes@laposte.net>wrote:> Hello people, > > I am in the process of migrating from Stata to R and I would like to check > if my results are similar under the two softwares: > > Here is my GLM command under R > nurse.model<-glm(pQSfteHT~dQSvacrateHTQuali3_2 + dQSvacrateHTQuali3_3 + > dQSvacrateHTQuali3_4 + dQSvacrateHTQuali3_5 + cluster_32 + cluster_33 + > cluster_34 ,family=binomial(link = "logit")) > > > and below the stata command > glm pQSfteHT dQSvacrateHTQuali3_2 dQSvacrateHTQuali3_3 dQSvacrateHTQuali3_4 > dQSvacrateHTQuali3_5 cluster_32 cluster_33 cluster_34, link(probit) > family(binomial) robust > > Apart from the robust option, it seems to me from what I understand that I > should get the same things. > Stata output: > > > > *Second model (N=690* > > > > *Coef.* > > *p-value* > > Constant** > > 0.241*** > > 0.000 > > QV>SV>0 > > 0.076*** > > 0.001 > > SV>QV>0 > > 0.071** > > 0.027 > > QV>SV=0 > > 0.051** > > 0.019 > > SV>QV=0 > > 0.042 > > 0.368 > > Mental Health HTs > > -0.226*** > > 0.000 > > Acute Teaching HTs > > 0.159*** > > 0.000 > > Other HTs > > 0.084 > > 0.200 > > > R output (Sorry for the presentation, but I am not able at the moment to > produce nice tables, the variables are in the same order as above) > Call: > glm(formula = pQSfteHT ~ dQSvacrateHTQuali3_2 + dQSvacrateHTQuali3_3 + > dQSvacrateHTQuali3_4 + dQSvacrateHTQuali3_5 + cluster_32 + > cluster_33 + cluster_34, family = binomial(link = "logit")) > > Deviance Residuals: > Min 1Q Median 3Q Max > -2.297e+00 2.107e-08 2.107e-08 6.275e-06 3.850e-01 > > Coefficients: > Estimate Std. Error z value Pr(>|z|) > (Intercept) 4.476e+01 1.950e+04 0.002 0.998 > dQSvacrateHTQuali3_2 -1.112e+00 2.136e+04 -5.21e-05 1.000 > dQSvacrateHTQuali3_3 -5.365e-01 2.576e+04 -2.08e-05 1.000 > dQSvacrateHTQuali3_4 -2.011e+01 1.693e+04 -0.001 0.999 > dQSvacrateHTQuali3_5 -6.509e-01 4.040e+04 -1.61e-05 1.000 > cluster_32 -3.194e-01 1.788e+04 -1.79e-05 1.000 > cluster_33 -2.857e-02 2.475e+04 -1.15e-06 1.000 > cluster_34 -2.209e+01 9.666e+03 -0.002 0.998 > > (Dispersion parameter for binomial family taken to be 1) > > Null deviance: 15.0690 on 688 degrees of freedom > Residual deviance: 7.2049 on 681 degrees of freedom > AIC: 23.205 > > Number of Fisher Scoring iterations: 24 > > > > My suggestion is that I have something wrong with my data under R (I am > confident with the Stata results). What do you think? I am not expecting > you > to solve my problem as I reckon it is a bit difficult for you as you do not > know the data, I just would like an opinion on the differences found > between > the two softwares, do you agree that there is something wrong? > > Thank you for reading this e-mail. > > I would like to thank you in advance and alos the people who answered my > previous e-mail that was very kind of you. > > Jean-Baptiste > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Joseph C. Magagnoli Doctoral Student Department of Political Science University of North Texas 1155 Union Circle #305340 Denton, Texas 76203-5017 Email: jcm0250@unt.edu [[alternative HTML version deleted]]
Jean-Baptiste, You are not doing the same thing in R as in Stata. In stata you used the probit link, in R the logit link. HTH, Thierry ------------------------------------------------------------------------ ---- ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek team Biometrie & Kwaliteitszorg Gaverstraat 4 9500 Geraardsbergen Belgium Research Institute for Nature and Forest team Biometrics & Quality Assurance Gaverstraat 4 9500 Geraardsbergen Belgium tel. + 32 54/436 185 Thierry.Onkelinx at inbo.be www.inbo.be To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey -----Oorspronkelijk bericht----- Van: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] Namens Jean-Baptiste Combes Verzonden: vrijdag 22 januari 2010 15:25 Aan: r-help at r-project.org Onderwerp: [R] Stata and R user GLM method Hello people, I am in the process of migrating from Stata to R and I would like to check if my results are similar under the two softwares: Here is my GLM command under R nurse.model<-glm(pQSfteHT~dQSvacrateHTQuali3_2 + dQSvacrateHTQuali3_3 + dQSvacrateHTQuali3_4 + dQSvacrateHTQuali3_5 + cluster_32 + cluster_33 + cluster_34 ,family=binomial(link = "logit")) and below the stata command glm pQSfteHT dQSvacrateHTQuali3_2 dQSvacrateHTQuali3_3 dQSvacrateHTQuali3_4 dQSvacrateHTQuali3_5 cluster_32 cluster_33 cluster_34, link(probit) family(binomial) robust Apart from the robust option, it seems to me from what I understand that I should get the same things. Stata output: *Second model (N=690* *Coef.* *p-value* Constant** 0.241*** 0.000 QV>SV>0 0.076*** 0.001 SV>QV>0 0.071** 0.027 QV>SV=0 0.051** 0.019 SV>QV=0 0.042 0.368 Mental Health HTs -0.226*** 0.000 Acute Teaching HTs 0.159*** 0.000 Other HTs 0.084 0.200 R output (Sorry for the presentation, but I am not able at the moment to produce nice tables, the variables are in the same order as above) Call: glm(formula = pQSfteHT ~ dQSvacrateHTQuali3_2 + dQSvacrateHTQuali3_3 + dQSvacrateHTQuali3_4 + dQSvacrateHTQuali3_5 + cluster_32 + cluster_33 + cluster_34, family = binomial(link = "logit")) Deviance Residuals: Min 1Q Median 3Q Max -2.297e+00 2.107e-08 2.107e-08 6.275e-06 3.850e-01 Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) 4.476e+01 1.950e+04 0.002 0.998 dQSvacrateHTQuali3_2 -1.112e+00 2.136e+04 -5.21e-05 1.000 dQSvacrateHTQuali3_3 -5.365e-01 2.576e+04 -2.08e-05 1.000 dQSvacrateHTQuali3_4 -2.011e+01 1.693e+04 -0.001 0.999 dQSvacrateHTQuali3_5 -6.509e-01 4.040e+04 -1.61e-05 1.000 cluster_32 -3.194e-01 1.788e+04 -1.79e-05 1.000 cluster_33 -2.857e-02 2.475e+04 -1.15e-06 1.000 cluster_34 -2.209e+01 9.666e+03 -0.002 0.998 (Dispersion parameter for binomial family taken to be 1) Null deviance: 15.0690 on 688 degrees of freedom Residual deviance: 7.2049 on 681 degrees of freedom AIC: 23.205 Number of Fisher Scoring iterations: 24 My suggestion is that I have something wrong with my data under R (I am confident with the Stata results). What do you think? I am not expecting you to solve my problem as I reckon it is a bit difficult for you as you do not know the data, I just would like an opinion on the differences found between the two softwares, do you agree that there is something wrong? Thank you for reading this e-mail. I would like to thank you in advance and alos the people who answered my previous e-mail that was very kind of you. Jean-Baptiste [[alternative HTML version deleted]] ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Druk dit bericht a.u.b. niet onnodig af. Please do not print this message unnecessarily. Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer en binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd is door een geldig ondertekend document. The views expressed in this message and any annex are purely those of the writer and may not be regarded as stating an official position of INBO, as long as the message is not confirmed by a duly signed document.