Сергей С.
2016-Mar-15 19:02 UTC
[R] About calculation of the gravity model in R and STATA software
Dear colleagues! We spent calculation of the gravity model in R and STATA software. For calculations we used the standard package glmm in R (with parameter family = quasipoisson) and ppml in STATA. Call the calculation procedure in R: summary(glmm<-glm(formula=exports ~ ln_GDPimporter + ln_GDPexporter + ln_GDPimppc + ln_GDPexppc + ln_Distance + ln_Tariff + ln_ExchangeRate + Contig + Comlang + Colony_CIS + EAEU_CIS + EU_European_Union, family=quasipoisson(link="log"),data=data_pua)) The results of the calculations in R following: ------------------------------------------------------------------------ Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) -12.53224 15.30072 -0.819 0.41357 ln_GDPimporter 0.10180 0.14988 0.679 0.49765 ln_GDPexporter 0.14612 0.79823 0.183 0.85491 ln_GDPimppc 0.34998 0.30247 1.157 0.24840 ln_GDPexppc 0.65811 0.82189 0.801 0.42409 ln_Distance 0.21838 0.16623 1.314 0.19020 ln_Tariff -0.05499 0.04913 -1.119 0.26411 ln_ExchangeRate -0.11748 0.04275 -2.748 0.00646 ** Contig 1.48321 0.28684 5.171 4.92e-07 *** Comlang 1.50727 0.26199 5.753 2.67e-08 *** Colony_CIS 2.15272 0.46899 4.590 7.16e-06 *** EAEU_CIS -0.94417 0.29315 -3.221 0.00146 ** EU_European_Union -0.08335 0.76733 -0.109 0.91359 --- Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 (Dispersion parameter for quasipoisson family taken to be 2100.979) Null deviance: 1886758 on 251 degrees of freedom Residual deviance: 316332 on 239 degrees of freedom AIC: NA Number of Fisher Scoring iterations: 8 ------------------------------------------------------------------------ On the same data, we done calculations in STATA, using ppml procedure. Call of the calculation procedure in STATA was next: ppml exports ln_gdpimporter ln_gdpexporter ln_gdpimppc ln_gdpexppc ln_distance ln_tariff ln_exchangerate contig comlang colony_cis eaeu_cis eu_european_union The results of the calculations in STATA were following: ------------------------------------------------------------------------ Iteration 1: deviance = 425911.3 Iteration 2: deviance = 327020.8 Iteration 3: deviance = 316763.3 Iteration 4: deviance = 316335.1 Iteration 5: deviance = 316332.3 Iteration 6: deviance = 316332.3 Iteration 7: deviance = 316332.3 Number of parameters: 13 Number of observations: 252 Pseudo log-likelihood: -158930.64 R-squared: .75348104 Option strict is: off ----------------------------------------------------------------------------------- | Robust exports | Coef. Std. Err. z P>|z| [95% Conf. Interval] ------------------ +---------------------------------------------------------------- ln_gdpimporter | .1018021 .0982091 1.04 0.300 -.0906843 .2942885 ln_gdpexporter | .1461135 1.084255 0.13 0.893 -1.978988 2.271215 ln_gdpimppc | .349982 .201011 1.74 0.082 -.0439924 .7439564 ln_gdpexppc | .6581201 1.098236 0.60 0.549 -1.494383 2.810624 ln_distance | .2183809 .156757 1.39 0.164 -.0888572 .525619 ln_tariff | -.0549914 .0551489 -1.00 0.319 -.1630811 .0530984 ln_exchangerate | -.1174816 .0343881 -3.42 0.001 -.1848812 -.0500821 contig | 1.483213 .168467 8.80 0.000 1.153024 1.813402 comlang | 1.507272 .2745761 5.49 0.000 .9691126 2.045431 colony_cis | 2.152723 .2338133 9.21 0.000 1.694457 2.610988 eaeu_cis | -.9441651 .2469764 -3.82 0.000 -1.42823 -.4601003 eu_european_union | -.0833477 .4955678 -0.17 0.866 -1.054643 .8879474 _cons | -12.53206 21.18599 -0.59 0.554 -54.05585 28.99172 ----------------------------------------------------------------------------------- As you can see, model coefficients (second column in the results table) are the same at least until the 4th mark (!) However, other results (columns in the table of results, since the third) is not the same. Could you explain differences in the results? In particular, why coefficients the same (the first result table columns), but standard errors is not? With best regards, Sergey S. [[alternative HTML version deleted]]
Jeff Newmiller
2016-Mar-15 19:37 UTC
[R] About calculation of the gravity model in R and STATA software
I am not the person to answer your question, but have some suggestions: Make your examples reproducible so others can confirm your results and explore other issues you may not have seen. [1] [2] Post your question on the R-sig-mixed mailing list, where mixed models experts hang out, instead of here where the R language is on topic rather than contributed packages like glmm. I don't know SAS or the glmm package, but the fact that the top description line for the glmm package says it uses Monte Carlo suggests to me that it might not be "standard" in SAS or R and that your results may vary depending on how you configure it and how the random number seed is initialized. [1] http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example [2] See also the Posting Guide mentioned at the bottom of this and every email on this list. -- Sent from my phone. Please excuse my brevity. On March 15, 2016 12:02:07 PM PDT, "?????? ?." <salnsg at gmail.com> wrote:>Dear colleagues! > >We spent calculation of the gravity model in R and STATA software. >For calculations we used the standard package glmm in R (with parameter >family = quasipoisson) >and ppml in STATA. > >Call the calculation procedure in R: > >summary(glmm<-glm(formula=exports ~ ln_GDPimporter + ln_GDPexporter + >ln_GDPimppc + ln_GDPexppc + ln_Distance + ln_Tariff + ln_ExchangeRate + >Contig + Comlang + Colony_CIS + EAEU_CIS + EU_European_Union, >family=quasipoisson(link="log"),data=data_pua)) > >The results of the calculations in R following: > >------------------------------------------------------------------------ > >Coefficients: > Estimate Std. Error t value Pr(>|t|) >(Intercept) -12.53224 15.30072 -0.819 0.41357 >ln_GDPimporter 0.10180 0.14988 0.679 0.49765 >ln_GDPexporter 0.14612 0.79823 0.183 0.85491 >ln_GDPimppc 0.34998 0.30247 1.157 0.24840 >ln_GDPexppc 0.65811 0.82189 0.801 0.42409 >ln_Distance 0.21838 0.16623 1.314 0.19020 >ln_Tariff -0.05499 0.04913 -1.119 0.26411 >ln_ExchangeRate -0.11748 0.04275 -2.748 0.00646 ** >Contig 1.48321 0.28684 5.171 4.92e-07 *** >Comlang 1.50727 0.26199 5.753 2.67e-08 *** >Colony_CIS 2.15272 0.46899 4.590 7.16e-06 *** >EAEU_CIS -0.94417 0.29315 -3.221 0.00146 ** >EU_European_Union -0.08335 0.76733 -0.109 0.91359 >--- >Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 > >(Dispersion parameter for quasipoisson family taken to be 2100.979) > > Null deviance: 1886758 on 251 degrees of freedom >Residual deviance: 316332 on 239 degrees of freedom >AIC: NA > >Number of Fisher Scoring iterations: 8 > >------------------------------------------------------------------------ > >On the same data, we done calculations in STATA, using ppml procedure. >Call of the calculation procedure in STATA was next: > >ppml exports ln_gdpimporter ln_gdpexporter ln_gdpimppc ln_gdpexppc >ln_distance ln_tariff ln_exchangerate contig comlang colony_cis >eaeu_cis >eu_european_union > >The results of the calculations in STATA were following: > >------------------------------------------------------------------------ > >Iteration 1: deviance = 425911.3 >Iteration 2: deviance = 327020.8 >Iteration 3: deviance = 316763.3 >Iteration 4: deviance = 316335.1 >Iteration 5: deviance = 316332.3 >Iteration 6: deviance = 316332.3 >Iteration 7: deviance = 316332.3 > >Number of parameters: 13 >Number of observations: 252 >Pseudo log-likelihood: -158930.64 >R-squared: .75348104 >Option strict is: off >----------------------------------------------------------------------------------- > | Robust > exports | Coef. Std. Err. z P>|z| > [95% Conf. Interval] >------------------ >+---------------------------------------------------------------- > ln_gdpimporter | .1018021 .0982091 1.04 0.300 >-.0906843 .2942885 > ln_gdpexporter | .1461135 1.084255 0.13 0.893 >-1.978988 2.271215 > ln_gdpimppc | .349982 .201011 1.74 0.082 >-.0439924 .7439564 > ln_gdpexppc | .6581201 1.098236 0.60 0.549 >-1.494383 2.810624 > ln_distance | .2183809 .156757 1.39 0.164 >-.0888572 .525619 > ln_tariff | -.0549914 .0551489 -1.00 0.319 >-.1630811 .0530984 > ln_exchangerate | -.1174816 .0343881 -3.42 0.001 >-.1848812 -.0500821 > contig | 1.483213 .168467 8.80 0.000 >1.153024 1.813402 > comlang | 1.507272 .2745761 5.49 0.000 >.9691126 2.045431 > colony_cis | 2.152723 .2338133 9.21 0.000 >1.694457 2.610988 > eaeu_cis | -.9441651 .2469764 -3.82 0.000 >-1.42823 -.4601003 >eu_european_union | -.0833477 .4955678 -0.17 0.866 >-1.054643 >.8879474 > _cons | -12.53206 21.18599 -0.59 0.554 >-54.05585 28.99172 >----------------------------------------------------------------------------------- > >As you can see, model coefficients (second column in the results table) >are >the same at least until the 4th mark (!) >However, other results (columns in the table of results, since the >third) >is not the same. >Could you explain differences in the results? >In particular, why coefficients the same (the first result table >columns), >but standard errors is not? >With best regards, >Sergey S. > > [[alternative HTML version deleted]] > >______________________________________________ >R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide >http://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code.[[alternative HTML version deleted]]
S Ellison
2016-Mar-16 13:10 UTC
[R] About calculation of the gravity model in R and STATA software
> As you can see, model coefficients (second column in the results table) are the > same at least until the 4th mark (!) However, other results (columns in the > table of results, since the third) is not the same. > Could you explain differences in the results?R and stata are clearly doing different things. Look up how stata calculates its standard errors (which I note are described as '_robust_ ') and compare that with the documentation for the R function. Without knowing anything about stata or your data set, my guess would be that R's quasibinomial family is allowing for overdispersion and stata is not. S Ellison ******************************************************************* This email and any attachments are confidential. Any use, copying or disclosure other than by the intended recipient is unauthorised. If you have received this message in error, please notify the sender immediately via +44(0)20 8943 7000 or notify postmaster at lgcgroup.com and delete this message and any copies from your computer and network. LGC Limited. Registered in England 2991879. Registered office: Queens Road, Teddington, Middlesex, TW11 0LY, UK
peter dalgaard
2016-Mar-16 13:41 UTC
[R] About calculation of the gravity model in R and STATA software
On 16 Mar 2016, at 14:10 , S Ellison <S.Ellison at LGCGroup.com> wrote:>> As you can see, model coefficients (second column in the results table) are the >> same at least until the 4th mark (!) However, other results (columns in the >> table of results, since the third) is not the same. >> Could you explain differences in the results? > > R and stata are clearly doing different things. > Look up how stata calculates its standard errors (which I note are described as '_robust_ ') and compare that with the documentation for the R function. > Without knowing anything about stata or your data set, my guess would be that R's quasibinomial family is allowing for overdispersion and stata is not. > > S EllisonWith a dispersion parameter of 2100, that would be rather more of a dramatic difference.... More likely, STATA is using some sort of sandwich estimator whereas R is just upscaling the results for a Poisson regression. You might want to check out package "sandwich". -pd -- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Office: A 4.23 Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com