from your stata output, it looks like you need to use the survey package in
R
for step-by-step instructions about how to do this (and comparisons to
stata), see
http://journal.r-project.org/archive/2009-2/RJournal_2009-2_Damico.pdf
once you're ready to run the regression, use svyglm() instead of glm() and
drop the weights argument (since it will already be part of the survey
design) :)
On Fri, Nov 23, 2012 at 3:13 PM, Pablo Menese <pmenese@gmail.com> wrote:
> Until a weeks ago I used stata for everything.
> Now I'm learning R and trying to move. But, in this stage I'm
testing R
> trying to do the same things than I used to do in stata whit the same
> outputs.
> I have a problem with the logit, applying weights.
>
> in stata I have this output
> . svy: logit bach job2 mujer i.egp4 programa delay mdeo i.str evprivate
> (running logit on estimation sample)
>
> Survey: Logistic regression
>
> Number of strata = 1 Number of obs > 248
> Number of PSUs = 248 Population size >
5290.1639
> Design df = 247
> F( 11, 237) = 4.39
> Prob > F = 0.0000
>
>
> Linearized
> bach Coef. Std. Err. t P>t [95% Conf. Interval]
>
> job2 -.4437446 .4385934 -1.01 0.313 -1.307605 .4201154
> mujer 1.070595 .4169919 2.57 0.011 .2492812 1.891908
>
> egp4
> 2 -.4839342 .539808 -0.90 0.371 -1.547148 .5792796
> 3 -1.288947 .5347344 -2.41 0.017 -2.342168 -.2357263
> 4 -.8569793 .5106425 -1.68 0.095 -1.862748 .1487898
>
> programa .9694352 .5677642 1.71 0.089 -.1488415 2.087712
> delay -1.552582 .5714967 -2.72 0.007 -2.678211 -.426954
> mdeo -.7938904 .3727571 -2.13 0.034 -1.528078 -.0597025
>
> str
> 2 -1.122691 .5731879 -1.96 0.051 -2.25165 .0062682
> 3 -2.056682 .6350485 -3.24 0.001 -3.307483 -.8058812
>
> evprivate -1.962431 .5674143 -3.46 0.001 -3.080018 -.8448431
> _cons 2.308699 .7274924 3.17 0.002 .8758187 3.741578
>
>
> the best that i get in R was:
>
> glm(formula = bach ~ job2 + mujer + egp4 + programa + delay +
> mdeo + str + evprivate, family = quasibinomial(link =
"logit"),
> weights = wst7)
>
> Deviance Residuals:
> Min 1Q Median 3Q Max
> -12.5951 -3.9034 -0.9412 3.8268 11.2750
>
> Coefficients:
> Estimate Std. Error t value Pr(>|t|)
> (Intercept) 2.3087 0.7173 3.218 0.00147 **
> job2 -0.4437 0.4355 -1.019 0.30926
> mujer 1.0706 0.3558 3.009 0.00290 **
> egp4intermediate (iii, iv) -0.4839 0.4946 -0.978 0.32890
> egp4skilled manual workers -1.2889 0.5268 -2.447 0.01514 *
> egp4working class -0.8570 0.4625 -1.853 0.06514 .
> programa 0.9694 0.4951 1.958 0.05141 .
> delay -1.5526 0.4878 -3.183 0.00166 **
> mdeo -0.7939 0.4207 -1.887 0.06037 .
> strest. ii -1.1227 0.4809 -2.334 0.02042 *
> strestr. iii -2.0567 0.5134 -4.006 8.28e-05 ***
> evprivate -1.9624 0.6490 -3.024 0.00277 **
> ---
> Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>
> (Dispersion parameter for quasibinomial family taken to be 23.14436)
>
> Null deviance: 7318.5 on 246 degrees of freedom
> Residual deviance: 5692.8 on 235 degrees of freedom
> (103 observations deleted due to missingness)
> AIC: NA
>
> Number of Fisher Scoring iterations: 6
>
> Warning message:
> In summary.glm(logit) :
> observations with zero weight not used for calculating dispersion
>
> this has the same betas but the hypothesis test has differents values...
>
>
> HELP!!!!
>
> [[alternative HTML version deleted]]
>
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>
[[alternative HTML version deleted]]