thr3ads.net - R help - [R] Explaining Survival difference between Stata and R [May 2004]

If this information is useful, please help other people find it:
Share via:

Paul Johnson

2004-May-10 23:59 UTC

[R] Explaining Survival difference between Stata and R

Dear Everybody:

I'm doing my usual "how does that work in R" thing with some Stata
projects.  I find a gross gap between the Stata and R in Cox PH models, 
and I hope you can give me some pointers about what goes wrong.  I'm 
getting signals from R/Survival that the model just can't be estimated, 
but Stata spits out numbers just fine.

I wonder if I should specify initial values for coxph?

I got a dataset from a student who uses Stata and try to replicate in R. 
I will share data to you in case you want to see for yourself.  Let me 
know if you want text or Stata data file.

In R, I try this:

 > cox2 <- coxph(Surv(yrs2,ratify)~ accession+ haz.wst+ haz.in +haz.out+ 
wefgov+ rle+ rqe + pol.free +tai.2001 + ny.gdp.pcap.pp.cd + eio, 
data=dat3, control=coxph.control(iter.max=1000),singular.ok=T)
Warning message:
Ran out of iterations and did not converge in: fitter(X, Y, strats, 
offset, init, control, weights = weights,

So I wrote out the file exatly as it was in R into Stata dataset

 > write.dta(dat3,"cleanBasel.dta")
Warning message:
Abbreviating variable names in: write.dta(dat3, "cleanBasel.dta")


Here's the Stata output:

. use
"/home/pauljohn/ps/ps909/AdvancedRegression/duration_2/cleanBasel.dta"
(Written by R.              )

. stset yrs2, failure (ratify)

      failure event:  ratify != 0 & ratify < .
obs. time interval:  (0, yrs2]
  exit on or before:  failure

----------------------------------------------------------------------------
 > --
        21  total obs.
         0  exclusions
----------------------------------------------------------------------------
 > --
        21  obs. remaining, representing
        21  failures in single record/single failure data
        78  total analysis time at risk, at risk from t =         0
                              earliest observed entry t =         0

. stcox accessin haz_wst  haz_in  haz_out  wefgov  rle  rqe  pol_free 
tai_2001 ny_gd  eio, robust
 >  nohr

          failure _d:  ratify
    analysis time _t:  yrs2

Iteration 0:   log pseudo-likelihood = -49.054959
Iteration 1:   log pseudo-likelihood = -45.021682
Iteration 2:   log pseudo-likelihood = -44.525187
Iteration 3:   log pseudo-likelihood = -44.521588
Iteration 4:   log pseudo-likelihood = -44.521586
Refining estimates:
Iteration 0:   log pseudo-likelihood = -44.521586

Cox regression -- Breslow method for ties

No. of subjects       =           21               Number of obs   = 
     21
No. of failures       =           21
Time at risk          =           78
                                                    Wald chi2(11)   = 
   81.64
Log pseudo-likelihood =   -44.521586               Prob > chi2     = 
0.0000

------------------------------------------------------------------------------
              |               Robust
           _t |      Coef.   Std. Err.      z    P>|z|
-------------+----------------------------------------------------------------
     accessin |  -1.114101   .6343663    -1.76   0.079
      haz_wst |   2.32e-08   1.08e-07     0.22   0.829
       haz_in |   3.78e-06   2.46e-06     1.54   0.124
      haz_out |  -3.80e-07   3.76e-07    -1.01   0.312
       wefgov |   2.139127   .9136992     2.34   0.019
          rle |   1.827482   1.500878     1.22   0.223
          rqe |  -3.126696   1.332069    -2.35   0.019
     pol_free |  -.4498276    .291764    -1.54   0.123
     tai_2001 |  -2.895922   2.577401    -1.12   0.261
     ny_gd___ |  -.0003223   .0002194    -1.47   0.142
          eio |  -.0577773   .0726064    -0.80   0.426
------------------------------------------------------------------------------

.
                                  last observed exit t =         7


----------------------------------



Paul Johnson
Dept. of Political Science
University of Kansas

Thomas Lumley

2004-May-11 14:10 UTC

head link

[R] Explaining Survival difference between Stata and R

On Mon, 10 May 2004, Paul Johnson wrote:
> Dear Everybody:
>
> I'm doing my usual "how does that work in R" thing with some
Stata
> projects.  I find a gross gap between the Stata and R in Cox PH models,
> and I hope you can give me some pointers about what goes wrong.  I'm
> getting signals from R/Survival that the model just can't be estimated,
> but Stata spits out numbers just fine.
>
> I wonder if I should specify initial values for coxph?
It's worth a try. The other question is whether Stata has in fact
converged -- if the range of rqe is not small then its coefficient may
actually be infinite.
> I got a dataset from a student who uses Stata and try to replicate in R.
> I will share data to you in case you want to see for yourself.  Let me
> know if you want text or Stata data file.
I'd like to look at the data.  We should be able to get coxph to converge
when there is a finite mle -- the log partial likelihood is concave.

	-thomas

> In R, I try this:
>
>  > cox2 <- coxph(Surv(yrs2,ratify)~ accession+ haz.wst+ haz.in
+haz.out+
> wefgov+ rle+ rqe + pol.free +tai.2001 + ny.gdp.pcap.pp.cd + eio,
> data=dat3, control=coxph.control(iter.max=1000),singular.ok=T)
> Warning message:
> Ran out of iterations and did not converge in: fitter(X, Y, strats,
> offset, init, control, weights = weights,
>
> So I wrote out the file exatly as it was in R into Stata dataset
>
>  > write.dta(dat3,"cleanBasel.dta")
> Warning message:
> Abbreviating variable names in: write.dta(dat3, "cleanBasel.dta")
>
>
> Here's the Stata output:
>
> . use
"/home/pauljohn/ps/ps909/AdvancedRegression/duration_2/cleanBasel.dta"
> (Written by R.              )
>
> . stset yrs2, failure (ratify)
>
>       failure event:  ratify != 0 & ratify < .
> obs. time interval:  (0, yrs2]
>   exit on or before:  failure
>
>
----------------------------------------------------------------------------
>  > --
>         21  total obs.
>          0  exclusions
>
----------------------------------------------------------------------------
>  > --
>         21  obs. remaining, representing
>         21  failures in single record/single failure data
>         78  total analysis time at risk, at risk from t =         0
>                               earliest observed entry t =         0
>
> . stcox accessin haz_wst  haz_in  haz_out  wefgov  rle  rqe  pol_free
> tai_2001 ny_gd  eio, robust
>  >  nohr
>
>           failure _d:  ratify
>     analysis time _t:  yrs2
>
> Iteration 0:   log pseudo-likelihood = -49.054959
> Iteration 1:   log pseudo-likelihood = -45.021682
> Iteration 2:   log pseudo-likelihood = -44.525187
> Iteration 3:   log pseudo-likelihood = -44.521588
> Iteration 4:   log pseudo-likelihood = -44.521586
> Refining estimates:
> Iteration 0:   log pseudo-likelihood = -44.521586
>
> Cox regression -- Breslow method for ties
>
> No. of subjects       =           21               Number of obs   >    
21
> No. of failures       =           21
> Time at risk          =           78
>                                                     Wald chi2(11)   >   
81.64
> Log pseudo-likelihood =   -44.521586               Prob > chi2     >
0.0000
>
>
------------------------------------------------------------------------------
>               |               Robust
>            _t |      Coef.   Std. Err.      z    P>|z|
>
-------------+----------------------------------------------------------------
>      accessin |  -1.114101   .6343663    -1.76   0.079
>       haz_wst |   2.32e-08   1.08e-07     0.22   0.829
>        haz_in |   3.78e-06   2.46e-06     1.54   0.124
>       haz_out |  -3.80e-07   3.76e-07    -1.01   0.312
>        wefgov |   2.139127   .9136992     2.34   0.019
>           rle |   1.827482   1.500878     1.22   0.223
>           rqe |  -3.126696   1.332069    -2.35   0.019
>      pol_free |  -.4498276    .291764    -1.54   0.123
>      tai_2001 |  -2.895922   2.577401    -1.12   0.261
>      ny_gd___ |  -.0003223   .0002194    -1.47   0.142
>           eio |  -.0577773   .0726064    -0.80   0.426
>
------------------------------------------------------------------------------
>
> .
>                                   last observed exit t =         7
>
>
> ----------------------------------
>
>
>
> Paul Johnson
> Dept. of Political Science
> University of Kansas
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://www.stat.math.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html
>
Thomas Lumley			Assoc. Professor, Biostatistics
tlumley at u.washington.edu	University of Washington, Seattle

Seemingly Similar Threads

Search for more reasonably related threads

R help - May 2004 - Explaining Survival difference between Stata and R

[R] Explaining Survival difference between Stata and R

[R] Explaining Survival difference between Stata and R

Seemingly Similar Threads