hazbro
2012-Aug-08 22:03 UTC
[R] basehaz() in package 'Survival' and warnings() with coxph
Hello, I have a couple of questions with regards to fitting a coxph model to a data set in R: I have a very large dataset and wanted to get the baseline hazard using the basehaz() function in the package : 'survival'. If I use all the covariates then the output from basehaz(fit), where fit is a model fit using coxph(), gives 507 unique values for the time and the corresponding cumulative hazard function. However if I use a subset of the varaibles, basehaz() gives 611 values for the time and cumulative hazard. The latter makes more sense as out of my 73000 observations, there are 611 unique times. However I wish to use all the variables to get the baseline hazard. Also I get a couple of warnings when I fit the coxph() model: 1) In fitter(X, Y, strats, offset, init, control, weights = weights, : Loglik converged before variable ; beta may be infinite. 2) X is deemed to be singular. I am aware that the second one is because of multicollinearity and also none of the coefficients are infinite so I thought I could ignore these. Removing the variables that causes these problems does not solve my problem with the basehaz(). The only reason I can think of is that maybe the baseline hazard is undefined at some time points. Thanks for the help. -- View this message in context: http://r.789695.n4.nabble.com/basehaz-in-package-Survival-and-warnings-with-coxph-tp4639687.html Sent from the R help mailing list archive at Nabble.com.
hazbro
2012-Aug-10 00:53 UTC
[R] basehaz() in package 'Survival' and warnings() with coxph
My sessionInfo is as follows: R version 2.15.1 (2012-06-22) Platform: x86_64-unknown-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_GB.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_GB.UTF-8 LC_COLLATE=en_GB.UTF-8 [5] LC_MONETARY=en_GB.UTF-8 LC_MESSAGES=en_GB.UTF-8 [7] LC_PAPER=C LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] splines stats graphics grDevices utils datasets methods [8] base other attached packages: [1] mi_0.09-16 arm_1.5-05 foreign_0.8-50 abind_1.4-0 [5] R2WinBUGS_2.1-18 coda_0.15-2 lme4_0.999999-0 Matrix_1.0-6 [9] lattice_0.20-6 car_2.0-12 nnet_7.3-4 MASS_7.3-20 [13] MuMIn_1.7.11 survival_2.36-14 loaded via a namespace (and not attached): [1] grid_2.15.1 nlme_3.1-104 stats4_2.15.1>It will be difficult to reproduce an example here as the data set I am using in very large. I can give you an example: fit3.1<- coxph(formula = y ~ sex + ns(ageyrs, df = 2) + AdmissionSource + + X1 + X2 + X3 + X5 + X6 + X7 + X11 + X12 + X13 + X14 + X15 + + X16 + X17 + X18 + X19 + X20 + X22 + X24 + X25 + X26 + X27 + + X28 + X29 + X32 + X33 + X35 + X38 + X39 + X40 + X41 + X42 + + X43 + X44 + X47 + X49 + X53 + X54 + X55 + X58 + X59 + X62 + + X68 + X69 + X78 + X80 + X81 + X84 + X85 + X86 + X93 + X95 + + X98 + X100 + X101 + X102 + X105 + X107 + X108 + X109 + X110 + + X112 + X113 + X114 + X115 + X116 + X117 + X121 + X122 + X125 + + X127 + X128 + X129 + X131 + X132 + X133 + X134 + X138 + X140 + + X143 + X145 + X146 + X148 + X150 + X151 + X153 + X157 + X158 + + X159 + X164 + X197 + X200 + X202 + X203 + X204 + X205 + X211 + + X214 + X217 + X224 + X228 + X233 + X237 + X244 + X249 + X254 + + X258 + X259 + X260 + CharlsonIndex + ethnic + day + season + + ln, data = dat2) haz<-basehaz(fit3.1) # gives 507 unique haz$time, time points fit2<-coxph(y~ns(ageyrs,df=2)+day+ln+sex+AdmissionSource+season+CharlsonIndex,data=dat1) haz<-basehaz(fit2) # gives 611 unique haz$time, time points I get the following warnings() with fit3.1: Warning message: In fitter(X, Y, strats, offset, init, control, weights = weights, : Loglik converged before variable ; beta may be infinite. Also the coefficients of the variables that the error occurs for are very high. The Wald test suggests dropping these terms where as the LRT suggests keeping them. What should I do in terms of model selection? -- View this message in context: http://r.789695.n4.nabble.com/basehaz-in-package-Survival-and-warnings-with-coxph-tp4639687p4639838.html Sent from the R help mailing list archive at Nabble.com.