Dear R-help, I have to calculate the percent inclusion of each variable in a bootstrap validation of a cox proportional hazards model(described in Sauerbrei and Schumacher, Stat Med 11:1093, 1992). First I need to get a bootstrap sample from my dataset, which I did with the sample function. Then I tried to run a cph model and looked which covariates are significant. This I would repeat 200 times and at the end calculate how many percent the covariates were included. This is what I entered: boot1 <- sample(Dataset, 300, replace=T) cph1 <- cph(Surv(months,status) ~ cov1 + cov2 + cov3 + cov4, data=boot1) Unfortunately, I get exactly the same results (coefficient, SE, p-value) then when I would fit a Cox model without drawing a bootstrap sample before. How do I do it right? Or is there another way to calculate the percentage? Sorry for my bad English. Thanks. Dott. Mario Rossi University of Foggia, Italy
Dear R-help, I have to calculate the percent inclusion of each variable in a bootstrap validation of a cox proportional hazards model(described in Sauerbrei and Schumacher, Stat Med 11:1093, 1992). First I need to get a bootstrap sample from my dataset, which I did with the sample function. Then I tried to run a cph model and looked which covariates are significant. This I would repeat 200 times and at the end calculate how many percent the covariates were included. This is what I entered: boot1 <- sample(Dataset, 300, replace=T) cph1 <- cph(Surv(months,status) ~ cov1 + cov2 + cov3 + cov4, data=boot1) Unfortunately, I get exactly the same results (coefficient, SE, p-value) then when I would fit a Cox model without drawing a bootstrap sample before. How do I do it right? Or is there another way to calculate the percentage? Sorry for my bad English. Thanks. Dott. Mario Rossi University of Foggia, Italy
sushi4u wrote:> Dear R-help, > > I have to calculate the percent inclusion of each variable in a bootstrap validation of a cox proportional hazards model(described in Sauerbrei and Schumacher, Stat Med 11:1093, 1992). >This approach is not recommended. Collinearities can ruin the result, and the selection frequencies will just replay what the original P-values tell you. Further, there is no reason to do variable selection in your case, and selection will distort all statistical inferences. Just pre-specify a model, fit it, and stop. Frank> First I need to get a bootstrap sample from my dataset, which I did with the sample function. Then I tried to run a cph model and looked which covariates are significant. This I would repeat 200 times and at the end calculate how many percent the covariates were included. > > This is what I entered: > boot1 <- sample(Dataset, 300, replace=T) > cph1 <- cph(Surv(months,status) ~ cov1 + cov2 + cov3 + cov4, data=boot1) > > Unfortunately, I get exactly the same results (coefficient, SE, p-value) then when I would fit a Cox model without drawing a bootstrap sample before. > > How do I do it right? Or is there another way to calculate the percentage? > > Sorry for my bad English. > > Thanks. > > Dott. Mario Rossi > University of Foggia, Italy > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University