I want use survfit() and basehaz() inside a function, but it doesn't work.
Could you take a look at this problem. Thanks for your help. Following is my
codes: 
library(survival)
n <- 50      # total sample size
nclust <- 5  # number of clusters
clusters <- rep(1:nclust,each=n/nclust)
beta0 <- c(1,2)
set.seed(13)
#generate phmm data set
Z <- cbind(Z1=sample(0:1,n,replace=TRUE),
       Z2=sample(0:1,n,replace=TRUE),
       Z3=sample(0:1,n,replace=TRUE))
b <-
cbind(rep(rnorm(nclust),each=n/nclust),rep(rnorm(nclust),each=n/nclust))
Wb <- matrix(0,n,2)
for( j in 1:2) Wb[,j] <- Z[,j]*b[,j]
Wb <- apply(Wb,1,sum)
T <- -log(runif(n,0,1))*exp(-Z[,c('Z1','Z2')]%*%beta0-Wb)
C <- runif(n,0,1)
time <- ifelse(T<C,T,C)
event <- ifelse(T<=C,1,0)
mean(event)
phmmd <- data.frame(Z)
phmmd$cluster <- clusters
phmmd$time <- time
phmmd$event <- event
fmla <- as.formula("Surv(time, event) ~ Z1 + Z2")
BaseFun <- function(x){
start.coxph <- coxph(x, phmmd)  
print(start.coxph)
betahat <- start.coxph$coefficient
print(betahat) 
print(333)  
print(survfit(start.coxph))
m <- basehaz(start.coxph)
print(m)
}  
BaseFun(fmla)
Error in formula.default(object, env = baseenv()) : invalid formula
But the following function works:
coxph(fmla, phmmd)
--
View this message in context:
http://r.789695.n4.nabble.com/formula-error-inside-function-tp4326549p4326549.html
Sent from the R help mailing list archive at Nabble.com.
Le mardi 24 janvier 2012 ? 23:02 -0800, moli a ?crit :> I want use survfit() and basehaz() inside a function, but it doesn't work. > Could you take a look at this problem. Thanks for your help. Following is my > codes:<snip>> Error in formula.default(object, env = baseenv()) : invalid formula > > > But the following function works: > coxph(fmla, phmmd)Please provide us with a shorter code to reproduce the problem. You can't expect people to read debug the whole function to identify the problem. You can use traceback() after the error occurs to see where it comes from. Also, reproducing the code here doesn't lead to the same error: Error in model.frame(formula = x, data = phmmd) : object 'x' not found Cheers
> I want use survfit() and basehaz() inside a function, but it doesn't > work. Could you take a look at this problem. Thanks for your help.Your problem has to do with environments, and these lines fmla <- as.formula("Surv(time, event) ~ Z1 + Z2") BaseFun <- function(x){ start.coxph <- coxph(x, phmmd) ... survfit(start.coxph) } Basefun(fmla) The survfit routine needs to reconstruct the model matrix, and by default in R this is done in the context where the model formula was first defined. Unfortunately this is outside the function, leading to problems -- your argument "x" is is unknown in the outer envirnoment. The solution is to add "model=TRUE" to the coxph call so that the model frame is saved and survfit doesn't have to do reconstruction. If you think this should work as is, well, so do I. I spent a lot of time on this issue a few months ago and finally threw in the towel. The interaction of environments with model.frame and model.matrix is subtle and far from obvious. (Just to be clear: I didn't say "broken". Each aspect of the process has well thought out reasons.) The standard modeling functions lm, glm, etc changed their defaults from model=F to model=T at some point. This costs some space & memory, but coxph may need to do the same. Terry T
On Jan 25, 2012, at 2:02 AM, moli wrote:> I want use survfit() and basehaz() inside a function, but it doesn't > work. > Could you take a look at this problem. Thanks for your help. > Following is my > codes: > > library(survival) > n <- 50 # total sample size > nclust <- 5 # number of clusters > clusters <- rep(1:nclust,each=n/nclust) > beta0 <- c(1,2) > set.seed(13) > #generate phmm data set > Z <- cbind(Z1=sample(0:1,n,replace=TRUE), > Z2=sample(0:1,n,replace=TRUE), > Z3=sample(0:1,n,replace=TRUE)) > b <- > cbind(rep(rnorm(nclust),each=n/nclust),rep(rnorm(nclust),each=n/ > nclust)) > Wb <- matrix(0,n,2) > for( j in 1:2) Wb[,j] <- Z[,j]*b[,j] > Wb <- apply(Wb,1,sum) > T <- -log(runif(n,0,1))*exp(-Z[,c('Z1','Z2')]%*%beta0-Wb) > C <- runif(n,0,1) > time <- ifelse(T<C,T,C) > event <- ifelse(T<=C,1,0) > mean(event) > phmmd <- data.frame(Z) > phmmd$cluster <- clusters > phmmd$time <- time > phmmd$event <- event > > fmla <- as.formula("Surv(time, event) ~ Z1 + Z2") > > BaseFun <- function(x){You should adopt a practice of using more meaningful parameter names. In this case "x" happens to be one of the parameter names for coxph. So I suguggest something like:> BaseFun <- function(frm){> start.coxph <- coxph(x, phmmd)Then try changing that line to start.coxph <- coxph(frm, phmmd, x=TRUE,y=TRUE) It turns out that unless I included x=TRUE and y=TRUE, (as suggested by an error essage I encountered during debugging) that the right information was not available inside your function. Personally I would have passed the data to the functions as well but that doesn't seem to have been necessary (or sufficient to solve the problem). -- David.> > print(> ) > > betahat <- start.coxph$coefficient > print(betahat) > print(333) > print(survfit(start.coxph)) > m <- basehaz(start.coxph) > print(m) > } > BaseFun(fmla) > > > > Error in formula.default(object, env = baseenv()) : invalid formula > > > But the following function works: > coxph(fmla, phmmd) > > -- > View this message in context: http://r.789695.n4.nabble.com/formula-error-inside-function-tp4326549p4326549.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.David Winsemius, MD West Hartford, CT
Thanks for you help. I also spent a lot of time on it. Your explanation is very clear. -- View this message in context: http://r.789695.n4.nabble.com/formula-error-inside-function-tp4326549p4328117.html Sent from the R help mailing list archive at Nabble.com.