Dear R people, Thanks to everyone who replied to my questions about environments and lexical scoping. It was very helpful. I am still working on understanding this. My phone line has been dead for the last 24 hrs, so I was cut off and could not reply earlier. I now have another question. I am trying to use princomp with a data frame of size (2061,98) so quite large. I successfully applied princomp in Splus to it, but the R version exits with the following incomprehensible error: Error in if (symmetric) { : missing value where logical needed if (symmetric) does not appear in the code for princomp so I am flummoxed. My code follows. It is identical to that which I used for Splus. I doubt if the problem will be clear from that, but if anyone wants my dataset to try to reproduce the results, feel free to ask. In the meantime I am going to take the princomp object from splus, dump it to a file, transfer it to R, and pray it works. Sincerely, Faheem Mitha. ************************************************************************ num <- 1:33 tanf <- rep("tanf",33) tanf <- paste(tanf,num, sep="") med <- rep("med",33) med <- paste(med,num, sep="") wage <- rep("wage",33) wage <- paste(wage,num, sep="") names <- c(tanf,med,wage) #tanf, medicaid, wages (33,33,33) rm(num,tanf,med,wage) prin.df <- read.table("clus.dat",header= FALSE, row.names=NULL, col.names=names) rm(names) prin.df <- prin.df[ ,2:99] prin.pr <- princomp(prin.df, cor = T) -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
At 01:08 PM 5/20/00 -0400, Faheem Mitha wrote:> >I now have another question. I am trying to use princomp with a data frame >of size (2061,98) so quite large. I successfully applied princomp in Splus >to it, but the R version exits with the following incomprehensible error: > >Error in if (symmetric) { : missing value where logical needed > >if (symmetric) does not appear in the code for princomp so I am flummoxed.Did traceback() say it came from princomp? Did you use traceback() at all? Do you know about traceback()? Don''t worry if the answer to all three questions is a shrug - many users are like that, but it really is a fundamental tool for errors like this and I recommend it to all. traceback() would have told you that the error is not in princomp() but in eigen(), the workhorse that princomp() uses. That has to be a clue. Have you looked at the code for eigen()? Bill Venables. -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
On Sun, 21 May 2000, Bill Venables wrote:> At 01:08 PM 5/20/00 -0400, Faheem Mitha wrote: > > > >I now have another question. I am trying to use princomp with a data frame > >of size (2061,98) so quite large. I successfully applied princomp in Splus > >to it, but the R version exits with the following incomprehensible error: > > > >Error in if (symmetric) { : missing value where logical needed > > > >if (symmetric) does not appear in the code for princomp so I am flummoxed. > > Did traceback() say it came from princomp?No.> traceback()[1] "eigen(cv)" "princomp(prin.df, cor = T)"> princomp> Did you use traceback() at all?Not till Douglas Bates mentioned it.> Do you know about traceback()?Used it in Splus, but Splus helpfully reminds you about it. Never realised it was in R, never thought of using it. I will from now on, though. I really would like to learn this stuff.> Don''t worry if the answer to all three questions is a shrug - many users > are like that, but it really is a fundamental tool for errors like this and > I recommend it to all.> traceback() would have told you that the error is not in princomp() but in > eigen(), the workhorse that princomp() uses. That has to be a clue. Have > you looked at the code for eigen()?Yup. I''m guessing the relevant part is if (symmetric) { if (complex.x) { xr <- Re(x) xi <- Im(x) z <- .Fortran("ch", n, n, xr, xi, values = dbl.n, !only.values, vectors = xr, ivectors = xi, dbl.n, dbl.n, double(2 * n), ierr = integer(1), PACKAGE = "base") if (z$ierr) stop(paste("ch returned code ", z$ierr, " in eigen")) if (!only.values) z$vectors <- matrix(complex(re = z$vectors, im z$ivectors), nc = n) } But this does not enlighten me. It seems to involve a call to some Fortran code, and I don''t know Fortran. And I am still freaked that it worked in Splus and not in R (same data set, same code). Could someone speculate on possible reasons? I have given up hope of running this successfully on R (even if I figure out what it wrong, princomp for a data set this large requires an unholy amount of memory, probably more than I have available) but I am curious what it causing it to go kerblooey. I tried princomp with small data sets and it works fine. Faheem. -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Faheem Mitha <faheem at email.unc.edu> writes:> Yup. I''m guessing the relevant part is > > if (symmetric) { > if (complex.x) { > xr <- Re(x) > xi <- Im(x) > z <- .Fortran("ch", n, n, xr, xi, values = dbl.n, > !only.values, vectors = xr, ivectors = xi, dbl.n, > dbl.n, double(2 * n), ierr = integer(1), PACKAGE = "base") > if (z$ierr) > stop(paste("ch returned code ", z$ierr, " in eigen")) > if (!only.values) > z$vectors <- matrix(complex(re = z$vectors, im > z$ivectors), > nc = n) > } > > But this does not enlighten me. It seems to involve a call to some Fortran > code, and I don''t know Fortran. And I am still freaked that it worked in > Splus and not in R (same data set, same code).The error message said that the problem was in "if (symmetric) {..." so I''d suspect that it is because symmetric is not a logical - you''re not getting to the Fortran at all. Now, since eigen is not explicitly being passed a "symmetric" argument, that must come from missing values in the covariance matrix itself. So where do they come from? One more hint: debug(princomp) and have a look at the generation of "cv". It looks like it should be protected from missings, but there might be a bug somewhere. -- O__ ---- Peter Dalgaard Blegdamsvej 3 c/ /''_ --- Dept. of Biostatistics 2200 Cph. N (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907 -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Is there a function somewhere for estimating the survivor function (plus confidence limits) when data are both left truncated and right censored? survfit in survival5 doesn''t like left truncation. -- G?ran Brostr?m tel: +46 90 786 5223 Department of Statistics fax: +46 90 786 6614 Ume? University SE-90187 Ume?, Sweden e-mail: gb at stat.umu.se -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Dear R people, I have been doing some more sherlocking of a rather primitive sort. Thanks very much for hints supplied by Peter Dalgaard and Bill Venables. I found debug(princomp) would just exit with the same error. So, since the calculations are really quite simple, I decided to follow them along, creating the objects as needed. This approach was successful in detecting the source of the prolem. The first column of my data set, corresponding to the first variable, was all ones. This reasonably enough gave a lot of zero covariances, and hence NaN''s are generated down the line when calculating the correlation matrix, hence throwing R into a tizzy. This seems reasonable enough. I don''t know enough about principal components analysis to know whether R should be expected to do anything further with a correlation matrix with missing values, but I doubt it. (Yes, the correlation matrix seems to be the problem). I wonder how Splus coped with it. I may take a look later if I have the energy. I deleted this column and then ran into a further problem. When I tried to do princomp(prin.df, cor = T) again I got Error in princomp(prin.df, cor = T) : covariance matrix is not non-negative definite. I took a look at the covariance matrix and indeed some of the eigenvalues were negative. However, they were very small negative numbers, ie [86] -1.861163e-18 -3.507169e-18 -1.400653e-17 -4.185918e-17 -5.495597e-17 [91] -8.742655e-17 -1.227594e-16 -1.515011e-16 -2.975572e-16 -3.650603e-16 [96] -5.166079e-16 -1.719909e-15 Now, I am no expert on this stuff, but I think the covariance matrix estimator is simply calculating the covariance matrix of the sample distribution, so should be guaranteed to give a non-negative definite matrix. So, I am inclined to think that this is probably a rounding error and not my fault. What should be done about it? No, please don''t tell me to rewrite the code. I don''t know enough. Faheem. -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Dear R people, A followup on my previous message. I did a little more investigation and experimentation. Firstly, I tried princomp, with cor = F. I was relieved that it gave me the "covariance matrix is not non-negative definite" error, otherwise I would have been really puzzled. So it seems clear that the first column of my matrix being all ones is what is causing the if(symmetric) error. Moving on, I was curious why Splus was not giving me a error for the same reason. I looked at the covariance matrix, and guess what, I get (for example) tanf2 tanf2 1.215290e-27 where R gave me zero. (tanf2 is the variable corresponding to the first column). Clearly zero is the correct answer, so R is the voice of sanity here. Three cheers for R! So, I have decided that I will deleted that first column of ones and run the results again. Goodness knows what terrible things this rounding error may be doing later on in the calculation. To close off, I would like to briefly explain what I am doing (which is very simple) and ask if it seems reasonable. I don''t want to be one of those people out there doing nonsense statistics. I have a data set, of 2061 rows and 99 columns originally. Now I guess it is going to be 97 columns, since the first column was all zeros (even Splus choked on this and I deleted it earlier) and the second one was all ones. Anyway, the first 64 (was 66) columns are binary data. The last 33 are numeric data. Now, I thought that a reasonable thing to do (in fact, the only thing I could think of) was to treat the first 64 columns as numeric zeros and ones, and then use the cor=TRUE flag (ie use the correlation matrix instead of the corelation matrix). This is advertised as a way of handling cases when the data is not all of the same scale. So that is what I did. Any comments/suggestions? Sincerely, Faheem Mitha. -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Faheem Mitha <faheem at email.unc.edu> writes:> I have a data set, of 2061 rows and 99 columns originally. Now I guess it > is going to be 97 columns, since the first column was all zeros (even > Splus choked on this and I deleted it earlier) and the second one was all > ones. Anyway, the first 64 (was 66) columns are binary data. The last 33 > are numeric data. Now, I thought that a reasonable thing to do (in fact, > the only thing I could think of) was to treat the first 64 columns as > numeric zeros and ones, and then use the cor=TRUE flag (ie use the > correlation matrix instead of the corelation matrix). This is advertised > as a way of handling cases when the data is not all of the same scale. So > that is what I did. Any comments/suggestions?Whatever the software, you''re likely to get in trouble trying to interpret the result of a PCA on binary data (using correlations or not). It can be tricky enough with continuous data.... Anyway, I bet some of those 64 binary columns will turn out to be linearly dependent, either overtly by some of them summing to a constant or more subtly because of some combinations being absent. A QR decomposition of your data matrix might be enlightening. Look at the rank and the pivoting information. It does seem that we''re handling the singular case suboptimally, though. Ideally, one should apply a fuzz factor before declaring that the matrix isn''t NND and I don''t think there''s a problem with factoring out a null space in the PCA. -- O__ ---- Peter Dalgaard Blegdamsvej 3 c/ /''_ --- Dept. of Biostatistics 2200 Cph. N (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907 -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
On Sun, 21 May 2000, gb wrote:> > Is there a function somewhere for estimating the survivor function > (plus confidence limits) when data are both left truncated and right > censored? survfit in survival5 doesn''t like left truncation.I got one answer, but a function for left censoring (for which I am grateful). So I wrote my own function. Here it is. Be careful, there are a few nasty loops... and no serious testing yet. Tips on improvements are welcome! (Note, it just plots, no return values. Easy to change, though.) G?ran ------------------------------------------------------------------------ plot.surv <- function(enter = rep( 0, length(exit) ), exit, event = rep( 1, length(exit) ), group = rep( 1, length(exit) ), limits = F, conf = 0.95) { ## Input data: ## ## enter : left truncation point ## exit : exit time point ## event : if zero, a censored observation; otherwise an event. ## group : one curve for each distinct value in group. ## limits: if TRUE, and only one group, pointwise confidence ## limits (Greenwoods formula, log(-log) type). ## conf : confidence level. Can be given as a percentage. ## Check input data: n <- length(exit) if (length(enter) != n)stop("enter and exit must have equal length.") if (length(event) != n) stop("event and exit must have equal length.") if (length(group) != n) stop("group and exit must have equal length.") if (min(exit - enter) <= 0) stop("Interval length must be positive.") if (conf > 1) conf <- conf / 100 ## conf given as a percentage(?) if ( (conf < 0.5) | (conf >=1) ) stop("Bad conf value") grupp <- as.character(group) strata <- sort(unique(grupp)) if (length(strata) > 9) stop("Too many groups. No more than 9 are allowed.") ## if (length(strata) > 1) limits <- F # No limits if multiple curves. gang <- 0 for (stratum in strata) { atom <- table.events(enter[grupp == stratum], exit[grupp == stratum], event[grupp == stratum]) gang <- gang + 1 surv <- c( 1, cumprod(1 - atom$events / atom$riskset.sizes) ) if (gang == 1) { plot(c(0, atom$times), surv, type = "s", xlab = "duration", ylab = "Surviving fraction", main = "Survivor function(s)", ylim = c(0, 1), col = gang%%5) if (limits) ## Greenwood''s formula, ## Kalbfleisch & Prentice, p. 15 (note error!). { q.alpha <- abs(qnorm((1 - conf) / 2)) survived <- (atom$riskset.size - atom$events) se <- sqrt(cumsum(atom$events / ( atom$riskset.sizes * survived ) ) )/ cumsum(-log(survived / atom$riskset.sizes)) upper <- surv ^ exp(q.alpha * c(0, se)) lower <- surv ^ exp(-q.alpha * c(0, se)) lines(c(0, atom$times), upper, type = "s", col = gang%%5 + 1) lines(c(0, atom$times), lower, type = "s", col = gang%%5 + 1) } } else { lines(c(0, atom$times), surv, type = "s", col = gang%%5) } } abline(h=0) if (length(strata) > 1) { ## legend.txt <- as.character(strata) colors <- (1:length(strata))%%5 legend(0, 0.6, lty = 1, legend = strata, col = colors) } } table.events <- function(enter = rep(0, length(exit)), exit, event) { n <- length(exit) ## Check input data: if ( length(enter) != n ) stop("enter and exit must have equal length.") if ( length(event) != n ) stop("event and exit must have equal length.") ## event <- (event != 0) ## 0 (F) = censoring, else (T) = event times <- c(unique(sort(exit[event]))) nn <- length(times) rs.size <- double(nn) n.events <- double(nn) for (i in 1:nn) ## Try to avoid this loop! { rs.size[i] <- sum( (enter < times[i]) & (exit >= times[i]) ) n.events[i] <- sum( (exit == times[i]) & event ) } stop.at <- which(rs.size == n.events) if (length(stop.at)) { stop.at <- min(stop.at) - 1 if (stop.at <= 0) stop("Bad data. All died immediately!") times <- times[1:stop.at] n.events <- n.events[1:stop.at] rs.size <- rs.size[1:stop.at] } return ( list(times = times, events = n.events, riskset.sizes = rs.size) ) } Have fun! And send bug reports to -- G?ran Brostr?m e-mail: gb at stat.umu.se Professor tel: +46 90 786 5223 Department of Statistics fax: +46 90 786 6614 Ume? University SE-90187 Ume?, Sweden http://www.stat.umu.se/egna/gb -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
I have got two late answers to this question, from Mai Zhou and Bendix Carstensen. Both point out a ''trick'': Use coxph to fit a model with no covariates and ask for the ''baseline hazard'' of that fit! Unlike survfit, coxph allows for left truncated data. Thanks to both! G?ran> On Sun, 21 May 2000, gb wrote: > > > > > Is there a function somewhere for estimating the survivor function > > (plus confidence limits) when data are both left truncated and right > > censored? survfit in survival5 doesn''t like left truncation.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
R-users! How do I ask for the ''baseline hazard'' of a coxph fit?? Fredrik Lundgren ----- Ursprungligt meddelande ----- Fr?n: "gb" <gb at stat.umu.se> Till: <r-help at stat.math.ethz.ch> Skickat: den 25 maj 2000 21:25 ?mne: Re: [R] Kaplan-Meier for left truncated data?> > I have got two late answers to this question, from Mai Zhou and Bendix > Carstensen. Both point out a ''trick'': Use coxph to fit a model with > no covariates and ask for the ''baseline hazard'' of that fit! > Unlike survfit, coxph allows for left truncated data. > > Thanks to both! > > G?ran > > > On Sun, 21 May 2000, gb wrote: > > > > > > > > Is there a function somewhere for estimating the survivor function > > > (plus confidence limits) when data are both left truncated and right > > > censored? survfit in survival5 doesn''t like left truncation. > > -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- > r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html > Send "info", "help", or "[un]subscribe" > (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch > _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
On Fri, 26 May 2000, Fredrik Lundgren wrote:> R-users! > > How do I ask for the ''baseline hazard'' of a coxph fit??This interesting:> start <- c(1,2,1,1) > stopp <- c(3,4,5,6) > event <- c(1,1,1,1) > library(survival5) > survfit(Surv(start, stopp, event))Error in survfit.km(X, Y, casewt, ...) : Can only handle right censored data Does not work :-(> res <- coxph(Surv(start, stopp, event) ~ 1) > survfit(res)Is ok! :-) G?ran> > Fredrik Lundgren > ----- Ursprungligt meddelande ----- > Fr?n: "gb" <gb at stat.umu.se> > Till: <r-help at stat.math.ethz.ch> > Skickat: den 25 maj 2000 21:25 > ?mne: Re: [R] Kaplan-Meier for left truncated data? > > > > > > I have got two late answers to this question, from Mai Zhou and Bendix > > Carstensen. Both point out a ''trick'': Use coxph to fit a model with > > no covariates and ask for the ''baseline hazard'' of that fit! > > Unlike survfit, coxph allows for left truncated data. > > > > Thanks to both! > > > > G?ran > > > > > On Sun, 21 May 2000, gb wrote: > > > > > > > > > > > Is there a function somewhere for estimating the survivor function > > > > (plus confidence limits) when data are both left truncated and right > > > > censored? survfit in survival5 doesn''t like left truncation. > > > > -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- > > r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html > > Send "info", "help", or "[un]subscribe" > > (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch > > _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._ > >-- G?ran Brostr?m, professor tel: +46 90 786 5223 Department of Statistics fax: +46 90 786 6614 Ume? University SE-90187 Ume?, Sweden e-mail: gb at stat.umu.se -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
On Fri, 26 May 2000, Fredrik Lundgren wrote:> R-users! > > How do I ask for the ''baseline hazard'' of a coxph fit??One way is to take -log of the survival function survfit(a.coxph.fit)$surv gives the survival function at the mean covariates, so -log(survfit(a.coxph.fit)$surv) gives the hazard at the mean covariates. Many people define the baseline hazard as the hazard at all covariates=0 in which case you need to supply these values to survfit. This can be numerically unstable and in any case is usually not what you really want. -thomas> > Fredrik Lundgren > ----- Ursprungligt meddelande ----- > Från: "gb" <gb at stat.umu.se> > Till: <r-help at stat.math.ethz.ch> > Skickat: den 25 maj 2000 21:25 > Ämne: Re: [R] Kaplan-Meier for left truncated data? > > > > > > I have got two late answers to this question, from Mai Zhou and Bendix > > Carstensen. Both point out a ''trick'': Use coxph to fit a model with > > no covariates and ask for the ''baseline hazard'' of that fit! > > Unlike survfit, coxph allows for left truncated data. > > > > Thanks to both! > > > > Göran > > > > > On Sun, 21 May 2000, gb wrote: > > > > > > > > > > > Is there a function somewhere for estimating the survivor function > > > > (plus confidence limits) when data are both left truncated and right > > > > censored? survfit in survival5 doesn''t like left truncation. > > > > -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- > > r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html > > Send "info", "help", or "[un]subscribe" > > (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch > > _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._ > > -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- > r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html > Send "info", "help", or "[un]subscribe" > (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch > _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._ >Thomas Lumley Assistant Professor, Biostatistics University of Washington, Seattle -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
summary of "How do I ask for the ''baseline hazard'' of a coxph fit??" Thank you to Thomas Lumley, Brian Ripley, G?ran Bodstr?m and Bendix Cartensen for their tips to use plot(survfit(a.coxph.fit)). Thomas Lumley suggests that to get the "real baseline hazard" I should use -log(survfit(a.coxph.fit)$surv) and set all covariates=0 in which case I need to supply these values to survfit. But how do you supply covariate values to survfit? Shouldn''t they be supplied to coxph in some way? Fredrik Lundgren "Thomas Lumley wrote may 26 2000 "> One way is to take -log of the survival function > survfit(a.coxph.fit)$surv > gives the survival function at the mean covariates, so > -log(survfit(a.coxph.fit)$surv) > gives the hazard at the mean covariates. > > Many people define the baseline hazard as the hazard at all covariates=0 > in which case you need to supply these values to survfit. This can be > numerically unstable and in any case is usually not what you really want. > > -thomas-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
On Fri, 26 May 2000, Fredrik Lundgren wrote:> summary of "How do I ask for the ''baseline hazard'' of a coxph fit??" > > Thank you to Thomas Lumley, Brian Ripley, Göran Bodström and Bendix > Cartensen for their tips to use plot(survfit(a.coxph.fit)). Thomas > Lumley suggests that to get the "real baseline hazard" I should use > -log(survfit(a.coxph.fit)$surv) and set all covariates=0 in which case > I need to supply these values to survfit. But how do you supply > covariate values to survfit? Shouldn''t they be supplied to coxph in > some way?No, to survfit. coxph fits the model, it doesn''t care what you do to it afterwards. If you look at the documentation for survfit() it has a data= argument to specify which covariates values you want the survival curve for. -thomas Thomas Lumley Assistant Professor, Biostatistics University of Washington, Seattle -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Hi R experts, I have a problem. . . I am using R 1.1 in Win98 if I do: age <- c( 20,25,30,50,20,30) sex <- c("M","F","F","M","M","F") hgt <- c(1.80,1.65,1.70,1.75,1.85,1.68) obs <-c("first","second","third","fourth","fifth","sixth") base<-data.frame(rbind(age=age,sex=sex,hgt=hgt,obs=obs)) base age sex hgt obs 1 20 M 1.8 first 2 25 F 1.65 second 3 30 F 1.7 third 4 50 M 1.75 fourth 5 20 M 1.85 fifth 6 30 F 1.68 sixth i.e. I created a data.frame with I think, 2 real columns, 1 factor and 1 character. . . I thought lage<-log(age) base2<-cbind(base,lage) base2 age sex hgt obs lage 1 20 M 1.8 first 2.995732 2 25 F 1.65 second 3.218876 3 30 F 1.7 third 3.401197 4 50 M 1.75 fourth 3.912023 5 20 M 1.85 fifth 2.995732 6 30 F 1.68 sixth 3.401197 I can add a numeric column with no problems. . .I thought Now I want to add a new observation. . log(40) [1] 3.688879 new.guy <- c(40,"M",1.82,"seventh",3.688879) base<-rbind(base,new.guy) Warning messages: 1: invalid factor level, NAs generated in: [<-.factor(*tmp*, ri, value = "40") 2: invalid factor level, NAs generated in: [<-.factor(*tmp*, ri, value = "1.82") base$age [1] 20 25 30 50 20 30 NA Levels: 20 25 30 50 base$sex [1] M F F M M F M Levels: F M base$age [1] 20 25 30 50 20 30 NA Levels: 20 25 30 50 base$lage [1] 2.99573227355399 3.2188758248682 3.40119738166216 3.91202300542815 [5] 2.99573227355399 3.40119738166216 3.688879 Levels: 2.99573227355399 3.2188758248682 3.40119738166216 3.688879 3.91202300542815>i.e. I can not add a new value of age, All are factors!!! including log(age)! why? so now I use the I construct as in the help base<-data.frame(cbind(age=I(age),sex=I(sex),hgt=I(hgt)))> baseage sex hgt 1 20 M 1.8 2 25 F 1.65 3 30 F 1.7 4 50 M 1.75 5 20 M 1.85 6 30 F 1.68> new.guy<-c(40,"B",1.82) > base2<-rbind(base,new.guy)Warning messages: 1: invalid factor level, NAs generated in: [<-.factor(*tmp*, ri, value = "40") 2: invalid factor level, NAs generated in: [<-.factor(*tmp*, ri, value = "B") 3: invalid factor level, NAs generated in: [<-.factor(*tmp*, ri, value = "1.82")> is.factor(base$age)[1] TRUE>Any help in how to turn all numeric factors into numeric, character vectors into character and how to add new observations to a data.frame . As always, many thanks to the members of the list. . R. Heberto Ghezzo Ph.D. Meakins-Christie Labs McGill University Montreal - Canada heberto at meakins.lan.mcgill.ca -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Mark Myatt wrote:> > How do I check if a library has already been loaded and then load it if > required?To check, which libraries are loaded: search() But you might want to use require() Uwe Ligges -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
On Thu, 14 Sep 2000, Uwe Ligges wrote:> Mark Myatt wrote: > > > > How do I check if a library has already been loaded and then load it if > > required? > > To check, which libraries are loaded: > search() > > > But you might want to use > require()Actually, just library(foo) does exactly what is required, that is to load the library if it is not already loaded. require() checks for the use of provide(), but that is deprecated now so require() is really just a wrapper for library(). Brian -- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272860 (secr) Oxford OX1 3TG, UK Fax: +44 1865 272595 -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._