One of the things that have been annoying me with both Splus and R is that the "simple tests" are inconsistent with the lm/glm/dataframe conventions, and that they become quite awkward to use when data have to be extracted from a dataframe: t.test(data$bp[data$sex=="F" & data$age>25], data$bp[data$sex=="M" & data$age>25]) OK, so it's better to do eval(t.test(bp[sex=="F"],bp[sex=="M"]),subset(data,age>25)) but try explaining that to a class of MD's! Below is a first sketch of a set of functions that extend t.test to allow specification in terms of a list of vectors or a vector and a grouping variable or a model formula, allowing data= and subsetarguments as well. The paired test can also be specified using a Pairs(x,y) construction, which should help prevent people from using the wrong test. It's supposed to be downward compatible with the Splus syntax. I've been pushing Kurt to use a similar interface for all the ctest functions, so I thought you'd all like to see them. If nothing else, they should be interesting to study (they were definitely fun to write), since they use some *really* dirty tricks relating to R's evaluation model. Anyway, here goes (remember, I said *first* sketch - there is at least one bug in it, can you spot it?): ---- ### ### The first couple of lines just guard against multiple loading ### if(exists("t.test",envir=.GlobalEnv))rm(t.test) if(exists("t.test.default",envir=.GlobalEnv))rm(t.test.default) ### Utility function to extract the environment the call is evaluated ### in. If you do eval(x,list), list is found as the variable "envir" ### in the stack frame immediately below the local frame .ParentEnv<-function() { parent<-sys.parent() grandparent <- sys.parent(2) if ( parent - grandparent == 1 ) sys.frame(grandparent) else get("envir",envir=sys.frame(parent-1)) } .t.test<-t.test t.test.default<-function(x,...,group) { e<-.ParentEnv() call <- if ( is.list(x) ) substitute(t.test.list(x,...)) else if (missing(group)) substitute(.t.test(x,...)) else substitute(t.test.list(split(x, group),...)) eval(call,e) } t.test<-function(...,data=sys.frame(sys.parent()),subset) { dname.add<- if (!missing(data)) paste(", data frame:", deparse(substitute(data))) else "" if (!missing(subset)){ dname.add<-paste(dname.add, ", subset:", deparse(substitute(subset))) subset <- eval(substitute(subset),data) data <- data[subset,] # had better be a data frame... } res<-eval(substitute(t.test.generic(...)), data) res$data.name <- paste(res$data.name, dname.add) res } t.test.generic<-function(x,...) { UseMethod("t.test") } t.test.list<-function(l,...) { if (length(l) != 2) stop("need exactly 2 groups") res<-.t.test(l[[1]],l[[2]],...) res$data.name<-deparse(substitute(l)) nn<-names(l) if (length(nn) != 2) nn<-c("group 1", "group 2") names(res$estimate)<-nn res } t.test.formula<-function(f,...) { e<-.ParentEnv() f[[1]]<-as.name("split") call<-as.call(list(as.name("t.test.list"), f, ...)) eval(call,e) } Pairs<-function(x,y) structure(match.call(),class="paired") t.test.paired<-function(l,...) { e<-.ParentEnv() call<-c(as.list(l),list(...),list(paired=T)) call[[1]]<-as.name(".t.test") call<-as.call(call) eval(call,e) } -- O__ ---- Peter Dalgaard Blegdamsvej 3 c/ /'_ --- Dept. of Biostatistics 2200 Cph. N (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - (p.dalgaard@biostat.ku.dk) FAX: (+45) 35327907 -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-devel-request@stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._