Hi R users, I have a very large data set that has two conditioning variables for the test I want to perform. A toy set can be simulated: type<-sample(1:3,100,replace=TRUE) class<-sample(1:20,100,replace=TRUE) value<-rnorm(100) data<-cbind(type,class,value) (though type and class are alphanum) I want to perform the three pair-wise t-tests between types for each class in data. Can someone help me out with this? Any help is greatly appreciated. Dan
Hi Jorge, That is exactly what I wanted - I should have given a reasonable number of observations (my set has *almost* all paired observations, so it will still break with that approach unless I manicure the data set). Is there a way to fail nicely on a single one of the tests without the whole thing failing? again, thanks for your help Dan On 25/03/2009, at 7:46 AM, Jorge Ivan Velez wrote:> # Data > set.seed(1) > x<-sample(1:3,100,replace=TRUE) > y<-sample(1:20,100,replace=TRUE) > z<-rnorm(100) > Data<-data.frame(x,y,z) > > # Observations for Type and Class > with(Data, table(x,y)) > > > # Splitting the data by Class > SD<-with(Data,split(Data,y)) > > res<-lapply(SD, function(.data){ > # Type combinations by Class > combs<-t(combn(sort(unique(.data[,1])),2)) > > # Applying the t-test for them > apply(combs,1, function(.r){ > x1<-.data[.data[,1]==.r[1],3] # select third column > x2<-.data[.data[,1]==.r[2],3] # select third column > tvalue<-t.test(x1,x2) > res<-c(tvalue$statistic,tvalue$parameter,tvalue > $p.value) > names(res)<-c('stat','df','pvalue') > res > } > ) > } > ) > > res >[[alternative HTML version deleted]]
.. and you will end up - in your example- with 60 t-statistics and p-values (so you do bonforroni adjustment or something like that)?! Sometimes the question for "How do I ..." should be read as "What is the question I *really* want to be answered ...". You may consider doing some more sophisticated analysis. Dan Kortschak wrote:> Hi R users, > > I have a very large data set that has two conditioning variables for the > test I want to perform. > > A toy set can be simulated: > > type<-sample(1:3,100,replace=TRUE) > class<-sample(1:20,100,replace=TRUE) > value<-rnorm(100) > data<-cbind(type,class,value) > > (though type and class are alphanum) > > I want to perform the three pair-wise t-tests between types for each > class in data. > > Can someone help me out with this? > > Any help is greatly appreciated. > Dan > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
That is a valid point, the number of samples I expect to be different is actually quite small, but it is supportable (or otherwise) by other experimental data. Unfortunately the question I really want answered is pretty much covered by doing this. thanks Dan On 25/03/2009, at 10:25 AM, Eik Vettorazzi wrote:> .. and you will end up - in your example- with 60 t-statistics and > p-values (so you do bonforroni adjustment or something like > that)?! Sometimes the question for "How do I ..." should be read > as "What is the question I *really* want to be answered ...". You > may consider doing some more sophisticated analysis. >
so you want to find a needle in a haystack, not an easy task. You should account for multiple tests, which is as far as I can see not done in the code yet - or you have to accept that you find a bunch of hay which accidentally looks pretty much like a needle. There are some solutions in doing such things for instance finding relevant SNPs in microarray data. Maybe your task is quite similar. Eik Dan Kortschak schrieb:> That is a valid point, the number of samples I expect to be different > is actually quite small, but it is supportable (or otherwise) by other > experimental data. > > Unfortunately the question I really want answered is pretty much > covered by doing this. > > thanks > Dan > > > On 25/03/2009, at 10:25 AM, Eik Vettorazzi wrote: > >> .. and you will end up - in your example- with 60 t-statistics and >> p-values (so you do bonforroni adjustment or something like that)?! >> Sometimes the question for "How do I ..." should be read as "What is >> the question I *really* want to be answered ...". You may consider >> doing some more sophisticated analysis. >>-- Eik Vettorazzi Institut f?r Medizinische Biometrie und Epidemiologie Universit?tsklinikum Hamburg-Eppendorf Martinistr. 52 20246 Hamburg T ++49/40/42803-8243 F ++49/40/42803-7790