arun
2013-May-01 21:43 UTC
[R] Multiple Paired T test from large Data Set with multiple pairs
Hi, Assuming that your dataset is similar to the one below: set.seed(25) dat1<- data.frame(Algae.Mass=sample(40:50,10,replace=TRUE),Seagrass.Mass=sample(30:70,10,replace=TRUE),Terrestrial.Mass=sample(80:100,10,replace=TRUE),Other.Mass=sample(40:60,10,replace=TRUE),Site.X.Treatment=rep(c("ALA1A","ALA1U"),each=5),stringsAsFactors=FALSE) library(reshape2) dat2<-melt(dat1,id.var="Site.X.Treatment") sapply(split(dat2,dat2$variable),function(x) t.test(x[x$Site.X.Treatment=="ALA1A",3],x[x$Site.X.Treatment=="ALA1U",3],paired=TRUE)$p.value) ? #??? Algae.Mass??? Seagrass.Mass Terrestrial.Mass?????? Other.Mass ? #???? 1.0000000??????? 0.4624989??????? 0.4388211??????? 0.7521036? #or library(plyr) ?ddply(dat2,.(variable),function(x) summarize(x,Pvalue=t.test(value~Site.X.Treatment,data=x,na.rm=TRUE,paired=TRUE)$p.value)) #????????? variable??? Pvalue #1?????? Algae.Mass 1.0000000 #2??? Seagrass.Mass 0.4624989 #3 Terrestrial.Mass 0.4388211 #4?????? Other.Mass 0.7521036 A.K.>Hey, > >I have a fairly large data set with multiple pairs of Sites.?Each site has two levels (the pairs) "A" and "U". ?For each pair I want to do a paired t test of >4 different metrics that exist as columns in my data set.> >Here is the long version > >t.test(Algae.Mass[Site.X.Treatment=="ALA1A"],Algae.Mass[Site.X.Treatment=="ALA1U"], paired=T) >t.test(Seagrass.Mass[Site.X.Treatment=="ALA1A"],Seagrass.Mass[Site.X.Treatment=="ALA1U"], paired=T) >t.test(Terrestrial.Mass[Site.X.Treatment=="ALA1A"],Terrestrial.Mass[Site.X.Treatment=="ALA1U"], paired=T) >t.test(Other.Mass[Site.X.Treatment=="ALA1A"],Other.Mass[Site.X.Treatment=="ALA1U"], paired=T) > >How can I do this in one line of code? ?I have tried lapply,tapply etc but keep running into issues. ?It would also be great to not have to keep defining >"Site.X.Treatment". ?I do have Site.X.Treatment broken down by just Site and Treatment in separate columns in the data set. ?Any Ideas??
arun
2013-May-02 17:58 UTC
[R] Multiple Paired T test from large Data Set with multiple pairs
My code was based on the assumption that your dataset was similar to the one I provided.? Please provide an example dataset (use dput(head(dataset),20)) http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example A.K.>Arun, > >I have tried applying your suggestions to my data set but Icannot get it to work. ?I think my lack of R skills may be a contributing factor. ?I will keep >trying though. ----- Original Message ----- From: arun <smartpink111 at yahoo.com> To: R help <r-help at r-project.org> Cc: Sent: Wednesday, May 1, 2013 5:43 PM Subject: Re: Multiple Paired T test from large Data Set with multiple pairs Hi, Assuming that your dataset is similar to the one below: set.seed(25) dat1<- data.frame(Algae.Mass=sample(40:50,10,replace=TRUE),Seagrass.Mass=sample(30:70,10,replace=TRUE),Terrestrial.Mass=sample(80:100,10,replace=TRUE),Other.Mass=sample(40:60,10,replace=TRUE),Site.X.Treatment=rep(c("ALA1A","ALA1U"),each=5),stringsAsFactors=FALSE) library(reshape2) dat2<-melt(dat1,id.var="Site.X.Treatment") sapply(split(dat2,dat2$variable),function(x) t.test(x[x$Site.X.Treatment=="ALA1A",3],x[x$Site.X.Treatment=="ALA1U",3],paired=TRUE)$p.value) ? #??? Algae.Mass??? Seagrass.Mass Terrestrial.Mass?????? Other.Mass ? #???? 1.0000000??????? 0.4624989??????? 0.4388211??????? 0.7521036? #or library(plyr) ?ddply(dat2,.(variable),function(x) summarize(x,Pvalue=t.test(value~Site.X.Treatment,data=x,na.rm=TRUE,paired=TRUE)$p.value)) #????????? variable??? Pvalue #1?????? Algae.Mass 1.0000000 #2??? Seagrass.Mass 0.4624989 #3 Terrestrial.Mass 0.4388211 #4?????? Other.Mass 0.7521036 A.K.>Hey, > >I have a fairly large data set with multiple pairs of Sites.?Each site has two levels (the pairs) "A" and "U". ?For each pair I want to do a paired t test of >4 different metrics that exist as columns in my data set.> >Here is the long version > >t.test(Algae.Mass[Site.X.Treatment=="ALA1A"],Algae.Mass[Site.X.Treatment=="ALA1U"], paired=T) >t.test(Seagrass.Mass[Site.X.Treatment=="ALA1A"],Seagrass.Mass[Site.X.Treatment=="ALA1U"], paired=T) >t.test(Terrestrial.Mass[Site.X.Treatment=="ALA1A"],Terrestrial.Mass[Site.X.Treatment=="ALA1U"], paired=T) >t.test(Other.Mass[Site.X.Treatment=="ALA1A"],Other.Mass[Site.X.Treatment=="ALA1U"], paired=T) > >How can I do this in one line of code? ?I have tried lapply,tapply etc but keep running into issues. ?It would also be great to not have to keep defining >"Site.X.Treatment". ?I do have Site.X.Treatment broken down by just Site and Treatment in separate columns in the data set. ?Any Ideas??