I am new to R and I suspect my problem is easily solved, but I haven't been able to figure it out without using loops. I am trying to implement Blair & Karniski's (1993) permutation test. I've included a sample data frame below. This data frame represents the conditional means (C1, C2) for 3 subjects in 2 consecutive samples of a continuous data set (e.g. ERP waveform). Each sample includes all possible permuations of the subject means (2^N), which is 8 in this case. The problem: I need to run a paired t-test on each SampleXPermutation set and save the maximum t-value obtained for each sample. The real data set has 16 subjects (2^16 permutations) and 500 samples, which leads to more than 32 million t-tests. I have a loop version of the program working, but it would take a few weeks to complete the job and I was hoping that someone could tell me how to do it faster? thank you kindly, Matthew Finkbeiner -------------------------------- "Sample" "C1" "C2" "PermN" 1 5 8 perm1 1 4 3 perm1 1 6 4 perm1 2 2 6 perm1 2 3 1 perm1 2 7 4 perm1 1 8 5 perm2 1 3 4 perm2 1 6 4 perm2 2 6 2 perm2 2 1 3 perm2 2 7 4 perm2 1 5 8 perm3 1 3 4 perm3 1 6 4 perm3 2 2 6 perm3 2 1 3 perm3 2 7 4 perm3 1 8 5 perm4 1 4 3 perm4 1 4 6 perm4 2 6 2 perm4 2 3 1 perm4 2 4 7 perm4 1 5 8 perm5 1 4 3 perm5 1 4 6 perm5 2 2 6 perm5 2 3 1 perm5 2 4 7 perm5 1 8 5 perm6 1 3 4 perm6 1 4 6 perm6 2 6 2 perm6 2 1 3 perm6 2 4 7 perm6 1 5 8 perm7 1 3 4 perm7 1 4 6 perm7 2 2 6 perm7 2 1 3 perm7 2 4 7 perm7 1 8 5 perm8 1 4 3 perm8 1 6 4 perm8 2 6 2 perm8 2 3 1 perm8 2 7 4 perm8 -- Dr. Matthew Finkbeiner Senior Lecturer & ARC Australian Research Fellow Macquarie Centre for Cognitive Science (MACCS) Macquarie University, Sydney, NSW 2109 Phone: +61 2 9850-6718 Fax: +61 2 9850-6059 Homepage: http://www.maccs.mq.edu.au/~mfinkbei Lab Homepage: http://www.maccs.mq.edu.au/laboratories/action/
I am new to R and I suspect my problem is easily solved, but I haven't been able to figure it out without using loops. I am trying to implement Blair & Karniski's (1993) permutation test. I've included a sample data frame below. This data frame represents the conditional means (C1, C2) for 3 subjects in 2 consecutive samples of a continuous data set (e.g. ERP waveform). Each sample includes all possible permuations of the subject means (2^N), which is 8 in this case. The problem: I need to run a paired t-test on each SampleXPermutation set and save the maximum t-value obtained for each sample. The real data set has 16 subjects (216 permutations) and 500 samples, which leads to more than 32 million t-tests. I have a loop version of the program working, but it would take a few weeks to complete the job and I was hoping that someone could tell me how to do it faster? thank you kindly, Matthew Finkbeiner -------------------------------- "Sample" "C1" "C2" "PermN" 1 5 8 perm1 1 4 3 perm1 1 6 4 perm1 2 2 6 perm1 2 3 1 perm1 2 7 4 perm1 1 8 5 perm2 1 3 4 perm2 1 6 4 perm2 2 6 2 perm2 2 1 3 perm2 2 7 4 perm2 1 5 8 perm3 1 3 4 perm3 1 6 4 perm3 2 2 6 perm3 2 1 3 perm3 2 7 4 perm3 1 8 5 perm4 1 4 3 perm4 1 4 6 perm4 2 6 2 perm4 2 3 1 perm4 2 4 7 perm4 1 5 8 perm5 1 4 3 perm5 1 4 6 perm5 2 2 6 perm5 2 3 1 perm5 2 4 7 perm5 1 8 5 perm6 1 3 4 perm6 1 4 6 perm6 2 6 2 perm6 2 1 3 perm6 2 4 7 perm6 1 5 8 perm7 1 3 4 perm7 1 4 6 perm7 2 2 6 perm7 2 1 3 perm7 2 4 7 perm7 1 8 5 perm8 1 4 3 perm8 1 6 4 perm8 2 6 2 perm8 2 3 1 perm8 2 7 4 perm8 -- Dr. Matthew Finkbeiner Senior Lecturer & ARC Australian Research Fellow Macquarie Centre for Cognitive Science (MACCS) Macquarie University, Sydney, NSW 2109 Phone: +61 2 9850-6718 Fax: +61 2 9850-6059 Homepage: http://www.maccs.mq.edu.au/~mfinkbei Lab Homepage: http://www.maccs.mq.edu.au/laboratories/action/
Hi Matt, see the example below. It took me a while to figure it out. I
suggest you carefully examine the example step by step. It computes t-values
for dataset with 3 variables and 8 unique combinations of two binning
variables. The code should extend easily to larger datasets. Also, it uses
the existing variable combinations only once, thereby further gaining
efficiency. If you want a paired t-test, you will have to adjust the t-value
computation in tt.test below (I implemented an unpaired t-test), and the
code cannot handle NAs, which you would have to implement if needed.
#Simulate data
data=data.frame(
grim=rnorm(100),
flik=rnorm(100,1,1),
prok=rnorm(100,0.5,1)
)
#Create a vector 1 do the number of variables to be tested in your data
#needed for indexing
b=1:length(data)
#Create binning variables that jointly define unique combinations
id1=rep(c(1:4),each=25)
id2=rep(c(1:2),50)
#Get unique combinations of two variable names
comb.a=t(combn(names(data),2))
#same as above for numeric variable indicators
comb.b=t(combn(b,2))
#Get unique combinations of the binning variables
comb.id=expand.grid(unique(id1),unique(id2))
#Aggregate data needed to compute t-tests
#for each unique binning variable combination
#i.e., for each row in comb.id
meanss=aggregate(data,by=list(id1,id2),mean)
varss=aggregate(data,by=list(id1,id2),var)
Nss=aggregate(data,by=list(id1,id2),length)
#Define a function for the t-test
tt.test=function(x){
test.means=meanss[,2+x]
test.vars=varss[,2+x]
test.Ns=Nss[,2+x]
(test.means[1]-test.means[2])/sqrt(test.vars[1]/test.Ns[1]+test.vars[2]/test.Ns[2])
}
#Apply the t-test over each row of comb.b
#where the rows in comb.b serve as the
#column indicator x used in tt.test
results=apply(comb.b,1,tt.test)
#Show results
#Results are in order of the eight unique
#combinations of id1 and id2 in comb.id
#presented in blocks for each combination
#of two variables (grim/flik, grim/prok, flik/prok)
results
#Combare the first element of the results
#with the individual t-test of the bin
# id1==1 and id2==1
t.test(data$grim[id1==1&id2==1],data$flik[id1==1&id2==1])
Hope this helps,
Daniel
--
View this message in context:
http://r.789695.n4.nabble.com/multiple-paired-t-tests-without-loops-tp2063347p2074863.html
Sent from the R help mailing list archive at Nabble.com.