Kristi Glover
2018-Sep-11 06:52 UTC
[R] loop for comparing two or more groups using bootstrapping
Hi R users, I was trying to test a null hypothesis of difference between two groups was 0. I have many years data, such as year1, year2,, year3, year4 and I was trying to compare between year1 and year2, year1 and year3, year2 and year3 and so on and have used following code with an example data. I tried to make a loop but did not work to compare between many years, and also want to obtain the exact p value. Would you mind to help me to make a loop? Thanks for your help. KG daT<-structure(list(year1 = c(0.417, 0.538, 0.69, 0.688, 0.688, 0.606, 0.667, 0.7, 0.545, 0.462, 0.711, 0.642, 0.744, 0.604, 0.612, 0.667, 0.533, 0.556, 0.444, 0.526, 0.323, 0.308, 0.195, 0.333, 0.323, 0.256, 0.345, 0.205, 0.286, 0.706, 0.7, 0.6, 0.571, 0.364, 0.429, 0.326, 0.571, 0.424, 0.341, 0.387, 0.341, 0.324, 0.696, 0.696, 0.583, 0.556, 0.645, 0.435, 0.471, 0.556), year2 = c(0.385, 0.552, 0.645, 0.516, 0.629, 0.595, 0.72, 0.638, 0.557, 0.588, 0.63, 0.744, 0.773, 0.571, 0.723, 0.769, 0.667, 0.667, 0.526, 0.476, 0.294, 0.323, 0.222, 0.556, 0.263, 0.37, 0.357, 0.25, 0.323, 0.778, 0.667, 0.636, 0.583, 0.432, 0.412, 0.333, 0.571, 0.39, 0.4, 0.452, 0.326, 0.471, 0.7, 0.75, 0.615, 0.462, 0.556, 0.4, 0.696, 0.465), year3 = c(0.435, 0.759, 0.759, 0.759, 0.714, 0.593, 0.651, 0.683, 0.513, 0.643, 0.652, 0.757, 0.791, 0.649, 0.78, 0.5, 0.5, 0.5, 0.533, 0.429, 0.333, 0.286, 0.231, 0.533, 0.303, 0.417, 0.333, 0.333, 0.357, 0.909, 1, 0.952, 0.8, 0.556, 0.529, 0.562, 0.762, 0.513, 0.733, 0.611, 0.733, 0.647, 0.909, 0.857, 0.8, 0.556, 0.588, 0.562, 0.857, 0.513), year4 = c(0.333, 0.533, 0.6, 0.483, 0.743, 0.5, 0.691, 0.619, 0.583, 0.385, 0.653, 0.762, 0.844, 0.64, 0.667, 0.571, 0.571, 0.615, 0.421, 0.5, 0.205, 0.308, 0.25, 0.6, 0.242, 0.308, 0.276, 0.235, 0.211, 0.9, 0.632, 0.72, 0.727, 0.356, 0.5, 0.368, 0.5, 0.41, 0.562, 0.514, 0.4, 0.409, 0.632, 0.72, 0.727, 0.4, 0.5, 0.421, 0.5, 0.462)), .Names = c("year1", "year2", "year3", "year4"), row.names = c(NA, -50L), class = "data.frame") head(daT) # null hypothesis; difference is equal to zero dif1.2<-daT$year2-daT$year1 k=10000 mysamples1.2=replicate(k, sample(dif1.2, replace=T)) mymeans1.2=apply(mysamples1.2, 2, mean) quantile(mymeans1.2, c(0.025, 0.975)) hist(mysamples1.2) mean(mymeans1.2) #what is p value? #similarly Now I want to compare between year 1 and year3, dif1.3<-daT$year3-daT$year1 mysamples1.3=replicate(k, sample(dif1.3, replace=T)) mymeans1.3=apply(mysamples1.3, 2, mean) quantile(mymeans1.3, c(0.025, 0.975)) [[alternative HTML version deleted]]
Jim Lemon
2018-Sep-11 07:44 UTC
[R] loop for comparing two or more groups using bootstrapping
Hi Kristy, Try this: colname.mat<-combn(paste0("year",1:4),2) samplenames<-apply(colname.mat,2,paste,collapse="") k<-10000 for(column in 1:ncol(colname.mat)) { assign(samplenames[column],replicate(k,sample(unlist(daT[,colname.mat[,column]]),3,TRUE))) } Then use get(samplenames[1]) and so on to access the values. Jim On Tue, Sep 11, 2018 at 4:52 PM Kristi Glover <kristi.glover at hotmail.com> wrote:> > Hi R users, > > I was trying to test a null hypothesis of difference between two groups was 0. I have many years data, such as year1, year2,, year3, year4 and I was trying to compare between year1 and year2, year1 and year3, year2 and year3 and so on and have used following code with an example data. > > > I tried to make a loop but did not work to compare between many years, and also want to obtain the exact p value. Would you mind to help me to make a loop? > > Thanks for your help. > > > KG > > > daT<-structure(list(year1 = c(0.417, 0.538, 0.69, 0.688, 0.688, 0.606, > > 0.667, 0.7, 0.545, 0.462, 0.711, 0.642, 0.744, 0.604, 0.612, > > 0.667, 0.533, 0.556, 0.444, 0.526, 0.323, 0.308, 0.195, 0.333, > > 0.323, 0.256, 0.345, 0.205, 0.286, 0.706, 0.7, 0.6, 0.571, 0.364, > > 0.429, 0.326, 0.571, 0.424, 0.341, 0.387, 0.341, 0.324, 0.696, > > 0.696, 0.583, 0.556, 0.645, 0.435, 0.471, 0.556), year2 = c(0.385, > > 0.552, 0.645, 0.516, 0.629, 0.595, 0.72, 0.638, 0.557, 0.588, > > 0.63, 0.744, 0.773, 0.571, 0.723, 0.769, 0.667, 0.667, 0.526, > > 0.476, 0.294, 0.323, 0.222, 0.556, 0.263, 0.37, 0.357, 0.25, > > 0.323, 0.778, 0.667, 0.636, 0.583, 0.432, 0.412, 0.333, 0.571, > > 0.39, 0.4, 0.452, 0.326, 0.471, 0.7, 0.75, 0.615, 0.462, 0.556, > > 0.4, 0.696, 0.465), year3 = c(0.435, 0.759, 0.759, 0.759, 0.714, > > 0.593, 0.651, 0.683, 0.513, 0.643, 0.652, 0.757, 0.791, 0.649, > > 0.78, 0.5, 0.5, 0.5, 0.533, 0.429, 0.333, 0.286, 0.231, 0.533, > > 0.303, 0.417, 0.333, 0.333, 0.357, 0.909, 1, 0.952, 0.8, 0.556, > > 0.529, 0.562, 0.762, 0.513, 0.733, 0.611, 0.733, 0.647, 0.909, > > 0.857, 0.8, 0.556, 0.588, 0.562, 0.857, 0.513), year4 = c(0.333, > > 0.533, 0.6, 0.483, 0.743, 0.5, 0.691, 0.619, 0.583, 0.385, 0.653, > > 0.762, 0.844, 0.64, 0.667, 0.571, 0.571, 0.615, 0.421, 0.5, 0.205, > > 0.308, 0.25, 0.6, 0.242, 0.308, 0.276, 0.235, 0.211, 0.9, 0.632, > > 0.72, 0.727, 0.356, 0.5, 0.368, 0.5, 0.41, 0.562, 0.514, 0.4, > > 0.409, 0.632, 0.72, 0.727, 0.4, 0.5, 0.421, 0.5, 0.462)), .Names = c("year1", > > "year2", "year3", "year4"), row.names = c(NA, -50L), class = "data.frame") > > head(daT) > > # null hypothesis; difference is equal to zero > > dif1.2<-daT$year2-daT$year1 > > k=10000 > > mysamples1.2=replicate(k, sample(dif1.2, replace=T)) > > mymeans1.2=apply(mysamples1.2, 2, mean) > > quantile(mymeans1.2, c(0.025, 0.975)) > > hist(mysamples1.2) > > mean(mymeans1.2) > > #what is p value? > > > #similarly Now I want to compare between year 1 and year3, > > dif1.3<-daT$year3-daT$year1 > > mysamples1.3=replicate(k, sample(dif1.3, replace=T)) > > mymeans1.3=apply(mysamples1.3, 2, mean) > > quantile(mymeans1.3, c(0.025, 0.975)) > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Kristi Glover
2018-Sep-11 13:55 UTC
[R] loop for comparing two or more groups using bootstrapping
Dear Jim, Thank you very much for the code. I run it but it gave me row names like "year224", "year142". are these the difference between columns? If we want to get bootstrapping means of difference between years (year2-year1; year3-year1), its CI and exact p value, how can we get it? thanks KG ---- head(daT) colname.mat<-combn(paste0("year",1:4),2) samplenames<-apply(colname.mat,2,paste,collapse="") k<-10 for(column in 1:ncol(colname.mat)) { assign(samplenames[column],replicate(k,sample(unlist(daT[,colname.mat[,column]]),3,TRUE))) }> get(samplenames[1])[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] year224 0.556 0.667 0.571 0.526 0.629 0.696 0.323 0.526 0.256 0.667 year142 0.324 0.324 0.706 0.638 0.600 0.294 0.612 0.688 0.432 0.387 year237 0.571 0.696 0.629 0.471 0.462 0.471 0.452 0.595 0.333 0.435 ________________________________ From: Jim Lemon <drjimlemon at gmail.com> Sent: September 11, 2018 1:44 AM To: Kristi Glover Cc: r-help mailing list Subject: Re: [R] loop for comparing two or more groups using bootstrapping Hi Kristy, Try this: colname.mat<-combn(paste0("year",1:4),2) samplenames<-apply(colname.mat,2,paste,collapse="") k<-10000 for(column in 1:ncol(colname.mat)) { assign(samplenames[column],replicate(k,sample(unlist(daT[,colname.mat[,column]]),3,TRUE))) } Then use get(samplenames[1]) and so on to access the values. Jim On Tue, Sep 11, 2018 at 4:52 PM Kristi Glover <kristi.glover at hotmail.com> wrote:> > Hi R users, > > I was trying to test a null hypothesis of difference between two groups was 0. I have many years data, such as year1, year2,, year3, year4 and I was trying to compare between year1 and year2, year1 and year3, year2 and year3 and so on and have used following code with an example data. > > > I tried to make a loop but did not work to compare between many years, and also want to obtain the exact p value. Would you mind to help me to make a loop? > > Thanks for your help. > > > KG > > > daT<-structure(list(year1 = c(0.417, 0.538, 0.69, 0.688, 0.688, 0.606, > 0.667, 0.7, 0.545, 0.462, 0.711, 0.642, 0.744, 0.604, 0.612, > 0.667, 0.533, 0.556, 0.444, 0.526, 0.323, 0.308, 0.195, 0.333, > 0.323, 0.256, 0.345, 0.205, 0.286, 0.706, 0.7, 0.6, 0.571, 0.364, > 0.429, 0.326, 0.571, 0.424, 0.341, 0.387, 0.341, 0.324, 0.696, > 0.696, 0.583, 0.556, 0.645, 0.435, 0.471, 0.556), year2 = c(0.385, > 0.552, 0.645, 0.516, 0.629, 0.595, 0.72, 0.638, 0.557, 0.588, > 0.63, 0.744, 0.773, 0.571, 0.723, 0.769, 0.667, 0.667, 0.526, > 0.476, 0.294, 0.323, 0.222, 0.556, 0.263, 0.37, 0.357, 0.25, > 0.323, 0.778, 0.667, 0.636, 0.583, 0.432, 0.412, 0.333, 0.571, > 0.39, 0.4, 0.452, 0.326, 0.471, 0.7, 0.75, 0.615, 0.462, 0.556, > 0.4, 0.696, 0.465), year3 = c(0.435, 0.759, 0.759, 0.759, 0.714, > 0.593, 0.651, 0.683, 0.513, 0.643, 0.652, 0.757, 0.791, 0.649, > 0.78, 0.5, 0.5, 0.5, 0.533, 0.429, 0.333, 0.286, 0.231, 0.533, > 0.303, 0.417, 0.333, 0.333, 0.357, 0.909, 1, 0.952, 0.8, 0.556, > 0.529, 0.562, 0.762, 0.513, 0.733, 0.611, 0.733, 0.647, 0.909, > 0.857, 0.8, 0.556, 0.588, 0.562, 0.857, 0.513), year4 = c(0.333, > 0.533, 0.6, 0.483, 0.743, 0.5, 0.691, 0.619, 0.583, 0.385, 0.653, > 0.762, 0.844, 0.64, 0.667, 0.571, 0.571, 0.615, 0.421, 0.5, 0.205, > 0.308, 0.25, 0.6, 0.242, 0.308, 0.276, 0.235, 0.211, 0.9, 0.632, > 0.72, 0.727, 0.356, 0.5, 0.368, 0.5, 0.41, 0.562, 0.514, 0.4, > 0.409, 0.632, 0.72, 0.727, 0.4, 0.5, 0.421, 0.5, 0.462)), .Names = c("year1", > "year2", "year3", "year4"), row.names = c(NA, -50L), class = "data.frame") > > head(daT) > > # null hypothesis; difference is equal to zero > > dif1.2<-daT$year2-daT$year1 > > k=10000 > > mysamples1.2=replicate(k, sample(dif1.2, replace=T)) > > mymeans1.2=apply(mysamples1.2, 2, mean) > > quantile(mymeans1.2, c(0.025, 0.975)) > > hist(mysamples1.2) > > mean(mymeans1.2) > > #what is p value? > > > #similarly Now I want to compare between year 1 and year3, > > dif1.3<-daT$year3-daT$year1 > > mysamples1.3=replicate(k, sample(dif1.3, replace=T)) > > mymeans1.3=apply(mysamples1.3, 2, mean) > > quantile(mymeans1.3, c(0.025, 0.975)) > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-helpthz.ch/mailman/listinfo/r-help> stat.ethz.ch The main R mailing list, for announcements about the development of R and the availability of new code, questions and answers about problems and solutions using R, enhancements and patches to the source code and documentation of R, comparison and compatibility with S and S-plus, and for the posting of nice examples and benchmarks.> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.[[alternative HTML version deleted]]