Kristi Glover
2018-Sep-11 06:52 UTC
[R] loop for comparing two or more groups using bootstrapping
Hi R users,
I was trying to test a null hypothesis of difference between two groups was 0. I
have many years data, such as year1, year2,, year3, year4 and I was trying to
compare between year1 and year2, year1 and year3, year2 and year3 and so on and
have used following code with an example data.
I tried to make a loop but did not work to compare between many years, and also
want to obtain the exact p value. Would you mind to help me to make a loop?
Thanks for your help.
KG
daT<-structure(list(year1 = c(0.417, 0.538, 0.69, 0.688, 0.688, 0.606,
0.667, 0.7, 0.545, 0.462, 0.711, 0.642, 0.744, 0.604, 0.612,
0.667, 0.533, 0.556, 0.444, 0.526, 0.323, 0.308, 0.195, 0.333,
0.323, 0.256, 0.345, 0.205, 0.286, 0.706, 0.7, 0.6, 0.571, 0.364,
0.429, 0.326, 0.571, 0.424, 0.341, 0.387, 0.341, 0.324, 0.696,
0.696, 0.583, 0.556, 0.645, 0.435, 0.471, 0.556), year2 = c(0.385,
0.552, 0.645, 0.516, 0.629, 0.595, 0.72, 0.638, 0.557, 0.588,
0.63, 0.744, 0.773, 0.571, 0.723, 0.769, 0.667, 0.667, 0.526,
0.476, 0.294, 0.323, 0.222, 0.556, 0.263, 0.37, 0.357, 0.25,
0.323, 0.778, 0.667, 0.636, 0.583, 0.432, 0.412, 0.333, 0.571,
0.39, 0.4, 0.452, 0.326, 0.471, 0.7, 0.75, 0.615, 0.462, 0.556,
0.4, 0.696, 0.465), year3 = c(0.435, 0.759, 0.759, 0.759, 0.714,
0.593, 0.651, 0.683, 0.513, 0.643, 0.652, 0.757, 0.791, 0.649,
0.78, 0.5, 0.5, 0.5, 0.533, 0.429, 0.333, 0.286, 0.231, 0.533,
0.303, 0.417, 0.333, 0.333, 0.357, 0.909, 1, 0.952, 0.8, 0.556,
0.529, 0.562, 0.762, 0.513, 0.733, 0.611, 0.733, 0.647, 0.909,
0.857, 0.8, 0.556, 0.588, 0.562, 0.857, 0.513), year4 = c(0.333,
0.533, 0.6, 0.483, 0.743, 0.5, 0.691, 0.619, 0.583, 0.385, 0.653,
0.762, 0.844, 0.64, 0.667, 0.571, 0.571, 0.615, 0.421, 0.5, 0.205,
0.308, 0.25, 0.6, 0.242, 0.308, 0.276, 0.235, 0.211, 0.9, 0.632,
0.72, 0.727, 0.356, 0.5, 0.368, 0.5, 0.41, 0.562, 0.514, 0.4,
0.409, 0.632, 0.72, 0.727, 0.4, 0.5, 0.421, 0.5, 0.462)), .Names =
c("year1",
"year2", "year3", "year4"), row.names = c(NA,
-50L), class = "data.frame")
head(daT)
# null hypothesis; difference is equal to zero
dif1.2<-daT$year2-daT$year1
k=10000
mysamples1.2=replicate(k, sample(dif1.2, replace=T))
mymeans1.2=apply(mysamples1.2, 2, mean)
quantile(mymeans1.2, c(0.025, 0.975))
hist(mysamples1.2)
mean(mymeans1.2)
#what is p value?
#similarly Now I want to compare between year 1 and year3,
dif1.3<-daT$year3-daT$year1
mysamples1.3=replicate(k, sample(dif1.3, replace=T))
mymeans1.3=apply(mysamples1.3, 2, mean)
quantile(mymeans1.3, c(0.025, 0.975))
[[alternative HTML version deleted]]
Jim Lemon
2018-Sep-11 07:44 UTC
[R] loop for comparing two or more groups using bootstrapping
Hi Kristy,
Try this:
colname.mat<-combn(paste0("year",1:4),2)
samplenames<-apply(colname.mat,2,paste,collapse="")
k<-10000
for(column in 1:ncol(colname.mat)) {
assign(samplenames[column],replicate(k,sample(unlist(daT[,colname.mat[,column]]),3,TRUE)))
}
Then use get(samplenames[1]) and so on to access the values.
Jim
On Tue, Sep 11, 2018 at 4:52 PM Kristi Glover <kristi.glover at
hotmail.com> wrote:>
> Hi R users,
>
> I was trying to test a null hypothesis of difference between two groups was
0. I have many years data, such as year1, year2,, year3, year4 and I was trying
to compare between year1 and year2, year1 and year3, year2 and year3 and so on
and have used following code with an example data.
>
>
> I tried to make a loop but did not work to compare between many years, and
also want to obtain the exact p value. Would you mind to help me to make a loop?
>
> Thanks for your help.
>
>
> KG
>
>
> daT<-structure(list(year1 = c(0.417, 0.538, 0.69, 0.688, 0.688, 0.606,
>
> 0.667, 0.7, 0.545, 0.462, 0.711, 0.642, 0.744, 0.604, 0.612,
>
> 0.667, 0.533, 0.556, 0.444, 0.526, 0.323, 0.308, 0.195, 0.333,
>
> 0.323, 0.256, 0.345, 0.205, 0.286, 0.706, 0.7, 0.6, 0.571, 0.364,
>
> 0.429, 0.326, 0.571, 0.424, 0.341, 0.387, 0.341, 0.324, 0.696,
>
> 0.696, 0.583, 0.556, 0.645, 0.435, 0.471, 0.556), year2 = c(0.385,
>
> 0.552, 0.645, 0.516, 0.629, 0.595, 0.72, 0.638, 0.557, 0.588,
>
> 0.63, 0.744, 0.773, 0.571, 0.723, 0.769, 0.667, 0.667, 0.526,
>
> 0.476, 0.294, 0.323, 0.222, 0.556, 0.263, 0.37, 0.357, 0.25,
>
> 0.323, 0.778, 0.667, 0.636, 0.583, 0.432, 0.412, 0.333, 0.571,
>
> 0.39, 0.4, 0.452, 0.326, 0.471, 0.7, 0.75, 0.615, 0.462, 0.556,
>
> 0.4, 0.696, 0.465), year3 = c(0.435, 0.759, 0.759, 0.759, 0.714,
>
> 0.593, 0.651, 0.683, 0.513, 0.643, 0.652, 0.757, 0.791, 0.649,
>
> 0.78, 0.5, 0.5, 0.5, 0.533, 0.429, 0.333, 0.286, 0.231, 0.533,
>
> 0.303, 0.417, 0.333, 0.333, 0.357, 0.909, 1, 0.952, 0.8, 0.556,
>
> 0.529, 0.562, 0.762, 0.513, 0.733, 0.611, 0.733, 0.647, 0.909,
>
> 0.857, 0.8, 0.556, 0.588, 0.562, 0.857, 0.513), year4 = c(0.333,
>
> 0.533, 0.6, 0.483, 0.743, 0.5, 0.691, 0.619, 0.583, 0.385, 0.653,
>
> 0.762, 0.844, 0.64, 0.667, 0.571, 0.571, 0.615, 0.421, 0.5, 0.205,
>
> 0.308, 0.25, 0.6, 0.242, 0.308, 0.276, 0.235, 0.211, 0.9, 0.632,
>
> 0.72, 0.727, 0.356, 0.5, 0.368, 0.5, 0.41, 0.562, 0.514, 0.4,
>
> 0.409, 0.632, 0.72, 0.727, 0.4, 0.5, 0.421, 0.5, 0.462)), .Names =
c("year1",
>
> "year2", "year3", "year4"), row.names = c(NA,
-50L), class = "data.frame")
>
> head(daT)
>
> # null hypothesis; difference is equal to zero
>
> dif1.2<-daT$year2-daT$year1
>
> k=10000
>
> mysamples1.2=replicate(k, sample(dif1.2, replace=T))
>
> mymeans1.2=apply(mysamples1.2, 2, mean)
>
> quantile(mymeans1.2, c(0.025, 0.975))
>
> hist(mysamples1.2)
>
> mean(mymeans1.2)
>
> #what is p value?
>
>
> #similarly Now I want to compare between year 1 and year3,
>
> dif1.3<-daT$year3-daT$year1
>
> mysamples1.3=replicate(k, sample(dif1.3, replace=T))
>
> mymeans1.3=apply(mysamples1.3, 2, mean)
>
> quantile(mymeans1.3, c(0.025, 0.975))
>
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
Kristi Glover
2018-Sep-11 13:55 UTC
[R] loop for comparing two or more groups using bootstrapping
Dear Jim,
Thank you very much for the code. I run it but it gave me row names like
"year224", "year142".
are these the difference between columns? If we want to get bootstrapping means
of difference between years (year2-year1; year3-year1), its CI and exact p
value, how can we get it?
thanks
KG
----
head(daT)
colname.mat<-combn(paste0("year",1:4),2)
samplenames<-apply(colname.mat,2,paste,collapse="")
k<-10
for(column in 1:ncol(colname.mat)) {
assign(samplenames[column],replicate(k,sample(unlist(daT[,colname.mat[,column]]),3,TRUE)))
}
> get(samplenames[1])
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
year224 0.556 0.667 0.571 0.526 0.629 0.696 0.323 0.526 0.256 0.667
year142 0.324 0.324 0.706 0.638 0.600 0.294 0.612 0.688 0.432 0.387
year237 0.571 0.696 0.629 0.471 0.462 0.471 0.452 0.595 0.333 0.435
________________________________
From: Jim Lemon <drjimlemon at gmail.com>
Sent: September 11, 2018 1:44 AM
To: Kristi Glover
Cc: r-help mailing list
Subject: Re: [R] loop for comparing two or more groups using bootstrapping
Hi Kristy,
Try this:
colname.mat<-combn(paste0("year",1:4),2)
samplenames<-apply(colname.mat,2,paste,collapse="")
k<-10000
for(column in 1:ncol(colname.mat)) {
assign(samplenames[column],replicate(k,sample(unlist(daT[,colname.mat[,column]]),3,TRUE)))
}
Then use get(samplenames[1]) and so on to access the values.
Jim
On Tue, Sep 11, 2018 at 4:52 PM Kristi Glover <kristi.glover at
hotmail.com> wrote:>
> Hi R users,
>
> I was trying to test a null hypothesis of difference between two groups was
0. I have many years data, such as year1, year2,, year3, year4 and I was trying
to compare between year1 and year2, year1 and year3, year2 and year3 and so on
and have used following code with an example data.
>
>
> I tried to make a loop but did not work to compare between many years, and
also want to obtain the exact p value. Would you mind to help me to make a loop?
>
> Thanks for your help.
>
>
> KG
>
>
> daT<-structure(list(year1 = c(0.417, 0.538, 0.69, 0.688, 0.688, 0.606,
> 0.667, 0.7, 0.545, 0.462, 0.711, 0.642, 0.744, 0.604, 0.612,
> 0.667, 0.533, 0.556, 0.444, 0.526, 0.323, 0.308, 0.195, 0.333,
> 0.323, 0.256, 0.345, 0.205, 0.286, 0.706, 0.7, 0.6, 0.571, 0.364,
> 0.429, 0.326, 0.571, 0.424, 0.341, 0.387, 0.341, 0.324, 0.696,
> 0.696, 0.583, 0.556, 0.645, 0.435, 0.471, 0.556), year2 = c(0.385,
> 0.552, 0.645, 0.516, 0.629, 0.595, 0.72, 0.638, 0.557, 0.588,
> 0.63, 0.744, 0.773, 0.571, 0.723, 0.769, 0.667, 0.667, 0.526,
> 0.476, 0.294, 0.323, 0.222, 0.556, 0.263, 0.37, 0.357, 0.25,
> 0.323, 0.778, 0.667, 0.636, 0.583, 0.432, 0.412, 0.333, 0.571,
> 0.39, 0.4, 0.452, 0.326, 0.471, 0.7, 0.75, 0.615, 0.462, 0.556,
> 0.4, 0.696, 0.465), year3 = c(0.435, 0.759, 0.759, 0.759, 0.714,
> 0.593, 0.651, 0.683, 0.513, 0.643, 0.652, 0.757, 0.791, 0.649,
> 0.78, 0.5, 0.5, 0.5, 0.533, 0.429, 0.333, 0.286, 0.231, 0.533,
> 0.303, 0.417, 0.333, 0.333, 0.357, 0.909, 1, 0.952, 0.8, 0.556,
> 0.529, 0.562, 0.762, 0.513, 0.733, 0.611, 0.733, 0.647, 0.909,
> 0.857, 0.8, 0.556, 0.588, 0.562, 0.857, 0.513), year4 = c(0.333,
> 0.533, 0.6, 0.483, 0.743, 0.5, 0.691, 0.619, 0.583, 0.385, 0.653,
> 0.762, 0.844, 0.64, 0.667, 0.571, 0.571, 0.615, 0.421, 0.5, 0.205,
> 0.308, 0.25, 0.6, 0.242, 0.308, 0.276, 0.235, 0.211, 0.9, 0.632,
> 0.72, 0.727, 0.356, 0.5, 0.368, 0.5, 0.41, 0.562, 0.514, 0.4,
> 0.409, 0.632, 0.72, 0.727, 0.4, 0.5, 0.421, 0.5, 0.462)), .Names =
c("year1",
> "year2", "year3", "year4"), row.names = c(NA,
-50L), class = "data.frame")
>
> head(daT)
>
> # null hypothesis; difference is equal to zero
>
> dif1.2<-daT$year2-daT$year1
>
> k=10000
>
> mysamples1.2=replicate(k, sample(dif1.2, replace=T))
>
> mymeans1.2=apply(mysamples1.2, 2, mean)
>
> quantile(mymeans1.2, c(0.025, 0.975))
>
> hist(mysamples1.2)
>
> mean(mymeans1.2)
>
> #what is p value?
>
>
> #similarly Now I want to compare between year 1 and year3,
>
> dif1.3<-daT$year3-daT$year1
>
> mysamples1.3=replicate(k, sample(dif1.3, replace=T))
>
> mymeans1.3=apply(mysamples1.3, 2, mean)
>
> quantile(mymeans1.3, c(0.025, 0.975))
>
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
thz.ch/mailman/listinfo/r-help>
stat.ethz.ch
The main R mailing list, for announcements about the development of R and the
availability of new code, questions and answers about problems and solutions
using R, enhancements and patches to the source code and documentation of R,
comparison and compatibility with S and S-plus, and for the posting of nice
examples and benchmarks.
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
[[alternative HTML version deleted]]