Dear all, I am trying to generate bootstrap replicate matrixes (rows=samples, column=species, sampling with replacement) from a matrix dataset, but I do not know how to do it in R. I have tried boot() and bootstrap(), but they require an statistic, which in my case is cluster analysis (generating bootstrap values for a cluster analysis is a topic that has been mentioned previously in this list). I have been trying to use sample() and matrix() to generate the replicate matrix but they seem to generate a single vector rather than the entire matrix. What I want is to resample the entire matrix, but by resampling different columns (species). In that way, the bootstrap values will give me an idea of how similar the samples are. Any ideas will be very very helpful. An example of that data matrix is below. Thanks Hector X36C X40C X58C X60C X62C X66C X77C X92C X95C X96C X107C X109C X116C 26Y 0 0 0 59 919 351 128 0 104 214 0 0 0 C-0 0 0 0 368 1343 1826 211 0 253 352 0 0 0 C-50 0 0 0 211 1032 1701 50 0 54 56 0 0 0 C-90 64 0 65 260 769 876 0 0 87 0 0 91 96 C-127-1 0 0 127 149 364 3990 0 0 0 0 0 0 0 C-164 0 0 0 68 179 2373 0 0 105 0 0 0 0 C-198 0 0 0 89 327 1458 314 0 209 298 0 0 0 C-226 0 0 0 0 206 858 0 0 363 304 0 0 0 C-268 0 0 0 75 270 629 0 0 107 0 0 0 0 C-294-C 54 0 0 112 379 753 0 220 823 325 0 0 0 C-310 0 0 0 0 116 305 0 396 1049 355 0 0 0 C-357-2 96 0 0 445 201 405 0 114 2265 0 178 99 125 C-375 90 0 56 231 385 817 0 211 2776 0 57 79 106 C-399 110 0 50 563 1060 1244 0 414 2933 0 54 107 123 C-414 64 0 0 197 408 825 0 111 1875 0 0 82 104 C-428 63 0 0 80 100 695 0 162 2374 0 481 132 369 C-434 0 0 0 269 261 1689 0 2923 3496 0 0 0 0 C-454 77 0 0 257 170 963 0 377 3984 0 0 90 96 C-465 0 0 0 234 406 860 0 428 1601 0 0 0 0 C-479 111 0 0 349 297 1538 51 494 3753 0 75 102 95
On Tue, 2003-09-09 at 05:11, Hector L. Ayala-del-Rio wrote:> Dear all, > I am trying to generate bootstrap replicate matrixes (rows=samples, > column=species, sampling with replacement) from a matrix dataset, but I do > not know how to do it in R. I have tried boot() and bootstrap(), but they > require an statistic, which in my case is cluster analysis (generating > bootstrap values for a cluster analysis is a topic that has been mentioned > previously in this list). I have been trying to use sample() and matrix() > to generate the replicate matrix but they seem to generate a single vector > rather than the entire matrix. What I want is to resample the entire > matrix, but by resampling different columns (species). In that way, the > bootstrap values will give me an idea of how similar the samples are. Any > ideas will be very very helpful. An example of that data matrix is below. > > Thanks > > Hector > > X36C X40C X58C X60C X62C X66C X77C X92C X95C X96C X107C X109C X116C > 26Y 0 0 0 59 919 351 128 0 104 214 0 0 0 > C-0 0 0 0 368 1343 1826 211 0 253 352 0 0 0 > C-50 0 0 0 211 1032 1701 50 0 54 56 0 0 0 > C-90 64 0 65 260 769 876 0 0 87 0 0 91 96 > C-127-1 0 0 127 149 364 3990 0 0 0 0 0 0 0 > C-164 0 0 0 68 179 2373 0 0 105 0 0 0 0 > C-198 0 0 0 89 327 1458 314 0 209 298 0 0 0 > C-226 0 0 0 0 206 858 0 0 363 304 0 0 0 > C-268 0 0 0 75 270 629 0 0 107 0 0 0 0 > C-294-C 54 0 0 112 379 753 0 220 823 325 0 0 0 > C-310 0 0 0 0 116 305 0 396 1049 355 0 0 0 > C-357-2 96 0 0 445 201 405 0 114 2265 0 178 99 125 > C-375 90 0 56 231 385 817 0 211 2776 0 57 79 106 > C-399 110 0 50 563 1060 1244 0 414 2933 0 54 107 123 > C-414 64 0 0 197 408 825 0 111 1875 0 0 82 104 > C-428 63 0 0 80 100 695 0 162 2374 0 481 132 369 > C-434 0 0 0 269 261 1689 0 2923 3496 0 0 0 0 > C-454 77 0 0 257 170 963 0 377 3984 0 0 90 96 > C-465 0 0 0 234 406 860 0 428 1601 0 0 0 0 > C-479 111 0 0 349 297 1538 51 494 3753 0 75 102 95 > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://www.stat.math.ethz.ch/mailman/listinfo/r-helpHi Hector, I'm not sure I've understood your problem, you should describe your data for people fully understand your problem. I think your should try to use the boot function. It has a lot of analysis allready programed that are extremely usefull. Your statistic must be the result of a R function applied to your dataset, just be carefull to assure that the result of your function allways have the same dimension, otherwise boot will fale. Regarding the species issue, what I understant is that you want to bootstrap the observations of each species independently and than compute the statistic. You can do that by using the "strata" argument in boot. Change the matrix to a dataframe with columns for species, samples and observations and tell boot that species is the strata. Hope this helps EJ -- Ernesto Jardim <ernesto at ipimar.pt> Bi?logo Marinho/Marine Biologist IPIMAR - Instituto Nacional de Investiga??o Agr?ria e das Pescas IPIMAR - National Research Institute for Agriculture and Fisheries Av. Brasilia, 1400-006 Lisboa, Portugal Tel: +351 213 027 000 Fax: +351 213 015 948 http://ernesto.freezope.org
Putting aside the issue of whether you should be using boot() or not, you can resample your matrix by doing something like this.> a <- matrix(1:10, nrow=10, ncol=10, byrow=TRUE) > a[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [1,] 1 2 3 4 5 6 7 8 9 10 [2,] 1 2 3 4 5 6 7 8 9 10 [3,] 1 2 3 4 5 6 7 8 9 10 [4,] 1 2 3 4 5 6 7 8 9 10 [5,] 1 2 3 4 5 6 7 8 9 10 [6,] 1 2 3 4 5 6 7 8 9 10 [7,] 1 2 3 4 5 6 7 8 9 10 [8,] 1 2 3 4 5 6 7 8 9 10 [9,] 1 2 3 4 5 6 7 8 9 10 [10,] 1 2 3 4 5 6 7 8 9 10> b <- sample(1:10, replace=TRUE) > b[1] 2 10 7 10 5 1 4 5 9 5> d <- a[,b] > d[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [1,] 2 10 7 10 5 1 4 5 9 5 [2,] 2 10 7 10 5 1 4 5 9 5 [3,] 2 10 7 10 5 1 4 5 9 5 [4,] 2 10 7 10 5 1 4 5 9 5 [5,] 2 10 7 10 5 1 4 5 9 5 [6,] 2 10 7 10 5 1 4 5 9 5 [7,] 2 10 7 10 5 1 4 5 9 5 [8,] 2 10 7 10 5 1 4 5 9 5 [9,] 2 10 7 10 5 1 4 5 9 5 [10,] 2 10 7 10 5 1 4 5 9 5 HTH, Jim James W. MacDonald Affymetrix and cDNA Microarray Core University of Michigan Cancer Center 1500 E. Medical Center Drive 7410 CCGC Ann Arbor MI 48109 734-647-5623>>> "Hector L. Ayala-del-Rio" <ayalahec at msu.edu> 09/09/03 12:11AM >>>Dear all, I am trying to generate bootstrap replicate matrixes (rows=samples, column=species, sampling with replacement) from a matrix dataset, but I do not know how to do it in R. I have tried boot() and bootstrap(), but they require an statistic, which in my case is cluster analysis (generating bootstrap values for a cluster analysis is a topic that has been mentioned previously in this list). I have been trying to use sample() and matrix() to generate the replicate matrix but they seem to generate a single vector rather than the entire matrix. What I want is to resample the entire matrix, but by resampling different columns (species). In that way, the bootstrap values will give me an idea of how similar the samples are. Any ideas will be very very helpful. An example of that data matrix is below. Thanks Hector X36C X40C X58C X60C X62C X66C X77C X92C X95C X96C X107C X109C X116C 26Y 0 0 0 59 919 351 128 0 104 214 0 0 0 C-0 0 0 0 368 1343 1826 211 0 253 352 0 0 0 C-50 0 0 0 211 1032 1701 50 0 54 56 0 0 0 C-90 64 0 65 260 769 876 0 0 87 0 0 91 96 C-127-1 0 0 127 149 364 3990 0 0 0 0 0 0 0 C-164 0 0 0 68 179 2373 0 0 105 0 0 0 0 C-198 0 0 0 89 327 1458 314 0 209 298 0 0 0 C-226 0 0 0 0 206 858 0 0 363 304 0 0 0 C-268 0 0 0 75 270 629 0 0 107 0 0 0 0 C-294-C 54 0 0 112 379 753 0 220 823 325 0 0 0 C-310 0 0 0 0 116 305 0 396 1049 355 0 0 0 C-357-2 96 0 0 445 201 405 0 114 2265 0 178 99 125 C-375 90 0 56 231 385 817 0 211 2776 0 57 79 106 C-399 110 0 50 563 1060 1244 0 414 2933 0 54 107 123 C-414 64 0 0 197 408 825 0 111 1875 0 0 82 104 C-428 63 0 0 80 100 695 0 162 2374 0 481 132 369 C-434 0 0 0 269 261 1689 0 2923 3496 0 0 0 0 C-454 77 0 0 257 170 963 0 377 3984 0 0 90 96 C-465 0 0 0 234 406 860 0 428 1601 0 0 0 0 C-479 111 0 0 349 297 1538 51 494 3753 0 75 102 95 ______________________________________________ R-help at stat.math.ethz.ch mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help