Hi, I have a dataset that looks like the one below. data plot plantno. species H 31 ABC D 2 DEF Y 54 GFE E 12 ERF Y 98 FVD H 4 JKU J 7 JFG A 55 EGD . . . . . . . . . I want to select rows belonging to 7 random plots for 100 times. (There are 50 plots in total) So I created a list of 100 vectors, each vector has 7 elements. samp <- lapply(1:100, function(i) sample(LETTERS)) samp2 <- lapply(samp2, "[", 1:7) How can I select the 26 plots from 'data' using 'samp'? samp3 <- sample(LETTERS, 7) samp4 <- subset(data, plot %in% samp3) # this works samp5 <- subset(data, plot %in% samp2[[1]]) # this works as well, but I used a for loop to get it to select 7 plots 100 times. for (i in nrow(samp2)) { samp6 <- subset(data, plot %in% samp2[[i]]) } # this doesn't work Am I missing something, or is there a better solution? Thanks. Kang Min
try this:> x <- read.table(textConnection("plot plantno. species+ H 31 ABC + D 2 DEF + Y 54 GFE + E 12 ERF + Y 98 FVD + H 4 JKU + J 7 JFG + A 55 EGD"), header=TRUE, as.is=TRUE)> closeAllConnections() > # chose 10 groups of 3 sample > choice <- lapply(1:10, function(.dummy){+ x[sample(nrow(x),3),] + })> > choice[[1]] plot plantno. species 3 Y 54 GFE 8 A 55 EGD 4 E 12 ERF [[2]] plot plantno. species 8 A 55 EGD 2 D 2 DEF 6 H 4 JKU [[3]] plot plantno. species 8 A 55 EGD 5 Y 98 FVD 4 E 12 ERF ........ On Sun, May 23, 2010 at 10:00 AM, Kang Min <ngokangmin at gmail.com> wrote:> Hi, > > I have a dataset that looks like the one below. > > data > plot ? ? plantno. ? ?species > H ? ? ? ? ?31 ? ? ? ? ? ? ABC > D ? ? ? ? ?2 ? ? ? ? ? ? ? DEF > Y ? ? ? ? ?54 ? ? ? ? ? ? GFE > E ? ? ? ? ?12 ? ? ? ? ? ? ERF > Y ? ? ? ? ?98 ? ? ? ? ? ? FVD > H ? ? ? ? ?4 ? ? ? ? ? ? ? JKU > J ? ? ? ? ? 7 ? ? ? ? ? ? ? JFG > A ? ? ? ? ?55 ? ? ? ? ? ? EGD > . ? ? ? ? ? ?. ? ? ? ? ? ? ? ? . > . ? ? ? ? ? ?. ? ? ? ? ? ? ? ? . > . ? ? ? ? ? ?. ? ? ? ? ? ? ? ? . > > I want to select rows belonging to 7 random plots for 100 times. > (There are 50 plots in total) > So I created a list of 100 vectors, each vector has 7 elements. > > samp <- lapply(1:100, function(i) sample(LETTERS)) > samp2 <- lapply(samp2, "[", 1:7) > > How can I select the 26 plots from 'data' using 'samp'? > > samp3 <- sample(LETTERS, 7) > samp4 <- subset(data, plot %in% samp3) # this works > samp5 <- subset(data, plot %in% samp2[[1]]) # this works as well, but > I used a for loop to get it to select 7 plots 100 times. > > for (i in nrow(samp2)) { > ? ? ?samp6 <- subset(data, plot %in% samp2[[i]]) > } # this doesn't work > > > Am I missing something, or is there a better solution? > > Thanks. > Kang Min > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve?
On May 23, 2010, at 10:00 AM, Kang Min wrote:> Hi, > > I have a dataset that looks like the one below. > > data > plot plantno. species > H 31 ABC > D 2 DEF > Y 54 GFE > E 12 ERF > Y 98 FVD > H 4 JKU > J 7 JFG > A 55 EGD > . . . > . . . > . . . > > I want to select rows belonging to 7 random plots for 100 times.So you should be thinking about a function that will do what you want exactly once and then wrapping it in replicate().> (There are 50 plots in total) > So I created a list of 100 vectors, each vector has 7 elements. > > samp <- lapply(1:100, function(i) sample(LETTERS))Please. "Minimal"!!! 5 samples should be enough for testing.> samp2 <- lapply(samp2, "[", 1:7) > > How can I select the 26 plots from 'data' using 'samp'? > > samp3 <- sample(LETTERS, 7)You do not want to sample from LETTERS but rather from the vector of data named "plot". Otherwise you will not be creating a representative sample. And ... "plot" is a really crappy name for a column. Try to avoid naming your columns with names that are common functions. Confusion of the humans reading your code is the predictable result, and occasional "confusion" of the R interpreter also may occur. [After reading your reply to Holtman.... Or maybe you do want to sample from LETTERS. The fix would be obvious.]> samp4 <- subset(data, plot %in% samp3) # this worksSo this is what you want to do once: samp1 <- function() subset(data, plot %in% sample(data$plot, 7) ) samp15 <- replicate(10, samp1()) samp5[,1] will be one sampled subset. (samp10 is now an array of lists.) Unforfunately, I noticed that even with minimal "data" example you provided (not in reproducible form unfortunately) that I was getting 7 or 8 samples and realized that using letters to subset was creating some overlaps whenever "H" was sampled. So this is safer: samp1 <- function() data[ sample(1:nrow(data), 7 ),] samp5 <- replicate(5, samp1() ) for(1 in 1:5) print(samp5[,i]) Then I noticed your reply to Holtman, so perhaps you do really wnat the first solution. Just so you understand it might not be statistically correct. -- David.> samp5 <- subset(data, plot %in% samp2[[1]]) # this works as well, but > I used a for loop to get it to select 7 plots 100 times. > > for (i in nrow(samp2)) { > samp6 <- subset(data, plot %in% samp2[[i]]) > } # this doesn't work > > Am I missing something, or is there a better solution? > > Thanks. > Kang Min > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.David Winsemius, MD West Hartford, CT