I don't use R much, and I have been unable to figure out how to get the subset of my data frame that I would like. For example, if this were my data frame: > dfr <- data.frame(x=rep(letters[1:3], 4), y=(1:12), z=(LETTERS[1:12])) > dfr x y z 1 a 1 A 2 b 2 B 3 c 3 C 4 a 4 D 5 b 5 E 6 c 6 F 7 a 7 G 8 b 8 H 9 c 9 I 10 a 10 J 11 b 11 K 12 c 12 L I would like to randomly select one row for each level of the factor x and create a new data frame with the results. For example, the result might be: x y z 1 a 1 A 5 b 5 E 6 c 6 F Any help would be greatly appreciated! Thanks, Kelly -- K. Kelly Hildner, Ph.D. NOAA Fisheries Southwest Fisheries Science Center 110 Shaffer Rd. Santa Cruz, CA 95060
Gabor Grothendieck
2006-Apr-20 05:31 UTC
[R] Randomly selecting one row for each factor level
Try this: set.seed(1) f <- function(x) { dd <- dfr[dfr$x == x,]; dd[sample(nrow(dd),1),] } do.call("rbind", lapply(levels(dfr$x), f )) On 4/20/06, Kelly Hildner <Kelly.Hildner at noaa.gov> wrote:> I don't use R much, and I have been unable to figure out how to get the > subset of my data frame that I would like. > > For example, if this were my data frame: > > > dfr <- data.frame(x=rep(letters[1:3], 4), y=(1:12), z=(LETTERS[1:12])) > > dfr > x y z > 1 a 1 A > 2 b 2 B > 3 c 3 C > 4 a 4 D > 5 b 5 E > 6 c 6 F > 7 a 7 G > 8 b 8 H > 9 c 9 I > 10 a 10 J > 11 b 11 K > 12 c 12 L > > I would like to randomly select one row for each level of the factor x > and create a new data frame with the results. For example, the result > might be: > > x y z > 1 a 1 A > 5 b 5 E > 6 c 6 F > > Any help would be greatly appreciated! > > Thanks, > Kelly > > -- > K. Kelly Hildner, Ph.D. > NOAA Fisheries > Southwest Fisheries Science Center > 110 Shaffer Rd. > Santa Cruz, CA 95060 > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html >