Using this data as an example dat <- read.table(textConnection("Id myvar 12 1 12 2 12 6 34 9 34 4 34 8 65 15 65 23"), header = TRUE) closeAllConnections() how can I create another data set that does not have duplicate entries for 'Id', but the included values are randomly selected from the available ones. Thanks! Juliet
Try this: do.call(rbind, lapply(split(dat, dat$Id), function(x)x[sample(1:nrow(x), 1),])) On 7/9/08, Juliet Hannah <juliet.hannah at gmail.com> wrote:> Using this data as an example > > dat <- read.table(textConnection("Id myvar > 12 1 > 12 2 > 12 6 > 34 9 > 34 4 > 34 8 > 65 15 > 65 23"), header = TRUE) > closeAllConnections() > > how can I create another data set that does not have duplicate entries > for 'Id', but the included values > are randomly selected from the available ones. > > Thanks! > > Juliet > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Henrique Dallazuanna Curitiba-Paran?-Brasil 25? 25' 40" S 49? 16' 22" O
How about this:> dat <- read.table(textConnection("Id myvar+ 12 1 + 12 2 + 12 6 + 34 9 + 34 4 + 34 8 + 65 15 + 65 23"), header = TRUE)> closeAllConnections() > # split by the id and then choose one > x <- lapply(split(dat, dat$Id), function(.grp){+ .grp[sample(seq(length(.grp)), 1),] + })> do.call(rbind, x)Id myvar 12 12 1 34 34 9 65 65 15 On Wed, Jul 9, 2008 at 3:17 PM, Juliet Hannah <juliet.hannah at gmail.com> wrote:> Using this data as an example > > dat <- read.table(textConnection("Id myvar > 12 1 > 12 2 > 12 6 > 34 9 > 34 4 > 34 8 > 65 15 > 65 23"), header = TRUE) > closeAllConnections() > > how can I create another data set that does not have duplicate entries > for 'Id', but the included values > are randomly selected from the available ones. > > Thanks! > > Juliet > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve?
on 07/09/2008 02:17 PM Juliet Hannah wrote:> Using this data as an example > > dat <- read.table(textConnection("Id myvar > 12 1 > 12 2 > 12 6 > 34 9 > 34 4 > 34 8 > 65 15 > 65 23"), header = TRUE) > closeAllConnections() > > how can I create another data set that does not have duplicate entries > for 'Id', but the included values > are randomly selected from the available ones. > > Thanks! > > Juliet> aggregate(dat$myvar, list(dat$Id), sample, 1) Group.1 x 1 12 6 2 34 4 3 65 15 > aggregate(dat$myvar, list(dat$Id), sample, 1) Group.1 x 1 12 2 2 34 9 3 65 15 > aggregate(dat$myvar, list(dat$Id), sample, 1) Group.1 x 1 12 1 2 34 8 3 65 23 HTH, Marc Schwartz