Hi all, I have a data frame with tow variables group and its size. mydat<- read.table( text='group count G1 25 G2 15 G3 12 G4 31 G5 10' , header = TRUE, as.is = TRUE ) I want to select group ID randomly (without replacement) until the sum of count reaches 40. So, in the first case, the data frame could be G4 31 65 10 In other case, it could be G5 10 G2 15 G3 12 How do I put sum of count variable is a minimum of 40 restriction? Than k you in advance I want to select group ids randomly until I reach the
First expand your data frame into a vector where G1 is repeated 25 times, G2 is repeated 15 times, etc. Then draw random samples of 40 from that vector:> grp <- rep(mydat$group, mydat$count) > grp.sam <- sample(grp, 40) > table(grp.sam)grp.sam G1 G2 G3 G4 G5 10 9 5 13 3 ---------------------------------------- David L Carlson Department of Anthropology Texas A&M University College Station, TX 77843-4352 -----Original Message----- From: R-help <r-help-bounces at r-project.org> On Behalf Of Val Sent: Monday, February 11, 2019 4:36 PM To: r-help at R-project.org (r-help at r-project.org) <r-help at r-project.org> Subject: [R] Select Hi all, I have a data frame with tow variables group and its size. mydat<- read.table( text='group count G1 25 G2 15 G3 12 G4 31 G5 10' , header = TRUE, as.is = TRUE ) I want to select group ID randomly (without replacement) until the sum of count reaches 40. So, in the first case, the data frame could be G4 31 65 10 In other case, it could be G5 10 G2 15 G3 12 How do I put sum of count variable is a minimum of 40 restriction? Than k you in advance I want to select group ids randomly until I reach the ______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Thank you David. However, this will not work for me. If the group ID selected then all of its observation should be included. On Mon, Feb 11, 2019 at 4:51 PM David L Carlson <dcarlson at tamu.edu> wrote:> > First expand your data frame into a vector where G1 is repeated 25 times, G2 is repeated 15 times, etc. Then draw random samples of 40 from that vector: > > > grp <- rep(mydat$group, mydat$count) > > grp.sam <- sample(grp, 40) > > table(grp.sam) > grp.sam > G1 G2 G3 G4 G5 > 10 9 5 13 3 > > ---------------------------------------- > David L Carlson > Department of Anthropology > Texas A&M University > College Station, TX 77843-4352 > > > -----Original Message----- > From: R-help <r-help-bounces at r-project.org> On Behalf Of Val > Sent: Monday, February 11, 2019 4:36 PM > To: r-help at R-project.org (r-help at r-project.org) <r-help at r-project.org> > Subject: [R] Select > > Hi all, > > I have a data frame with tow variables group and its size. > mydat<- read.table( text='group count > G1 25 > G2 15 > G3 12 > G4 31 > G5 10' , header = TRUE, as.is = TRUE ) > > I want to select group ID randomly (without replacement) until the > sum of count reaches 40. > So, in the first case, the data frame could be > G4 31 > 65 10 > > In other case, it could be > G5 10 > G2 15 > G3 12 > > How do I put sum of count variable is a minimum of 40 restriction? > > Than k you in advance > > > > > > > I want to select group ids randomly until I reach the > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
On 2019-02-11 23:35, Val wrote:> Hi all, > > I have a data frame with tow variables group and its size. > mydat<- read.table( text='group count > G1 25 > G2 15 > G3 12 > G4 31 > G5 10' , header = TRUE, as.is = TRUE ) >How about x <- sample(1:5) total <- mydat$count[x[1]] i <- 1 while (total < 40){ i <- i + 1 total <- total + mydat$count[x[i]] } print(mydat$group[x[1:i]]) G?ran> I want to select group ID randomly (without replacement) until the > sum of count reaches 40. > So, in the first case, the data frame could be > G4 31 > 65 10 > > In other case, it could be > G5 10 > G2 15 > G3 12 > > How do I put sum of count variable is a minimum of 40 restriction? > > Than k you in advance > > > > > > > I want to select group ids randomly until I reach the > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >