Hi all, I'm looking for some help to bias the sample function. Basically, I'd like to generate a data frame where the first column is completely random, the second, however, is conditional do the first, the third is conditional to the first and the second and so on. By conditional I mean that I shouldn't have repeated values in the line. I know it could be easily implemented using permutation, but it is not the case here. I need at least five columns. Any idea to achieve what do I need? set.seed(51) data <- data.frame( id=as.factor(1:100), a=as.factor(sample(1:10, size=100, replace=TRUE)), b=as.factor(sample(1:10, size=100, replace=TRUE)), c=as.factor(sample(1:10, size=100, replace=TRUE)), d=as.factor(sample(1:10, size=100, replace=TRUE)), e=as.factor(sample(1:10, size=100, replace=TRUE)) )
Hello, The function that follows returns a matrix, not a data.frame but does what you ask for. fun <- function(x, y, n){ f <- function(x, y){ while(TRUE){ rnd <- sample(x, 1) if(!any(rnd %in% y)) break } rnd } for(i in seq_len(n)){ tmp <- apply(y, 1, function(.y) f(x, .y)) y <- cbind(y, tmp) } y } a <- cbind(sample(1:10, 100, TRUE)) # must have dims fun(1:10, a, 4) # returns 5 columns, 'a' plus 4 Hope this helps, Rui Barradas Em 11-11-2012 19:06, dms at riseup.net escreveu:> Hi all, > > I'm looking for some help to bias the sample function. Basically, I'd like > to generate a data frame where the first column is completely random, the > second, however, is conditional do the first, the third is conditional to > the first and the second and so on. By conditional I mean that I shouldn't > have repeated values in the line. I know it could be easily implemented > using permutation, but it is not the case here. I need at least five > columns. Any idea to achieve what do I need? > > > set.seed(51) > data <- data.frame( > id=as.factor(1:100), > a=as.factor(sample(1:10, size=100, replace=TRUE)), > b=as.factor(sample(1:10, size=100, replace=TRUE)), > c=as.factor(sample(1:10, size=100, replace=TRUE)), > d=as.factor(sample(1:10, size=100, replace=TRUE)), > e=as.factor(sample(1:10, size=100, replace=TRUE)) > ) > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Hi, If the question is to remove the duplicates/repeated in each row from the example "data", then dat2<-data[apply(data,1,function(x) all(!duplicated(x)|duplicated(x,fromLast=TRUE))),] head(dat2) #?? id a b? c d? e #6?? 6 9 5 10 1? 7 #8?? 8 5 2? 6 7? 4 #11 11 6 4? 9 8? 5 #12 12 7 1? 8 9 10 #15 15 1 9? 8 4? 7 #16 16 6 1? 3 7 10 A.K. ----- Original Message ----- From: "dms at riseup.net" <dms at riseup.net> To: r-help at r-project.org Cc: Sent: Sunday, November 11, 2012 2:06 PM Subject: [R] biasing conditional sample Hi all, I'm looking for some help to bias the sample function. Basically, I'd like to generate a data frame where the first column is completely random, the second, however, is conditional do the first, the third is conditional to the first and the second and so on. By conditional I mean that I shouldn't have repeated values in the line. I know it could be easily implemented using permutation, but it is not the case here. I need at least five columns. Any idea to achieve what do I need? set.seed(51) data <- data.frame( ? ? id=as.factor(1:100), ? ? a=as.factor(sample(1:10, size=100, replace=TRUE)), ? ? b=as.factor(sample(1:10, size=100, replace=TRUE)), ? ? c=as.factor(sample(1:10, size=100, replace=TRUE)), ? ? d=as.factor(sample(1:10, size=100, replace=TRUE)), ? ? e=as.factor(sample(1:10, size=100, replace=TRUE)) ) ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.