Hello,
Try
fun <- function(x){
one <- which(x$score == 1) # rows to remove
if(length(one) == 1)
x
else if(length(one) > 1)
x[-one[-sample(seq_along(one), 1)], ] # all but a randomly sampled row
}
res <- lapply(split(data.frame(dat), dat[, "group"]), fun)
res
do.call(rbind, res)
Hope this helps,
Rui Barradas
ck wrote>
> Dear R users,
>
> I am working on a big dataset and have got a problem with data cleaning.
> My data set looks like this:
>
> data <- cbind (group = c(1,1,1,2,2,3,3,3,4,4,4,4,4), member >
c(1,2,3,1,2,1,2,3,1,2,3,4,5), score = c(0,1,0,0,0,1,0,1,0,1,1,1,0))
>
> I just want to keep the group in which the sum of score is equal to 1 and
> remove the whole group in which the sum of score is equal to 0. For the
> group in which the sum of the score is greater than 1, e.g., sum of score
> = 3, I want to randomly select two group members with score equal to 1 and
> remove them from the group. Then the data may look like this:
>
> newdata <- cbind (group = c(1,1,1,3,3,4,4,4), member =
c(1,2,3,2,3,1,3,5),
> score = c(0,1,0,0,1,0,1,0))
>
> Does anybody can help me get this done? Thank you in advance.
>
> ck
>
--
View this message in context:
http://r.789695.n4.nabble.com/Delete-rows-from-dataset-tp4632427p4632606.html
Sent from the R help mailing list archive at Nabble.com.