thr3ads.net - R help - [R] Randomly drawing observations from factors. [Jul 2008]

If this information is useful, please help other people find it:
Share via:

Economics Guy

2008-Jul-31 20:20 UTC

[R] Randomly drawing observations from factors.

I have a large data set where one of the columns needs be a unique
identifier (ID) for each row. However for a few of the rows they have
the same ID. What I need to do is randomly draw one of the rows and
keep it in the data frame and drop all the others which have the same
ID.

For example:

v1 <- c(1,2,3,4,5,6,7)
v2 <- c(10,20,30,40,50,60,70)
ID <-
c("A","A","B","B","C","D","E")
DF <- data.frame(v1,v2,ID)

But I only need one of the A rows and one of the B rows in the data
frame. I tried making ID a factor and using apply() to randomly draw
one but I could not get it to work.

Any ideas would be greatly appreciated.

Thanks,

EG

Marc Schwartz

2008-Jul-31 20:45 UTC

head link

[R] Randomly drawing observations from factors.

on 07/31/2008 03:20 PM Economics Guy wrote:> I have a large data set where one of the columns needs be a unique
> identifier (ID) for each row. However for a few of the rows they have
> the same ID. What I need to do is randomly draw one of the rows and
> keep it in the data frame and drop all the others which have the same
> ID.
> 
> For example:
> 
> v1 <- c(1,2,3,4,5,6,7)
> v2 <- c(10,20,30,40,50,60,70)
> ID <-
c("A","A","B","B","C","D","E")
> DF <- data.frame(v1,v2,ID)
> 
> But I only need one of the A rows and one of the B rows in the data
> frame. I tried making ID a factor and using apply() to randomly draw
> one but I could not get it to work.
> 
> Any ideas would be greatly appreciated.
> 
> Thanks,
> 
> EG

Try this:
> do.call(rbind, lapply(split(DF, DF$ID), function(x) x[sample(nrow(x), 1),
]))   v1 v2 ID
A  1 10  A
B  3 30  B
C  5 50  C
D  6 60  D
E  7 70  E


Essentially, I am split()ting DF by ID, randomly selecting one row from 
each ID within lapply() and then rbind()ing it all back together.

BTW, a real name would be appreciated.

HTH,

Marc Schwartz

R help - Jul 2008 - Randomly drawing observations from factors.

[R] Randomly drawing observations from factors.

[R] Randomly drawing observations from factors.