thr3ads.net - R help - [R] dataframe indexing by number of cases per group [Nov 2011]

If this information is useful, please help other people find it:
Share via:

Johannes Radinger

2011-Nov-24 12:02 UTC

[R] dataframe indexing by number of cases per group

Hello,

assume we have following dataframe:

group <-c(rep("A",5),rep("B",6),rep("C",4))
x <- c(runif(5,1,5),runif(6,1,10),runif(4,2,15))
df <- data.frame(group,x)

Now I want to select all cases (rows) for those groups
which have more or equal 5 cases (so I want to select
all cases of group A and B).
How can I use the indexing for such questions?

df[??]... I think it is probably quite easy but I really
don't know how to do that at the moment.

maybe someone can help me...

/johannes
--

Dennis Murphy

2011-Nov-24 13:47 UTC

head link

[R] dataframe indexing by number of cases per group

A very similar question was asked a couple of days ago - see the
thread titled "Removing rows in dataframe w'o duplicated values" -
in
particular, the responses by Dimitris Rizopoulos and David Winsemius.
The adaptation to this problem is

df[ave(as.numeric(df$group), as.numeric(df$group), FUN = length) > 4, ]
   group        x
1      A 3.903747
2      A 3.599547
3      A 2.449991
4      A 2.740639
5      A 4.268988
6      B 8.649600
7      B 5.493841
8      B 1.892154
9      B 6.781754
10     B 1.459250
11     B 6.749522

HTH,
Dennis

On Thu, Nov 24, 2011 at 4:02 AM, Johannes Radinger <JRadinger at gmx.at>
wrote:> Hello,
>
> assume we have following dataframe:
>
> group
<-c(rep("A",5),rep("B",6),rep("C",4))
> x <- c(runif(5,1,5),runif(6,1,10),runif(4,2,15))
> df <- data.frame(group,x)
>
> Now I want to select all cases (rows) for those groups
> which have more or equal 5 cases (so I want to select
> all cases of group A and B).
> How can I use the indexing for such questions?
>
> df[??]... I think it is probably quite easy but I really
> don't know how to do that at the moment.
>
> maybe someone can help me...
>
> /johannes
> --
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

Gabor Grothendieck

2011-Nov-24 14:12 UTC

head link

[R] dataframe indexing by number of cases per group

On Thu, Nov 24, 2011 at 7:02 AM, Johannes Radinger <JRadinger at gmx.at>
wrote:> Hello,
>
> assume we have following dataframe:
>
> group
<-c(rep("A",5),rep("B",6),rep("C",4))
> x <- c(runif(5,1,5),runif(6,1,10),runif(4,2,15))
> df <- data.frame(group,x)
>
> Now I want to select all cases (rows) for those groups
> which have more or equal 5 cases (so I want to select
> all cases of group A and B).
> How can I use the indexing for such questions?
>
> df[??]... I think it is probably quite easy but I really
> don't know how to do that at the moment.
>
> maybe someone can help me...
>
Here are three approaches:

subset(merge(df, xtabs(~ group, df)), Freq >= 5)
:
subset(transform(df, len = ave(x, group, FUN = length)), len >= 5)

library(sqldf)
sqldf('select a.*
    from df a join (select "group", count(*) "count" from df
group by "group")
    using ("group")
    where "count" >= 5')

-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

Reasonably Related Threads

Search for more apparently analagous threads

R help - Nov 2011 - dataframe indexing by number of cases per group

[R] dataframe indexing by number of cases per group

[R] dataframe indexing by number of cases per group

[R] dataframe indexing by number of cases per group

Reasonably Related Threads