Oscar Franzén
2010-Mar-24 13:38 UTC
[R] removing fields of the same group from a data frame
Dear all, I'm trying to find a a way to remove certain fields belonging to the same group from a data frame structure. I have a data frame like this: foo v1 v2 v3 1 1 a 6 2 a 3 8 a 4 4 b 4 4 b 2 1 c 1 6 d Each row can then be grouped according to the third column: a, b, c, d. Then I would like to remove all fields that belong to a group with less than X members, for example less than 3 members, then the resulting data frame structure would look like: foo v1 v2 v3 1 1 a 6 2 a 3 8 a Is there some simple way to do this in R? Thanks in advance. /Oscar [[alternative HTML version deleted]]
Henrique Dallazuanna
2010-Mar-24 13:59 UTC
[R] removing fields of the same group from a data frame
Try this: subset(foo, v3 %in% names(which(!table(foo$v3) < 3))) On Wed, Mar 24, 2010 at 10:38 AM, Oscar Franz?n <oscar.franzen2 at gmail.com> wrote:> Dear all, > > I'm trying to find a a way to remove certain fields belonging to the same > group from a data frame structure. > > I have a data frame like this: > > ?foo v1 v2 v3 > ? ? ? 1 ?1 ?a > ? ? ? 6 ?2 ?a > ? ? ? 3 ?8 ?a > ? ? ? 4 ?4 ?b > ? ? ? 4 ?4 ?b > ? ? ? 2 ?1 ?c > ? ? ? 1 ?6 ?d > > Each row can then be grouped according to the third column: a, b, c, d. Then > I would like to remove all fields that belong to a group with less than X > members, for example less than 3 members, then > the resulting data frame structure would look like: > > > ?foo v1 v2 v3 > ? ? ? 1 ?1 ? a > ? ? ? 6 ?2 ? a > ? ? ? 3 ?8 ? a > > Is there some simple way to do this in R? > > Thanks in advance. > /Oscar > > ? ? ? ?[[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Henrique Dallazuanna Curitiba-Paran?-Brasil 25? 25' 40" S 49? 16' 22" O
Marc Schwartz
2010-Mar-24 14:01 UTC
[R] removing fields of the same group from a data frame
On Mar 24, 2010, at 8:38 AM, Oscar Franz?n wrote:> Dear all, > > I'm trying to find a a way to remove certain fields belonging to the same > group from a data frame structure. > > I have a data frame like this: > > foo v1 v2 v3 > 1 1 a > 6 2 a > 3 8 a > 4 4 b > 4 4 b > 2 1 c > 1 6 d > > Each row can then be grouped according to the third column: a, b, c, d. Then > I would like to remove all fields that belong to a group with less than X > members, for example less than 3 members, then > the resulting data frame structure would look like: > > > foo v1 v2 v3 > 1 1 a > 6 2 a > 3 8 a > > Is there some simple way to do this in R? > > Thanks in advance. > /Oscar> DFv1 v2 v3 1 1 1 a 2 6 2 a 3 3 8 a 4 4 4 b 5 4 4 b 6 2 1 c 7 1 6 d> subset(DF, !v3 %in% names(which(table(v3) < 3)))v1 v2 v3 1 1 1 a 2 6 2 a 3 3 8 a The use of table() gets us:> table(DF$v3) < 3a b c d FALSE TRUE TRUE TRUE followed by:> names(which(table(DF$v3) < 3))[1] "b" "c" "d" which gives us the values of v3 that don't have at least 3 entries. When using subset(), the variables are evaluated first within the data frame, hence we can drop the 'DF$' in the function call. The use of "%in%" in subset() allows us to include or exclude certain values from a set comparison. We could also reverse the logic, yielding the same result:> subset(DF, v3 %in% names(which(table(v3) >= 3)))v1 v2 v3 1 1 1 a 2 6 2 a 3 3 8 a See ?table, ?subset and ?"%in%" for more information. HTH, Marc Schwartz