Simone Gabbriellini
2011-Mar-02 15:10 UTC
[R] how to simplify a data.frame and add the counts of duplicate rows as a new column
Hello List, I would like to simplify a data.frame like this columnA columnB user10 proj12 user10 proj19 user10 proj12 into something like: columnA columnB columnC user10 proj12 2 user10 proj19 1 I know unique() can simplify the data.frame, but how to count and store the duplicates? thanks in advance for any help. best regards, Simone
Scott Chamberlain
2011-Mar-02 15:22 UTC
[R] how to simplify a data.frame and add the counts of duplicate rows as a new column
see package plyr, especially the function ddply(), eg.., in your case: ddply(dataframe, .(columnA, columnB), summarise, columnC = length(columnB) ) Scott On Wednesday, March 2, 2011 at 9:10 AM, Simone Gabbriellini wrote:> Hello List, > > I would like to simplify a data.frame like this > > columnA columnB > user10 proj12 > user10 proj19 > user10 proj12 > > into something like: > > columnA columnB columnC > user10 proj12 2 > user10 proj19 1 > > I know unique() can simplify the data.frame, but how to count and store the duplicates? > > thanks in advance for any help. > > best regards, > Simone > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]
Simone Gabbriellini
2011-Mar-02 15:34 UTC
[R] how to simplify a data.frame and add the counts of duplicate rows as a new column
many thanks, this is really a great solution! best, Simone Il giorno 02/mar/2011, alle ore 16.22, Scott Chamberlain ha scritto:> see package plyr, especially the function ddply(), eg.., in your case: > > ddply(dataframe, .(columnA, columnB), summarise, > columnC = length(columnB) > ) > > Scott > On Wednesday, March 2, 2011 at 9:10 AM, Simone Gabbriellini wrote: > >> Hello List, >> >> I would like to simplify a data.frame like this >> >> columnA columnB >> user10 proj12 >> user10 proj19 >> user10 proj12 >> >> into something like: >> >> columnA columnB columnC >> user10 proj12 2 >> user10 proj19 1 >> >> I know unique() can simplify the data.frame, but how to count and store the duplicates? >> >> thanks in advance for any help. >> >> best regards, >> Simone >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >