Dear R users, I want to aggregate data in the following way: ### X <- data.frame(u = c("T1","T1","T1","T2"), v=c("a","a","b","a")) X library(sqldf) sqlOut <- sqldf("select count(distinct(v)) from X group by u") sqlOut ### Now I want to get the same result without using SQL. How can I achieve that ? Thanks for your help, Gildas
There are so many ways.... Here is one: aggregate(v ~ u, data=X, function(...) length(unique(...))) # u v # 1 T1 2 # 2 T2 1 Hope this helps Allan. On 22/07/10 12:52, Gildas Mazo wrote:> Dear R users, > > I want to aggregate data in the following way: > > ### > > X<- data.frame(u = c("T1","T1","T1","T2"), v=c("a","a","b","a")) > X > library(sqldf) > sqlOut<- sqldf("select count(distinct(v)) from X group by u") > sqlOut > > ### > > Now I want to get the same result without using SQL. How can I achieve > that ? > > Thanks for your help, > > Gildas > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
On 7/22/2010 5:01 AM, Allan Engelhardt wrote:> There are so many ways.... Here is one: > > aggregate(v ~ u, data=X, function(...) length(unique(...))) > # u v > # 1 T1 2 > # 2 T2 1 > > Hope this helpsHere is one other way, using the plyr package (which is very good for taking a data structure (data.frame, list, array), pulling it apart by some criteria, doing something on each of the parts, and putting the results back together): library("plyr") ddply(X, .(u), function(x) {length(unique(x$v))}) # u V1 #1 T1 2 #2 T2 1> Allan. > > On 22/07/10 12:52, Gildas Mazo wrote: >> Dear R users, >> >> I want to aggregate data in the following way: >> >> ### >> >> X<- data.frame(u = c("T1","T1","T1","T2"), v=c("a","a","b","a")) >> X >> library(sqldf) >> sqlOut<- sqldf("select count(distinct(v)) from X group by u") >> sqlOut >> >> ### >> >> Now I want to get the same result without using SQL. How can I achieve >> that ? >> >> Thanks for your help, >> >> Gildas >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> >-- -- Brian Diggs Senior Research Associate, Department of Surgery, Oregon Health & Science University
Thanks for your answers, Best, Gildas Brian Diggs a ?crit :> On 7/22/2010 5:01 AM, Allan Engelhardt wrote: >> There are so many ways.... Here is one: >> >> aggregate(v ~ u, data=X, function(...) length(unique(...))) >> # u v >> # 1 T1 2 >> # 2 T2 1 >> >> Hope this helps > > Here is one other way, using the plyr package (which is very good for > taking a data structure (data.frame, list, array), pulling it apart by > some criteria, doing something on each of the parts, and putting the > results back together): > > library("plyr") > ddply(X, .(u), function(x) {length(unique(x$v))}) > # u V1 > #1 T1 2 > #2 T2 1 > > >> Allan. >> >> On 22/07/10 12:52, Gildas Mazo wrote: >>> Dear R users, >>> >>> I want to aggregate data in the following way: >>> >>> ### >>> >>> X<- data.frame(u = c("T1","T1","T1","T2"), v=c("a","a","b","a")) >>> X >>> library(sqldf) >>> sqlOut<- sqldf("select count(distinct(v)) from X group by u") >>> sqlOut >>> >>> ### >>> >>> Now I want to get the same result without using SQL. How can I achieve >>> that ? >>> >>> Thanks for your help, >>> >>> Gildas >>> >>> ______________________________________________ >>> R-help at r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide >>> http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >>> >> > >