Dear all, I'm puzzled by the following example inspired by a recent question on R-help, cc <- textConnection("user_id website time 20 google 0930 21 yahoo 0935 20 facebook 1000 25 facebook 1015 61 google 0940") d <- read.table(cc, head=T) ; close(cc) table(d$user_id) # count the occurrences # now I'd like to include these results in the original data.frame, ddply(d, .(website), transform, count = table(user_id)) # why two new columns? I just can't understand how this is different from, ddply(d, .(website), transform, count = sum(user_id)) Many thanks, baptiste
baptiste auguie-2 wrote:> > > ddply(d, .(website), transform, count = table(user_id)) # why two new > columns? > >Try this to see why: as.data.frame(table(d$user_id)) This works more like you expect: ddply(d, .(website), transform, count = unclass(table(user_id))) - Tom -- View this message in context: http://www.nabble.com/plyr-and-table-question-tp22865174p22868047.html Sent from the R help mailing list archive at Nabble.com.
On Fri, Apr 3, 2009 at 4:43 AM, baptiste auguie <ba208 at exeter.ac.uk> wrote:> Dear all, > > I'm puzzled by the following example inspired by a recent question on > R-help, > > > cc <- textConnection("user_id ?website ? ? ? ? ?time > 20 ? ? ? ?google ? ? ? ? ? ?0930 > 21 ? ? ? ?yahoo ? ? ? ? ? ?0935 > 20 ? ? ? ?facebook ? ? ? ?1000 > 25 ? ? ? ?facebook ? ? ? ?1015 > 61 ? ? ? ?google ? ? ? ? ? ?0940") > > d <- read.table(cc, head=T) ; close(cc) > > table(d$user_id) # count the occurrences > > # now I'd like to include these results in the original data.frame, > > ddply(d, .(website), transform, count = table(user_id)) # why two new > columns?Because ddply expects a data frame as output from your aggregation function. When the output isn't a data frame, it calls as.data.frame, which in this case produces a data frame with two columns. Hadley -- http://had.co.nz/
of course! Thanks, baptiste On 3 Apr 2009, at 14:48, hadley wickham wrote:> On Fri, Apr 3, 2009 at 8:43 AM, baptiste auguie <ba208 at exeter.ac.uk> > wrote: >> That makes sense, so I can do something like, >> >> count <- function(x){ >> as.integer(unclass(table(x))) >> } >> >> count(d$user_id) >> >> ddply(d, .(user_id), transform, count = count(user_id)) >> >>> user_id website time count >>> 1 20 google 930 2 >>> 2 20 facebook 1000 2 >>> 3 21 yahoo 935 1 >>> 4 25 facebook 1015 1 >>> 5 61 google 940 1 >> >> Have I missed a built-in function to obtain this result? > > ddply(d, .(user_id), transform, count = nrow) > > ? > > Hadley > > -- > http://had.co.nz/_____________________________ Baptiste Augui? School of Physics University of Exeter Stocker Road, Exeter, Devon, EX4 4QL, UK Phone: +44 1392 264187 http://newton.ex.ac.uk/research/emag