Is length(unique()) what you are looking for? Andy> From: Christoph Lehmann > > Hi I have a question concerning aggregation > > (simple demo code S. below) > > I have the data.frame > > id meas date > 1 a 0.637513747 1 > 2 a 0.187710063 2 > 3 a 0.247098459 2 > 4 a 0.306447690 3 > 5 b 0.407573577 2 > 6 b 0.783255085 2 > 7 b 0.344265082 3 > 8 b 0.103893068 3 > 9 c 0.738649586 1 > 10 c 0.614154037 2 > 11 c 0.949924371 3 > 12 c 0.008187858 4 > > When I want for each id the sum of its meas I do: > > aggregate(data$meas, list(id = data$id), sum) > > If I want to know the number of meas(ures) for each id I do, eg > > aggregate(data$meas, list(id = data$id), length) > > NOW: Is there a way to compute the number of meas(ures) for > each id with > not identical date (e.g using diff()? > so that I get eg: > > id x > 1 a 3 > 2 b 2 > 3 c 4 > > > I am sure it must be possible > > thanks for any (even short) hint > > cheers > Christoph > > > > -------------- > data <- data.frame(c(rep("a", 4), rep("b", 4), rep("c", 4)), > runif(12), c(1, 2, 2, 3, 2, 2, 3, 3, 1, 2, 3, 4)) > names(data) <- c("id", "meas", "date") > > m <- aggregate(data$meas, list(id = data$id), sum) > names(m) <- c("id", "cum.meas") > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! > http://www.R-project.org/posting-guide.html > > >
Christoph Lehmann wrote on 4/15/2005 9:51 AM:> Hi I have a question concerning aggregation > > (simple demo code S. below) > > I have the data.frame > > id meas date > 1 a 0.637513747 1 > 2 a 0.187710063 2 > 3 a 0.247098459 2 > 4 a 0.306447690 3 > 5 b 0.407573577 2 > 6 b 0.783255085 2 > 7 b 0.344265082 3 > 8 b 0.103893068 3 > 9 c 0.738649586 1 > 10 c 0.614154037 2 > 11 c 0.949924371 3 > 12 c 0.008187858 4 > > When I want for each id the sum of its meas I do: > > aggregate(data$meas, list(id = data$id), sum) > > If I want to know the number of meas(ures) for each id I do, eg > > aggregate(data$meas, list(id = data$id), length) > > NOW: Is there a way to compute the number of meas(ures) for each id with > not identical date (e.g using diff()? > so that I get eg: > > id x > 1 a 3 > 2 b 2 > 3 c 4 > > > I am sure it must be possible > > thanks for any (even short) hint > > cheers > Christoph > > > > -------------- > data <- data.frame(c(rep("a", 4), rep("b", 4), rep("c", 4)), > runif(12), c(1, 2, 2, 3, 2, 2, 3, 3, 1, 2, 3, 4)) > names(data) <- c("id", "meas", "date") > > m <- aggregate(data$meas, list(id = data$id), sum) > names(m) <- c("id", "cum.meas") >How about: m <- aggregate(data["date"], data["id"], function(x) length(unique(x))) --sundar
Hi I have a question concerning aggregation (simple demo code S. below) I have the data.frame id meas date 1 a 0.637513747 1 2 a 0.187710063 2 3 a 0.247098459 2 4 a 0.306447690 3 5 b 0.407573577 2 6 b 0.783255085 2 7 b 0.344265082 3 8 b 0.103893068 3 9 c 0.738649586 1 10 c 0.614154037 2 11 c 0.949924371 3 12 c 0.008187858 4 When I want for each id the sum of its meas I do: aggregate(data$meas, list(id = data$id), sum) If I want to know the number of meas(ures) for each id I do, eg aggregate(data$meas, list(id = data$id), length) NOW: Is there a way to compute the number of meas(ures) for each id with not identical date (e.g using diff()? so that I get eg: id x 1 a 3 2 b 2 3 c 4 I am sure it must be possible thanks for any (even short) hint cheers Christoph -------------- data <- data.frame(c(rep("a", 4), rep("b", 4), rep("c", 4)), runif(12), c(1, 2, 2, 3, 2, 2, 3, 3, 1, 2, 3, 4)) names(data) <- c("id", "meas", "date") m <- aggregate(data$meas, list(id = data$id), sum) names(m) <- c("id", "cum.meas")
If I understood you correctly, here's one way:> sumWO2 <- sapply(split(dat, dat$id), function(d) sum(d$meas[d$date != 2])) > sumWO2a b c 0.9439614 0.4481582 1.6967618 Andy> From: Christoph Lehmann > > Dear Sundar, dear Andy > manyt thanks for the length(unique(x)) hint. It solves of course my > problem in a very elegant way. Just of curiosity (or for > potential future > problems): how could I solve it in a way, conceptually > different, namely, > that the computation on 'meas' being dependent on the > variable 'date'?, > means the computation on a variable x in the function passed > to aggregate > is conditional on the value of another variable y? I hope you > understand > what I mean, let's think of an example: > > E.g for the example data.frame below, the sum shall be taken over the > variable meas only for all entries with a corresponding 'data' != 2 > > for this do I have to nest two aggregate statements, or is > there a way > using sapply or similar apply-based commands? > > thanks a lot for your kind help. > > Cheers! > > Christoph > > aggregate(data$meas, list(id = data$id), sum) > > > > > > Christoph Lehmann wrote on 4/15/2005 9:51 AM: > > > Hi I have a question concerning aggregation > > > > > > (simple demo code S. below) > > > > > > I have the data.frame > > > > > > id meas date > > > 1 a 0.637513747 1 > > > 2 a 0.187710063 2 > > > 3 a 0.247098459 2 > > > 4 a 0.306447690 3 > > > 5 b 0.407573577 2 > > > 6 b 0.783255085 2 > > > 7 b 0.344265082 3 > > > 8 b 0.103893068 3 > > > 9 c 0.738649586 1 > > > 10 c 0.614154037 2 > > > 11 c 0.949924371 3 > > > 12 c 0.008187858 4 > > > > > > When I want for each id the sum of its meas I do: > > > > > > aggregate(data$meas, list(id = data$id), sum) > > > > > > If I want to know the number of meas(ures) for each id I do, eg > > > > > > aggregate(data$meas, list(id = data$id), length) > > > > > > NOW: Is there a way to compute the number of meas(ures) > for each id > with > > > not identical date (e.g using diff()? > > > so that I get eg: > > > > > > id x > > > 1 a 3 > > > 2 b 2 > > > 3 c 4 > > > > > > > > > I am sure it must be possible > > > > > > thanks for any (even short) hint > > > > > > cheers > > > Christoph > > > > > > > > > > > > -------------- > > > data <- data.frame(c(rep("a", 4), rep("b", 4), rep("c", 4)), > > > runif(12), c(1, 2, 2, 3, 2, 2, 3, 3, > 1, 2, 3, 4)) > > > names(data) <- c("id", "meas", "date") > > > > > > m <- aggregate(data$meas, list(id = data$id), sum) > > > names(m) <- c("id", "cum.meas") > > > > > > > > > How about: > > > > m <- aggregate(data["date"], data["id"], > > function(x) length(unique(x))) > > > > --sundar > > > > -- > +++ GMX - Die erste Adresse f?r Mail, Message, More +++ > > 1 GB Mailbox bereits in GMX FreeMail http://www.gmx.net/de/go/mail > > >
> From: Christoph Lehmann > > great, Andy! Thanks a lot- I didn't know split. > So 'split' can be used as alternative for 'aggregate', with > the advantage > that in the passed self-defined function one can consider > more than one > variable of the to-be-aggregated data.frame?split() only split the data frame into a list of data frames, according to the variable supplied as the second argument. You can then use sapply()/lapply() to apply the same operation on each piece, where each piece contains all the variables. Andy> Christoph > > If I understood you correctly, here's one way: > > > > > sumWO2 <- sapply(split(dat, dat$id), function(d) > sum(d$meas[d$date !> > 2])) > > > sumWO2 > > a b c > > 0.9439614 0.4481582 1.6967618 > > > > Andy > > > > > > > From: Christoph Lehmann > > > > > > Dear Sundar, dear Andy > > > manyt thanks for the length(unique(x)) hint. It solves of > course my > > > problem in a very elegant way. Just of curiosity (or for > > > potential future > > > problems): how could I solve it in a way, conceptually > > > different, namely, > > > that the computation on 'meas' being dependent on the > > > variable 'date'?, > > > means the computation on a variable x in the function passed > > > to aggregate > > > is conditional on the value of another variable y? I hope you > > > understand > > > what I mean, let's think of an example: > > > > > > E.g for the example data.frame below, the sum shall be > taken over the > > > variable meas only for all entries with a corresponding > 'data' != 2 > > > > > > for this do I have to nest two aggregate statements, or is > > > there a way > > > using sapply or similar apply-based commands? > > > > > > thanks a lot for your kind help. > > > > > > Cheers! > > > > > > Christoph > > > > > > aggregate(data$meas, list(id = data$id), sum) > > > > > > > > > > > > Christoph Lehmann wrote on 4/15/2005 9:51 AM: > > > > > Hi I have a question concerning aggregation > > > > > > > > > > (simple demo code S. below) > > > > > > > > > > I have the data.frame > > > > > > > > > > id meas date > > > > > 1 a 0.637513747 1 > > > > > 2 a 0.187710063 2 > > > > > 3 a 0.247098459 2 > > > > > 4 a 0.306447690 3 > > > > > 5 b 0.407573577 2 > > > > > 6 b 0.783255085 2 > > > > > 7 b 0.344265082 3 > > > > > 8 b 0.103893068 3 > > > > > 9 c 0.738649586 1 > > > > > 10 c 0.614154037 2 > > > > > 11 c 0.949924371 3 > > > > > 12 c 0.008187858 4 > > > > > > > > > > When I want for each id the sum of its meas I do: > > > > > > > > > > aggregate(data$meas, list(id = data$id), sum) > > > > > > > > > > If I want to know the number of meas(ures) for each > id I do, eg > > > > > > > > > > aggregate(data$meas, list(id = data$id), length) > > > > > > > > > > NOW: Is there a way to compute the number of meas(ures) > > > for each id > > > with > > > > > not identical date (e.g using diff()? > > > > > so that I get eg: > > > > > > > > > > id x > > > > > 1 a 3 > > > > > 2 b 2 > > > > > 3 c 4 > > > > > > > > > > > > > > > I am sure it must be possible > > > > > > > > > > thanks for any (even short) hint > > > > > > > > > > cheers > > > > > Christoph > > > > > > > > > > > > > > > > > > > > -------------- > > > > > data <- data.frame(c(rep("a", 4), rep("b", 4), rep("c", 4)), > > > > > runif(12), c(1, 2, 2, 3, 2, 2, 3, 3, > > > 1, 2, 3, 4)) > > > > > names(data) <- c("id", "meas", "date") > > > > > > > > > > m <- aggregate(data$meas, list(id = data$id), sum) > > > > > names(m) <- c("id", "cum.meas") > > > > > > > > > > > > > > > > > How about: > > > > > > > > m <- aggregate(data["date"], data["id"], > > > > function(x) length(unique(x))) > > > > > > > > --sundar > > > > > > > > > > -- > > > +++ GMX - Die erste Adresse f?r Mail, Message, More +++ > > > > > > 1 GB Mailbox bereits in GMX FreeMail http://www.gmx.net/de/go/mail > > > > > > > > > > > > > ______________________________________________ > > R-help at stat.math.ethz.ch mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide! > > http://www.R-project.org/posting-guide.html > > > > -- > +++ NEU: GMX DSL_Flatrate! Schon ab 14,99 EUR/Monat! +++ > > GMX Garantie: Surfen ohne Tempo-Limit! http://www.gmx.net/de/go/dsl > > >