Hi, I have a dataset with info on individuals (B) that have been involved in projects (A) during multiple years (C). The dataset contains three columns: A, B, C. Example: A B C 1 1 a 1999 2 1 b 1999 3 1 c 1999 4 2 c 2001 5 2 d 2001 6 3 a 2004 7 3 b 2004 I am interested in the average tenure of all individuals for each project (assuming that the tenure of an individual = 0 in the first project this individual is involved in). So based on the data above: A D 1 1 0 2 2 1 3 3 5 where D = average project tenure. How do I do this? Your help is very much appreciated. Thanks! -- View this message in context: http://r.789695.n4.nabble.com/Data-manipulation-tp3302717p3302717.html Sent from the R help mailing list archive at Nabble.com.
mathijsdevaan <mathijsdevaan at gmail.com> [Sat, Feb 12, 2011 at 03:00:18PM CET]:> > Hi, > > I have a dataset with info on individuals (B) that have been involved in > projects (A) during multiple years (C). The dataset contains three columns: > A, B, C. Example: > A B C > 1 1 a 1999 > 2 1 b 1999 > 3 1 c 1999 > 4 2 c 2001 > 5 2 d 2001 > 6 3 a 2004 > 7 3 b 2004 > > I am interested in the average tenure of all individuals for each project > (assuming that the tenure of an individual = 0 in the first project this > individual is involved in). So based on the data above: > A D > 1 1 0 > 2 2 1 > 3 3 5 > > where D = average project tenure. How do I do this? >I am not getting how you arrive at D calculating an "average". Could you write down the arithmetic operations involved? -- Johannes H?sing There is something fascinating about science. One gets such wholesale returns of conjecture mailto:johannes at huesing.name from such a trifling investment of fact. http://derwisch.wikidot.com (Mark Twain, "Life on the Mississippi")
Will this do it for you:> x <- read.table(textConnection(" A B C+ 1 1 a 1999 + 2 1 b 1999 + 3 1 c 1999 + 4 2 c 2001 + 5 2 d 2001 + 6 3 a 2004 + 7 3 b 2004"), header = TRUE)> closeAllConnections() > # add a tenure column > x$tenure <- ave(x$C, x$B, FUN = function(yr) yr - min(yr)) > xA B C tenure 1 1 a 1999 0 2 1 b 1999 0 3 1 c 1999 0 4 2 c 2001 2 5 2 d 2001 0 6 3 a 2004 5 7 3 b 2004 5> # compute tenure on project > aggregate(x$tenure, list(project = x$A), mean)project x 1 1 0 2 2 1 3 3 5 On Sat, Feb 12, 2011 at 9:00 AM, mathijsdevaan <mathijsdevaan at gmail.com> wrote:> > Hi, > > I have a dataset with info on individuals (B) that have been involved in > projects (A) during multiple years (C). The dataset contains three columns: > A, B, C. Example: > ? A ?B ?C > 1 1 ?a ?1999 > 2 1 ?b ?1999 > 3 1 ?c ?1999 > 4 2 ?c ?2001 > 5 2 ?d ?2001 > 6 3 ?a ?2004 > 7 3 ?b ?2004 > > I am interested in the average tenure of all individuals for each project > (assuming that the tenure of an individual = 0 in the first project this > individual is involved in). So based on the data above: > ?A ?D > 1 1 ?0 > 2 2 ?1 > 3 3 ?5 > > where D = average project tenure. How do I do this? > > Your help is very much appreciated. Thanks! > -- > View this message in context: http://r.789695.n4.nabble.com/Data-manipulation-tp3302717p3302717.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Jim Holtman Data Munger Guru What is the problem that you are trying to solve?