I have a data frame containing children, with variables 'year' = birth year, and 'm.id' = mother's id number. Let's assume that all the births of each mother is represented in the data frame. Now I want to create a subset of this data frame containing all children, whose mother's first birth was in the year 1816 or later. This seems to work: mid <- tapply(dat$year, dat$m.id, min) mid <- as.numeric(names(mid)[mid >= 1816]) dat <- dat[dat$m.id %in% mid, ] but I'm worried about the second line, because the output from 'tapply' isn't documented to have a 'dimnames' attribute (although it has one, at least in R-2.1.0, 2005-01-19). Another aspect is that this code relies on m.id being numeric; I would have to change it if the type of m.id changes to, eg, character. So, question: Is there a better way of doing this? -- G?ran Brostr?m tel: +46 90 786 5223 Department of Statistics fax: +46 90 786 6614 Ume? University http://www.stat.umu.se/egna/gb/ SE-90187 Ume?, Sweden e-mail: gb at stat.umu.se
> From: G?ran Brostr?m > > I have a data frame containing children, with variables 'year' = birth > year, and 'm.id' = mother's id number. Let's assume that all > the births of > each mother is represented in the data frame. > > Now I want to create a subset of this data frame containing > all children, > whose mother's first birth was in the year 1816 or later. > This seems to > work: > > mid <- tapply(dat$year, dat$m.id, min) > mid <- as.numeric(names(mid)[mid >= 1816]) > dat <- dat[dat$m.id %in% mid, ] > > but I'm worried about the second line, because the output > from 'tapply' > isn't documented to have a 'dimnames' attribute (although it > has one, at > least in R-2.1.0, 2005-01-19). Another aspect is that this > code relies on > m.id being numeric; I would have to change it if the type of > m.id changes > to, eg, character. > > So, question: Is there a better way of doing this?Would this work? dat <- dat[ave(dat$year, dat$m.id, min) >= 1816, ] Andy> -- > G?ran Brostr?m tel: +46 90 786 5223 > Department of Statistics fax: +46 90 786 6614 > Ume? University http://www.stat.umu.se/egna/gb/ > SE-90187 Ume?, Sweden e-mail: gb at stat.umu.se > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! > http://www.R-project.org/posting-guide.html > >
your approach, after omitting the "as.numeric()" in the second line, seems to work even for `m.id' being factor, i.e., dat <- data.frame(m.id=rep(letters[1:10], 10), year=sample(1805:1950, 100, TRUE)) ########### mid <- tapply(dat$year, dat$m.id, min) mid <- names(mid)[mid >= 1816] dat. <- dat[dat$m.id %in% mid, ] dat; dat. but maybe there is something better. Best, Dimitris ---- Dimitris Rizopoulos Ph.D. Student Biostatistical Centre School of Public Health Catholic University of Leuven Address: Kapucijnenvoer 35, Leuven, Belgium Tel: +32/16/336899 Fax: +32/16/337015 Web: http://www.med.kuleuven.ac.be/biostat http://www.student.kuleuven.ac.be/~m0390867/dimitris.htm ----- Original Message ----- From: "G?ran Brostr?m" <gb at tal.stat.umu.se> To: <r-help at stat.math.ethz.ch> Sent: Tuesday, January 25, 2005 3:55 PM Subject: [R] tapply and names>I have a data frame containing children, with variables 'year' = >birth > year, and 'm.id' = mother's id number. Let's assume that all the > births of > each mother is represented in the data frame. > > Now I want to create a subset of this data frame containing all > children, > whose mother's first birth was in the year 1816 or later. This seems > to > work: > > mid <- tapply(dat$year, dat$m.id, min) > mid <- as.numeric(names(mid)[mid >= 1816]) > dat <- dat[dat$m.id %in% mid, ] > > but I'm worried about the second line, because the output from > 'tapply' > isn't documented to have a 'dimnames' attribute (although it has > one, at > least in R-2.1.0, 2005-01-19). Another aspect is that this code > relies on > m.id being numeric; I would have to change it if the type of m.id > changes > to, eg, character. > > So, question: Is there a better way of doing this? > -- > G?ran Brostr?m tel: +46 90 786 5223 > Department of Statistics fax: +46 90 786 6614 > Ume? University http://www.stat.umu.se/egna/gb/ > SE-90187 Ume?, Sweden e-mail: gb at stat.umu.se > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! > http://www.R-project.org/posting-guide.html >