I have a data frame containing children, with variables 'year' = birth
year, and 'm.id' = mother's id number. Let's assume that all the
births of
each mother is represented in the data frame.
Now I want to create a subset of this data frame containing all children,
whose mother's first birth was in the year 1816 or later. This seems to
work:
mid <- tapply(dat$year, dat$m.id, min)
mid <- as.numeric(names(mid)[mid >= 1816])
dat <- dat[dat$m.id %in% mid, ]
but I'm worried about the second line, because the output from
'tapply'
isn't documented to have a 'dimnames' attribute (although it has
one, at
least in R-2.1.0, 2005-01-19). Another aspect is that this code relies on
m.id being numeric; I would have to change it if the type of m.id changes
to, eg, character.
So, question: Is there a better way of doing this?
--
G?ran Brostr?m tel: +46 90 786 5223
Department of Statistics fax: +46 90 786 6614
Ume? University http://www.stat.umu.se/egna/gb/
SE-90187 Ume?, Sweden e-mail: gb at stat.umu.se
> From: G?ran Brostr?m > > I have a data frame containing children, with variables 'year' = birth > year, and 'm.id' = mother's id number. Let's assume that all > the births of > each mother is represented in the data frame. > > Now I want to create a subset of this data frame containing > all children, > whose mother's first birth was in the year 1816 or later. > This seems to > work: > > mid <- tapply(dat$year, dat$m.id, min) > mid <- as.numeric(names(mid)[mid >= 1816]) > dat <- dat[dat$m.id %in% mid, ] > > but I'm worried about the second line, because the output > from 'tapply' > isn't documented to have a 'dimnames' attribute (although it > has one, at > least in R-2.1.0, 2005-01-19). Another aspect is that this > code relies on > m.id being numeric; I would have to change it if the type of > m.id changes > to, eg, character. > > So, question: Is there a better way of doing this?Would this work? dat <- dat[ave(dat$year, dat$m.id, min) >= 1816, ] Andy> -- > G?ran Brostr?m tel: +46 90 786 5223 > Department of Statistics fax: +46 90 786 6614 > Ume? University http://www.stat.umu.se/egna/gb/ > SE-90187 Ume?, Sweden e-mail: gb at stat.umu.se > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! > http://www.R-project.org/posting-guide.html > >
your approach, after omitting the "as.numeric()" in the second line,
seems to work even for `m.id' being factor, i.e.,
dat <- data.frame(m.id=rep(letters[1:10], 10), year=sample(1805:1950,
100, TRUE))
###########
mid <- tapply(dat$year, dat$m.id, min)
mid <- names(mid)[mid >= 1816]
dat. <- dat[dat$m.id %in% mid, ]
dat; dat.
but maybe there is something better.
Best,
Dimitris
----
Dimitris Rizopoulos
Ph.D. Student
Biostatistical Centre
School of Public Health
Catholic University of Leuven
Address: Kapucijnenvoer 35, Leuven, Belgium
Tel: +32/16/336899
Fax: +32/16/337015
Web: http://www.med.kuleuven.ac.be/biostat
http://www.student.kuleuven.ac.be/~m0390867/dimitris.htm
----- Original Message -----
From: "G?ran Brostr?m" <gb at tal.stat.umu.se>
To: <r-help at stat.math.ethz.ch>
Sent: Tuesday, January 25, 2005 3:55 PM
Subject: [R] tapply and names
>I have a data frame containing children, with variables 'year' =
>birth
> year, and 'm.id' = mother's id number. Let's assume that
all the
> births of
> each mother is represented in the data frame.
>
> Now I want to create a subset of this data frame containing all
> children,
> whose mother's first birth was in the year 1816 or later. This seems
> to
> work:
>
> mid <- tapply(dat$year, dat$m.id, min)
> mid <- as.numeric(names(mid)[mid >= 1816])
> dat <- dat[dat$m.id %in% mid, ]
>
> but I'm worried about the second line, because the output from
> 'tapply'
> isn't documented to have a 'dimnames' attribute (although it
has
> one, at
> least in R-2.1.0, 2005-01-19). Another aspect is that this code
> relies on
> m.id being numeric; I would have to change it if the type of m.id
> changes
> to, eg, character.
>
> So, question: Is there a better way of doing this?
> --
> G?ran Brostr?m tel: +46 90 786 5223
> Department of Statistics fax: +46 90 786 6614
> Ume? University http://www.stat.umu.se/egna/gb/
> SE-90187 Ume?, Sweden e-mail: gb at stat.umu.se
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
> http://www.R-project.org/posting-guide.html
>