Hi, I have a data.frame with many variables for which I am performing the mean by subgroup, for a pair of variables at a time, where one of them for each pair defines the subgroup. The subgroups in the x$cm1 are 0, 1 and 2. x ph1 cm1 0.2345 2 1.2222 1 2.0033 0 0.0000 2 1.0033 1 0.2345 0 1.2222 2 2.0033 0 0.0000 1 1.0033 2 > meanbygroup <- as.vector(with(x, by(x$ph1, x$cm1, mean))) > meanbygroup if the ph1 has no missing values the above statements work fine: [1] 1.4137000 0.7418333 0.6150000 In the moment that I introduce in the ph1 a missing value in the ph1 as NA x ph1 cm1 0.2345 2 NA 1 1.2222 1 ............. the above transforms into [1] 1.4137000 NA 0.6150000 Question: is there a way I can protect this calculations from the NA values in the ph1 (some kind of: na.rm=T)? TIA, Aldi --
Dear Aldi, Yes . Here it is: as.vector(with(x, by(ph1, cm1, mean,na.rm=TRUE))) or with(x,tapply(phi1,cm1,mean,na.rm=TRUE)) See ?mean and ?tapply for more details. HTH, Jorge On Wed, Mar 25, 2009 at 7:36 PM, Aldi Kraja <aldi@wustl.edu> wrote:> Hi, > > I have a data.frame with many variables for which I am performing the mean > by subgroup, for a pair of variables at a time, where one of them for each > pair defines the subgroup. The subgroups in the x$cm1 are 0, 1 and 2. > x > ph1 cm1 > 0.2345 2 > 1.2222 1 > 2.0033 0 > 0.0000 2 > 1.0033 1 > 0.2345 0 > 1.2222 2 > 2.0033 0 > 0.0000 1 > 1.0033 2 > > > meanbygroup <- as.vector(with(x, by(x$ph1, x$cm1, mean))) > > meanbygroup > if the ph1 has no missing values the above statements work fine: > [1] 1.4137000 0.7418333 0.6150000 > > In the moment that I introduce in the ph1 a missing value in the ph1 as NA > x > ph1 cm1 > 0.2345 2 > NA 1 > 1.2222 1 > ............. > > the above transforms into > [1] 1.4137000 NA 0.6150000 > > Question: is there a way I can protect this calculations from the NA values > in the ph1 (some kind of: na.rm=T)? > > TIA, > > Aldi > > > -- > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]
On 25/03/2009 7:36 PM, Aldi Kraja wrote:> Hi, > > I have a data.frame with many variables for which I am performing the > mean by subgroup, for a pair of variables at a time, where one of them > for each pair defines the subgroup. The subgroups in the x$cm1 are 0, 1 > and 2. > x > ph1 cm1 > 0.2345 2 > 1.2222 1 > 2.0033 0 > 0.0000 2 > 1.0033 1 > 0.2345 0 > 1.2222 2 > 2.0033 0 > 0.0000 1 > 1.0033 2 > > > meanbygroup <- as.vector(with(x, by(x$ph1, x$cm1, mean)))You don't need with() here, as you are explicitly extracting the vectors from x.> > meanbygroup > if the ph1 has no missing values the above statements work fine: > [1] 1.4137000 0.7418333 0.6150000 > > In the moment that I introduce in the ph1 a missing value in the ph1 as NA > x > ph1 cm1 > 0.2345 2 > NA 1 > 1.2222 1 > ............. > > the above transforms into > [1] 1.4137000 NA 0.6150000 > > Question: is there a way I can protect this calculations from the NA > values in the ph1 (some kind of: na.rm=T)?You could use with(), and extract the vectors from a subset of x: with(x[!is.na(x$ph1),], by(ph1, cm1, mean)) This is untested. If you had provided sample data in a usable format I would have tried it, but you didn't, and I'm too lazy to create my own. Duncan Murdoch