I need advice on how to create a variable that is the group mean of another variable. For example, I have a variable called x for which each row in the data set has a value. I also have a nominal variable called g that indicates which of 100 different groups each row belongs to. So, I want to create a new variable called w, which is the group mean of x for which ever group the row belongs to (as indicated by variable g). Ideally, I'd also like to take out each row's value of x before calculating the group mean assigned to that row. I've already tried the aggregate command. That gives me the group means, but does not allow me to assign them to each row in the data set. THANKS! -- Casey A. Klofstad University of Miami Department of Political Science Coral Gables, FL klofstad at gmail.com http://moya.bus.miami.edu/~cklofstad
Gabor Grothendieck
2007-Nov-13 00:01 UTC
[R] how to assign a group mean to individual cases?
Look at: ?ave On Nov 12, 2007 5:01 PM, Casey Klofstad <klofstad at gmail.com> wrote:> I need advice on how to create a variable that is the group mean of > another variable. > > For example, I have a variable called x for which each row in the data > set has a value. I also have a nominal variable called g that > indicates which of 100 different groups each row belongs to. > > So, I want to create a new variable called w, which is the group mean > of x for which ever group the row belongs to (as indicated by variable > g). Ideally, I'd also like to take out each row's value of x before > calculating the group mean assigned to that row. > > I've already tried the aggregate command. That gives me the group > means, but does not allow me to assign them to each row in the data > set. > > THANKS! > -- > Casey A. Klofstad > University of Miami > Department of Political Science > Coral Gables, FL > > klofstad at gmail.com > http://moya.bus.miami.edu/~cklofstad > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
You can do the following: x <- 1:10 g <- rep(3:5,len=10) df <- data.frame(g=g,x=x) y <- aggregate(df$x,list(df$g)) z <- sapply(df$g,function(x) which(y[,1]==x)) df1 <- data.frame(df,group.mean=y[z,2]) --- Casey Klofstad <klofstad at gmail.com> wrote:> I need advice on how to create a variable that is > the group mean of > another variable. > > For example, I have a variable called x for which > each row in the data > set has a value. I also have a nominal variable > called g that > indicates which of 100 different groups each row > belongs to. > > So, I want to create a new variable called w, which > is the group mean > of x for which ever group the row belongs to (as > indicated by variable > g). Ideally, I'd also like to take out each row's > value of x before > calculating the group mean assigned to that row. > > I've already tried the aggregate command. That gives > me the group > means, but does not allow me to assign them to each > row in the data > set. > > THANKS! > -- > Casey A. Klofstad > University of Miami > Department of Political Science > Coral Gables, FL > > klofstad at gmail.com > http://moya.bus.miami.edu/~cklofstad > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, > reproducible code. >
Emmanuel Charpentier
2007-Nov-13 14:09 UTC
[R] how to assign a group mean to individual cases?
Casey Klofstad a ?crit :> I need advice on how to create a variable that is the group mean of > another variable. > > For example, I have a variable called x for which each row in the data > set has a value. I also have a nominal variable called g that > indicates which of 100 different groups each row belongs to. > > So, I want to create a new variable called w, which is the group mean > of x for which ever group the row belongs to (as indicated by variable > g). Ideally, I'd also like to take out each row's value of x before > calculating the group mean assigned to that row. > > I've already tried the aggregate command. That gives me the group > means, but does not allow me to assign them to each row in the data > set.The first one is easy : you just have to choose which siege cannon you need to shoot your fly. What about : your.dataset$w<-(aov(x~g,data=your.dataset))$fitted.values No a priori idea about the second one, but it has a strong flavor of jackknife ; I'd have a look in this direction... HTH, Emmanuel Charpentier
For the first question use ave along with mean like Gabor suggested. For the second question (finding the mean with that value removed) you can use ave with the sum function ("ave(x, group, FUN=sum)") to find the sum rather than the average, then use ave again with the length function to find out how many are in each group, then subtract your x value from the sum to get the sum of all other values in the group and divide that sum by n-1 (from using ave with length) to get the mean. Hope this helps, -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.snow at intermountainmail.org (801) 408-8111> -----Original Message----- > From: r-help-bounces at r-project.org > [mailto:r-help-bounces at r-project.org] On Behalf Of Casey Klofstad > Sent: Monday, November 12, 2007 3:02 PM > To: r-help at r-project.org > Subject: [R] how to assign a group mean to individual cases? > > I need advice on how to create a variable that is the group > mean of another variable. > > For example, I have a variable called x for which each row in > the data set has a value. I also have a nominal variable > called g that indicates which of 100 different groups each > row belongs to. > > So, I want to create a new variable called w, which is the > group mean of x for which ever group the row belongs to (as > indicated by variable g). Ideally, I'd also like to take out > each row's value of x before calculating the group mean > assigned to that row. > > I've already tried the aggregate command. That gives me the > group means, but does not allow me to assign them to each row > in the data set. > > THANKS! > -- > Casey A. Klofstad > University of Miami > Department of Political Science > Coral Gables, FL > > klofstad at gmail.com > http://moya.bus.miami.edu/~cklofstad > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >