you can use 'ave' to add a new column with the state average:
> id<-as.character(c(01001:01010, 02001:02010))
> st<-substr(id,1,1)
> cnty<-substr(id,2,5)
> tfr10<-rnorm(1:20)
>
> mydata<-data.frame(id,st,cnty,tfr10)
> mydata$stAvg <- ave(mydata$tfr10, mydata$st)
> print(mydata)
id st cnty tfr10 stAvg
1 1001 1 001 1.1896489 -0.3190678
2 1002 1 002 -1.0504707 -0.3190678
3 1003 1 003 -1.6130538 -0.3190678
4 1004 1 004 -1.1573924 -0.3190678
5 1005 1 005 -0.2013412 -0.3190678
6 1006 1 006 0.5176950 -0.3190678
7 1007 1 007 -1.3256951 -0.3190678
8 1008 1 008 0.4367956 -0.3190678
9 1009 1 009 0.2025659 -0.3190678
10 1010 1 010 -0.1894306 -0.3190678
11 2001 2 001 -0.9337906 -0.3842536
12 2002 2 002 0.2999035 -0.3842536
13 2003 2 003 0.5091345 -0.3842536
14 2004 2 004 -0.4787584 -0.3842536
15 2005 2 005 -1.6958660 -0.3842536
16 2006 2 006 -0.4430861 -0.3842536
17 2007 2 007 0.2100123 -0.3842536
18 2008 2 008 -1.7471779 -0.3842536
19 2009 2 009 0.1778717 -0.3842536
20 2010 2 010 0.2592210 -0.3842536>
On Wed, Aug 31, 2011 at 12:50 PM, jour4life <jour4life at gmail.com>
wrote:> Hello all,
>
> I hope something is not already posted regarding this exact problem I am
> trying to solve. I've read through the forums and previous postings and
am
> still confused as to how to approach this. Basically, what I am trying to
do
> is construct variables that utilizes an average of a variable from a
> grouping, or higher order, variable. For instance, in my dataset I have
> variables, with each observation being a county. Of those counties, we have
> an ID variable, for which, I have extracted variables from the substring of
> the ID variable. Thus, I was able to extract a state variable, for which, I
> want to use the averages, calculated at the state level, and utilize those
> averages for another variable. I know this may be confusing, so I'm
posting
> an example dataset here:
>
> id.tmp1<-as.character(01001:01010)
> st<-substr(id,1,1)
> cnty<-substr(id,2,5)
> tfr10<-rnorn(1:10)
>
> mydata<-cbind(id,st,cnty,tfr10)
> print(mydata)
> ? ? id ? ? st ?cnty ?tfr10
> ?[1,] "1001" "1" "001"
"1.07505442756833"
> ?[2,] "1002" "1" "002"
"-0.882434417011687"
> ?[3,] "1003" "1" "003"
"2.29276525788035"
> ?[4,] "1004" "1" "004"
"-0.312320296652298"
> ?[5,] "1005" "1" "005"
"1.09001860766383"
> ?[6,] "1006" "1" "006"
"-0.781940988103414"
> ?[7,] "1007" "1" "007"
"-0.614135968631341"
> ?[8,] "1008" "1" "008"
"0.515142965880679"
> ?[9,] "1009" "1" "009"
"0.0274456168157293"
> [10,] "1010" "1" "010"
"-0.538584996182184"
>
> What I want to do is get the average for of the variable "tfr10"
by state.
> Based on that, I will create another calculation that will output
variables.
> In other words, for each observation, calculate a new variable using the
> average at the state level. Of course, this is a simple example and will
> have 32 states, for which I do not want to create a "mean
variable" for each
> state to calculate another variable and would rather do this using a loop.
>
> Or, I can potentially create a "mean" variable, but based on the
> observations at the state level using a loop. Whichever way is best and
> easiest. I hope that this example is understandable. Any help or direction
> would be greatly appreciated!!!
>
> Thanks,
>
> Carlos
>
> --
> View this message in context:
http://r.789695.n4.nabble.com/looping-by-grouping-variable-tp3781580p3781580.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
--
Jim Holtman
Data Munger Guru
What is the problem that you are trying to solve?