thr3ads.net - R help - [R] Create variables with common values for each group [Jun 2006]

If this information is useful, please help other people find it:
Share via:

Stephan Lindner

2006-Jun-20 08:42 UTC

[R] Create variables with common values for each group

Dear all,

sorry, this is for sure really basic, but I searched a lot in the
internet, and just couldn't find a solution. 

The problem is to create new variables from a data frame which
contains both individual and group variables, such as mean age for an
household. My data frame:



df 

       hhid h.age
1  10010020    23
2  10010020    23
3  10010126    42
4  10010126    60
5  10010142    20
6  10010142    49
7  10010142    52
8  10010150    18
9  10010150    51
10 10010150    28


where hhid is the same number for each household, h.age the age for
each household member. 

I tried tapply, by(), and aggregate. The best I could get was:

by(df, df$hhid, function(subset) rep(mean(subset$h.age,na.rm=T),nrow(subset)))

df$hhid: 10010020
[1] 23 23
------------------------------------------------------------ 
df$hhid: 10010126
[1] 51 51
------------------------------------------------------------ 
df$hhid: 10010142
[1] 40.33333 40.33333 40.33333
------------------------------------------------------------ 
df$hhid: 10010150
[1] 32.33333 32.33333 32.33333


Now I principally only would have to stack up the mean values, and
this is where I'm stucked. The function aggregate works nice, and I
could loop then, but I was wondering whether there is a better way to
do that. 

My end result should look like this (assigning mean.age to the data frame):



       hhid h.age  mean.age
1  10010020    23     23.00
2  10010020    23     23.00
3  10010126    42     51.00
4  10010126    60     51.00
5  10010142    20     40.33
6  10010142    49     40.33
7  10010142    52     40.33
8  10010150    18     32.33
9  10010150    51     32.33
10 10010150    28     32.33



Cheers, and thanks a lot,


Stephan Lindner




-- 
-----------------------
Stephan Lindner, Dipl.Vw.
1512 Gilbert Ct., V-17
Ann Arbor, Michigan 48105
U.S.A.
Tel.: 001-734-272-2437
E-Mail: lindners at umich.edu

"The prevailing ideas of a time were always only the ideas of the
ruling class" -- Karl Marx

Chuck Cleland

2006-Jun-20 09:02 UTC

head link

[R] Create variables with common values for each group

Stephan Lindner wrote:> Dear all,
> 
> sorry, this is for sure really basic, but I searched a lot in the
> internet, and just couldn't find a solution. 
> 
> The problem is to create new variables from a data frame which
> contains both individual and group variables, such as mean age for an
> household. My data frame:
> 
> 
> 
> df 
> 
>        hhid h.age
> 1  10010020    23
> 2  10010020    23
> 3  10010126    42
> 4  10010126    60
> 5  10010142    20
> 6  10010142    49
> 7  10010142    52
> 8  10010150    18
> 9  10010150    51
> 10 10010150    28
> 
> 
> where hhid is the same number for each household, h.age the age for
> each household member. 
> 
> I tried tapply, by(), and aggregate. The best I could get was:
> 
> by(df, df$hhid, function(subset)
rep(mean(subset$h.age,na.rm=T),nrow(subset)))
> 
> df$hhid: 10010020
> [1] 23 23
> ------------------------------------------------------------ 
> df$hhid: 10010126
> [1] 51 51
> ------------------------------------------------------------ 
> df$hhid: 10010142
> [1] 40.33333 40.33333 40.33333
> ------------------------------------------------------------ 
> df$hhid: 10010150
> [1] 32.33333 32.33333 32.33333
> 
> 
> Now I principally only would have to stack up the mean values, and
> this is where I'm stucked. The function aggregate works nice, and I
> could loop then, but I was wondering whether there is a better way to
> do that. 
   You could use aggregate() and then merge() the result with df. 
Something like this:

 > df.agg <- aggregate(df$h.age, list(hhid = df$hhid), mean)
 >
 > names(df.agg)[2] <- "mean.age"
 >
 > merge(df, df.agg)
        hhid h.age mean.age
1  10010020    23 23.00000
2  10010020    23 23.00000
3  10010126    42 51.00000
4  10010126    60 51.00000
5  10010142    20 40.33333
6  10010142    49 40.33333
7  10010142    52 40.33333
8  10010150    18 32.33333
9  10010150    51 32.33333
10 10010150    28 32.33333
> My end result should look like this (assigning mean.age to the data frame):
> 
> 
> 
>        hhid h.age  mean.age
> 1  10010020    23     23.00
> 2  10010020    23     23.00
> 3  10010126    42     51.00
> 4  10010126    60     51.00
> 5  10010142    20     40.33
> 6  10010142    49     40.33
> 7  10010142    52     40.33
> 8  10010150    18     32.33
> 9  10010150    51     32.33
> 10 10010150    28     32.33
> 
> 
> 
> Cheers, and thanks a lot,
> 
> 
> Stephan Lindner
> 
> 
> 
> 
-- 
Chuck Cleland, Ph.D.
NDRI, Inc.
71 West 23rd Street, 8th floor
New York, NY 10010
tel: (212) 845-4495 (Tu, Th)
tel: (732) 512-0171 (M, W, F)
fax: (917) 438-0894

Dimitris Rizopoulos

2006-Jun-20 09:03 UTC

head link

[R] Create variables with common values for each group

you can use something like:

dat <- data.frame(hhid = rep(c(10010020, 10010126, 10010142, 
10010150), c(2, 2, 3, 3)), h.age = sample(18:50, 10, TRUE))
###########
dat$mean.age <- rep(tapply(dat$h.age, dat$hhid, mean), 
tapply(dat$h.age, dat$hhid, length))
dat


I hope it helps.

Best,
Dimitris

----
Dimitris Rizopoulos
Ph.D. Student
Biostatistical Centre
School of Public Health
Catholic University of Leuven

Address: Kapucijnenvoer 35, Leuven, Belgium
Tel: +32/(0)16/336899
Fax: +32/(0)16/337015
Web: http://med.kuleuven.be/biostat/
     http://www.student.kuleuven.be/~m0390867/dimitris.htm


----- Original Message ----- 
From: "Stephan Lindner" <lindners at umich.edu>
To: <r-help at stat.math.ethz.ch>
Sent: Tuesday, June 20, 2006 10:42 AM
Subject: [R] Create variables with common values for each group

> Dear all,
>
> sorry, this is for sure really basic, but I searched a lot in the
> internet, and just couldn't find a solution.
>
> The problem is to create new variables from a data frame which
> contains both individual and group variables, such as mean age for 
> an
> household. My data frame:
>
>
>
> df
>
>       hhid h.age
> 1  10010020    23
> 2  10010020    23
> 3  10010126    42
> 4  10010126    60
> 5  10010142    20
> 6  10010142    49
> 7  10010142    52
> 8  10010150    18
> 9  10010150    51
> 10 10010150    28
>
>
> where hhid is the same number for each household, h.age the age for
> each household member.
>
> I tried tapply, by(), and aggregate. The best I could get was:
>
> by(df, df$hhid, function(subset) 
> rep(mean(subset$h.age,na.rm=T),nrow(subset)))
>
> df$hhid: 10010020
> [1] 23 23
> ------------------------------------------------------------ 
> df$hhid: 10010126
> [1] 51 51
> ------------------------------------------------------------ 
> df$hhid: 10010142
> [1] 40.33333 40.33333 40.33333
> ------------------------------------------------------------ 
> df$hhid: 10010150
> [1] 32.33333 32.33333 32.33333
>
>
> Now I principally only would have to stack up the mean values, and
> this is where I'm stucked. The function aggregate works nice, and I
> could loop then, but I was wondering whether there is a better way 
> to
> do that.
>
> My end result should look like this (assigning mean.age to the data 
> frame):
>
>
>
>       hhid h.age  mean.age
> 1  10010020    23     23.00
> 2  10010020    23     23.00
> 3  10010126    42     51.00
> 4  10010126    60     51.00
> 5  10010142    20     40.33
> 6  10010142    49     40.33
> 7  10010142    52     40.33
> 8  10010150    18     32.33
> 9  10010150    51     32.33
> 10 10010150    28     32.33
>
>
>
> Cheers, and thanks a lot,
>
>
> Stephan Lindner
>
>
>
>
> -- 
> -----------------------
> Stephan Lindner, Dipl.Vw.
> 1512 Gilbert Ct., V-17
> Ann Arbor, Michigan 48105
> U.S.A.
> Tel.: 001-734-272-2437
> E-Mail: lindners at umich.edu
>
> "The prevailing ideas of a time were always only the ideas of the
> ruling class" -- Karl Marx
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
> 

Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm

Dieter Menne

2006-Jun-20 09:03 UTC

head link

[R] Create variables with common values for each group

Stephan Lindner <lindners <at> umich.edu> writes:

> The problem is to create new variables from a data frame which
> contains both individual and group variables, such as mean age for an
> household. My data frame:
> 
> df 
> 
>        hhid h.age
> 1  10010020    23
> 2  10010020    23
...> where hhid is the same number for each household, h.age the age for
> each household member. 
> 
> I tried tapply, by(), and aggregate. The best I could get was:
> 
> by(df, df$hhid, function(subset)
rep(mean(subset$h.age,na.rm=T),nrow(subset)))
> 
> df$hhid: 10010020
> [1] 23 23
> ------------------------------------------------------------ 
> df$hhid: 10010126
> [1] 51 51
try something like 

do.call("rbind",byresult)

As you did not provide a running example, the suggestion is only approximately
correct.

Dieter

Apparently Analagous Threads

Search for more reasonably related threads

R help - Jun 2006 - Create variables with common values for each group

[R] Create variables with common values for each group

[R] Create variables with common values for each group

[R] Create variables with common values for each group

[R] Create variables with common values for each group

Apparently Analagous Threads