Julia Moeller
2011-Aug-12 15:10 UTC
[R] recode Variable in dependence of values of two other variables
Hi,
as an R-beginner, I have a recoding problem and hope you can help me:
I am working on a SPSS dataset, which I loaded into R (load("C:/...)
I have 2 existing Variables: "ID" and "X" ,
and one variable to be computed: meanX.dependID (=mean of X for all rows
in which ID has the same value)
ID = subject ID. Since it is a longitudinal dataset, there are repeated
measurement points for each subject, each of which appears in a new row.
So, each ID value appears in many rows. (e.g. ID ==1 in row 1:5; ID ==2
in rows 6:8 etc).
Now: For all rows, in which ID has a certain value, meanX.dependID shall
be the mean of X in for these rows. How can I automatisize that, without
having to specify the number of the rows each time?
e.g.
ID X meanX.dependID
1 2 2.25
1 3 2.25
1 1 2.25
1 3 2.25
2 5 3.3
2 2 3.3
2 3 3.3
3 4 3
3 1 3
3 2 3
3 3 3
3 4 3
3 5 3
Thanks a lot! Hope this is the right place to post, if not, please tell me!
best,
Julia
Mikhail Titov
2011-Aug-12 17:05 UTC
[R] recode Variable in dependence of values of two other variables
?aggregate aggregate(X~ID, your.data.frame.goes.here, "mean") Mikhail> -----Original Message----- > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org]On> Behalf Of Julia Moeller > Sent: Friday, August 12, 2011 10:10 AM > To: r-help at r-project.org > Subject: [R] recode Variable in dependence of values of two othervariables> > Hi, > > as an R-beginner, I have a recoding problem and hope you can help me: > > I am working on a SPSS dataset, which I loaded into R (load("C:/...) > > I have 2 existing Variables: "ID" and "X" , and one variable to be > computed: meanX.dependID (=mean of X for all rows in which ID has the same > value) > > ID = subject ID. Since it is a longitudinal dataset, there are repeated > measurement points for each subject, each of which appears in a new row. > So, each ID value appears in many rows. (e.g. ID ==1 in row 1:5; ID ==2 in > rows 6:8 etc). > > > Now: For all rows, in which ID has a certain value, meanX.dependID shallbe> the mean of X in for these rows. How can I automatisize that, without > having to specify the number of the rows each time? > > e.g. > > > ID X meanX.dependID > 1 2 2.25 > 1 3 2.25 > 1 1 2.25 > 1 3 2.25 > 2 5 3.3 > 2 2 3.3 > 2 3 3.3 > 3 4 3 > 3 1 3 > 3 2 3 > 3 3 3 > 3 4 3 > 3 5 3 > > > Thanks a lot! Hope this is the right place to post, if not, please tellme!> best, > Julia
Dennis Murphy
2011-Aug-12 19:49 UTC
[R] recode Variable in dependence of values of two other variables
Hi:
Here are several equivalent ways to produce your desired output:
# Base package: transform()
df <- transform(df, mean = ave(x, id, FUN = mean))
# plyr package
library('plyr')
ddply(df, .(id), transform, mean = mean(x))
# data.table package
library('data.table')
dt <- data.table(df, key = 'id')
dt[, list(x, mean = mean(x)), by = 'id']
# doBy package
library('doBy')
transformBy(~ id, data = df, mean = mean(x))
HTH,
Dennis
On Fri, Aug 12, 2011 at 8:10 AM, Julia Moeller
<julia.moeller at uni-erfurt.de> wrote:> Hi,
>
> as an R-beginner, I have a recoding problem and hope you can help me:
>
> I am working on a SPSS dataset, which I loaded into R (load("C:/...)
>
> I have ?2 existing Variables: "ID" and "X" ,
> and one variable to be computed: meanX.dependID (=mean of X for all rows in
> which ID has the same value)
>
> ID = subject ID. ?Since it is a longitudinal dataset, there are repeated
> measurement points for each subject, each of which appears in a new row.
So,
> each ID value appears in many rows. (e.g. ID ==1 in row 1:5; ID ==2 in rows
> 6:8 etc).
>
>
> Now: For all rows, in which ID has a certain value, meanX.dependID shall be
> the mean of X in for these rows. How can I automatisize that, without
having
> to specify the number of the rows each time?
>
> e.g.
>
>
> ID ? ?X ? ?meanX.dependID
> 1 ? ?2 ? ?2.25
> 1 ? ?3 ? ?2.25
> 1 ? ?1 ? ?2.25
> 1 ? ?3 ? ?2.25
> 2 ? ?5 ? ?3.3
> 2 ? ?2 ? ?3.3
> 2 ? ?3 ? ?3.3
> 3 ? ?4 ? ?3
> 3 ? ?1 ? ?3
> 3 ? ?2 ? ?3
> 3 ? ?3 ? ?3
> 3 ? ?4 ? ?3
> 3 ? ?5 ? ?3
>
>
> Thanks a lot! Hope this is the right place to post, if not, please tell me!
> best,
> Julia
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>