Chad Danyluck
2015-Feb-19 20:32 UTC
[R] Averaging column scores when participants vary in number of observations
I have a data set that includes the identity of a number of Video Coders who scored participants' behaviors in a video. Every participant was scored once, but some participants were randomly assigned to have their data scored twice so I could calculate inter-rater reliabilities. I have completed the reliability analyses and want to use the average score for participants who had their behavior coded twice. I'd like to create a 'for loop' or function that allows me to calculate these column means iteratively because the number of observations is quite large (*N* 168). Given the organization of the data, with some participants on multiple rows, I am not sure how to proceed. The original data looks something like this: Participant ID Video Coder Score Observation A 1 Donald 4 Observation B 1 Tracy 5 Observation C 2 Donald 6 Observation D 3 Sam 2 Observation E 3 Tracy 3 Observation F 4 Donald 2 Observation G 4 Tracy 1 Observation H 5 Sam 8 When the data processing is completed, I would like the new data set to look like this: Participant ID Score 1 4.5 2 6 3 2.5 4 1.5 5 8 Any tips or suggestions would be appreciated. Kind regards, Chad -- Chad M. Danyluck, MA PhD Candidate, Psychology University of Toronto ?There is nothing either good or bad but thinking makes it so.? - William Shakespeare [[alternative HTML version deleted]]
Nordlund, Dan (DSHS/RDA)
2015-Feb-19 21:58 UTC
[R] Averaging column scores when participants vary in number of observations
> -----Original Message----- > From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of Chad > Danyluck > Sent: Thursday, February 19, 2015 12:33 PM > To: r-help at r-project.org > Subject: Re: [R] Averaging column scores when participants vary in > number of observations > > I have a data set that includes the identity of a number of Video > Coders > who scored participants' behaviors in a video. Every participant was > scored > once, but some participants were randomly assigned to have their data > scored twice so I could calculate inter-rater reliabilities. I have > completed the reliability analyses and want to use the average score > for > participants who had their behavior coded twice. I'd like to create a > 'for > loop' or function that allows me to calculate these column means > iteratively because the number of observations is quite large (*N* > 168). Given the organization of the data, with some participants on > multiple rows, I am not sure how to proceed. > > The original data looks something like this: > > Participant ID Video Coder Score > Observation A 1 Donald 4 > Observation B 1 Tracy 5 > Observation C 2 Donald 6 > Observation D 3 Sam 2 > Observation E 3 Tracy 3 > Observation F 4 Donald 2 > Observation G 4 Tracy 1 > Observation H 5 Sam 8 > > When the data processing is completed, I would like the new data set to > look like this: > > Participant ID Score > 1 4.5 > 2 6 > 3 2.5 > 4 1.5 > 5 8 > > Any tips or suggestions would be appreciated. > > Kind regards, > > ChadHow about something like aggregate(Score ~ Participant_ID, data=rating, mean) hope this is helpful, Dan Daniel J. Nordlund, PhD Research and Data Analysis Division Services & Enterprise Support Administration Washington State Department of Social and Health Services
JS Huang
2015-Feb-20 01:36 UTC
[R] Averaging column scores when participants vary in number of observations
Hi, Another implication:> data1Observation Participant.ID Video.Coder Score 1 A 1 Donald 4 2 B 1 Tracy 5 3 C 2 Donald 6 4 D 3 Sam 2 5 E 3 Tracy 3 6 F 4 Donald 2 7 G 4 Tracy 1 8 H 5 Sam 8> tapply(data1$Score,data1$Participant.ID,mean)1 2 3 4 5 4.5 6.0 2.5 1.5 8.0 -- View this message in context: http://r.789695.n4.nabble.com/Re-Averaging-column-scores-when-participants-vary-in-number-of-observations-tp4703549p4703561.html Sent from the R help mailing list archive at Nabble.com.