Thomas Levine
2008-Jan-24 18:42 UTC
[R] How should I organize data to compare differences in matched pairs?
I'm just learning how to use R right now, so I'm not sure what the most efficient way to organize these data is. I had subjects perform the same task twice with slight changes between the rounds. I want to analyze differences between the rounds. All of the subjects also answered a questionnaire. Putting all of one subject's information on one row seems sloppy. I was thinking about making a three-dimensional array with subject number, round and measurement as axes, but then the differences would have to be the third column in the round axis, which also seemed messy. Also, I would have duplicates of all of the information from the questionnaire, which seems inefficient. Or maybe I could just use a matrix where round is just another column among all of the measurements. This is similar to the previous arrangement, but I don't know which is better. It still has all of the duplicated information that the previous method has. Anyway, I'm sure someone's done this before, so I'd like to see what other people have done for data like these. Thomas Levine [[alternative HTML version deleted]]
John Kane
2008-Jan-24 19:03 UTC
[R] How should I organize data to compare differences in matched pairs?
Putting all the information in one row is going to be the easest for data entry and for analysis. If it's easier you can enter the task information in one data set and the questionnaire in another (with the same unique id ) and merge the data sets later if you need to. All you want is the data in a form that you can read into R, probably as a data.frame. Then do any data manipulations that you want. --- Thomas Levine <thomas.levine at gmail.com> wrote:> I'm just learning how to use R right now, so I'm not > sure what the most > efficient way to organize these data is. > > I had subjects perform the same task twice with > slight changes between the > rounds. I want to analyze differences between the > rounds. All of the > subjects also answered a questionnaire. > > Putting all of one subject's information on one row > seems sloppy. > > I was thinking about making a three-dimensional > array with subject number, > round and measurement as axes, but then the > differences would have to be the > third column in the round axis, which also seemed > messy. Also, I would have > duplicates of all of the information from the > questionnaire, which seems > inefficient. > > Or maybe I could just use a matrix where round is > just another column among > all of the measurements. This is similar to the > previous arrangement, but I > don't know which is better. It still has all of the > duplicated information > that the previous method has. > > Anyway, I'm sure someone's done this before, so I'd > like to see what other > people have done for data like these. > > Thomas Levine > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, > reproducible code. >
Greg Snow
2008-Jan-24 19:05 UTC
[R] How should I organize data to compare differences in matchedpairs?
Here is how I would do it (there are multiple ways you could do it, so there is not single "Right" answer): Assign each person a unique identifier. Put all the information from the questionaire along with the idenifier and anything else that does not change between rounds (age, sex, height, ...) into one data frame. This df will have as many rows as you have subjects. The round information then goes into a second data frame with each round being a row (each subject has multiple rows) and include the unique identifier on each row for that person. If you need information combined from both data frames, then use the merge function to merge the 2 data frames (or subsets of them) together. Advantages of this method include: Uses data frames which most of the analysis functions expect. Each piece of data is only entered once (other than the id) Disadvantage: Data is split between 2 objects. Hope this helps, -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.snow at imail.org (801) 408-8111> -----Original Message----- > From: r-help-bounces at r-project.org > [mailto:r-help-bounces at r-project.org] On Behalf Of Thomas Levine > Sent: Thursday, January 24, 2008 11:43 AM > To: r-help at r-project.org > Subject: [R] How should I organize data to compare > differences in matchedpairs? > > I'm just learning how to use R right now, so I'm not sure > what the most efficient way to organize these data is. > > I had subjects perform the same task twice with slight > changes between the rounds. I want to analyze differences > between the rounds. All of the subjects also answered a questionnaire. > > Putting all of one subject's information on one row seems sloppy. > > I was thinking about making a three-dimensional array with > subject number, round and measurement as axes, but then the > differences would have to be the third column in the round > axis, which also seemed messy. Also, I would have duplicates > of all of the information from the questionnaire, which seems > inefficient. > > Or maybe I could just use a matrix where round is just > another column among all of the measurements. This is similar > to the previous arrangement, but I don't know which is > better. It still has all of the duplicated information that > the previous method has. > > Anyway, I'm sure someone's done this before, so I'd like to > see what other people have done for data like these. > > Thomas Levine > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
Thomas Levine
2008-Jan-24 23:23 UTC
[R] How should I organize data to compare differences in matchedpairs?
By accident, I didn't send this to the list. On Thu, 2008-01-24 at 17:54 -0500, Thomas Levine wrote:> Oh, right, I don't need the differences. I only needed to get the > differences before because I was doing them sloppily in a spreadsheet > and needed to do a t-test manually because the program didn't have a > function for one type of t-test. I shall do it this way then. > > > > On Thu, 2008-01-24 at 12:05 -0700, Greg Snow wrote: > > > Here is how I would do it (there are multiple ways you could do it, so > > there is not single "Right" answer): > > > > Assign each person a unique identifier. > > > > Put all the information from the questionaire along with the idenifier > > and anything else that does not change between rounds (age, sex, height, > > ...) into one data frame. This df will have as many rows as you have > > subjects. > > > > The round information then goes into a second data frame with each round > > being a row (each subject has multiple rows) and include the unique > > identifier on each row for that person. > > > > If you need information combined from both data frames, then use the > > merge function to merge the 2 data frames (or subsets of them) together. > > > > Advantages of this method include: > > > > Uses data frames which most of the analysis functions expect. > > Each piece of data is only entered once (other than the id) > > > > Disadvantage: > > > > Data is split between 2 objects. > > > > > > Hope this helps, > >-------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part Url : https://stat.ethz.ch/pipermail/r-help/attachments/20080124/f71153e4/attachment.bin