R. Mark Sharp
2009-Jan-27  15:36 UTC
[R] Creating list or numeric vectors out of selected columns of row oriented data
I am just assuming this can be done, but I have not gotten close to making it happen. I have a data file with about 1 million rows with 1470 unique subjects. Each row represents a small set of observations made on a specific date for a single subject. I would like to transform the data so that I have an R object with a single entry for each subject and start date and vectors for the observation dates and the observations. The data are something like the following where for each subject the subject_id does not change and the start_date does not change, but the obeservation_date and the three different observations change between rows. (There is one row for each day for each subject over a three year period although some entered the study late): 'subject_id', 'start_date','observation_date','weight_obs', 'activity_obs','calories_obs' 1,'1/1/2005','1/1/2005',3.26,'a',93 1,'1/1/2005','1/2/2005',3.22,'o',85 1,'1/1/2005','1/3/2005',3.28,'o',91 ... 1,'1/1/2005','12/31/2008',4.38,'h',102 2,'2/13/2005','2/13/2005',3.02,'l',80 2,'2/13/2005','2/14/2005',3.08,'j',85 ... Any guidance is appreciated. R. Mark Sharp, Ph.D. Director of Primate Records Database Southwest National Primate Center Southwest Foundation for Biomedical Research P.O. Box 760549 San Antonio, TX 78245-0549 Telephone: (210)258-9476 e-mail: msharp@sfbr.org [[alternative HTML version deleted]]
jim holtman
2009-Jan-27  16:32 UTC
[R] Creating list or numeric vectors out of selected columns of row oriented data
Yes the data can probably be easily transformed, but you would have to provide an example of what the input looks like to understand what has to be done with the data and how variable it might be so we can understand how we might have to "parse" the data from the input. Are the missing days supposed to be filled with with NAs? On Tue, Jan 27, 2009 at 10:36 AM, R. Mark Sharp <msharp at sfbr.org> wrote:> I am just assuming this can be done, but I have not gotten close to > making it happen. I have a data file with about 1 million rows with > 1470 unique subjects. Each row represents a small set of observations > made on a specific date for a single subject. I would like to > transform the data so that I have an R object with a single entry for > each subject and start date and vectors for the observation dates and > the observations. The data are something like the following where for > each subject the subject_id does not change and the start_date does > not change, but the obeservation_date and the three different > observations change between rows. (There is one row for each day for > each subject over a three year period although some entered the study > late): > 'subject_id', 'start_date','observation_date','weight_obs', > 'activity_obs','calories_obs' > 1,'1/1/2005','1/1/2005',3.26,'a',93 > 1,'1/1/2005','1/2/2005',3.22,'o',85 > 1,'1/1/2005','1/3/2005',3.28,'o',91 > ... > 1,'1/1/2005','12/31/2008',4.38,'h',102 > 2,'2/13/2005','2/13/2005',3.02,'l',80 > 2,'2/13/2005','2/14/2005',3.08,'j',85 > ... > > Any guidance is appreciated. > > R. Mark Sharp, Ph.D. > Director of Primate Records Database > Southwest National Primate Center > Southwest Foundation for > Biomedical Research > P.O. Box 760549 > San Antonio, TX 78245-0549 > Telephone: (210)258-9476 > e-mail: msharp at sfbr.org > > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve?