hi, I'm a total noob who is having to ramp up to full speed very quickly due to an unfortunate abrupt staffing change at my job :) I have longitudinal data that looks like this: PID OBSDATE DaysAgo CleanValue NAME 1 1410164934000610 8/17/2004 13:03:38 1345 6.2 HGBA1C 2 1410164934000610 11/16/2004 10:39:51 1254 7.1 HGBA1C ...etc I'd like to end up with a wide-format table like: PID OBSDATE.1 DaysAgo.1 CleanValue.1 [...] OBSDATE.n DaysAgo.n CleanValue.n The problem: Every patient's on a different schedule, so there's no natural timevar value to reshape with. My solution involved creating another column for the timevar and looping through the dataframe to populate that column with correct values; however, it's a pretty big table and so the loop runs really slowly. I need to do this particular operation on quite a few datafiles on a regular basis, and my database keeps getting bigger - so I anticipate this solution won't be ideal in the future. Can anyone recommend a more efficient solution? -- View this message in context: http://www.nabble.com/reshaping-%22long-form%22-longitudinal-data-from-sql-query-tp16851308p16851308.html Sent from the R help mailing list archive at Nabble.com.
Gabor Grothendieck
2008-Apr-24 19:37 UTC
[R] re shaping "long-form" longitudinal data from sql query
If DF is your data frame then create a time column like this: DF$time <- ave(DF$DaysAgo, DF$PID, FUN = seq_along) and now use the reshape command on DF (or melt/cast from the reshape package). On Thu, Apr 24, 2008 at 2:13 PM, Tubin <sredmonson at yahoo.com> wrote:> > hi, I'm a total noob who is having to ramp up to full speed very quickly due > to an unfortunate abrupt staffing change at my job :) > > I have longitudinal data that looks like this: > PID OBSDATE DaysAgo CleanValue > NAME > 1 1410164934000610 8/17/2004 13:03:38 1345 6.2 HGBA1C > 2 1410164934000610 11/16/2004 10:39:51 1254 7.1 HGBA1C > ...etc > > I'd like to end up with a wide-format table like: > PID OBSDATE.1 DaysAgo.1 CleanValue.1 [...] OBSDATE.n DaysAgo.n > CleanValue.n > > The problem: Every patient's on a different schedule, so there's no natural > timevar value to reshape with. > > My solution involved creating another column for the timevar and looping > through the dataframe to populate that column with correct values; however, > it's a pretty big table and so the loop runs really slowly. I need to do > this particular operation on quite a few datafiles on a regular basis, and > my database keeps getting bigger - so I anticipate this solution won't be > ideal in the future. > > Can anyone recommend a more efficient solution? > > -- > View this message in context: http://www.nabble.com/reshaping-%22long-form%22-longitudinal-data-from-sql-query-tp16851308p16851308.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >