Eric Fail
2011-Dec-07 09:02 UTC
[R] plotting and coloring longitudinal data with three time points (ggplot2)
Dear list, I have been struggling with this for some time now, and for the last hour I have been struggling to make a working example for the list. I hope someone out there have some experience with plotting longitudinal data that they will share. My data is some patient data with three different time stamps. First the patients are identified at different times (first time stamp). Second, they go through an assessment phase and begin their treatment (time stamp 2). Finally they are admitted from the hospital at some point (time stamp 3), I would like to make a spaghetti plot with the assessment phase in one color and the treatment phase in another color. I used ggplot2, and with this example data and only two time points; it works fine (I call it my working example), library(ggplot2) df <- data.frame( ?date = seq(Sys.Date(), len=104, by="1 day")[sample(104, 52)], ? patient = factor(rep(1:26, 2), labels = LETTERS) ) df <- df[order(df$date), ] dt <- qplot(date, patient, data=df, geom="line") dt + scale_x_date() df[ which(df$patient=='E'), c("patient", "date")] But, if I have three time points, R, for some reason I do not yet understand, add the two second time points in some funny way. Finally, when that is solved; how do I colorize the different parts of the line so the assessment phase gets one color and the treatment phase another? I want to be able to show how long we have been in contact with our patients, how much of the contact time that was assessment and how much that was actual treatment. Below is an example (I call it the not-working example) df2 <- data.frame( ?date2 = seq(Sys.Date(), len= 156, by="2 day")[sample(156, 78)], ?patient2 = factor(rep(1:26, 3), labels = LETTERS) ) df2 <- df2[order(df2$date2), ] dt2 <- qplot(date2, patient2, data=df2, geom="line") dt2 + scale_x_date(major="months", minor="weeks") df2[ which(df2$patient2=='B'), c("patient2", "date2")] If someone can point me in a direction or tell me what I am doing wrong or if there is some amazing package for plotting longitudinal data I would be very grateful. Thanks, Eric
Jim Lemon
2011-Dec-07 10:57 UTC
[R] plotting and coloring longitudinal data with three time points (ggplot2)
On 12/07/2011 08:02 PM, Eric Fail wrote:> Dear list, > > I have been struggling with this for some time now, and for the last hour I have been struggling to make a working example for the list. I hope someone out there have some experience with plotting longitudinal data that they will share. > > My data is some patient data with three different time stamps. First the patients are identified at different times (first time stamp). Second, they go through an assessment phase and begin their treatment (time stamp 2). Finally they are admitted from the hospital at some point (time stamp 3), > > I would like to make a spaghetti plot with the assessment phase in one color and the treatment phase in another color. > > I used ggplot2, and with this example data and only two time points; it works fine (I call it my working example), > > library(ggplot2) > df<- data.frame( > date = seq(Sys.Date(), len=104, by="1 day")[sample(104, 52)], > patient = factor(rep(1:26, 2), labels = LETTERS) > ) > df<- df[order(df$date), ] > dt<- qplot(date, patient, data=df, geom="line") > dt + scale_x_date() > df[ which(df$patient=='E'), c("patient", "date")] > > But, if I have three time points, R, for some reason I do not yet understand, add the two second time points in some funny way. > > Finally, when that is solved; how do I colorize the different parts of the line so the assessment phase gets one color and the treatment phase another? > > I want to be able to show how long we have been in contact with our patients, how much of the contact time that was assessment and how much that was actual treatment. > > Below is an example (I call it the not-working example) > > df2<- data.frame( > date2 = seq(Sys.Date(), len= 156, by="2 day")[sample(156, 78)], > patient2 = factor(rep(1:26, 3), labels = LETTERS) > ) > > df2<- df2[order(df2$date2), ] > dt2<- qplot(date2, patient2, data=df2, geom="line") > dt2 + scale_x_date(major="months", minor="weeks") > df2[ which(df2$patient2=='B'), c("patient2", "date2")] > > If someone can point me in a direction or tell me what I am doing wrong or if there is some amazing package for plotting longitudinal data I would be very grateful. >Hi Eric, Try this, I think it does more or less what you want. I tried to work this out with matplot, but couldn't. library(plotrix) df2<-data.frame(dates=c(base_dates,dates2,dates3),patients=rep(LETTERS,3), occasion=rep(c("Assessment","Treatment","Hospital"),each=26)) plot(df2$dates,as.numeric(factor(df2$patients)), main="Dates of treatment stages by patient", xlab="Date",ylab="Patient",axes=FALSE,pch=rep(c("A","T","H"),each=26)) axis.dates<-c("2011-01-01","2011-03-01","2011-05-01","2011-07-01", "2011-09-01","2011-11-01") axis(1,at=as.Date(axis.dates,"%Y-%m-%d"),labels=axis.dates) staxlab(2,at=1:26,labels=LETTERS) box() for(i in 1:26) { lines(df2$dates[c(i,i+26)],c(i,i),col=2) lines(df2$dates[c(i+26,i+52)],c(i,i),col=3) } Jim
Hadley Wickham
2011-Dec-07 14:01 UTC
[R] plotting and coloring longitudinal data with three time points (ggplot2)
On Wed, Dec 7, 2011 at 4:02 AM, Eric Fail <eric.fail at gmx.us> wrote:> ?Dear list, > > I have been struggling with this for some time now, and for the last hour I have been struggling to make a working example for the list. I hope someone out there have some experience with plotting longitudinal data that they will share. > > My data is some patient data with three different time stamps. First the patients are identified at different times (first time stamp). Second, they go through an assessment phase and begin their treatment (time stamp 2). Finally they are admitted from the hospital at some point (time stamp 3), > > I would like to make a spaghetti plot with the assessment phase in one color and the treatment phase in another color. > > I used ggplot2, and with this example data and only two time points; it works fine (I call it my working example), > > library(ggplot2) > df <- data.frame( > ??date = seq(Sys.Date(), len=104, by="1 day")[sample(104, 52)], > ?? patient = factor(rep(1:26, 2), labels = LETTERS) > ?) > df <- df[order(df$date), ] > dt <- qplot(date, patient, data=df, geom="line") > dt + scale_x_date() > df[ which(df$patient=='E'), c("patient", "date")] > > But, if I have three time points, R, for some reason I do not yet understand, add the two second time points in some funny way. > > Finally, when that is solved; how do I colorize the different parts of the line so the assessment phase gets one color and the treatment phase another? > > I want to be able to show how long we have been in contact with our patients, how much of the contact time that was assessment and how much that was actual treatment. > > Below is an example (I call it the not-working example) > > df2 <- data.frame( > ??date2 = seq(Sys.Date(), len= 156, by="2 day")[sample(156, 78)], > ??patient2 = factor(rep(1:26, 3), labels = LETTERS) > ?) > > df2 <- df2[order(df2$date2), ] > dt2 <- qplot(date2, patient2, data=df2, geom="line") > dt2 + scale_x_date(major="months", minor="weeks") > df2[ which(df2$patient2=='B'), c("patient2", "date2")]Did you mean something like this? library(ggplot2) library(plyr) df2 <- data.frame( date2 = seq(Sys.Date(), len= 156, by="2 day")[sample(156, 78)], patient2 = factor(rep(1:26, 3), labels = LETTERS) ) df2 <- ddply(df2, "patient2", mutate, visit = order(date2)) qplot(date2, patient2, data = df2, geom = "line") + geom_point(aes(colour = factor(visit))) # or this? library(ggplot2) library(plyr) df2 <- data.frame( date2 = seq(Sys.Date(), len= 156, by="2 day")[sample(156, 78)], patient2 = factor(rep(1:26, 3), labels = LETTERS) ) df2 <- ddply(df2, "patient2", mutate, visit = order(date2)) qplot(date2, patient2, data = df2, geom = "line", colour factor(visit), group = patient2) # Obviously the lines are drawn between the observations so you only see the first two visits. Hadley -- Assistant Professor / Dobelman Family Junior Chair Department of Statistics / Rice University http://had.co.nz/