Hello, Hope that someone could help me plotting longitudinal data below: 7213 3333330001 0.8300 13.05.09 1 1 3333330001 0.8700 09.02.05 NULL 4797 3333330001 0.7700 21.03.07 NULL 2399 3333330001 0.7800 12.04.06 NULL 2400 3333330002 NULL 27.03.06 NULL 7230 3333330002 0.8200 14.05.09 0 2 3333330002 0.8400 09.02.05 NULL 4798 3333330002 0.8700 20.03.07 0 4799 3333330003 0.9000 20.03.07 13 2401 3333330003 0.9300 27.03.06 16 3 3333330003 0.8400 10.02.05 NULL 7233 3333330003 NULL 14.05.09 1 4 3333330004 0.7200 10.02.05 NULL 4800 3333330004 0.8900 19.03.07 22 2402 3333330004 0.7300 29.03.06 27 7258 3333330004 0.7700 18.05.09 1 The second column is a patient_id, the third is the value I want to plot against the fourth which is the date. First I 'aggregate' the patient_ids: id<-unique(dat$patient_id) Then I try (and fail) to create a loop, that is supposed to plot the data: for(i in 1:8480){patient_id==id[i]plot(date,value)} What might be wrong? And how could I only plot eg quintiles for the ones that go down fastest? Thanks, Jukka
Hi: There are several problems; see inline. On Tue, Sep 7, 2010 at 9:27 AM, Jukka Koskela <jukka.koskela@helsinki.fi>wrote:> Hello, > > Hope that someone could help me plotting longitudinal data below: >Firstly, you want to use NA in place of NULL as the missing value code. This is easy to change in a text editor.> > 7213 3333330001 0.8300 13.05.09 1 > 1 3333330001 0.8700 09.02.05 NULL > 4797 3333330001 0.7700 21.03.07 NULL > 2399 3333330001 0.7800 12.04.06 NULL > 2400 3333330002 NULL 27.03.06 NULL > 7230 3333330002 0.8200 14.05.09 0 > 2 3333330002 0.8400 09.02.05 NULL > 4798 3333330002 0.8700 20.03.07 0 > 4799 3333330003 0.9000 20.03.07 13 > 2401 3333330003 0.9300 27.03.06 16 > 3 3333330003 0.8400 10.02.05 NULL > 7233 3333330003 NULL 14.05.09 1 > 4 3333330004 0.7200 10.02.05 NULL > 4800 3333330004 0.8900 19.03.07 22 > 2402 3333330004 0.7300 29.03.06 27 > 7258 3333330004 0.7700 18.05.09 1 > > The second column is a patient_id, the third is the value I want to plot > against the fourth which is the date. >This isn't hard to do. Below is one approach using package ggplot2, but there are other ways. mydat <- read.table(textConnection(" + 7213 3333330001 0.8300 13.05.09 1 + 1 3333330001 0.8700 09.02.05 NA <snip for brevity> + 7258 3333330004 0.7700 18.05.09 1"), header = FALSE) closeAllConnections() # Convert the date to a Date object mydat$date <- as.Date(mydat$V4, format = '%d.%m.%y') mydat V1 V2 V3 V4 V5 date 1 7213 3333330001 0.83 13.05.09 1 2009-05-13 2 1 3333330001 0.87 09.02.05 NA 2005-02-09 3 4797 3333330001 0.77 21.03.07 NA 2007-03-21 4 2399 3333330001 0.78 12.04.06 NA 2006-04-12 5 2400 3333330002 NA 27.03.06 NA 2006-03-27 ... library(ggplot2) g <- ggplot(mydat2, aes(x = date, y = V3, group = V2)) g + geom_line(aes(colour = V2), size = 1) + scale_colour_gradient(legend = FALSE) This gives you a plot of V3 by Date for each patient; the lines are broken when NAs are present in either y or x. First I 'aggregate' the patient_ids:> > id<-unique(dat$patient_id) > > Then I try (and fail) to create a loop, that is supposed to plot the data: >The loop below isn't going to work. Are you trying to produce 8480 separate plots by patient or are you trying to put all 8480 in the same panel? If the latter, I'd strongly encourage you to read about alpha transparency.> > for(i in 1:8480){patient_id==id[i]plot(date,value)} > > > What might be wrong? > > I don't understand what you're driving at in the statement below - perhapsyou could explain what you mean?> And how could I only plot eg quintiles for the ones that go down fastest? >HTH, Dennis> > Thanks, > > Jukka > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]
Hi, Reading in (and slightly editing your data, so that patient_id is a factor, and the date is class date). Results of dput() provided for others' benefit. dat <- structure(list(V1 = c("7213", "1", "4797", "2399", "2400", "7230", "2", "4798", "4799", "2401", "3", "7233", "4", "4800", "2402", "7258"), patient_id = structure(c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L), .Label = c("3333330001", "3333330002", "3333330003", "3333330004"), class = "factor"), value = c(0.83, 0.87, 0.77, 0.78, NA, 0.82, 0.84, 0.87, 0.9, 0.93, 0.84, NA, 0.72, 0.89, 0.73, 0.77), date = structure(c(14377, 12823, 13593, 13250, 13234, 14378, 12823, 13592, 13592, 13234, 12824, 14378, 12824, 13591, 13236, 14382), class = "Date"), V5 = c("1", "NULL", "NULL", "NULL", "NULL", "0", "NULL", "0", "13", "16", "NULL", "1", "NULL", "22", "27", "1")), .Names = c("V1", "patient_id", "value", "date", "V5"), row.names = c(NA, -16L), class "data.frame") Here are two simple little plots: library(ggplot2) ggplot(data = dat, aes(x = date, y = value, colour = patient_id)) + geom_line() library(lattice) xyplot(value ~ date, data = dat, groups = patient_id, type = "l") Cheers, Josh On Tue, Sep 7, 2010 at 9:27 AM, Jukka Koskela <jukka.koskela at helsinki.fi> wrote:> Hello, > > Hope that someone could help me plotting longitudinal data below: > > 7213 ? ?3333330001 ? ? ?0.8300 ?13.05.09 ? ? ? ?1 > 1 ? ? ? 3333330001 ? ? ?0.8700 ?09.02.05 ? ? ? ?NULL > 4797 ? ?3333330001 ? ? ?0.7700 ?21.03.07 ? ? ? ?NULL > 2399 ? ?3333330001 ? ? ?0.7800 ?12.04.06 ? ? ? ?NULL > 2400 ? ?3333330002 ? ? ?NULL ? ?27.03.06 ? ? ? ?NULL > 7230 ? ?3333330002 ? ? ?0.8200 ?14.05.09 ? ? ? ?0 > 2 ? ? ? 3333330002 ? ? ?0.8400 ?09.02.05 ? ? ? ?NULL > 4798 ? ?3333330002 ? ? ?0.8700 ?20.03.07 ? ? ? ?0 > 4799 ? ?3333330003 ? ? ?0.9000 ?20.03.07 ? ? ? ?13 > 2401 ? ?3333330003 ? ? ?0.9300 ?27.03.06 ? ? ? ?16 > 3 ? ? ? 3333330003 ? ? ?0.8400 ?10.02.05 ? ? ? ?NULL > 7233 ? ?3333330003 ? ? ?NULL ? ?14.05.09 ? ? ? ?1 > 4 ? ? ? 3333330004 ? ? ?0.7200 ?10.02.05 ? ? ? ?NULL > 4800 ? ?3333330004 ? ? ?0.8900 ?19.03.07 ? ? ? ?22 > 2402 ? ?3333330004 ? ? ?0.7300 ?29.03.06 ? ? ? ?27 > 7258 ? ?3333330004 ? ? ?0.7700 ?18.05.09 ? ? ? ?1 > > The second column is a patient_id, the third is the value I want to plot > against the fourth which is the date. > > First I 'aggregate' the patient_ids: > > id<-unique(dat$patient_id) > > Then I try (and fail) to create a loop, that is supposed to plot the data: > > for(i in 1:8480){patient_id==id[i]plot(date,value)} > > > What might be wrong? > > And how could I only plot eg quintiles for the ones that go down fastest? > > Thanks, > > Jukka > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/