I have a dataset that I'm trying to rearrange for a repeated measures analysis: It looks like: patient basefev1 fev11h fev12h fev13h fev14h fev15h fev16h fev17h fev18h drug 201 2.46 2.68 2.76 2.50 2.30 2.14 2.40 2.33 2.20 a 202 3.50 3.95 3.65 2.93 2.53 3.04 3.37 3.14 2.62 a 203 1.96 2.28 2.34 2.29 2.43 2.06 2.18 2.28 2.29 a 204 3.44 4.08 3.87 3.79 3.30 3.80 3.24 2.98 2.91 a And I want to make it look like: Patient FEV time drug 201 2.46 0 a 201 2.68 1 a 201 2.76 2 a 201 2.50 3 a And so on . . . . . There would be 9 "time" and drug is a factor variable. I know there is a way to do this in R but I cannot remember the function. I've looked at the transpose function in (base) but that doesn't seem to be what I want. Can something like this be done easily from within package functions or would it require writing something custom? Another program would use something like the transpose procedure, but I'm trying to stay away from that program. Thanks, Patrick R version 2.9.2 (2009-08-24) i386-pc-mingw32 locale: LC_COLLATE=English_United States.1252;LC_CTYPE=English_United States.1252;LC_MONETARY=English_United States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252 attached base packages: [1] grDevices datasets tcltk splines graphics stats utils methods base other attached packages: [1] svSocket_0.9-43 svMisc_0.9.48 TinnR_1.0.3 R2HTML_1.59-1 Hmisc_3.6-1 survival_2.35-4 loaded via a namespace (and not attached): [1] cluster_1.12.0 grid_2.9.2 lattice_0.17-25 tools_2.9.2 This email message, including any attachments, is for th...{{dropped:6}}
?reshape On 28/08/2009, at 11:37 AM, Richardson, Patrick wrote:> I have a dataset that I'm trying to rearrange for a repeated > measures analysis: > > It looks like: > > patient basefev1 fev11h fev12h fev13h fev14h fev15h fev16h fev17h > fev18h drug > 201 2.46 2.68 2.76 2.50 2.30 2.14 2.40 2.33 > 2.20 a > 202 3.50 3.95 3.65 2.93 2.53 3.04 3.37 3.14 > 2.62 a > 203 1.96 2.28 2.34 2.29 2.43 2.06 2.18 2.28 > 2.29 a > 204 3.44 4.08 3.87 3.79 3.30 3.80 3.24 2.98 > 2.91 a > > And I want to make it look like: > > Patient FEV time drug > 201 2.46 0 a > 201 2.68 1 a > 201 2.76 2 a > 201 2.50 3 a > > And so on . . . . . There would be 9 "time" and drug is a factor > variable. > > I know there is a way to do this in R but I cannot remember the > function. I've looked at the transpose function in (base) but that > doesn't seem to be what I want. Can something like this be done > easily from within package functions or would it require writing > something custom? Another program would use something like the > transpose procedure, but I'm trying to stay away from that program. > > Thanks, > > Patrick###################################################################### Attention:\ This e-mail message is privileged and confid...{{dropped:9}}
I suspect reshape() is the function you're looking for; there is also a reshape package that you might prefer. It's also quite easy to do this in base R using unlist() and some indexing with rep, but that may be more than you care to deal with. Bert Gunter Genentech Nonclinical Biostatisics -----Original Message----- From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Richardson, Patrick Sent: Thursday, August 27, 2009 4:37 PM To: r help Subject: [R] Transform data for repeated measures I have a dataset that I'm trying to rearrange for a repeated measures analysis: It looks like: patient basefev1 fev11h fev12h fev13h fev14h fev15h fev16h fev17h fev18h drug 201 2.46 2.68 2.76 2.50 2.30 2.14 2.40 2.33 2.20 a 202 3.50 3.95 3.65 2.93 2.53 3.04 3.37 3.14 2.62 a 203 1.96 2.28 2.34 2.29 2.43 2.06 2.18 2.28 2.29 a 204 3.44 4.08 3.87 3.79 3.30 3.80 3.24 2.98 2.91 a And I want to make it look like: Patient FEV time drug 201 2.46 0 a 201 2.68 1 a 201 2.76 2 a 201 2.50 3 a And so on . . . . . There would be 9 "time" and drug is a factor variable. I know there is a way to do this in R but I cannot remember the function. I've looked at the transpose function in (base) but that doesn't seem to be what I want. Can something like this be done easily from within package functions or would it require writing something custom? Another program would use something like the transpose procedure, but I'm trying to stay away from that program. Thanks, Patrick R version 2.9.2 (2009-08-24) i386-pc-mingw32 locale: LC_COLLATE=English_United States.1252;LC_CTYPE=English_United States.1252;LC_MONETARY=English_United States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252 attached base packages: [1] grDevices datasets tcltk splines graphics stats utils methods base other attached packages: [1] svSocket_0.9-43 svMisc_0.9.48 TinnR_1.0.3 R2HTML_1.59-1 Hmisc_3.6-1 survival_2.35-4 loaded via a namespace (and not attached): [1] cluster_1.12.0 grid_2.9.2 lattice_0.17-25 tools_2.9.2 This email message, including any attachments, is for th...{{dropped:8}}
Patrick; You got two helpful suggestions for where to start your learning how to reshape your data. I am going to admit that I have recurring difficulty using either reshape() or the functions in the reshape package. It undoubtedly reflects some sort of "constricted abstraction capability" on my part, but I have learned that sometimes I get what I need from a combination of rep() and the stack() function. The strategy is to rep() the recurring variables and stack() the repeated measures on the same subjects resp.df <- read.table(textConnection("patient basefev1 fev11h fev12h fev13h fev14h fev15h fev16h fev17h fev18h drug 201 2.46 2.68 2.76 2.50 2.30 2.14 2.40 2.33 2.20 a 202 3.50 3.95 3.65 2.93 2.53 3.04 3.37 3.14 2.62 a 203 1.96 2.28 2.34 2.29 2.43 2.06 2.18 2.28 2.29 a 204 3.44 4.08 3.87 3.79 3.30 3.80 3.24 2.98 2.91 a"), header=TRUE) closeAllConnections() stk.resp <- data.frame(pt.id.=rep(resp.df$patient, 9), FEV1 =stack(resp.df[,2:10]), drug=rep(resp.df$drug, 9)) stk.resp pt.id. FEV1.values FEV1.ind drug 1 201 2.46 basefev1 a 2 202 3.50 basefev1 a 3 203 1.96 basefev1 a 4 204 3.44 basefev1 a 5 201 2.68 fev11h a 6 202 3.95 fev11h a 7 203 2.28 fev11h a 8 204 4.08 fev11h a snipped further unneeded output HTH; David. On Aug 27, 2009, at 7:37 PM, Richardson, Patrick wrote:> I have a dataset that I'm trying to rearrange for a repeated > measures analysis: > > It looks like: > > patient basefev1 fev11h fev12h fev13h fev14h fev15h fev16h fev17h > fev18h drug > 201 2.46 2.68 2.76 2.50 2.30 2.14 2.40 2.33 > 2.20 a > 202 3.50 3.95 3.65 2.93 2.53 3.04 3.37 3.14 > 2.62 a > 203 1.96 2.28 2.34 2.29 2.43 2.06 2.18 2.28 > 2.29 a > 204 3.44 4.08 3.87 3.79 3.30 3.80 3.24 2.98 > 2.91 a > > And I want to make it look like: > > Patient FEV time drug > 201 2.46 0 a > 201 2.68 1 a > 201 2.76 2 a > 201 2.50 3 a > > And so on . . . . . There would be 9 "time" and drug is a factor > variable. > > I know there is a way to do this in R but I cannot remember the > function. I've looked at the transpose function in (base) but that > doesn't seem to be what I want. Can something like this be done > easily from within package functions or would it require writing > something custom? Another program would use something like the > transpose procedure, but I'm trying to stay away from that program. > > Thanks, > > PatrickDavid Winsemius, MD Heritage Laboratories West Hartford, CT
Using reshape you can try this: reshape(resp.df, direction = "long", idvar="patient", varying list(grep("fev", names(resp.df)))) On Thu, Aug 27, 2009 at 8:37 PM, Richardson, Patrick < Patrick.Richardson@vai.org> wrote:> I have a dataset that I'm trying to rearrange for a repeated measures > analysis: > > It looks like: > > patient basefev1 fev11h fev12h fev13h fev14h fev15h fev16h fev17h fev18h > drug > 201 2.46 2.68 2.76 2.50 2.30 2.14 2.40 2.33 2.20 a > 202 3.50 3.95 3.65 2.93 2.53 3.04 3.37 3.14 2.62 a > 203 1.96 2.28 2.34 2.29 2.43 2.06 2.18 2.28 2.29 a > 204 3.44 4.08 3.87 3.79 3.30 3.80 3.24 2.98 2.91 a > > And I want to make it look like: > > Patient FEV time drug > 201 2.46 0 a > 201 2.68 1 a > 201 2.76 2 a > 201 2.50 3 a > > And so on . . . . . There would be 9 "time" and drug is a factor variable. > > I know there is a way to do this in R but I cannot remember the function. > I've looked at the transpose function in (base) but that doesn't seem to be > what I want. Can something like this be done easily from within package > functions or would it require writing something custom? Another program > would use something like the transpose procedure, but I'm trying to stay > away from that program. > > Thanks, > > Patrick > > R version 2.9.2 (2009-08-24) > i386-pc-mingw32 > > locale: > LC_COLLATE=English_United States.1252;LC_CTYPE=English_United > States.1252;LC_MONETARY=English_United > States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252 > > attached base packages: > [1] grDevices datasets tcltk splines graphics stats utils > methods base > > other attached packages: > [1] svSocket_0.9-43 svMisc_0.9.48 TinnR_1.0.3 R2HTML_1.59-1 > Hmisc_3.6-1 survival_2.35-4 > > loaded via a namespace (and not attached): > [1] cluster_1.12.0 grid_2.9.2 lattice_0.17-25 tools_2.9.2 > This email message, including any attachments, is for ...{{dropped:20}}
On Thu, Aug 27, 2009 at 6:37 PM, Richardson, Patrick<Patrick.Richardson at vai.org> wrote:> I have a dataset that I'm trying to rearrange for a repeated measures analysis: > > It looks like: > > patient basefev1 fev11h fev12h fev13h fev14h fev15h fev16h fev17h fev18h drug > 201 ? ? 2.46 ? 2.68 ? 2.76 ? 2.50 ? 2.30 ? 2.14 ? 2.40 ? 2.33 ? 2.20 ? ?a > 202 ? ? 3.50 ? 3.95 ? 3.65 ? 2.93 ? 2.53 ? 3.04 ? 3.37 ? 3.14 ? 2.62 ? ?a > 203 ? ? 1.96 ? 2.28 ? 2.34 ? 2.29 ? 2.43 ? 2.06 ? 2.18 ? 2.28 ? 2.29 ? ?a > 204 ? ? 3.44 ? 4.08 ? 3.87 ? 3.79 ? 3.30 ? 3.80 ? 3.24 ? 2.98 ? 2.91 ? ?a > > And I want to make it look like: > > Patient ?FEV ?time ?drug > 201 ? ? ? ? 2.46 ? ?0 ? ? ? ? a > 201 ? ? ? ? 2.68 ? ?1 ? ? ? ? a > 201 ? ? ? ? 2.76 ? ?2 ? ? ? ? a > 201 ? ? ? ? 2.50 ? ?3 ? ? ? ? a > > And so on . . . . . There would be 9 "time" and drug is a factor variable. > > I know there is a way to do this in R but I cannot remember the function. I've looked at the transpose function in (base) but that doesn't seem to be what I want. Can something like this be done easily from within package functions or would it require writing something custom? Another program would use something like the transpose procedure, but I'm trying to stay away from that program.To give a concrete example of what using the reshape package would look like: (thanks to David for making your example easily reproducible) resp.df <- read.table(textConnection("patient basefev1 fev11h fev12h fev13h fev14h fev15h fev16h fev17h fev18h drug 201 2.46 2.68 2.76 2.50 2.30 2.14 2.40 2.33 2.20 a 202 3.50 3.95 3.65 2.93 2.53 3.04 3.37 3.14 2.62 a 203 1.96 2.28 2.34 2.29 2.43 2.06 2.18 2.28 2.29 a 204 3.44 4.08 3.87 3.79 3.30 3.80 3.24 2.98 2.91 a"), header=TRUE) closeAllConnections() library(reshape) # Identify the variables controlled by your experimental design: stk.resp <- melt(resp.df, id = c("patient", "drug")) stk.resp$time <- as.numeric(gsub("[^0-9]", "", stk.resp$variable)) Hadley -- http://had.co.nz/