Hi, I hope that folks can give me some simple approaches to taking the data set below, which is accumulated in two columns called "long" and "group", then arrange the data is the "long" column into a data frame containing five variables: "Group 1", "Group 2", "Group 3", "Group 4", and "Group 5". I am hoping for a few different techniques which I can pass on to my students. Thanks David Arnold College of the Redwoods> dput(flies)structure(list(long = c(40L, 37L, 44L, 47L, 47L, 47L, 68L, 47L, 54L, 61L, 71L, 75L, 89L, 58L, 59L, 62L, 79L, 96L, 58L, 62L, 70L, 72L, 74L, 96L, 75L, 46L, 42L, 65L, 46L, 58L, 42L, 48L, 58L, 50L, 80L, 63L, 65L, 70L, 70L, 72L, 97L, 46L, 56L, 70L, 70L, 72L, 76L, 90L, 76L, 92L, 21L, 40L, 44L, 54L, 36L, 40L, 56L, 60L, 48L, 53L, 60L, 60L, 65L, 68L, 60L, 81L, 81L, 48L, 48L, 56L, 68L, 75L, 81L, 48L, 68L, 35L, 37L, 49L, 46L, 63L, 39L, 46L, 56L, 63L, 65L, 56L, 65L, 70L, 63L, 65L, 70L, 77L, 81L, 86L, 70L, 70L, 77L, 77L, 81L, 77L, 16L, 19L, 19L, 32L, 33L, 33L, 30L, 42L, 42L, 33L, 26L, 30L, 40L, 54L, 34L, 34L, 47L, 47L, 42L, 47L, 54L, 54L, 56L, 60L, 44L ), group = structure(c(5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = c("Group 5", "Group 4", "Group 3", "Group 2", "Group 1"), class = "factor")), .Names = c("long", "group"), row.names c(NA, -125L), class = "data.frame") -- View this message in context: http://r.789695.n4.nabble.com/Arrange-two-columns-into-a-five-variable-dataframe-tp4636503.html Sent from the R help mailing list archive at Nabble.com.
On 7/13/2012 8:37 PM, darnold wrote:> Hi, > > I hope that folks can give me some simple approaches to taking the data set > below, which is accumulated in two columns called "long" and "group", then > arrange the data is the "long" column into a data frame containing five > variables: "Group 1", "Group 2", "Group 3", "Group 4", and "Group 5". I am > hoping for a few different techniques which I can pass on to my students. > > Thanks > > David Arnold > College of the Redwoods > > >> dput(flies) > structure(list(long = c(40L, 37L, 44L, 47L, 47L, 47L, 68L, 47L, > 54L, 61L, 71L, 75L, 89L, 58L, 59L, 62L, 79L, 96L, 58L, 62L, 70L, > 72L, 74L, 96L, 75L, 46L, 42L, 65L, 46L, 58L, 42L, 48L, 58L, 50L, > 80L, 63L, 65L, 70L, 70L, 72L, 97L, 46L, 56L, 70L, 70L, 72L, 76L, > 90L, 76L, 92L, 21L, 40L, 44L, 54L, 36L, 40L, 56L, 60L, 48L, 53L, > 60L, 60L, 65L, 68L, 60L, 81L, 81L, 48L, 48L, 56L, 68L, 75L, 81L, > 48L, 68L, 35L, 37L, 49L, 46L, 63L, 39L, 46L, 56L, 63L, 65L, 56L, > 65L, 70L, 63L, 65L, 70L, 77L, 81L, 86L, 70L, 70L, 77L, 77L, 81L, > 77L, 16L, 19L, 19L, 32L, 33L, 33L, 30L, 42L, 42L, 33L, 26L, 30L, > 40L, 54L, 34L, 34L, 47L, 47L, 42L, 47L, 54L, 54L, 56L, 60L, 44L > ), group = structure(c(5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, > 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 4L, > 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, > 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, > 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, > 3L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, > 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, > 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, > 1L, 1L, 1L), .Label = c("Group 5", "Group 4", "Group 3", "Group 2", > "Group 1"), class = "factor")), .Names = c("long", "group"), row.names > c(NA, > -125L), class = "data.frame")Generally I would recommend either the reshape function or the functions in the reshape2 package. However, your data doesn't quite have what is needed to use those. You are implicitly assuming that the first occurring values in each group go together (should be in the same row), the second ones, etc. The reshapes require an explicit indication of which variables go together. The unstack function will work for you and uses the same assumption. > unstack(flies) Group.5 Group.4 Group.3 Group.2 Group.1 1 16 35 21 46 40 2 19 37 40 42 37 3 19 49 44 65 44 4 32 46 54 46 47 5 33 63 36 58 47 6 33 39 40 42 47 7 30 46 56 48 68 8 42 56 60 58 47 9 42 63 48 50 54 10 33 65 53 80 61 11 26 56 60 63 71 12 30 65 60 65 75 13 40 70 65 70 89 14 54 63 68 70 58 15 34 65 60 72 59 16 34 70 81 97 62 17 47 77 81 46 79 18 47 81 48 56 96 19 42 86 48 70 58 20 47 70 56 70 62 21 54 70 68 72 70 22 54 77 75 76 72 23 56 77 81 90 74 24 60 81 48 76 96 25 44 77 68 92 75 -- Brian S. Diggs, PhD Senior Research Associate, Department of Surgery Oregon Health & Science University
Hi, You could use either one of these methods: #Method 1: #dat1 : data list1<-split(dat1,dat1$group) dat2<-data.frame(list1) dat2<-data.frame(list1[[5]][1],list1[[4]][1],list1[[3]][1],list1[[2]][1],list1[[1]][1]) colnames(dat2)<-c(rev(levels(dat1$group))) head(dat2) ? Group 1 Group 2 Group 3 Group 4 Group 5 1????? 40????? 46????? 21????? 35????? 16 2????? 37????? 42????? 40????? 37????? 19 3????? 44????? 65????? 44????? 49????? 19 4????? 47????? 46????? 54????? 46????? 32 5????? 47????? 58????? 36????? 63????? 33 6????? 47????? 42????? 40????? 39????? 33 #Method 2: #dat1:data library(reshape) dat3<-data.frame(dat1,ID=rep(1:25,5)) dat4<-reshape(dat3,idvar="ID",timevar="group",direction="wide") dat4<-dat4[,-1] colnames(dat4)<-rev(levels(dat3$group)) head(dat4) ?? Group 1 Group 2 Group 3 Group 4 Group 5 1????? 40????? 46????? 21????? 35????? 16 2????? 37????? 42????? 40????? 37????? 19 3????? 44????? 65????? 44????? 49????? 19 4????? 47????? 46????? 54????? 46????? 32 5????? 47????? 58????? 36????? 63????? 33 6????? 47????? 42????? 40????? 39????? 33 #Method 3: #dat1: data dat3<-data.frame(dat1,ID=rep(1:25,5)) library(reshape2) dat5<-dcast(melt(dat3,id.vars=c("ID","group")),ID~variable+group) dat5<-dat5[,-1] colnames(dat5)<-levels(dat3$group) dat5<-dat5[,c(5:1)] head(dat5) ?Group 1 Group 2 Group 3 Group 4 Group 5 1????? 40????? 46????? 21????? 35????? 16 2????? 37????? 42????? 40????? 37????? 19 3????? 44????? 65????? 44????? 49????? 19 4????? 47????? 46????? 54????? 46????? 32 5????? 47????? 58????? 36????? 63????? 33 6????? 47????? 42????? 40????? 39????? 33> identical(dat2,dat4)[1] TRUE> identical(dat2,dat5)[1] TRUE A.K. ----- Original Message ----- From: darnold <dwarnold45 at suddenlink.net> To: r-help at r-project.org Cc: Sent: Friday, July 13, 2012 11:37 PM Subject: [R] Arrange two columns into a five variable dataframe Hi, I hope that folks can give me some simple approaches to taking the data set below, which is accumulated in two columns called "long" and "group", then arrange the data is the "long" column into a data frame containing five variables: "Group 1", "Group 2", "Group 3", "Group 4", and "Group 5".? I am hoping for a few different techniques which I can pass on to my students. Thanks David Arnold College of the Redwoods> dput(flies)structure(list(long = c(40L, 37L, 44L, 47L, 47L, 47L, 68L, 47L, 54L, 61L, 71L, 75L, 89L, 58L, 59L, 62L, 79L, 96L, 58L, 62L, 70L, 72L, 74L, 96L, 75L, 46L, 42L, 65L, 46L, 58L, 42L, 48L, 58L, 50L, 80L, 63L, 65L, 70L, 70L, 72L, 97L, 46L, 56L, 70L, 70L, 72L, 76L, 90L, 76L, 92L, 21L, 40L, 44L, 54L, 36L, 40L, 56L, 60L, 48L, 53L, 60L, 60L, 65L, 68L, 60L, 81L, 81L, 48L, 48L, 56L, 68L, 75L, 81L, 48L, 68L, 35L, 37L, 49L, 46L, 63L, 39L, 46L, 56L, 63L, 65L, 56L, 65L, 70L, 63L, 65L, 70L, 77L, 81L, 86L, 70L, 70L, 77L, 77L, 81L, 77L, 16L, 19L, 19L, 32L, 33L, 33L, 30L, 42L, 42L, 33L, 26L, 30L, 40L, 54L, 34L, 34L, 47L, 47L, 42L, 47L, 54L, 54L, 56L, 60L, 44L ), group = structure(c(5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = c("Group 5", "Group 4", "Group 3", "Group 2", "Group 1"), class = "factor")), .Names = c("long", "group"), row.names c(NA, -125L), class = "data.frame") -- View this message in context: http://r.789695.n4.nabble.com/Arrange-two-columns-into-a-five-variable-dataframe-tp4636503.html Sent from the R help mailing list archive at Nabble.com. ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.