You can also combine the data frames into a single one and use xtabs: ID <- names(mylist) mylist <- Map(data.frame, mylist, dfn=ID) mydf <- do.call(rbind, mylist) mydf$Family <- factor(mydf$Family, levels=sort(levels(mydf$Family))) xtabs(Hits~Family+dfn, mydf) # dfn # Family A B C # a 0 3 0 # c 1 1 0 # d 2 0 0 # e 3 0 0 # f 0 4 5 # o 0 0 4 # q 0 0 10 ------------------------------------- David L Carlson Department of Anthropology Texas A&M University College Station, TX 77840-4352 -----Original Message----- From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of Jim Lemon Sent: Thursday, February 23, 2017 6:00 PM To: Andr? Luis Neves <andrluis at ualberta.ca>; r-help mailing list <r-help at r-project.org> Subject: Re: [R] Help with data management Hi Andre, As far as I am aware, merges can only be accomplished between two data frames, so I think you would have to do it one by one. It is probably possible to program this to operate on your list of data frames, but I suspect that it would take as much time as a bit of copying and pasting. If your data is being extracted from an external database, it may be possible to perform the operation in SQL, I don't have the time to work that out at the moment. Jim On Fri, Feb 24, 2017 at 10:53 AM, Andr? Luis Neves <andrluis at ualberta.ca> wrote:> Hi, Jim: > > Your code worked great, but I have 48 dataframes. After merging A and B in > D, you merged C in D. In this case, do I need to add them one by one until > getting the 48 dataframes merged in one? > > Thank you for your great help. > > Andre > > On Thu, Feb 23, 2017 at 4:24 PM, Jim Lemon <drjimlemon at gmail.com> wrote: >> >> Hi Andre, >> This might do it: >> >> A<-data.frame(c("c", "d", "e"),4.4:6.8,c(1,2,3)) >> colnames(A) <- c ("Family", "NormalizedCount", "Hits") >> B<-data.frame(c("c", "f", "a"),c(3.2,6.4, 4.4), c(1,4,3)) >> colnames(B) <- c ("Family", "NormalizedCount", "Hits") >> C<-data.frame(c("q", "o", "f"),c(7.2,9.4, 41.4), c(10,4,5)) >> colnames(C) <- c ("Family", "NormalizedCount", "Hits") >> keepcols<-c("Family","Hits") >> D<-merge(A[,keepcols],B[,keepcols],by="Family",all=TRUE) >> D<-merge(D,C[,keepcols],by="Family",all=TRUE) >> D[,2:4]<-sapply(D[,-1],function(x) { x[is.na(x)]<-0; x }) >> names(D)<-c("Family","A","B","C") >> >> Jim >> >> >> On Fri, Feb 24, 2017 at 9:37 AM, Andr? Luis Neves <andrluis at ualberta.ca> >> wrote: >> > Dear R users, >> > >> > I have the following dataframes (A, B, and C) stored in a list: >> > >> > A= data.frame(c("c", "d", "e"),4.4:6.8,c(1,2,3)) >> > colnames(A) <- c ("Family", "NormalizedCount", "Hits") >> > A >> > >> > >> > B= data.frame(c("c", "f", "a"),c(3.2,6.4, 4.4), c(1,4,3)) >> > colnames(B) <- c ("Family", "NormalizedCount", "Hits") >> > B >> > >> > >> > C= data.frame(c("q", "o", "f"),c(7.2,9.4, 41.4), c(10,4,5)) >> > colnames(C) <- c ("Family", "NormalizedCount", "Hits") >> > C >> > >> > mylist <- list(A=A,B=B,C=C) >> > mylist >> > >> > >> > My idea is to merge the three dataframes into another dataframe (let's >> > name >> > it: 'D') with a structure in which the rows are the Families and >> > columns >> > the "Hits" of each family detected in the dataframes A, B, and C. If a >> > given 'Family' does NOT have a 'Hit' in the dataframe we need to assign >> > number 0 to it. >> > >> > The dataframe 'D' would need to be populated as follows: >> > >> > >> > Family A >> > B C >> > c 1 1 0 >> > d 2 0 0 >> > e 3 0 0 >> > f 0 4 5 >> > a 0 3 0 >> > q 0 0 10 >> > o 0 0 4 >> > >> > >> > Thank you very much for your great help, >> > >> > >> > >> > -- >> > Andre >> > >> > [[alternative HTML version deleted]] >> > >> > ______________________________________________ >> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >> > https://stat.ethz.ch/mailman/listinfo/r-help >> > PLEASE do read the posting guide >> > http://www.R-project.org/posting-guide.html >> > and provide commented, minimal, self-contained, reproducible code. > > > > > -- > Andre______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Hi, David: Thank you so much for your answer. I just added some commands and got what I wanted. The final command would be something like this: A= data.frame(c("c", "d", "e"),4.4:6.8,c(1,2,3)) colnames(A) <- c ("Family", "NormalizedCount", "Hits") A B= data.frame(c("c", "f", "a"),c(3.2,6.4, 4.4), c(1,4,3)) colnames(B) <- c ("Family", "NormalizedCount", "Hits") B C= data.frame(c("q", "o", "f"),c(7.2,9.4, 41.4), c(10,4,5)) colnames(C) <- c ("Family", "NormalizedCount", "Hits") C mylist <- list(A=A,B=B,C=C) mylist ID <- names(mylist) mylist <- Map(data.frame, mylist, dfn=ID) mydf <- do.call(rbind, mylist) mydf$Family <- factor(mydf$Family, levels=sort(levels(mydf$Family))) z <- xtabs(Hits~Family+dfn, mydf) x <- as.data.frame(z) x library(reshape2) y <- dcast(x, Family ~ dfn, value.var = "Freq") y Thank you very much. Andre On Fri, Feb 24, 2017 at 8:40 AM, David L Carlson <dcarlson at tamu.edu> wrote:> You can also combine the data frames into a single one and use xtabs: > > ID <- names(mylist) > mylist <- Map(data.frame, mylist, dfn=ID) > mydf <- do.call(rbind, mylist) > mydf$Family <- factor(mydf$Family, levels=sort(levels(mydf$Family))) > xtabs(Hits~Family+dfn, mydf) > # dfn > # Family A B C > # a 0 3 0 > # c 1 1 0 > # d 2 0 0 > # e 3 0 0 > # f 0 4 5 > # o 0 0 4 > # q 0 0 10 > > > ------------------------------------- > David L Carlson > Department of Anthropology > Texas A&M University > College Station, TX 77840-4352 > > > > > -----Original Message----- > From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of Jim Lemon > Sent: Thursday, February 23, 2017 6:00 PM > To: Andr? Luis Neves <andrluis at ualberta.ca>; r-help mailing list < > r-help at r-project.org> > Subject: Re: [R] Help with data management > > Hi Andre, > As far as I am aware, merges can only be accomplished between two data > frames, so I think you would have to do it one by one. It is probably > possible to program this to operate on your list of data frames, but I > suspect that it would take as much time as a bit of copying and > pasting. If your data is being extracted from an external database, it > may be possible to perform the operation in SQL, I don't have the time > to work that out at the moment. > > Jim > > > On Fri, Feb 24, 2017 at 10:53 AM, Andr? Luis Neves <andrluis at ualberta.ca> > wrote: > > Hi, Jim: > > > > Your code worked great, but I have 48 dataframes. After merging A and B > in > > D, you merged C in D. In this case, do I need to add them one by one > until > > getting the 48 dataframes merged in one? > > > > Thank you for your great help. > > > > Andre > > > > On Thu, Feb 23, 2017 at 4:24 PM, Jim Lemon <drjimlemon at gmail.com> wrote: > >> > >> Hi Andre, > >> This might do it: > >> > >> A<-data.frame(c("c", "d", "e"),4.4:6.8,c(1,2,3)) > >> colnames(A) <- c ("Family", "NormalizedCount", "Hits") > >> B<-data.frame(c("c", "f", "a"),c(3.2,6.4, 4.4), c(1,4,3)) > >> colnames(B) <- c ("Family", "NormalizedCount", "Hits") > >> C<-data.frame(c("q", "o", "f"),c(7.2,9.4, 41.4), c(10,4,5)) > >> colnames(C) <- c ("Family", "NormalizedCount", "Hits") > >> keepcols<-c("Family","Hits") > >> D<-merge(A[,keepcols],B[,keepcols],by="Family",all=TRUE) > >> D<-merge(D,C[,keepcols],by="Family",all=TRUE) > >> D[,2:4]<-sapply(D[,-1],function(x) { x[is.na(x)]<-0; x }) > >> names(D)<-c("Family","A","B","C") > >> > >> Jim > >> > >> > >> On Fri, Feb 24, 2017 at 9:37 AM, Andr? Luis Neves <andrluis at ualberta.ca > > > >> wrote: > >> > Dear R users, > >> > > >> > I have the following dataframes (A, B, and C) stored in a list: > >> > > >> > A= data.frame(c("c", "d", "e"),4.4:6.8,c(1,2,3)) > >> > colnames(A) <- c ("Family", "NormalizedCount", "Hits") > >> > A > >> > > >> > > >> > B= data.frame(c("c", "f", "a"),c(3.2,6.4, 4.4), c(1,4,3)) > >> > colnames(B) <- c ("Family", "NormalizedCount", "Hits") > >> > B > >> > > >> > > >> > C= data.frame(c("q", "o", "f"),c(7.2,9.4, 41.4), c(10,4,5)) > >> > colnames(C) <- c ("Family", "NormalizedCount", "Hits") > >> > C > >> > > >> > mylist <- list(A=A,B=B,C=C) > >> > mylist > >> > > >> > > >> > My idea is to merge the three dataframes into another dataframe (let's > >> > name > >> > it: 'D') with a structure in which the rows are the Families and > >> > columns > >> > the "Hits" of each family detected in the dataframes A, B, and C. If a > >> > given 'Family' does NOT have a 'Hit' in the dataframe we need to > assign > >> > number 0 to it. > >> > > >> > The dataframe 'D' would need to be populated as follows: > >> > > >> > > >> > Family A > >> > B C > >> > c 1 1 0 > >> > d 2 0 0 > >> > e 3 0 0 > >> > f 0 4 5 > >> > a 0 3 0 > >> > q 0 0 10 > >> > o 0 0 4 > >> > > >> > > >> > Thank you very much for your great help, > >> > > >> > > >> > > >> > -- > >> > Andre > >> > > >> > [[alternative HTML version deleted]] > >> > > >> > ______________________________________________ > >> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > >> > https://stat.ethz.ch/mailman/listinfo/r-help > >> > PLEASE do read the posting guide > >> > http://www.R-project.org/posting-guide.html > >> > and provide commented, minimal, self-contained, reproducible code. > > > > > > > > > > -- > > Andre > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/ > posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Andre [[alternative HTML version deleted]]
You can also get there without reshape2: z <- xtabs(Hits~Family+dfn, mydf) x <- as.data.frame.matrix(z) # Convert the table without changing the format y <- data.frame(Family=dimnames(z)$Family, as.data.frame.matrix(z)) # Add Family column rownames(y) <- NULL # Optional, but it replaces the rownames numbers str(y) # data.frame': 7 obs. of 4 variables: # $ Family: Factor w/ 7 levels "a","c","d","e",..: 1 2 3 4 5 6 7 # $ A : num 0 1 2 3 0 0 0 # $ B : num 3 1 0 0 4 0 0 # $ C : num 0 0 0 0 5 4 10 ------------------------------------- David L Carlson Department of Anthropology Texas A&M University College Station, TX 77840-4352 From: Andr? Luis Neves [mailto:andrluis at ualberta.ca] Sent: Friday, February 24, 2017 10:14 AM To: David L Carlson <dcarlson at tamu.edu> Cc: Jim Lemon <drjimlemon at gmail.com>; r-help mailing list <r-help at r-project.org> Subject: Re: [R] Help with data management Hi, David: Thank you so much for your answer. I just added some commands and got what I wanted. The final command would be something like this: A= data.frame(c("c", "d", "e"),4.4:6.8,c(1,2,3)) colnames(A) <- c ("Family", "NormalizedCount", "Hits")? A? B= data.frame(c("c", "f", "a"),c(3.2,6.4, 4.4), c(1,4,3))? colnames(B) <- c ("Family", "NormalizedCount", "Hits") B C= data.frame(c("q", "o", "f"),c(7.2,9.4, 41.4), c(10,4,5))? colnames(C) <- c ("Family", "NormalizedCount", "Hits") C mylist <- list(A=A,B=B,C=C) mylist ID <- names(mylist) mylist <- Map(data.frame, mylist, dfn=ID) mydf <- do.call(rbind, mylist) mydf$Family <- factor(mydf$Family, levels=sort(levels(mydf$Family))) z <- xtabs(Hits~Family+dfn, mydf) x <- as.data.frame(z) x library(reshape2) y <- dcast(x, Family ~ dfn, value.var = "Freq") y Thank you very much. Andre On Fri, Feb 24, 2017 at 8:40 AM, David L Carlson <dcarlson at tamu.edu> wrote: You can also combine the data frames into a single one and use xtabs: ID <- names(mylist) mylist <- Map(data.frame, mylist, dfn=ID) mydf <- do.call(rbind, mylist) mydf$Family <- factor(mydf$Family, levels=sort(levels(mydf$Family))) xtabs(Hits~Family+dfn, mydf) #? ? ? ?dfn # Family? A? B? C #? ? ? a? 0? 3? 0 #? ? ? c? 1? 1? 0 #? ? ? d? 2? 0? 0 #? ? ? e? 3? 0? 0 #? ? ? f? 0? 4? 5 #? ? ? o? 0? 0? 4 #? ? ? q? 0? 0 10 ------------------------------------- David L Carlson Department of Anthropology Texas A&M University College Station, TX 77840-4352 -----Original Message----- From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of Jim Lemon Sent: Thursday, February 23, 2017 6:00 PM To: Andr? Luis Neves <andrluis at ualberta.ca>; r-help mailing list <r-help at r-project.org> Subject: Re: [R] Help with data management Hi Andre, As far as I am aware, merges can only be accomplished between two data frames, so I think you would have to do it one by one. It is probably possible to program this to operate on your list of data frames, but I suspect that it would take as much time as a bit of copying and pasting. If your data is being extracted from an external database, it may be possible to perform the operation in SQL, I don't have the time to work that out at the moment. Jim On Fri, Feb 24, 2017 at 10:53 AM, Andr? Luis Neves <andrluis at ualberta.ca> wrote:> Hi, Jim: > > Your code worked great, but I have 48 dataframes. After merging A and B in > D, you merged C in D. In this case, do I need to add them one by one until > getting the 48 dataframes merged in one? > > Thank you for your great help. > > Andre > > On Thu, Feb 23, 2017 at 4:24 PM, Jim Lemon <drjimlemon at gmail.com> wrote: >> >> Hi Andre, >> This might do it: >> >> A<-data.frame(c("c", "d", "e"),4.4:6.8,c(1,2,3)) >> colnames(A) <- c ("Family", "NormalizedCount", "Hits") >> B<-data.frame(c("c", "f", "a"),c(3.2,6.4, 4.4), c(1,4,3)) >> colnames(B) <- c ("Family", "NormalizedCount", "Hits") >> C<-data.frame(c("q", "o", "f"),c(7.2,9.4, 41.4), c(10,4,5)) >> colnames(C) <- c ("Family", "NormalizedCount", "Hits") >> keepcols<-c("Family","Hits") >> D<-merge(A[,keepcols],B[,keepcols],by="Family",all=TRUE) >> D<-merge(D,C[,keepcols],by="Family",all=TRUE) >> D[,2:4]<-sapply(D[,-1],function(x) { x[is.na(x)]<-0; x }) >> names(D)<-c("Family","A","B","C") >> >> Jim >> >> >> On Fri, Feb 24, 2017 at 9:37 AM, Andr? Luis Neves <andrluis at ualberta.ca> >> wrote: >> > Dear R users, >> > >> > I have the following dataframes (A, B, and C) stored in a list: >> > >> > A= data.frame(c("c", "d", "e"),4.4:6.8,c(1,2,3)) >> > colnames(A) <- c ("Family", "NormalizedCount", "Hits") >> > A >> > >> > >> > B= data.frame(c("c", "f", "a"),c(3.2,6.4, 4.4), c(1,4,3)) >> > colnames(B) <- c ("Family", "NormalizedCount", "Hits") >> > B >> > >> > >> > C= data.frame(c("q", "o", "f"),c(7.2,9.4, 41.4), c(10,4,5)) >> > colnames(C) <- c ("Family", "NormalizedCount", "Hits") >> > C >> > >> > mylist <- list(A=A,B=B,C=C) >> > mylist >> > >> > >> > My idea is to merge the three dataframes into another dataframe (let's >> > name >> > it: 'D')? with a structure in which the rows are the Families and >> > columns >> > the "Hits" of each family detected in the dataframes A, B, and C. If a >> > given 'Family' does NOT have a 'Hit' in the dataframe we need to assign >> > number 0 to it. >> > >> > The dataframe 'D' would need to be populated as follows: >> > >> > >> > Family? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? A >> >? ? ? ? B? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? C >> > c 1 1 0 >> > d 2 0 0 >> > e 3 0 0 >> > f 0 4 5 >> > a 0 3 0 >> > q 0 0 10 >> > o 0 0 4 >> > >> > >> > Thank you very much for your great help, >> > >> > >> > >> > -- >> > Andre >> > >> >? ? ? ? ?[[alternative HTML version deleted]] >> > >> > ______________________________________________ >> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >> > https://stat.ethz.ch/mailman/listinfo/r-help >> > PLEASE do read the posting guide >> > http://www.R-project.org/posting-guide.html >> > and provide commented, minimal, self-contained, reproducible code. > > > > > -- > Andre______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Andre