Hi there, I have a list of dataframes (generated by reading multiple files) and all dataframes are comparable in dimension and column names. They also have a common column, which, I'd like to use for merging. To give a simple example of what I have: df1 <- data.frame(c(LETTERS[1:5]), c(2,6,3,1,9)) names(df1) <- c("pos", "data") df3 <- df2 <- df1 df2$data <- c(6,2,9,7,5) df3$data <- c(9,3,6,2,1) mylist <- list(df1,df2,df3) names(mylist) <- c("df1","df2","df3") > mylist $df1 pos data 1 A 2 2 B 6 3 C 3 4 D 1 5 E 9 $df2 pos data 1 A 6 2 B 2 3 C 9 4 D 7 5 E 5 $df3 pos data 1 A 9 2 B 3 3 C 6 4 D 2 5 E 1 If I use do.call("cbind"), I'll end up with something like this pos data pos data pos data 1 A 2 A 6 A 9 2 B 6 B 2 B 3 3 C 3 C 9 C 6 4 D 1 D 7 D 2 5 E 9 E 5 E 1 but now, I don't know anymore which data comes from which dataframe... and I have the column "pos" multiple times... Instead I'd like to have it like this: pos df1 df2 df3 1 A 2 6 9 2 B 6 2 3 3 C 3 9 6 4 D 1 7 2 5 E 9 5 1 How, can I realize it? (The list, I'm working with has not just 3 data frames like given in my example, so I need to automize it) Antje
What version of R are you using? I get this:> do.call(cbind, mylist)df1.pos df1.data df2.pos df2.data df3.pos df3.data 1 A 2 A 6 A 9 2 B 6 B 2 B 3 3 C 3 C 9 C 6 4 D 1 D 7 D 2 5 E 9 E 5 E 1> R.version.string[1] "R version 2.8.1 Patched (2008-12-26 r47350)" In which case> ALL <- do.call(cbind, mylist) > ALL <- ALL[regexpr("data", names(ALL)) > 0] > names(ALL) <- sub("[.].*", "", names(ALL)) > ALLdf1 df2 df3 1 2 6 9 2 6 2 3 3 3 9 6 4 1 7 2 5 9 5 1 On Wed, Jan 21, 2009 at 3:19 AM, Antje <niederlein-rstat at yahoo.de> wrote:> Hi there, > > I have a list of dataframes (generated by reading multiple files) and all > dataframes are comparable in dimension and column names. They also have a > common column, which, I'd like to use for merging. To give a simple example > of what I have: > > df1 <- data.frame(c(LETTERS[1:5]), c(2,6,3,1,9)) > names(df1) <- c("pos", "data") > df3 <- df2 <- df1 > df2$data <- c(6,2,9,7,5) > df3$data <- c(9,3,6,2,1) > mylist <- list(df1,df2,df3) > names(mylist) <- c("df1","df2","df3") > >> mylist > > $df1 > pos data > 1 A 2 > 2 B 6 > 3 C 3 > 4 D 1 > 5 E 9 > > $df2 > pos data > 1 A 6 > 2 B 2 > 3 C 9 > 4 D 7 > 5 E 5 > > $df3 > pos data > 1 A 9 > 2 B 3 > 3 C 6 > 4 D 2 > 5 E 1 > > If I use do.call("cbind"), I'll end up with something like this > > pos data pos data pos data > 1 A 2 A 6 A 9 > 2 B 6 B 2 B 3 > 3 C 3 C 9 C 6 > 4 D 1 D 7 D 2 > 5 E 9 E 5 E 1 > > > but now, I don't know anymore which data comes from which dataframe... and I > have the column "pos" multiple times... > > Instead I'd like to have it like this: > > pos df1 df2 df3 > 1 A 2 6 9 > 2 B 6 2 3 > 3 C 3 9 6 > 4 D 1 7 2 > 5 E 9 5 1 > > How, can I realize it? (The list, I'm working with has not just 3 data > frames like given in my example, so I need to automize it) > > > Antje > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
Try this also: cbind(pos = mylist$df1$pos, data.frame(mylist)[grep("data", names(data.frame(mylist)))]) On Wed, Jan 21, 2009 at 6:19 AM, Antje <niederlein-rstat@yahoo.de> wrote:> Hi there, > > I have a list of dataframes (generated by reading multiple files) and all > dataframes are comparable in dimension and column names. They also have a > common column, which, I'd like to use for merging. To give a simple example > of what I have: > > df1 <- data.frame(c(LETTERS[1:5]), c(2,6,3,1,9)) > names(df1) <- c("pos", "data") > df3 <- df2 <- df1 > df2$data <- c(6,2,9,7,5) > df3$data <- c(9,3,6,2,1) > mylist <- list(df1,df2,df3) > names(mylist) <- c("df1","df2","df3") > > > mylist > > $df1 > pos data > 1 A 2 > 2 B 6 > 3 C 3 > 4 D 1 > 5 E 9 > > $df2 > pos data > 1 A 6 > 2 B 2 > 3 C 9 > 4 D 7 > 5 E 5 > > $df3 > pos data > 1 A 9 > 2 B 3 > 3 C 6 > 4 D 2 > 5 E 1 > > If I use do.call("cbind"), I'll end up with something like this > > pos data pos data pos data > 1 A 2 A 6 A 9 > 2 B 6 B 2 B 3 > 3 C 3 C 9 C 6 > 4 D 1 D 7 D 2 > 5 E 9 E 5 E 1 > > > but now, I don't know anymore which data comes from which dataframe... and > I have the column "pos" multiple times... > > Instead I'd like to have it like this: > > pos df1 df2 df3 > 1 A 2 6 9 > 2 B 6 2 3 > 3 C 3 9 6 > 4 D 1 7 2 > 5 E 9 5 1 > > How, can I realize it? (The list, I'm working with has not just 3 data > frames like given in my example, so I need to automize it) > > > Antje > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40" S 49° 16' 22" O [[alternative HTML version deleted]]
Another possibility is Reduce:> Reduce(function(x,y,by='pos')merge(x,y,by='pos'),mylist)pos data.x data.y data 1 A 2 6 9 2 B 6 2 3 3 C 3 9 6 4 D 1 7 2 5 E 9 5 1 - Phil Spector Statistical Computing Facility Department of Statistics UC Berkeley spector at stat.berkeley.edu On Wed, 21 Jan 2009, Antje wrote:> Hi there, > > I have a list of dataframes (generated by reading multiple files) and all > dataframes are comparable in dimension and column names. They also have a > common column, which, I'd like to use for merging. To give a simple example > of what I have: > > df1 <- data.frame(c(LETTERS[1:5]), c(2,6,3,1,9)) > names(df1) <- c("pos", "data") > df3 <- df2 <- df1 > df2$data <- c(6,2,9,7,5) > df3$data <- c(9,3,6,2,1) > mylist <- list(df1,df2,df3) > names(mylist) <- c("df1","df2","df3") > >> mylist > > $df1 > pos data > 1 A 2 > 2 B 6 > 3 C 3 > 4 D 1 > 5 E 9 > > $df2 > pos data > 1 A 6 > 2 B 2 > 3 C 9 > 4 D 7 > 5 E 5 > > $df3 > pos data > 1 A 9 > 2 B 3 > 3 C 6 > 4 D 2 > 5 E 1 > > If I use do.call("cbind"), I'll end up with something like this > > pos data pos data pos data > 1 A 2 A 6 A 9 > 2 B 6 B 2 B 3 > 3 C 3 C 9 C 6 > 4 D 1 D 7 D 2 > 5 E 9 E 5 E 1 > > > but now, I don't know anymore which data comes from which dataframe... and I > have the column "pos" multiple times... > > Instead I'd like to have it like this: > > pos df1 df2 df3 > 1 A 2 6 9 > 2 B 6 2 3 > 3 C 3 9 6 > 4 D 1 7 2 > 5 E 9 5 1 > > How, can I realize it? (The list, I'm working with has not just 3 data frames > like given in my example, so I need to automize it) > > > Antje > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
Possibly Parallel Threads
- merging dataframes with an unequal number of variables
- Looping through a list of objects & do something...
- Compare two dataframes
- levelplot help needed
- Best way/practice to create a new data frame from two given ones with last column computed from the two data frames?