thr3ads.net - R help - [R] merging dataframes [Nov 2005]

If this information is useful, please help other people find it:
Share via:

Gavin Simpson

2005-Nov-03 13:08 UTC

[R] merging dataframes

Dear List,

I often have to merge two or more data frames containing unique row
names but with some columns (names) common to the two data frames and
some columns not common. This toy example will explain the kind of setup
I am talking about:

mat1 <- as.data.frame(matrix(rnorm(20), nrow = 5))
mat2 <- as.data.frame(matrix(rnorm(20), nrow = 4))
rownames(mat1) <- paste("site", 1:5, sep="")
rownames(mat2) <- paste("site", 6:9, sep="")
names(mat1) <- paste("species", c(1,3,5,7), sep="")
names(mat2) <- paste("species", c(2,3,4,7,9), sep="")
mat1
mat2

So sites (rows) are unique across both data frames, but there are only 7
unique species (columns):

unique(c(names(mat1), names(mat2)))

merge(mat1, mat2, all = TRUE)

gives almost what I want, but it drops or looses the rownames()
information from the two merged data frames, and it re-orders the rows
so that one simply cannot write back the correct row names.

How might I go about merging two data frames as I have described, but
preserve the row names and more importantly, keep the order of rows the
same, so that rows from mat1 come before the rows of mat2?

Many thanks,

Gavin
-- 
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
Gavin Simpson                     [T] +44 (0)20 7679 5522
ENSIS Research Fellow             [F] +44 (0)20 7679 7565
ENSIS Ltd. & ECRC                 [E] gavin.simpsonATNOSPAMucl.ac.uk
UCL Department of Geography       [W] http://www.ucl.ac.uk/~ucfagls/cv/
26 Bedford Way                    [W] http://www.ucl.ac.uk/~ucfagls/
London.  WC1H 0AP.
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%

Roger Bivand

2005-Nov-03 13:28 UTC

head link

[R] merging dataframes

On Thu, 3 Nov 2005, Gavin Simpson wrote:
> Dear List,
> 
> I often have to merge two or more data frames containing unique row
> names but with some columns (names) common to the two data frames and
> some columns not common. This toy example will explain the kind of setup
> I am talking about:
> 
> mat1 <- as.data.frame(matrix(rnorm(20), nrow = 5))
> mat2 <- as.data.frame(matrix(rnorm(20), nrow = 4))
> rownames(mat1) <- paste("site", 1:5, sep="")
> rownames(mat2) <- paste("site", 6:9, sep="")
> names(mat1) <- paste("species", c(1,3,5,7), sep="")
> names(mat2) <- paste("species", c(2,3,4,7,9),
sep="")
> mat1
> mat2
> 
> So sites (rows) are unique across both data frames, but there are only 7
> unique species (columns):
> 
> unique(c(names(mat1), names(mat2)))
> 
> merge(mat1, mat2, all = TRUE)
> 
> gives almost what I want, but it drops or looses the rownames()
> information from the two merged data frames, and it re-orders the rows
> so that one simply cannot write back the correct row names.
> 
> How might I go about merging two data frames as I have described, but
> preserve the row names and more importantly, keep the order of rows the
> same, so that rows from mat1 come before the rows of mat2?
merge(mat1, mat2, all = TRUE, sort=FALSE)

seems to fix the second question. The first is mentioned tangentially in 
the help page details if you are merging on row names, which you are not - 
maybe prepend to both a column called sites:

mat1a <- data.frame(sites=row.names(mat1), mat1)
mat2a <- data.frame(sites=row.names(mat2), mat2)
data.frame(merge(mat1a, mat2a, all = TRUE, sort=FALSE),
row.names="sites")

is a bit long-winded, but gets you there.

Roger
> 
> Many thanks,
> 
> Gavin
> 
-- 
Roger Bivand
Economic Geography Section, Department of Economics, Norwegian School of
Economics and Business Administration, Helleveien 30, N-5045 Bergen,
Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43
e-mail: Roger.Bivand at nhh.no

Dimitris Rizopoulos

2005-Nov-03 13:33 UTC

head link

[R] merging dataframes

you could use something like:

mat1$id1 <- 1:nrow(mat1)
mat2$id2 <- 1:nrow(mat2)

out <- merge(mat1, mat2, all = TRUE)
out[order(out$id1, out$id2), ]

I hope it helps.

Best,
Dimitris

----
Dimitris Rizopoulos
Ph.D. Student
Biostatistical Centre
School of Public Health
Catholic University of Leuven

Address: Kapucijnenvoer 35, Leuven, Belgium
Tel: +32/(0)16/336899
Fax: +32/(0)16/337015
Web: http://www.med.kuleuven.be/biostat/
     http://www.student.kuleuven.be/~m0390867/dimitris.htm


----- Original Message ----- 
From: "Gavin Simpson" <gavin.simpson at ucl.ac.uk>
To: "R-help" <R-help at stat.math.ethz.ch>
Sent: Thursday, November 03, 2005 2:08 PM
Subject: [R] merging dataframes

> Dear List,
>
> I often have to merge two or more data frames containing unique row
> names but with some columns (names) common to the two data frames 
> and
> some columns not common. This toy example will explain the kind of 
> setup
> I am talking about:
>
> mat1 <- as.data.frame(matrix(rnorm(20), nrow = 5))
> mat2 <- as.data.frame(matrix(rnorm(20), nrow = 4))
> rownames(mat1) <- paste("site", 1:5, sep="")
> rownames(mat2) <- paste("site", 6:9, sep="")
> names(mat1) <- paste("species", c(1,3,5,7), sep="")
> names(mat2) <- paste("species", c(2,3,4,7,9),
sep="")
> mat1
> mat2
>
> So sites (rows) are unique across both data frames, but there are 
> only 7
> unique species (columns):
>
> unique(c(names(mat1), names(mat2)))
>
> merge(mat1, mat2, all = TRUE)
>
> gives almost what I want, but it drops or looses the rownames()
> information from the two merged data frames, and it re-orders the 
> rows
> so that one simply cannot write back the correct row names.
>
> How might I go about merging two data frames as I have described, 
> but
> preserve the row names and more importantly, keep the order of rows 
> the
> same, so that rows from mat1 come before the rows of mat2?
>
> Many thanks,
>
> Gavin
> -- 
> %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
> Gavin Simpson                     [T] +44 (0)20 7679 5522
> ENSIS Research Fellow             [F] +44 (0)20 7679 7565
> ENSIS Ltd. & ECRC                 [E] gavin.simpsonATNOSPAMucl.ac.uk
> UCL Department of Geography       [W] 
> http://www.ucl.ac.uk/~ucfagls/cv/
> 26 Bedford Way                    [W] http://www.ucl.ac.uk/~ucfagls/
> London.  WC1H 0AP.
> %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
> 

Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm

Liaw, Andy

2005-Nov-03 13:33 UTC

head link

[R] merging dataframes

The `Value' section of ?merge does say that `... in all cases the result has
no special row names', so you're left to handle that on your own.  One
possibility is to use

  result <- merge(mat1, mat2, all=TRUE, sort=FALSE)

so that the sorting is not done, then you can just do

  rownames(result) <- c(rownames(mat1), rownames(mat2))

Cheers,
Andy

> From: Gavin Simpson
> 
> Dear List,
> 
> I often have to merge two or more data frames containing unique row
> names but with some columns (names) common to the two data frames and
> some columns not common. This toy example will explain the 
> kind of setup
> I am talking about:
> 
> mat1 <- as.data.frame(matrix(rnorm(20), nrow = 5))
> mat2 <- as.data.frame(matrix(rnorm(20), nrow = 4))
> rownames(mat1) <- paste("site", 1:5, sep="")
> rownames(mat2) <- paste("site", 6:9, sep="")
> names(mat1) <- paste("species", c(1,3,5,7), sep="")
> names(mat2) <- paste("species", c(2,3,4,7,9),
sep="")
> mat1
> mat2
> 
> So sites (rows) are unique across both data frames, but there 
> are only 7
> unique species (columns):
> 
> unique(c(names(mat1), names(mat2)))
> 
> merge(mat1, mat2, all = TRUE)
> 
> gives almost what I want, but it drops or looses the rownames()
> information from the two merged data frames, and it re-orders the rows
> so that one simply cannot write back the correct row names.
> 
> How might I go about merging two data frames as I have described, but
> preserve the row names and more importantly, keep the order 
> of rows the
> same, so that rows from mat1 come before the rows of mat2?
> 
> Many thanks,
> 
> Gavin
> -- 
> %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~
> %~%~%~%~%
> Gavin Simpson                     [T] +44 (0)20 7679 5522
> ENSIS Research Fellow             [F] +44 (0)20 7679 7565
> ENSIS Ltd. & ECRC                 [E] gavin.simpsonATNOSPAMucl.ac.uk
> UCL Department of Geography       [W] 
> http://www.ucl.ac.uk/~ucfagls/cv/
> 26 Bedford Way              
>       [W] http://www.ucl.ac.uk/~ucfagls/
> London.  WC1H 0AP.
> %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~
> %~%~%~%~%
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
> 
>

ManuelPerera-Chang@fmc-ag.com

2005-Nov-03 13:44 UTC

head link

[R] merging dataframes

Hi,

what about padding both datasets with dummy missing records ... and then
play with cbind and rbind

... like e.g.
> species5<-c(NA,NA,NA,NA)
> modmat2<-cbind(mat2,species1,species5)
and then similarly with mat1 ...
e.g.
species2<-c(NA,NA,NA,NA,NA)
> modmad1<-cbind(mat1,species2,species4,species9)
> rbind(modmad1,modmat2)        species1    species3   species5   species7   species2   species4
species9
site1 -0.7190044 -0.52482580 -1.1813567 -1.5584831         NA         NA
NA
site2 -1.1782180  1.72337964  0.1652343 -0.9026087         NA         NA
NA
site3  0.3823015 -0.07226644 -1.2907470 -0.3692091         NA         NA
NA
site4 -1.3051131 -0.61107947  0.6264416  1.5259373         NA         NA
NA
site5  0.2028565 -1.28374638  1.6284780 -1.2975163         NA         NA
NA
site6         NA  1.19088414         NA  0.3159949 -0.1624538  0.5987733
0.2205512
site7         NA  0.75292176         NA  1.7524988  0.8335334 -0.7998774
-0.9788762
site8         NA -0.47803396         NA -1.3041628  1.7925165 -0.4153879
-0.4708165
site9         NA -0.20063523         NA  1.8119115  1.5351801 -1.3334419
0.5812675

it will need modifications of course if you are working with several
datasets

Saludos,

Manuel

Apparently Analagous Threads

Search for more reasonably related threads

R help - Nov 2005 - merging dataframes

[R] merging dataframes

[R] merging dataframes

[R] merging dataframes

[R] merging dataframes

[R] merging dataframes

Apparently Analagous Threads