Michael Lachmann
2010-Jun-30 19:46 UTC
[R] problem with rbind on data.frames that contain data.frames
It took me some time to find this bug in my code. Is this a feature of R? Am I doing something wrong? > a=data.frame(x=1:10,y=1:10) > b=data.frame(x=11:20,y=11:20) > z=data.frame(1:10,11:20) > a$z=z > b$z=z > rbind(a,b) Error in `row.names<-.data.frame`(`*tmp*`, value = c("1", "2", "3", "4", : duplicate 'row.names' are not allowed In addition: Warning message: non-unique values when setting 'row.names': ?1?, ?10?, ?2?, ?3?, ?4?, ?5?, ?6?, ?7?, ?8?, ?9? adding rownames to a and b doesn't help: > rownames(a)=1:10 > rownames(b)=11:20 > rbind(a,b) Error in `row.names<-.data.frame`(`*tmp*`, value = c("1", "2", "3", "4", : duplicate 'row.names' are not allowed In addition: Warning message: non-unique values when setting 'row.names': ?1?, ?10?, ?2?, ?3?, ?4?, ?5?, ?6?, ?7?, ?8?, ?9? the problem is with the rownames of a$z and b$z... > rownames(a$z)=1:10 > rownames(b$z)=11:20 > rbind(a,b) x y z.X1.10 z.X11.20 1 1 1 1 11 2 2 2 2 12 3 3 3 3 13 4 4 4 4 14 5 5 5 5 15 6 6 6 6 16 7 7 7 7 17 8 8 8 8 18 9 9 9 9 19 10 10 10 10 20 11 11 11 1 11 12 12 12 2 12 13 13 13 3 13 14 14 14 4 14 15 15 15 5 15 16 16 16 6 16 17 17 17 7 17 18 18 18 8 18 19 19 19 9 19 20 20 20 10 20 I was creating data.frames with data.frame members when I was doing computations on the data.frames. Something like: > a=data.frame(x=1:10,y=1:10) > b=data.frame(x=11:20,y=11:20) > a$z=a*2 > b$z=b*2 > rbind(a,b) Error in `row.names<-.data.frame`(`*tmp*`, value = c("1", "2", "3", "4", : duplicate 'row.names' are not allowed In addition: Warning message: non-unique values when setting 'row.names': ?1?, ?10?, ?2?, ?3?, ?4?, ?5?, ?6?, ?7?, ?8?, ?9? Thanks for listening, Michael
Allan Engelhardt
2010-Jun-30 20:55 UTC
[R] problem with rbind on data.frames that contain data.frames
On 30/06/10 20:46, Michael Lachmann wrote:> It took me some time to find this bug in my code. Is this a feature of > R? Am I doing something wrong? > > > a=data.frame(x=1:10,y=1:10) > > b=data.frame(x=11:20,y=11:20) > > z=data.frame(1:10,11:20) > > > a$z=zYou are (kind of) assigning *two* columns from the data frame "z" to the name 'z' in "a" which is probably not going to work as you expect. R tries to be clever which may or may not be a Good Thing. Try a$z1 <- z[,1] a$z2 <- z[,2] or equivalent to keep the names straight. As you have it, a$z is a data.frame, not a column, so you'd need a$z[,1] to get the 1:10 back from the original assignment of z. The default printing of a does not help: always check using str: > str(a) 'data.frame': 10 obs. of 3 variables: $ x: int 1 2 3 4 5 6 7 8 9 10 $ y: int 1 2 3 4 5 6 7 8 9 10 $ z:'data.frame': 10 obs. of 2 variables: ..$ X1.10 : int 1 2 3 4 5 6 7 8 9 10 ..$ X11.20: int 11 12 13 14 15 16 17 18 19 20 Hope this helps a little. Allan
Reasonably Related Threads
- rbind on data.frame that contains a column that is also a data.frame
- bug in rbind?
- Is it possible to print a data.frame without the row names?
- Proper way to define cbind, rbind for s4 classes in package
- Proper way to define cbind, rbind for s4 classes in package