rafalku at gmail.com
2006-May-16 19:50 UTC
[Rd] bug in rbind.data.frame with factors (PR#8868)
Full_Name: Rafal Kustra Version: 2.1.1 OS: Linux, MacOS 10.3 Submission from: (NULL) (69.195.47.62) When Rbinding two data frames with factors, strange result occur (but no error) when the order of data frame variables is different in two data frames:> d1=as.data.frame(list(x=1:10,y=letters[1:10])) > d2=as.data.frame(list(y=LETTERS[1:5],x=7:11)) > d2y x 1 A 7 2 B 8 3 C 9 4 D 10 5 E 11> rbind(d1,d2)x y 1 1 a 2 2 b 3 3 c 4 4 d 5 5 e 6 6 f 7 7 g 8 8 h 9 9 i 10 10 j 11 7 <NA> 21 8 <NA> 31 9 <NA> 41 10 <NA> 51 11 <NA> Warning message: invalid factor level, NAs generated in: "[<-.factor"(`*tmp*`, ri, value = c("A", "B", "C", "D", "E")) Things work correctly when the order of variables is the same:> d3=as.data.frame(list(x=7:11,y=LETTERS[1:5])) > rbind(d1,d3)x y 1 1 a 2 2 b 3 3 c 4 4 d 5 5 e 6 6 f 7 7 g 8 8 h 9 9 i 10 10 j 11 7 A 21 8 B 31 9 C 41 10 D 51 11 E>
How is this a bug? From the help page for cbind/rbind: Description Take a sequence of vector, matrix or data frames arguments and combine by _columns_ or _rows_, respectively. (emphasis added) Note that it does _not_ say "combine by variable names". Peter Ehlers rafalku at gmail.com wrote:> Full_Name: Rafal Kustra > Version: 2.1.1 > OS: Linux, MacOS 10.3 > Submission from: (NULL) (69.195.47.62) > > > When Rbinding two data frames with factors, strange result occur (but no error) > when the order of data frame variables is different in two data frames: > > >>d1=as.data.frame(list(x=1:10,y=letters[1:10])) >>d2=as.data.frame(list(y=LETTERS[1:5],x=7:11)) >>d2 > > y x > 1 A 7 > 2 B 8 > 3 C 9 > 4 D 10 > 5 E 11 > >>rbind(d1,d2) > > x y > 1 1 a > 2 2 b > 3 3 c > 4 4 d > 5 5 e > 6 6 f > 7 7 g > 8 8 h > 9 9 i > 10 10 j > 11 7 <NA> > 21 8 <NA> > 31 9 <NA> > 41 10 <NA> > 51 11 <NA> > Warning message: > invalid factor level, NAs generated in: "[<-.factor"(`*tmp*`, ri, value = c("A", > "B", "C", "D", "E")) > > > Things work correctly when the order of variables is the same: > > >>d3=as.data.frame(list(x=7:11,y=LETTERS[1:5])) >>rbind(d1,d3) > > x y > 1 1 a > 2 2 b > 3 3 c > 4 4 d > 5 5 e > 6 6 f > 7 7 g > 8 8 h > 9 9 i > 10 10 j > 11 7 A > 21 8 B > 31 9 C > 41 10 D > 51 11 E > > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel