Hello All,
At the suggestion of commenters on a discussion at stack overflow (
http://stackoverflow.com/questions/1535021/whats-the-biggest-r-gotcha-youve-run-across/1535433#1535433
), I'm forwarding the following behavior report to this list.
R Session:> a<-data.frame(c(1,2,3,4),c(4,3,2,1))
> a<-a[-3,]
> a
c.1..2..3..4. c.4..3..2..1.
1 1 4
2 2 3
4 4 1> a[4,1]<-1
> a
Error in data.frame(c.1..2..3..4. = c("1", "2",
"4", "1"), c.4..3..2..1. = c(" 4", :
duplicate row.names: 4
What's going on:
1. A four row data.frame is created, so the rownames are c(1,2,3,4)
2. The third row is deleted, so the rownames are c(1,2,4)
3. A fourth row is added, and R automatically sets the row name equal to the
index i.e. 4, so the row names are c(1,2,4,4).
4. print.data.frame throws an error because it requires unique row names
It seems to me that either R should automatically generate a unique row names,
or print.data.frame should accept duplicates. Looking at the manual 2.3.2, it is
unclear whether row names are required to be unique, but the help page for
data.frame states: "A data frame is a list of variables of the same number
of rows with unique row names,..." This implies that a[4,1]<-1 creates
an invalid data.frame object.
Cheers,
Ian Fellows