Hi, everybody. This was an interesting discussion last time and it helped me a lot. Could you please have a look at some feature and tell me why it was designed this way (my questions are under #########)> x = c(1, 10) > y = c(99, 55) > d <- data.frame(x = x, y = y) > dx y 1 1 99 2 10 55> add <- data.frame(x = 14, y = 99) > addx y 1 14 99> d <- rbind(d, add) > dx y 1 1 99 2 10 55 11 14 99 ######### it would be more natural to index the rows: 1,2,3 instead of #1,2,11 ?!> > d[3,1][1] 14> d[11,1][1] NA ######### especially if index '11' is not functioning...> > add1 <- data.frame(x = 10, y = 87) > d <- rbind(d, add)######### now I would think that the next index should be 21, BUT:> dx y 1 1 99 2 10 55 11 14 99 12 10 87 ######### so what is the intuition of such indexing? -- Svetlana Eden Biostatistician II School of Medicine Department of Biostatistics Vanderbilt University
You are adding a row with name "1" each time. R just adds a suffix to make it unique. What you call indices are the *row names* of the data frame. Suppose the row name had been "Eden". Then "Eden1" and "Eden2" make more sense than your suggestions. On Wed, 18 Feb 2004, Svetlana Eden wrote:> Hi, everybody. > This was an interesting discussion last time and it helped me a lot. > > Could you please have a look at some feature and tell me > why it was designed this way > (my questions are under #########) > > > x = c(1, 10) > > y = c(99, 55) > > d <- data.frame(x = x, y = y) > > d > x y > 1 1 99 > 2 10 55 > > add <- data.frame(x = 14, y = 99) > > add > x y > 1 14 99 > > d <- rbind(d, add) > > d > x y > 1 1 99 > 2 10 55 > 11 14 99 > ######### it would be more natural to index the rows: 1,2,3 instead of > #1,2,11 ?! > > > > d[3,1] > [1] 14 > > d[11,1] > [1] NA > ######### especially if index '11' is not functioning...You need "11": it is a row name and not a row index.> > add1 <- data.frame(x = 10, y = 87) > > d <- rbind(d, add) > ######### now I would think that the next index should be 21, BUT: > > d > x y > 1 1 99 > 2 10 55 > 11 14 99 > 12 10 87 > ######### so what is the intuition of such indexing?-- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595
Svetlana Eden <svetlana.eden at vanderbilt.edu> writes:> Hi, everybody. > This was an interesting discussion last time and it helped me a lot. > > Could you please have a look at some feature and tell me > why it was designed this way > (my questions are under #########) > > > x = c(1, 10) > > y = c(99, 55) > > d <- data.frame(x = x, y = y) > > d > x y > 1 1 99 > 2 10 55 > > add <- data.frame(x = 14, y = 99) > > add > x y > 1 14 99 > > d <- rbind(d, add) > > d > x y > 1 1 99 > 2 10 55 > 11 14 99 > ######### it would be more natural to index the rows: 1,2,3 instead of > #1,2,11 ?!It's not the number 11, it is the string "11". Row names are character strings. In your original data frame the row names were "1" and "2" for the first frame and "1" for the second. The rbind function should not create a duplicate row name so it prepended a "1" to all the names in the second frame. That explains the "11" and "12" in your last example. They are simply the original names "1" and "2" with a "1" prepended to them.> > > > d[3,1] > [1] 14 > > d[11,1]Try d["11", 1]> [1] NA > ######### especially if index '11' is not functioning... > > > > add1 <- data.frame(x = 10, y = 87) > > d <- rbind(d, add) > ######### now I would think that the next index should be 21, BUT: > > d > x y > 1 1 99 > 2 10 55 > 11 14 99 > 12 10 87 > ######### so what is the intuition of such indexing?
Svetlana Eden wrote:> Hi, everybody. > This was an interesting discussion last time and it helped me a lot. > > Could you please have a look at some feature and tell me > why it was designed this wayWhat you see in the first column are the row names. The index is 1,2,3,4 as usual. Thomas P.
Interesting feature The logic seems to be simple. The first digit in the label indicates the row position within the new matrix/vector that is being added. Thus the first digit remains 1 if you go on adding single rows. The second digit is then filled up by the lowest digit possible that makes the label distinct from all previous labels, and the third so on.... Try adding not just single rows but matrices of more than 1 rows repeatedly, and the pattern will be clear. Supratik. [[alternative HTML version deleted]]
Hi! On Wed, Feb 18, 2004 at 10:45:23AM -0600, Svetlana Eden wrote:> > d <- rbind(d, add) > > d > x y > 1 1 99 > 2 10 55 > 11 14 99 > ######### it would be more natural to index the rows: 1,2,3 instead of > #1,2,11 ?!What you see in the first column are row-names not indexes. Since both data frames had a row named '1' there was a conflict which R is trying to resolve by appending '1'.> ######### especially if index '11' is not functioning......> ######### now I would think that the next index should be 21, BUT:...> ######### so what is the intuition of such indexing?I think it all becomes clear if you try this:> a <- data.frame(foo=1:5, bar=10:6, row.names=LETTERS[1:5]) > afoo bar A 1 10 B 2 9 C 3 8 D 4 7 E 5 6> b <- data.frame(foo=c(9,10), bar=c(99,98), row.names=c('A','B')) > bfoo bar A 9 99 B 10 98> rbind(a,b)foo bar A 1 10 B 2 9 C 3 8 D 4 7 E 5 6 A1 9 99 B1 10 98 cu Philipp -- Dr. Philipp Pagel Tel. +49-89-3187-3675 Institute for Bioinformatics / MIPS Fax. +49-89-3187-3585 GSF - National Research Center for Environment and Health Ingolstaedter Landstrasse 1 85764 Neuherberg, Germany http://mips.gsf.de/~pagel
> From: Svetlana Eden > > Hi, everybody. > This was an interesting discussion last time and it helped me a lot. > > Could you please have a look at some feature and tell me > why it was designed this way > (my questions are under #########)I give it a shot...> > x = c(1, 10) > > y = c(99, 55) > > d <- data.frame(x = x, y = y) > > d > x y > 1 1 99 > 2 10 55 > > add <- data.frame(x = 14, y = 99) > > add > x y > 1 14 99 > > d <- rbind(d, add) > > d > x y > 1 1 99 > 2 10 55 > 11 14 99 > ######### it would be more natural to index the rows: 1,2,3 instead of > #1,2,11 ?!rownames for data.frame need not be 1, 2, ..., and we need something that's going to work regardless.> > d[3,1] > [1] 14 > > d[11,1] > [1] NA > ######### especially if index '11' is not functioning...This one seems curious to me. Trying to access non-existing column result in error:> d[1,11]Error in "[.data.frame"(d, 1, 11) : undefined columns selected OTOH, try:> d[11,1] <- 1 > dim(d)[1] 11 2> dError in data.frame(x = c(" 1", "10", "14", "NA", "NA", "NA", "NA", "NA", : duplicate row.names: 11 so that destroys the integrity of the data frame!> > add1 <- data.frame(x = 10, y = 87) > > d <- rbind(d, add) > ######### now I would think that the next index should be 21, BUT: > > d > x y > 1 1 99 > 2 10 55 > 11 14 99 > 12 10 87 > ######### so what is the intuition of such indexing?I believe the rationale is explained in ?make.unique (which rbind.data.frame() calls to create the rownames). HTH, Andy> -- > Svetlana Eden Biostatistician II School of Medicine > Department of Biostatistics Vanderbilt > University > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://www.stat.math.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! > http://www.R-project.org/posting-guide.html > >------------------------------------------------------------------------------ Notice: This e-mail message, together with any attachments,...{{dropped}}