Hi R Fans, I stumbled across a strange (I think) bug in R 2.9.1. I have read in a data file with 5934 rows and 9 columns with the commands: daten = data.frame(read.table("C:/fussball.dat",header=TRUE)) Then I needed a subset of the data file: newd = daten[daten[,1]!=daten[,2],] --> two values do not meet the logical specification and are dropped. The strange thing about it: When I print the newd in the R Console, the output still shows 5934 rows. When I check the number of rows with NROW(newd) , I get 5932 as output. When I print newd[5934, ], I get NAs. When I print newd[5932, ] I get the row that is listed in line 5934 when I just type in newd. This is totally crazy! Has anyone had the same problem? Thanks for a post. Marc
Marc Jekel wrote:> Hi R Fans, > > I stumbled across a strange (I think) bug in R 2.9.1. I have read in a > data file with 5934 rows and 9 columns with the commands: > > daten = data.frame(read.table("C:/fussball.dat",header=TRUE)) > > Then I needed a subset of the data file: > > newd = daten[daten[,1]!=daten[,2],] > > --> two values do not meet the logical specification and are dropped. > > The strange thing about it: When I print the newd in the R Console, the > output still shows 5934 rows. When I check the number of rows with > NROW(newd) , I get 5932 as output. When I print newd[5934, ], I get NAs. > When I print newd[5932, ] I get the row that is listed in line 5934 when > I just type in newd. This is totally crazy! Has anyone had the same > problem? Thanks for a post.You're confusing row names and row numbers. When you printed newd, did you actually count the number of lines? Thought so... It isn't any stranger than this: > data.frame(x=rnorm(6),y=rnorm(6))[-5,] x y 1 0.9457385 -1.1398275 2 -1.1683732 -0.7269941 3 0.9942821 0.9310146 4 -2.0839580 -0.6261567 6 1.7225233 0.2457897 -- O__ ---- Peter Dalgaard ?ster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907
Marc Jekel wrote:> Hi R Fans, > > I stumbled across a strange (I think) bug in R 2.9.1. I have read in a > data file with 5934 rows and 9 columns with the commands: > > daten = data.frame(read.table("C:/fussball.dat",header=TRUE)) > > Then I needed a subset of the data file: > > newd = daten[daten[,1]!=daten[,2],] > > --> two values do not meet the logical specification and are dropped. > > The strange thing about it: When I print the newd in the R Console, the > output still shows 5934 rows.No 5932 rows, but with the original rownames (with 2 of them missing). Uwe Ligges When I check the number of rows with> NROW(newd) , I get 5932 as output. When I print newd[5934, ], I get NAs. > When I print newd[5932, ] I get the row that is listed in line 5934 when > I just type in newd. This is totally crazy! Has anyone had the same > problem? Thanks for a post. > > Marc > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.