Tried the following with R --vanilla on the Rv2.4.0 release (see details at the end). I think the script and its comments speaks for itself, but the outcome is certainly not wanted. for (n in 58950:58970) { cat("n=", n, "\n", sep=""); # Clean up first rm(names, x, y); gc(); # Create a named vector of length n # Try with format "%5d" and it works names <- sprintf("%05d", 1:n); x <- seq(along=names); names(x) <- names; # Extract the first k elements k <- 36422; t0 <- system.time({ y <- x[names[1:k]]; }) str(y); # But with one more it takes # for ever when n >= 58960 k <- k + 1; t1 <- system.time({ y <- x[names[1:k]]; }) # ...then t1/t0 ~= 300-500 and growing! print(t1/t0); str(y); } The interesting this is that if you replace y <- x[names[1:k]]; with idxs <- match(names[1:k], names(x)); y <- x[idxs]; everything is fine. (For those working with the Affy 100K SNP chips, the freaky thing is that the problem occurs at n = 58960 which is exactly the number of SNPs on the Xba array; that's how I found out about the bug/feature it the first place). Tried this on two different systems:> sessionInfo()R version 2.4.0 (2006-10-03) i386-pc-mingw32 locale: LC_COLLATE=English_United States.1252;LC_CTYPE=English_United States.1252;LC_MONETARY=English_United States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252 attached base packages: [1] "methods" "stats" "graphics" "grDevices" "utils" "datasets" [7] "base"> sessionInfo()R version 2.4.0 (2006-10-03) x86_64-unknown-linux-gnu locale: C attached base packages: [1] "methods" "stats" "graphics" "grDevices" "utils" "datasets" [7] "base" Cheers /Henrik
Another example:> avec <- 1:55000 > names(avec) <- as.character(avec) > system.time(avec[names(avec)[1:39045]])[1] 0.06 0.00 0.07 NA NA> system.time(avec[names(avec)[1:39046]])[1] 23.89 0.00 23.94 NA NA> version_ platform i386-pc-mingw32 arch i386 os mingw32 system i386, mingw32 status major 2 minor 4.0 year 2006 month 10 day 03 svn rev 39566 language R version.string R version 2.4.0 (2006-10-03) FWIW, this example shows similar behavior on R-2.2.0 Linux. On Fri, 6 Oct 2006, Henrik Bengtsson wrote:> Tried the following with R --vanilla on the Rv2.4.0 release (see > details at the end). I think the script and its comments speaks for > itself, but the outcome is certainly not wanted. >[snip] Charles C. Berry (858) 534-2098 Dept of Family/Preventive Medicine E mailto:cberry at tajo.ucsd.edu UC San Diego http://biostat.ucsd.edu/~cberry/ La Jolla, San Diego 92093-0717
On 10/6/2006 6:20 PM, Henrik Bengtsson wrote:> Tried the following with R --vanilla on the Rv2.4.0 release (see > details at the end). I think the script and its comments speaks for > itself, but the outcome is certainly not wanted.I think this is fixed now in R-devel and R-patched. Thanks for the report, and the detailed script to reproduce the bug. Duncan Murdoch> > for (n in 58950:58970) { > cat("n=", n, "\n", sep=""); > > # Clean up first > rm(names, x, y); gc(); > > # Create a named vector of length n > # Try with format "%5d" and it works > names <- sprintf("%05d", 1:n); > x <- seq(along=names); > names(x) <- names; > > # Extract the first k elements > k <- 36422; > t0 <- system.time({ > y <- x[names[1:k]]; > }) > str(y); > > # But with one more it takes > # for ever when n >= 58960 > k <- k + 1; > t1 <- system.time({ > y <- x[names[1:k]]; > }) > # ...then t1/t0 ~= 300-500 and growing! > print(t1/t0); > str(y); > } > > > The interesting this is that if you replace > > y <- x[names[1:k]]; > > with > > idxs <- match(names[1:k], names(x)); > y <- x[idxs]; > > everything is fine. > > (For those working with the Affy 100K SNP chips, the freaky thing is > that the problem occurs at n = 58960 which is exactly the number of > SNPs on the Xba array; that's how I found out about the bug/feature it > the first place). > > Tried this on two different systems: > >> sessionInfo() > R version 2.4.0 (2006-10-03) > i386-pc-mingw32 > locale: > LC_COLLATE=English_United States.1252;LC_CTYPE=English_United > States.1252;LC_MONETARY=English_United > States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252 > attached base packages: > [1] "methods" "stats" "graphics" "grDevices" "utils" "datasets" > [7] "base" > >> sessionInfo() > R version 2.4.0 (2006-10-03) > x86_64-unknown-linux-gnu > locale: > C > attached base packages: > [1] "methods" "stats" "graphics" "grDevices" "utils" "datasets" > [7] "base" > > Cheers > > /Henrik > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel