gregory_r_warnes@groton.pfizer.com
2002-Apr-09 22:52 UTC
[Rd] Problem handling NA indexes for character matrixes (PR#1447)
In a package I've been developing for manipulating genetic data I discovered a problem when indexing into character arrays using NA's: Create a character matrix and a numeric matrix > cmat <- matrix( letters[1:4], ncol=2, nrow=2) > nmat <- matrix( 1:4, ncol=2, nrow=2) Create an index vector containing an NA value > indvec <- c(1,2,NA) Indexing works fine for both matrixes when we only pull off *one* column: > cmat[ indvec, 1 ] [1] "a" "b" "NA" > nmat[ indvec, 1 ] [1] 1 2 NA > cmat[ indvec, 2 ] [1] "c" "d" "NA" > nmat[ indvec, 2 ] [1] 3 4 NA However, when we pull off both columns, we get "" where we should have NA for cmat but it works properly for nmat: > cmat[ indvec, ] [,1] [,2] [1,] "a" "c" [2,] "b" "d" [3,] "NA" "" > nmat[ indvec, ] [,1] [,2] [1,] 1 3 [2,] 2 4 [3,] NA NA > cmat[ indvec, c(1,2) ] [,1] [,2] [1,] "a" "c" [2,] "b" "d" [3,] "NA" "" > nmat[ indvec, c(1,2) ] [,1] [,2] [1,] 1 3 [2,] 2 4 [3,] NA NA The problem persists with other matrix sizes: > cmat <- matrix( letters[1:6], ncol=3, nrow=2) > nmat <- matrix( 1:6, ncol=3, nrow=2) > cmat[ indvec, ] [,1] [,2] [,3] [1,] "a" "c" "e" [2,] "b" "d" "f" [3,] "NA" "" "" > nmat[ indvec, ] [,1] [,2] [,3] [1,] 1 3 5 [2,] 2 4 6 [3,] NA NA NA Also we get a strange result if the only index is NA: > cmat[NA,] [,1] [,2] [,3] [1,] "NA" "" "" [2,] "NA" "" "" > nmat[NA,] [,1] [,2] [,3] [1,] NA NA NA [2,] NA NA NA I would expect to get a single row for these cases rather than the whole matrix replicated and (incorrectly for the cmat) filled with NA. I'm reporting this now to get it in. I'll download the 1.5.0 source code now and muck about to see if I can locate the problem. -Greg> version_ platform sparc-sun-solaris2.8 arch sparc os solaris2.8 system sparc, solaris2.8 status major 1 minor 4.1 year 2002 month 01 day 30 language R LEGAL NOTICE Unless expressly stated otherwise, this message is confidential and may be privileged. It is intended for the addressee(s) only. Access to this E-mail by anyone else is unauthorized. If you are not an addressee, any disclosure or copying of the contents of this E-mail or any action taken (or not taken) in reliance on it is unauthorized and may be unlawful. If you are not an addressee, please inform the sender immediately. -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-devel-request@stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Peter Dalgaard BSA
2002-Apr-10 09:08 UTC
[Rd] Problem handling NA indexes for character matrixes (PR#1447)
gregory_r_warnes@groton.pfizer.com writes:> In a package I've been developing for manipulating genetic data I discovered > a problem when indexing into character arrays using NA's:> > cmat[ indvec, ] > [,1] [,2] > [1,] "a" "c" > [2,] "b" "d" > [3,] "NA" "" > > nmat[ indvec, ] > [,1] [,2] > [1,] 1 3 > [2,] 2 4 > [3,] NA NAThis seems to have vanished in R-1.5.0pre, likely as a (fortunate, if unexpected) side effect of the character NA changes:> cmat[ indvec, ][,1] [,2] [1,] "a" "c" [2,] "b" "d" [3,] NA NA> nmat[ indvec, ][,1] [,2] [1,] 1 3 [2,] 2 4 [3,] NA NA -- O__ ---- Peter Dalgaard Blegdamsvej 3 c/ /'_ --- Dept. of Biostatistics 2200 Cph. N (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - (p.dalgaard@biostat.ku.dk) FAX: (+45) 35327907 -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-devel-request@stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._