ripley@stats.ox.ac.uk
1999-Jul-15 07:14 UTC
which() does not handle NAs in named vectors. (PR#226)
Version: platform = sparc-sun-solaris2.6 arch = sparc os = solaris2.6 system = sparc, solaris2.6 status = status.rev = 0 major = 0 minor = 64.2 year = 1999 month = July day = 3 language = R -- It is unclear to me that the handling of NAs is desirable, and it has problems with names:> z <- c(T,T,NA,F,T) > names(z) <- letters[1:5] > which(z)Error: names attribute must be the same length as the vector (Why do the vector and its names have different subscripts? And while you are correcting this, Arguments: x: a logical vector or array. `NA's are allowed an omitted. has a typo, and the logic can be simplified: see below.) On Thu, 15 Jul 1999, Martin Maechler wrote:> >>>>> "BDR" == Prof Brian D Ripley <ripley@stats.ox.ac.uk> writes: > > BDR> On Wed, 14 Jul 1999, Friedrich Leisch wrote: > >> >>>>> On Wed, 14 Jul 1999 04:09:21, >>>>> Peter B Mandeville (PBM) > >> wrote: > >> > PBM> I have a vector Pes with 600 elements some of which are NA's. How > PBM> can I form a vector of the indices of the NA's. > >> > PBM> for(i in 1:600) if(is.na(Pes[i])) print(i) > >> > PBM> prints the indices of the NA's but I can't figure out how to put > PBM> the results in a vector. > >> try this: > >> > >> x <- (1:length(Pes))[is.na(Pes)] > > BDR> Tip: that sort of thing often fails for a length 0 vector. The > BDR> `approved' spell is > > BDR> seq(along=Pes)[is.na(Pes)] > > BDR> In this case it does not matter as the subscript is of length 0, > BDR> but it has floored enough library/package writers to be worth > BDR> thinking about. > > Good teaching about seq() vs. 1:n > > However, the solution I gave > > which(is.na(Pes)) > > is the one I stilly really recommend; > it does deal with 0-length objects, and it keeps names when there are some, > and it has an `arr.ind = FALSE' argument to return array indices instead of > vector indices when so desired.Yes, but -- It is not in S (so causing difficulty in porting from R to S) -- It looks a relatively expensive operation. -- Internally which could be simplified by using seq(along=) as it is a wrapper for this construct, but actually the separate handling of n == 0 is unnecessary (as logic & !is.na(logic) will have length zero.) Brian -- Brian D. Ripley, ripley@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272860 (secr) Oxford OX1 3TG, UK Fax: +44 1865 272595 -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-devel-request@stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Martin Maechler
1999-Jul-15 12:57 UTC
which() does not handle NAs in named vectors. (PR#226)
>>>>> On Thu, 15 Jul 1999 09:14, ripley@stats.ox.ac.uk (Brian D. Ripley) said:Thank you for the bug report BDR> -- It is unclear to me that the handling of NAs is desirable, and BDR> it has problems with names: {function which in its present form very much evolved out of user wishes...} BDR> z <- c(T,T,NA,F,T) BDR> names(z) <- letters[1:5] BDR> which(z) BDR> Error: names attribute must be the same length as the vector fixed for release-patches [available in a day or two from CRAN src/devel/] and hence every new release. BDR> (Why do the vector and its names have different subscripts? And BDR> while you are correcting this, BDR> Arguments: BDR> x: a logical vector or array. `NA's are allowed an BDR> omitted. is now x: a `logical' vector or array. `NA's are allowed and omitted (treated as if `FALSE'). BDR> has a typo, and the logic can be simplified: see below.) BDR> On Thu, 15 Jul 1999, Martin Maechler wrote: >> >>>>> "BDR" == Prof Brian D Ripley <ripley@stats.ox.ac.uk> writes: >> BDR> On Wed, 14 Jul 1999, Friedrich Leisch wrote: >> >> >>>>> On Wed, 14 Jul 1999 04:09:21, >>>>> Peter B Mandeville >> (PBM) >> wrote: >> >> PBM> I have a vector Pes with 600 elements some of which are NA's. How PBM> can I form a vector of the indices of the NA's. >> >> PBM> for(i in 1:600) if(is.na(Pes[i])) print(i) >> >> PBM> prints the indices of the NA's but I can't figure out how to put PBM> the results in a vector. >> >> try this: >> >> >> >> x <- (1:length(Pes))[is.na(Pes)] >> BDR> Tip: that sort of thing often fails for a length 0 vector. The BDR> `approved' spell is >> BDR> seq(along=Pes)[is.na(Pes)] BTW, currently seq(along = x) returns "numeric" ("double") whereas 1:length(x) returns "integer". I'm about to fix this... BDR> In this case it does not matter as the subscript is of length 0, BDR> but it has floored enough library/package writers to be worth BDR> thinking about. >> Good teaching about seq() vs. 1:n >> >> However, the solution I gave >> >> which(is.na(Pes)) >> >> is the one I stilly really recommend; it does deal with 0-length >> objects, and it keeps names when there are some, and it has an >> `arr.ind = FALSE' argument to return array indices instead of vector >> indices when so desired. BDR> Yes, but BDR> -- It is not in S (so causing difficulty in porting from R to S) Well, I know what you mean and your point is all well in the above case... but anyway: Our group here has been using this ("which" function) in S for quite a while and eventually, someone will have to collect a library of things from R, missing in S-plus and easily implementable. And then, for quite a few R users, S-plus backward compatibility is not the big issue. Locally, in our collection of S-plus add-ons, we've got already quite a few of them.. And in other ways, R is so much nicer - math annotation in graphics - color, line types { plot(x,y, col="light blue", col.main = "blue") } - filled.contour - persp() with shading.. I think if you want to live in both worlds, I want (and recommend) to use if(is.R()) { ...R specific... } else { ## S-plus --- ...S-plus specific... } anyway, even within user written functions and make sure (via .First or S_FIRST or ...) that is.R() |--> FALSE in S-plus BDR> -- It looks a relatively expensive operation. I don't think it is expensive (for arr.ind=FALSE !) if you want to do deal with missings (NA) at all. (Peter's example above is one of the few places where you are absolutely sure there are no missings...) Assume x has some NAs, e.g. x <- rnorm(1000); x[1000*runif(rpois(1,lam=50))] <- NA Then which( x < -2 ) works how one would want; seq(along = x)[x < -2] gives silly NA's (which make sense for the logical vector but not for the extraction). BDR> -- Internally which could be simplified by using seq(along=) as it is a wrapper for BDR> this construct, but actually the separate handling of n == 0 is BDR> unnecessary (as logic & !is.na(logic) will have length zero.) You are right, and that's part of the fix for `which' which is currently which <- function(logic, arr.ind = FALSE) { if(!is.logical(logic)) stop("argument to \"which\" is not logical") wh <- seq(along=logic)[ll <- logic & !is.na(logic)] if ((m <- length(wh)) > 0) { dl <- dim(logic) if (is.null(dl) || !arr.ind) { names(wh) <- names(logic)[ll] } else { ##-- return a matrix length(wh) x rank rank <- length(dl) wh1 <- wh - 1 wh <- 1 + wh1 %% dl[1] wh <- matrix(wh, nrow = m, ncol = rank, dimnames list(dimnames(logic)[[1]][wh], if(rank == 2) c("row", "col")# for matrices else paste("dim", 1:rank, sep=""))) if(rank >= 2) { denom <- 1 for (i in 2:rank) { denom <- denom * dl[i-1] nextd1 <- wh1 %/% denom# (next dim of elements) - 1 wh[,i] <- 1 + nextd1 %% dl[i] } } storage.mode(wh) <- "integer" } } wh } -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-devel-request@stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._