On Sat, 21 Oct 2000, Kaspar Pflugshaupt wrote:
> Hello,
>
> I've a question concerning the behaviour of the "scale"
function in the base
> package. I'm using R 1.1.1 on Windows 95.
>
> If I take a matrix with NA values, such as
>
> > tm <- matrix(c(2,1,0,1,0,NA,NA,NA,0), nrow=3)
> > tm
> [,1] [,2] [,3]
> [1,] 2 1 NA
> [2,] 1 0 NA
> [3,] 0 NA 0
>
> and scale it, the columns containing NAs come out all NA:
>
> > scale(tm)
> [,1] [,2] [,3]
> [1,] 1 NA NA
> [2,] 0 NA NA
> [3,] -1 NA NA
>
> If I just center it, this does not happen:
>
> > scale(tm, scale=F)
> [,1] [,2] [,3]
> [1,] 1 0.5 NA
> [2,] 0 -0.5 NA
> [3,] -1 NA 0
>
> Is this difference in NA handling deliberate?
>
> >From the help text and from looking at the code of
"scale.default", I got
> the impression that the scaling part of it would filter out NAs:
>
> ...
> if (scale) {
> f <- function(v) {
> nas <- is.na(v)
> if (any(is.na(nas)))
> v <- v[!is.na(nas)]
> sqrt(sum(v^2)/max(1, length(v) - 1))
> }
> x <- sweep(x, 2, apply(x, 2, f), "/")
> ...
>
>
> Is the mentioned behaviour thus a bug, or is there a reason for it?
It's a bug, and the prototype gets this right.
nas will be TRUE or FALSE.
is.na(nas) is always FALSE: you always know if something is an NA or not.
All that is needed is v <- v[!is.na(v)]. I doubt if the pre-test saves any
worthwhile time or space.
I've put a bug fix in.
--
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272860 (secr)
Oxford OX1 3TG, UK Fax: +44 1865 272595
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To: r-help-request at
stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._