thr3ads.net - R help - [R] scale() and NA values [Oct 2000]

If this information is useful, please help other people find it:
Share via:

Kaspar Pflugshaupt

2000-Oct-21 10:24 UTC

[R] scale() and NA values

Hello,

I've a question concerning the behaviour of the "scale" function
in the base
package. I'm using R 1.1.1 on Windows 95.

If I take a matrix with NA values, such as
> tm <- matrix(c(2,1,0,1,0,NA,NA,NA,0), nrow=3)
> tm     [,1] [,2] [,3]
[1,]    2    1   NA
[2,]    1    0   NA
[3,]    0   NA    0

and scale it, the columns containing NAs come out all NA:
> scale(tm)     [,1] [,2] [,3]
[1,]    1   NA   NA
[2,]    0   NA   NA
[3,]   -1   NA   NA

If I just center it, this does not happen:
> scale(tm, scale=F)     [,1] [,2] [,3]
[1,]    1  0.5   NA
[2,]    0 -0.5   NA
[3,]   -1   NA    0

Is this difference in NA handling deliberate?
>From the help text and from looking at the code of
"scale.default", I gotthe impression that the scaling part of it would filter out NAs:

...
        if (scale) {
            f <- function(v) {
                nas <- is.na(v)
                if (any(is.na(nas)))
                  v <- v[!is.na(nas)]
                sqrt(sum(v^2)/max(1, length(v) - 1))
            }
            x <- sweep(x, 2, apply(x, 2, f), "/")
...


Is the mentioned behaviour thus a bug, or is there a reason for it?

Cheers

Kaspar Pflugshaupt

-- 

Kaspar Pflugshaupt
Geobotanisches Institut
Zuerichbergstr. 38
CH-8044 Zuerich

Tel. ++41 1 632 43 19
Fax  ++41 1 632 12 15

mailto:pflugshaupt at geobot.umnw.ethz.ch
privat:pflugshaupt at mails.ch
http://www.geobot.umnw.ethz.ch

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at
stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._

Prof Brian D Ripley

2000-Oct-21 15:35 UTC

head link

[R] scale() and NA values

On Sat, 21 Oct 2000, Kaspar Pflugshaupt wrote:
> Hello,
> 
> I've a question concerning the behaviour of the "scale"
function in the base
> package. I'm using R 1.1.1 on Windows 95.
> 
> If I take a matrix with NA values, such as
> 
> > tm <- matrix(c(2,1,0,1,0,NA,NA,NA,0), nrow=3)
> > tm
>      [,1] [,2] [,3]
> [1,]    2    1   NA
> [2,]    1    0   NA
> [3,]    0   NA    0
> 
> and scale it, the columns containing NAs come out all NA:
> 
> > scale(tm)
>      [,1] [,2] [,3]
> [1,]    1   NA   NA
> [2,]    0   NA   NA
> [3,]   -1   NA   NA
> 
> If I just center it, this does not happen:
> 
> > scale(tm, scale=F)
>      [,1] [,2] [,3]
> [1,]    1  0.5   NA
> [2,]    0 -0.5   NA
> [3,]   -1   NA    0
> 
> Is this difference in NA handling deliberate?
> 
> >From the help text and from looking at the code of
"scale.default", I got
> the impression that the scaling part of it would filter out NAs:
> 
> ...
>         if (scale) {
>             f <- function(v) {
>                 nas <- is.na(v)
>                 if (any(is.na(nas)))
>                   v <- v[!is.na(nas)]
>                 sqrt(sum(v^2)/max(1, length(v) - 1))
>             }
>             x <- sweep(x, 2, apply(x, 2, f), "/")
> ...
> 
> 
> Is the mentioned behaviour thus a bug, or is there a reason for it?
It's a bug, and the prototype gets this right.

nas will be TRUE or FALSE.
is.na(nas) is always FALSE: you always know if something is an NA or not.

All that is needed is v <- v[!is.na(v)]. I doubt if the pre-test saves any
worthwhile time or space.

I've put a bug fix in.

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272860 (secr)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at
stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._

Maybe Matching Threads

Search for more maybe matching threads

R help - Oct 2000 - scale() and NA values

[R] scale() and NA values

[R] scale() and NA values

Maybe Matching Threads