David Croll
2014-Mar-07 11:17 UTC
[R] function doesn't change the variable in the right way
Dear R users and friends,
I would like to ask you about the weird behaviour of a function I just
wrote. This little function should take a vector, find NAs and
substitute them for the mean of the vector, and return the normalized
value of that vector.
I've tried both <- and <<- for changing the variables.
That's what I do:
# just a vector:
b <- c(1,1,1,NA,3,2)
# my function:
normalize <- function(x) {
# copy x into new vector
xn <- x
# index of NAs
nas <- which(is.na(xn) ==TRUE)
m <- mean(xn, na.rm=TRUE)
# insert mean for NAs
xn[nas] <- m
# normalize
return((xn - mean(xn))/sd(xn))
}
# run...
normalize(b)
# here's what I get:
# [1] -0.75 -0.75 -0.75 0.00 1.75 0.50
The 4th value should be 1.6, but is 0.
I believe the answer to my problem is pretty obvious, but I can't see it...
Best regards,
David
Hello The function does exactly what you tell it to do; first you substitute all NA with the mean; then you subtract the mean; for the NA this meand: mean-mean=0; and this is what you get. the problem is not the function but the z-score of means. lg fabian On 07-03-2014 12:17, David Croll wrote:> Dear R users and friends, > > > I would like to ask you about the weird behaviour of a function I just > wrote. This little function should take a vector, find NAs and > substitute them for the mean of the vector, and return the normalized > value of that vector. > > I've tried both <- and <<- for changing the variables. > > That's what I do: > > # just a vector: > b <- c(1,1,1,NA,3,2) > > # my function: > normalize <- function(x) { > > # copy x into new vector > xn <- x > > # index of NAs > nas <- which(is.na(xn) ==TRUE) > m <- mean(xn, na.rm=TRUE) > > # insert mean for NAs > xn[nas] <- m > > # normalize > return((xn - mean(xn))/sd(xn)) > > } > > # run... > normalize(b) > > # here's what I get: > # [1] -0.75 -0.75 -0.75 0.00 1.75 0.50 > > > The 4th value should be 1.6, but is 0. > > > I believe the answer to my problem is pretty obvious, but I can't see > it... > > > Best regards, > > > David > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Rui Barradas
2014-Mar-07 14:31 UTC
[R] function doesn't change the variable in the right way
Hello, Inline. Em 07-03-2014 11:17, David Croll escreveu:> > Dear R users and friends, > > > I would like to ask you about the weird behaviour of a function I just > wrote. This little function should take a vector, find NAs and > substitute them for the mean of the vector, and return the normalized > value of that vector. > > I've tried both <- and <<- for changing the variables. > > That's what I do: > > # just a vector: > b <- c(1,1,1,NA,3,2) > > # my function: > normalize <- function(x) { > > # copy x into new vector > xn <- x > > # index of NAs > nas <- which(is.na(xn) ==TRUE) > m <- mean(xn, na.rm=TRUE) > > # insert mean for NAs > xn[nas] <- m > > # normalize > return((xn - mean(xn))/sd(xn)) > > } > > # run... > normalize(b) > > # here's what I get: > # [1] -0.75 -0.75 -0.75 0.00 1.75 0.50 > > > The 4th value should be 1.6, but is 0.No it shouldn't. The value of the 4th element should be 0. It is equal to mean(xn, na.rm = TRUE) - mean(new xn) and this is zero. Hope this helps, Rui Barradas> > > I believe the answer to my problem is pretty obvious, but I can't see it... > > > Best regards, > > > David > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.