David Croll
2014-Mar-07 11:17 UTC
[R] function doesn't change the variable in the right way
Dear R users and friends, I would like to ask you about the weird behaviour of a function I just wrote. This little function should take a vector, find NAs and substitute them for the mean of the vector, and return the normalized value of that vector. I've tried both <- and <<- for changing the variables. That's what I do: # just a vector: b <- c(1,1,1,NA,3,2) # my function: normalize <- function(x) { # copy x into new vector xn <- x # index of NAs nas <- which(is.na(xn) ==TRUE) m <- mean(xn, na.rm=TRUE) # insert mean for NAs xn[nas] <- m # normalize return((xn - mean(xn))/sd(xn)) } # run... normalize(b) # here's what I get: # [1] -0.75 -0.75 -0.75 0.00 1.75 0.50 The 4th value should be 1.6, but is 0. I believe the answer to my problem is pretty obvious, but I can't see it... Best regards, David
Hello The function does exactly what you tell it to do; first you substitute all NA with the mean; then you subtract the mean; for the NA this meand: mean-mean=0; and this is what you get. the problem is not the function but the z-score of means. lg fabian On 07-03-2014 12:17, David Croll wrote:> Dear R users and friends, > > > I would like to ask you about the weird behaviour of a function I just > wrote. This little function should take a vector, find NAs and > substitute them for the mean of the vector, and return the normalized > value of that vector. > > I've tried both <- and <<- for changing the variables. > > That's what I do: > > # just a vector: > b <- c(1,1,1,NA,3,2) > > # my function: > normalize <- function(x) { > > # copy x into new vector > xn <- x > > # index of NAs > nas <- which(is.na(xn) ==TRUE) > m <- mean(xn, na.rm=TRUE) > > # insert mean for NAs > xn[nas] <- m > > # normalize > return((xn - mean(xn))/sd(xn)) > > } > > # run... > normalize(b) > > # here's what I get: > # [1] -0.75 -0.75 -0.75 0.00 1.75 0.50 > > > The 4th value should be 1.6, but is 0. > > > I believe the answer to my problem is pretty obvious, but I can't see > it... > > > Best regards, > > > David > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Rui Barradas
2014-Mar-07 14:31 UTC
[R] function doesn't change the variable in the right way
Hello, Inline. Em 07-03-2014 11:17, David Croll escreveu:> > Dear R users and friends, > > > I would like to ask you about the weird behaviour of a function I just > wrote. This little function should take a vector, find NAs and > substitute them for the mean of the vector, and return the normalized > value of that vector. > > I've tried both <- and <<- for changing the variables. > > That's what I do: > > # just a vector: > b <- c(1,1,1,NA,3,2) > > # my function: > normalize <- function(x) { > > # copy x into new vector > xn <- x > > # index of NAs > nas <- which(is.na(xn) ==TRUE) > m <- mean(xn, na.rm=TRUE) > > # insert mean for NAs > xn[nas] <- m > > # normalize > return((xn - mean(xn))/sd(xn)) > > } > > # run... > normalize(b) > > # here's what I get: > # [1] -0.75 -0.75 -0.75 0.00 1.75 0.50 > > > The 4th value should be 1.6, but is 0.No it shouldn't. The value of the 4th element should be 0. It is equal to mean(xn, na.rm = TRUE) - mean(new xn) and this is zero. Hope this helps, Rui Barradas> > > I believe the answer to my problem is pretty obvious, but I can't see it... > > > Best regards, > > > David > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.