I've been using R recently to analyze some data, but I'm having a problem using the mean() function. I imported the original data set as a vector of integers, x and then calculated a exponential moving average of the data, x_ema. This part worked fine. Then, I tried to find the mean squared error between the original series and the moving average, using mse = mean((x - x_ema)^2). This gives N/A as a result, which seems to be the result of the mean function. When I run mean() on x_ema, which is of data type double, it always returns N/A. I can find the mean of the original integer data just fine, as well as for a simple test vector of 10 doubles, but it never seems to return a usable result for the x_ema vector. Is this because the vector is too long? (Both x and x_ema contain around 40000 values). Am I running into some memory limit? I can do other calculations with x_ema just fine, it's only the mean() function that doesn't seem to work. All the information in the help seems to indicate that this operation should work without a problem. If it matters at all, I'm using R 2.9.2 running on Windows Vista. I'd appreciate any help or suggestions as to what might be wrong. Thank you, Reuben Bellika
Have you tried: mean(x) mean(as.numeric(as.character(x))) mean(x_ema) mean(as.numeric(as.character(x_ema))) What is the result of the following: which(is.na(as.numeric(as.character(x_ema)))) Abit hard since you don't provide the data, but there may be an NA or character value that is causing the error. Hope this helps at a bit. --- On Fri, 10/16/09, Reuben Bellika <rubenyi at gmail.com> wrote:> From: Reuben Bellika <rubenyi at gmail.com> > Subject: [R] Cannot calculate mean() for double vector > To: r-help at r-project.org > Date: Friday, October 16, 2009, 12:06 PM > I've been using R recently to analyze > some data, but I'm having a > problem using the mean() function. > > I imported the original data set as a vector of integers, x > and then > calculated a exponential moving average of the data, x_ema. > This part > worked fine. > > Then, I tried to find the mean squared error between the > original > series and the moving average, using mse = mean((x - > x_ema)^2). This > gives N/A as a result, which seems to be the result of the > mean > function. When I run mean() on x_ema, which is of data type > double, it > always returns N/A. I can find the mean of the original > integer data > just fine, as well as for a simple test vector of 10 > doubles, but it > never seems to return a usable result for the x_ema vector. > Is this > because the vector is too long? (Both x and x_ema contain > around 40000 > values). Am I running into some memory limit? I can do > other > calculations with x_ema just fine, it's only the mean() > function that > doesn't seem to work. All the information in the help seems > to > indicate that this operation should work without a > problem. > > If it matters at all, I'm using R 2.9.2 running on Windows > Vista. > > I'd appreciate any help or suggestions as to what might be > wrong. > > Thank you, > Reuben Bellika > > ______________________________________________ > R-help at r-project.org > mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, > reproducible code. >
Reuben Bellika wrote:> > I've been using R recently to analyze some data, but I'm having a > problem using the mean() function. > > I imported the original data set as a vector of integers, x and then > calculated a exponential moving average of the data, x_ema. This part > worked fine. > > Then, I tried to find the mean squared error between the original > series and the moving average, using mse = mean((x - x_ema)^2). This > gives N/A as a result, which seems to be the result of the mean > function. When I run mean() on x_ema, which is of data type double, > it always returns N/A. >So, x_ema includes one (or more) NA (and not N/A) in it. Test: if (any(is.na(x_ema))) cat("Oops! NAs in x_ema\n") If you want to get which of them are na: which(is.na(x_ema)) Alberto Monteiro
OK. It looks like I just have several NA values at the start of my array:> which (is.na(x_ema))[1] 1 2 3 4 5 6 7 8 9 That make sense, because the moving average is not defined for those positions. I'll just have to set those values to zero:> x_ema = replace(x_ema, which(is.na(x_ema)), 0) > which (is.na(x_ema))integer(0) The mean() call works now and I can get on with my work. I'll have to remember to condition the data like this in the future. Thanks for the help! Reuben Bellika
It looks like I should read the documentation a bit more carefully. Simply ignoring the NA values when calculating the mean *is* a much better solution. (Although having 10 values out of a vector of 40000 or so set to zero is *not* going to bias the mean toward zero in any significant way.) Thanks to everyone for your help. Reuben