thr3ads.net - R help - [R] Role of na.rm inside mean() [Jul 2011]

If this information is useful, please help other people find it:
Share via:

Doran, Harold

2011-Jul-12 16:26 UTC

[R] Role of na.rm inside mean()

This is just posed out of curiosity, (not as a criticism per se). But what is
the functional role of the argument na.rm inside the mean() function? If there
are missing values, mean() will always return an NA as in the example below.
But, is there ever a purpose in computing a mean only to receive NA as a result?

In 10 years of using R, I have always used mean() in order to get a result,
which is the opposite of its default behavior (when there are NAs). Can anyone
suggest a reason why it is in fact desired to get NA as a result of computing
mean()?
> x <- rnorm(100)
> x[1] <- NA
> mean(x)[1] NA
> mean(x, na.rm=TRUE)[1] 0.08136736

If the reason is to alert the user that the vector has missing values, I suppose
I could buy that. But, I think other checks are better

Harold


	[[alternative HTML version deleted]]

Jeff Newmiller

2011-Jul-12 16:35 UTC

head link

[R] Role of na.rm inside mean()

In SQL, the default is to ignore NULL (equivalent to NA in R).

However, it can be dangerous to fail to verify how much data was actually used
in an aggregation, so the logic behind the default na.rm setting may be one of
encouraging the user to take responsibility for missing data.
---------------------------------------------------------------------------
Jeff Newmiller The ..... ..... Go Live...
DCN:<jdnewmil@dcn.davis.ca.us> Basics: ##.#. ##.#. Live Go...
Live: OO#.. Dead: OO#.. Playing
Research Engineer (Solar/Batteries O.O#. #.O#. with
/Software/Embedded Controllers) .OO#. .OO#. rocks...1k
--------------------------------------------------------------------------- 
Sent from my phone. Please excuse my brevity.

"Doran, Harold" <HDoran@air.org> wrote:

This is just posed out of curiosity, (not as a criticism per se). But what is
the functional role of the argument na.rm inside the mean() function? If there
are missing values, mean() will always return an NA as in the example below.
But, is there ever a purpose in computing a mean only to receive NA as a result?

In 10 years of using R, I have always used mean() in order to get a result,
which is the opposite of its default behavior (when there are NAs). Can anyone
suggest a reason why it is in fact desired to get NA as a result of computing
mean()?
> x <- rnorm(100)
> x[1] <- NA
> mean(x)[1] NA
> mean(x, na.rm=TRUE)[1] 0.08136736

If the reason is to alert the user that the vector has missing values, I suppose
I could buy that. But, I think other checks are better

Harold

	[[alternative HTML version deleted]]

_____________________________________________

R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

	[[alternative HTML version deleted]]

Duncan Murdoch

2011-Jul-12 16:38 UTC

head link

[R] Role of na.rm inside mean()

On 12/07/2011 12:26 PM, Doran, Harold wrote:> This is just posed out of curiosity, (not as a criticism per se). But what
is the functional role of the argument na.rm inside the mean() function? If
there are missing values, mean() will always return an NA as in the example
below. But, is there ever a purpose in computing a mean only to receive NA as a
result?
The general idea in R is that NA stands for "unknown".  If some of the
values in a vector are unknown, then the mean of the vector is also 
unknown.  NA is also used in other ways sometimes; then it makes sense 
to remove it and compute the mean of the other values.

Duncan Murdoch
> In 10 years of using R, I have always used mean() in order to get a result,
which is the opposite of its default behavior (when there are NAs). Can anyone
suggest a reason why it is in fact desired to get NA as a result of computing
mean()?
>
> >  x<- rnorm(100)
> >  x[1]<- NA
>
> >  mean(x)
> [1] NA
>
> >  mean(x, na.rm=TRUE)
> [1] 0.08136736
>
> If the reason is to alert the user that the vector has missing values, I
suppose I could buy that. But, I think other checks are better
>
> Harold
>
>
> 	[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

Joshua Wiley

2011-Jul-12 16:43 UTC

head link

[R] Role of na.rm inside mean()

Hi Harold,

Many (most?) of the statistics function have a similar argument.  I
suspect it is sort of to warn the user---you have to be explicit about
it rather than the program just silently removing or ignoring values
that would not work in the function called.  I can think of one
example where I want a missing value returned.  In psychology we often
create scores on some construct (say optimism), by averaging
individuals' response to several questions.  In certain cases if a
subject does not respond to one question, their overall score should
be missing.  This is easily accomplished by letting na.rm = FALSE.

Cheers,

Josh

On Tue, Jul 12, 2011 at 9:26 AM, Doran, Harold <HDoran at air.org>
wrote:> This is just posed out of curiosity, (not as a criticism per se). But what
is the functional role of the argument na.rm inside the mean() function? If
there are missing values, mean() will always return an NA as in the example
below. But, is there ever a purpose in computing a mean only to receive NA as a
result?
>
> In 10 years of using R, I have always used mean() in order to get a result,
which is the opposite of its default behavior (when there are NAs). Can anyone
suggest a reason why it is in fact desired to get NA as a result of computing
mean()?
>
>> x <- rnorm(100)
>> x[1] <- NA
>
>> mean(x)
> [1] NA
>
>> mean(x, na.rm=TRUE)
> [1] 0.08136736
>
> If the reason is to alert the user that the vector has missing values, I
suppose I could buy that. But, I think other checks are better
>
> Harold
>
>
> ? ? ? ?[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
https://joshuawiley.com/

Apparently Analagous Threads

Search for more maybe matching threads

R help - Jul 2011 - Role of na.rm inside mean()

[R] Role of na.rm inside mean()

[R] Role of na.rm inside mean()

[R] Role of na.rm inside mean()

[R] Role of na.rm inside mean()

Apparently Analagous Threads