thr3ads.net - R devel - [Rd] Inconsistency between rowMeans documentation and reality? [Apr 2011]

If this information is useful, please help other people find it:
Share via:

Gavin Simpson

2011-Apr-05 11:33 UTC

[Rd] Inconsistency between rowMeans documentation and reality?

Dear List,

I'm not even sure this is an issue or not, but ?rowMeans has:

Value:

     A numeric or complex array of suitable size, or a vector if the
     result is one-dimensional.  The ?dimnames? (or ?names? for a
     vector result) are taken from the original array.

     If there are no values in a range to be summed over (after
     removing missing values with ?na.rm = TRUE?), that component of
     the output is set to ?0? (?*Sums?) or ?NA? (?*Means?), consistent
     with ?sum? and ?mean?.

However the output of mean() and rowMeans() is not exactly the same when
all supplied values are missing.
> mean(NA, na.rm = TRUE)
[1] NaN> mean(rep(NA, 5), na.rm = TRUE)
[1] NaN> rowMeans(matrix(rep(NA, 5), ncol = 5), na.rm = TRUE)[1] NA

So in one sense, the outputs are not consistent:
> is.nan(mean(rep(NA, 5), na.rm = TRUE))
[1] TRUE> is.nan(rowMeans(matrix(rep(NA, 5), ncol = 5), na.rm = TRUE))[1] FALSE

but in another they are:
> is.na(mean(rep(NA, 5), na.rm = TRUE))
[1] TRUE> is.na(rowMeans(matrix(rep(NA, 5), ncol = 5), na.rm = TRUE))[1] TRUE

I'm not familiar enough with the details to know if this even matters,
but wonder if something in the documentation needs a change or tweak to
clarify what is returned. As I say, in one sense the outputs are not
consistent.
> sessionInfo()R version 2.13.0 beta (2011-04-04 r55298)
Platform: x86_64-unknown-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_GB.utf8       LC_NUMERIC=C             
 [3] LC_TIME=en_GB.utf8        LC_COLLATE=en_GB.utf8    
 [5] LC_MONETARY=C             LC_MESSAGES=en_GB.utf8   
 [7] LC_PAPER=en_GB.utf8       LC_NAME=C                
 [9] LC_ADDRESS=C              LC_TELEPHONE=C           
[11] LC_MEASUREMENT=en_GB.utf8 LC_IDENTIFICATION=C      

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods  
[7] base     

loaded via a namespace (and not attached):
[1] tools_2.13.0

Thanks,

Gavin
-- 
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
 Dr. Gavin Simpson             [t] +44 (0)20 7679 0522
 ECRC, UCL Geography,          [f] +44 (0)20 7679 0565
 Pearson Building,             [e] gavin.simpsonATNOSPAMucl.ac.uk
 Gower Street, London          [w] http://www.ucl.ac.uk/~ucfagls/
 UK. WC1E 6BT.                 [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%

Prof Brian Ripley

2011-Apr-11 14:57 UTC

head link

[Rd] Inconsistency between rowMeans documentation and reality?

I suspect you omitted some of the help page:

   As they are written for speed, they blur over some of the subtleties
   of ?NaN? and ?NA?.

So, given that (and that real NA is a specific NaN) I think it is 
perfectly reasonable to claim they are consistent with mean.

On Tue, 5 Apr 2011, Gavin Simpson wrote:
> Dear List,
>
> I'm not even sure this is an issue or not, but ?rowMeans has:
>
> Value:
>
>     A numeric or complex array of suitable size, or a vector if the
>     result is one-dimensional.  The ?dimnames? (or ?names? for a
>     vector result) are taken from the original array.
>
>     If there are no values in a range to be summed over (after
>     removing missing values with ?na.rm = TRUE?), that component of
>     the output is set to ?0? (?*Sums?) or ?NA? (?*Means?), consistent
>     with ?sum? and ?mean?.
>
> However the output of mean() and rowMeans() is not exactly the same when
> all supplied values are missing.
>
>> mean(NA, na.rm = TRUE)
> [1] NaN
>> mean(rep(NA, 5), na.rm = TRUE)
> [1] NaN
>> rowMeans(matrix(rep(NA, 5), ncol = 5), na.rm = TRUE)
> [1] NA
>
> So in one sense, the outputs are not consistent:
>
>> is.nan(mean(rep(NA, 5), na.rm = TRUE))
> [1] TRUE
>> is.nan(rowMeans(matrix(rep(NA, 5), ncol = 5), na.rm = TRUE))
> [1] FALSE
>
> but in another they are:
>
>> is.na(mean(rep(NA, 5), na.rm = TRUE))
> [1] TRUE
>> is.na(rowMeans(matrix(rep(NA, 5), ncol = 5), na.rm = TRUE))
> [1] TRUE
>
> I'm not familiar enough with the details to know if this even matters,
> but wonder if something in the documentation needs a change or tweak to
> clarify what is returned. As I say, in one sense the outputs are not
> consistent.
>
>> sessionInfo()
> R version 2.13.0 beta (2011-04-04 r55298)
> Platform: x86_64-unknown-linux-gnu (64-bit)
>
> locale:
> [1] LC_CTYPE=en_GB.utf8       LC_NUMERIC=C
> [3] LC_TIME=en_GB.utf8        LC_COLLATE=en_GB.utf8
> [5] LC_MONETARY=C             LC_MESSAGES=en_GB.utf8
> [7] LC_PAPER=en_GB.utf8       LC_NAME=C
> [9] LC_ADDRESS=C              LC_TELEPHONE=C
> [11] LC_MEASUREMENT=en_GB.utf8 LC_IDENTIFICATION=C
>
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods
> [7] base
>
> loaded via a namespace (and not attached):
> [1] tools_2.13.0
>
> Thanks,
>
> Gavin
> -- 
> %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
> Dr. Gavin Simpson             [t] +44 (0)20 7679 0522
> ECRC, UCL Geography,          [f] +44 (0)20 7679 0565
> Pearson Building,             [e] gavin.simpsonATNOSPAMucl.ac.uk
> Gower Street, London          [w] http://www.ucl.ac.uk/~ucfagls/
> UK. WC1E 6BT.                 [w] http://www.freshwaters.org.uk
> %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>
-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

Apparently Analagous Threads

Search for more apparently analagous threads

R devel - Apr 2011 - Inconsistency between rowMeans documentation and reality?

[Rd] Inconsistency between rowMeans documentation and reality?

[Rd] Inconsistency between rowMeans documentation and reality?

Apparently Analagous Threads