thr3ads.net - R devel - [R] variance/mean [Mar 2009]

If this information is useful, please help other people find it:
Share via:

rkevinburton at charter.net

2009-Mar-22 08:17 UTC

[R] variance/mean

At the risk of appearing ignorant why is the folowing true?

o <- cbind(rep(1,3),rep(2,3),rep(3,3))
var(o)
     [,1] [,2] [,3]
[1,]    0    0    0
[2,]    0    0    0
[3,]    0    0    0

and

mean(o)
[1] 2

How do I get mean to return an array similar to var? I would expect in the above
example a vector of length 3 {1,2,3}.

Thank you for your help.

Kevin

(Ted Harding)

2009-Mar-22 09:01 UTC

head link

[R] variance/mean

On 22-Mar-09 08:17:29, rkevinburton at charter.net
wrote:> At the risk of appearing ignorant why is the folowing true?
> 
> o <- cbind(rep(1,3),rep(2,3),rep(3,3))
> var(o)
>      [,1] [,2] [,3]
> [1,]    0    0    0
> [2,]    0    0    0
> [3,]    0    0    0
> 
> and
> 
> mean(o)
> [1] 2
> 
> How do I get mean to return an array similar to var? I would expect in
> the above example a vector of length 3 {1,2,3}.
> 
> Thank you for your help.
> Kevin
This is a consequence of (understandable) confusion about how var()
and mean() operate! It is not explicit, in "?var", that if you apply
var() to a matrix, as in your "var(o)" you get the covariance matrix
between the columns of 'o' -- except where it says (almost as an
aside) that "'var' is just another interface to
'cov'". Hence in
your example "var(o)" is equivalent to "cov(o)". Looked at
in this
way, it is now straightforward to expect what you got.

This is, of course, different from what you would expect if you apply
var() to a vector, namely the variance of that series of numbers
(a single value).

On the other hand, mean() works differently. According to "?mean":
  Arguments:
     x: An R object.  Currently there are methods for numeric
        data frames, numeric vectors and dates.
  [...]
  Value:
     For a data frame, a named vector with the appropriate method
     being applied column by column.

which may have been what you expected. But a matrix is not a data
frame. Instead, it is an array, which (in effect) is a vector with
an attached "dimensions" attribute which tells R how to chop it up
into columns etc. -- whereas a data frame has its "by-column"
structure built in to it.

Now: "?mean" says nothing about matrices. Nothing whatever.
So you have to find out the hard way that mean(o) treats the array
'o' as a vector, ignoring its "dimensions" attribute. Hence
you
get a single number, which is the mean of all the values in the
matrix.

In order to get what you are apparently looking for (the means of
the columns of 'o'), you could:

a) (the smooth way) use the apply() function, causing mean() to be
   applied to the second dimension (columns) of 'o':

   apply(o,2,mean)
   # [1] 1 2 3

b) (the heavy way) take a hint from "?mean" and feed it a data frame:

   mean(as.data.frame(o))
   # V1 V2 V3
   #  1  2  3 

Hoping this helps to clarify things!
Ted.

--------------------------------------------------------------------
E-Mail: (Ted Harding) <Ted.Harding at manchester.ac.uk>
Fax-to-email: +44 (0)870 094 0861
Date: 22-Mar-09                                       Time: 09:01:40
------------------------------ XFMail ------------------------------

Wacek Kusnierczyk

2009-Mar-22 09:15 UTC

head link

[R] variance/mean

rkevinburton at charter.net wrote:> At the risk of appearing ignorant why is the folowing true?
>
> o <- cbind(rep(1,3),rep(2,3),rep(3,3))
> var(o)
>      [,1] [,2] [,3]
> [1,]    0    0    0
> [2,]    0    0    0
> [3,]    0    0    0
>
> and
>
> mean(o)
> [1] 2
>
> How do I get mean to return an array similar to var? I would expect in the
above example a vector of length 3 {1,2,3}.
>   
you may well be ignorant about how var works with matrices, but this
does not mean it's your fault.  the documentation is typically cryptical.

when you apply var to a single matrix, it will compute covariances
between its columns rather than the overall variance:

    set.seed(0)
    x = matrix(rnorm(4), 2, 2)
   
    var(x)
    #                [,1]     [,2]
    # [1,]  1.2629543 1.329799
    # [2,] -0.3262334 1.272429

    matrix(nrow=2, ncol=2, byrow=TRUE, c(
       cov(x[,1], x[,1]), cov(x[,1], x[,2]),
       cov(x[,2], x[,1]), cov(x[,2], x[,2])))
      
vQ

Wacek Kusnierczyk

2009-Mar-22 09:28 UTC

head link

[R] variance/mean

Wacek Kusnierczyk wrote:>
> when you apply var to a single matrix, it will compute covariances
> between its columns rather than the overall variance:
>
>     set.seed(0)
>     x = matrix(rnorm(4), 2, 2)
>    
>     var(x)
>     #                [,1]     [,2]
>     # [1,]  1.2629543 1.329799
>     # [2,] -0.3262334 1.272429
>   
except for that i seem to have pasted wrong output.

    set.seed(0)
    x = matrix(rnorm(4), 2, 2)

    var(x)
    #           [,1]        [,2]
    # [1,] 1.2627587 0.045585801
    # [2,] 0.0455858 0.001645655

    matrix(nrow=2, ncol=2, byrow=TRUE, c(
        cov(x[,1], x[,1]), cov(x[,1], x[,2]),
        cov(x[,2], x[,1]), cov(x[,2], x[,2])))
    #           [,1]        [,2]
    # [1,] 1.2627587 0.045585801
    # [2,] 0.0455858 0.001645655

vQ

Possibly Parallel Threads

Search for more apparently analagous threads

R devel - Mar 2009 - variance/mean

[R] variance/mean

[R] variance/mean

[R] variance/mean

[R] variance/mean

Possibly Parallel Threads