At the risk of appearing ignorant why is the folowing true? o <- cbind(rep(1,3),rep(2,3),rep(3,3)) var(o) [,1] [,2] [,3] [1,] 0 0 0 [2,] 0 0 0 [3,] 0 0 0 and mean(o) [1] 2 How do I get mean to return an array similar to var? I would expect in the above example a vector of length 3 {1,2,3}. Thank you for your help. Kevin
On 22-Mar-09 08:17:29, rkevinburton at charter.net wrote:> At the risk of appearing ignorant why is the folowing true? > > o <- cbind(rep(1,3),rep(2,3),rep(3,3)) > var(o) > [,1] [,2] [,3] > [1,] 0 0 0 > [2,] 0 0 0 > [3,] 0 0 0 > > and > > mean(o) > [1] 2 > > How do I get mean to return an array similar to var? I would expect in > the above example a vector of length 3 {1,2,3}. > > Thank you for your help. > KevinThis is a consequence of (understandable) confusion about how var() and mean() operate! It is not explicit, in "?var", that if you apply var() to a matrix, as in your "var(o)" you get the covariance matrix between the columns of 'o' -- except where it says (almost as an aside) that "'var' is just another interface to 'cov'". Hence in your example "var(o)" is equivalent to "cov(o)". Looked at in this way, it is now straightforward to expect what you got. This is, of course, different from what you would expect if you apply var() to a vector, namely the variance of that series of numbers (a single value). On the other hand, mean() works differently. According to "?mean": Arguments: x: An R object. Currently there are methods for numeric data frames, numeric vectors and dates. [...] Value: For a data frame, a named vector with the appropriate method being applied column by column. which may have been what you expected. But a matrix is not a data frame. Instead, it is an array, which (in effect) is a vector with an attached "dimensions" attribute which tells R how to chop it up into columns etc. -- whereas a data frame has its "by-column" structure built in to it. Now: "?mean" says nothing about matrices. Nothing whatever. So you have to find out the hard way that mean(o) treats the array 'o' as a vector, ignoring its "dimensions" attribute. Hence you get a single number, which is the mean of all the values in the matrix. In order to get what you are apparently looking for (the means of the columns of 'o'), you could: a) (the smooth way) use the apply() function, causing mean() to be applied to the second dimension (columns) of 'o': apply(o,2,mean) # [1] 1 2 3 b) (the heavy way) take a hint from "?mean" and feed it a data frame: mean(as.data.frame(o)) # V1 V2 V3 # 1 2 3 Hoping this helps to clarify things! Ted. -------------------------------------------------------------------- E-Mail: (Ted Harding) <Ted.Harding at manchester.ac.uk> Fax-to-email: +44 (0)870 094 0861 Date: 22-Mar-09 Time: 09:01:40 ------------------------------ XFMail ------------------------------
rkevinburton at charter.net wrote:> At the risk of appearing ignorant why is the folowing true? > > o <- cbind(rep(1,3),rep(2,3),rep(3,3)) > var(o) > [,1] [,2] [,3] > [1,] 0 0 0 > [2,] 0 0 0 > [3,] 0 0 0 > > and > > mean(o) > [1] 2 > > How do I get mean to return an array similar to var? I would expect in the above example a vector of length 3 {1,2,3}. >you may well be ignorant about how var works with matrices, but this does not mean it's your fault. the documentation is typically cryptical. when you apply var to a single matrix, it will compute covariances between its columns rather than the overall variance: set.seed(0) x = matrix(rnorm(4), 2, 2) var(x) # [,1] [,2] # [1,] 1.2629543 1.329799 # [2,] -0.3262334 1.272429 matrix(nrow=2, ncol=2, byrow=TRUE, c( cov(x[,1], x[,1]), cov(x[,1], x[,2]), cov(x[,2], x[,1]), cov(x[,2], x[,2]))) vQ
Wacek Kusnierczyk wrote:> > when you apply var to a single matrix, it will compute covariances > between its columns rather than the overall variance: > > set.seed(0) > x = matrix(rnorm(4), 2, 2) > > var(x) > # [,1] [,2] > # [1,] 1.2629543 1.329799 > # [2,] -0.3262334 1.272429 >except for that i seem to have pasted wrong output. set.seed(0) x = matrix(rnorm(4), 2, 2) var(x) # [,1] [,2] # [1,] 1.2627587 0.045585801 # [2,] 0.0455858 0.001645655 matrix(nrow=2, ncol=2, byrow=TRUE, c( cov(x[,1], x[,1]), cov(x[,1], x[,2]), cov(x[,2], x[,1]), cov(x[,2], x[,2]))) # [,1] [,2] # [1,] 1.2627587 0.045585801 # [2,] 0.0455858 0.001645655 vQ