On Fri, 2007-01-19 at 11:36 -0500, bogdan romocea wrote:> Hello, I don't understand the behavior of apply() on the data frame
below.
>
> test <-
> structure(list(Date = structure(c(13361, 13361, 13361, 13361,
> 13361, 13361, 13361, 13361, 13362, 13362, 13362, 13362, 13362,
> 13362, 13362, 13362, 13363, 13363, 13363, 13363, 13363, 13363,
> 13363, 13363, 13364, 13364, 13364, 13364, 13364, 13364, 13364,
> 13364, 13365, 13365, 13365, 13365, 13365, 13365, 13365, 13365,
> 13366, 13366, 13366, 13366, 13366, 13366, 13366, 13366, 13367,
> 13367), class = "Date"), RANK = as.integer(c(19, 7, 5, 4, 6,
> 3, 3, 4, 18, 7, 6, 4, 6, 3, 3, 4, 19, 7, 6, 4, 6, 3, 3, 4, 18,
> 6, 7, 4, 6, 3, 3, 4, 18, 6, 7, 4, 6, 3, 3, 4, 18, 6, 7, 4, 6,
> 3, 3, 4, 18, 6))), .Names = c("Date", "RANK"),
row.names = c("1",
> "2", "3", "4", "5", "6",
"7", "8", "9", "10", "11",
"12", "13",
> "14", "15", "16", "17",
"18", "19", "20", "21", "22",
"23", "24",
> "25", "26", "27", "28",
"29", "30", "31", "32", "33",
"34", "35",
> "36", "37", "38", "39",
"40", "41", "42", "43", "44",
"45", "46",
> "47", "48", "49", "50"), class =
"data.frame")
>
> #---fine
> > summary(test)
> Date RANK
> Min. :2006-08-01 Min. : 3.00
> 1st Qu.:2006-08-02 1st Qu.: 4.00
> Median :2006-08-04 Median : 5.50
> Mean :2006-08-03 Mean : 6.62
> 3rd Qu.:2006-08-05 3rd Qu.: 6.75
> Max. :2006-08-07 Max. :19.00
>
> #---isn't this supposed to work?
> > apply(test,2,mean)
> Date RANK
> NA NA
> Warning messages:
> 1: argument is not numeric or logical: returning NA in:
> mean.default(newX[, i], ...)
> 2: argument is not numeric or logical: returning NA in:
> mean.default(newX[, i], ...)
Look at ?apply and details.
Argument X of apply is supposed to be an array. Details says:
If 'X' is not an array but has a dimension attribute,
'apply'
attempts to coerce it to an array via 'as.matrix' if it is
two-dimensional (e.g., data frames) or via 'as.array'.
So you should look at what is happening with as.matrix():
str(as.matrix(test))
chr [1:50, 1:2] "2006-08-01" "2006-08-01"
"2006-08-01" ...
- attr(*, "dimnames")=List of 2
..$ : chr [1:50] "1" "2" "3" "4" ...
..$ : chr [1:2] "Date" "RANK"
Notice this is now a character matrix and not what you thought it was.
So look at ?as.matrix and we see:
'as.matrix' is a generic function. The method for data frames will
convert any non-numeric/complex column into a character vector
using 'format' and so return a character matrix, except that
all-logical data frames will be coerced to a logical matrix. When
coercing a vector, it produces a one-column matrix, and promotes
the names (if any) of the vector to the rownames of the matrix.
Which explains what is happening.
Workaround:
lapply(test, mean)
sapply(test, mean)
Both work
HTH,
G
> Thank you,
> b.
>
> platform i386-pc-mingw32
> arch i386
> os mingw32
> system i386, mingw32
> status
> major 2
> minor 4.0
> year 2006
> month 10
> day 03
> svn rev 39566
> language R
> version.string R version 2.4.0 (2006-10-03)
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
--
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
Gavin Simpson [t] +44 (0)20 7679 0522
ECRC, UCL Geography, [f] +44 (0)20 7679 0565
Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk
Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/
UK. WC1E 6BT. [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%