Hi, I'm trying to use sapply to compute the min of several variables, each of them stored in data.frames, grouped as a list: Is it normal that mean() and min() produce different objects dimensions? > str(dats) List of 5 $ log20:'data.frame': 83 obs. of 5 variables: ..$ DATE : int [1:83] 2001081500 2001081512 2001081600 2001081612 2001081700 2001081712 2001081800 2001081812 2001081900 2001081912 ... ..$ logrho: num [1:83] 1.16 -1.30 -1.30 -1.30 -1.30 ... ..$ w2 : num [1:83] 1.01 1.27 1.24 1.31 1.28 ... ..$ rms : num [1:83] 5.001 0.630 0.616 0.685 0.655 ... ..$ maxi : num [1:83] 8.66 3.39 3.83 3.35 3.23 ... $ log30:'data.frame': 71 obs. of 5 variables: ..$ DATE : int [1:71] 2001081500 2001081512 2001081600 2001081612 2001081700 2001081712 2001081800 2001081812 2001081900 2001081912 ... ..$ logrho: num [1:71] 1.16 -1.00 -1.00 -1.00 -1.00 ... ..$ w2 : num [1:71] 1.01 1.27 1.21 1.29 1.27 ... ..$ rms : num [1:71] 5.001 0.851 0.802 0.877 0.864 ... ..$ maxi : num [1:71] 8.66 4.57 4.62 4.27 4.47 ... $ log40:'data.frame': 14 obs. of 5 variables: ..$ DATE : int [1:14] 2001081500 2001081512 2001081600 2001081612 2001081700 2001081712 2001081800 2001081812 2001081900 2001081912 ... ..$ logrho: num [1:14] 1.16 -0.50 -0.50 -0.50 -0.50 ... ..$ w2 : num [1:14] 1.01 1.27 1.18 1.23 1.17 ... ..$ rms : num [1:14] 5.00 1.40 1.26 1.36 1.25 ... ..$ maxi : num [1:14] 8.66 7.54 6.39 6.68 5.83 ... $ log50:'data.frame': 69 obs. of 5 variables: ..$ DATE : int [1:69] 2001081500 2001081512 2001081600 2001081612 2001081700 2001081712 2001081800 2001081812 2001081900 2001081912 ... ..$ logrho: num [1:69] 1.16 -1.50 -1.50 -1.50 -1.50 ... ..$ w2 : num [1:69] 1.01 1.27 1.25 1.33 1.31 ... ..$ rms : num [1:69] 5.001 0.516 0.516 0.577 0.554 ... ..$ maxi : num [1:69] 8.66 2.77 3.44 2.92 3.25 ... $ log60:'data.frame': 66 obs. of 5 variables: ..$ DATE : int [1:66] 2001081500 2001081512 2001081600 2001081612 2001081700 2001081712 2001081800 2001081812 2001081900 2001081912 ... ..$ logrho: num [1:66] 1.16 -2.00 -2.00 -2.00 -2.00 ... ..$ w2 : num [1:66] 1.01 1.27 1.27 1.34 1.34 ... ..$ rms : num [1:66] 5.001 0.313 0.326 0.371 0.366 ... ..$ maxi : num [1:66] 8.66 1.68 2.43 2.23 3.85 ... > sapply(dats,mean) log20 log30 log40 log50 log60 DATE 2.001088e+09 2.001087e+09 2.001082e+09 2.001087e+09 2.001086e+09 logrho -1.270326e+00 -9.695748e-01 -3.816967e-01 -1.461383e+00 -1.951941e+00 w2 1.324907e+00 1.283293e+00 1.217808e+00 1.345297e+00 1.398435e+00 rms 7.752963e-01 9.623266e-01 1.636949e+00 6.788993e-01 4.810029e-01 maxi 4.016900e+00 4.466865e+00 6.205573e+00 3.672405e+00 2.898055e+00 > sapply(dats,min) log20 log30 log40 log50 log60 -1.3000708 -1.0000960 -0.5000176 -1.5001134 -2.0001350 Thanks for your help, V??ctor. -- ----------------------------------------------------------------------- V??ctor Homar Santaner Grup de Meteorologia Edif. Mateu Orfila Tel: +34 971 17 1376 Universitat de les Illes Balears Fax: +34 971 17 3426 07122 Palma de Mallorca (SPAIN) Email: [1]Victor.Homar at uib.es Knowledge is contagious. Infect truth. ----------------------------------------------------------------------- References 1. mailto:Victor.Homar at uib.es
On Wed, Jun 25, 2008 at 6:00 PM, Victor Homar <victor.homar@uib.cat> wrote:> > Hi, I'm trying to use sapply to compute the min of several variables, > each > of them stored in data.frames, grouped as a list: > Is it normal that mean() and min() produce different objects dimensions? >it is exactly as documented in ?mean and ?min mean is a generic function which has a separate method for data frames: mean.data.frame <- mean.data.frame function (x, ...) sapply(x, mean, ...) min has no separate method for data frames (so you'll get just one number), but you can define one easily: min.data.frame <- function(...) sapply(..., min) k > str(dats)> List of 5 > $ log20:'data.frame': 83 obs. of 5 variables: > ..$ DATE : int [1:83] 2001081500 2001081512 2001081600 2001081612 > 2001081700 2001081712 2001081800 2001081812 2001081900 2001081912 ... > ..$ logrho: num [1:83] 1.16 -1.30 -1.30 -1.30 -1.30 ... > ..$ w2 : num [1:83] 1.01 1.27 1.24 1.31 1.28 ... > ..$ rms : num [1:83] 5.001 0.630 0.616 0.685 0.655 ... > ..$ maxi : num [1:83] 8.66 3.39 3.83 3.35 3.23 ... > $ log30:'data.frame': 71 obs. of 5 variables: > ..$ DATE : int [1:71] 2001081500 2001081512 2001081600 2001081612 > 2001081700 2001081712 2001081800 2001081812 2001081900 2001081912 ... > ..$ logrho: num [1:71] 1.16 -1.00 -1.00 -1.00 -1.00 ... > ..$ w2 : num [1:71] 1.01 1.27 1.21 1.29 1.27 ... > ..$ rms : num [1:71] 5.001 0.851 0.802 0.877 0.864 ... > ..$ maxi : num [1:71] 8.66 4.57 4.62 4.27 4.47 ... > $ log40:'data.frame': 14 obs. of 5 variables: > ..$ DATE : int [1:14] 2001081500 2001081512 2001081600 2001081612 > 2001081700 2001081712 2001081800 2001081812 2001081900 2001081912 ... > ..$ logrho: num [1:14] 1.16 -0.50 -0.50 -0.50 -0.50 ... > ..$ w2 : num [1:14] 1.01 1.27 1.18 1.23 1.17 ... > ..$ rms : num [1:14] 5.00 1.40 1.26 1.36 1.25 ... > ..$ maxi : num [1:14] 8.66 7.54 6.39 6.68 5.83 ... > $ log50:'data.frame': 69 obs. of 5 variables: > ..$ DATE : int [1:69] 2001081500 2001081512 2001081600 2001081612 > 2001081700 2001081712 2001081800 2001081812 2001081900 2001081912 ... > ..$ logrho: num [1:69] 1.16 -1.50 -1.50 -1.50 -1.50 ... > ..$ w2 : num [1:69] 1.01 1.27 1.25 1.33 1.31 ... > ..$ rms : num [1:69] 5.001 0.516 0.516 0.577 0.554 ... > ..$ maxi : num [1:69] 8.66 2.77 3.44 2.92 3.25 ... > $ log60:'data.frame': 66 obs. of 5 variables: > ..$ DATE : int [1:66] 2001081500 2001081512 2001081600 2001081612 > 2001081700 2001081712 2001081800 2001081812 2001081900 2001081912 ... > ..$ logrho: num [1:66] 1.16 -2.00 -2.00 -2.00 -2.00 ... > ..$ w2 : num [1:66] 1.01 1.27 1.27 1.34 1.34 ... > ..$ rms : num [1:66] 5.001 0.313 0.326 0.371 0.366 ... > ..$ maxi : num [1:66] 8.66 1.68 2.43 2.23 3.85 ... > > sapply(dats,mean) > log20 log30 log40 log50 > log60 > DATE 2.001088e+09 2.001087e+09 2.001082e+09 2.001087e+09 > 2.001086e+09 > logrho -1.270326e+00 -9.695748e-01 -3.816967e-01 -1.461383e+00 > -1.951941e+00 > w2 1.324907e+00 1.283293e+00 1.217808e+00 1.345297e+00 > 1.398435e+00 > rms 7.752963e-01 9.623266e-01 1.636949e+00 6.788993e-01 > 4.810029e-01 > maxi 4.016900e+00 4.466865e+00 6.205573e+00 3.672405e+00 > 2.898055e+00 > > sapply(dats,min) > log20 log30 log40 log50 log60 > -1.3000708 -1.0000960 -0.5000176 -1.5001134 -2.0001350 > Thanks for your help, > Víctor. > -- > ----------------------------------------------------------------------- > Víctor Homar Santaner > Grup de Meteorologia > > Edif. Mateu Orfila Tel: +34 971 17 1376 > Universitat de les Illes Balears Fax: +34 971 17 3426 > 07122 Palma de Mallorca (SPAIN) Email: [1]Victor.Homar@uib.es > > Knowledge is contagious. Infect truth. > ----------------------------------------------------------------------- > > References > > 1. mailto:Victor.Homar@uib.es > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]
The answer to your question is ``yeah, sort of''. The reason for the difference is that mean() is generic and has a method for data frames, according to which the mean of each column of the data frame is found in some ``appropriate'' manner. (Essentially the columns of the data frame must be either numeric or have some sort of date persuasion, else you get a warning and an NA for the column in question. The function min() is not generic and so if you hit a data frame with min() it (apparently) treats that data frame as if it were an atomic vector of data and finds the minimum of that atomic vector. Given, of course, that doing so makes sense. It would seem that you want min() to mimic the behaviour of mean(). To achieve this you can, in this instance at least (I think!) simply do sapply(dats,function(x){sapply(x,min)}) HTH. cheers, Rolf Turner On 26/06/2008, at 3:00 AM, Victor Homar wrote:> > Hi, I'm trying to use sapply to compute the min of several > variables, each > of them stored in data.frames, grouped as a list: > Is it normal that mean() and min() produce different objects > dimensions? >> str(dats) > List of 5 > $ log20:'data.frame': 83 obs. of 5 variables: > ..$ DATE : int [1:83] 2001081500 2001081512 2001081600 > 2001081612 > 2001081700 2001081712 2001081800 2001081812 2001081900 > 2001081912 ... > ..$ logrho: num [1:83] 1.16 -1.30 -1.30 -1.30 -1.30 ... > ..$ w2 : num [1:83] 1.01 1.27 1.24 1.31 1.28 ... > ..$ rms : num [1:83] 5.001 0.630 0.616 0.685 0.655 ... > ..$ maxi : num [1:83] 8.66 3.39 3.83 3.35 3.23 ... > $ log30:'data.frame': 71 obs. of 5 variables: > ..$ DATE : int [1:71] 2001081500 2001081512 2001081600 > 2001081612 > 2001081700 2001081712 2001081800 2001081812 2001081900 > 2001081912 ... > ..$ logrho: num [1:71] 1.16 -1.00 -1.00 -1.00 -1.00 ... > ..$ w2 : num [1:71] 1.01 1.27 1.21 1.29 1.27 ... > ..$ rms : num [1:71] 5.001 0.851 0.802 0.877 0.864 ... > ..$ maxi : num [1:71] 8.66 4.57 4.62 4.27 4.47 ... > $ log40:'data.frame': 14 obs. of 5 variables: > ..$ DATE : int [1:14] 2001081500 2001081512 2001081600 > 2001081612 > 2001081700 2001081712 2001081800 2001081812 2001081900 > 2001081912 ... > ..$ logrho: num [1:14] 1.16 -0.50 -0.50 -0.50 -0.50 ... > ..$ w2 : num [1:14] 1.01 1.27 1.18 1.23 1.17 ... > ..$ rms : num [1:14] 5.00 1.40 1.26 1.36 1.25 ... > ..$ maxi : num [1:14] 8.66 7.54 6.39 6.68 5.83 ... > $ log50:'data.frame': 69 obs. of 5 variables: > ..$ DATE : int [1:69] 2001081500 2001081512 2001081600 > 2001081612 > 2001081700 2001081712 2001081800 2001081812 2001081900 > 2001081912 ... > ..$ logrho: num [1:69] 1.16 -1.50 -1.50 -1.50 -1.50 ... > ..$ w2 : num [1:69] 1.01 1.27 1.25 1.33 1.31 ... > ..$ rms : num [1:69] 5.001 0.516 0.516 0.577 0.554 ... > ..$ maxi : num [1:69] 8.66 2.77 3.44 2.92 3.25 ... > $ log60:'data.frame': 66 obs. of 5 variables: > ..$ DATE : int [1:66] 2001081500 2001081512 2001081600 > 2001081612 > 2001081700 2001081712 2001081800 2001081812 2001081900 > 2001081912 ... > ..$ logrho: num [1:66] 1.16 -2.00 -2.00 -2.00 -2.00 ... > ..$ w2 : num [1:66] 1.01 1.27 1.27 1.34 1.34 ... > ..$ rms : num [1:66] 5.001 0.313 0.326 0.371 0.366 ... > ..$ maxi : num [1:66] 8.66 1.68 2.43 2.23 3.85 ... >> sapply(dats,mean) > log20 log30 log40 > log50 log60 > DATE 2.001088e+09 2.001087e+09 2.001082e+09 2.001087e+09 > 2.001086e+09 > logrho -1.270326e+00 -9.695748e-01 -3.816967e-01 -1.461383e+00 > -1.951941e+00 > w2 1.324907e+00 1.283293e+00 1.217808e+00 1.345297e+00 > 1.398435e+00 > rms 7.752963e-01 9.623266e-01 1.636949e+00 6.788993e-01 > 4.810029e-01 > maxi 4.016900e+00 4.466865e+00 6.205573e+00 3.672405e+00 > 2.898055e+00 >> sapply(dats,min) > log20 log30 log40 log50 log60 > -1.3000708 -1.0000960 -0.5000176 -1.5001134 -2.0001350###################################################################### Attention:\ This e-mail message is privileged and confid...{{dropped:9}}