Christoph Jäckel
2011-May-26 21:22 UTC
[R] Different behavior of median and mean function - Why?
Hi together, below is a small example which produces outcome I do not understand, namely that the median function works fine on a data.frame without negative numbers, but doesn't work on a data.frame with one negative number. I'm sure there is a reasonable explanation for that or better, that I'm doing something wrong and someone could guide me how to solve it. I tried googling it, but couldn't find a solution:> #Set up data frame > df <- data.frame(V1=c(1,2,3,4),V2=c(2,3,4,5)) > #Both work fine > mean(df)V1 V2 2.5 3.5> median(df)[1] 2.5 3.5> > #Now, I just make one number negative in the data.frame > df <- data.frame(V1=c(1,2,3,-4),V2=c(2,3,4,5)) > mean(df)#Works fineV1 V2 0.5 3.5> median(df)#Why do I get that error?[1] NA 0.5 Warnmeldung: In mean.default(X[[1L]], ...) : argument is not numeric or logical: returning NA> #It works fine on both columns seperately > median(df$V1)[1] 1.5> median(df$V2)[1] 3.5> > sessionInfo()R version 2.13.0 (2011-04-13) Platform: x86_64-pc-mingw32/x64 (64-bit) locale: [1] C attached base packages: [1] graphics grDevices datasets stats utils methods base other attached packages: [1] urca_1.2-5 zoo_1.6-5 svSocket_0.9-51 loaded via a namespace (and not attached): [1] grid_2.13.0 lattice_0.19-23 svMisc_0.9-61 tcltk_2.13.0 tools_2.13.0 Thanks Christoph -- -------------------------------------------------------------------------------------------------------------------------------------------------------------------- Christoph J?ckel (Dipl.-Kfm.) -------------------------------------------------------------------------------------------------------------------------------------------------------------------- Research Assistant Chair for Financial Management and Capital Markets | Lehrstuhl f?r Finanzmanagement und Kapitalm?rkte TUM School of Management | Technische Universit?t M?nchen Arcisstr. 21 | D-80333 M?nchen | Germany
Mitchell Maltenfort
2011-May-26 21:31 UTC
[R] Different behavior of median and mean function - Why?
Summary (df) will also work. On 5/26/11, Christoph J?ckel <christoph.jaeckel at wi.tum.de> wrote:> Hi together, > > below is a small example which produces outcome I do not understand, > namely that the median function works fine on a data.frame without > negative numbers, but doesn't work on a data.frame with one negative > number. I'm sure there is a reasonable explanation for that or better, > that I'm doing something wrong and someone could guide me how to solve > it. I tried googling it, but couldn't find a solution: > >> #Set up data frame >> df <- data.frame(V1=c(1,2,3,4),V2=c(2,3,4,5)) >> #Both work fine >> mean(df) > V1 V2 > 2.5 3.5 >> median(df) > [1] 2.5 3.5 >> >> #Now, I just make one number negative in the data.frame >> df <- data.frame(V1=c(1,2,3,-4),V2=c(2,3,4,5)) >> mean(df)#Works fine > V1 V2 > 0.5 3.5 >> median(df)#Why do I get that error? > [1] NA 0.5 > Warnmeldung: > In mean.default(X[[1L]], ...) : > argument is not numeric or logical: returning NA >> #It works fine on both columns seperately >> median(df$V1) > [1] 1.5 >> median(df$V2) > [1] 3.5 >> >> sessionInfo() > R version 2.13.0 (2011-04-13) > Platform: x86_64-pc-mingw32/x64 (64-bit) > > locale: > [1] C > > attached base packages: > [1] graphics grDevices datasets stats utils methods base > > other attached packages: > [1] urca_1.2-5 zoo_1.6-5 svSocket_0.9-51 > > loaded via a namespace (and not attached): > [1] grid_2.13.0 lattice_0.19-23 svMisc_0.9-61 tcltk_2.13.0 > tools_2.13.0 > > > Thanks Christoph > -- > -------------------------------------------------------------------------------------------------------------------------------------------------------------------- > > Christoph J?ckel (Dipl.-Kfm.) > > -------------------------------------------------------------------------------------------------------------------------------------------------------------------- > > Research Assistant > > Chair for Financial Management and Capital Markets | Lehrstuhl f?r > Finanzmanagement und Kapitalm?rkte > > TUM School of Management | Technische Universit?t M?nchen > > Arcisstr. 21 | D-80333 M?nchen | Germany > > ______________________________________________ > R-help at r-project.org mailing list > stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Sent from my mobile device Due to the recession, requests for instant gratification will be deferred until arrears in scheduled gratification have been satisfied.
Sarah Goslee
2011-May-26 21:33 UTC
[R] Different behavior of median and mean function - Why?
Christoph, After a quick look at the code for median, I'm amazed that it gives a correct result for any data frame. median() isn't really intended for use with data frames; there's no data.frame method. The correct and safe approach is to use sapply(df, median) This was recently discussed on the R-devel list: r.789695.n4.nabble.com/median-and-data-frames-td3478921.html Sarah On Thu, May 26, 2011 at 5:22 PM, Christoph J?ckel <christoph.jaeckel at wi.tum.de> wrote:> Hi together, > > below is a small example which produces outcome I do not understand, > namely that the median function works fine on a data.frame without > negative numbers, but doesn't work on a data.frame with one negative > number. I'm sure there is a reasonable explanation for that or better, > that I'm doing something wrong and someone could guide me how to solve > it. I tried googling it, but couldn't find a solution: > >> #Set up data frame >> df <- data.frame(V1=c(1,2,3,4),V2=c(2,3,4,5)) >> #Both work fine >> mean(df) > ?V1 ?V2 > 2.5 3.5 >> median(df) > [1] 2.5 3.5 >> >> #Now, I just make one number negative in the data.frame >> df <- data.frame(V1=c(1,2,3,-4),V2=c(2,3,4,5)) >> mean(df)#Works fine > ?V1 ?V2 > 0.5 3.5 >> median(df)#Why do I ?get that error? > [1] ?NA 0.5 > Warnmeldung: > In mean.default(X[[1L]], ...) : > ?argument is not numeric or logical: returning NA >> #It works fine on both columns seperately >> median(df$V1) > [1] 1.5 >> median(df$V2) > [1] 3.5 >> >> sessionInfo() > R version 2.13.0 (2011-04-13) > Platform: x86_64-pc-mingw32/x64 (64-bit) > > locale: > [1] C > > attached base packages: > [1] graphics ?grDevices datasets ?stats ? ? utils ? ? methods ? base > > other attached packages: > [1] urca_1.2-5 ? ? ?zoo_1.6-5 ? ? ? svSocket_0.9-51 > > loaded via a namespace (and not attached): > [1] grid_2.13.0 ? ? lattice_0.19-23 svMisc_0.9-61 ? tcltk_2.13.0 > tools_2.13.0 > > > Thanks Christoph > -- >-- Sarah Goslee functionaldiversity.org
Joshua Wiley
2011-May-26 21:36 UTC
[R] Different behavior of median and mean function - Why?
Hi Christoph, Use: sapply(df, median) median() does not have methods for a data frame (read: it is never meant to be used directly on data frames so do not expect sensical results). Cheers, Josh On Thu, May 26, 2011 at 2:22 PM, Christoph J?ckel <christoph.jaeckel at wi.tum.de> wrote:> Hi together, > > below is a small example which produces outcome I do not understand, > namely that the median function works fine on a data.frame without > negative numbers, but doesn't work on a data.frame with one negative > number. I'm sure there is a reasonable explanation for that or better, > that I'm doing something wrong and someone could guide me how to solve > it. I tried googling it, but couldn't find a solution: > >> #Set up data frame >> df <- data.frame(V1=c(1,2,3,4),V2=c(2,3,4,5)) >> #Both work fine >> mean(df) > ?V1 ?V2 > 2.5 3.5 >> median(df) > [1] 2.5 3.5 >> >> #Now, I just make one number negative in the data.frame >> df <- data.frame(V1=c(1,2,3,-4),V2=c(2,3,4,5)) >> mean(df)#Works fine > ?V1 ?V2 > 0.5 3.5 >> median(df)#Why do I ?get that error? > [1] ?NA 0.5 > Warnmeldung: > In mean.default(X[[1L]], ...) : > ?argument is not numeric or logical: returning NA >> #It works fine on both columns seperately >> median(df$V1) > [1] 1.5 >> median(df$V2) > [1] 3.5 >> >> sessionInfo() > R version 2.13.0 (2011-04-13) > Platform: x86_64-pc-mingw32/x64 (64-bit) > > locale: > [1] C > > attached base packages: > [1] graphics ?grDevices datasets ?stats ? ? utils ? ? methods ? base > > other attached packages: > [1] urca_1.2-5 ? ? ?zoo_1.6-5 ? ? ? svSocket_0.9-51 > > loaded via a namespace (and not attached): > [1] grid_2.13.0 ? ? lattice_0.19-23 svMisc_0.9-61 ? tcltk_2.13.0 > tools_2.13.0 > > > Thanks Christoph > -- > -------------------------------------------------------------------------------------------------------------------------------------------------------------------- > > Christoph J?ckel (Dipl.-Kfm.) > > -------------------------------------------------------------------------------------------------------------------------------------------------------------------- > > Research Assistant > > Chair for Financial Management and Capital Markets | Lehrstuhl f?r > Finanzmanagement und Kapitalm?rkte > > TUM School of Management | Technische Universit?t M?nchen > > Arcisstr. 21 | D-80333 M?nchen | Germany > > ______________________________________________ > R-help at r-project.org mailing list > stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles joshuawiley.com
Marc Schwartz
2011-May-26 21:37 UTC
[R] Different behavior of median and mean function - Why?
On May 26, 2011, at 4:22 PM, Christoph J?ckel wrote:> Hi together, > > below is a small example which produces outcome I do not understand, > namely that the median function works fine on a data.frame without > negative numbers, but doesn't work on a data.frame with one negative > number. I'm sure there is a reasonable explanation for that or better, > that I'm doing something wrong and someone could guide me how to solve > it. I tried googling it, but couldn't find a solution: > >> #Set up data frame >> df <- data.frame(V1=c(1,2,3,4),V2=c(2,3,4,5)) >> #Both work fine >> mean(df) > V1 V2 > 2.5 3.5 >> median(df) > [1] 2.5 3.5 >> >> #Now, I just make one number negative in the data.frame >> df <- data.frame(V1=c(1,2,3,-4),V2=c(2,3,4,5)) >> mean(df)#Works fine > V1 V2 > 0.5 3.5 >> median(df)#Why do I get that error? > [1] NA 0.5 > Warnmeldung: > In mean.default(X[[1L]], ...) : > argument is not numeric or logical: returning NA >> #It works fine on both columns seperately >> median(df$V1) > [1] 1.5 >> median(df$V2) > [1] 3.5This was actually just discussed late last month. See the thread here: stat.ethz.ch/pipermail/r-devel/2011-April/060731.html The bottom line is that median does not have a 'method' for data frames, whereas mean does. HTH, Marc Schwartz
Maybe Matching Threads
- Problem with ddply in the plyr-package: surprising output of a date-column
- Matching Problem: Want to match to data.frame with inexact matching identifier (one identifier has to be in the range of the other).
- samba_backup
- Problem with xlsx package
- Change font size in Windows