Hello, following functions doesnt work correct with my data: median(), geo.mean(). My datafiles contain more than 10.000 lines and six columns from a flow-cytometer-measurment. I need the arithmetic and geometric mean and median. For the calculation of the geometric mean i wrote following function: fix(geo.mean) function(x) { n<-length(x) gm<-prod(x)^(1/n) return(gm) } The function median() error tells me "need numeric data". The data are numeric. The function geo.mean() gave out "[1] NaN". What are the reasons and what are the solutions? I'am a newbie and need urgently information. Thanks. Here is an short output with the results: 9997 385.42 68.54 9.82 124.09 23.93 138.24 9998 342.89 73.65 133.35 1134.19 345.99 1876.88 9999 316.23 76.35 48.26 421.70 129.80 873.79 10000 291.64 103.66 6.85 107.46 26.42 189.38 10001 0.00 0.00 0.00 0.00 0.00 0.00 > mean(data) FSC SSC FL1 FL2 FL32 FL4 375.94880 73.76219 50.73413 434.42837 110.06393 637.34980 > geo.mean(data) [1] NaN > median(data) Error in median(data) : need numeric data >
Kjetil Brinchmann Halvorsen
2005-Feb-19 12:49 UTC
[R] Warnings by functions mean(), median()
mailpuls at gmx.net wrote:> Hello, > > following functions doesnt work correct with my data: median(), > geo.mean(). > > My datafiles contain more than 10.000 lines and six columns from a > flow-cytometer-measurment. I need the arithmetic and geometric mean > and median. For the calculation of the geometric mean i wrote > following function: > > fix(geo.mean) > > function(x) > { > n<-length(x) > gm<-prod(x)^(1/n)This is probably what gives the NaN below. exp(mean(log(x))) would be more to the point. Kjetil> return(gm) > } > > The function median() error tells me "need numeric data". The data are > numeric. The function geo.mean() gave out "[1] NaN". What are the > reasons and what are the solutions? > > I'am a newbie and need urgently information. > Thanks. > > Here is an short output with the results: > > 9997 385.42 68.54 9.82 124.09 23.93 138.24 > 9998 342.89 73.65 133.35 1134.19 345.99 1876.88 > 9999 316.23 76.35 48.26 421.70 129.80 873.79 > 10000 291.64 103.66 6.85 107.46 26.42 189.38 > 10001 0.00 0.00 0.00 0.00 0.00 0.00 > > mean(data) > FSC SSC FL1 FL2 FL32 FL4 > 375.94880 73.76219 50.73413 434.42837 110.06393 637.34980 > > geo.mean(data) > [1] NaN > > median(data) > Error in median(data) : need numeric data > > > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! > http://www.R-project.org/posting-guide.html > > >-- Kjetil Halvorsen. Peace is the most effective weapon of mass construction. -- Mahdi Elmandjra -- No virus found in this outgoing message. Checked by AVG Anti-Virus.
On Sat, 19 Feb 2005 mailpuls at gmx.net wrote:> following functions doesnt work correct with my data: median(), geo.mean(). > > My datafiles contain more than 10.000 lines and six columns from a > flow-cytometer-measurment. I need the arithmetic and geometric mean and > median. For the calculation of the geometric mean i wrote following function: > > fix(geo.mean) > > function(x) > { > n<-length(x) > gm<-prod(x)^(1/n) > return(gm) > } > > The function median() error tells me "need numeric data". The data are > numeric. The function geo.mean() gave out "[1] NaN". What are the reasons and > what are the solutions? > > I'am a newbie and need urgently information.0) `data' is a bad choice of name as it masks an R system function. 1) `data' appears to be a data frame, not numeric data, as median says. Do you want a summary for each column or the whole table? So you need sapply(data, median) or median(as.matrix(data)). 2) Your function is trying to take a fractional power of 0, and what you think that is? (0) However, it is liable to under/overflow (10000 numbers of size 100 have product 10^20000, way more than IEC60559 arithmetic can represent, so you have (Inf*0)^(1/100001) = NaN). You want something like geo.mean <- function(x) { if(any(x < 0)) stop("need positive data") exp(mean(log(x))) } which will even work for a data frame. But I can tell you the answer is 0 for the data you show. For more information, see `An Introduction to R' or a good book on data manipulation with S/R, plus Numerical Analysis 101.> Here is an short output with the results: > > 9997 385.42 68.54 9.82 124.09 23.93 138.24 > 9998 342.89 73.65 133.35 1134.19 345.99 1876.88 > 9999 316.23 76.35 48.26 421.70 129.80 873.79 > 10000 291.64 103.66 6.85 107.46 26.42 189.38 > 10001 0.00 0.00 0.00 0.00 0.00 0.00 >> mean(data) > FSC SSC FL1 FL2 FL32 FL4 > 375.94880 73.76219 50.73413 434.42837 110.06393 637.34980 >> geo.mean(data) > [1] NaN >> median(data) > Error in median(data) : need numeric data-- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595