hi, when i was computing the variance of a simple vector, i found unexpect result. not sure whether it is a bug. > var(c(1,2,3)) [1] 1 #which should be 2/3. > var(c(1,2,3,4,5)) [1] 2.5 #which should be 10/5=2 it seems to me that the program uses (sample size -1) instead of sample size at the denominator. how can i rectify this? regards, tianhua
Le 05.12.2005 09:53, Wang Tian Hua a ??crit :>hi, >when i was computing the variance of a simple vector, i found unexpect >result. not sure whether it is a bug. > > var(c(1,2,3)) >[1] 1 #which should be 2/3. > > var(c(1,2,3,4,5)) >[1] 2.5 #which should be 10/5=2 > >it seems to me that the program uses (sample size -1) instead of sample >size at the denominator. how can i rectify this? > >regards, >tianhua > >These results are expected, it is so not a bug. From details section in ?var The denominator n - 1 is used which gives an unbiased estimator of the (co)variance for i.i.d. observations. (.....) If you really want biased variance, work around : biasedVar <- function(x, ...){ n <- length(x) (n-1) / n * var(x,...) } Romain -- visit the R Graph Gallery : http://addictedtor.free.fr/graphiques mixmod 1.7 is released : http://www-math.univ-fcomte.fr/mixmod/index.php +---------------------------------------------------------------+ | Romain FRANCOIS - http://francoisromain.free.fr | | Doctorant INRIA Futurs / EDF | +---------------------------------------------------------------+
Wang Tian Hua wrote:> hi, > when i was computing the variance of a simple vector, i found unexpect > result. not sure whether it is a bug.Not a bug! ?var: "The denominator n - 1 is used which gives an unbiased estimator of the (co)variance for i.i.d. observations."> > var(c(1,2,3)) > [1] 1 #which should be 2/3. > > var(c(1,2,3,4,5)) > [1] 2.5 #which should be 10/5=2 > > it seems to me that the program uses (sample size -1) instead of sample > size at the denominator. how can i rectify this?Simply change it by: x <- c(1,2,3,4,5) n <- length(x) var(x)*(n-1)/n if you really want it. Uwe Ligges> regards, > tianhua > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Just redefine the var(x) as sum((x-mean(x))^2)/length(x)? Or straightforward just use var(x)*(1-1/length(x)) As you already mentioned var(x) is now defined by sum((x-mean(x))^2)/(length(x)-1) which is an *unbaised* estimtor of COV. While sum((x-mean(x))^2)/length(x) is a *biased* estimator with Bias = -1/N COV Denote population mean by M Proof: E[sum (Xj-mean(X))^2] = E[sum Xj^2 - n mean(X)^2] = sum E[Xj^2] - n E[mean(X)^2] = sum (COV + M^2) - n (1/n COV + M^2) = (n-1) COV Best regards, Kristel Wang Tian Hua wrote:> hi, > when i was computing the variance of a simple vector, i found unexpect > result. not sure whether it is a bug. > > var(c(1,2,3)) > [1] 1 #which should be 2/3. > > var(c(1,2,3,4,5)) > [1] 2.5 #which should be 10/5=2 > > it seems to me that the program uses (sample size -1) instead of sample > size at the denominator. how can i rectify this? > > regards, > tianhua > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html-- __________________________________________ Kristel Joossens Ph.D. Student Research Center ORSTAT K.U. Leuven Naamsestraat 69 Tel: +32 16 326929 3000 Leuven, Belgium Fax: +32 16 326732 E-mail: Kristel.Joossens at econ.kuleuven.be http://www.econ.kuleuven.be/public/ndbae49 Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm