Dear friends, regression towards the mean is interesting in medical circles, and a very recent paper (The American Statistician November 2007;61:302-307 by Krause and Pinheiro) treats it at length. An initial example specifies (p 303): "Consider the following example: we draw 100 samples from a bivariate Normal distribution with X0~N(0,1), X1~N(0,1) and cov(X0,X1)=0.7, We then calculate the p value for the null hypothesis that the means of X0 and X1 are equal, using a paired Student's t test. The procedure is repeated 1000 times, producing 1000 simulated p values. Because X0 and X1 have identical marginal distributions, the simulated p values behave like independent Uniform(0,1) random variables." This I did not understand, and simulating like shown below produced far from uniform (0,1) p values - but I fail to see how it is wrong. I contacted the authors of the paper but they did not answer. So, please, doesn?t the code below specify a bivariate N(0,1) with covariance 0.7? I get p values = 1 all over - not interesting, but how wrong? Best wishes Troels library(MASS) Sigma <- matrix(c(1,0.7,0.7,1),2,2) Sigma res <- NULL for (i in 1:1000){ ff <-(mvrnorm(n=100, rep(0, 2), Sigma, empirical = TRUE)) res[i] <- t.test(ff[,1],ff[,2],paired=TRUE)$p.value} -- Troels Ring - - Department of nephrology - - Aalborg Hospital 9100 Aalborg, Denmark - - +45 99326629 - - tring at gvdnet.dk
Peter Dalgaard
2007-Dec-17 18:31 UTC
[R] regression towards the mean, AS paper November 2007
Troels Ring wrote:> Dear friends, regression towards the mean is interesting in medical > circles, and a very recent paper (The American Statistician November > 2007;61:302-307 by Krause and Pinheiro) treats it at length. An initial > example specifies (p 303): > "Consider the following example: we draw 100 samples from a bivariate > Normal distribution with X0~N(0,1), X1~N(0,1) and cov(X0,X1)=0.7, We > then calculate the p value for the null hypothesis that the means of X0 > and X1 are equal, using a paired Student's t test. The procedure is > repeated 1000 times, producing 1000 simulated p values. Because X0 and > X1 have identical marginal distributions, the simulated p values behave > like independent Uniform(0,1) random variables." This I did not > understand, and simulating like shown below produced far from uniform > (0,1) p values - but I fail to see how it is wrong. I contacted the > authors of the paper but they did not answer. So, please, doesn?t the > code below specify a bivariate N(0,1) with covariance 0.7? I get p > values = 1 all over - not interesting, but how wrong? > Best wishes > Troels > > library(MASS) > Sigma <- matrix(c(1,0.7,0.7,1),2,2) > Sigma > res <- NULL > for (i in 1:1000){ > ff <-(mvrnorm(n=100, rep(0, 2), Sigma, empirical = TRUE)) > res[i] <- t.test(ff[,1],ff[,2],paired=TRUE)$p.value} > >You do not want empirical=TRUE in the mvrnorm call. This pegs the empirical means to exactly (0,0) which are obviously never significantly different. BTW, there's a function called replicate()... -- O__ ---- Peter Dalgaard ?ster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907
Duncan Murdoch
2007-Dec-17 18:32 UTC
[R] regression towards the mean, AS paper November 2007
On 12/17/2007 1:21 PM, Troels Ring wrote:> Dear friends, regression towards the mean is interesting in medical > circles, and a very recent paper (The American Statistician November > 2007;61:302-307 by Krause and Pinheiro) treats it at length. An initial > example specifies (p 303): > "Consider the following example: we draw 100 samples from a bivariate > Normal distribution with X0~N(0,1), X1~N(0,1) and cov(X0,X1)=0.7, We > then calculate the p value for the null hypothesis that the means of X0 > and X1 are equal, using a paired Student's t test. The procedure is > repeated 1000 times, producing 1000 simulated p values. Because X0 and > X1 have identical marginal distributions, the simulated p values behave > like independent Uniform(0,1) random variables." This I did not > understand, and simulating like shown below produced far from uniform > (0,1) p values - but I fail to see how it is wrong. I contacted the > authors of the paper but they did not answer. So, please, doesn?t the > code below specify a bivariate N(0,1) with covariance 0.7? I get p > values = 1 all over - not interesting, but how wrong? > Best wishes > Troels > > library(MASS) > Sigma <- matrix(c(1,0.7,0.7,1),2,2) > Sigma > res <- NULL > for (i in 1:1000){ > ff <-(mvrnorm(n=100, rep(0, 2), Sigma, empirical = TRUE)) > res[i] <- t.test(ff[,1],ff[,2],paired=TRUE)$p.value}Specifying empirical=TRUE means that your sampled values are not independent, the means are guaranteed to match exactly, and the mean difference is exactly zero. Thus all of the t statistics are exactly zero, and the p-values are exactly 1. Set empirical=FALSE (the default), and you'll see more reasonable results. Duncan Murdoch