Pavlos Pavlidis
2013-Jan-09 18:17 UTC
[R] [solved] t-test behavior given that the null hypothesis is true
Hi Ted, yes this was the problem. Thank you very much. best idaios On Wed, Jan 9, 2013 at 4:51 PM, Ted Harding <Ted.Harding@wlandres.net>wrote:> Ah! You have aqssigned a parameter "equal.var=TRUE", and "equal.var" > is not a listed paramater for t.test() -- see ?t.test : > > t.test(x, y = NULL, > alternative = c("two.sided", "less", "greater"), > mu = 0, paired = FALSE, var.equal = FALSE, > conf.level = 0.95, ...) > > Try it instead with "var.equal=TRUE", i.e. in your code: > for(i in 1:k){ > rv.t.pvalues[i] <- t.test(rv[i, 1:(c/2)], rv[i, (c/2+1):c], > ##equal.var=TRUE, alternative="two.sided")$p.value > var.equal=TRUE, alternative="two.sided")$p.value > } > > When I run your code with "equal.var", I indeed repeatedly see > the deficient bin for the lowest P-values that you observed. > When I run your code with "var.equal" I do not see it. > > The explanation is that, since "equal.var" is not a recognised > parameter for t.test(), it has assumed the default value FALSE > for var.equal, and has therefore (since it is a 2-sample test) > adopted the Welch/Satterthwaite procedure: > > var.equal: a logical variable indicating whether to treat > the two variances as being equal. If 'TRUE' then the > pooled variance is used to estimate the variance > otherwise the Welch (or Satterthwaite) approximation > to the degrees of freedom is used. > > This has the effect of somewhat adapting the test procedure to > the data, so that extreme (i.e. small) values of P are even > rarer than they should be. > > With best wishes, > Ted. > > On 09-Jan-2013 13:24:59 Pavlos Pavlidis wrote: > > Hi Ted, > > thanks for the reply. I use a similar code which you can see below: > > > > k <- 10000 > > c <- 6 > > rv <- array(NA, dim=c(k, c) ) > > for(i in 1:k){ > > rv[i,] <- rnorm(c, mean=0, sd=1) > > } > > > > rv.t.pvalues <- array(NA, k) > > > > for(i in 1:k){ > > rv.t.pvalues[i] <- t.test(rv[i, 1:(c/2)], rv[i, (c/2+1):c], > > equal.var=TRUE, alternative="two.sided")$p.value > > } > > > > hist(rv.t.pvalues) > > > > The histogram is this one: > > *http://tinyurl.com/histogram-rt-pvalues-pdf > > > > * > > *all the best > > idaios > > * > > > > > > On Wed, Jan 9, 2013 at 12:29 PM, Ted Harding <Ted.Harding@wlandres.net > >wrote: > > > >> On 09-Jan-2013 08:50:46 Pavlos Pavlidis wrote: > >> > Dear all, > >> > I observer a strange behavior of the pvalues of the t-test under > >> > the null hypothesis. Specifically, I obtain 2 samples of 3 > >> > individuals each from a normal distribution of mean 0 and variance 1. > >> > Then, I calculate the pvalue using the t-test (var.equal=TRUE, > >> > samples are independent). When I make a histogram of pvalues > >> > I see that consistently the bin of the smallest pvalues has a > >> > lower frequency. Is this a known behavior of the t-test or it's > >> > a kind of bug/random number generation problem? > >> > > >> > kind regards, > >> > idaios > >> > >> Using the following code, I did not observe the behavious you describe. > >> The histograms are consistent with a uniform distribution of the > >> P-values, and the lowest bin for the P-values (when the code is > >> run repeatedly) is not consistently lower (or higher, or anything > >> else) than the other bins. > >> > >> ## My code: > >> N <- 10000 > >> Ps <- numeric(N) > >> for(i in (1:N)){ > >> X1 <- rnorm(3,0,1) ; X2 <- rnorm(3,0,1) > >> Ps[i] <- t.test(X1,X2,var.equal=TRUE)$p.value > >> } > >> hist(Ps) > >> ################################################ > >> > >> If you would post the code you used, the reason why you are observing > >> this may become more evident! > >> > >> Hoping this helps, > >> Ted. > >> > >> ------------------------------------------------- > >> E-Mail: (Ted Harding) <Ted.Harding@wlandres.net> > >> Date: 09-Jan-2013 Time: 10:29:21 > >> This message was sent by XFMail > >> ------------------------------------------------- > >> > > > > [[alternative HTML version deleted]] > > > > ______________________________________________ > > R-help@r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > ------------------------------------------------- > E-Mail: (Ted Harding) <Ted.Harding@wlandres.net> > Date: 09-Jan-2013 Time: 14:51:04 > This message was sent by XFMail > ------------------------------------------------- >[[alternative HTML version deleted]]