Matthias Gondan
2008-Nov-25 14:52 UTC
[R] Statistical question: one-sample binomial test for clustered data
Dear list, I hope the topic is of sufficient interest, because it is not R-related. I have N=100 yes/no-responses from a psychophysics paradigm (say Y Yes and 100-Y No-Responses). I want to see whether these yes-no-responses are in line with a model predicting a certain amount p of yes-responses. Standard procedure would be a one-sample binomial test for the observed proportion, chi?(1 df) = (Y-Np)?/(Np) + [(100-Y)-N(1-p)]?/[N(1-p)] Actually, this is the approximate chi?-test, but the sample size seems to be reasonably high for an asymptotic test. The problem is that the experiment took quite a while, and the 100 responses are grouped into 20 blocks of 5 responses each. The responses within the blocks are clustered, ICC is about 0.13 or so. Can anyone point me to some literature explaining a one-sample binomial test / or chi? test for correlated data? Most of the literature I found starts with more advanced stuff, e.g. 2x2 cross-tabulated data. Best wishes, Matthias
Greg Snow
2008-Nov-25 16:33 UTC
[R] Statistical question: one-sample binomial test for clustered data
I don't have a good reference for you, but here are a couple of things that you could try: 1. Do a bootstrap estimation of p by resampling the blocks of 5 (rather than the individual observations) and see if the hypothesized p is in the confidence interval. 2. Simulate data using the hypothesized p and the blocking structure that was actually used (find the amount of random error to add to the base p for each block that gives the desired ICC) then compute the sample proportion from the simulated data. Redo this a bunch of times to get the sampling distribution of the sample proportion. Compare the sample proportion from the real data to this distribution to get a p-value. 3. Use the glmer function from lme4 with the responses as the left side of the formula and an intercept as the right side with the groups forming the random variable, then look at the inferences on the intercept and how it compares to the hypothesized p. (this one is probably overkill for this problem, but should work in theory). Hope this helps, -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.snow at imail.org 801.408.8111> -----Original Message----- > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r- > project.org] On Behalf Of Matthias Gondan > Sent: Tuesday, November 25, 2008 7:53 AM > To: r-help at stat.math.ethz.ch > Subject: [R] Statistical question: one-sample binomial test for > clustered data > > Dear list, > > I hope the topic is of sufficient interest, because it is not > R-related. I have N=100 yes/no-responses from a psychophysics > paradigm (say Y Yes and 100-Y No-Responses). I want to see > whether these yes-no-responses are in line with a model > predicting a certain amount p of yes-responses. Standard > procedure would be a one-sample binomial test for the observed > proportion, > > chi?(1 df) = (Y-Np)?/(Np) + [(100-Y)-N(1-p)]?/[N(1-p)] > > Actually, this is the approximate chi?-test, but the sample > size seems to be reasonably high for an asymptotic test. > > The problem is that the experiment took quite a while, and > the 100 responses are grouped into 20 blocks of 5 responses > each. The responses within the blocks are clustered, ICC is > about 0.13 or so. > > Can anyone point me to some literature explaining a one-sample > binomial test / or chi? test for correlated data? Most of the > literature I found starts with more advanced stuff, e.g. > 2x2 cross-tabulated data. > > Best wishes, > > Matthias > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > and provide commented, minimal, self-contained, reproducible code.
Thomas Lumley
2008-Nov-25 20:58 UTC
[R] Statistical question: one-sample binomial test for clustered data
The bootstrap that Greg Snow suggested is probably the best approach, but it is possible to estimate the variance of the proportion. The total T number of yes reponses is the sum of twenty totals for blocks, and these are independent, so the variance of Y is 20 times the variance of these twenty numbers. The variance of the proportion is the variance of Y divided by 100^2 -thomas On Tue, 25 Nov 2008, Matthias Gondan wrote:> Dear list, > > I hope the topic is of sufficient interest, because it is not > R-related. I have N=100 yes/no-responses from a psychophysics > paradigm (say Y Yes and 100-Y No-Responses). I want to see > whether these yes-no-responses are in line with a model > predicting a certain amount p of yes-responses. Standard > procedure would be a one-sample binomial test for the observed > proportion, > > chi?(1 df) = (Y-Np)?/(Np) + [(100-Y)-N(1-p)]?/[N(1-p)] > > Actually, this is the approximate chi?-test, but the sample > size seems to be reasonably high for an asymptotic test. > > The problem is that the experiment took quite a while, and > the 100 responses are grouped into 20 blocks of 5 responses > each. The responses within the blocks are clustered, ICC is > about 0.13 or so. > > Can anyone point me to some literature explaining a one-sample > binomial test / or chi? test for correlated data? Most of the > literature I found starts with more advanced stuff, e.g. > 2x2 cross-tabulated data. > > Best wishes, > > Matthias > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >Thomas Lumley Assoc. Professor, Biostatistics tlumley at u.washington.edu University of Washington, Seattle