I'm intested in understanding why the standard error grows with respect to the square root of the sample size. For instance, using an honest coin and flipping it L times, the expected number of HEADS is half and we may define the error (relative to the expected number) to be e = H - L/2, where H is the number of heads that we really obtained. The absolute value of e grows as L grows, but by how much? It seems statistical theory claims it grow by an order of the square root of L. To try to make things clearer to me, I decided to play a game. Players A, B compete to see who gets closer to the error in the number of HEADS in random samples selected by of an honest coin. Both players know the error should follow some square root of L, but B guesses 1/3 sqrt(L) while A guesses 1/2 sqrt(L) and it seems A is usually better. It seems statistical theory says the constant should be the standard deviation of the phenomenon. I may not have the proper terminology here. The standard deviation for the phenomenon of flipping an honest coin can be taken to be sqrt[((-1/2)^2 + (1/2)^2)/2] = 1/2 by defining that TAILS are zero and HEADS are one. (So that's why A is doing better.) The standard deviation giving the best constant seems clear because errors are normally distributed and that is intuitive. So the standard deviation gives a measure of how samples might vary, so we can use it to estimate how far a guess will be from the expected value. But standard deviation is only one measure. I could use the absolute deviation too, couldn't I? The absolute deviation of an honest coin turns out to be 1/2 too, so by luck that's the same answer. Maybe I'd need a different example to inspect a particular case of which measure would turn out to be better. Anyhow, it's not clear to me why standard deviation is really the best guess (if it is that at all) for the constant and it's even less clear to me why error grows with respect to the square root of the number of coin flips, that is, of the sample size. I would like to have an intuitive understanding of this, but if that's too hard, I would at least like to see some mathematical argument on an interesting book, which you might point me out to. Thank you! PS. Is this off-topic? I'm not aware of any newsgroup on statistics at the moment. Please point me to the adequate place if that's applicable?
stats.stackexchange.com On August 21, 2020 1:25:06 PM PDT, Wayne Harris via R-help <r-help at r-project.org> wrote:> >I'm intested in understanding why the standard error grows with respect >to the square root of the sample size. For instance, using an honest >coin and flipping it L times, the expected number of HEADS is half and >we may define the error (relative to the expected number) to be > > e = H - L/2, > >where H is the number of heads that we really obtained. The absolute >value of e grows as L grows, but by how much? It seems statistical >theory claims it grow by an order of the square root of L. > >To try to make things clearer to me, I decided to play a game. Players >A, B compete to see who gets closer to the error in the number of HEADS >in random samples selected by of an honest coin. Both players know the >error should follow some square root of L, but B guesses 1/3 sqrt(L) >while A guesses 1/2 sqrt(L) and it seems A is usually better. > >It seems statistical theory says the constant should be the standard >deviation of the phenomenon. I may not have the proper terminology >here. The standard deviation for the phenomenon of flipping an honest >coin can be taken to be sqrt[((-1/2)^2 + (1/2)^2)/2] = 1/2 by defining >that TAILS are zero and HEADS are one. (So that's why A is doing >better.) > >The standard deviation giving the best constant seems clear because >errors are normally distributed and that is intuitive. So the standard >deviation gives a measure of how samples might vary, so we can use it >to >estimate how far a guess will be from the expected value. > >But standard deviation is only one measure. I could use the absolute >deviation too, couldn't I? The absolute deviation of an honest coin >turns out to be 1/2 too, so by luck that's the same answer. Maybe I'd >need a different example to inspect a particular case of which measure >would turn out to be better. > >Anyhow, it's not clear to me why standard deviation is really the best >guess (if it is that at all) for the constant and it's even less clear >to me why error grows with respect to the square root of the number of >coin flips, that is, of the sample size. > >I would like to have an intuitive understanding of this, but if that's >too hard, I would at least like to see some mathematical argument on an >interesting book, which you might point me out to. > >Thank you! > >PS. Is this off-topic? I'm not aware of any newsgroup on statistics at >the moment. Please point me to the adequate place if that's >applicable? > >______________________________________________ >R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide >http://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code.-- Sent from my phone. Please excuse my brevity.
+ (in addition to Jeff's link) https://en.wikipedia.org/wiki/Binomial_distribution Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) On Sat, Aug 22, 2020 at 6:50 AM Wayne Harris via R-help < r-help at r-project.org> wrote:> > I'm intested in understanding why the standard error grows with respect > to the square root of the sample size. For instance, using an honest > coin and flipping it L times, the expected number of HEADS is half and > we may define the error (relative to the expected number) to be > > e = H - L/2, > > where H is the number of heads that we really obtained. The absolute > value of e grows as L grows, but by how much? It seems statistical > theory claims it grow by an order of the square root of L. > > To try to make things clearer to me, I decided to play a game. Players > A, B compete to see who gets closer to the error in the number of HEADS > in random samples selected by of an honest coin. Both players know the > error should follow some square root of L, but B guesses 1/3 sqrt(L) > while A guesses 1/2 sqrt(L) and it seems A is usually better. > > It seems statistical theory says the constant should be the standard > deviation of the phenomenon. I may not have the proper terminology > here. The standard deviation for the phenomenon of flipping an honest > coin can be taken to be sqrt[((-1/2)^2 + (1/2)^2)/2] = 1/2 by defining > that TAILS are zero and HEADS are one. (So that's why A is doing > better.) > > The standard deviation giving the best constant seems clear because > errors are normally distributed and that is intuitive. So the standard > deviation gives a measure of how samples might vary, so we can use it to > estimate how far a guess will be from the expected value. > > But standard deviation is only one measure. I could use the absolute > deviation too, couldn't I? The absolute deviation of an honest coin > turns out to be 1/2 too, so by luck that's the same answer. Maybe I'd > need a different example to inspect a particular case of which measure > would turn out to be better. > > Anyhow, it's not clear to me why standard deviation is really the best > guess (if it is that at all) for the constant and it's even less clear > to me why error grows with respect to the square root of the number of > coin flips, that is, of the sample size. > > I would like to have an intuitive understanding of this, but if that's > too hard, I would at least like to see some mathematical argument on an > interesting book, which you might point me out to. > > Thank you! > > PS. Is this off-topic? I'm not aware of any newsgroup on statistics at > the moment. Please point me to the adequate place if that's applicable? > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]
> The absolute > value of e grows as L grows, but by how much? It seems statistical > theory claims it grow by an order of the square root of L.Assuming you want the standard deviation for the number of successes, given p=0.5: #exact 0.5 * sqrt (n) #numerical approximation sd (rbinom (1e6, n, 0.5) ) Note that variance should be linear in n.