Andrew Robinson
2011-Jun-14 01:02 UTC
[R] Off-topic: (Simple?) Random Sampling when n is a random variable
Hi everyone, I'm involved in a discussion with a colleague. He suggested a sample design for a finite-sized process that (to all intents and purposes) involves tossing a coin and examining the unit if the coin shows Heads. I should emphasize that we're both approaching the problem from a design-based sampling theory point of view. So I have no argument about the appropriateness of the design as such. Can this design be called 'Simple Random Sampling'? My intuition suggests that it can not, because the sample size is a random variable, so the usual standard error equations for SRS will be inaccurate. But I can't find any citations to back me up. So maybe I'm wrong. My questions are: 1) does this design have a name, and 2) are the usual SRS formula for e.g. the standard error of the mean exactly accurate? Or are they defensibly accurate approximations? 3) can anyone suggest some citations that provide guidance either way? Thanks for any assistance! Andrew -- Andrew Robinson Program Manager, ACERA Department of Mathematics and Statistics Tel: +61-3-8344-6410 University of Melbourne, VIC 3010 Australia (prefer email) http://www.ms.unimelb.edu.au/~andrewpr Fax: +61-3-8344-4599 http://www.acera.unimelb.edu.au/ Forest Analytics with R (Springer, 2011) http://www.ms.unimelb.edu.au/FAwR/ Introduction to Scientific Programming and Simulation using R (CRC, 2009): http://www.ms.unimelb.edu.au/spuRs/
Andrew Robinson
2011-Jun-14 10:04 UTC
[R] Off-topic: (Simple?) Random Sampling when n is a random variable
On Tue, Jun 14, 2011 at 11:02:52AM +1000, Andrew Robinson wrote:> Hi everyone, > > I'm involved in a discussion with a colleague. He suggested a sample > design for a finite-sized process that (to all intents and purposes) > involves tossing a coin and examining the unit if the coin shows > Heads. > > I should emphasize that we're both approaching the problem from a > design-based sampling theory point of view. So I have no argument > about the appropriateness of the design as such. > > Can this design be called 'Simple Random Sampling'? My intuition > suggests that it can not, because the sample size is a random > variable, so the usual standard error equations for SRS will be > inaccurate. But I can't find any citations to back me up. So maybe > I'm wrong. My questions are: > > 1) does this design have a name, andBernoulli sampling.> 2) are the usual SRS formula for e.g. the standard error of the mean > exactly accurate? Or are they defensibly accurate approximations?Not exact. Can be approximately ok. See 'Estimation of a Population Total Under a "Bernoulli Sampling" Procedure' Strand 1979 American Statistician 33 (2) 81-84. See also Sarndal et al 'Model Assisted Survey Sampling'.> 3) can anyone suggest some citations that provide guidance either way?As above! Best wishes to all Andrew> Thanks for any assistance! > > Andrew > > -- > Andrew Robinson > Program Manager, ACERA > Department of Mathematics and Statistics Tel: +61-3-8344-6410 > University of Melbourne, VIC 3010 Australia (prefer email) > http://www.ms.unimelb.edu.au/~andrewpr Fax: +61-3-8344-4599 > http://www.acera.unimelb.edu.au/ > > Forest Analytics with R (Springer, 2011) > http://www.ms.unimelb.edu.au/FAwR/ > Introduction to Scientific Programming and Simulation using R (CRC, 2009): > http://www.ms.unimelb.edu.au/spuRs/ > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.-- Andrew Robinson Program Manager, ACERA Department of Mathematics and Statistics Tel: +61-3-8344-6410 University of Melbourne, VIC 3010 Australia (prefer email) http://www.ms.unimelb.edu.au/~andrewpr Fax: +61-3-8344-4599 http://www.acera.unimelb.edu.au/ Forest Analytics with R (Springer, 2011) http://www.ms.unimelb.edu.au/FAwR/ Introduction to Scientific Programming and Simulation using R (CRC, 2009): http://www.ms.unimelb.edu.au/spuRs/
Greg Snow
2011-Jun-14 22:13 UTC
[R] Off-topic: (Simple?) Random Sampling when n is a random variable
This sounds like what is called "domains" in survey sampling (possibly other names, but that is what I learned it as). The idea is that you take a random sample (or the population) then ask a question to determine which domain the subject is in, then ask the question of interest in the domain of interest. For example you want to know how long tourists plan to stay in the area so you go to the airport and ask N people if they are tourists, if they answer 'yes' then you ask how long they will be staying. The sample size of tourists n (which is <=N) is random and not know ahead. This is the same idea as you flipping a coin instead of asking the 1st question. And yes, the randomness of n does change the formulas needed. Consult a survey sampling text for details (I am looking at the one by Lohr which has a section on this). -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.snow at imail.org 801.408.8111> -----Original Message----- > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r- > project.org] On Behalf Of Andrew Robinson > Sent: Monday, June 13, 2011 7:03 PM > To: R-Help Discussion > Subject: [R] Off-topic: (Simple?) Random Sampling when n is a random > variable > > Hi everyone, > > I'm involved in a discussion with a colleague. He suggested a sample > design for a finite-sized process that (to all intents and purposes) > involves tossing a coin and examining the unit if the coin shows > Heads. > > I should emphasize that we're both approaching the problem from a > design-based sampling theory point of view. So I have no argument > about the appropriateness of the design as such. > > Can this design be called 'Simple Random Sampling'? My intuition > suggests that it can not, because the sample size is a random > variable, so the usual standard error equations for SRS will be > inaccurate. But I can't find any citations to back me up. So maybe > I'm wrong. My questions are: > > 1) does this design have a name, and > > 2) are the usual SRS formula for e.g. the standard error of the mean > exactly accurate? Or are they defensibly accurate approximations? > > 3) can anyone suggest some citations that provide guidance either way? > > Thanks for any assistance! > > Andrew > > -- > Andrew Robinson > Program Manager, ACERA > Department of Mathematics and Statistics Tel: +61-3-8344- > 6410 > University of Melbourne, VIC 3010 Australia (prefer > email) > http://www.ms.unimelb.edu.au/~andrewpr Fax: +61-3-8344- > 4599 > http://www.acera.unimelb.edu.au/ > > Forest Analytics with R (Springer, 2011) > http://www.ms.unimelb.edu.au/FAwR/ > Introduction to Scientific Programming and Simulation using R (CRC, > 2009): > http://www.ms.unimelb.edu.au/spuRs/ > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > and provide commented, minimal, self-contained, reproducible code.