Cheng, Yiling (CDC/CCHP/NCCDPHP)
2008-Aug-20 13:12 UTC
[R] Quantile regression with complex survey data
Dear there, I am working on the NHANES survey data, and want to apply quantile regression on these complex survey data. Does anyone know how to do this? Thank you in advance, Yiling Cheng Yiling J. Cheng MD, PhD Epidemiologist CoCHP, Division of Diabetes Translation Centers for Disease Control and Prevention 4770 Buford Highway, N.E. Mailstop K-10 Atlanta, GA 30341 [[alternative HTML version deleted]]
Cheng, Yiling (CDC/CCHP/NCCDPHP)
2008-Aug-20 13:42 UTC
[R] Quantile regression with complex survey data
> Dear there, > > I am working on the NHANES survey data, and want to apply quantile > regression on these complex survey data. Does anyone know how to do > this? > > Thank you in advance, > Yiling Cheng > Yiling J. Cheng MD, PhD > Epidemiologist > CoCHP, Division of Diabetes Translation > Centers for Disease Control and Prevention > 4770 Buford Highway, N.E. Mailstop K-10 > Atlanta, GA 30341 > > >[[alternative HTML version deleted]]
On Wed, Aug 20, 2008 at 8:12 AM, Cheng, Yiling (CDC/CCHP/NCCDPHP) <ycc1 at cdc.gov> wrote:> I am working on the NHANES survey data, and want to apply quantile > regression on these complex survey data. Does anyone know how to do > this?There are no references in technical literature (thinking, Annals, JASA, JRSS B, Survey Methodology). Absolutely none. Zero. You might be able to apply the procedure mechanically and then adjust the standard errors, but God only knows what the population equivalent is of whatever that model estimates. If there is a population analogue at all. In general, a quantile regression is a heavily model based concept: for each value of the explanatory variables, there is a well defined distribution of the response, and quantile regression puts additional structure on it -- linearity of quantiles wrt to some explanatory variables. That does not mesh well with the design paradigm according to which the survey estimation is usually conducted. With the latter, the finite population and characteristics of every unit are assumed fixed, and randomness comes only from the sampling procedure. Within that paradigm, you can define the marginal distribution of the response (or any other) variable, but the conditional distributions may simply be unavailable because there are no units in the population satisfying the conditions. -- Stas Kolenikov, also found at stas.kolenikov.name Small print: I use this email account for mailing lists only.