knallgrau at gmx.com
2012-May-28 10:28 UTC
[R] stats q: multiple imputation and quantile regression
Dear list, this is perhaps more of a statistics question than an R question, but perhaps someone could help me out anyway. I'm doing sociological research and am currently in the process of familiarizing myself with the basic concepts of multiple imputation. Eventually, my goal is to perform quantile regression on a large data set, where one non-negative discrete variable contains missing values -- which I'm hoping to impute using multiple imputation. The variable in question has between 5-20% missing values (depending on the sample I'm using). Here's my question: Is it acceptable to use a linear-regression based model for imputation of the values of my non-negative discrete predictor variable, even though the aim is to use quantile regression for the substantive analysis? Section 2 (page 6) in Joseph L Schafer's "Multiple Imputation: A primer" (Statistical Methods in Medical Research 1999, Vol 8, pp 3-15) gives me the impression that I might have a problem, if the predictor's distribution is skewed and I'm mainly interested in conditional quantiles rather than means for my substantive analysis? Any pointers you could give me would be greatly appreciated. Best, Irene P.