James Shaw
2011-Feb-28 23:50 UTC
[R] Robust variance estimation with rq (failure of the bootstrap?)
I am fitting quantile regression models using data collected from a sample of 124 patients. When modeling cross-sectional associations, I have noticed that nonparametric bootstrap estimates of the variances of parameter estimates are much greater in magnitude than the empirical Huber estimates derived using summary.rq's "nid" option. The outcome variable is severely skewed, and I am afraid that this may be affecting the consistency of the bootstrap variance estimates. I have read that the m out of n bootstrap can be used to overcome this problem. However, this procedure requires both the original sample (n) and the subsample (m) sizes to be large. The version implemented in rq.boot does not appear to provide any improvement over the naive bootstrap. Ultimately, I am interested in using median regression to model changes in the outcome variable over time. Summary.rq's robust variance estimator is not applicable to repeated-measures data. I question whether the block (cluster) bootstrap variance estimator, which can accommodate intraclass correlation, would perform well. Can anyone suggest alternatives for variance estimation in this situation? Regards, Jim James W. Shaw, Ph.D., Pharm.D., M.P.H. Assistant Professor Department of Pharmacy Administration College of Pharmacy University of Illinois at Chicago 833 South Wood Street, M/C 871, Room 266 Chicago, IL 60612 Tel.: 312-355-5666 Fax: 312-996-0868 Mobile Tel.: 215-852-3045
Matt Shotwell
2011-Mar-01 02:59 UTC
[R] Robust variance estimation with rq (failure of the bootstrap?)
Jim, If repeated measurements on patients are correlated, then resampling all measurements independently induces an incorrect sampling distribution (=> incorrect variance) on a statistic of these data. One solution, as you mention, is the block or cluster bootstrap, which preserves the correlation among repeated observations in resamples. I don't immediately see why the cluster bootstrap is unsuitable. Beyond this, I would be concerned about *any* variance estimates that are blind to correlated observations. The bootstrap variance estimate may be larger than the asymptotic variance estimate, but that alone isn't evidence to favor one over the other. Also, I can't justify (to myself) why skew would hamper the quality of bootstrap variance estimates. I wonder how it affects the sandwich variance estimate... Best, Matt On Mon, 2011-02-28 at 17:50 -0600, James Shaw wrote:> I am fitting quantile regression models using data collected from a > sample of 124 patients. When modeling cross-sectional associations, I > have noticed that nonparametric bootstrap estimates of the variances > of parameter estimates are much greater in magnitude than the > empirical Huber estimates derived using summary.rq's "nid" option. > The outcome variable is severely skewed, and I am afraid that this may > be affecting the consistency of the bootstrap variance estimates. I > have read that the m out of n bootstrap can be used to overcome this > problem. However, this procedure requires both the original sample > (n) and the subsample (m) sizes to be large. The version implemented > in rq.boot does not appear to provide any improvement over the naive > bootstrap. Ultimately, I am interested in using median regression to > model changes in the outcome variable over time. Summary.rq's robust > variance estimator is not applicable to repeated-measures data. I > question whether the block (cluster) bootstrap variance estimator, > which can accommodate intraclass correlation, would perform well. Can > anyone suggest alternatives for variance estimation in this situation? > Regards, > > Jim > > > James W. Shaw, Ph.D., Pharm.D., M.P.H. > Assistant Professor > Department of Pharmacy Administration > College of Pharmacy > University of Illinois at Chicago > 833 South Wood Street, M/C 871, Room 266 > Chicago, IL 60612 > Tel.: 312-355-5666 > Fax: 312-996-0868 > Mobile Tel.: 215-852-3045 > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.