Geertje Van der Heijden
2007-Sep-04  17:55 UTC
[R] Robust linear models and unequal variance
Hi all, I have probably a basic question, but I can't seem to find the answer in the literature or in the R-archives. I would like to do a robust ANCOVA (using either rlm or lmRob of the MASS and robust packages) - my response variable deviates slightly from normal and I have some "outliers". The data consist of 2 factor variables and 3-5 covariates (fdepending on the model). However, the variance between my groups is not equal and I am not sure if it is therefore appropriate to use a robust statistical method or if a non-parametric analysis (i.e. ranked regression) might be better. If I can still use a robust statistical method, which estimator is best to use to deal with unequal variance? And if it is better to use a non-parametric analysis, could anyone put me in the direction of the right non-parametric method to use (the relationship between my response variable and the covariates is linear)? Any help on this would be greatly appreciated! Many thanks, Geertje ~~~~ Geertje van der Heijden PhD student Tropical Ecology School of Geography University of Leeds Leeds LS2 9JT Tel: (+44)(0)113 3433345 Email: g.m.f.vanderheijden04@leeds.ac.uk [[alternative HTML version deleted]]
Let me try a reply, although I wish others wiser than I had responded. 1. How do you know the variances are unequal? 2. If you somehow know what the variances are (or at least their relative sizes), you can use the "weights" arguments of the functions you mentions to weight inversely proportional to variance (except not for the "MM" method in rlm() according to the docs.) 3. That "ranked regression" is robust is a myth. It also does not deal with the unequal variance situation. It is not a panacea for anything. If you need "robust" regression use robust regression. 4. If group sizes are not too dissimilar, than whether you case weight or not may not make much difference (alas, hard to tell a priori). Especially to estimation. The fundamental issue is that "outliers" and "unequal variances" must be operationalized, otherwise they are confounded: "outlier" only has meaning compared to what is expected from a specified distribution. Outliers are no longer out when the variance is "large." Also look at glm() with the "quasi" option if you wish to consider fitting a heterogeneous variance structure to initialize a robust method (which could, of course, be distorted by your "outliers"). Bert Gunter Genentech Nonclinical Statistics -----Original Message----- From: r-help-bounces at stat.math.ethz.ch [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Geertje Van der Heijden Sent: Tuesday, September 04, 2007 10:55 AM To: r-help at stat.math.ethz.ch Subject: [R] Robust linear models and unequal variance Hi all, I have probably a basic question, but I can't seem to find the answer in the literature or in the R-archives. I would like to do a robust ANCOVA (using either rlm or lmRob of the MASS and robust packages) - my response variable deviates slightly from normal and I have some "outliers". The data consist of 2 factor variables and 3-5 covariates (fdepending on the model). However, the variance between my groups is not equal and I am not sure if it is therefore appropriate to use a robust statistical method or if a non-parametric analysis (i.e. ranked regression) might be better. If I can still use a robust statistical method, which estimator is best to use to deal with unequal variance? And if it is better to use a non-parametric analysis, could anyone put me in the direction of the right non-parametric method to use (the relationship between my response variable and the covariates is linear)? Any help on this would be greatly appreciated! Many thanks, Geertje ~~~~ Geertje van der Heijden PhD student Tropical Ecology School of Geography University of Leeds Leeds LS2 9JT Tel: (+44)(0)113 3433345 Email: g.m.f.vanderheijden04 at leeds.ac.uk [[alternative HTML version deleted]] ______________________________________________ R-help at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.