Johannes:
WARNING: I'm no expert. Caveat emptor!
There is a huge literature on robust estimation, as you'll find when you
Google it. One natural place to start might be the relevant sections of
V&R's MASS( Modern Applied Statistics with S) and the references
therein. An
old classic, which may not, however, still be in print, is Hoaglin,
Mosteller, Tukey: Understanding robust and exploratory data analysis.
(Robust estimation chapter)
It is not clear to me that robust estimation will solve your problems with
lots of one-sided outliers -- sounds like a skewed distribution in there
somewhere.
One thing to be careful about: there's "Robustness of efficiency"
and
"Outlier resistance." The first is about maintaining estimation
efficiency
in the face of "contamination" by a usually small percentage of
"outliers"
(whatever THEY are); the second is about maintaining estimation accuracy in
the face of a possibly large proportion of outliers. The classic example of
the latter for estimating location is the median; an M-estimator (e.g.
iterated biweight) is an exemplar of the former. As V&R and others makes
clear, these are not mutually exclusive, but they do tend to pull in
somewhat different ways.
Robust estimation seems to have lost its cachet these days, maybe because it
seems to be difficult to do in the nonlinear models that arise out of the
complex covariance structures people want to use these days (e.g, mixed
models; Empirical Bayes). I continue to find it an essential tool in any
routine regression work that I do, however. Seems more in keeping with
entropy.
Cheers,
-- Bert Gunter
Genentech Non-Clinical Statistics
South San Francisco, CA
"The business of the statistician is to catalyze the scientific learning
process." - George E. P. Box
> -----Original Message-----
> From: r-help-bounces at stat.math.ethz.ch
> [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of
> Johannes Graumann
> Sent: Tuesday, August 23, 2005 2:33 PM
> To: r-help at stat.math.ethz.ch
> Subject: [R] Robust M-Estimator Comparison
>
> Hello,
>
> I'm learning about robust M-estimators right now and had
> settled on the
> "Huber Proposal 2" as implemented in MASS, but further
> reading made clear,
> that at least 2 further weighting functions (Hampel, Tukey
> bisquare) exist.
> In a post from B.D. Ripley going back to 1999 I found the
> following quote:
>
> >> 2) Would huber() give me results that are similar (i.e.,
> close enough)?
> >
> > Not if you have lots of extreme outliers on just one side.
>
> Since this message seems to imply that the nature of the data
> described (and
> not just personal preference) should influence the choice among above
> M-estimators, I've been scouting around for a direct
> comparison among them
> - to no avail.
>
> Can anybody here point me to such a comparison
> (novice-suitability would be
> more than welcome ;0)?
>
> Thanks for any hint,
>
> Joh
>
> --
> +-------------------------------------------------------------
> ---------+
> | Johannes Graumann, Dipl. Biol.
> |
> |
> |
> | Graduate Student Tel.: ++1 (626) 395
> 6602 |
> | Deshaies Lab Fax.: ++1 (626) 395
> 5739 |
> | Department of Biology
> |
> | CALTECH, M/C 156-29
> |
> | 1200 E. California Blvd.
> |
> | Pasadena, CA 91125
> |
> | USA
> |
> +-------------------------------------------------------------
> ---------+
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
> http://www.R-project.org/posting-guide.html
>