There are several aspects of that description that are worrying, including
the starting point (looks like a least-squares fit is done), and the
covariance matrix estimate.
For more comprehensive approaches to robust linear model fitting, see
rlm in MASS and lmRob in package robust. help.search("robuts linear")
would have got you there. But if you are looking for outliers, resistant
fitting (e.g. lqs in MASS) may be more informative.
There are examples in MASS (the book), including of the differences.
On Tue, 20 Nov 2007, Darren Weber wrote:
> Hi,
>
> I've been using the Matlab robustfit function for linear regressions
> where I suspect some data points are outliers. Is there an equivalent
> function in R?
>
> Take care, Darren
>
> PS, This is the Matlab help on robustfit:
>
>>> help robustfit
> ROBUSTFIT Robust linear regression
> B = ROBUSTFIT(X,Y) returns the vector B of regression coefficients,
> obtained by performing robust regression to estimate the linear model
> Y = Xb. X is an n-by-p matrix of predictor variables, and Y is an
> n-by-1 vector of observations. The algorithm uses iteratively
> reweighted least squares with the bisquare weighting function. By
> default, ROBUSTFIT adds a column of ones to X, corresponding to a
> constant term in the first element of B. Do not enter a column of ones
> directly into the X matrix.
>
> B = ROBUSTFIT(X,Y,'WFUN',TUNE) uses the weighting function
'WFUN' and
> tuning constant TUNE. 'WFUN' can be any of 'andrews',
'bisquare',
> 'cauchy', 'fair', 'huber', 'logistic',
'talwar', or 'welsch'.
> Alternatively 'WFUN' can be a function that takes a residual
vector as
> input and produces a weight vector as output. The residuals are scaled
> by the tuning constant and by an estimate of the error standard
> deviation before the weight function is called. 'WFUN' can be
> specified using @ (as in @myfun). TUNE is a tuning constant that is
> divided into the residual vector before computing the weights, and it
> is required if 'WFUN' is specified as a function.
>
> B = ROBUSTFIT(X,Y,'WFUN',TUNE,'CONST') controls whether
or not the
> model will include a constant term. 'CONST' is 'on'
(the default) to
> include the constant term, or 'off' to omit it.
>
> [B,STATS] = ROBUSTFIT(...) also returns a STATS structure
> containing the following fields:
> 'ols_s' sigma estimate (rmse) from least squares fit
> 'robust_s' robust estimate of sigma
> 'mad_s' MAD estimate of sigma; used for scaling
> residuals during the iterative fitting
> 's' final estimate of sigma, the larger of robust_s
> and a weighted average of ols_s and robust_s
> 'se' standard error of coefficient estimates
> 't' ratio of b to stats.se
> 'p' p-values for stats.t
> 'covb' estimated covariance matrix for coefficient
estimates
> 'coeffcorr' estimated correlation of coefficient estimates
> 'w' vector of weights for robust fit
> 'h' vector of leverage values for least squares fit
> 'dfe' degrees of freedom for error
> 'R' R factor in QR decomposition of X matrix
>
> The ROBUSTFIT function estimates the variance-covariance matrix of the
> coefficient estimates as V=inv(X'*X)*STATS.S^2. The standard errors
> and correlations are derived from V.
>
> ROBUSTFIT treats NaNs in X or Y as missing values, and removes them.
>
> Example:
> x = (1:10)';
> y = 10 - 2*x + randn(10,1); y(10) = 0;
> bls = regress(y,[ones(10,1) x])
> brob = robustfit(x,y)
> scatter(x,y)
> hold on
> plot(x,brob(1)+brob(2)*x,'r-',
x,bls(1)+bls(2)*x,'m:')
>
> See also REGRESS, ROBUSTDEMO.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
--
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595