thr3ads.net - R devel - [Rd] proposal: allowing alternative variance estimators in glm/lm [Dec 2006]

If this information is useful, please help other people find it:
Share via:

Thomas Lumley

2006-Dec-27 17:54 UTC

[Rd] proposal: allowing alternative variance estimators in glm/lm

There has been recent discussion about alternatives to the model-based 
standard error estimators for lm. While some people like the sandwich 
estimator and others don't, it is clear that neither estimator dominates 
the other for any sane loss function.  It is also worth noting that the 
sandwich estimator is the default for t.test().

I think it would be useful for models using other variance estimators to 
be able to inherit from lm and use summary.lm and predict.lm (and 
similarly for glm).  The main step in making this possible would be
moving the variance-covariance matrix computation that is currently 
duplicated in summary.lm and predict.lm into vcov.lm, and then having 
summary.lm and predict.lm call vcov().

This allows a fitting function (whether lm() or another function) to 
produce objects that inherit usefully from lm and glm but have other 
standard error estimators, by supplying a new vcov method for the class. 
The initial discusssion was about heteroscedasticity-consistent sandwich 
estimators, but from my point of view autocorrelation-consistent 
estimators and estimators that handle sampling weights are more 
interesting.

OOP purists might point out that the relationship involved is not, 
strictly speaking, inheritance.  They would be quite right. However, 
unless someone wants to rewrite glm and lm for S4 classes I think that 
battle is lost.


     -thomas

Thomas Lumley			Assoc. Professor, Biostatistics
tlumley at u.washington.edu	University of Washington, Seattle

Prof Brian Ripley

2006-Dec-27 18:43 UTC

head link

[Rd] proposal: allowing alternative variance estimators in glm/lm

What concerns me about this is if people call the summary methods directly 
on objects not of the right class.  That used to be quite prevalent in R 
itself, but problems with residuals/weights mean it has now gone, I 
believe.

summary.lm and summary.glm are exported from stats, and this indicates 
that they were quite widely used (and a grep across CRAN suggests that 
they still are).

One fairly backwards-compatible option would seem to be to call the vcov 
generic only if the object inherits from [g]lm and had an earlier class.


On Wed, 27 Dec 2006, Thomas Lumley wrote:
>
> There has been recent discussion about alternatives to the model-based
> standard error estimators for lm. While some people like the sandwich
> estimator and others don't, it is clear that neither estimator
dominates
> the other for any sane loss function.  It is also worth noting that the
> sandwich estimator is the default for t.test().
>
> I think it would be useful for models using other variance estimators to
> be able to inherit from lm and use summary.lm and predict.lm (and
> similarly for glm).  The main step in making this possible would be
> moving the variance-covariance matrix computation that is currently
> duplicated in summary.lm and predict.lm into vcov.lm, and then having
> summary.lm and predict.lm call vcov().
>
> This allows a fitting function (whether lm() or another function) to
> produce objects that inherit usefully from lm and glm but have other
> standard error estimators, by supplying a new vcov method for the class.
> The initial discusssion was about heteroscedasticity-consistent sandwich
> estimators, but from my point of view autocorrelation-consistent
> estimators and estimators that handle sampling weights are more
> interesting.
>
> OOP purists might point out that the relationship involved is not,
> strictly speaking, inheritance.  They would be quite right. However,
> unless someone wants to rewrite glm and lm for S4 classes I think that
> battle is lost.
>
>
>     -thomas
>
> Thomas Lumley			Assoc. Professor, Biostatistics
> tlumley at u.washington.edu	University of Washington, Seattle
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>
-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

ivo welch

2007-Jan-02 15:42 UTC

head link

[Rd] proposal: allowing alternative variance estimators in glm/lm

Dear Brian / Thomas:

May I suggest a "cheap" and amateurish solution, obviously without
much
knowledge or intelligence about the subject?

As a non-statistician user of R, maybe a hook functionality at strategic
places could provide some flexibility without too much pain.   I think
replacing the standard output from summary.lm would be a bad idea (it
could easily create errors downstream, when idiots like myself ask "why
don't I get the s.e. that stata produces?  duh---you loaded
heteroskedasticity adjustment, but forgot about it).  But I think some
flexibility to add more information would be a very good thing. 

Hooks that can be set by functions (perhaps cascades) would allow third
parties to create additional statistics, that could survive future
changes to the functions themselves, without requiring a full object
paradigm.   For example, summary.lm could provide two hooks that allow
programmers to chain my own objects to either the ans$coefficients and
the ans object.  (I guess even one hook would do.)  Well-thought-out
hooks could also add to print methods, etc., without requiring complete
function rewrites, and would survive future changes in the real R code
itself.
>From the perspective of a first-time amateurish end-user, an invokationof "library(lm.addnormalized)" could then magically always add a
normalized coefficient to the coefficient output.  An invokation of
"library(lm.addheteroskedasticity)" could magically always add
heteroskedasticity se's and T-stats.  And so on.


As I said, I don't know what I am talking about.  I am really a
non-statistician end-user, who is really a bit over his head with all of
this---I am using R not because it is sensible given my needs and
abilities, but because I am in awe by many of its capabilities ( and
because I enjoy Brian berating me while he offers me usually desparately
needed help ;-) ).

Regards,

/ivo

Apparently Analagous Threads

Search for more reasonably related threads

R devel - Dec 2006 - proposal: allowing alternative variance estimators in glm/lm

[Rd] proposal: allowing alternative variance estimators in glm/lm

[Rd] proposal: allowing alternative variance estimators in glm/lm

[Rd] proposal: allowing alternative variance estimators in glm/lm

Apparently Analagous Threads