Thomas Lumley
2006-Dec-27 17:54 UTC
[Rd] proposal: allowing alternative variance estimators in glm/lm
There has been recent discussion about alternatives to the model-based standard error estimators for lm. While some people like the sandwich estimator and others don't, it is clear that neither estimator dominates the other for any sane loss function. It is also worth noting that the sandwich estimator is the default for t.test(). I think it would be useful for models using other variance estimators to be able to inherit from lm and use summary.lm and predict.lm (and similarly for glm). The main step in making this possible would be moving the variance-covariance matrix computation that is currently duplicated in summary.lm and predict.lm into vcov.lm, and then having summary.lm and predict.lm call vcov(). This allows a fitting function (whether lm() or another function) to produce objects that inherit usefully from lm and glm but have other standard error estimators, by supplying a new vcov method for the class. The initial discusssion was about heteroscedasticity-consistent sandwich estimators, but from my point of view autocorrelation-consistent estimators and estimators that handle sampling weights are more interesting. OOP purists might point out that the relationship involved is not, strictly speaking, inheritance. They would be quite right. However, unless someone wants to rewrite glm and lm for S4 classes I think that battle is lost. -thomas Thomas Lumley Assoc. Professor, Biostatistics tlumley at u.washington.edu University of Washington, Seattle
Prof Brian Ripley
2006-Dec-27 18:43 UTC
[Rd] proposal: allowing alternative variance estimators in glm/lm
What concerns me about this is if people call the summary methods directly on objects not of the right class. That used to be quite prevalent in R itself, but problems with residuals/weights mean it has now gone, I believe. summary.lm and summary.glm are exported from stats, and this indicates that they were quite widely used (and a grep across CRAN suggests that they still are). One fairly backwards-compatible option would seem to be to call the vcov generic only if the object inherits from [g]lm and had an earlier class. On Wed, 27 Dec 2006, Thomas Lumley wrote:> > There has been recent discussion about alternatives to the model-based > standard error estimators for lm. While some people like the sandwich > estimator and others don't, it is clear that neither estimator dominates > the other for any sane loss function. It is also worth noting that the > sandwich estimator is the default for t.test(). > > I think it would be useful for models using other variance estimators to > be able to inherit from lm and use summary.lm and predict.lm (and > similarly for glm). The main step in making this possible would be > moving the variance-covariance matrix computation that is currently > duplicated in summary.lm and predict.lm into vcov.lm, and then having > summary.lm and predict.lm call vcov(). > > This allows a fitting function (whether lm() or another function) to > produce objects that inherit usefully from lm and glm but have other > standard error estimators, by supplying a new vcov method for the class. > The initial discusssion was about heteroscedasticity-consistent sandwich > estimators, but from my point of view autocorrelation-consistent > estimators and estimators that handle sampling weights are more > interesting. > > OOP purists might point out that the relationship involved is not, > strictly speaking, inheritance. They would be quite right. However, > unless someone wants to rewrite glm and lm for S4 classes I think that > battle is lost. > > > -thomas > > Thomas Lumley Assoc. Professor, Biostatistics > tlumley at u.washington.edu University of Washington, Seattle > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel >-- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595
ivo welch
2007-Jan-02 15:42 UTC
[Rd] proposal: allowing alternative variance estimators in glm/lm
Dear Brian / Thomas: May I suggest a "cheap" and amateurish solution, obviously without much knowledge or intelligence about the subject? As a non-statistician user of R, maybe a hook functionality at strategic places could provide some flexibility without too much pain. I think replacing the standard output from summary.lm would be a bad idea (it could easily create errors downstream, when idiots like myself ask "why don't I get the s.e. that stata produces? duh---you loaded heteroskedasticity adjustment, but forgot about it). But I think some flexibility to add more information would be a very good thing. Hooks that can be set by functions (perhaps cascades) would allow third parties to create additional statistics, that could survive future changes to the functions themselves, without requiring a full object paradigm. For example, summary.lm could provide two hooks that allow programmers to chain my own objects to either the ans$coefficients and the ans object. (I guess even one hook would do.) Well-thought-out hooks could also add to print methods, etc., without requiring complete function rewrites, and would survive future changes in the real R code itself.>From the perspective of a first-time amateurish end-user, an invokationof "library(lm.addnormalized)" could then magically always add a normalized coefficient to the coefficient output. An invokation of "library(lm.addheteroskedasticity)" could magically always add heteroskedasticity se's and T-stats. And so on. As I said, I don't know what I am talking about. I am really a non-statistician end-user, who is really a bit over his head with all of this---I am using R not because it is sensible given my needs and abilities, but because I am in awe by many of its capabilities ( and because I enjoy Brian berating me while he offers me usually desparately needed help ;-) ). Regards, /ivo
Possibly Parallel Threads
- svyglm and sandwich estimator of variance
- Robust SE & Heteroskedasticity-consistent estimation
- Robust vce for heckman estimators
- Robust or Sandwich estimates in lmer2
- latex output of regressions with standardized regression coefficients and t-statistics based on Huber-White