I was surprised by this (in R 2.0.1): > a <- ordered(-1:1) > a [1] -1 0 1 Levels: -1 < 0 < 1 > model.matrix(~ a) (Intercept) a.L a.Q 1 1 -7.071068e-01 0.4082483 2 1 -9.073800e-17 -0.8164966 3 1 7.071068e-01 0.4082483 attr(,"assign") [1] 0 1 1 attr(,"contrasts") attr(,"contrasts")$a [1] "contr.poly" > model.matrix(~ -1 + a) a-1 a0 a1 1 1 0 0 2 0 1 0 3 0 0 1 attr(,"assign") [1] 1 1 1 attr(,"contrasts") attr(,"contrasts")$a [1] "contr.poly" Without the intercept, treatment contrasts seem to have been used (this despite the "contr.poly" in the "contrasts" attribute). It's not restricted to ordered factors. For example, if Helmert contrasts are used for nominal factors, the same sort of thing happens. I suppose it is a deliberate feature (perhaps to protect the user from accidentally fitting models that make no sense? or maybe some better reason?) -- is it explained somewhere? David
Prof Brian Ripley
2005-Feb-23 15:45 UTC
[R] model.matrix for a factor effect with no intercept
MASS4 p.150 White Book p.38 Those are the only two reasonably comprehensive accounts that I am aware of (and they have only partial overlap). The underlying motivation is to span the _additional_ vector space covered by the term, the complement to what has gone before. Put another way, as each term is added, only enough columns are added to the model matrix to span the same space as if dummy coding had been used for that term and its predecessors. So think of this as a way to produce a parsimonious (usually full-rank) basis for the model space. On Wed, 23 Feb 2005, David Firth wrote:> I was surprised by this (in R 2.0.1): > >> a <- ordered(-1:1) >> a > [1] -1 0 1 > Levels: -1 < 0 < 1 > >> model.matrix(~ a) > (Intercept) a.L a.Q > 1 1 -7.071068e-01 0.4082483 > 2 1 -9.073800e-17 -0.8164966 > 3 1 7.071068e-01 0.4082483 > attr(,"assign") > [1] 0 1 1 > attr(,"contrasts") > attr(,"contrasts")$a > [1] "contr.poly" > >> model.matrix(~ -1 + a) > a-1 a0 a1 > 1 1 0 0 > 2 0 1 0 > 3 0 0 1 > attr(,"assign") > [1] 1 1 1 > attr(,"contrasts") > attr(,"contrasts")$a > [1] "contr.poly" > > Without the intercept, treatment contrasts seem to have been used (this > despite the "contr.poly" in the "contrasts" attribute). > > It's not restricted to ordered factors. For example, if Helmert contrasts > are used for nominal factors, the same sort of thing happens. > > I suppose it is a deliberate feature (perhaps to protect the user from > accidentally fitting models that make no sense? or maybe some better > reason?) -- is it explained somewhere?-- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595