thr3ads.net - R help - [R] Regression with factor having1 level [Mar 2016]

If this information is useful, please help other people find it:
Share via:

peter dalgaard

2016-Mar-11 22:07 UTC

[R] Regression with factor having1 level

> On 11 Mar 2016, at 17:56 , David Winsemius <dwinsemius at
comcast.net> wrote:
> 
>> 
>> On Mar 11, 2016, at 12:48 AM, peter dalgaard <pdalgd at
gmail.com> wrote:
>> 
>> 
>>> On 11 Mar 2016, at 08:25 , David Winsemius <dwinsemius at
comcast.net> wrote:
>>>> 
>> ...
>>>>> dfrm <- data.frame(y=rnorm(10), x1=rnorm(10)
,x2=as.factor(TRUE), x3=rnorm(10))
>>>>> lm(y~x1+x2+x3, dfrm, na.action=na.exclude)
>>>> Error in `contrasts<-`(`*tmp*`, value = contr.funs[1 +
isOF[nn]]) :
>>>> contrasts can be applied
>>> 
>>> Yes, and the error appears to come from `model.matrix`:
>>> 
>>>> model.matrix(y~x1+factor(x2)+x3, dfrm)
>>> Error in `contrasts<-`(`*tmp*`, value = contr.funs[1 +
isOF[nn]]) :
>>> contrasts can be applied only to factors with 2 or more levels
>>> 
>> 
>> Actually not. The above is because you use an explicit factor(x2). The
actual smoking gun is this line in lm()
>> 
>> mf$drop.unused.levels <- TRUE
> 
> It's possible that modifying model.matrix to allow single level factors
would then bump up against that check, but  at the moment the traceback() from
an error generated with data that has a single level factor and no call to
factor in the formula still implicates code in model.matrix:
You're missing the point: model.matrix has a beef with 1-level factors, not
with 2-level factors of which one level happens to be absent, which is what this
thread was originally about. It is lm that via model.frame with
drop.unused.levels=TRUE converts the latter factors to the former.

-pd 

> 
>> dfrm <- data.frame(y=rnorm(10), x1=rnorm(10) ,x2=factor(TRUE),
x3=rnorm(10))
>> lm(y~x1+x2+x3, dfrm)
> Error in `contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) : 
>  contrasts can be applied only to factors with 2 or more levels
>> traceback()
> 5: stop("contrasts can be applied only to factors with 2 or more
levels")
> 4: `contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]])
> 3: model.matrix.default(mt, mf, contrasts)
> 2: model.matrix(mt, mf, contrasts)
> 1: lm(y ~ x1 + x2 + x3, dfrm)
> 
> -- 
> David.
> 
>> 
>> which someone must have thought was a good idea at some point....
>> 
>> model.matrix itself is quite happy to leave factors alone and let
subsequent code sort out any singularities, e.g.
>> 
>>> model.matrix(y~x1+x2, data=df[1:2,])
>> (Intercept) x1 x2B
>> 1           1  1   0
>> 2           1  1   0
>> attr(,"assign")
>> [1] 0 1 2
>> attr(,"contrasts")
>> attr(,"contrasts")$x2
>> [1] "contr.treatment"
>> 
>> 
>> 
>>>> model.matrix(y~x1+x2+x3, dfrm)
>>> (Intercept)          x1 x2TRUE         x3
>>> 1            1  0.04887847      1 -0.4199628
>>> 2            1 -1.04786688      1  1.3947923
>>> 3            1 -0.34896007      1 -2.1873666
>>> 4            1 -0.08866061      1  0.1204129
>>> 5            1 -0.41111366      1 -1.6631057
>>> 6            1 -0.83449110      1  1.1631801
>>> 7            1 -0.67887823      1  0.3207544
>>> 8            1 -1.12206068      1  0.6012040
>>> 9            1  0.05116683      1  0.3598696
>>> 10           1  1.74413583      1  0.3608478
>>> attr(,"assign")
>>> [1] 0 1 2 3
>>> attr(,"contrasts")
>>> attr(,"contrasts")$x2
>>> [1] "contr.treatment"
>>> 
>>> -- 
>>> 
>>> David Winsemius
>>> Alameda, CA, USA
>>> 
>>> ______________________________________________
>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more,
see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>> 
>> -- 
>> Peter Dalgaard, Professor,
>> Center for Statistics, Copenhagen Business School
>> Solbjerg Plads 3, 2000 Frederiksberg, Denmark
>> Phone: (+45)38153501
>> Office: A 4.23
>> Email: pd.mes at cbs.dk  Priv: PDalgd at gmail.com
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
> 
> David Winsemius
> Alameda, CA, USA
-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd.mes at cbs.dk  Priv: PDalgd at gmail.com

David Winsemius

2016-Mar-11 22:48 UTC

head link

[R] Regression with factor having1 level

> On Mar 11, 2016, at 2:07 PM, peter dalgaard <pdalgd at gmail.com>
wrote:
> 
> 
>> On 11 Mar 2016, at 17:56 , David Winsemius <dwinsemius at
comcast.net> wrote:
>> 
>>> 
>>> On Mar 11, 2016, at 12:48 AM, peter dalgaard <pdalgd at
gmail.com> wrote:
>>> 
>>> 
>>>> On 11 Mar 2016, at 08:25 , David Winsemius <dwinsemius at
comcast.net> wrote:
>>>>> 
>>> ...
>>>>>> dfrm <- data.frame(y=rnorm(10), x1=rnorm(10)
,x2=as.factor(TRUE), x3=rnorm(10))
>>>>>> lm(y~x1+x2+x3, dfrm, na.action=na.exclude)
>>>>> Error in `contrasts<-`(`*tmp*`, value = contr.funs[1 +
isOF[nn]]) :
>>>>> contrasts can be applied
>>>> 
>>>> Yes, and the error appears to come from `model.matrix`:
>>>> 
>>>>> model.matrix(y~x1+factor(x2)+x3, dfrm)
>>>> Error in `contrasts<-`(`*tmp*`, value = contr.funs[1 +
isOF[nn]]) :
>>>> contrasts can be applied only to factors with 2 or more levels
>>>> 
>>> 
>>> Actually not. The above is because you use an explicit factor(x2).
The actual smoking gun is this line in lm()
>>> 
>>> mf$drop.unused.levels <- TRUE
>> 
>> It's possible that modifying model.matrix to allow single level
factors would then bump up against that check, but  at the moment the
traceback() from an error generated with data that has a single level factor and
no call to factor in the formula still implicates code in model.matrix:
> 
> You're missing the point: model.matrix has a beef with 1-level factors,
not with 2-level factors of which one level happens to be absent, which is what
this thread was originally about. It is lm that via model.frame with
drop.unused.levels=TRUE converts the latter factors to the former.
> 
I guess I did miss the point. Apologies for being obtuse. I thought that a one
level factor would have been "aliased out" when model.matrix
"realized" that it was collinear with the intercept. (Further
apologies for my projection of cognitive capacites on a machine.) Are you saying
it remains desirable that an error be thrown rather than reporting an NA for
coefficients and issuing a warning?

-- 
David.

> -pd 
> 
> 
>> 
>>> dfrm <- data.frame(y=rnorm(10), x1=rnorm(10) ,x2=factor(TRUE),
x3=rnorm(10))
>>> lm(y~x1+x2+x3, dfrm)
>> Error in `contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) : 
>> contrasts can be applied only to factors with 2 or more levels
>>> traceback()
>> 5: stop("contrasts can be applied only to factors with 2 or more
levels")
>> 4: `contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]])
>> 3: model.matrix.default(mt, mf, contrasts)
>> 2: model.matrix(mt, mf, contrasts)
>> 1: lm(y ~ x1 + x2 + x3, dfrm)
>> 
>> -- 
>> David.
>> 
>>> 
>>> which someone must have thought was a good idea at some point....
>>> 
>>> model.matrix itself is quite happy to leave factors alone and let
subsequent code sort out any singularities, e.g.
>>> 
>>>> model.matrix(y~x1+x2, data=df[1:2,])
>>> (Intercept) x1 x2B
>>> 1           1  1   0
>>> 2           1  1   0
>>> attr(,"assign")
>>> [1] 0 1 2
>>> attr(,"contrasts")
>>> attr(,"contrasts")$x2
>>> [1] "contr.treatment"
>>> 
>>> 
>>> 
>>>>> model.matrix(y~x1+x2+x3, dfrm)
>>>> (Intercept)          x1 x2TRUE         x3
>>>> 1            1  0.04887847      1 -0.4199628
>>>> 2            1 -1.04786688      1  1.3947923
>>>> 3            1 -0.34896007      1 -2.1873666
>>>> 4            1 -0.08866061      1  0.1204129
>>>> 5            1 -0.41111366      1 -1.6631057
>>>> 6            1 -0.83449110      1  1.1631801
>>>> 7            1 -0.67887823      1  0.3207544
>>>> 8            1 -1.12206068      1  0.6012040
>>>> 9            1  0.05116683      1  0.3598696
>>>> 10           1  1.74413583      1  0.3608478
>>>> attr(,"assign")
>>>> [1] 0 1 2 3
>>>> attr(,"contrasts")
>>>> attr(,"contrasts")$x2
>>>> [1] "contr.treatment"
>>>> 
>>>> -- 
>>>> 
>>>> David Winsemius
>>>> Alameda, CA, USA
>>>> 
>>>> ______________________________________________
>>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and
more, see
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible
code.
>>> 
>>> -- 
>>> Peter Dalgaard, Professor,
>>> Center for Statistics, Copenhagen Business School
>>> Solbjerg Plads 3, 2000 Frederiksberg, Denmark
>>> Phone: (+45)38153501
>>> Office: A 4.23
>>> Email: pd.mes at cbs.dk  Priv: PDalgd at gmail.com
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>> 
>> David Winsemius
>> Alameda, CA, USA
> 
> -- 
> Peter Dalgaard, Professor,
> Center for Statistics, Copenhagen Business School
> Solbjerg Plads 3, 2000 Frederiksberg, Denmark
> Phone: (+45)38153501
> Office: A 4.23
> Email: pd.mes at cbs.dk  Priv: PDalgd at gmail.com
> 
> 
> 
> 
> 
> 
> 
> 
> 
David Winsemius
Alameda, CA, USA

peter dalgaard

2016-Mar-11 23:57 UTC

head link

[R] Regression with factor having1 level

> On 11 Mar 2016, at 23:48 , David Winsemius <dwinsemius at
comcast.net> wrote:
> 
>> 
>> On Mar 11, 2016, at 2:07 PM, peter dalgaard <pdalgd at gmail.com>
wrote:
>> 
>> 
>>> On 11 Mar 2016, at 17:56 , David Winsemius <dwinsemius at
comcast.net> wrote:
>>> 
>>>> 
>>>> On Mar 11, 2016, at 12:48 AM, peter dalgaard <pdalgd at
gmail.com> wrote:
>>>> 
>>>> 
>>>>> On 11 Mar 2016, at 08:25 , David Winsemius <dwinsemius
at comcast.net> wrote:
>>>>>> 
>>>> ...
>>>>>>> dfrm <- data.frame(y=rnorm(10), x1=rnorm(10)
,x2=as.factor(TRUE), x3=rnorm(10))
>>>>>>> lm(y~x1+x2+x3, dfrm, na.action=na.exclude)
>>>>>> Error in `contrasts<-`(`*tmp*`, value = contr.funs[1
+ isOF[nn]]) :
>>>>>> contrasts can be applied
>>>>> 
>>>>> Yes, and the error appears to come from `model.matrix`:
>>>>> 
>>>>>> model.matrix(y~x1+factor(x2)+x3, dfrm)
>>>>> Error in `contrasts<-`(`*tmp*`, value = contr.funs[1 +
isOF[nn]]) :
>>>>> contrasts can be applied only to factors with 2 or more
levels
>>>>> 
>>>> 
>>>> Actually not. The above is because you use an explicit
factor(x2). The actual smoking gun is this line in lm()
>>>> 
>>>> mf$drop.unused.levels <- TRUE
>>> 
>>> It's possible that modifying model.matrix to allow single level
factors would then bump up against that check, but  at the moment the
traceback() from an error generated with data that has a single level factor and
no call to factor in the formula still implicates code in model.matrix:
>> 
>> You're missing the point: model.matrix has a beef with 1-level
factors, not with 2-level factors of which one level happens to be absent, which
is what this thread was originally about. It is lm that via model.frame with
drop.unused.levels=TRUE converts the latter factors to the former.
>> 
> 
> I guess I did miss the point. Apologies for being obtuse. I thought that a
one level factor would have been "aliased out" when model.matrix
"realized" that it was collinear with the intercept. (Further
apologies for my projection of cognitive capacites on a machine.) Are you saying
it remains desirable that an error be thrown rather than reporting an NA for
coefficients and issuing a warning?
> 
For the moment I was just analyzing where this came from. Intuitively I'd be
leaning in the opposite direction -- dropping factor levels automatically is
usually a bad thing.

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd.mes at cbs.dk  Priv: PDalgd at gmail.com

R help - Mar 2016 - Regression with factor having1 level

[R] Regression with factor having1 level

[R] Regression with factor having1 level

[R] Regression with factor having1 level