Steven McKinney
2007-May-17 18:41 UTC
[Rd] model.matrix bug? Nested factor yields singular design matrix.
Hi all, I believe this is a bug in the model.matrix function. I'd like a second opinion before filing a bug report. If I have a nested covariate B with multiple values for just one level of A, I can not get a non-singular design matrix out of model.matrix> df <- data.frame(A = factor(c("a", "a", "x", "x"), levels = c("x", "a")),+ B = factor(c("b", "x", "x", "x"), levels = c("x", "b")))> > dfA B 1 a b 2 a x 3 x x 4 x x So of course the full design matrix is singular, this is expected.> model.matrix(~ A * B, df)(Intercept) Aa Bb Aa:Bb 1 1 1 1 1 2 1 1 0 0 3 1 0 0 0 4 1 0 0 0 attr(,"assign") [1] 0 1 2 3 attr(,"contrasts") attr(,"contrasts")$A [1] "contr.treatment" attr(,"contrasts")$B [1] "contr.treatment" I'd like to drop the B main effect column, but get the unexpected result of a column of zeroes.> model.matrix(~ A * B - B, df)(Intercept) Aa Ax:Bb Aa:Bb 1 1 1 0 1 2 1 1 0 0 3 1 0 0 0 4 1 0 0 0 attr(,"assign") [1] 0 1 2 2 attr(,"contrasts") attr(,"contrasts")$A [1] "contr.treatment" attr(,"contrasts")$B [1] "contr.treatment"> >This does not happen in S-PLUS.> info()S info file C:\DOCUME~1\kilroy\LOCALS~1\Temp\S0107EB.tmp will be removed at session end $Sinfo: Enterprise Developer Version 7.0.6 for Microsoft Windows : 2005 SHOME: C:/PROGRAMFILES/INSIGHTFUL/splus70 prog.name: SPLUS.EXE load.date: Sun Dec 04 23:15:59 2005 date: Thu May 17 07:38:16 PDT 2007> options(contrasts = c("contr.treatment", "contr.poly")) > df <- data.frame(A = factor(c("a", "a", "x", "x"), levels = c("x", "a")),+ B = factor(c("b", "x", "x", "x"), levels = c("x", "b")))> model.matrix(~ A * B - B, df)(Intercept) A A:B 1 1 1 1 2 1 1 0 3 1 0 0 4 1 0 0>This is what I was expecting to get in R, but can not. Alternate specifications in R continue to yield a singular design matrix> > model.matrix(~ A/B, df)(Intercept) Aa Ax:Bb Aa:Bb 1 1 1 0 1 2 1 1 0 0 3 1 0 0 0 4 1 0 0 0 attr(,"assign") [1] 0 1 2 2 attr(,"contrasts") attr(,"contrasts")$A [1] "contr.treatment" attr(,"contrasts")$B [1] "contr.treatment"> model.matrix(~ A + A:B, df)(Intercept) Aa Ax:Bb Aa:Bb 1 1 1 0 1 2 1 1 0 0 3 1 0 0 0 4 1 0 0 0 attr(,"assign") [1] 0 1 2 2 attr(,"contrasts") attr(,"contrasts")$A [1] "contr.treatment" attr(,"contrasts")$B [1] "contr.treatment" Why is the Ax:Bb column being included? Have I missed a control parameter or some other way of specifying to model.matrix not to include this extra column? Any feedback appreciated. Best regards Steven McKinney Statistician Molecular Oncology and Breast Cancer Program British Columbia Cancer Research Centre email: smckinney at bccrc.ca tel: 604-675-8000 x7561 BCCRC Molecular Oncology 675 West 10th Ave, Floor 4 Vancouver B.C. V5Z 1L3 Canada
Steven McKinney
2007-May-17 18:42 UTC
[Rd] model.matrix bug? Nested factor yields singular design matrix.
Apologies - I forgot the session info.> sessionInfo()R version 2.5.0 (2007-04-23) powerpc-apple-darwin8.9.1 locale: en_CA.UTF-8/en_CA.UTF-8/en_CA.UTF-8/C/en_CA.UTF-8/en_CA.UTF-8 attached base packages: [1] "stats" "graphics" "grDevices" "utils" "datasets" "methods" "base" other attached packages: lme4 Matrix lattice "0.99875-0" "0.99875-1" "0.15-5" -----Original Message----- From: r-devel-bounces at r-project.org on behalf of Steven McKinney Sent: Thu 5/17/2007 11:41 AM To: r-devel at r-project.org Subject: [Rd] model.matrix bug? Nested factor yields singular design matrix. Hi all, I believe this is a bug in the model.matrix function. I'd like a second opinion before filing a bug report. If I have a nested covariate B with multiple values for just one level of A, I can not get a non-singular design matrix out of model.matrix> df <- data.frame(A = factor(c("a", "a", "x", "x"), levels = c("x", "a")),+ B = factor(c("b", "x", "x", "x"), levels = c("x", "b")))> > dfA B 1 a b 2 a x 3 x x 4 x x So of course the full design matrix is singular, this is expected.> model.matrix(~ A * B, df)(Intercept) Aa Bb Aa:Bb 1 1 1 1 1 2 1 1 0 0 3 1 0 0 0 4 1 0 0 0 attr(,"assign") [1] 0 1 2 3 attr(,"contrasts") attr(,"contrasts")$A [1] "contr.treatment" attr(,"contrasts")$B [1] "contr.treatment" I'd like to drop the B main effect column, but get the unexpected result of a column of zeroes.> model.matrix(~ A * B - B, df)(Intercept) Aa Ax:Bb Aa:Bb 1 1 1 0 1 2 1 1 0 0 3 1 0 0 0 4 1 0 0 0 attr(,"assign") [1] 0 1 2 2 attr(,"contrasts") attr(,"contrasts")$A [1] "contr.treatment" attr(,"contrasts")$B [1] "contr.treatment"> >This does not happen in S-PLUS.> info()S info file C:\DOCUME~1\kilroy\LOCALS~1\Temp\S0107EB.tmp will be removed at session end $Sinfo: Enterprise Developer Version 7.0.6 for Microsoft Windows : 2005 SHOME: C:/PROGRAMFILES/INSIGHTFUL/splus70 prog.name: SPLUS.EXE load.date: Sun Dec 04 23:15:59 2005 date: Thu May 17 07:38:16 PDT 2007> options(contrasts = c("contr.treatment", "contr.poly")) > df <- data.frame(A = factor(c("a", "a", "x", "x"), levels = c("x", "a")),+ B = factor(c("b", "x", "x", "x"), levels = c("x", "b")))> model.matrix(~ A * B - B, df)(Intercept) A A:B 1 1 1 1 2 1 1 0 3 1 0 0 4 1 0 0>This is what I was expecting to get in R, but can not. Alternate specifications in R continue to yield a singular design matrix> > model.matrix(~ A/B, df)(Intercept) Aa Ax:Bb Aa:Bb 1 1 1 0 1 2 1 1 0 0 3 1 0 0 0 4 1 0 0 0 attr(,"assign") [1] 0 1 2 2 attr(,"contrasts") attr(,"contrasts")$A [1] "contr.treatment" attr(,"contrasts")$B [1] "contr.treatment"> model.matrix(~ A + A:B, df)(Intercept) Aa Ax:Bb Aa:Bb 1 1 1 0 1 2 1 1 0 0 3 1 0 0 0 4 1 0 0 0 attr(,"assign") [1] 0 1 2 2 attr(,"contrasts") attr(,"contrasts")$A [1] "contr.treatment" attr(,"contrasts")$B [1] "contr.treatment" Why is the Ax:Bb column being included? Have I missed a control parameter or some other way of specifying to model.matrix not to include this extra column? Any feedback appreciated. Best regards Steven McKinney Statistician Molecular Oncology and Breast Cancer Program British Columbia Cancer Research Centre email: smckinney at bccrc.ca tel: 604-675-8000 x7561 BCCRC Molecular Oncology 675 West 10th Ave, Floor 4 Vancouver B.C. V5Z 1L3 Canada ______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel