Hello, All: What's the simplest way to convert a data.frame into a model.matrix? One way is given by the following example, modified from the examples in help(model.matrix): dd <- data.frame(a = gl(3,4), b = gl(4,1,12)) ab <- model.matrix(~ a + b, dd) ab0 <- model.matrix(~., dd) all.equal(ab, ab0) What do you think about replacing "model.matrix(~ a + b, dd)" in the current help(model.matrix) with this 3-line expansion? I suggest this, because I spent a few hours today trying to convert a data.frame into a model.matrix before finding this. Also, what do you think about adding something like the following to the stats package: model.matrix.data.frame <- function(object, ...){ model.matrix(~., object, ...) } And then extend the above example as follows: ab. <- model.matrix(dd) all.equal(ab, ab.) Thanks, Spencer Graves
Dear Spencer, I don't think that the problem of "converting a data frame into a model matrix" is well-defined, because there isn't a unique mapping from one to the other. In your example, you build the model matrix for the additive formula ~ a + b from the data frame matrix containing a and b, using "treatment" contrasts, but there are other possible formulas (e.g., ~ a*b) and contrasts [e.g., model.matrix(~ a + b, dd, contrasts=list(a=contr.sum, b=contr.helmert)]. So I think that the current approach is sensible -- to require both a data frame and a formula. Best, John> -----Original Message----- > From: R-devel [mailto:r-devel-bounces at r-project.org] On Behalf Of Spencer > Graves > Sent: October 3, 2016 7:59 PM > To: r-devel at r-project.org > Subject: [Rd] suggested addition to model.matrix > > Hello, All: > > > What's the simplest way to convert a data.frame into a model.matrix? > > > One way is given by the following example, modified from the examples in > help(model.matrix): > > > dd <- data.frame(a = gl(3,4), b = gl(4,1,12)) > ab <- model.matrix(~ a + b, dd) > ab0 <- model.matrix(~., dd) > all.equal(ab, ab0) > > > What do you think about replacing "model.matrix(~ a + b, dd)" in > the current help(model.matrix) with this 3-line expansion? > > > I suggest this, because I spent a few hours today trying to > convert a data.frame into a model.matrix before finding this. > > > Also, what do you think about adding something like the following > to the stats package: > > > model.matrix.data.frame <- function(object, ...){ > model.matrix(~., object, ...) > } > > > And then extend the above example as follows: > > ab. <- model.matrix(dd) > all.equal(ab, ab.) > > > Thanks, > Spencer Graves > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel
In addition, there is a formula method for data.frame that assumes the first column is the dependent variable. > z <- data.frame(X1=1:6,X2=letters[1:3],Y=log(1:6)) > formula(z) X1 ~ X2 + Y > colnames(model.matrix(formula(z), z)) [1] "(Intercept)" "X2b" "X2c" "Y" Spencer's request is that the default formula given to model.matrix have no dependent variable. > colnames(model.matrix(~., z)) [1] "(Intercept)" "X1" "X2b" "X2c" "Y" In my opinion, formula.data.frame is a mistake, but we don't need two incompatible mistakes. Bill Dunlap TIBCO Software wdunlap tibco.com On Mon, Oct 3, 2016 at 9:46 PM, Fox, John <jfox at mcmaster.ca> wrote:> Dear Spencer, > > I don't think that the problem of "converting a data frame into a model > matrix" is well-defined, because there isn't a unique mapping from one to > the other. > > In your example, you build the model matrix for the additive formula ~ a > + b from the data frame matrix containing a and b, using "treatment" > contrasts, but there are other possible formulas (e.g., ~ a*b) and > contrasts [e.g., model.matrix(~ a + b, dd, contrasts=list(a=contr.sum, > b=contr.helmert)]. > > So I think that the current approach is sensible -- to require both a data > frame and a formula. > > Best, > John > > > -----Original Message----- > > From: R-devel [mailto:r-devel-bounces at r-project.org] On Behalf Of > Spencer > > Graves > > Sent: October 3, 2016 7:59 PM > > To: r-devel at r-project.org > > Subject: [Rd] suggested addition to model.matrix > > > > Hello, All: > > > > > > What's the simplest way to convert a data.frame into a > model.matrix? > > > > > > One way is given by the following example, modified from the > examples in > > help(model.matrix): > > > > > > dd <- data.frame(a = gl(3,4), b = gl(4,1,12)) > > ab <- model.matrix(~ a + b, dd) > > ab0 <- model.matrix(~., dd) > > all.equal(ab, ab0) > > > > > > What do you think about replacing "model.matrix(~ a + b, dd)" in > > the current help(model.matrix) with this 3-line expansion? > > > > > > I suggest this, because I spent a few hours today trying to > > convert a data.frame into a model.matrix before finding this. > > > > > > Also, what do you think about adding something like the following > > to the stats package: > > > > > > model.matrix.data.frame <- function(object, ...){ > > model.matrix(~., object, ...) > > } > > > > > > And then extend the above example as follows: > > > > ab. <- model.matrix(dd) > > all.equal(ab, ab.) > > > > > > Thanks, > > Spencer Graves > > > > ______________________________________________ > > R-devel at r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-devel > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel >[[alternative HTML version deleted]]
Possibly Parallel Threads
- Bug in model.matrix.default for higher-order interaction encoding when specific model terms are missing
- suggested addition to model.matrix
- Bug in model.matrix.default for higher-order interaction encoding when specific model terms are missing
- Bug in model.matrix.default for higher-order interaction encoding when specific model terms are missing
- Bug in model.matrix.default for higher-order interaction encoding when specific model terms are missing