thr3ads.net - R help - [R] lm on matrix data [Oct 2012]

If this information is useful, please help other people find it:
Share via:

Baoqiang Cao

2012-Oct-10 14:35 UTC

[R] lm on matrix data

Hi,

I have a question about using lm on matrix, have to admit it is very
trivial but I just couldn't find the answer after searched the mailing
list and other online tutorial. It would be great if you could help.

I have a matrix "trainx" of 492(rows) by 220(columns) that is my x,
and trainy is 492 by 1. Also, I have the newdata testx which is 240
(rows) by 220 (columns). Here is what I got:

py <- predict(lm(trainy ~ trainx ), data.frame(testx))
Warning message:
'newdata' had 240 rows but variable(s) found have 492 rows

The fitting formula I intended is: trainy ~ trainx[,1] + trainx[,2] +
.. +trainx[,220].

Any help, please?

Best,
Baoqiang

R. Michael Weylandt

2012-Oct-10 22:33 UTC

head link

[R] lm on matrix data

On Wed, Oct 10, 2012 at 3:35 PM, Baoqiang Cao <bqcaomail at gmail.com>
wrote:> Hi,
>
> I have a question about using lm on matrix, have to admit it is very
> trivial but I just couldn't find the answer after searched the mailing
> list and other online tutorial. It would be great if you could help.
>
> I have a matrix "trainx" of 492(rows) by 220(columns) that is my
x,
> and trainy is 492 by 1. Also, I have the newdata testx which is 240
> (rows) by 220 (columns). Here is what I got:
>
> py <- predict(lm(trainy ~ trainx ), data.frame(testx))
> Warning message:
> 'newdata' had 240 rows but variable(s) found have 492 rows
>
> The fitting formula I intended is: trainy ~ trainx[,1] + trainx[,2] +
> .. +trainx[,220].
>
I think you want a formula like

trainy ~ .

meaning "trainy" explained by everything else. (Admittedly, I think
any model with 220 regressors is going to be absolutely terrible, but
that's a different email)

What I think is happening here is that lm() looks for "trainx" as a
column name in the data set you provide, can't find it, and then finds
the "trainx" dataset as a whole, which doesn't fit the
dimensionality
you need. Take a look at ?formula for more on how to use formula
notation properly.

Cheers,
Michael

Jean V Adams

2012-Oct-11 13:30 UTC

head link

[R] lm on matrix data

Baoqiang,

Here's an approach that should work:
(1) Make sure that the column names of trainx and testx are the same.
(2) Combine trainy and trainx into a data frame for fitting the model.
(2) Use the newdata= argument in the predict() function.
(3) Convert testx from matrix to data frame.

# some example data
nrow <- 5
ncol <- 3
colnames <- paste("x", seq(ncol), sep="")
nrow2 <- 8
trainx <- matrix(rnorm(nrow*ncol), ncol=ncol, dimnames=list(NULL, 
colnames))
trainy <- matrix(rnorm(nrow), ncol=1, dimnames=list(NULL, "y"))
testx <- matrix(rnorm(nrow2*ncol), ncol=ncol, dimnames=list(NULL, 
colnames))

# create data frames for model fitting and prediction
traindf <- data.frame(cbind(trainy, trainx))
testdf <- data.frame(testx)

# fit the model and make predictions for new data
fit <- lm(y ~ ., data=traindf)
py <- predict(fit, newdata=testdf)

Note that the lm() function you fit to the two matrices worked just fine
        lm(trainy ~ trainx)
but the way that names are assigned to the predictor variables
        trainxx1, trainxx2, etc
makes it inconvenient in predicting on new data.

Jean

 

Baoqiang Cao <bqcaomail@gmail.com> wrote on 10/10/2012 09:35:47
AM:> 
> Hi,
> 
> I have a question about using lm on matrix, have to admit it is very
> trivial but I just couldn't find the answer after searched the mailing
> list and other online tutorial. It would be great if you could help.
> 
> I have a matrix "trainx" of 492(rows) by 220(columns) that is my
x,
> and trainy is 492 by 1. Also, I have the newdata testx which is 240
> (rows) by 220 (columns). Here is what I got:
> 
> py <- predict(lm(trainy ~ trainx ), data.frame(testx))
> Warning message:
> 'newdata' had 240 rows but variable(s) found have 492 rows
> 
> The fitting formula I intended is: trainy ~ trainx[,1] + trainx[,2] +
> .. +trainx[,220].
> 
> Any help, please?
> 
> Best,
> Baoqiang
	[[alternative HTML version deleted]]

Apparently Analagous Threads

Search for more maybe matching threads

R help - Oct 2012 - lm on matrix data

[R] lm on matrix data

[R] lm on matrix data

[R] lm on matrix data

Apparently Analagous Threads