I want to perform a multiple regression in R and make predictions based on the trained model. Below is an example code I am using: price = c(10,18,18,11,17) predictors = cbind(c(5,6,3,4,5),c(2,1,8,5,6)) predict(lm(price ~ predictors), data.frame(predictors=matrix(c(3,5),nrow=1))) So, based on the 2-variate regression model trained by 5 samples, I want to make a prediction for the test data point where the first variate is 3 and second variate is 5. But I get a warning from above code saying that 'newdata' had 1 rows but variable(s) found have 5 rows. How can I correct above code? Below code works fine where I give the variables separately to the model formula. But since I will have hundreds of variates, I have to give them in a matrix since it would be unfeasible to append hundreds of columns using + sign. price = c(10,18,18,11,17) predictor1 = c(5,6,3,4,5) predictor2 = c(2,1,8,5,6) predict(lm(price ~ predictor1 + predictor2), data.frame(predictor1=3,predictor2=5)) Thanks in advance! -- -safiye [[alternative HTML version deleted]]
Solved! Here is the solution in case it helps others: The easiest way to get past the issue of matching up variable names from a matrix of covariates to newdata data.frame column names is to put your input data into a data.frame as well. Try this price = c(10,18,18,11,17) predictors = cbind(c(5,6,3,4,5),c(2,1,8,5,6)) indata<-data.frame(price,predictors=predictors) predict(lm(price ~ ., indata), data.frame(predictors=matrix(c(3,5),nrow=1))) Here we combine price and predictors into a data.frame such that it will be named the same say as the newdata data.frame. We use the . in the formula to mean "all other columns" so we don't have to specify them explicitly. On 29 May 2014 13:38, Safiye Celik <safisce@gmail.com> wrote:> I want to perform a multiple regression in R and make predictions based > on the trained model. Below is an example code I am using: > > price = c(10,18,18,11,17) > predictors = cbind(c(5,6,3,4,5),c(2,1,8,5,6)) > predict(lm(price ~ predictors), data.frame(predictors=matrix(c(3,5),nrow=1))) > > So, based on the 2-variate regression model trained by 5 samples, I want > to make a prediction for the test data point where the first variate is 3 > and second variate is 5. But I get a warning from above code saying that 'newdata' > had 1 rows but variable(s) found have 5 rows. How can I correct above > code? Below code works fine where I give the variables separately to the > model formula. But since I will have hundreds of variates, I have to give > them in a matrix since it would be unfeasible to append hundreds of columns > using + sign. > > price = c(10,18,18,11,17) > predictor1 = c(5,6,3,4,5) > predictor2 = c(2,1,8,5,6) > predict(lm(price ~ predictor1 + predictor2), data.frame(predictor1=3,predictor2=5)) > > Thanks in advance! > > -- > -safiye >-- -safiye [[alternative HTML version deleted]]
Hi, I'd do it like this, making use of data frames and the data argument to lm. traindata <- data.frame(price=price, predictor1=predictor1, predictor2=predictor2) testdata <- data.frame(predictor1=3, predictor2=5) predict(lm(price ~ ., data=traindata), testdata) Note that you don't have to specify all the predictors individually: the . in the formula takes care of that. Note also that ?predict.lm states that newdata should be a data frame; I suspect trying to use a matrix is the cause of your problem. Sarah On Thu, May 29, 2014 at 4:38 PM, Safiye Celik <safisce at gmail.com> wrote:> I want to perform a multiple regression in R and make predictions based on > the trained model. Below is an example code I am using: > > price = c(10,18,18,11,17) > predictors = cbind(c(5,6,3,4,5),c(2,1,8,5,6)) > predict(lm(price ~ predictors), data.frame(predictors=matrix(c(3,5),nrow=1))) > > So, based on the 2-variate regression model trained by 5 samples, I want > to make a prediction for the test data point where the first variate is 3 > and second variate is 5. But I get a warning from above code saying > that 'newdata' > had 1 rows but variable(s) found have 5 rows. How can I correct above code? > Below code works fine where I give the variables separately to the model > formula. But since I will have hundreds of variates, I have to give them in > a matrix since it would be unfeasible to append hundreds of columns using + > sign. > > price = c(10,18,18,11,17) > predictor1 = c(5,6,3,4,5) > predictor2 = c(2,1,8,5,6) > predict(lm(price ~ predictor1 + predictor2), > data.frame(predictor1=3,predictor2=5)) > > Thanks in advance! > > -- > -safiye > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.-- Sarah Goslee http://www.functionaldiversity.org
Hello, lm() is designed to work with data.frames, not with matrices. You can change your code to something like dat <- data.frame(price, pred1 = c(5,6,3,4,5), pred2 = c(2,1,8,5,6)) fit <- lm(price ~ pred1 + pred2, data = dat) and then use the fitted model to do predictions. You don't have to give the new values in a matrix, you can give them as vectors of a data.frame. predict(fit, data.frame(pred1 = 1:3, pred2 = 3:5)) Hope this helps, Rui Barradas Em 29-05-2014 21:38, Safiye Celik escreveu:> I want to perform a multiple regression in R and make predictions based on > the trained model. Below is an example code I am using: > > price = c(10,18,18,11,17) > predictors = cbind(c(5,6,3,4,5),c(2,1,8,5,6)) > predict(lm(price ~ predictors), data.frame(predictors=matrix(c(3,5),nrow=1))) > > So, based on the 2-variate regression model trained by 5 samples, I want > to make a prediction for the test data point where the first variate is 3 > and second variate is 5. But I get a warning from above code saying > that 'newdata' > had 1 rows but variable(s) found have 5 rows. How can I correct above code? > Below code works fine where I give the variables separately to the model > formula. But since I will have hundreds of variates, I have to give them in > a matrix since it would be unfeasible to append hundreds of columns using + > sign. > > price = c(10,18,18,11,17) > predictor1 = c(5,6,3,4,5) > predictor2 = c(2,1,8,5,6) > predict(lm(price ~ predictor1 + predictor2), > data.frame(predictor1=3,predictor2=5)) > > Thanks in advance! >