Dear R-Users My problem is quite simple: I need to use a fitted model to predict the next point (that is, just one single point in a curve). The data was divided in two parts: identification (x and y - class matrix) and validation (xt and yt - class matrix). I don't use all values in x and y but only the 10 nearest points (x[b,] and y[b,]) for each regression (b is a vector with the indexes).> fit.a=lm(y[b] ~ x[b,],method="qr")> summary(fit.a)Call: lm(formula = y[b] ~ x[b, ], method = "qr") Residuals: 1 2 3 4 5 6 7 6.939e-18 -2.393e-02 -3.912e-02 1.344e-02 -5.926e-02 -1.821e-02 -4.075e-02 8 9 10 4.075e-02 3.938e-02 8.769e-02 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 1.852793 0.842459 2.199 0.115 x[b, ]1 -0.086324 0.056841 -1.519 0.226 x[b, ]2 0.001114 0.001666 0.668 0.552 x[b, ]3 0.002501 0.004376 0.571 0.608 x[b, ]4 -0.003589 0.009041 -0.397 0.718 x[b, ]5 -0.276498 0.119545 -2.313 0.104 x[b, ]6 -0.003010 0.003574 -0.842 0.462 Residual standard error: 0.07893 on 3 degrees of freedom Multiple R-squared: 0.7932, Adjusted R-squared: 0.3795 F-statistic: 1.917 on 6 and 3 DF, p-value: 0.3169 Once the fiited model is found (it does not matter how bad it is - the variables are poorly correlated), I need to predict the next value> predict(fit.a,xt[j,])Error in eval(predvars, data, env) : numeric 'envir' arg not of length one It seems that predict needs a dataframe class but even if I change xt[b,] to as.data.frame(xt[b,]) the result is not what I expect.> predict(fit.a,as.data.frame(t(xt[j,])))1 2 3 4 5 6 7 8 0.7834919 0.8243357 0.7780093 0.7810451 0.8084342 0.8057823 1.0304123 0.9729126 9 10 0.7708979 0.8464298 Warning message: 'newdata' had 1 rows but variable(s) found have 10 rows My feeling is that I did not understand how to enter the formula in lm in the first place. Many thanks Ed [[alternative HTML version deleted]]
Eduardo; I think you would be more successful if you put your data in a dataframe, offered it to lm with column names only in the formula, and then used the newdata argument with predict with the column names matching the column names in the original data. -- David. On Aug 15, 2011, at 5:46 AM, Eduardo M. A. M.Mendes wrote:> Dear R-Users > > > > My problem is quite simple: I need to use a fitted model to predict > the next > point (that is, just one single point in a curve). > > > > The data was divided in two parts: identification (x and y - class > matrix) > and validation (xt and yt - class matrix). I don't use all values > in x and > y but only the 10 nearest points (x[b,] and y[b,]) for each > regression (b is > a vector with the indexes). > > > >> fit.a=lm(y[b] ~ x[b,],method="qr") > >> summary(fit.a) > > > > Call: > > lm(formula = y[b] ~ x[b, ], method = "qr") > > > > Residuals: > > 1 2 3 4 5 > 6 7 > > > 6.939e-18 -2.393e-02 -3.912e-02 1.344e-02 -5.926e-02 -1.821e-02 > -4.075e-02 > > > 8 9 10 > > 4.075e-02 3.938e-02 8.769e-02 > > > > Coefficients: > > Estimate Std. Error t value Pr(>|t|) > > (Intercept) 1.852793 0.842459 2.199 0.115 > > x[b, ]1 -0.086324 0.056841 -1.519 0.226 > > x[b, ]2 0.001114 0.001666 0.668 0.552 > > x[b, ]3 0.002501 0.004376 0.571 0.608 > > x[b, ]4 -0.003589 0.009041 -0.397 0.718 > > x[b, ]5 -0.276498 0.119545 -2.313 0.104 > > x[b, ]6 -0.003010 0.003574 -0.842 0.462 > > > > Residual standard error: 0.07893 on 3 degrees of freedom > > Multiple R-squared: 0.7932, Adjusted R-squared: 0.3795 > > F-statistic: 1.917 on 6 and 3 DF, p-value: 0.3169 > > > > Once the fiited model is found (it does not matter how bad it is - the > variables are poorly correlated), I need to predict the next value > > > >> predict(fit.a,xt[j,]) > > Error in eval(predvars, data, env) : > > numeric 'envir' arg not of length one > > > > It seems that predict needs a dataframe class but even if I change > xt[b,] to > as.data.frame(xt[b,]) the result is not what I expect. > > > >> predict(fit.a,as.data.frame(t(xt[j,]))) > > 1 2 3 4 5 6 7 > 8 > > 0.7834919 0.8243357 0.7780093 0.7810451 0.8084342 0.8057823 1.0304123 > 0.9729126 > > 9 10 > > 0.7708979 0.8464298 > > Warning message: > > 'newdata' had 1 rows but variable(s) found have 10 rows > > > > My feeling is that I did not understand how to enter the formula in > lm in > the first place. > > > > Many thanks > > > > Ed > > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.David Winsemius, MD Heritage Laboratories West Hartford, CT
On 11-08-15 12:21 PM, David Winsemius wrote:> Eduardo; > > I think you would be more successful if you put your data in a > dataframe, offered it to lm with column names only in the formula, > and then used the newdata argument with predict with the column names > matching the column names in the original data. >I agree. You can use the "subset" argument to lm to choose your local window. Duncan Murdoch
Hi there Many thanks. I will try to follow what you two said. Meanwhile I use the dirty solution that I have just come across. Cheers Ed -----Original Message----- From: Duncan Murdoch [mailto:murdoch.duncan at gmail.com] Sent: Monday, August 15, 2011 1:36 PM To: David Winsemius Cc: Eduardo M. A. M.Mendes; r-help at r-project.org Subject: Re: [R] Help on how to use predict On 11-08-15 12:21 PM, David Winsemius wrote:> Eduardo; > > I think you would be more successful if you put your data in a > dataframe, offered it to lm with column names only in the formula, and > then used the newdata argument with predict with the column names > matching the column names in the original data. >I agree. You can use the "subset" argument to lm to choose your local window. Duncan Murdoch