cc super
2010-Jun-23 06:11 UTC
[R] Estimate of variance and prediction for multiple linear regression
Hi, everyone, Night. I have three questions about multiple linear regression in R. Q1: y=rnorm(10,mean=5) x1=rnorm(10,mean=2) x2=rnorm(10) lin=lm(y~x1+x2) summary(lin) ## In the summary, 'Residual standard error: 1.017 on 7 degrees of freedom', 1.017 is the estimate of the constance variance? Q2: beta0=lin$coefficients[1] beta1=lin$coefficients[2] beta2=lin$coefficients[3] y_hat=beta0+beta1*x1+beta2*x2 ## Is there any built-in function in R to obtain y_hat directly? Q3: If I want to apply this regression result to another dataset, that is, new x1 and x2. Is the built-in function in 2 still available? Thank you in advance! [[alternative HTML version deleted]]
Gavin Simpson
2010-Jun-23 07:57 UTC
[R] Estimate of variance and prediction for multiple linear regression
On Tue, 2010-06-22 at 23:11 -0700, cc super wrote:> Hi, everyone, > > Night. I have three questions about multiple linear regression in R. > > Q1: > > y=rnorm(10,mean=5) > x1=rnorm(10,mean=2) > x2=rnorm(10) > lin=lm(y~x1+x2) > summary(lin) > > ## In the summary, 'Residual standard error: 1.017 on 7 degrees of freedom', > 1.017 is the estimate of the constance variance?Yes, it is sigma. Just a note, in order for the above code to yield the same results as you quote, you need a call to set.seed() to fix the pseudo random number generator.> Q2: > > beta0=lin$coefficients[1] > beta1=lin$coefficients[2] > beta2=lin$coefficients[3] > > y_hat=beta0+beta1*x1+beta2*x2 > > ## Is there any built-in function in R to obtain y_hat directly?fitted(lin) Note that there are quite a few standard extractor functions like fitted available for modelling functions in R. coef() for example should be used to extract the coefficients, resid() will extract residuals etc.> Q3: > > If I want to apply this regression result to another dataset, that is, new > x1 and x2. Is the built-in function in 2 still available?It is called predict() (although if you called predict(lin) above instead of fitted(lin) it would have produced the same answer; the fitted values for the observations). One gotcha that catches people out is that in the new dataset, the variables (used in the model) must have the same names as the data frame used to fit it. So we could do: pdat <- data.frame(x1 = rnorm(10, 2), x2 = rnorm(10)) predict(lin, pdat) to get predictions at the new values of x1 an x2.> Thank you in advance!HTH G -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%