thr3ads.net - R help - [R] How does predict() calculate prediction intervals? [Jan 2013]

If this information is useful, please help other people find it:
Share via:

Kurt Rinehart

2013-Jan-30 16:50 UTC

[R] How does predict() calculate prediction intervals?

For a given linear regression, I wish to find the 2-tailed t-dist
probability that Y-hat <= newly observed values. I generate prediction
intervals in predict() for plotting, but when I calculate my t-dist
probabilities, they don't agree. I have researched the issues with variance
of individual predictions and been advised to use the variance formula
below (in the code).

I presume my variance function differs from that used in predict(). Can
someone advise me as to why I cannot reproduce the results of predict()?

As a test, I calculated the prediction intervals about Y-hat by hand and
compared them to predict():

X <- log10(c(780,54,520,703,1356,1900,2500,741,1500,600))
Y <-
log10(c(0.037,0.036,0.026,0.0315,0.028,0.0099,0.00955,0.0405,0.019,0.0245))
dat <- data.frame(cbind(X, Y))
mod <- lm(Y ~ X, data = dat)
rm(X,Y)

## model predictions, y.hat, at the new X values
pred <- data.frame(predict.lm(mod,newdata = obs, interval =
"prediction",
level = 0.95))

# 3 subsequent observations
obs <- log10(data.frame(cbind(c(40, 200, 450), c(0.3, 0.06, .034))))
names(obs) <- c("X", "Y")

## Calculating t-dist, 2-tailed, 95% prediction intervals for new
observations
mse <- anova(mod)[2,3]
new.x <- obs$X - mean(dat$X)
sum.x2 <- sum((dat$X - mean(dat$X))^2)
y.hat <- pred$fit
var.y.hat <- mse*(1+new.x^2/sum.x2)

upr <- y.hat + qt(c(0.975), df = 8) * sqrt(var.y.hat)
lwr <- y.hat + qt(c(0.025), df = 8) * sqrt(var.y.hat)
hand <- data.frame(cbind(y.hat, lwr, upr))

#The limits are not the same
pred
hand

--
Thank you,
Kurt

	[[alternative HTML version deleted]]

Cade, Brian

2013-Jan-30 20:55 UTC

head link

[R] How does predict() calculate prediction intervals?

Kurt:  You missed including the term 1/n in your var.y.hat calculation.  Do
that and pred and hand  are the same.

Brian

Brian S. Cade, PhD

U. S. Geological Survey
Fort Collins Science Center
2150 Centre Ave., Bldg. C
Fort Collins, CO  80526-8818

email:  brian_cade@usgs.gov
tel:  970 226-9326



On Wed, Jan 30, 2013 at 9:50 AM, Kurt Rinehart
<kurt.rinehart@uvm.edu>wrote:
> For a given linear regression, I wish to find the 2-tailed t-dist
> probability that Y-hat <= newly observed values. I generate prediction
> intervals in predict() for plotting, but when I calculate my t-dist
> probabilities, they don't agree. I have researched the issues with
variance
> of individual predictions and been advised to use the variance formula
> below (in the code).
>
> I presume my variance function differs from that used in predict(). Can
> someone advise me as to why I cannot reproduce the results of predict()?
>
> As a test, I calculated the prediction intervals about Y-hat by hand and
> compared them to predict():
>
> X <- log10(c(780,54,520,703,1356,1900,2500,741,1500,600))
> Y <-
> log10(c(0.037,0.036,0.026,0.0315,0.028,0.0099,0.00955,0.0405,0.019,0.0245))
> dat <- data.frame(cbind(X, Y))
> mod <- lm(Y ~ X, data = dat)
> rm(X,Y)
>
> ## model predictions, y.hat, at the new X values
> pred <- data.frame(predict.lm(mod,newdata = obs, interval =
"prediction",
> level = 0.95))
>
> # 3 subsequent observations
> obs <- log10(data.frame(cbind(c(40, 200, 450), c(0.3, 0.06, .034))))
> names(obs) <- c("X", "Y")
>
> ## Calculating t-dist, 2-tailed, 95% prediction intervals for new
> observations
> mse <- anova(mod)[2,3]
> new.x <- obs$X - mean(dat$X)
> sum.x2 <- sum((dat$X - mean(dat$X))^2)
> y.hat <- pred$fit
> var.y.hat <- mse*(1+new.x^2/sum.x2)
>
> upr <- y.hat + qt(c(0.975), df = 8) * sqrt(var.y.hat)
> lwr <- y.hat + qt(c(0.025), df = 8) * sqrt(var.y.hat)
> hand <- data.frame(cbind(y.hat, lwr, upr))
>
> #The limits are not the same
> pred
> hand
>
> --
> Thank you,
> Kurt
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
	[[alternative HTML version deleted]]

Rolf Turner

2013-Jan-30 21:16 UTC

head link

[R] How does predict() calculate prediction intervals?

Just look at the code of predict.lm().  It is reasonably perspicuous.
In particular look at res.var.

     cheers,

         Rolf Turner

On 01/31/2013 05:50 AM, Kurt Rinehart wrote:> For a given linear regression, I wish to find the 2-tailed t-dist
> probability that Y-hat <= newly observed values. I generate prediction
> intervals in predict() for plotting, but when I calculate my t-dist
> probabilities, they don't agree. I have researched the issues with
variance
> of individual predictions and been advised to use the variance formula
> below (in the code).
>
> I presume my variance function differs from that used in predict(). Can
> someone advise me as to why I cannot reproduce the results of predict()?
>
> As a test, I calculated the prediction intervals about Y-hat by hand and
> compared them to predict():
>
> X <- log10(c(780,54,520,703,1356,1900,2500,741,1500,600))
> Y <-
> log10(c(0.037,0.036,0.026,0.0315,0.028,0.0099,0.00955,0.0405,0.019,0.0245))
> dat <- data.frame(cbind(X, Y))
> mod <- lm(Y ~ X, data = dat)
> rm(X,Y)
>
> ## model predictions, y.hat, at the new X values
> pred <- data.frame(predict.lm(mod,newdata = obs, interval =
"prediction",
> level = 0.95))
>
> # 3 subsequent observations
> obs <- log10(data.frame(cbind(c(40, 200, 450), c(0.3, 0.06, .034))))
> names(obs) <- c("X", "Y")
>
> ## Calculating t-dist, 2-tailed, 95% prediction intervals for new
> observations
> mse <- anova(mod)[2,3]
> new.x <- obs$X - mean(dat$X)
> sum.x2 <- sum((dat$X - mean(dat$X))^2)
> y.hat <- pred$fit
> var.y.hat <- mse*(1+new.x^2/sum.x2)
>
> upr <- y.hat + qt(c(0.975), df = 8) * sqrt(var.y.hat)
> lwr <- y.hat + qt(c(0.025), df = 8) * sqrt(var.y.hat)
> hand <- data.frame(cbind(y.hat, lwr, upr))
>
> #The limits are not the same
> pred
> hand
>
> --
> Thank you,
> Kurt
>
> 	[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>
>

Seemingly Similar Threads

Search for more reasonably related threads

R help - Jan 2013 - How does predict() calculate prediction intervals?

[R] How does predict() calculate prediction intervals?

[R] How does predict() calculate prediction intervals?

[R] How does predict() calculate prediction intervals?

Seemingly Similar Threads