baconbeach
2012-Apr-30 23:22 UTC
[R] 95% confidence interval of the coefficients from a bootstrap analysis
Hello,
I am doing a simple linear regression analysis that includes few variables.
I am using a bootstrap analysis to obtain the variation of my variables to
replacement.
I am trying to obtain the coefficients 95% confidence interval from the
bootstrap procedure.
Here is my script for the bootstrap:
N = length (data_Pb[,1])
B = 10000
stor.r2 = rep(0,B)
stor.r2 = rep(0,B)
stor.inter = rep(0,B)
stor.Ind5 = rep(0,B)
stor.LNPRI25 = rep(0,B)
stor.NPRI10 = rep(0,B)
stor.Mine = rep(0,B)
for (i in 1:B){
idx = sample(1:N, replace=T)
newdata = data_Pb[idx,]
L_NPRI_25k <- log(newdata$NPRI_25k+1)
data_Pb.boot = lm(newdata$Log_Level ~
newdata$Ind_5k + newdata$MineP_50k +
newdata$NPRI_10k + L_NPRI_25k )
stor.r2[i] = summary(data_Pb.boot)$r.squared
stor.inter[i] = summary(data_Pb.boot)$coefficients[1,1]
stor.Ind5[i] = summary(data_Pb.boot)$coefficients[2,1]
stor.LNPRI25 [i] = summary(data_Pb.boot)$coefficients[3,1]
stor.NPRI10[i] = summary(data_Pb.boot)$coefficients[4,1]
stor.Mine [i] = summary(data_Pb.boot)$coefficients[5,1]
}
hist(stor.r2, xlab="R-squared",main="Distribution of R-squared -
Lead
(log)")
hist(stor.inter, xlab="Intercept",main="Distribution of Intercept
- Lead
(log)")
hist(stor.Ind5, xlab="Industrial 5 km",main="Distribution of
Industrial 5 km
- Lead (log)")
hist(stor.LNPRI25, xlab="NPRI 25 km (log)",main="Distribution of
NPRI 25 km
- Lead (log)")
hist(stor.NPRI10, xlab="NPRI 10 km",main="Distribution of NPRI 10
km - Lead
(log)")
hist(stor.Mine, xlab="Past Mines 50 km",main="Distribution of
Past Mines 50
km - Lead (log)")
summary(stor.r2)
summary(stor.inter)
summary(stor.Ind5)
summary(stor.LNPRI25)
summary(stor.NPRI10)
summary(stor.Mine)
###Valid only for the last calculated model of the bootstrap analysis ???
confint(data_Pb.boot)
Can anybody show me the best way to get the 95% confidence interval of the
coefficients?
Thank you
Steeve
--
View this message in context:
http://r.789695.n4.nabble.com/95-confidence-interval-of-the-coefficients-from-a-bootstrap-analysis-tp4599692.html
Sent from the R help mailing list archive at Nabble.com.
Rui Barradas
2012-Apr-30 23:54 UTC
[R] 95% confidence interval of the coefficients from a bootstrap analysis
Hello, baconbeach wrote> > Hello, > > I am doing a simple linear regression analysis that includes few > variables. I am using a bootstrap analysis to obtain the variation of my > variables to replacement. > > I am trying to obtain the coefficients 95% confidence interval from the > bootstrap procedure. > > Here is my script for the bootstrap: > > > > > N = length (data_Pb[,1]) > B = 10000 > > > stor.r2 = rep(0,B) > stor.r2 = rep(0,B) > stor.inter = rep(0,B) > stor.Ind5 = rep(0,B) > stor.LNPRI25 = rep(0,B) > stor.NPRI10 = rep(0,B) > stor.Mine = rep(0,B) > > for (i in 1:B){ > idx = sample(1:N, replace=T) > newdata = data_Pb[idx,] > > > L_NPRI_25k <- log(newdata$NPRI_25k+1) > > data_Pb.boot = lm(newdata$Log_Level ~ > > newdata$Ind_5k + newdata$MineP_50k + > newdata$NPRI_10k + L_NPRI_25k ) > > stor.r2[i] = summary(data_Pb.boot)$r.squared > stor.inter[i] = summary(data_Pb.boot)$coefficients[1,1] > stor.Ind5[i] = summary(data_Pb.boot)$coefficients[2,1] > stor.LNPRI25 [i] = summary(data_Pb.boot)$coefficients[3,1] > stor.NPRI10[i] = summary(data_Pb.boot)$coefficients[4,1] > stor.Mine [i] = summary(data_Pb.boot)$coefficients[5,1] > } > > > hist(stor.r2, xlab="R-squared",main="Distribution of R-squared - Lead > (log)") > hist(stor.inter, xlab="Intercept",main="Distribution of Intercept - Lead > (log)") > hist(stor.Ind5, xlab="Industrial 5 km",main="Distribution of Industrial 5 > km - Lead (log)") > hist(stor.LNPRI25, xlab="NPRI 25 km (log)",main="Distribution of NPRI 25 > km - Lead (log)") > hist(stor.NPRI10, xlab="NPRI 10 km",main="Distribution of NPRI 10 km - > Lead (log)") > hist(stor.Mine, xlab="Past Mines 50 km",main="Distribution of Past Mines > 50 km - Lead (log)") > > > summary(stor.r2) > summary(stor.inter) > summary(stor.Ind5) > summary(stor.LNPRI25) > summary(stor.NPRI10) > summary(stor.Mine) > > > ###Valid only for the last calculated model of the bootstrap analysis ??? > > confint(data_Pb.boot) > > > Can anybody show me the best way to get the 95% confidence interval of the > coefficients? > > Thank you > > Steeve >I think you're complicating a bit, you could save r2 in a vector and the coefficients in a list (see (*) below), like the one that follows: After creating your stor vars, and before the bootstarp loop, put this stor.confint <- vector("list", B) Then, inside the loop, at it's end, stor.confint[[i]] <- confint(data_Pb.boot) This creates a list of matrices. (*) The same for the coefficients, create a list first then in the loop use function coef() stor.coeffs <- vector("list", B) stor.coeffs <- coef(data_Pb.boot) Hope this helps, Rui Barradas -- View this message in context: http://r.789695.n4.nabble.com/95-confidence-interval-of-the-coefficients-from-a-bootstrap-analysis-tp4599692p4599734.html Sent from the R help mailing list archive at Nabble.com.
baconbeach
2012-May-01 00:39 UTC
[R] 95% confidence interval of the coefficients from a bootstrap analysis
Hi Rui, Thanks for your help!! It works perfectly, but when I call stor.confint, I obtain the list of confidence interval (10,000 times). Is there an easy way to summarize the results and getting only one 2.5% and 97.5% values for each variable? Thanks again for your help Steeve -- View this message in context: http://r.789695.n4.nabble.com/95-confidence-interval-of-the-coefficients-from-a-bootstrap-analysis-tp4599692p4599813.html Sent from the R help mailing list archive at Nabble.com.
Rui Barradas
2012-May-01 00:47 UTC
[R] 95% confidence interval of the coefficients from a bootstrap analysis
Hello, baconbeach wrote> > Hi Rui, > > Thanks for your help!! > > It works perfectly, but when I call stor.confint, I obtain the list of > confidence interval (10,000 times). Is there an easy way to summarize the > results and getting only one 2.5% and 97.5% values for each variable? > > Thanks again for your help > > Steeve >Yes, there is. rowMeans(sapply(stor.confint, colMeans)) Rui Barradas -- View this message in context: http://r.789695.n4.nabble.com/95-confidence-interval-of-the-coefficients-from-a-bootstrap-analysis-tp4599692p4599823.html Sent from the R help mailing list archive at Nabble.com.