baconbeach
2012-Apr-30 23:22 UTC
[R] 95% confidence interval of the coefficients from a bootstrap analysis
Hello, I am doing a simple linear regression analysis that includes few variables. I am using a bootstrap analysis to obtain the variation of my variables to replacement. I am trying to obtain the coefficients 95% confidence interval from the bootstrap procedure. Here is my script for the bootstrap: N = length (data_Pb[,1]) B = 10000 stor.r2 = rep(0,B) stor.r2 = rep(0,B) stor.inter = rep(0,B) stor.Ind5 = rep(0,B) stor.LNPRI25 = rep(0,B) stor.NPRI10 = rep(0,B) stor.Mine = rep(0,B) for (i in 1:B){ idx = sample(1:N, replace=T) newdata = data_Pb[idx,] L_NPRI_25k <- log(newdata$NPRI_25k+1) data_Pb.boot = lm(newdata$Log_Level ~ newdata$Ind_5k + newdata$MineP_50k + newdata$NPRI_10k + L_NPRI_25k ) stor.r2[i] = summary(data_Pb.boot)$r.squared stor.inter[i] = summary(data_Pb.boot)$coefficients[1,1] stor.Ind5[i] = summary(data_Pb.boot)$coefficients[2,1] stor.LNPRI25 [i] = summary(data_Pb.boot)$coefficients[3,1] stor.NPRI10[i] = summary(data_Pb.boot)$coefficients[4,1] stor.Mine [i] = summary(data_Pb.boot)$coefficients[5,1] } hist(stor.r2, xlab="R-squared",main="Distribution of R-squared - Lead (log)") hist(stor.inter, xlab="Intercept",main="Distribution of Intercept - Lead (log)") hist(stor.Ind5, xlab="Industrial 5 km",main="Distribution of Industrial 5 km - Lead (log)") hist(stor.LNPRI25, xlab="NPRI 25 km (log)",main="Distribution of NPRI 25 km - Lead (log)") hist(stor.NPRI10, xlab="NPRI 10 km",main="Distribution of NPRI 10 km - Lead (log)") hist(stor.Mine, xlab="Past Mines 50 km",main="Distribution of Past Mines 50 km - Lead (log)") summary(stor.r2) summary(stor.inter) summary(stor.Ind5) summary(stor.LNPRI25) summary(stor.NPRI10) summary(stor.Mine) ###Valid only for the last calculated model of the bootstrap analysis ??? confint(data_Pb.boot) Can anybody show me the best way to get the 95% confidence interval of the coefficients? Thank you Steeve -- View this message in context: http://r.789695.n4.nabble.com/95-confidence-interval-of-the-coefficients-from-a-bootstrap-analysis-tp4599692.html Sent from the R help mailing list archive at Nabble.com.
Rui Barradas
2012-Apr-30 23:54 UTC
[R] 95% confidence interval of the coefficients from a bootstrap analysis
Hello, baconbeach wrote> > Hello, > > I am doing a simple linear regression analysis that includes few > variables. I am using a bootstrap analysis to obtain the variation of my > variables to replacement. > > I am trying to obtain the coefficients 95% confidence interval from the > bootstrap procedure. > > Here is my script for the bootstrap: > > > > > N = length (data_Pb[,1]) > B = 10000 > > > stor.r2 = rep(0,B) > stor.r2 = rep(0,B) > stor.inter = rep(0,B) > stor.Ind5 = rep(0,B) > stor.LNPRI25 = rep(0,B) > stor.NPRI10 = rep(0,B) > stor.Mine = rep(0,B) > > for (i in 1:B){ > idx = sample(1:N, replace=T) > newdata = data_Pb[idx,] > > > L_NPRI_25k <- log(newdata$NPRI_25k+1) > > data_Pb.boot = lm(newdata$Log_Level ~ > > newdata$Ind_5k + newdata$MineP_50k + > newdata$NPRI_10k + L_NPRI_25k ) > > stor.r2[i] = summary(data_Pb.boot)$r.squared > stor.inter[i] = summary(data_Pb.boot)$coefficients[1,1] > stor.Ind5[i] = summary(data_Pb.boot)$coefficients[2,1] > stor.LNPRI25 [i] = summary(data_Pb.boot)$coefficients[3,1] > stor.NPRI10[i] = summary(data_Pb.boot)$coefficients[4,1] > stor.Mine [i] = summary(data_Pb.boot)$coefficients[5,1] > } > > > hist(stor.r2, xlab="R-squared",main="Distribution of R-squared - Lead > (log)") > hist(stor.inter, xlab="Intercept",main="Distribution of Intercept - Lead > (log)") > hist(stor.Ind5, xlab="Industrial 5 km",main="Distribution of Industrial 5 > km - Lead (log)") > hist(stor.LNPRI25, xlab="NPRI 25 km (log)",main="Distribution of NPRI 25 > km - Lead (log)") > hist(stor.NPRI10, xlab="NPRI 10 km",main="Distribution of NPRI 10 km - > Lead (log)") > hist(stor.Mine, xlab="Past Mines 50 km",main="Distribution of Past Mines > 50 km - Lead (log)") > > > summary(stor.r2) > summary(stor.inter) > summary(stor.Ind5) > summary(stor.LNPRI25) > summary(stor.NPRI10) > summary(stor.Mine) > > > ###Valid only for the last calculated model of the bootstrap analysis ??? > > confint(data_Pb.boot) > > > Can anybody show me the best way to get the 95% confidence interval of the > coefficients? > > Thank you > > Steeve >I think you're complicating a bit, you could save r2 in a vector and the coefficients in a list (see (*) below), like the one that follows: After creating your stor vars, and before the bootstarp loop, put this stor.confint <- vector("list", B) Then, inside the loop, at it's end, stor.confint[[i]] <- confint(data_Pb.boot) This creates a list of matrices. (*) The same for the coefficients, create a list first then in the loop use function coef() stor.coeffs <- vector("list", B) stor.coeffs <- coef(data_Pb.boot) Hope this helps, Rui Barradas -- View this message in context: http://r.789695.n4.nabble.com/95-confidence-interval-of-the-coefficients-from-a-bootstrap-analysis-tp4599692p4599734.html Sent from the R help mailing list archive at Nabble.com.
baconbeach
2012-May-01 00:39 UTC
[R] 95% confidence interval of the coefficients from a bootstrap analysis
Hi Rui, Thanks for your help!! It works perfectly, but when I call stor.confint, I obtain the list of confidence interval (10,000 times). Is there an easy way to summarize the results and getting only one 2.5% and 97.5% values for each variable? Thanks again for your help Steeve -- View this message in context: http://r.789695.n4.nabble.com/95-confidence-interval-of-the-coefficients-from-a-bootstrap-analysis-tp4599692p4599813.html Sent from the R help mailing list archive at Nabble.com.
Rui Barradas
2012-May-01 00:47 UTC
[R] 95% confidence interval of the coefficients from a bootstrap analysis
Hello, baconbeach wrote> > Hi Rui, > > Thanks for your help!! > > It works perfectly, but when I call stor.confint, I obtain the list of > confidence interval (10,000 times). Is there an easy way to summarize the > results and getting only one 2.5% and 97.5% values for each variable? > > Thanks again for your help > > Steeve >Yes, there is. rowMeans(sapply(stor.confint, colMeans)) Rui Barradas -- View this message in context: http://r.789695.n4.nabble.com/95-confidence-interval-of-the-coefficients-from-a-bootstrap-analysis-tp4599692p4599823.html Sent from the R help mailing list archive at Nabble.com.