Hi folks, I'm trying to estimate bias-corrected percentile (BCP) confidence intervals on a vector from a simple for loop used for resampling. I am attempting to follow steps in Manly, B. 1998. Randomization, bootstrap and monte carlo methods in biology. 2nd edition., p. 48. PDF of the approach/steps should be available here: https://wyocoopunit.box.com/s/9vm4vgmbx5h7um809bvg6u7wr392v6i9 If this is too statsy for the R list, I can try stackexchange or crossval, but the R list provides all kinds of great help so thought I'd try it first. I aware of boot::boot but am hoping to avoid it for the current analysis I am working on (the boot function became quite challenging, for me, for a few reasons). I cannot figure out where I'm going wrong but the estimates from my attempt at the BCP CI are different enough from other methods that I assume I'm doing something wrong. require(boot) data("mtcars") # 1) Bootstrap 95% CI for R-Squared via boot::boot # statmethods.net/advstats/bootstrapping.html # Function for boot rsq <- function(formula, data, indices) { d <- data[indices,] fit <- lm(formula, data=d) return(summary(fit)$r.square) } # bootstrapping with 1000 replications results <- boot(data=mtcars, statistic=rsq, R=1000, formula = mpg ~ wt + disp) # get 95% confidence interval rsq.ci <- boot.ci(results, type="all") # 2) Bootstrap via for loop and estimate bias-corrected pecentile CI's # Fit LM with all the data lm.obs <- lm(mpg ~ wt + disp, mtcars) # Bootstrap with for loop iterations <- 1000 # # of iterations rsq.out <- vector() # to hold loop output for(i in 1:iterations){ boot.df <- mtcars[sample(nrow(mtcars), nrow(mtcars), replace = TRUE), ] boot.lm <- lm(mpg ~ wt + disp, boot.df) boot.rsq <- summary(boot.lm)$r.squared rsq.out <- c(rsq.out, boot.rsq) } hist(rsq.out) # Bias-corrected percentile CI # Manly, B. 1998. Randomization, bootstrap and monte carlo methods in biology. 2nd edition. # - p. 48 # Proportion of times that the boot estimate exceeds observed estimate mtcar.boot.rsq <- data.frame(boot.rsq = rsq.out, obs.rsq = summary(lm.obs)$r.squared) head(mtcar.boot.rsq) # If boot mean is > observed mean, code 1, otherwise 0 mtcar.boot.rsq$boot.high <- ifelse(mtcar.boot.rsq$boot.rsq > mtcar.boot.rsq$obs.rsq, 1, 0) # mean is the proportion of times boot mean > obs mean mean(mtcar.boot.rsq$boot.high) head(mtcar.boot.rsq) # Use proportion to get Z score, then use that to calculate the new bias-correct # Z score to look up the new proportion to use in quantile() rsq.bc <- quantile(mtcar.boot.rsq$boot.rsq, c(pnorm((2*pnorm(mean(mtcar.boot.rsq$boot.high))) - 1.96), pnorm((2*pnorm(mean(mtcar.boot.rsq$boot.high))) + 1.96))) # ESTIMATES # My attempt at BCP CI (ehhhh) rsq.bc 31.59952% -- 0.7747525 99.97103% -- 0.9034800 # Boot package rsq.ci Intervals : Level Normal Basic 95% ( 0.6690, 0.8659 ) ( 0.6818, 0.8824 ) Level Percentile BCa 95% ( 0.6795, 0.8800 ) ( 0.6044, 0.8511 ) # Simple percentile CI quantile(rsq.out, c(0.025, 0.975)) 2.5% 97.5% 0.6871754 0.8809823 boot.ci(results, type="perc") Level Percentile 95% ( 0.6795, 0.8800 ) I appreciate any advice! Thanks.

It was very statsy...I shoulda known better :) stats.stackexchange answer, for posterity: https://stats.stackexchange.com/questions/298506/bias-corrected-percentile-confidence-intervals Joe On Wed, Aug 16, 2017 at 1:30 PM, Joe Ceradini <joeceradini at gmail.com> wrote:> Hi folks, > > I'm trying to estimate bias-corrected percentile (BCP) confidence > intervals on a vector from a simple for loop used for resampling. I am > attempting to follow steps in Manly, B. 1998. Randomization, bootstrap > and monte carlo methods in biology. 2nd edition., p. 48. PDF of the > approach/steps should be available here: > https://wyocoopunit.box.com/s/9vm4vgmbx5h7um809bvg6u7wr392v6i9 > If this is too statsy for the R list, I can try stackexchange or > crossval, but the R list provides all kinds of great help so thought > I'd try it first. I aware of boot::boot but am hoping to avoid it for > the current analysis I am working on (the boot function became quite > challenging, for me, for a few reasons). > > I cannot figure out where I'm going wrong but the estimates from my > attempt at the BCP CI are different enough from other methods that I > assume I'm doing something wrong. > > require(boot) > data("mtcars") > > # 1) Bootstrap 95% CI for R-Squared via boot::boot > # statmethods.net/advstats/bootstrapping.html > > # Function for boot > rsq <- function(formula, data, indices) { > d <- data[indices,] > fit <- lm(formula, data=d) > return(summary(fit)$r.square) > } > > # bootstrapping with 1000 replications > results <- boot(data=mtcars, statistic=rsq, > R=1000, formula = mpg ~ wt + disp) > > # get 95% confidence interval > rsq.ci <- boot.ci(results, type="all") > > > > # 2) Bootstrap via for loop and estimate bias-corrected pecentile CI's > > # Fit LM with all the data > lm.obs <- lm(mpg ~ wt + disp, mtcars) > > # Bootstrap with for loop > iterations <- 1000 # # of iterations > rsq.out <- vector() # to hold loop output > > for(i in 1:iterations){ > boot.df <- mtcars[sample(nrow(mtcars), nrow(mtcars), replace = TRUE), ] > boot.lm <- lm(mpg ~ wt + disp, boot.df) > boot.rsq <- summary(boot.lm)$r.squared > rsq.out <- c(rsq.out, boot.rsq) > } > > hist(rsq.out) > > # Bias-corrected percentile CI > # Manly, B. 1998. Randomization, bootstrap and monte carlo methods in > biology. 2nd edition. > # - p. 48 > > # Proportion of times that the boot estimate exceeds observed estimate > mtcar.boot.rsq <- data.frame(boot.rsq = rsq.out, > obs.rsq = summary(lm.obs)$r.squared) > head(mtcar.boot.rsq) > > # If boot mean is > observed mean, code 1, otherwise 0 > mtcar.boot.rsq$boot.high <- ifelse(mtcar.boot.rsq$boot.rsq > > mtcar.boot.rsq$obs.rsq, 1, 0) > # mean is the proportion of times boot mean > obs mean > mean(mtcar.boot.rsq$boot.high) > head(mtcar.boot.rsq) > > # Use proportion to get Z score, then use that to calculate the new > bias-correct > # Z score to look up the new proportion to use in quantile() > rsq.bc <- quantile(mtcar.boot.rsq$boot.rsq, > c(pnorm((2*pnorm(mean(mtcar.boot.rsq$boot.high))) - > 1.96), > pnorm((2*pnorm(mean(mtcar.boot.rsq$boot.high))) + > 1.96))) > > # ESTIMATES > # My attempt at BCP CI (ehhhh) > rsq.bc > 31.59952% -- 0.7747525 > 99.97103% -- 0.9034800 > > # Boot package > rsq.ci > Intervals : > Level Normal Basic > 95% ( 0.6690, 0.8659 ) ( 0.6818, 0.8824 ) > > Level Percentile BCa > 95% ( 0.6795, 0.8800 ) ( 0.6044, 0.8511 ) > > # Simple percentile CI > quantile(rsq.out, c(0.025, 0.975)) > 2.5% 97.5% > 0.6871754 0.8809823 > > boot.ci(results, type="perc") > Level Percentile > 95% ( 0.6795, 0.8800 ) > > I appreciate any advice! > Thanks. >[[alternative HTML version deleted]]