Hi, I am trying to compute bootstrap confidence intervals for weighted means of paired differences with the boot package. Unfortunately, the weighted mean estimate lies out of the confidence bounds and hence I am obviously doing something wrong. Appreciate any help. Thanks. Here is a reproducible example: library(boot) set.seed(1111) x <- rnorm(50) y <- rnorm(50) weights <- runif(50) weights <- weights / sum(weights) dataset <- cbind(x,y,weights) vw_m_diff <- function(dataset,w, d) { differences <- dataset[d,1]-dataset[d,2] weights <- w[d] return(weighted.mean(x=differences, w=weights)) } res_boot <- boot(dataset, statistic=vw_m_diff, R = 1000, w=dataset[,3]) boot.ci(res_boot) *BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS* *Based on 1000 bootstrap replicates* *CALL : * *boot.ci <http://boot.ci>(boot.out = res_boot)* *Intervals : * *Level Normal Basic * *95% (-0.8365, -0.3463 ) (-0.8311, -0.3441 ) * *Level Percentile BCa * *95% (-0.3276, 0.1594 ) (-0.4781, -0.3477 ) * weighted.mean(x=dataset[,1]-dataset[,2], w=dataset[,3]) *[1] -0.07321734* [[alternative HTML version deleted]]
David Winsemius
2014-Nov-14 23:18 UTC
[R] Bootstrap CIs for weighted means of paired differences
On Nov 14, 2014, at 12:15 PM, ivan wrote:> Hi, > > I am trying to compute bootstrap confidence intervals for weighted means of > paired differences with the boot package. Unfortunately, the weighted mean > estimate lies out of the confidence bounds and hence I am obviously doing > something wrong. > > Appreciate any help. Thanks. Here is a reproducible example: > > > library(boot) > set.seed(1111) > x <- rnorm(50) > y <- rnorm(50) > weights <- runif(50) > weights <- weights / sum(weights) > dataset <- cbind(x,y,weights) > vw_m_diff <- function(dataset,w, d) {My understanding of the principle underlying the design of the bootstrapped function was that the data was the first argument and the index vector was the second. (I admit to not knowing what it would do with a third argument. So I would have guessed that you wanted: vw_m_diff <- function(dataset,w) { differences <- dataset[d,1]-dataset[d,2] weights <- dataset[w, "weights"] return(weighted.mean(x=differences, w=weights)) } I get what appears to me as a sensible set of estimates (since they seem centered on zero) although I further admit I do not know what the theoretic CI _should_ be for this problem:> res_boot <- boot(dataset, statistic=vw_m_diff, R = 1000, w=dataset[,3]) > boot.ci(res_boot)BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS Based on 1000 bootstrap replicates CALL : boot.ci(boot.out = res_boot) Intervals : Level Normal Basic 95% (-0.5657, 0.4962 ) (-0.5713, 0.5062 ) Level Percentile BCa 95% (-0.6527, 0.4249 ) (-0.5579, 0.5023 ) Calculations and Intervals on Original Scale> differences <- dataset[d,1]-dataset[d,2] > weights <- w[d] > return(weighted.mean(x=differences, w=weights)) > } > res_boot <- boot(dataset, statistic=vw_m_diff, R = 1000, w=dataset[,3]) > boot.ci(res_boot) > > *BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS* > *Based on 1000 bootstrap replicates* > > *CALL : * > *boot.ci <http://boot.ci>(boot.out = res_boot)* > > *Intervals : * > *Level Normal Basic * > *95% (-0.8365, -0.3463 ) (-0.8311, -0.3441 ) * > > *Level Percentile BCa * > *95% (-0.3276, 0.1594 ) (-0.4781, -0.3477 ) * > > weighted.mean(x=dataset[,1]-dataset[,2], w=dataset[,3]) > > *[1] -0.07321734* > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.David Winsemius Alameda, CA, USA