I'm teaching a class and using R for the first time. We're talking about the bootstrap, and I've been trying to get R to replicate some simple bootstrap programs with no success. I'd like to be able to use the boot.ci function to produce confidence intervals (non-parametric) for some simple statistics, and this requires first creating a "boot" object. The boot object, in turn, requires a statistic that takes two inputs: the data and the second a vector of indices, frequencies, or weights which "define" the bootstrap. My question, I guess, is which option do I choose to "define" an ordinary sample-with-replacement? And where does this vector come from? Suppose I want to bootstrap a confidence interval for the mean. Do i have to write my own "mean" function to provide for this defining vector of indices? Just to clarify, I want to use the boot command to replicate this: bstraps <- c() for (i in 1:R){ bstraps <- c(mean(sample(data,replace=T,n=length(data))), bstraps)} thanks, Rob -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
This doesn't answer Rob's general question but here is some code that is very fast for running the nonparametric bootstrap percentile method for a mean, in S-Plus. If anyone knows the corresponding R internal function to sample.index (special case where "prob" argument not needed) I would appreciate knowing it. Thanks -Frank Harrell smean.cl.boot <- function(x, conf.int=.95, B=1000, na.rm=TRUE) { if(na.rm) x <- x[!is.na(x)] n <- length(x) xbar <- mean(x) if(n < 2) return(Mean=xbar, Lower=NA, Upper=NA) z <- unlist(lapply(1:B, function(i,x,N) sum(x[.Internal(sample.index(N, N, TRUE), "S_sample",TRUE,0)]), x=x, N=n)) / n quant <- quantile(z, c((1-conf.int)/2,(1+conf.int)/2)) names(quant) <- NULL c(Mean=xbar, Lower=quant[1], Upper=quant[2]) } On Fri, 25 Jan 2002 10:48:13 -0800 Rob Gould (local) <rgould at stat.ucla.edu> wrote:> I'm teaching a class and using R for the first time. We're talking > about the bootstrap, and I've been trying to get R to replicate some > simple bootstrap programs with no success. I'd like to be able to use > the boot.ci function to produce confidence intervals (non-parametric) > for some simple statistics, and this requires first creating a "boot" > object. The boot object, in turn, requires a statistic that takes two > inputs: the data and the second a vector of indices, frequencies, or > weights which "define" the bootstrap. My question, I guess, is which > option do I choose to "define" an ordinary sample-with-replacement? And > where does this vector come from? Suppose I want to bootstrap a > confidence interval for the mean. Do i have to write my own "mean" > function to provide for this defining vector of indices? > > Just to clarify, I want to use the boot command to replicate this: > bstraps <- c() > for (i in 1:R){ > bstraps <- c(mean(sample(data,replace=T,n=length(data))), bstraps)} > > thanks, > Rob > > -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- > r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html > Send "info", "help", or "[un]subscribe" > (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch > _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._-- Frank E Harrell Jr Prof. of Biostatistics & Statistics Div. of Biostatistics & Epidem. Dept. of Health Evaluation Sciences U. Virginia School of Medicine http://hesweb1.med.virginia.edu/biostat -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
On Fri, 25 Jan 2002, Rob Gould wrote:> I'm teaching a class and using R for the first time. We're talking > about the bootstrap, and I've been trying to get R to replicate some > simple bootstrap programs with no success. I'd like to be able to use > the boot.ci function to produce confidence intervals (non-parametric) > for some simple statistics, and this requires first creating a "boot" > object. The boot object, in turn, requires a statistic that takes two > inputs: the data and the second a vector of indices, frequencies, or > weights which "define" the bootstrap. My question, I guess, is which > option do I choose to "define" an ordinary sample-with-replacement? And > where does this vector come from? Suppose I want to bootstrap a > confidence interval for the mean. Do i have to write my own "mean" > function to provide for this defining vector of indices? > > Just to clarify, I want to use the boot command to replicate this: > bstraps <- c() > for (i in 1:R){ > bstraps <- c(mean(sample(data,replace=T,n=length(data))), bstraps)}Following the example in Venables & Ripley for the median: data.boot <- boot(data, function(x, i) mean(x[i]), R=R) boot.ci(data.boot, more options) Pretty simple, I think, Frank? -- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272860 (secr) Oxford OX1 3TG, UK Fax: +44 1865 272595 -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Hi Rob, here's an example I worked out:> data <- rnorm(100) > f <- function(x, i) { median(x[i]) } > b <- boot(data, f, R = 1000) > bORDINARY NONPARAMETRIC BOOTSTRAP Call: boot(data = data, statistic = f, R = 1000) Bootstrap Statistics : original bias std. error t1* -0.07986221 0.008121274 0.09311031> boot.ci(b, type="perc")BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS Based on 1000 bootstrap replicates CALL : boot.ci(boot.out = b, type = "perc") Intervals : Level Percentile 95% (-0.2264, 0.1228 ) Calculations and Intervals on Original Scale -roger _______________________________ UCLA Department of Statistics rpeng at stat.ucla.edu http://www.stat.ucla.edu/~rpeng On Fri, 25 Jan 2002, Rob Gould wrote:> I'm teaching a class and using R for the first time. We're talking > about the bootstrap, and I've been trying to get R to replicate some > simple bootstrap programs with no success. I'd like to be able to use > the boot.ci function to produce confidence intervals (non-parametric) > for some simple statistics, and this requires first creating a "boot" > object. The boot object, in turn, requires a statistic that takes two > inputs: the data and the second a vector of indices, frequencies, or > weights which "define" the bootstrap. My question, I guess, is which > option do I choose to "define" an ordinary sample-with-replacement? And > where does this vector come from? Suppose I want to bootstrap a > confidence interval for the mean. Do i have to write my own "mean" > function to provide for this defining vector of indices? > > Just to clarify, I want to use the boot command to replicate this: > bstraps <- c() > for (i in 1:R){ > bstraps <- c(mean(sample(data,replace=T,n=length(data))), bstraps)} > > thanks, > Rob > > -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- > r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html > Send "info", "help", or "[un]subscribe" > (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch > _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._ >-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Rob Gould <rgould at stat.ucla.edu> writes:>I'm teaching a class and using R for the first time. We're talking >about the bootstrap, and I've been trying to get R to replicate some >simple bootstrap programs with no success. I'd like to be able to use >the boot.ci function to produce confidence intervals (non-parametric) >for some simple statistics, and this requires first creating a "boot" >object. The boot object, in turn, requires a statistic that takes two >inputs: the data and the second a vector of indices, frequencies, or >weights which "define" the bootstrap. My question, I guess, is which >option do I choose to "define" an ordinary sample-with-replacement? And >where does this vector come from? Suppose I want to bootstrap a >confidence interval for the mean. Do i have to write my own "mean" >function to provide for this defining vector of indices? > >Just to clarify, I want to use the boot command to replicate this: >bstraps <- c() >for (i in 1:R){ >bstraps <- c(mean(sample(data,replace=T,n=length(data))), bstraps)}The bootstrap is surprisingly easy to do in R. I have a brief and simple section in my R notes (www.myatt.demon.co.uk) that I use as a teaching example. Selvin: Selvin, S, ?Modern Applied Biostatistical Methods Using S-Plus?, Oxford University Press, New York, 1998 Covers this quite well. Mark -- Mark Myatt -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Prof Brian D Ripley <ripley at stats.ox.ac.uk> writes:>What, `I want to use the boot command to replicate this'? I don't >think either of you cover it or `the boot.ci function to produce confidence >intervals'. In my experience it is well worth using the code writen by >experts, for bootstrapping as well as in general. > >It is `surprisingly easy' to re-invent the wheel in R.I like that last comment a lot and would like to add my tuppence. It is surprisingly easy to re-invent the wheel in R because it is surprisingly easy to overlook the existence of functions and, in my case, entire libraries of functions that are already there, debugged, and working well. I am not sure how to overcome this problem (I understand that it is not a problem for you, Prof.) with R. Perhaps there is a documentation solution. Mark -- Mark Myatt -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Prof Brian D Ripley <ripley at stats.ox.ac.uk> writes:>What, `I want to use the boot command to replicate this'? I don't >think either of you cover it or `the boot.ci function to produce confidence >intervals'. In my experience it is well worth using the code writen by >experts, for bootstrapping as well as in general. > >It is `surprisingly easy' to re-invent the wheel in R.As a learning device and with the luxury of time, reinventing the wheel is great with R. I've been working through Clifford Lunneborg's "Data Analysis by Resampling", expressing his ideas and sometimes the code from the text written in Resampling Stats, SC and S-Plus, in R. In reinventing the wheel, I've learned much about resampling techniques and R. Much less would have been learned had I relied upon the functions written by others. Not a bad investment of time but the importance and press of time to others might differ. ANDREW Andrew Criswell Professor of Finance Graduate School, Bangkok University -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._