The only reason the boot package will take more memory for 2000
replications than 10 is that it needs to store the results. That is
not to say that on a 32-bit OS the fragmentation will not get worse,
but that is unlikely to be a significant factor.
As for the methodology: 'boot' is support software for a book, so
please consult it (and not secondary sources). From your brief
description it looks to me as if you should be using studentized CIs.
130,000 cases is a lot, and running the experiment on a 1% sample
may well show that asymptotic CIs are good enough.
On Thu, 5 May 2011, E Hofstadler wrote:
> hello,
>
> the following questions will without doubt reveal some fundamental
> ignorance, but hopefully you can still help me out.
>
> I'd like to bootstrap a coefficient gained on the basis of the
> coefficients in a logistic regression model (the mean differences in
> the predicted probabilities between two groups, where each predict()
> operation uses as the newdata-argument a dataframe of equal size as
> the original dataframe).I've got 130,000 rows and 7 columns in my
> dataframe. The glm-model uses all variables (as well as two 2-way
> interactions).
>
> System:
> - R-version: 2.12.2
> - OS: Windows XP Pro, 32-bit
> - 3.16Ghz intel dual core processor, 2.9GB RAM
>
> I'm using the boot package to arrive at the standard errors for this
> difference, but even with only 10 replications, this takes quite a
> long time: 216 seconds (perhaps this is partly also due to my
> inefficiently programmed function underlying the boot-call, I'm also
> looking into that).
>
> I wanted to try out calculating a bca-bootstrapped confidence
> interval, which as I understand requires a lot more replications than
> normal-theory intervals. Drawing on John Fox' Appendix to his "An
R
> Companion to Applied Regression", I was thinking of trying out 2000
> replications -- but this will take several hours to compute on my
> system (which isn't in itself a major issue though).
>
> My Questions:
> - let's say I try bootstrapping with 2000 replications. Can I be
> certain that the memory available to R will be sufficient for this
> operation?
> - (this relates to statistics more generally): is it a good idea in
> your opinion to try bca-bootstrapping, or can it be assumed that a
> normal theory confidence interval will be a sufficiently good
> approximation (letting me get away with, say, 500 replications)?
>
>
> Best,
> Esther
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
--
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595