thr3ads.net - R help - [R] bootstrap vs. resampleing [Apr 2005]

If this information is useful, please help other people find it:
Share via:

array chip

2005-Apr-06 17:19 UTC

[R] bootstrap vs. resampleing

Hi,

I understand bootstrap can be used to estimate 95%
confidence interval for some statistics, e.g.
variance, median, etc. I have someone suggesting that
by resampling certain proportion of the total samples
(e.g. 80%) without replacement, we can also get the
estimate of confidence intervals. Here we have an
example of 1000 obsevations, we would like to estimate
95% confidence intervals for odds ratio for a
diagnostic test, can I use resampling 80% of the
observations without replacement, instead of
bootstrap, to do this? If not, why is it wrong to do
it this way?

Thanks

Berton Gunter

2005-Apr-06 18:16 UTC

head link

[R] bootstrap vs. resampleing

> I understand bootstrap can be used to estimate 95%
> confidence interval for some statistics, e.g.                               ^^^^^^^^^^

There's no such thing. You can estimate 95% CI's on population
**parameters**, which is, I assume, what you mean. If you don't know what
the difference is, stop here and consult a local statistician, as you are
out of your depth.
-----------

If you make it to here, I think you are referring to cross-validation vs
resampling. 

Typically, X-validation is used to get an "honest" estimate of
prediction
error rather than confidence limits for a parameter. The correctness of
bootstrapping for this purpose is based on asymptotic theory: loosely
speaking, the data distribution approximates the population distribution;
appropriate resampling (e.g. maybe stratified, moving blocks, ...) from the
data corresponds to iid sampling (or whatever is appropriate..) from the
population. It is actually a way to approximate the (itself approximate)
asymptotic sampling distribution.

AFAIK (experts, please correct) no such asymptotic theory holds for
X-validation and so it would be problematic/wrong for CI's.

-- Bert Gunter
Genentech Non-Clinical Statistics
South San Francisco, CA
 
"The business of the statistician is to catalyze the scientific learning
process."  - George E. P. Box
 
 
> -----Original Message-----
> From: r-help-bounces at stat.math.ethz.ch 
> [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of array chip
> Sent: Wednesday, April 06, 2005 10:19 AM
> To: r-help at stat.math.ethz.ch
> Subject: [R] bootstrap vs. resampleing
> 
> Hi,
> 
> I understand bootstrap can be used to estimate 95%
> confidence interval for some statistics, e.g.
> variance, median, etc. I have someone suggesting that
> by resampling certain proportion of the total samples
> (e.g. 80%) without replacement, we can also get the
> estimate of confidence intervals. Here we have an
> example of 1000 obsevations, we would like to estimate
> 95% confidence intervals for odds ratio for a
> diagnostic test, can I use resampling 80% of the
> observations without replacement, instead of
> bootstrap, to do this? If not, why is it wrong to do
> it this way?
> 
> Thanks
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
>

Roger D. Peng

2005-Apr-06 18:18 UTC

head link

[R] bootstrap vs. resampleing

What you're describing sounds like subsampling, about which John 
Hartigan has written a few papers.

-roger

array chip wrote:> Hi,
> 
> I understand bootstrap can be used to estimate 95%
> confidence interval for some statistics, e.g.
> variance, median, etc. I have someone suggesting that
> by resampling certain proportion of the total samples
> (e.g. 80%) without replacement, we can also get the
> estimate of confidence intervals. Here we have an
> example of 1000 obsevations, we would like to estimate
> 95% confidence intervals for odds ratio for a
> diagnostic test, can I use resampling 80% of the
> observations without replacement, instead of
> bootstrap, to do this? If not, why is it wrong to do
> it this way?
> 
> Thanks
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html
>

Thomas Lumley

2005-Apr-06 18:31 UTC

head link

[R] bootstrap vs. resampleing

On Wed, 6 Apr 2005, array chip wrote:
> Hi,
>
> I understand bootstrap can be used to estimate 95%
> confidence interval for some statistics, e.g.
> variance, median, etc. I have someone suggesting that
> by resampling certain proportion of the total samples
> (e.g. 80%) without replacement, we can also get the
> estimate of confidence intervals. Here we have an
> example of 1000 obsevations, we would like to estimate
> 95% confidence intervals for odds ratio for a
> diagnostic test, can I use resampling 80% of the
> observations without replacement, instead of
> bootstrap, to do this? If not, why is it wrong to do
> it this way?
>
You can, provided you rescale correctly for the fact that you are working 
with a smaller sample.  This is more like the jackknife, which also 
resamples a smaller number without replacement.

There is quite a bit of literature on this sort of jackknife/bootstrap 
variant.  One useful book is "The Jackknife and Bootstrap" by Shao and
Tu.

 	-thomas

"Jens Oehlschlägel"

2005-Apr-06 18:39 UTC

head link

[R] bootstrap vs. resampleing

Confidence intervals depend on the sample size - the bigger the sample the
smaller the interval. Subsampling (resampling without replacement) gives
smaller samples and underestimates confidence (overestimates confidence
interval size) of parameters calculated on the original sample. 

Best


Jens Oehlschl?gel


P.S.: I guess signing a question with your name makes answers more likely

--

Huntsinger, Reid

2005-Apr-06 21:48 UTC

head link

[R] bootstrap vs. resampleing

I may be misunderstanding the question, but I believe you want a pointwise
confidence band for the conditional odds function. The issue here is less
bootstrap versus some other resampling plan, and more how to do it at all.
For example, if no matter what "training" data you feed in, you always
get
the same conditional odds estimate, no resampling will (by itself) reveal
this bias (and you will have a confidence band of width 0). You could
however use resampling together with nonparametric estimation in a variety
of ways to address this. 

If you assume your conditional odds estimation to be unbiased, you could
resample and look at the empirical distribution of conditional odds ratio
estimates at a given covariate or feature value. You have to figure out how
this is related to the population distribution; this is easiest with the
bootstrap since you have the same sample size. In this case the simplest
procedure is to treat the bootstrap distribution as the population
distribution, but there are many alternatives. See the book Thomas Lumley
recommended by Jun Shao and Dongsheng Tu. They treat estimation of
regression functions in several places; those remarks are relevant for your
case as well. 

Reid Huntsinger

-----Original Message-----
From: r-help-bounces at stat.math.ethz.ch
[mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of array chip
Sent: Wednesday, April 06, 2005 1:19 PM
To: r-help at stat.math.ethz.ch
Subject: [R] bootstrap vs. resampleing


Hi,

I understand bootstrap can be used to estimate 95%
confidence interval for some statistics, e.g.
variance, median, etc. I have someone suggesting that
by resampling certain proportion of the total samples
(e.g. 80%) without replacement, we can also get the
estimate of confidence intervals. Here we have an
example of 1000 obsevations, we would like to estimate
95% confidence intervals for odds ratio for a
diagnostic test, can I use resampling 80% of the
observations without replacement, instead of
bootstrap, to do this? If not, why is it wrong to do
it this way?

Thanks

______________________________________________
R-help at stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html

Seemingly Similar Threads

Search for more possibly parallel threads

R help - Apr 2005 - bootstrap vs. resampleing

[R] bootstrap vs. resampleing

[R] bootstrap vs. resampleing

[R] bootstrap vs. resampleing

[R] bootstrap vs. resampleing

[R] bootstrap vs. resampleing

[R] bootstrap vs. resampleing

Seemingly Similar Threads