thr3ads.net - R help - [R] bootstrapping vectors of unequal length [May 2005]

If this information is useful, please help other people find it:
Share via:

Matt Oliver

2005-May-20 19:46 UTC

[R] bootstrapping vectors of unequal length

Dear R Help List,

I have a vector of n and a vector of n-1 and I want to use boot() to
bootstrap the ratio of their respective medians. I want to eventually
use boot.ci() to generate confidence intervals. Because the vectors
are not equal in length, I tried a few things, but have yet to be
successful.

Try 1: 
> x <- runif(20)
> 
> y <- c(runif(19), NA)
> 
> median(x)
[1] 0.522284> 
> median(y[1:19])
[1] 0.488046> 
> median(x)/median((y)[1:19])
[1] 1.070153> 
> t <- as.data.frame(cbind(x, y))
> 
> ratio <- function(t, i) median(t$x[i])/median((t$y[1:19])[i])
> 
> boot(t, ratio, R = 1000)
ORDINARY NONPARAMETRIC BOOTSTRAP


Call:
boot(data = t, statistic = ratio, R = 1000)


Bootstrap Statistics :
    original  bias    std. error
t1*       NA      NA   0.4603294


I thought this might be successful because median(x)/median((y)[1:19])
gives a result, and not an NA.

I also tried to use a regular list (even though boot() technically
doesn't accept them) so I didn't have to use NA.

Try 2:
> x <- runif(20)
> 
> y <- runif(19)
> 
> median(x)
[1] 0.732906> 
> median(y)
[1] 0.5596225> 
> median(x)/median(y)
[1] 1.309644> 
> t <- list(x = x, y = y)
> 
> ratio <- function(t, i) median(t$x[i])/median(t$y[i])
> 
> boot(t, ratio, R = 1000)
ORDINARY NONPARAMETRIC BOOTSTRAP


Call:
boot(data = t, statistic = ratio, R = 1000)


Bootstrap Statistics :
    original       bias    std. error
t1* 1.153598 -0.004907764  0.08266257


At first glance this seemed to work, but median(x)/median(y) is not
equal to the "original" in the Bootstrap Statistics (which is a bit
odd to me, but it may be because boot() doesn't accept this kind of
list.)


Is there a way to do this type of bootstrap with the boot() function?

Thanks in advance

Matt Oliver

Uwe Ligges

2005-May-20 20:37 UTC

head link

[R] bootstrapping vectors of unequal length

Matt Oliver wrote:
> Dear R Help List,
> 
> I have a vector of n and a vector of n-1 and I want to use boot() to
> bootstrap the ratio of their respective medians. I want to eventually
> use boot.ci() to generate confidence intervals. Because the vectors
> are not equal in length, I tried a few things, but have yet to be
> successful.
> 
> Try 1: 
> 
> 
>>x <- runif(20)
>>
>>y <- c(runif(19), NA)
>>
>>median(x)
> 
> [1] 0.522284
> 
>>median(y[1:19])
> 
> [1] 0.488046
> 
>>median(x)/median((y)[1:19])
> 
> [1] 1.070153
> 
>>t <- as.data.frame(cbind(x, y))
>>
>>ratio <- function(t, i) median(t$x[i])/median((t$y[1:19])[i])
>>
>>boot(t, ratio, R = 1000)
> 
> 
> ORDINARY NONPARAMETRIC BOOTSTRAP
> 
> 
> Call:
> boot(data = t, statistic = ratio, R = 1000)
> 
> 
> Bootstrap Statistics :
>     original  bias    std. error
> t1*       NA      NA   0.4603294
> 
> 
> I thought this might be successful because median(x)/median((y)[1:19])
> gives a result, and not an NA.
> 
> I also tried to use a regular list (even though boot() technically
> doesn't accept them) so I didn't have to use NA.
> 
> Try 2:
> 
> 
>>x <- runif(20)
>>
>>y <- runif(19)
>>
>>median(x)
> 
> [1] 0.732906
> 
>>median(y)
> 
> [1] 0.5596225
> 
>>median(x)/median(y)
> 
> [1] 1.309644
> 
>>t <- list(x = x, y = y)
>>
>>ratio <- function(t, i) median(t$x[i])/median(t$y[i])
>>
>>boot(t, ratio, R = 1000)
> 
> 
> ORDINARY NONPARAMETRIC BOOTSTRAP
> 
> 
> Call:
> boot(data = t, statistic = ratio, R = 1000)
> 
> 
> Bootstrap Statistics :
>     original       bias    std. error
> t1* 1.153598 -0.004907764  0.08266257
> 
> 
> At first glance this seemed to work, but median(x)/median(y) is not
> equal to the "original" in the Bootstrap Statistics (which is a
bit
> odd to me, but it may be because boot() doesn't accept this kind of
> list.)
> 
> 
> Is there a way to do this type of bootstrap with the boot() function?
> 
> Thanks in advance
> 
> Matt Oliver
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html

The question is always how to handle NA, obviously you do NOT want to 
throw away whole observations including NAs from your data.frame, but 
only throw away NAs for each vector separately.

Your idea specifying "median((t$y[1:19])[i])" does not work, because
the
statistic is evaluated on subsets of t, hence you get for the subset of 
observations 1:5: t[1:5,]$y[1:19], and you have even more NAs

Instead, you want to use median()'s na.rm argument as in:

ratio <- function(t, i)
     median(t$x[i], na.rm=TRUE) / median(t$y[i], na.rm=TRUE)

Uwe Ligges

Reasonably Related Threads

Search for more maybe matching threads

R help - May 2005 - bootstrapping vectors of unequal length

[R] bootstrapping vectors of unequal length

[R] bootstrapping vectors of unequal length

Reasonably Related Threads