thr3ads.net - R help - [R] Unexp. behavior from boot with multiple statistics [May 2011]

If this information is useful, please help other people find it:
Share via:

algorimancer

2011-May-03 19:15 UTC

[R] Unexp. behavior from boot with multiple statistics

I am attempting to use package boot to summarize and compare the performance
of three models.  I'm using R 2.13.0 in a Win32 environment.

My statistic function returns a vector of 6 values, 3 of which are error
rates for different models, and 3 are pairwise differences between those
error rates.  It looks like:

multiEst<-function(dat,i)
{
       ....
	c(E1,E2,E3,E2-E1,E3-E1,E3-E2);
}

then I call boot (using R=4 for simplicity of description) with:

multiBoot=boot(data,multiEst,R=4)

which gives reasonable results:

Bootstrap Statistics :
    original  bias    std. error
t1*     0.07  0.3775  0.04193249
t2*     0.08  0.3750  0.04654747
t3*     0.04  0.4200  0.05354126
t4*     0.01 -0.0025  0.00500000
t5*    -0.03  0.0425  0.01500000
t6*    -0.04  0.0450  0.01290994

and the resulting "t0" contains the expected estimates of the
statistics,> multiBoot$t0[1]  0.07  0.08  0.04  0.01 -0.03 -0.04

however "t", which is supposed to contain bootstrap replicates of the
statistic, doesn't.  It looks like this:> multiBoot$t     [,1] [,2] [,3] [,4] [,5]  [,6]
[1,] 0.46 0.47 0.46 0.01 0.00 -0.01
[2,] 0.39 0.39 0.39 0.00 0.00  0.00
[3,] 0.45 0.46 0.47 0.01 0.02  0.01
[4,] 0.49 0.50 0.52 0.01 0.03  0.02

It is not clear where these columns come from --- they clearly do not
resemble the estimates in "t0".

If I define a separate statistic function for each desired estimate, the
resulting "t" and "t0" are as expected, however it is
important in this case
that the separate estimates derive from the same bootstrap replicates.

Any helpful suggestions? Or have I come upon a bug in the implementation?

Note: the documentation provides the following definitions for these
returned variables:

t0 	The observed value of statistic applied to data.
t 	A matrix with R rows each of which is a bootstrap replicate of statistic. 









--
View this message in context:
http://r.789695.n4.nabble.com/Unexp-behavior-from-boot-with-multiple-statistics-tp3493300p3493300.html
Sent from the R help mailing list archive at Nabble.com.

Andrew Robinson

2011-May-03 23:11 UTC

head link

[R] Unexp. behavior from boot with multiple statistics

Your interpretation of what the output is supposed to look like is
actually correct.  Take a look at the estimates of the bias in the
BootStrap Statistics.  You will see that they are the same as the
difference between the location of colMeans of t and t0.

I hope that this helps,

Andrew

On Tue, May 03, 2011 at 12:15:05PM -0700, algorimancer
wrote:> I am attempting to use package boot to summarize and compare the
performance
> of three models.  I'm using R 2.13.0 in a Win32 environment.
> 
> My statistic function returns a vector of 6 values, 3 of which are error
> rates for different models, and 3 are pairwise differences between those
> error rates.  It looks like:
> 
> multiEst<-function(dat,i)
> {
>        ....
> 	c(E1,E2,E3,E2-E1,E3-E1,E3-E2);
> }
> 
> then I call boot (using R=4 for simplicity of description) with:
> 
> multiBoot=boot(data,multiEst,R=4)
> 
> which gives reasonable results:
> 
> Bootstrap Statistics :
>     original  bias    std. error
> t1*     0.07  0.3775  0.04193249
> t2*     0.08  0.3750  0.04654747
> t3*     0.04  0.4200  0.05354126
> t4*     0.01 -0.0025  0.00500000
> t5*    -0.03  0.0425  0.01500000
> t6*    -0.04  0.0450  0.01290994
> 
> and the resulting "t0" contains the expected estimates of the
statistics,
> > multiBoot$t0
> [1]  0.07  0.08  0.04  0.01 -0.03 -0.04
> 
> however "t", which is supposed to contain bootstrap replicates of
the
> statistic, doesn't.  It looks like this:
> > multiBoot$t
>      [,1] [,2] [,3] [,4] [,5]  [,6]
> [1,] 0.46 0.47 0.46 0.01 0.00 -0.01
> [2,] 0.39 0.39 0.39 0.00 0.00  0.00
> [3,] 0.45 0.46 0.47 0.01 0.02  0.01
> [4,] 0.49 0.50 0.52 0.01 0.03  0.02
> 
> It is not clear where these columns come from --- they clearly do not
> resemble the estimates in "t0".
> 
> If I define a separate statistic function for each desired estimate, the
> resulting "t" and "t0" are as expected, however it is
important in this case
> that the separate estimates derive from the same bootstrap replicates.
> 
> Any helpful suggestions? Or have I come upon a bug in the implementation?
> 
> Note: the documentation provides the following definitions for these
> returned variables:
> 
> t0 	The observed value of statistic applied to data.
> t 	A matrix with R rows each of which is a bootstrap replicate of
statistic.
> 
> 
> 
> 
> 
> 
> 
> 
> 
> --
> View this message in context:
http://r.789695.n4.nabble.com/Unexp-behavior-from-boot-with-multiple-statistics-tp3493300p3493300.html
> Sent from the R help mailing list archive at Nabble.com.
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
-- 
Andrew Robinson  
Program Manager, ACERA 
Department of Mathematics and Statistics            Tel: +61-3-8344-6410
University of Melbourne, VIC 3010 Australia               (prefer email)
http://www.ms.unimelb.edu.au/~andrewpr              Fax: +61-3-8344-4599
http://www.acera.unimelb.edu.au/

Forest Analytics with R (Springer, 2011) 
http://www.ms.unimelb.edu.au/FAwR/
Introduction to Scientific Programming and Simulation using R (CRC, 2009): 
http://www.ms.unimelb.edu.au/spuRs/

Possibly Parallel Threads

Search for more seemingly similar threads

R help - May 2011 - Unexp. behavior from boot with multiple statistics

[R] Unexp. behavior from boot with multiple statistics

[R] Unexp. behavior from boot with multiple statistics

Possibly Parallel Threads