I am attempting to use package boot to summarize and compare the performance
of three models. I'm using R 2.13.0 in a Win32 environment.
My statistic function returns a vector of 6 values, 3 of which are error
rates for different models, and 3 are pairwise differences between those
error rates. It looks like:
multiEst<-function(dat,i)
{
....
c(E1,E2,E3,E2-E1,E3-E1,E3-E2);
}
then I call boot (using R=4 for simplicity of description) with:
multiBoot=boot(data,multiEst,R=4)
which gives reasonable results:
Bootstrap Statistics :
original bias std. error
t1* 0.07 0.3775 0.04193249
t2* 0.08 0.3750 0.04654747
t3* 0.04 0.4200 0.05354126
t4* 0.01 -0.0025 0.00500000
t5* -0.03 0.0425 0.01500000
t6* -0.04 0.0450 0.01290994
and the resulting "t0" contains the expected estimates of the
statistics,> multiBoot$t0
[1] 0.07 0.08 0.04 0.01 -0.03 -0.04
however "t", which is supposed to contain bootstrap replicates of the
statistic, doesn't. It looks like this:> multiBoot$t
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 0.46 0.47 0.46 0.01 0.00 -0.01
[2,] 0.39 0.39 0.39 0.00 0.00 0.00
[3,] 0.45 0.46 0.47 0.01 0.02 0.01
[4,] 0.49 0.50 0.52 0.01 0.03 0.02
It is not clear where these columns come from --- they clearly do not
resemble the estimates in "t0".
If I define a separate statistic function for each desired estimate, the
resulting "t" and "t0" are as expected, however it is
important in this case
that the separate estimates derive from the same bootstrap replicates.
Any helpful suggestions? Or have I come upon a bug in the implementation?
Note: the documentation provides the following definitions for these
returned variables:
t0 The observed value of statistic applied to data.
t A matrix with R rows each of which is a bootstrap replicate of statistic.
--
View this message in context:
http://r.789695.n4.nabble.com/Unexp-behavior-from-boot-with-multiple-statistics-tp3493300p3493300.html
Sent from the R help mailing list archive at Nabble.com.
Andrew Robinson
2011-May-03 23:11 UTC
[R] Unexp. behavior from boot with multiple statistics
Your interpretation of what the output is supposed to look like is actually correct. Take a look at the estimates of the bias in the BootStrap Statistics. You will see that they are the same as the difference between the location of colMeans of t and t0. I hope that this helps, Andrew On Tue, May 03, 2011 at 12:15:05PM -0700, algorimancer wrote:> I am attempting to use package boot to summarize and compare the performance > of three models. I'm using R 2.13.0 in a Win32 environment. > > My statistic function returns a vector of 6 values, 3 of which are error > rates for different models, and 3 are pairwise differences between those > error rates. It looks like: > > multiEst<-function(dat,i) > { > .... > c(E1,E2,E3,E2-E1,E3-E1,E3-E2); > } > > then I call boot (using R=4 for simplicity of description) with: > > multiBoot=boot(data,multiEst,R=4) > > which gives reasonable results: > > Bootstrap Statistics : > original bias std. error > t1* 0.07 0.3775 0.04193249 > t2* 0.08 0.3750 0.04654747 > t3* 0.04 0.4200 0.05354126 > t4* 0.01 -0.0025 0.00500000 > t5* -0.03 0.0425 0.01500000 > t6* -0.04 0.0450 0.01290994 > > and the resulting "t0" contains the expected estimates of the statistics, > > multiBoot$t0 > [1] 0.07 0.08 0.04 0.01 -0.03 -0.04 > > however "t", which is supposed to contain bootstrap replicates of the > statistic, doesn't. It looks like this: > > multiBoot$t > [,1] [,2] [,3] [,4] [,5] [,6] > [1,] 0.46 0.47 0.46 0.01 0.00 -0.01 > [2,] 0.39 0.39 0.39 0.00 0.00 0.00 > [3,] 0.45 0.46 0.47 0.01 0.02 0.01 > [4,] 0.49 0.50 0.52 0.01 0.03 0.02 > > It is not clear where these columns come from --- they clearly do not > resemble the estimates in "t0". > > If I define a separate statistic function for each desired estimate, the > resulting "t" and "t0" are as expected, however it is important in this case > that the separate estimates derive from the same bootstrap replicates. > > Any helpful suggestions? Or have I come upon a bug in the implementation? > > Note: the documentation provides the following definitions for these > returned variables: > > t0 The observed value of statistic applied to data. > t A matrix with R rows each of which is a bootstrap replicate of statistic. > > > > > > > > > > -- > View this message in context: http://r.789695.n4.nabble.com/Unexp-behavior-from-boot-with-multiple-statistics-tp3493300p3493300.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.-- Andrew Robinson Program Manager, ACERA Department of Mathematics and Statistics Tel: +61-3-8344-6410 University of Melbourne, VIC 3010 Australia (prefer email) http://www.ms.unimelb.edu.au/~andrewpr Fax: +61-3-8344-4599 http://www.acera.unimelb.edu.au/ Forest Analytics with R (Springer, 2011) http://www.ms.unimelb.edu.au/FAwR/ Introduction to Scientific Programming and Simulation using R (CRC, 2009): http://www.ms.unimelb.edu.au/spuRs/