Hi R-help,
I am trying to implement a nonparametric bootstrap to find the
standard errors of a simple statistics - the ratio of two scalars. I
am having difficulty getting boot() to work correctly. I code a
function to create the ratio of the relevant scalars. theta(data, i).
When I call the function for my data and every observation appearing
once, theta(test, c(1)), I get the correct statistic for my original
data. However when I use boot(test, theta, 200) the original
statistic is incorrect.
My code and data are below. The code is very short.
Any help will be appreciated.
Cheers,
Alex
library(boot)
test <- read.csv("test.csv")
test.mean <- mean(test)
test.cov <- cov(test)*87/88 ## use the biased version as I am
reproducing a result from a book
test.eigen <- eigen(test.cov)
theta <- function(data, i) {
data <- data * i
data.cov <- cov(data)
data.eigen <- eigen(data.cov)
data.eigen$values[1]/sum(data.eigen$values)
}
test.boot <- boot(test, theta, 200)
test
mec vec alg ana sta
1 77 82 67 67 81
2 63 78 80 70 81
3 75 73 71 66 81
4 55 72 63 70 68
5 63 63 65 70 63
6 53 61 72 64 73
7 51 67 65 65 68
8 59 70 68 62 56
9 62 60 58 62 70
10 64 72 60 62 45
11 52 64 60 63 54
12 55 67 59 62 44
13 50 50 64 55 63
14 65 63 58 56 37
15 31 55 60 57 73
16 60 64 56 54 40
17 44 69 53 53 53
18 42 69 61 55 45
19 62 46 61 57 45
20 31 49 62 63 62
21 44 61 52 62 46
22 49 41 61 49 64
23 12 58 61 63 67
24 49 53 49 62 47
25 54 49 56 47 53
26 54 53 46 59 44
27 44 56 55 61 36
28 18 44 50 57 81
29 46 52 65 50 35
30 32 45 49 57 64
31 30 69 50 52 45
32 46 49 53 59 37
33 40 27 54 61 61
34 31 42 48 54 68
35 36 59 51 45 51
36 56 40 56 54 35
37 46 56 57 49 32
38 45 42 55 56 40
39 42 60 54 49 33
40 40 63 53 54 25
41 23 55 59 53 44
42 48 48 49 51 37
43 41 63 49 46 34
44 46 52 53 41 40
45 46 61 46 38 41
46 40 57 51 52 31
47 49 49 45 48 39
48 22 58 53 56 41
49 35 60 47 54 33
50 48 56 49 42 32
51 31 57 50 54 34
52 17 53 57 43 51
53 49 57 47 39 26
54 59 50 47 15 46
55 37 56 49 28 45
56 40 43 48 21 61
57 35 35 41 51 50
58 38 44 54 47 24
59 43 43 38 34 49
60 39 46 46 32 43
61 62 44 36 22 42
62 48 38 41 44 33
63 34 42 50 47 29
64 18 51 40 56 30
65 35 36 46 48 29
66 59 53 37 22 19
67 41 41 43 30 33
68 31 52 37 27 40
69 17 51 52 35 31
70 34 30 50 47 36
71 46 40 47 29 17
72 10 46 36 47 39
73 46 37 45 15 30
74 30 34 43 46 18
75 13 51 50 25 31
76 49 50 38 23 9
77 18 32 31 45 40
78 8 42 48 26 40
79 23 38 36 48 15
80 30 24 43 33 25
81 3 9 51 47 40
82 7 51 43 17 22
83 15 40 43 23 18
84 15 38 39 28 17
85 5 30 44 36 18
86 12 30 32 35 21
87 5 26 15 20 20
88 0 40 21 9 14
Hi,
It's kind of weird for me to not see a return() statement in the
function. Maybe try rolling your own, nonparametric bootstraps aren't all
that bad.
i.e.
reps=1000
stat.holder=rep(NA,reps) ###Rep of NA's so you can pick up errors easily
for(count in 1:reps)
{
bootsample=sample(data, n.data.to.compute.from,replace=T)
##Insert code here to compute statistic
stat.holder[count]=computed.statistic
}
It may be overtly simplistic, but you may find it helpful.
Thanks,
Ken
On Fri, Aug 12, 2011 at 10:10 AM, Alex Olssen <alex.olssen@gmail.com>
wrote:
> Hi R-help,
>
> I am trying to implement a nonparametric bootstrap to find the
> standard errors of a simple statistics - the ratio of two scalars. I
> am having difficulty getting boot() to work correctly. I code a
> function to create the ratio of the relevant scalars. theta(data, i).
> When I call the function for my data and every observation appearing
> once, theta(test, c(1)), I get the correct statistic for my original
> data. However when I use boot(test, theta, 200) the original
> statistic is incorrect.
>
> My code and data are below. The code is very short.
>
> Any help will be appreciated.
>
> Cheers,
> Alex
>
> library(boot)
> test <- read.csv("test.csv")
> test.mean <- mean(test)
> test.cov <- cov(test)*87/88 ## use the biased version as I am
> reproducing a result from a book
> test.eigen <- eigen(test.cov)
> theta <- function(data, i) {
> data <- data * i
> data.cov <- cov(data)
> data.eigen <- eigen(data.cov)
> data.eigen$values[1]/sum(data.eigen$values)
> }
> test.boot <- boot(test, theta, 200)
>
> test
> mec vec alg ana sta
> 1 77 82 67 67 81
> 2 63 78 80 70 81
> 3 75 73 71 66 81
> 4 55 72 63 70 68
> 5 63 63 65 70 63
> 6 53 61 72 64 73
> 7 51 67 65 65 68
> 8 59 70 68 62 56
> 9 62 60 58 62 70
> 10 64 72 60 62 45
> 11 52 64 60 63 54
> 12 55 67 59 62 44
> 13 50 50 64 55 63
> 14 65 63 58 56 37
> 15 31 55 60 57 73
> 16 60 64 56 54 40
> 17 44 69 53 53 53
> 18 42 69 61 55 45
> 19 62 46 61 57 45
> 20 31 49 62 63 62
> 21 44 61 52 62 46
> 22 49 41 61 49 64
> 23 12 58 61 63 67
> 24 49 53 49 62 47
> 25 54 49 56 47 53
> 26 54 53 46 59 44
> 27 44 56 55 61 36
> 28 18 44 50 57 81
> 29 46 52 65 50 35
> 30 32 45 49 57 64
> 31 30 69 50 52 45
> 32 46 49 53 59 37
> 33 40 27 54 61 61
> 34 31 42 48 54 68
> 35 36 59 51 45 51
> 36 56 40 56 54 35
> 37 46 56 57 49 32
> 38 45 42 55 56 40
> 39 42 60 54 49 33
> 40 40 63 53 54 25
> 41 23 55 59 53 44
> 42 48 48 49 51 37
> 43 41 63 49 46 34
> 44 46 52 53 41 40
> 45 46 61 46 38 41
> 46 40 57 51 52 31
> 47 49 49 45 48 39
> 48 22 58 53 56 41
> 49 35 60 47 54 33
> 50 48 56 49 42 32
> 51 31 57 50 54 34
> 52 17 53 57 43 51
> 53 49 57 47 39 26
> 54 59 50 47 15 46
> 55 37 56 49 28 45
> 56 40 43 48 21 61
> 57 35 35 41 51 50
> 58 38 44 54 47 24
> 59 43 43 38 34 49
> 60 39 46 46 32 43
> 61 62 44 36 22 42
> 62 48 38 41 44 33
> 63 34 42 50 47 29
> 64 18 51 40 56 30
> 65 35 36 46 48 29
> 66 59 53 37 22 19
> 67 41 41 43 30 33
> 68 31 52 37 27 40
> 69 17 51 52 35 31
> 70 34 30 50 47 36
> 71 46 40 47 29 17
> 72 10 46 36 47 39
> 73 46 37 45 15 30
> 74 30 34 43 46 18
> 75 13 51 50 25 31
> 76 49 50 38 23 9
> 77 18 32 31 45 40
> 78 8 42 48 26 40
> 79 23 38 36 48 15
> 80 30 24 43 33 25
> 81 3 9 51 47 40
> 82 7 51 43 17 22
> 83 15 40 43 23 18
> 84 15 38 39 28 17
> 85 5 30 44 36 18
> 86 12 30 32 35 21
> 87 5 26 15 20 20
> 88 0 40 21 9 14
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
[[alternative HTML version deleted]]
Shouldn't the "i" in your theta() function refer to the selected
rows (a
"vector of indices" as referred to in the help file for boot) of the
data
used by boot()?
theta <- function(data, i) {
data <- data[i, ]
data.cov <- cov(data)
data.eigen <- eigen(data.cov)
data.eigen$values[1]/sum(data.eigen$values)
}
Jean
r-help-bounces@r-project.org wrote on 08/12/2011 09:10:52 AM:
> [image removed]
>
> [R] Getting bootstrap statistic to work
>
> Alex Olssen
>
> to:
>
> r-help
>
> 08/12/2011 09:14 AM
>
> Sent by:
>
> r-help-bounces@r-project.org
>
> Hi R-help,
>
> I am trying to implement a nonparametric bootstrap to find the
> standard errors of a simple statistics - the ratio of two scalars. I
> am having difficulty getting boot() to work correctly. I code a
> function to create the ratio of the relevant scalars. theta(data, i).
> When I call the function for my data and every observation appearing
> once, theta(test, c(1)), I get the correct statistic for my original
> data. However when I use boot(test, theta, 200) the original
> statistic is incorrect.
>
> My code and data are below. The code is very short.
>
> Any help will be appreciated.
>
> Cheers,
> Alex
>
> library(boot)
> test <- read.csv("test.csv")
> test.mean <- mean(test)
> test.cov <- cov(test)*87/88 ## use the biased version as I am
> reproducing a result from a book
> test.eigen <- eigen(test.cov)
> theta <- function(data, i) {
> data <- data * i
> data.cov <- cov(data)
> data.eigen <- eigen(data.cov)
> data.eigen$values[1]/sum(data.eigen$values)
> }
> test.boot <- boot(test, theta, 200)
>
> test
> mec vec alg ana sta
> 1 77 82 67 67 81
> 2 63 78 80 70 81
> 3 75 73 71 66 81
> 4 55 72 63 70 68
> 5 63 63 65 70 63
> 6 53 61 72 64 73
> 7 51 67 65 65 68
> 8 59 70 68 62 56
> 9 62 60 58 62 70
> 10 64 72 60 62 45
> 11 52 64 60 63 54
> 12 55 67 59 62 44
> 13 50 50 64 55 63
> 14 65 63 58 56 37
> 15 31 55 60 57 73
> 16 60 64 56 54 40
> 17 44 69 53 53 53
> 18 42 69 61 55 45
> 19 62 46 61 57 45
> 20 31 49 62 63 62
> 21 44 61 52 62 46
> 22 49 41 61 49 64
> 23 12 58 61 63 67
> 24 49 53 49 62 47
> 25 54 49 56 47 53
> 26 54 53 46 59 44
> 27 44 56 55 61 36
> 28 18 44 50 57 81
> 29 46 52 65 50 35
> 30 32 45 49 57 64
> 31 30 69 50 52 45
> 32 46 49 53 59 37
> 33 40 27 54 61 61
> 34 31 42 48 54 68
> 35 36 59 51 45 51
> 36 56 40 56 54 35
> 37 46 56 57 49 32
> 38 45 42 55 56 40
> 39 42 60 54 49 33
> 40 40 63 53 54 25
> 41 23 55 59 53 44
> 42 48 48 49 51 37
> 43 41 63 49 46 34
> 44 46 52 53 41 40
> 45 46 61 46 38 41
> 46 40 57 51 52 31
> 47 49 49 45 48 39
> 48 22 58 53 56 41
> 49 35 60 47 54 33
> 50 48 56 49 42 32
> 51 31 57 50 54 34
> 52 17 53 57 43 51
> 53 49 57 47 39 26
> 54 59 50 47 15 46
> 55 37 56 49 28 45
> 56 40 43 48 21 61
> 57 35 35 41 51 50
> 58 38 44 54 47 24
> 59 43 43 38 34 49
> 60 39 46 46 32 43
> 61 62 44 36 22 42
> 62 48 38 41 44 33
> 63 34 42 50 47 29
> 64 18 51 40 56 30
> 65 35 36 46 48 29
> 66 59 53 37 22 19
> 67 41 41 43 30 33
> 68 31 52 37 27 40
> 69 17 51 52 35 31
> 70 34 30 50 47 36
> 71 46 40 47 29 17
> 72 10 46 36 47 39
> 73 46 37 45 15 30
> 74 30 34 43 46 18
> 75 13 51 50 25 31
> 76 49 50 38 23 9
> 77 18 32 31 45 40
> 78 8 42 48 26 40
> 79 23 38 36 48 15
> 80 30 24 43 33 25
> 81 3 9 51 47 40
> 82 7 51 43 17 22
> 83 15 40 43 23 18
> 84 15 38 39 28 17
> 85 5 30 44 36 18
> 86 12 30 32 35 21
> 87 5 26 15 20 20
> 88 0 40 21 9 14
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html> and provide commented, minimal, self-contained, reproducible code.
[[alternative HTML version deleted]]