Dear David,
Dear Rui,
Many thanks for your response. It perfectly works for the mean. Now I have a
problem with my R code for the median. Because I always get 1 (100%) coverage
probability that is more than very strange. Indeed, considering that an interval
whose lower limit is the smallest value in the sample and whose upper limit is
the largest value has 1/32 + 1/32 = 1/16 probability of non-coverage, implying
that the confidence of such an interval is 15/16 rather than 1 (100%), I suspect
that the confidence interval I use for the median is not correctly defined for
n=5 observations, and likely contains all observations in the sample ? What is
wrong with my R code ?
########################################
library(boot)
s=rgamma(n=100000,shape=2,rate=5)
median(s)
N <- 100
out <- replicate(N, {
a<- sample(s,size=5)
median(a)
dat<-data.frame(a)
med<-function(d,i) {
temp<-d[i,]
median(temp)
}
? boot.out <- boot(data = dat, statistic = med, R = 10000)
? boot.ci(boot.out, type = "bca")$bca[, 4:5]
})
#coverage probability
median(out[1, ] < median(s) & median(s) < out[2, ])
########################################
Le jeudi 23 d?cembre 2021, 14:10:36 UTC+1, Rui Barradas <ruipbarradas at
sapo.pt> a ?crit :
Hello,
The code is running very slowly because you are recreating the function
in the replicate() loop and because you are creating a data.frame also
in the loop.
And because in the bootstrap statistic function med() you are computing
the variance of yet another loop. This is probably statistically wrong
but like David says, without a problem description it's hard to say.
Also, why compute variances if they are never used?
Here is complete code executing in much less than 2:00 hours. Note that
it passes the vector a directly to med(), not a df with just one column.
library(boot)
set.seed(2021)
s <- sample(178:798, 100000, replace = TRUE)
mean(s)
med <- function(d, i) {
? temp <- d[i]
? f <- mean(temp)
? g <- var(temp)
? c(Mean = f, Var = g)
}
N <- 1000
out <- replicate(N, {
? a <- sample(s, size = 5)
? boot.out <- boot(data = a, statistic = med, R = 10000)
? boot.ci(boot.out, type = "stud")$stud[, 4:5]
})
mean(out[1, ] < mean(s) & mean(s) < out[2, ])
#[1] 0.952
Hope this helps,
Rui Barradas
?s 11:45 de 19/12/21, varin sacha via R-help escreveu:> Dear R-experts,
>
> Here below my R code working but really really slowly ! I need 2 hours with
my computer to finally get an answer ! Is there a way to improve my R code to
speed it up ? At least to win 1 hour ;=)
>
> Many thanks
>
> ########################################################
> library(boot)
>
> s<- sample(178:798, 100000, replace=TRUE)
> mean(s)
>
> N <- 1000
> out <- replicate(N, {
> a<- sample(s,size=5)
> mean(a)
> dat<-data.frame(a)
>
> med<-function(d,i) {
> temp<-d[i,]
> f<-mean(temp)
> g<-var(replicate(50,mean(sample(temp,replace=T))))
> return(c(f,g))
>
> }
>
>? ? boot.out <- boot(data = dat, statistic = med, R = 10000)
>? ? boot.ci(boot.out, type = "stud")$stud[, 4:5]
> })
> mean(out[1,] < mean(s) & mean(s) < out[2,])
> ########################################################
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>