thr3ads.net - R help - [R] Bootstrap P-Value [Nov 2020]

If this information is useful, please help other people find it:
Share via:

AbouEl-Makarim Aboueissa

2020-Nov-06 16:43 UTC

[R] Bootstrap P-Value

*Dear All:*

*I am trying to compute the p-value of the bootstrap test; please see
below.*

*In example 1 the p-value agrees with the confidence interval.*
*BUT, in example 2  the p-value DOES NOT agree with the confidence
interval. In Example 2, the p-value should be zero or close to zero.*

*I am not sure what went wrong, or not sure if I missed something.*

*any help would be appreciated.*


*with many thanks*
*abou*



#####  Two - Sample Bootstrap

#####  Source:
http://www.ievbras.ru/ecostat/Kiril/R/Biblio_N/R_Eng/Chernick2011.pdf

#####  Example 1:
#####  ----------



set.seed(1)

n1 <- 29
n1
x1 <- rnorm(n1, 1.143, 0.164) #some random normal variates: mean1 = 1.143
x1

n2 <- 33
n2
x2 <- rnorm(n2, 1.175, 0.169) #2nd random sample: mean2 = 1.175
x2

obs.diff.theta <- mean(x1) - mean(x2)
obs.diff.theta

theta <- as.vector(NULL) #### vector to hold difference estimates

iterations <- 1000

for (i in 1:1000) {                        #bootstrap resamples
 xx1 <- sample(x1, n1, replace = TRUE)
 xx2 <- sample(x2, n2, replace = TRUE)
 theta[i] <- mean(xx1) - mean(xx2)
 }



##### Confidence Interval:
##### --------------------


quantile(theta, probs = c(.025,0.975)) #Efron percentile CI on difference
in means

##### 2.5% 97.5%
##### - 0.1248539 0.0137601


##### P-Value
##### -------

p.value <- (sum (abs(theta) >= obs.diff.theta) + 1)/ (iterations+1)

#####  p.value <- (sum (theta >= obs.diff.theta) + 1)/ (iterations+1)

p.value



#### R OUTPUT

#### > quantile(theta, probs = c(.025,0.975))
####        2.5%       97.5%
#### -0.12647744  0.02099391

#### > p.value <- (sum (abs(theta) >= obs.diff.theta) + 1)/
(iterations+1)
#### > p.value
#### [1] 1

#####  Example 2:
#####  ----------


set.seed(5)

n1 <- 29
### n1
x1 <- rnorm(n1, 10.5, 0.15) ######   sample 1 with mean1 = 10.5
### x1

n2 <- 33
### n2
x2 <- rnorm(n2, 1.5, 0.155) #####  Sample 2 with mean2 = 1.5
### x2

obs.diff.theta <- mean(x1) - mean(x2)
obs.diff.theta

theta <- as.vector(NULL) #### vector to hold difference estimates

iterations <- 1000

#####   bootstrap resamples

for (i in 1:1000) {
 xx1 <- sample(x1, n1, replace = TRUE)
 xx2 <- sample(x2, n2, replace = TRUE)
 theta[i] <- mean(xx1) - mean(xx2)
 }



##### Confidence Interval:
##### --------------------


######  CI on difference in means

quantile(theta, probs = c(.025,0.975))



##### P-Value
##### -------

p.value <- (sum (abs(theta) >= obs.diff.theta) + 1)/ (iterations+1)

##### p.value <- (sum (theta >= obs.diff.theta) + 1)/ (iterations+1)

p.value

##### R OUTPUT

####   > ######  CI on difference in means
####   >
####   > quantile(theta, probs = c(.025,0.975))
####       2.5%    97.5%
####   8.908398 9.060601

####   > ##### P-Value
####   > p.value <- (sum (abs(theta) >= obs.diff.theta) + 1)/
(iterations+1)

####   > p.value
####   [1] 0.4835165

______________________


*AbouEl-Makarim Aboueissa, PhD*

*Professor, Statistics and Data Science*
*Graduate Coordinator*

*Department of Mathematics and Statistics*
*University of Southern Maine*

	[[alternative HTML version deleted]]

Greg Snow

2020-Nov-06 17:34 UTC

head link

[R] Bootstrap P-Value

A p-value is for testing a specific null hypothesis, but you do not
state your null hypothesis anywhere.

It is the null value that needs to be subtracted from the bootstrap
differences, not the observed difference.  By subtracting the observed
difference you are setting a situation where the p-value will always
be about 0.5 or about 1 (depending on 1 tailed or 2 tailed).  If
instead you subtract a null value (such as 0), then the p-values will
be closer to what you are expecting.

On Fri, Nov 6, 2020 at 9:44 AM AbouEl-Makarim Aboueissa
<abouelmakarim1962 at gmail.com> wrote:>
> *Dear All:*
>
> *I am trying to compute the p-value of the bootstrap test; please see
> below.*
>
> *In example 1 the p-value agrees with the confidence interval.*
> *BUT, in example 2  the p-value DOES NOT agree with the confidence
> interval. In Example 2, the p-value should be zero or close to zero.*
>
> *I am not sure what went wrong, or not sure if I missed something.*
>
> *any help would be appreciated.*
>
>
> *with many thanks*
> *abou*
>
>
>
> #####  Two - Sample Bootstrap
>
> #####  Source:
> http://www.ievbras.ru/ecostat/Kiril/R/Biblio_N/R_Eng/Chernick2011.pdf
>
> #####  Example 1:
> #####  ----------
>
>
>
> set.seed(1)
>
> n1 <- 29
> n1
> x1 <- rnorm(n1, 1.143, 0.164) #some random normal variates: mean1 =
1.143
> x1
>
> n2 <- 33
> n2
> x2 <- rnorm(n2, 1.175, 0.169) #2nd random sample: mean2 = 1.175
> x2
>
> obs.diff.theta <- mean(x1) - mean(x2)
> obs.diff.theta
>
> theta <- as.vector(NULL) #### vector to hold difference estimates
>
> iterations <- 1000
>
> for (i in 1:1000) {                        #bootstrap resamples
>  xx1 <- sample(x1, n1, replace = TRUE)
>  xx2 <- sample(x2, n2, replace = TRUE)
>  theta[i] <- mean(xx1) - mean(xx2)
>  }
>
>
>
> ##### Confidence Interval:
> ##### --------------------
>
>
> quantile(theta, probs = c(.025,0.975)) #Efron percentile CI on difference
> in means
>
> ##### 2.5% 97.5%
> ##### - 0.1248539 0.0137601
>
>
> ##### P-Value
> ##### -------
>
> p.value <- (sum (abs(theta) >= obs.diff.theta) + 1)/ (iterations+1)
>
> #####  p.value <- (sum (theta >= obs.diff.theta) + 1)/ (iterations+1)
>
> p.value
>
>
>
> #### R OUTPUT
>
> #### > quantile(theta, probs = c(.025,0.975))
> ####        2.5%       97.5%
> #### -0.12647744  0.02099391
>
> #### > p.value <- (sum (abs(theta) >= obs.diff.theta) + 1)/
(iterations+1)
> #### > p.value
> #### [1] 1
>
> #####  Example 2:
> #####  ----------
>
>
> set.seed(5)
>
> n1 <- 29
> ### n1
> x1 <- rnorm(n1, 10.5, 0.15) ######   sample 1 with mean1 = 10.5
> ### x1
>
> n2 <- 33
> ### n2
> x2 <- rnorm(n2, 1.5, 0.155) #####  Sample 2 with mean2 = 1.5
> ### x2
>
> obs.diff.theta <- mean(x1) - mean(x2)
> obs.diff.theta
>
> theta <- as.vector(NULL) #### vector to hold difference estimates
>
> iterations <- 1000
>
> #####   bootstrap resamples
>
> for (i in 1:1000) {
>  xx1 <- sample(x1, n1, replace = TRUE)
>  xx2 <- sample(x2, n2, replace = TRUE)
>  theta[i] <- mean(xx1) - mean(xx2)
>  }
>
>
>
> ##### Confidence Interval:
> ##### --------------------
>
>
> ######  CI on difference in means
>
> quantile(theta, probs = c(.025,0.975))
>
>
>
> ##### P-Value
> ##### -------
>
> p.value <- (sum (abs(theta) >= obs.diff.theta) + 1)/ (iterations+1)
>
> ##### p.value <- (sum (theta >= obs.diff.theta) + 1)/ (iterations+1)
>
> p.value
>
> ##### R OUTPUT
>
> ####   > ######  CI on difference in means
> ####   >
> ####   > quantile(theta, probs = c(.025,0.975))
> ####       2.5%    97.5%
> ####   8.908398 9.060601
>
> ####   > ##### P-Value
> ####   > p.value <- (sum (abs(theta) >= obs.diff.theta) + 1)/
(iterations+1)
>
> ####   > p.value
> ####   [1] 0.4835165
>
> ______________________
>
>
> *AbouEl-Makarim Aboueissa, PhD*
>
> *Professor, Statistics and Data Science*
> *Graduate Coordinator*
>
> *Department of Mathematics and Statistics*
> *University of Southern Maine*
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.


-- 
Gregory (Greg) L. Snow Ph.D.
538280 at gmail.com

AbouEl-Makarim Aboueissa

2020-Nov-06 18:01 UTC

head link

[R] Bootstrap P-Value

Dear Greg:

H0: Mean 1- Mean 2 = 0
Ha: Mean 1 - Mean 2 ! = 0

with many thanks
abou
______________________


*AbouEl-Makarim Aboueissa, PhD*

*Professor, Statistics and Data Science*
*Graduate Coordinator*

*Department of Mathematics and Statistics*
*University of Southern Maine*



On Fri, Nov 6, 2020 at 12:35 PM Greg Snow <538280 at gmail.com> wrote:
> A p-value is for testing a specific null hypothesis, but you do not
> state your null hypothesis anywhere.
>
> It is the null value that needs to be subtracted from the bootstrap
> differences, not the observed difference.  By subtracting the observed
> difference you are setting a situation where the p-value will always
> be about 0.5 or about 1 (depending on 1 tailed or 2 tailed).  If
> instead you subtract a null value (such as 0), then the p-values will
> be closer to what you are expecting.
>
> On Fri, Nov 6, 2020 at 9:44 AM AbouEl-Makarim Aboueissa
> <abouelmakarim1962 at gmail.com> wrote:
> >
> > *Dear All:*
> >
> > *I am trying to compute the p-value of the bootstrap test; please see
> > below.*
> >
> > *In example 1 the p-value agrees with the confidence interval.*
> > *BUT, in example 2  the p-value DOES NOT agree with the confidence
> > interval. In Example 2, the p-value should be zero or close to zero.*
> >
> > *I am not sure what went wrong, or not sure if I missed something.*
> >
> > *any help would be appreciated.*
> >
> >
> > *with many thanks*
> > *abou*
> >
> >
> >
> > #####  Two - Sample Bootstrap
> >
> > #####  Source:
> > http://www.ievbras.ru/ecostat/Kiril/R/Biblio_N/R_Eng/Chernick2011.pdf
> >
> > #####  Example 1:
> > #####  ----------
> >
> >
> >
> > set.seed(1)
> >
> > n1 <- 29
> > n1
> > x1 <- rnorm(n1, 1.143, 0.164) #some random normal variates: mean1 =
1.143
> > x1
> >
> > n2 <- 33
> > n2
> > x2 <- rnorm(n2, 1.175, 0.169) #2nd random sample: mean2 = 1.175
> > x2
> >
> > obs.diff.theta <- mean(x1) - mean(x2)
> > obs.diff.theta
> >
> > theta <- as.vector(NULL) #### vector to hold difference estimates
> >
> > iterations <- 1000
> >
> > for (i in 1:1000) {                        #bootstrap resamples
> >  xx1 <- sample(x1, n1, replace = TRUE)
> >  xx2 <- sample(x2, n2, replace = TRUE)
> >  theta[i] <- mean(xx1) - mean(xx2)
> >  }
> >
> >
> >
> > ##### Confidence Interval:
> > ##### --------------------
> >
> >
> > quantile(theta, probs = c(.025,0.975)) #Efron percentile CI on
difference
> > in means
> >
> > ##### 2.5% 97.5%
> > ##### - 0.1248539 0.0137601
> >
> >
> > ##### P-Value
> > ##### -------
> >
> > p.value <- (sum (abs(theta) >= obs.diff.theta) + 1)/
(iterations+1)
> >
> > #####  p.value <- (sum (theta >= obs.diff.theta) + 1)/
(iterations+1)
> >
> > p.value
> >
> >
> >
> > #### R OUTPUT
> >
> > #### > quantile(theta, probs = c(.025,0.975))
> > ####        2.5%       97.5%
> > #### -0.12647744  0.02099391
> >
> > #### > p.value <- (sum (abs(theta) >= obs.diff.theta) + 1)/
> (iterations+1)
> > #### > p.value
> > #### [1] 1
> >
> > #####  Example 2:
> > #####  ----------
> >
> >
> > set.seed(5)
> >
> > n1 <- 29
> > ### n1
> > x1 <- rnorm(n1, 10.5, 0.15) ######   sample 1 with mean1 = 10.5
> > ### x1
> >
> > n2 <- 33
> > ### n2
> > x2 <- rnorm(n2, 1.5, 0.155) #####  Sample 2 with mean2 = 1.5
> > ### x2
> >
> > obs.diff.theta <- mean(x1) - mean(x2)
> > obs.diff.theta
> >
> > theta <- as.vector(NULL) #### vector to hold difference estimates
> >
> > iterations <- 1000
> >
> > #####   bootstrap resamples
> >
> > for (i in 1:1000) {
> >  xx1 <- sample(x1, n1, replace = TRUE)
> >  xx2 <- sample(x2, n2, replace = TRUE)
> >  theta[i] <- mean(xx1) - mean(xx2)
> >  }
> >
> >
> >
> > ##### Confidence Interval:
> > ##### --------------------
> >
> >
> > ######  CI on difference in means
> >
> > quantile(theta, probs = c(.025,0.975))
> >
> >
> >
> > ##### P-Value
> > ##### -------
> >
> > p.value <- (sum (abs(theta) >= obs.diff.theta) + 1)/
(iterations+1)
> >
> > ##### p.value <- (sum (theta >= obs.diff.theta) + 1)/
(iterations+1)
> >
> > p.value
> >
> > ##### R OUTPUT
> >
> > ####   > ######  CI on difference in means
> > ####   >
> > ####   > quantile(theta, probs = c(.025,0.975))
> > ####       2.5%    97.5%
> > ####   8.908398 9.060601
> >
> > ####   > ##### P-Value
> > ####   > p.value <- (sum (abs(theta) >= obs.diff.theta) + 1)/
> (iterations+1)
> >
> > ####   > p.value
> > ####   [1] 0.4835165
> >
> > ______________________
> >
> >
> > *AbouEl-Makarim Aboueissa, PhD*
> >
> > *Professor, Statistics and Data Science*
> > *Graduate Coordinator*
> >
> > *Department of Mathematics and Statistics*
> > *University of Southern Maine*
> >
> >         [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
>
>
> --
> Gregory (Greg) L. Snow Ph.D.
> 538280 at gmail.com
>
	[[alternative HTML version deleted]]

R help - Nov 2020 - Bootstrap P-Value

[R] Bootstrap P-Value

[R] Bootstrap P-Value

[R] Bootstrap P-Value