thr3ads.net - R help - [R] bootstrapping in regression [Jan 2009]

If this information is useful, please help other people find it:
Share via:

Thomas Mang

2009-Jan-29 16:43 UTC

[R] bootstrapping in regression

Hi,

Please apologize if my questions sounds somewhat 'stupid' to the trained
and experienced statisticians of you. Also I am not sure if I used all 
terms correctly, if not then corrections are welcome.

I have asked myself the following question regarding bootstrapping in 
regression:
Say for whatever reason one does not want to take the p-values for 
regression coefficients from the established test statistics 
distributions (t-distr for individual coefficients, F-values for 
whole-model-comparisons), but instead apply a more robust approach by 
bootstrapping.

In the simple linear regression case, one possibility is to randomly 
rearrange the X/Y data pairs, estimate the model and take the 
beta1-coefficient. Do this many many times, and so derive the null 
distribution for beta1. Finally compare beta1 for the observed data 
against this null-distribution.

What I now wonder is how the situation looks like in the multiple 
regression case. Assume there are two predictors, X1 and X2. Is it then 
possible to do the same, but just only rearranging the values of one 
predictor (the one of interest) at a time? Say I want again to test 
beta1. Is it then valid to many times randomly rearrange the X1 data 
(and keeping Y and X2 as observed), fit the model, take the beta1 
coefficient, and finally compare the beta1 of the observed data against 
the distributions of these beta1s ?
For X2, do the same, randomly rearrange X2 all the time while keeping Y 
and X1 as observed etc.
Is this valid ?

Second, if this is valid for the 'normal', fixed-effects only 
regression, is it also valid to derive null distributions for the 
regression coefficients of the fixed effects in a mixed model this way? 
Or does the quite different parameters estimation calculation forbid 
this approach (Forbid in the sense of bogus outcome) ?

Thanks, Thomas

Chuck Cleland

2009-Jan-29 17:22 UTC

head link

[R] bootstrapping in regression

On 1/29/2009 11:43 AM, Thomas Mang wrote:> Hi,
> 
> Please apologize if my questions sounds somewhat 'stupid' to the
trained
> and experienced statisticians of you. Also I am not sure if I used all
> terms correctly, if not then corrections are welcome.
> 
> I have asked myself the following question regarding bootstrapping in
> regression:
> Say for whatever reason one does not want to take the p-values for
> regression coefficients from the established test statistics
> distributions (t-distr for individual coefficients, F-values for
> whole-model-comparisons), but instead apply a more robust approach by
> bootstrapping.
> 
> In the simple linear regression case, one possibility is to randomly
> rearrange the X/Y data pairs, estimate the model and take the
> beta1-coefficient. Do this many many times, and so derive the null
> distribution for beta1. Finally compare beta1 for the observed data
> against this null-distribution.
> 
> What I now wonder is how the situation looks like in the multiple
> regression case. Assume there are two predictors, X1 and X2. Is it then
> possible to do the same, but just only rearranging the values of one
> predictor (the one of interest) at a time? Say I want again to test
> beta1. Is it then valid to many times randomly rearrange the X1 data
> (and keeping Y and X2 as observed), fit the model, take the beta1
> coefficient, and finally compare the beta1 of the observed data against
> the distributions of these beta1s ?
> For X2, do the same, randomly rearrange X2 all the time while keeping Y
> and X1 as observed etc.
> Is this valid ?
> 
> Second, if this is valid for the 'normal', fixed-effects only
> regression, is it also valid to derive null distributions for the
> regression coefficients of the fixed effects in a mixed model this way?
> Or does the quite different parameters estimation calculation forbid
> this approach (Forbid in the sense of bogus outcome) ?
> 
> Thanks, Thomas
  Have a look at the following document by John Fox:

http://cran.r-project.org/doc/contrib/Fox-Companion/appendix-bootstrapping.pdf
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
-- 
Chuck Cleland, Ph.D.
NDRI, Inc. (www.ndri.org)
71 West 23rd Street, 8th floor
New York, NY 10010
tel: (212) 845-4495 (Tu, Th)
tel: (732) 512-0171 (M, W, F)
fax: (917) 438-0894

Tom Backer Johnsen

2009-Jan-29 21:56 UTC

head link

[R] bootstrapping in regression

Tom Backer Johnsen wrote:> Thomas Mang wrote:
>> Hi,
>>
>> Please apologize if my questions sounds somewhat 'stupid' to
the
>> trained and experienced statisticians of you. Also I am not sure if I 
>> used all terms correctly, if not then corrections are welcome.
>>
>> I have asked myself the following question regarding bootstrapping in 
>> regression:
>> Say for whatever reason one does not want to take the p-values for 
>> regression coefficients from the established test statistics 
>> distributions (t-distr for individual coefficients, F-values for 
>> whole-model-comparisons), but instead apply a more robust approach by 
>> bootstrapping.
>>
>> In the simple linear regression case, one possibility is to randomly 
>> rearrange the X/Y data pairs, estimate the model and take the 
>> beta1-coefficient. Do this many many times, and so derive the null 
>> distribution for beta1. Finally compare beta1 for the observed data 
>> against this null-distribution.
> 
> There is a very basic difference between bootstrapping and random 
> permutations.  What you are suggesting is to shuffle values between 
> cases or rows in the frame.  That amounts to a variant of a permutation 
> test, not a bootstrap.
> 
> What you do in a bootstrap test is different, you regard your sample as 
> a population and then sample from that population (with replacement), 
> normally by extracting a large number of random samples of the same size 
> as the original sample and do the computations for whatever you are 
> interested in for each sample.
> 
> In other words, with bootstrapping, the pattern of values within each 
> case or row is unchanged, and you sample complete cases or rows.  With a 
> permutation test you keep the original sample of cases or rows, but 
> shuffle the observations on the same variable between cases or rows.
> 
> Have a look at the 'boot' package.
> 
> Tom
>>
>> What I now wonder is how the situation looks like in the multiple 
>> regression case. Assume there are two predictors, X1 and X2. Is it 
>> then possible to do the same, but just only rearranging the values of 
>> one predictor (the one of interest) at a time? Say I want again to 
>> test beta1. Is it then valid to many times randomly rearrange the X1 
>> data (and keeping Y and X2 as observed), fit the model, take the beta1 
>> coefficient, and finally compare the beta1 of the observed data 
>> against the distributions of these beta1s ?
>> For X2, do the same, randomly rearrange X2 all the time while keeping 
>> Y and X1 as observed etc.
>> Is this valid ?
>>
>> Second, if this is valid for the 'normal', fixed-effects only 
>> regression, is it also valid to derive null distributions for the 
>> regression coefficients of the fixed effects in a mixed model this 
>> way? Or does the quite different parameters estimation calculation 
>> forbid this approach (Forbid in the sense of bogus outcome) ?
>>
>> Thanks, Thomas
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide 
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
> 
> 

-- 
+----------------------------------------------------------------+
| Tom Backer Johnsen, Psychometrics Unit,  Faculty of Psychology |
| University of Bergen, Christies gt. 12, N-5015 Bergen,  NORWAY |
| Tel : +47-5558-9185                        Fax : +47-5558-9879 |
| Email : backer at psych.uib.no    URL : http://www.galton.uib.no/ |
+----------------------------------------------------------------+

Greg Snow

2009-Jan-29 22:02 UTC

head link

[R] bootstrapping in regression

What you are describing is actually a permutation test rather than a bootstrap
(related concepts but with a subtle but important difference).

The way to do a permutation test with multiple x's is to fit the reduced
model (use all x's other than x1 if you want to test x1) on the original
data and store the fitted values and the residuals.

Permute the residuals (randomize their order) and add them back to the fitted
values and fit the full model (including x1 this time) to the permuted data set.
Do this a bunch of times and it will give you the sampling distribution for the
slope on x1 (or whatever your set of interest is) when the null hypothesis that
it is 0 given the other variables in the model is true.

Permuting just x1 only works if x1 is orthogonal to all the other predictors,
otherwise the permuting destroys the relationship with the other predictors and
does not do the test you want.

Bootstrapping depends on sampling with replacement, not permuting, and is used
more for confidence intervals than for tests (the reference by John Fox given to
you in another reply can help if that is the approach you want to take).

Hope this helps,

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.snow at imail.org
801.408.8111

> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-
> project.org] On Behalf Of Thomas Mang
> Sent: Thursday, January 29, 2009 9:44 AM
> To: r-help at stat.math.ethz.ch
> Subject: [R] bootstrapping in regression
> 
> Hi,
> 
> Please apologize if my questions sounds somewhat 'stupid' to the
> trained
> and experienced statisticians of you. Also I am not sure if I used all
> terms correctly, if not then corrections are welcome.
> 
> I have asked myself the following question regarding bootstrapping in
> regression:
> Say for whatever reason one does not want to take the p-values for
> regression coefficients from the established test statistics
> distributions (t-distr for individual coefficients, F-values for
> whole-model-comparisons), but instead apply a more robust approach by
> bootstrapping.
> 
> In the simple linear regression case, one possibility is to randomly
> rearrange the X/Y data pairs, estimate the model and take the
> beta1-coefficient. Do this many many times, and so derive the null
> distribution for beta1. Finally compare beta1 for the observed data
> against this null-distribution.
> 
> What I now wonder is how the situation looks like in the multiple
> regression case. Assume there are two predictors, X1 and X2. Is it then
> possible to do the same, but just only rearranging the values of one
> predictor (the one of interest) at a time? Say I want again to test
> beta1. Is it then valid to many times randomly rearrange the X1 data
> (and keeping Y and X2 as observed), fit the model, take the beta1
> coefficient, and finally compare the beta1 of the observed data against
> the distributions of these beta1s ?
> For X2, do the same, randomly rearrange X2 all the time while keeping Y
> and X1 as observed etc.
> Is this valid ?
> 
> Second, if this is valid for the 'normal', fixed-effects only
> regression, is it also valid to derive null distributions for the
> regression coefficients of the fixed effects in a mixed model this way?
> Or does the quite different parameters estimation calculation forbid
> this approach (Forbid in the sense of bogus outcome) ?
> 
> Thanks, Thomas
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

Maybe Matching Threads

Search for more possibly parallel threads

R help - Jan 2009 - bootstrapping in regression

[R] bootstrapping in regression

[R] bootstrapping in regression

[R] bootstrapping in regression

[R] bootstrapping in regression

Maybe Matching Threads