thr3ads.net - R help - [R] statistics - hypothesis testing question [Sep 2007]

If this information is useful, please help other people find it:
Share via:

Leeds, Mark (IED)

2007-Sep-13 18:18 UTC

[R] statistics - hypothesis testing question

I estimate two competing simple regression models, A and B where the LHS
is the same in both cases but the predictor is different (
I handle the intercept issue based on other postings I have seen ). I
estimate the two models on a weekly basis over 24 weeks. 
So, I end up with 24 RSquaredAs and 24 RsquaredBs, so essentally 2 time
series of Rsquareds. This doesn't have to be necessarily thought of as a
time series problem but, is there a usual way, given the Rsquared data,
to test 

H0 : Rsquared B = Rsquared A versus H1 : Rsquared B > Rsquared A 

so that I can map the 24 R squared numbers into 1 statistic. Maybe
that's somehow equivalent to just running 2 big regressions over the
whole 24 weeks and then calculating a statistic from those based on
those regressions ?

I broke things up into 24 weeks because I was thinking that the
stability of the performance difference of the two models could be 
examined over time. Essentially these are simple time series regressions
X_t = B*X_t-1 + epsilon so I always need to consider
whether any type of behavior is stable.  But now I am thinking that,  if
I just want one overall number,  then maybe I should be considering all
the data simultaneously ? 

In a nutshell,  I am looking for any suggestions on the best way to test
whether Model B is better than Model A where

Model A :  X_t = Beta*X_t-1 + epsilon

Model B :  X_t = Betastar*Xstar_t-1 + epsilonstar


Thanks fo your help.
--------------------------------------------------------

This is not an offer (or solicitation of an offer) to buy/se...{{dropped}}

Duncan Murdoch

2007-Sep-13 18:31 UTC

head link

[R] statistics - hypothesis testing question

On 9/13/2007 2:18 PM, Leeds, Mark (IED) wrote:> I estimate two competing simple regression models, A and B where the LHS
> is the same in both cases but the predictor is different (
> I handle the intercept issue based on other postings I have seen ). I
> estimate the two models on a weekly basis over 24 weeks. 
> So, I end up with 24 RSquaredAs and 24 RsquaredBs, so essentally 2 time
> series of Rsquareds. This doesn't have to be necessarily thought of as
a
> time series problem but, is there a usual way, given the Rsquared data,
> to test 
> 
> H0 : Rsquared B = Rsquared A versus H1 : Rsquared B > Rsquared A 
> 
> so that I can map the 24 R squared numbers into 1 statistic. Maybe
> that's somehow equivalent to just running 2 big regressions over the
> whole 24 weeks and then calculating a statistic from those based on
> those regressions ?
The question doesn't make sense, if you're using standard notation.  R^2
is a statistic, not a parameter, so one wouldn't test copies of it for 
equality.

You can probably reframe the question in terms of E(R^2) so the 
statement parses, but then it doesn't really make sense from a subject 
matter point of view:  unless model A is nested within model B, why 
would you ever expect the two fits to explain exactly the same amount of 
variation?

If model A is really a special case of model B, then you're back to the 
standard hypothesis testing situation, but repeated 24 times.  There's a 
lot of literature on how to handle such multiple testing problems, 
depending on what sort of alternatives you want to detect.  (E.g. do you 
think all 24 cases will be identical, or is it possible that 23 will 
match but one doesn't?)

Duncan Murdoch
> 
> I broke things up into 24 weeks because I was thinking that the
> stability of the performance difference of the two models could be 
> examined over time. Essentially these are simple time series regressions
> X_t = B*X_t-1 + epsilon so I always need to consider
> whether any type of behavior is stable.  But now I am thinking that,  if
> I just want one overall number,  then maybe I should be considering all
> the data simultaneously ? 
> 
> In a nutshell,  I am looking for any suggestions on the best way to test
> whether Model B is better than Model A where
> 
> Model A :  X_t = Beta*X_t-1 + epsilon
> 
> Model B :  X_t = Betastar*Xstar_t-1 + epsilonstar
> 
> 
> Thanks fo your help.
> --------------------------------------------------------
> 
> This is not an offer (or solicitation of an offer) to buy/se...{{dropped}}
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

Greg Snow

2007-Sep-13 19:08 UTC

head link

[R] statistics - hypothesis testing question

Is the data paired?  i.e. do you have an A and a B from week 1, then the
same for each following week?

If so, then you could probably do a simple sign test, within each week
see if rsquared B > rsquared A. under the null hypothesis that A and B
are equivalent this should be a binomial with parameter = 0.5.  If you
want something a little  fancier then you could do some type of
permutation test (which the sign test is a special case of).

Hope this helps,

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.snow at intermountainmail.org
(801) 408-8111
 
 
> -----Original Message-----
> From: r-help-bounces at r-project.org 
> [mailto:r-help-bounces at r-project.org] On Behalf Of Leeds, Mark (IED)
> Sent: Thursday, September 13, 2007 12:18 PM
> To: r-help at stat.math.ethz.ch
> Subject: [R] statistics - hypothesis testing question
> 
> I estimate two competing simple regression models, A and B 
> where the LHS is the same in both cases but the predictor is 
> different ( I handle the intercept issue based on other 
> postings I have seen ). I estimate the two models on a weekly 
> basis over 24 weeks. 
> So, I end up with 24 RSquaredAs and 24 RsquaredBs, so 
> essentally 2 time series of Rsquareds. This doesn't have to 
> be necessarily thought of as a time series problem but, is 
> there a usual way, given the Rsquared data, to test 
> 
> H0 : Rsquared B = Rsquared A versus H1 : Rsquared B > Rsquared A 
> 
> so that I can map the 24 R squared numbers into 1 statistic. 
> Maybe that's somehow equivalent to just running 2 big 
> regressions over the whole 24 weeks and then calculating a 
> statistic from those based on those regressions ?
> 
> I broke things up into 24 weeks because I was thinking that 
> the stability of the performance difference of the two models 
> could be examined over time. Essentially these are simple 
> time series regressions X_t = B*X_t-1 + epsilon so I always 
> need to consider whether any type of behavior is stable.  But 
> now I am thinking that,  if I just want one overall number,  
> then maybe I should be considering all the data simultaneously ? 
> 
> In a nutshell,  I am looking for any suggestions on the best 
> way to test whether Model B is better than Model A where
> 
> Model A :  X_t = Beta*X_t-1 + epsilon
> 
> Model B :  X_t = Betastar*Xstar_t-1 + epsilonstar
> 
> 
> Thanks fo your help.
> --------------------------------------------------------
> 
> This is not an offer (or solicitation of an offer) to 
> buy/se...{{dropped}}
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

John Sorkin

2007-Sep-13 20:19 UTC

head link

[R] statistics - hypothesis testing question

Mark,
Estimates of R values can be compared using Fishers r to z transform. Perhaps
this will
do what you wish to do.
John

John Sorkin M.D., Ph.D.
Chief, Biostatistics and Informatics
Baltimore VA Medical Center GRECC,
University of Maryland School of Medicine Claude D. Pepper OAIC,
University of Maryland Clinical Nutrition Research Unit, and
Baltimore VA Center Stroke of Excellence

University of Maryland School of Medicine
Division of Gerontology
Baltimore VA Medical Center
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524

(Phone) 410-605-7119
(Fax) 410-605-7913 (Please call phone number above prior to faxing)
jsorkin at grecc.umaryland.edu>>> Duncan Murdoch <murdoch at stats.uwo.ca> 09/13/07 2:31 PM
>>>
On 9/13/2007 2:18 PM, Leeds, Mark (IED) wrote:> I estimate two competing simple regression models, A and B where the LHS
> is the same in both cases but the predictor is different (
> I handle the intercept issue based on other postings I have seen ). I
> estimate the two models on a weekly basis over 24 weeks. 
> So, I end up with 24 RSquaredAs and 24 RsquaredBs, so essentally 2 time
> series of Rsquareds. This doesn't have to be necessarily thought of as
a
> time series problem but, is there a usual way, given the Rsquared data,
> to test 
> 
> H0 : Rsquared B = Rsquared A versus H1 : Rsquared B > Rsquared A 
> 
> so that I can map the 24 R squared numbers into 1 statistic. Maybe
> that's somehow equivalent to just running 2 big regressions over the
> whole 24 weeks and then calculating a statistic from those based on
> those regressions ?
The question doesn't make sense, if you're using standard notation.  R^2
is a statistic, not a parameter, so one wouldn't test copies of it for 
equality.

You can probably reframe the question in terms of E(R^2) so the 
statement parses, but then it doesn't really make sense from a subject 
matter point of view:  unless model A is nested within model B, why 
would you ever expect the two fits to explain exactly the same amount of 
variation?

If model A is really a special case of model B, then you're back to the 
standard hypothesis testing situation, but repeated 24 times.  There's a 
lot of literature on how to handle such multiple testing problems, 
depending on what sort of alternatives you want to detect.  (E.g. do you 
think all 24 cases will be identical, or is it possible that 23 will 
match but one doesn't?)

Duncan Murdoch
> 
> I broke things up into 24 weeks because I was thinking that the
> stability of the performance difference of the two models could be 
> examined over time. Essentially these are simple time series regressions
> X_t = B*X_t-1 + epsilon so I always need to consider
> whether any type of behavior is stable.  But now I am thinking that,  if
> I just want one overall number,  then maybe I should be considering all
> the data simultaneously ? 
> 
> In a nutshell,  I am looking for any suggestions on the best way to test
> whether Model B is better than Model A where
> 
> Model A :  X_t = Beta*X_t-1 + epsilon
> 
> Model B :  X_t = Betastar*Xstar_t-1 + epsilonstar
> 
> 
> Thanks fo your help.
> --------------------------------------------------------
> 
> This is not an offer (or solicitation of an offer) to buy/se...{{dropped}}
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Confidentiality Statement:
This email message, including any attachments, is for the so...{{dropped}}

S Ellison

2007-Sep-14 00:50 UTC

head link

[R] statistics - hypothesis testing question

I may be miles off base, but could this be treated as a random-effects model,
with the regression predictors as random effects grouped by week? And if so,
could each set form a single lme() model, allowing you to compare the models via
AIC's for 'quality' and anova for significance of the difference...?
(After reading Pinheiro and Bates of course.. and not all mixed effects models
can be compared directly, particularly using REML, if I read it correctly).


>>> "Greg Snow" <Greg.Snow at intermountainmail.org>
09/13/07 8:08 PM >>>
> I estimate two competing simple regression models, A and B 
> where the LHS is the same in both cases but the predictor is 
> different ( I handle the intercept issue based on other 
> postings I have seen ). I estimate the two models on a weekly 
> basis over 24 weeks. 
> So, I end up with 24 RSquaredAs and 24 RsquaredBs, so 
> essentally 2 time series of Rsquareds. This doesn't have to 
> be necessarily thought of as a time series problem but, is 
> there a usual way, given the Rsquared data, to test 
> 
> H0 : Rsquared B = Rsquared A versus H1 : Rsquared B > Rsquared A 
> 
> so that I can map the 24 R squared numbers into 1 statistic. 
> Maybe that's somehow equivalent to just running 2 big 
> regressions over the whole 24 weeks and then calculating a 
> statistic from those based on those regressions ?
> 
> I broke things up into 24 weeks because I was thinking that 
> the stability of the performance difference of the two models 
> could be examined over time. Essentially these are simple 
> time series regressions X_t = B*X_t-1 + epsilon so I always 
> need to consider whether any type of behavior is stable.  But 
> now I am thinking that,  if I just want one overall number,  
> then maybe I should be considering all the data simultaneously ? 
> 
> In a nutshell,  I am looking for any suggestions on the best 
> way to test whether Model B is better than Model A where
> 
> Model A :  X_t = Beta*X_t-1 + epsilon
> 
> Model B :  X_t = Betastar*Xstar_t-1 + epsilonstar
> 
> 
> Thanks fo your help.
> --------------------------------------------------------
> 
> This is not an offer (or solicitation of an offer) to 
> buy/se...{{dropped}}
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

*******************************************************************
This email and any attachments are confidential. Any use, co...{{dropped}}

Pfaff, Bernhard Dr.

2007-Sep-14 08:41 UTC

head link

[R] statistics - hypothesis testing question

Hello Mark,

in addition and complementing the already provided answers to your
question. You want to consider the J-test, too. For an outline and the
pitfalls of this test, see:

http://citeseer.ist.psu.edu/cache/papers/cs/24954/http:zSzzSzwww.econ.qu
eensu.cazSzfacultyzSzdavidsonzSzbj4-noam.pdf/bootstrap-j-tests-of.pdf


Best,
Bernhard 
>
>I estimate two competing simple regression models, A and B 
>where the LHS
>is the same in both cases but the predictor is different (
>I handle the intercept issue based on other postings I have seen ). I
>estimate the two models on a weekly basis over 24 weeks. 
>So, I end up with 24 RSquaredAs and 24 RsquaredBs, so essentally 2 time
>series of Rsquareds. This doesn't have to be necessarily 
>thought of as a
>time series problem but, is there a usual way, given the Rsquared data,
>to test 
>
>H0 : Rsquared B = Rsquared A versus H1 : Rsquared B > Rsquared A 
>
>so that I can map the 24 R squared numbers into 1 statistic. Maybe
>that's somehow equivalent to just running 2 big regressions over the
>whole 24 weeks and then calculating a statistic from those based on
>those regressions ?
>
>I broke things up into 24 weeks because I was thinking that the
>stability of the performance difference of the two models could be 
>examined over time. Essentially these are simple time series 
>regressions
>X_t = B*X_t-1 + epsilon so I always need to consider
>whether any type of behavior is stable.  But now I am thinking 
>that,  if
>I just want one overall number,  then maybe I should be considering all
>the data simultaneously ? 
>
>In a nutshell,  I am looking for any suggestions on the best 
>way to test
>whether Model B is better than Model A where
>
>Model A :  X_t = Beta*X_t-1 + epsilon
>
>Model B :  X_t = Betastar*Xstar_t-1 + epsilonstar
>
>
>Thanks fo your help.
>--------------------------------------------------------
>
>This is not an offer (or solicitation of an offer) to 
>buy/se...{{dropped}}
>
>______________________________________________
>R-help at r-project.org mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide 
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.
>*****************************************************************
Confidentiality Note: The information contained in this mess...{{dropped}}

Apparently Analagous Threads

Search for more possibly parallel threads

R help - Sep 2007 - statistics - hypothesis testing question

[R] statistics - hypothesis testing question

[R] statistics - hypothesis testing question

[R] statistics - hypothesis testing question

[R] statistics - hypothesis testing question

[R] statistics - hypothesis testing question

[R] statistics - hypothesis testing question

Apparently Analagous Threads