I estimate two competing simple regression models, A and B where the LHS is the same in both cases but the predictor is different ( I handle the intercept issue based on other postings I have seen ). I estimate the two models on a weekly basis over 24 weeks. So, I end up with 24 RSquaredAs and 24 RsquaredBs, so essentally 2 time series of Rsquareds. This doesn't have to be necessarily thought of as a time series problem but, is there a usual way, given the Rsquared data, to test H0 : Rsquared B = Rsquared A versus H1 : Rsquared B > Rsquared A so that I can map the 24 R squared numbers into 1 statistic. Maybe that's somehow equivalent to just running 2 big regressions over the whole 24 weeks and then calculating a statistic from those based on those regressions ? I broke things up into 24 weeks because I was thinking that the stability of the performance difference of the two models could be examined over time. Essentially these are simple time series regressions X_t = B*X_t-1 + epsilon so I always need to consider whether any type of behavior is stable. But now I am thinking that, if I just want one overall number, then maybe I should be considering all the data simultaneously ? In a nutshell, I am looking for any suggestions on the best way to test whether Model B is better than Model A where Model A : X_t = Beta*X_t-1 + epsilon Model B : X_t = Betastar*Xstar_t-1 + epsilonstar Thanks fo your help. -------------------------------------------------------- This is not an offer (or solicitation of an offer) to buy/se...{{dropped}}
On 9/13/2007 2:18 PM, Leeds, Mark (IED) wrote:> I estimate two competing simple regression models, A and B where the LHS > is the same in both cases but the predictor is different ( > I handle the intercept issue based on other postings I have seen ). I > estimate the two models on a weekly basis over 24 weeks. > So, I end up with 24 RSquaredAs and 24 RsquaredBs, so essentally 2 time > series of Rsquareds. This doesn't have to be necessarily thought of as a > time series problem but, is there a usual way, given the Rsquared data, > to test > > H0 : Rsquared B = Rsquared A versus H1 : Rsquared B > Rsquared A > > so that I can map the 24 R squared numbers into 1 statistic. Maybe > that's somehow equivalent to just running 2 big regressions over the > whole 24 weeks and then calculating a statistic from those based on > those regressions ?The question doesn't make sense, if you're using standard notation. R^2 is a statistic, not a parameter, so one wouldn't test copies of it for equality. You can probably reframe the question in terms of E(R^2) so the statement parses, but then it doesn't really make sense from a subject matter point of view: unless model A is nested within model B, why would you ever expect the two fits to explain exactly the same amount of variation? If model A is really a special case of model B, then you're back to the standard hypothesis testing situation, but repeated 24 times. There's a lot of literature on how to handle such multiple testing problems, depending on what sort of alternatives you want to detect. (E.g. do you think all 24 cases will be identical, or is it possible that 23 will match but one doesn't?) Duncan Murdoch> > I broke things up into 24 weeks because I was thinking that the > stability of the performance difference of the two models could be > examined over time. Essentially these are simple time series regressions > X_t = B*X_t-1 + epsilon so I always need to consider > whether any type of behavior is stable. But now I am thinking that, if > I just want one overall number, then maybe I should be considering all > the data simultaneously ? > > In a nutshell, I am looking for any suggestions on the best way to test > whether Model B is better than Model A where > > Model A : X_t = Beta*X_t-1 + epsilon > > Model B : X_t = Betastar*Xstar_t-1 + epsilonstar > > > Thanks fo your help. > -------------------------------------------------------- > > This is not an offer (or solicitation of an offer) to buy/se...{{dropped}} > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Is the data paired? i.e. do you have an A and a B from week 1, then the same for each following week? If so, then you could probably do a simple sign test, within each week see if rsquared B > rsquared A. under the null hypothesis that A and B are equivalent this should be a binomial with parameter = 0.5. If you want something a little fancier then you could do some type of permutation test (which the sign test is a special case of). Hope this helps, -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.snow at intermountainmail.org (801) 408-8111> -----Original Message----- > From: r-help-bounces at r-project.org > [mailto:r-help-bounces at r-project.org] On Behalf Of Leeds, Mark (IED) > Sent: Thursday, September 13, 2007 12:18 PM > To: r-help at stat.math.ethz.ch > Subject: [R] statistics - hypothesis testing question > > I estimate two competing simple regression models, A and B > where the LHS is the same in both cases but the predictor is > different ( I handle the intercept issue based on other > postings I have seen ). I estimate the two models on a weekly > basis over 24 weeks. > So, I end up with 24 RSquaredAs and 24 RsquaredBs, so > essentally 2 time series of Rsquareds. This doesn't have to > be necessarily thought of as a time series problem but, is > there a usual way, given the Rsquared data, to test > > H0 : Rsquared B = Rsquared A versus H1 : Rsquared B > Rsquared A > > so that I can map the 24 R squared numbers into 1 statistic. > Maybe that's somehow equivalent to just running 2 big > regressions over the whole 24 weeks and then calculating a > statistic from those based on those regressions ? > > I broke things up into 24 weeks because I was thinking that > the stability of the performance difference of the two models > could be examined over time. Essentially these are simple > time series regressions X_t = B*X_t-1 + epsilon so I always > need to consider whether any type of behavior is stable. But > now I am thinking that, if I just want one overall number, > then maybe I should be considering all the data simultaneously ? > > In a nutshell, I am looking for any suggestions on the best > way to test whether Model B is better than Model A where > > Model A : X_t = Beta*X_t-1 + epsilon > > Model B : X_t = Betastar*Xstar_t-1 + epsilonstar > > > Thanks fo your help. > -------------------------------------------------------- > > This is not an offer (or solicitation of an offer) to > buy/se...{{dropped}} > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
Mark, Estimates of R values can be compared using Fishers r to z transform. Perhaps this will do what you wish to do. John John Sorkin M.D., Ph.D. Chief, Biostatistics and Informatics Baltimore VA Medical Center GRECC, University of Maryland School of Medicine Claude D. Pepper OAIC, University of Maryland Clinical Nutrition Research Unit, and Baltimore VA Center Stroke of Excellence University of Maryland School of Medicine Division of Gerontology Baltimore VA Medical Center 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 (Phone) 410-605-7119 (Fax) 410-605-7913 (Please call phone number above prior to faxing) jsorkin at grecc.umaryland.edu>>> Duncan Murdoch <murdoch at stats.uwo.ca> 09/13/07 2:31 PM >>>On 9/13/2007 2:18 PM, Leeds, Mark (IED) wrote:> I estimate two competing simple regression models, A and B where the LHS > is the same in both cases but the predictor is different ( > I handle the intercept issue based on other postings I have seen ). I > estimate the two models on a weekly basis over 24 weeks. > So, I end up with 24 RSquaredAs and 24 RsquaredBs, so essentally 2 time > series of Rsquareds. This doesn't have to be necessarily thought of as a > time series problem but, is there a usual way, given the Rsquared data, > to test > > H0 : Rsquared B = Rsquared A versus H1 : Rsquared B > Rsquared A > > so that I can map the 24 R squared numbers into 1 statistic. Maybe > that's somehow equivalent to just running 2 big regressions over the > whole 24 weeks and then calculating a statistic from those based on > those regressions ?The question doesn't make sense, if you're using standard notation. R^2 is a statistic, not a parameter, so one wouldn't test copies of it for equality. You can probably reframe the question in terms of E(R^2) so the statement parses, but then it doesn't really make sense from a subject matter point of view: unless model A is nested within model B, why would you ever expect the two fits to explain exactly the same amount of variation? If model A is really a special case of model B, then you're back to the standard hypothesis testing situation, but repeated 24 times. There's a lot of literature on how to handle such multiple testing problems, depending on what sort of alternatives you want to detect. (E.g. do you think all 24 cases will be identical, or is it possible that 23 will match but one doesn't?) Duncan Murdoch> > I broke things up into 24 weeks because I was thinking that the > stability of the performance difference of the two models could be > examined over time. Essentially these are simple time series regressions > X_t = B*X_t-1 + epsilon so I always need to consider > whether any type of behavior is stable. But now I am thinking that, if > I just want one overall number, then maybe I should be considering all > the data simultaneously ? > > In a nutshell, I am looking for any suggestions on the best way to test > whether Model B is better than Model A where > > Model A : X_t = Beta*X_t-1 + epsilon > > Model B : X_t = Betastar*Xstar_t-1 + epsilonstar > > > Thanks fo your help. > -------------------------------------------------------- > > This is not an offer (or solicitation of an offer) to buy/se...{{dropped}} > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Confidentiality Statement: This email message, including any attachments, is for the so...{{dropped}}
I may be miles off base, but could this be treated as a random-effects model, with the regression predictors as random effects grouped by week? And if so, could each set form a single lme() model, allowing you to compare the models via AIC's for 'quality' and anova for significance of the difference...? (After reading Pinheiro and Bates of course.. and not all mixed effects models can be compared directly, particularly using REML, if I read it correctly).>>> "Greg Snow" <Greg.Snow at intermountainmail.org> 09/13/07 8:08 PM >>>> I estimate two competing simple regression models, A and B > where the LHS is the same in both cases but the predictor is > different ( I handle the intercept issue based on other > postings I have seen ). I estimate the two models on a weekly > basis over 24 weeks. > So, I end up with 24 RSquaredAs and 24 RsquaredBs, so > essentally 2 time series of Rsquareds. This doesn't have to > be necessarily thought of as a time series problem but, is > there a usual way, given the Rsquared data, to test > > H0 : Rsquared B = Rsquared A versus H1 : Rsquared B > Rsquared A > > so that I can map the 24 R squared numbers into 1 statistic. > Maybe that's somehow equivalent to just running 2 big > regressions over the whole 24 weeks and then calculating a > statistic from those based on those regressions ? > > I broke things up into 24 weeks because I was thinking that > the stability of the performance difference of the two models > could be examined over time. Essentially these are simple > time series regressions X_t = B*X_t-1 + epsilon so I always > need to consider whether any type of behavior is stable. But > now I am thinking that, if I just want one overall number, > then maybe I should be considering all the data simultaneously ? > > In a nutshell, I am looking for any suggestions on the best > way to test whether Model B is better than Model A where > > Model A : X_t = Beta*X_t-1 + epsilon > > Model B : X_t = Betastar*Xstar_t-1 + epsilonstar > > > Thanks fo your help. > -------------------------------------------------------- > > This is not an offer (or solicitation of an offer) to > buy/se...{{dropped}} > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ******************************************************************* This email and any attachments are confidential. Any use, co...{{dropped}}
Hello Mark, in addition and complementing the already provided answers to your question. You want to consider the J-test, too. For an outline and the pitfalls of this test, see: http://citeseer.ist.psu.edu/cache/papers/cs/24954/http:zSzzSzwww.econ.qu eensu.cazSzfacultyzSzdavidsonzSzbj4-noam.pdf/bootstrap-j-tests-of.pdf Best, Bernhard> >I estimate two competing simple regression models, A and B >where the LHS >is the same in both cases but the predictor is different ( >I handle the intercept issue based on other postings I have seen ). I >estimate the two models on a weekly basis over 24 weeks. >So, I end up with 24 RSquaredAs and 24 RsquaredBs, so essentally 2 time >series of Rsquareds. This doesn't have to be necessarily >thought of as a >time series problem but, is there a usual way, given the Rsquared data, >to test > >H0 : Rsquared B = Rsquared A versus H1 : Rsquared B > Rsquared A > >so that I can map the 24 R squared numbers into 1 statistic. Maybe >that's somehow equivalent to just running 2 big regressions over the >whole 24 weeks and then calculating a statistic from those based on >those regressions ? > >I broke things up into 24 weeks because I was thinking that the >stability of the performance difference of the two models could be >examined over time. Essentially these are simple time series >regressions >X_t = B*X_t-1 + epsilon so I always need to consider >whether any type of behavior is stable. But now I am thinking >that, if >I just want one overall number, then maybe I should be considering all >the data simultaneously ? > >In a nutshell, I am looking for any suggestions on the best >way to test >whether Model B is better than Model A where > >Model A : X_t = Beta*X_t-1 + epsilon > >Model B : X_t = Betastar*Xstar_t-1 + epsilonstar > > >Thanks fo your help. >-------------------------------------------------------- > >This is not an offer (or solicitation of an offer) to >buy/se...{{dropped}} > >______________________________________________ >R-help at r-project.org mailing list >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide >http://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code. >***************************************************************** Confidentiality Note: The information contained in this mess...{{dropped}}