Eric Jaeger
2013-Jul-18 09:50 UTC
[R] Test if 2 samples differ if they have autocorrelation
> Dear all > > I have one question that I struggle to find an answer: > > Let`s assume I have 2 timeseries of daily PnL data over 2 years coming from 2 different trading strategies. I want to find out if strategy A is better than strategy B. The problem is that the two series have serial correlations, hence I cannot just do a simple t-test. > > I tried something like this: > > 1.create cumulative timeseries of PnL_A = C_A and of PnL_B = C_B > > 2.take the difference of both: C_A – C_B = DiffPnL (to see how the difference evolves over time) > > 3.do a regression: DiffPnL = beta * time + error (I thought if beta is significantly different from 0 than the two time series are different) > > 4.estimate beta not with OLS, but with the Newey-West method (HAC estimator) -> this corrects statistical tests, standard errors for beta heteroskedasticity and autocorrelation > > BUT: I read something that the tests are biased when the timeseries are unit root non-stationary (which is due to the fact that I take cumulative time series) > > > > I am lost! This should be fairly simple: test if two samples differ if they have autocorrelation? Probably my approach above is completely wrong… > > > > Thanks for your help > > Best regards > > Eric > > > > The information in this e-mail is intended only for th...{{dropped:23}}
Rolf Turner
2013-Jul-18 21:51 UTC
[R] Test if 2 samples differ if they have autocorrelation
I imagine that most readers of this list will put your question in the too hard basket. That being so, here is my inexpert take on the question. The issue is to estimate the uncertainty in the estimated difference of the means. This uncertainty depends on the nature of the serial dependence of the series. Therefore in order to get anywhere you need to *model* this dependence. Different models could yield very different values for the variance of the estimated difference of the means. If the series are observed at the same times I would suggest taking the pointwise difference of the two series: D_t = X_t - Y_t, say. Fit the best arima model that you can to D_t. Then the standard error of what is incorrectly labelled "intercept" (it is actually the estimate of the series *mean*) is the appropriate estimate of the uncertainty. The ratio of the "intercept" value to its standard error is the test statistic you are looking for. If the series are *not* observed at the same times but can be assumed to be independent then model *each* series as well as you can (different models for each series) and obtain the standard error of the "intercept" for each series. Your test statistic is then the difference of the "intercept estimates divided by sqrt(se_X^2 + se_Y^2) in what I hope is an "obvious" notation. If the series are not observed at the same times and cannot be assumed to be independent then you probably haven't got sufficient information to answer the question that you wish to answer. I hope that there is some value in the forgoing. cheers, Rolf Turner On 18/07/13 21:50, Eric Jaeger wrote:>> Dear all >> >> I have one question that I struggle to find an answer: >> >> Let`s assume I have 2 timeseries of daily PnL data over 2 years coming from 2 different trading strategies. I want to find out if strategy A is better than strategy B. The problem is that the two series have serial correlations, hence I cannot just do a simple t-test. >> >> I tried something like this: >> >> 1.create cumulative timeseries of PnL_A = C_A and of PnL_B = C_B >> >> 2.take the difference of both: C_A ??? C_B = DiffPnL (to see how the difference evolves over time) >> >> 3.do a regression: DiffPnL = beta * time + error (I thought if beta is significantly different from 0 than the two time series are different) >> >> 4.estimate beta not with OLS, but with the Newey-West method (HAC estimator) -> this corrects statistical tests, standard errors for beta heteroskedasticity and autocorrelation >> >> BUT: I read something that the tests are biased when the timeseries are unit root non-stationary (which is due to the fact that I take cumulative time series) >> >> >> >> I am lost! This should be fairly simple: test if two samples differ if they have autocorrelation? Probably my approach above is completely wrong???