thr3ads.net - R help - [R] Comparison of two time series using R [Jul 2002]

If this information is useful, please help other people find it:
Share via:

Tim Churches

2002-Jul-30 00:08 UTC

[R] Comparison of two time series using R

We have two time series: the first is a series of weekly counts of 
isolates of RSV (respiratory syncytial virus) by pathology laboratories, 
and the second is a series of weekly counts of cases of bronchiolitis in 
young children presenting to hospital emergency departments. 
Bronchiolitis in young children is usually caused by RSV infection, and 
simple visual inspection reveals a very close correspondence between the 
two series, both of which show strong seasonality and also corresponding 
variation from year to year.

My question is how to approach the analysis of these data using R. Here 
is what we have done so far (guided by Diggle and MASS):

1) Create two time-series (ts) objects from the data, making sure the 
corresponding observations in the two ts are in fact contemporaneous.
2) Decompose each ts into seasonal, trend and remainder components using 
stl() and decompose().
3) Examine the cross-correlogram for the raw ts and the decomposed 
components using ccf() - this revealed that bronchiolitis cases were 
maximally cross-correlated with RSV isolates at a 3 week lag.
4) Examine periodograms of the raw ts and the pre-whitened data (the 
remainders) - most of the energy is in the week-to-week variation.
5) Calculate the cross-correlation between the remainders of the two 
series using a 3 week lag - it is about 0.55.

OK as far as it goes, but these results only obliquely shed light on the 
question we want to answer: "Can lab RSV isolate counts be used to 
predict the hospital bronchiolitis case-load a few weeks hence, and if 
so, how reliably?"

Stephen Morrell [Morrell S. Times Series (Box-Jenkins) Analysis. In: 
Kerr C, Taylor R, Heard G. /Handbook of Public Health Methods, /McGraw 
Hill, Sydney, 1998.] suggests the following approach (direct quote 
observing fair-use copyright provisions follows):

"In the first stage of analysis, the outcome and predictor series are 
pre-analysed tp identify the form of the transfer function. In the 
second stage the transfer function is identified and its residuals 
computed. Finally, an ARIMA model is fitted to the residuals to assess 
the adequacy of teh overall model. A ratio of U- and S-polynomials, 
U(B)/S(B), called impulse weights, is used to specify the effect of a 
unit change in the predictor series on teh outcome series. These weights 
are initally estimated by a cross-correlation function (CCF), which 
assess the relationship between the de=trended predictor series on the 
de=trended outcome series (with autocorrelation influences removed, 
called prewhitening)."

Is this a reasonable approach to our question? Hints on how to proceed 
are most welcome, and/or references to papers or texts which might 
render us a bit less clueless wrt this problem.

Regards,

Tim C



-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at
stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._

Thomas Lumley

2002-Jul-30 15:11 UTC

head link

[R] Comparison of two time series using R

On Tue, 30 Jul 2002, Tim Churches wrote:
> We have two time series: the first is a series of weekly counts of
> isolates of RSV (respiratory syncytial virus) by pathology laboratories,
> and the second is a series of weekly counts of cases of bronchiolitis in
> young children presenting to hospital emergency departments.
> Bronchiolitis in young children is usually caused by RSV infection, and
> simple visual inspection reveals a very close correspondence between the
> two series, both of which show strong seasonality and also corresponding
> variation from year to year.
>
<snip>> Is this a reasonable approach to our question? Hints on how to proceed
> are most welcome, and/or references to papers or texts which might
> render us a bit less clueless wrt this problem.
This is similar to the air pollution epidemiology question of whether eg
particulate air pollution causes myocardial infarction. There are very
strong season patterns, year-to-year variation, and weather effects, and a
very small (but non-zero) residual association.

I would fit a loglinear model with spline terms to remove the seasonal
effects
eg

model <- glm(bronchiolitis~RSV+ns(week,df)+temperature,family=quasipoisson)

where df is chosen to remove seasonal-scale variability (eg 4-6 df/year of
data)

If there is substantial autocorrelation in the pearson residuals

   acf(resid(model,"pearson"))

then you need some other sort of standard error calculation.  One method
(which was my PhD dissertation) is available for R at
   http://faculty.washington.edu/tlumley/weave.html

However, in air pollution epi there usually isn't much autocorrelation
after season and weather are removed.



This will tell you how well RSV prevalence predicts bronchiolitis over and
above seasonal variation.


You might also try adding a sine/cosine term to the model to represent the
predictable part of seasonal variation.  This would allow you to say how
much of the seasonal variation is stable from year to year, which would be
useful in assessing prediction.


	-thomas




-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at
stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._

Maybe Matching Threads

Search for more apparently analagous threads

R help - Jul 2002 - Comparison of two time series using R

[R] Comparison of two time series using R

[R] Comparison of two time series using R

Maybe Matching Threads