Dear Users, please help with the following DF test: ====library(tseries) library(timeSeries) Y=c(3519,3803,4332,4251,4661,4811,4448,4451,4343,4067,4001,3934,3652,3768 ,4082,4101,4628,4898,4476,4728,4458,4004,4095,4056,3641,3966,4417,4367 ,4821,5190,4638,4904,4528,4383,4339,4327,3856,4072,4563,4561,4984,5316 ,4843,5383,4889,4681,4466,4463,4217,4322,4779,4988,5383,5591,5322,5404 ,5106,4871,4977,4706,4193,4460,4956,5022,5408,5565,5360,5490,5286,5257 ,5002,4897,4577,4764,5052,5251,5558,5931,5476,5603,5425,5177,4792,4776 ,4450,4659,5043,5233,5423,5814,5339,5474,5278,5184,4975,4751,4600,4718 ,5218,5336,5665,5900,5330,5626,5512,5293,5143,4842,4627,4981,5321,5290 ,6002,5811,5671,6102,5482,5429,5356,5167,4608,4889,5352,5441,5970,5750 ,5670,5860,5449,5401,5240,5229,4770,5006,5518,5576,6160,6121,5900,5994 ,5841,5832,5505,5573,5331,5355,6057,6055,6771,6669,6375,6666,6383,6118 ,5927,5750,5122,5398,5817,6163,6763,6835,6678,6821,6421,6338,6265,6291 ,5540,5822,6318,6268,7270,7096,6505,7039,6440,6446,6717,6320) YY=as.timeSeries(Y) adf.test(Y) adf.test(YY) ======== Output ===> adf.test(Y) Augmented Dickey-Fuller Test data: Y Dickey-Fuller = -6.1661, Lag order = 5, p-value = 0.01 alternative hypothesis: stationary Warning message: In adf.test(Y) : p-value smaller than printed p-value> adf.test(YY)Augmented Dickey-Fuller Test data: YY Dickey-Fuller = 12.4944, Lag order = 5, p-value = 0.99 alternative hypothesis: stationary Warning message: In adf.test(YY) : p-value greater than printed p-value>=========================================Question: Why the two results are different? The help file says that the input series is either a numeric vector or a time series object. But the results are completely opposite if the different types of arguments are used. Thanks in advance. -- View this message in context: http://r.789695.n4.nabble.com/Dickey-Fuller-Test-tp3018408p3018408.html Sent from the R help mailing list archive at Nabble.com.
Hi: Since the series is obviously nonstationary and periodic, it would seem that one should embrace a wider window over which to evaluate the DF test. I did the following: y <- ts(Y, frequency = 12) # looked like an annual series to me plot(stl(y, 'periodic')) # very informative!! adf.test(y) Augmented Dickey-Fuller Test data: y Dickey-Fuller = -6.1661, Lag order = 5, p-value = 0.01 alternative hypothesis: stationary # Try a wider window of 12 lags: adf.test(y, k = 12) Augmented Dickey-Fuller Test data: y Dickey-Fuller = -1.061, Lag order = 12, p-value = 0.9261 alternative hypothesis: stationary This certainly made me wonder, and since I don't know much about unit root testing, it was time to play. What should k be? I decided to plot the value of the ADF statistic and its p-value for lags k from 5 to 40 just to see what would happen. df <- data.frame(lag = 5:40, ADF.statistic = sapply(5:40, function(x) adf.test(y, k = x)$statistic), ADF.pvalue = sapply(5:40, function(x) adf.test(y, k = x)$p.value)) library(ggplot2) dfm <- melt(df, id = 'lag') # one could use geom_point() in place of geom_path() ggplot(dfm, aes(x = lag, y = value)) + geom_path(size = 1) + facet_grid(variable ~ ., scales = 'free_y') + xlab('Lag number') + ylab("") This shows pretty clearly that after about eight or nine lags, the ADF statistic fails to reject the null hypothesis of a unit root. The appearance of periodicity in both graphs indicates that the seasonality of the series has some impact on the value of the augmented DF test. Since I'm not up on the literature re this test, is it meant to be applied in the presence of seasonality? If not, I would suggest deseasonalizing the series first before applying the unit root test. I tried a few models and the best fitting one had differencing in both the non-seasonal and seasonal parts. This isn't really supported by the stl() graph, but there is an improvement in the fit if the seasonal part of the series is differenced, too. So if there is nonstationarity in both the non-seasonal and seasonal parts of the series, is the ADF test sensitive enough to pick up on this and does it still do the right thing? From my limited web searching, I didn't find any adjustments in the presence of seasonality, statiionary or not, which makes me wonder... HTH, Dennis On Thu, Oct 28, 2010 at 8:47 PM, Cuckovic Paik <cuckovic.paik@gmail.com>wrote:> > Dear Users, please help with the following DF test: > ====> library(tseries) > library(timeSeries) > > Y=c(3519,3803,4332,4251,4661,4811,4448,4451,4343,4067,4001,3934,3652,3768 > ,4082,4101,4628,4898,4476,4728,4458,4004,4095,4056,3641,3966,4417,4367 > ,4821,5190,4638,4904,4528,4383,4339,4327,3856,4072,4563,4561,4984,5316 > ,4843,5383,4889,4681,4466,4463,4217,4322,4779,4988,5383,5591,5322,5404 > ,5106,4871,4977,4706,4193,4460,4956,5022,5408,5565,5360,5490,5286,5257 > ,5002,4897,4577,4764,5052,5251,5558,5931,5476,5603,5425,5177,4792,4776 > ,4450,4659,5043,5233,5423,5814,5339,5474,5278,5184,4975,4751,4600,4718 > ,5218,5336,5665,5900,5330,5626,5512,5293,5143,4842,4627,4981,5321,5290 > ,6002,5811,5671,6102,5482,5429,5356,5167,4608,4889,5352,5441,5970,5750 > ,5670,5860,5449,5401,5240,5229,4770,5006,5518,5576,6160,6121,5900,5994 > ,5841,5832,5505,5573,5331,5355,6057,6055,6771,6669,6375,6666,6383,6118 > ,5927,5750,5122,5398,5817,6163,6763,6835,6678,6821,6421,6338,6265,6291 > ,5540,5822,6318,6268,7270,7096,6505,7039,6440,6446,6717,6320) > > YY=as.timeSeries(Y) > > adf.test(Y) > adf.test(YY) > ======== Output ===> > adf.test(Y) > > Augmented Dickey-Fuller Test > > data: Y > Dickey-Fuller = -6.1661, Lag order = 5, p-value = 0.01 > alternative hypothesis: stationary > > Warning message: > In adf.test(Y) : p-value smaller than printed p-value > > adf.test(YY) > > Augmented Dickey-Fuller Test > > data: YY > Dickey-Fuller = 12.4944, Lag order = 5, p-value = 0.99 > alternative hypothesis: stationary > > Warning message: > In adf.test(YY) : p-value greater than printed p-value > > > =========================================> Question: Why the two results are different? > > The help file says that the input series is either a numeric vector or a > time series object. But the results are completely opposite if the > different > types of arguments are used. Thanks in advance. > > > > > > > > -- > View this message in context: > http://r.789695.n4.nabble.com/Dickey-Fuller-Test-tp3018408p3018408.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]
Dear Cuckovic, although you got already an answer to your post that relates a little bit more on the time series characteristics of your data in question; I will take up on your initial question. Basically, you got trapped by the word 'time series' in the documentation for adf.test(). What is meant, is an object of informal class ts, hence: YYY <- as.ts(Y) adf.test(Y) adf.test(YYY) does yield the same result. Now, what's happening if an object of formal class timeSeries is inserted? Well, have a look at adf.test directly: adf.test Here, you will see that the series becomes differenced, but this operation is applied differently for numeric/ts objects viz. timeSeries objects; check: showMethods(diff) and/or diff(Y) diff(YY) diff(YYY) Now, to rectify your results, use: adf.test(series(YY)) instead. Here, the data part of your timeSeries object is extracted only and hence the same method for diff() is used as in the case of numeric/ts objects. Best, Bernhard -----Urspr?ngliche Nachricht----- Von: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] Im Auftrag von Cuckovic Paik Gesendet: Freitag, 29. Oktober 2010 05:48 An: r-help at r-project.org Betreff: [R] Dickey Fuller Test Dear Users, please help with the following DF test: ====library(tseries) library(timeSeries) Y=c(3519,3803,4332,4251,4661,4811,4448,4451,4343,4067,4001,3934,3652,3768 ,4082,4101,4628,4898,4476,4728,4458,4004,4095,4056,3641,3966,4417,4367 ,4821,5190,4638,4904,4528,4383,4339,4327,3856,4072,4563,4561,4984,5316 ,4843,5383,4889,4681,4466,4463,4217,4322,4779,4988,5383,5591,5322,5404 ,5106,4871,4977,4706,4193,4460,4956,5022,5408,5565,5360,5490,5286,5257 ,5002,4897,4577,4764,5052,5251,5558,5931,5476,5603,5425,5177,4792,4776 ,4450,4659,5043,5233,5423,5814,5339,5474,5278,5184,4975,4751,4600,4718 ,5218,5336,5665,5900,5330,5626,5512,5293,5143,4842,4627,4981,5321,5290 ,6002,5811,5671,6102,5482,5429,5356,5167,4608,4889,5352,5441,5970,5750 ,5670,5860,5449,5401,5240,5229,4770,5006,5518,5576,6160,6121,5900,5994 ,5841,5832,5505,5573,5331,5355,6057,6055,6771,6669,6375,6666,6383,6118 ,5927,5750,5122,5398,5817,6163,6763,6835,6678,6821,6421,6338,6265,6291 ,5540,5822,6318,6268,7270,7096,6505,7039,6440,6446,6717,6320) YY=as.timeSeries(Y) adf.test(Y) adf.test(YY) ======== Output ===> adf.test(Y) Augmented Dickey-Fuller Test data: Y Dickey-Fuller = -6.1661, Lag order = 5, p-value = 0.01 alternative hypothesis: stationary Warning message: In adf.test(Y) : p-value smaller than printed p-value> adf.test(YY)Augmented Dickey-Fuller Test data: YY Dickey-Fuller = 12.4944, Lag order = 5, p-value = 0.99 alternative hypothesis: stationary Warning message: In adf.test(YY) : p-value greater than printed p-value>=========================================Question: Why the two results are different? The help file says that the input series is either a numeric vector or a time series object. But the results are completely opposite if the different types of arguments are used. Thanks in advance. -- View this message in context: http://r.789695.n4.nabble.com/Dickey-Fuller-Test-tp3018408p3018408.html Sent from the R help mailing list archive at Nabble.com. ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ***************************************************************** Confidentiality Note: The information contained in this ...{{dropped:10}}