Dear R-help! I have encountered strange behaviour (that is, far-off filtering, smoothing and forecast distributions under certain conditions) in the `dlm` package by Giovanni Petris. Here is an example: I use the annual hotel bookings time series data, which I model using a second order polinomial DLM. First I perform the analysis with the data in logarithmic form and everything seems to be ok. I repeat the analysis with the original, non logarithmic data and the filtering, smoothing and forecast distributions are *very* far off the actual data. I again repeat the analysis, this time I also specify the expected value and variance of the pre-sample state vector (m0 and C0). The filtering and forecast distribution seem to be influenced by these values only and ignore the actual data. I would like to know what I would have to do in order to get plausible results from DLM with data in non-logarithmic form. Is there something that I am missing? Thanks in advance! Martin Zuba References: dlm introductory paper <http://cran.r-project.org/web/packages/dlm/vignettes/dlm.pdf> by Giovanni Petris (2009) dlm package help file <http://benz.nchu.edu.tw/~finmyc/dlm.pdf> This R code reproduces my analysis: # DLM anomaly rm(list=ls()) hotann <- ts(start=1973, data=c(13733531, 15445493, 16024342, 16715224, 16463957, 17896118, 18263114, 18727641, 19756293, 20352440, 22489054, 23442393, 24170736, 25308044, 25021560, 25724331, 25990336, 26602038, 27292250, 28891954, 29659700, 31533579, 32513666, 33628559, 34494451)) plot(hotann, ylab="Annual hotel bookings") # Analysis with log data tsdata <- log(hotann) buildfun <- function (x) { dlmModPoly(order = 2, dV = exp(x[1]), dW = c(0,exp(x[2]))) } fit <- dlmMLE(y=tsdata, parm=c(0,0), build=buildfun) # Warning: a numerically singular 'V' has been slightly perturbed to make it nonsingular fit$conv dlmTsdata <- buildfun(fit$par) tsdataFilter <- dlmFilter(tsdata, mod=dlmTsdata) tsdataSmooth <- dlmSmooth(tsdata, mod=dlmTsdata) plot(tsdata, lwd=2) for (i in 1:10) lines(lty=6, col="blue", dropFirst(dlmBSample(tsdataFilter))[,1]) # looks ok! tsdataForecast <- dlmForecast(tsdataFilter, nAhead=20) sqrtR <- sapply(tsdataForecast$R, function(x) sqrt(x[1,1])) pl <- tsdataForecast$a[,1] + qnorm(0.05, sd= sqrtR) pu <- tsdataForecast$a[,1] + qnorm(0.95, sd= sqrtR) x <- ts.union(tsdata,tsdataSmooth$s[,1],tsdataForecast$a[,1],pl,pu) plot(x, plot.type="single", type="l", col=c("black","blue","red","orange","orange"), ylab="TS data") # looks ok! # Repeat with non-log data. tsdata <- hotann buildfun <- function (x) { dlmModPoly(order = 2, dV = exp(x[1]), dW = c(0,exp(x[2]))) } fit <- dlmMLE(y=tsdata, parm=c(0,0), build=buildfun) fit$conv dlmTsdata <- buildfun(fit$par) tsdataFilter <- dlmFilter(tsdata, mod=dlmTsdata) tsdataSmooth <- dlmSmooth(tsdata, mod=dlmTsdata) plot(tsdata, lwd=2, ylim=c(0, 6e+07)) for (i in 1:10) lines(lty=6, col="blue", dropFirst(dlmBSample(tsdataFilter))[,1]) # where is the filtering distribution??? tsdataForecast <- dlmForecast(tsdataFilter, nAhead=20) sqrtR <- sapply(tsdataForecast$R, function(x) sqrt(x[1,1])) pl <- tsdataForecast$a[,1] + qnorm(0.05, sd= sqrtR) pu <- tsdataForecast$a[,1] + qnorm(0.95, sd= sqrtR) x <- ts.union(tsdata,tsdataSmooth$s[,1],tsdataForecast$a[,1],pl,pu) plot(x, plot.type="single", type="l", col=c("black","blue","red","orange","orange"), ylab="TS data") # where is the forecast??? # strong sensitivity towards expected value (m0) and variance (C0) of pre-sample state vectors # m0 is rep(0,order) in the default specification, this might be responsible for the results above (?) # repeat analysis, but specify values for m0 and C0 tsdata <- hotann buildfun <- function (x) { dlmModPoly(order = 2, dV = exp(x[1]), dW = c(0,exp(x[2])), m0=c(x[3],x[4]), C0 = exp(x[5]) * diag(nrow = 2)) } fit <- dlmMLE(y=tsdata, parm=c(0,0,2e+07,20000,20), build=buildfun) fit$conv dlmTsdata <- buildfun(fit$par) tsdataFilter <- dlmFilter(tsdata, mod=dlmTsdata) tsdataSmooth <- dlmSmooth(tsdata, mod=dlmTsdata) plot(tsdata, lwd=2, ylim=c(0, 6e+07)) for (i in 1:10) lines(lty=6, col="blue", dropFirst(dlmBSample(tsdataFilter))[,1]) # the distribution only follows m0, C0 and seems ignorant of the data tsdataForecast <- dlmForecast(tsdataFilter, nAhead=20) sqrtR <- sapply(tsdataForecast$R, function(x) sqrt(x[1,1])) pl <- tsdataForecast$a[,1] + qnorm(0.05, sd= sqrtR) pu <- tsdataForecast$a[,1] + qnorm(0.95, sd= sqrtR) x <- ts.union(tsdata,tsdataSmooth$s[,1],tsdataForecast$a[,1],pl,pu) plot(x, plot.type="single", type="l", col=c("black","blue","red","orange","orange"), ylab="TS data") # the forecast only follows m0, C0 and seems ignorant of the data -- View this message in context: http://r.789695.n4.nabble.com/Strange-behaviour-of-dlm-package-tp4683255.html Sent from the R help mailing list archive at Nabble.com.