Jouni Helske
2012-Apr-30 11:37 UTC
[Rd] The constant part of the log-likelihood in StructTS
Dear all, I'd like to discuss about a possible bug in function StructTS of stats package. It seems that the function returns wrong value of the log-likelihood, as the added constant to the relevant part of the log-likelihood is misspecified. Here is an simple example:> data(Nile) > fit <- StructTS(Nile, type = "level") > fit$loglik[1] -367.5194 When computing the log-likelihood with other packages such as KFAS and FKF, the loglikelihood value is around -645. For the local level model, the likelihood is defined by -0.5*n*log(2*pi) - 0.5*sum(log(F_t) + v_t^2/sqrt(F_t)) (see for example Durbin and Koopman (2001, page 30). But in StructTS, the likelihood is computed like this: loglik <- -length(y) * res$value + length(y) * log(2 * pi), where the first part coincides with the last part of the definition, but the constant part has wrong sign and it is not multiplied by 0.5. Also in case of missing observations, I think there should be sum(!is.na(y)) instead of length(y) in the constant term, as the likelihood is only computed for those y which are observed. This does not affect in estimation of model parameters, but it could have effects in model comparison or some other cases. Is there some reason for this kind of constant, or is it just a bug? Best regards, Jouni Helske PhD student in Statistics University of Jyväskylä Finland [[alternative HTML version deleted]]
Ravi Varadhan
2012-May-01 13:51 UTC
[Rd] The constant part of the log-likelihood in StructTS
This is not a problem at all. The log likelihood function is a function of the model parameters and the data, but it is defined up to an additive arbitrary constant, i.e. L(\theta) and L(\theta) + k are completely equivalent, for any k. This does not affect model comparisons or hypothesis tests. Ravi ________________________________________ From: r-devel-bounces at r-project.org [r-devel-bounces at r-project.org] on behalf of Jouni Helske [jounihelske at gmail.com] Sent: Monday, April 30, 2012 7:37 AM To: r-devel at r-project.org Subject: [Rd] The constant part of the log-likelihood in StructTS Dear all, I'd like to discuss about a possible bug in function StructTS of stats package. It seems that the function returns wrong value of the log-likelihood, as the added constant to the relevant part of the log-likelihood is misspecified. Here is an simple example:> data(Nile) > fit <- StructTS(Nile, type = "level") > fit$loglik[1] -367.5194 When computing the log-likelihood with other packages such as KFAS and FKF, the loglikelihood value is around -645. For the local level model, the likelihood is defined by -0.5*n*log(2*pi) - 0.5*sum(log(F_t) + v_t^2/sqrt(F_t)) (see for example Durbin and Koopman (2001, page 30). But in StructTS, the likelihood is computed like this: loglik <- -length(y) * res$value + length(y) * log(2 * pi), where the first part coincides with the last part of the definition, but the constant part has wrong sign and it is not multiplied by 0.5. Also in case of missing observations, I think there should be sum(!is.na(y)) instead of length(y) in the constant term, as the likelihood is only computed for those y which are observed. This does not affect in estimation of model parameters, but it could have effects in model comparison or some other cases. Is there some reason for this kind of constant, or is it just a bug? Best regards, Jouni Helske PhD student in Statistics University of Jyv?skyl? Finland [[alternative HTML version deleted]]
Prof Brian Ripley
2012-May-17 11:12 UTC
[Rd] The constant part of the log-likelihood in StructTS
On 30/04/2012 12:37, Jouni Helske wrote:> Dear all, > > I'd like to discuss about a possible bug in function StructTS of stats > package. It seems that the function returns wrong value of the > log-likelihood, as the added constant to the relevant part of the > log-likelihood is misspecified. Here is an simple example: > >> data(Nile) >> fit<- StructTS(Nile, type = "level") >> fit$loglik > [1] -367.5194 > > When computing the log-likelihood with other packages such as KFAS and FKF, > the loglikelihood value is around -645. > > For the local level model, the likelihood is defined by -0.5*n*log(2*pi) - > 0.5*sum(log(F_t) + v_t^2/sqrt(F_t)) (see for example Durbin and Koopman > (2001, page 30). But in StructTS, the likelihood is computed like this: > > loglik<- -length(y) * res$value + length(y) * log(2 * pi), > > where the first part coincides with the last part of the definition, but > the constant part has wrong sign and it is not multiplied by 0.5. Also in > case of missing observations, I think there should be sum(!is.na(y)) > instead of length(y) in the constant term, as the likelihood is only > computed for those y which are observed. > > This does not affect in estimation of model parameters, but it could have > effects in model comparison or some other cases. > > Is there some reason for this kind of constant, or is it just a bug? > > Best regards, > > Jouni Helske > PhD student in Statistics > University of Jyv?skyl? > FinlandI think you missed the following on the help page: loglik: the maximized log-likelihood. Note that as all these models are non-stationary this includes a diffuse prior for some observations and hence is not comparable with ?arima? nor different types of structural models. It is explicitly not valid for almost all model comparisons, and those few that are valid will use the differences of the quoted values. Yes, it was an error, but the constant in log-likelihoods is always arbitrary and here there is even more indeterminancy. -- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595