Hertzog Gladys
2013-Jun-10  20:01 UTC
[R] parameters estimation of a normal-lognormal multivariate model
Dear all,
I have to create a model which is a mixture of a normal and log-normal
distribution. To create it, I need to estimate the 2 covariance matrixes and the
mixing parameter (total =7 parameters) by maximizing the log-likelihood
function. This maximization has to be performed by the nlm routine.
As I use relative data, the means are known and equal to 1.
I?ve already tried to do it in 1 dimension (with 1 set of relative data) and it
works well. However, when I introduce the 2nd set of relative data I get
illogical results for the correlation and a lot of warnings messages (at all
25).
To estimates the parameters I defined first the log-likelihood function with the
2 commands dmvnorm and dlnorm.plus. Then I assign starting values of the
parameters and finally I use the nlm routine to estimate the parameters (see
script below).
# Importing and reading the grid files. Output are 2048x2048 matrixes
 
P <-
read.ascii.grid("d:/Documents/JOINT_FREQUENCY/grid_E727_P-3000.asc",
return.header= FALSE );
V <-
read.ascii.grid("d:/Documents/JOINT_FREQUENCY/grid_E727_V-3000.asc",
return.header= FALSE );
 
p <- c(P); # tranform matrix into a vector
v <- c(V);
 
p<- p[!is.na(p)] # removing NA values
v<- v[!is.na(v)]
 
p_rel <- p/mean(p) #Transforming the data to relative values
v_rel <- v/mean(v) 
PV <- cbind(p_rel, v_rel) # create a matrix of vectors
 
L <- function(par,p_rel,v_rel) {
 
return (-sum(log( (1- par[7])*dmvnorm(PV, mean=c(1,1), sigma= matrix(c(par[1]^2,
par[1]*par[2]*par[3],par[1]*par[2]*par[3], par[2]^2 ),nrow=2, ncol=2))+
par[7]*dlnorm.rplus(PV, meanlog=c(1,1), varlog=
matrix(c(par[4]^2,par[4]*par[5]*par[6],par[4]*par[5]*par[6],par[5]^2),
nrow=2,ncol=2))            )))
 
}
par.start<- c(0.74, 0.66 ,0.40, 1.4, 1.2, 0.4, 0.5) # log-likelihood
estimators
 
result<-nlm(L,par.start,v_rel=v_rel,p_rel=p_rel, hessian=TRUE, iterlim=200,
check.analyticals= TRUE)
Messages d'avis :
1: In log(eigen(sigma, symmetric = TRUE, only.values = TRUE)$values) :
  production de NaN
2: In sqrt(2 * pi * det(varlog)) : production de NaN
3: In nlm(L, par.start, p_rel = p_rel, v_rel = v_rel, hessian = TRUE) :
  NA/Inf replaced by maximum positive value
4: In log(eigen(sigma, symmetric = TRUE, only.values = TRUE)$values) :
  production de NaN
?. Until 25.
par.hat <- result$estimate
 
cat("sigN_p =", par[1],"\n","sigN_v =",
par[2],"\n","rhoN =", par[3],"\n","sigLN_p
=", par[4],"\n","sigLN_v =",
par[5],"\n","rhoLN =", par[6],"\n","mixing
parameter =", par[7],"\n")
 
sigN_p = 0.5403361 
 sigN_v = 0.6667375 
 rhoN = 0.6260181 
 sigLN_p = 1.705626 
 sigLN_v = 1.592832 
 rhoLN = 0.9735974 
 mixing parameter = 0.8113369
 
Does someone know what is wrong in my model or how should I do to find these
parameters in 2 dimensions?
Thank you very much for taking time to look at my questions.
Regards,
Gladys Hertzog
Master student in environmental engineering, ETH Zurich