Hertzog Gladys
2013-Jun-10 20:01 UTC
[R] parameters estimation of a normal-lognormal multivariate model
Dear all, I have to create a model which is a mixture of a normal and log-normal distribution. To create it, I need to estimate the 2 covariance matrixes and the mixing parameter (total =7 parameters) by maximizing the log-likelihood function. This maximization has to be performed by the nlm routine. As I use relative data, the means are known and equal to 1. I?ve already tried to do it in 1 dimension (with 1 set of relative data) and it works well. However, when I introduce the 2nd set of relative data I get illogical results for the correlation and a lot of warnings messages (at all 25). To estimates the parameters I defined first the log-likelihood function with the 2 commands dmvnorm and dlnorm.plus. Then I assign starting values of the parameters and finally I use the nlm routine to estimate the parameters (see script below). # Importing and reading the grid files. Output are 2048x2048 matrixes P <- read.ascii.grid("d:/Documents/JOINT_FREQUENCY/grid_E727_P-3000.asc", return.header= FALSE ); V <- read.ascii.grid("d:/Documents/JOINT_FREQUENCY/grid_E727_V-3000.asc", return.header= FALSE ); p <- c(P); # tranform matrix into a vector v <- c(V); p<- p[!is.na(p)] # removing NA values v<- v[!is.na(v)] p_rel <- p/mean(p) #Transforming the data to relative values v_rel <- v/mean(v) PV <- cbind(p_rel, v_rel) # create a matrix of vectors L <- function(par,p_rel,v_rel) { return (-sum(log( (1- par[7])*dmvnorm(PV, mean=c(1,1), sigma= matrix(c(par[1]^2, par[1]*par[2]*par[3],par[1]*par[2]*par[3], par[2]^2 ),nrow=2, ncol=2))+ par[7]*dlnorm.rplus(PV, meanlog=c(1,1), varlog= matrix(c(par[4]^2,par[4]*par[5]*par[6],par[4]*par[5]*par[6],par[5]^2), nrow=2,ncol=2)) ))) } par.start<- c(0.74, 0.66 ,0.40, 1.4, 1.2, 0.4, 0.5) # log-likelihood estimators result<-nlm(L,par.start,v_rel=v_rel,p_rel=p_rel, hessian=TRUE, iterlim=200, check.analyticals= TRUE) Messages d'avis : 1: In log(eigen(sigma, symmetric = TRUE, only.values = TRUE)$values) : production de NaN 2: In sqrt(2 * pi * det(varlog)) : production de NaN 3: In nlm(L, par.start, p_rel = p_rel, v_rel = v_rel, hessian = TRUE) : NA/Inf replaced by maximum positive value 4: In log(eigen(sigma, symmetric = TRUE, only.values = TRUE)$values) : production de NaN ?. Until 25. par.hat <- result$estimate cat("sigN_p =", par[1],"\n","sigN_v =", par[2],"\n","rhoN =", par[3],"\n","sigLN_p =", par[4],"\n","sigLN_v =", par[5],"\n","rhoLN =", par[6],"\n","mixing parameter =", par[7],"\n") sigN_p = 0.5403361 sigN_v = 0.6667375 rhoN = 0.6260181 sigLN_p = 1.705626 sigLN_v = 1.592832 rhoLN = 0.9735974 mixing parameter = 0.8113369 Does someone know what is wrong in my model or how should I do to find these parameters in 2 dimensions? Thank you very much for taking time to look at my questions. Regards, Gladys Hertzog Master student in environmental engineering, ETH Zurich