thr3ads.net - R help - [R] estimation of covariance matrix of a bivariate normal distribution using maximization of the log-likelihood [Jun 2013]

If this information is useful, please help other people find it:
Share via:

Hertzog Gladys

2013-Jun-07 12:00 UTC

[R] estimation of covariance matrix of a bivariate normal distribution using maximization of the log-likelihood

Dear all,
I?m new in R and I?m trying to estimate the covariance matrix of a bivariate
normal distribution by maximizing the log-likelihood. The maximization really
has to be performed with the non-linear minimization routine (nlm). The 2 means
of the distribution are known and equal to 1.
I?ve already tried 2 different ways to compute this covariance but for each of
them I obtained a lot of warnings and illogical values for the covariance
matrix.
 
In the first one, I defined the bivariate normal distribution with the command
dmvnorm:
 
x<-rnorm(6000, 2.4, 0.6)
x <- matrix(c(x), ncol=1)
y<-rlnorm(6000, 1.3,0.1)
y <- matrix(c(y), ncol=1)
XY <- cbind(x,y) 
 
L <- function(par,x,y) {
return (-sum(log(par[4]*dmvnorm(XY, mean=c(1,1), sigma= matrix(c(par[1],
par[1]*par[2]*par[3],par[1]*par[2]*par[3], par[2] ),nrow=2, ncol=2))           
)))
}
par.start<- c(0.5, 0.5 ,0.5 ,0.5)
result<-nlm(L,par.start,y=y,x=x, hessian=TRUE)
par.hat <- result$estimate
 
par.hatIl y a eu 32 avis (utilisez warnings() pour les
visionner)> par.hat <- result$estimate
> par.hat
[1] 5.149919e+01 2.520721e+02 8.734212e-03 3.996771e+02> warnings()Messages d'avis :
1: In log(eigen(sigma, symmetric = TRUE, only.values = TRUE)$values) :
  production de NaN
2: In nlm(L, par.start, y = y, x = x, hessian = TRUE) :
  NA/Inf replaced by maximum positive value
3: In log(eigen(sigma, symmetric = TRUE, only.values = TRUE)$values) :
  production de NaN
4: In nlm(L, par.start, y = y, x = x, hessian = TRUE) :
  NA/Inf replaced by maximum positive value
5: In log(eigen(sigma, symmetric = TRUE, only.values = TRUE)$values) :
  production de NaN
6: In nlm(L, par.start, y = y, x = x, hessian = TRUE) :
  NA/Inf replaced by maximum positive value
7: In log(eigen(sigma, symmetric = TRUE, only.values = TRUE)$values) :
  production de NaN
8: In nlm(L, par.start, y = y, x = x, hessian = TRUE) :
  NA/Inf replaced by maximum positive value
9: In log(eigen(sigma, symmetric = TRUE, only.values = TRUE)$values) :
  production de NaN
10: In nlm(L, par.start, y = y, x = x, hessian = TRUE) :
  NA/Inf replaced by maximum positive value
11: In log(eigen(sigma, symmetric = TRUE, only.values = TRUE)$values) :
  production de NaN
12: In nlm(L, par.start, y = y, x = x, hessian = TRUE) :
  NA/Inf replaced by maximum positive value
13: In log(eigen(sigma, symmetric = TRUE, only.values = TRUE)$values) :
  production de NaN
 
In the second one, I wrote step by step the bivariate normal distribution in
order to have each parameter separately (not in a matrix) but it didn?t work as
well:
 
x<-rnorm(6000, 2.4, 0.6)
y<-rlnorm(6000, 1.3,0.1)
L <- function(par,x,y) {
return (-sum(log((1-par[4])*( (1/(2*pi*par[1]*par[2]*sqrt(1-par[3])))*exp(  
(-1/2*(1-par[3]^2))* ((y-1)/par[2])^2 +((x-1)/par[1])^2 -
2*(y-1)*(x-1)/(par[2]*par[1])   )) )))
}
#par [1]= sigma_x , par [2]= sigma_y  par [3]= rho_xy  par[4] is a mixing
parameter. The final step of my calculation will be to have a mixture of
bivariate normal and log-normal distributions.
par.start<- c(0.5, 0.5 ,0.5 ,0.5)
result<-nlm(L,par.start,y=y,x=x, hessian=T)
par.hat <- result$estimate
par.ha
 
When I run this script, I get always 50 advices like those below:
Messages d'avis :
1: In sqrt(1 - par[3]) : production de NaN
2: In nlm(L, par.start, y = y, x = x, hessian = T) :
  NA/Inf replaced by maximum positive value
3: In sqrt(1 - par[3]) : production de NaN
4: In nlm(L, par.start, y = y, x = x, hessian = T) :
  NA/Inf replaced by maximum positive value
5: In sqrt(1 - par[3]) : production de NaN
6: In nlm(L, par.start, y = y, x = x, hessian = T) :
  NA/Inf replaced by maximum positive value
7: In log((1 - par[4]) * ((1/(2 * pi * par[1] * par[2] *  ... : production de
NaN
8: In nlm(L, par.start, y = y, x = x, hessian = T) :
  NA/Inf replaced by maximum positive value
9: In log((1 - par[4]) * ((1/(2 * pi * par[1] * par[2] *  ... : production de
NaN
10: In nlm(L, par.start, y = y, x = x, hessian = T) :
  NA/Inf replaced by maximum positive value
11: In log((1 - par[4]) * ((1/(2 * pi * par[1] * par[2] *  ... : production de
NaN
12: In nlm(L, par.start, y = y, x = x, hessian = T) :
  NA/Inf replaced by maximum positive value
??
Does one of you know how to use the nlm method to estimate the covariance matrix
(and mixing parameter ) of a bivariate normal distribution?
 
Thank you in advance for your help and answers.
 
Best regards,
 
Gladys Hertzog
Master student in environmental engineering at ETH Zurich
-------------- next part --------------
A non-text attachment was scrubbed...
Name: message_to_mailing_list.pdf
Type: application/pdf
Size: 200565 bytes
Desc: message_to_mailing_list.pdf
URL:
<https://stat.ethz.ch/pipermail/r-help/attachments/20130607/3fcf8ef9/attachment.pdf>

Rolf Turner

2013-Jun-08 00:21 UTC

head link

[R] estimation of covariance matrix of a bivariate normal distribution using maximization of the log-likelihood

See in line below.

On 08/06/13 00:00, Hertzog Gladys wrote:> Dear all,
> I?m new in R and I?m trying to estimate the covariance matrix of a
bivariate normal distribution by maximizing the log-likelihood. The maximization
really has to be performed with the non-linear minimization routine (nlm).
Why?> The 2 means of the distribution are known and equal to 1.
But you simulate your data using means 2.4 and 1.3.> I?ve already tried 2 different ways to compute this covariance but for each
of them I obtained a lot of warnings and illogical values for the covariance
matrix.
>   
> In the first one, I defined the bivariate normal distribution with the
command dmvnorm:
>   
> x<-rnorm(6000, 2.4, 0.6)
> x <- matrix(c(x), ncol=1)Why do you use matrix() and c() here? Totally unnecessary
(and confusing).> y<-rlnorm(6000, 1.3,0.1)
> y <- matrix(c(y), ncol=1)
> XY <- cbind(x,y)To estimate the covariance matrix underlying the data XY you could
simply do:

M <- var(XY).

If you want to impose the constraint that the true mean vector is c(1,1)
you could do:

mu <- c(1,1)
xbar <- apply(XY,2,mean)
M <- (5999/6000)*var(XY) + (xbar-mu)%*%t(xbar-mu)

It would of course make more sense in terms of your simulated data
to use:

mu <- c(2.4,1.3)

I haven't looked at your code below any further. Too much of a mess.

cheers,

Rolf Turner>   
> L <- function(par,x,y) {
> return (-sum(log(par[4]*dmvnorm(XY, mean=c(1,1), sigma= matrix(c(par[1],
par[1]*par[2]*par[3],par[1]*par[2]*par[3], par[2] ),nrow=2, ncol=2))           
)))
> }
> par.start<- c(0.5, 0.5 ,0.5 ,0.5)
> result<-nlm(L,par.start,y=y,x=x, hessian=TRUE)
> par.hat <- result$estimate
>   
> par.hatIl y a eu 32 avis (utilisez warnings() pour les visionner)
>> par.hat <- result$estimate
>> par.hat
> [1] 5.149919e+01 2.520721e+02 8.734212e-03 3.996771e+02
>> warnings()
> Messages d'avis :
> 1: In log(eigen(sigma, symmetric = TRUE, only.values = TRUE)$values) :
>    production de NaN
> 2: In nlm(L, par.start, y = y, x = x, hessian = TRUE) :
>    NA/Inf replaced by maximum positive value
> 3: In log(eigen(sigma, symmetric = TRUE, only.values = TRUE)$values) :
>    production de NaN
> 4: In nlm(L, par.start, y = y, x = x, hessian = TRUE) :
>    NA/Inf replaced by maximum positive value
> 5: In log(eigen(sigma, symmetric = TRUE, only.values = TRUE)$values) :
>    production de NaN
> 6: In nlm(L, par.start, y = y, x = x, hessian = TRUE) :
>    NA/Inf replaced by maximum positive value
> 7: In log(eigen(sigma, symmetric = TRUE, only.values = TRUE)$values) :
>    production de NaN
> 8: In nlm(L, par.start, y = y, x = x, hessian = TRUE) :
>    NA/Inf replaced by maximum positive value
> 9: In log(eigen(sigma, symmetric = TRUE, only.values = TRUE)$values) :
>    production de NaN
> 10: In nlm(L, par.start, y = y, x = x, hessian = TRUE) :
>    NA/Inf replaced by maximum positive value
> 11: In log(eigen(sigma, symmetric = TRUE, only.values = TRUE)$values) :
>    production de NaN
> 12: In nlm(L, par.start, y = y, x = x, hessian = TRUE) :
>    NA/Inf replaced by maximum positive value
> 13: In log(eigen(sigma, symmetric = TRUE, only.values = TRUE)$values) :
>    production de NaN
>   
> In the second one, I wrote step by step the bivariate normal distribution
in order to have each parameter separately (not in a matrix) but it didn?t work
as well:
>   
> x<-rnorm(6000, 2.4, 0.6)
> y<-rlnorm(6000, 1.3,0.1)
> L <- function(par,x,y) {
> return (-sum(log((1-par[4])*( (1/(2*pi*par[1]*par[2]*sqrt(1-par[3])))*exp( 
(-1/2*(1-par[3]^2))* ((y-1)/par[2])^2 +((x-1)/par[1])^2 -
2*(y-1)*(x-1)/(par[2]*par[1])   )) )))
> }
> #par [1]= sigma_x , par [2]= sigma_y  par [3]= rho_xy  par[4] is a mixing
parameter. The final step of my calculation will be to have a mixture of
bivariate normal and log-normal distributions.
> par.start<- c(0.5, 0.5 ,0.5 ,0.5)
> result<-nlm(L,par.start,y=y,x=x, hessian=T)
> par.hat <- result$estimate
> par.ha
>   
> When I run this script, I get always 50 advices like those below:
> Messages d'avis :
> 1: In sqrt(1 - par[3]) : production de NaN
> 2: In nlm(L, par.start, y = y, x = x, hessian = T) :
>    NA/Inf replaced by maximum positive value
> 3: In sqrt(1 - par[3]) : production de NaN
> 4: In nlm(L, par.start, y = y, x = x, hessian = T) :
>    NA/Inf replaced by maximum positive value
> 5: In sqrt(1 - par[3]) : production de NaN
> 6: In nlm(L, par.start, y = y, x = x, hessian = T) :
>    NA/Inf replaced by maximum positive value
> 7: In log((1 - par[4]) * ((1/(2 * pi * par[1] * par[2] *  ... : production
de NaN
> 8: In nlm(L, par.start, y = y, x = x, hessian = T) :
>    NA/Inf replaced by maximum positive value
> 9: In log((1 - par[4]) * ((1/(2 * pi * par[1] * par[2] *  ... : production
de NaN
> 10: In nlm(L, par.start, y = y, x = x, hessian = T) :
>    NA/Inf replaced by maximum positive value
> 11: In log((1 - par[4]) * ((1/(2 * pi * par[1] * par[2] *  ... : production
de NaN
> 12: In nlm(L, par.start, y = y, x = x, hessian = T) :
>    NA/Inf replaced by maximum positive value
> ??
> Does one of you know how to use the nlm method to estimate the covariance
matrix (and mixing parameter ) of a bivariate normal distribution?
>   
> Thank you in advance for your help and answers.

R help - Jun 2013 - estimation of covariance matrix of a bivariate normal distribution using maximization of the log-likelihood

[R] estimation of covariance matrix of a bivariate normal distribution using maximization of the log-likelihood

[R] estimation of covariance matrix of a bivariate normal distribution using maximization of the log-likelihood