Anhai Jin
2013-Jun-12 14:43 UTC
[R] how to maximize the log-likelihood function with EM algorithm
Hi R users,
I am trying to fit a Mixed Markov Model and ran into trouble with maximizing the
log-likelihood function. I attached my R codes and the problem I have right now
is that the maximization may end in some local maximum by specifying different
start values. I am thinking if we can improve the maximization with EM
algorithm. Any help will be greatly appreciated!
purch_1 = c(0,1,0,1,0,1,0,1,0,1,0,1,0,1,0,1,0,1,0,1,0,1,0,1,0,1,0,1,0,1,0,1)
purch_2 = c(0,0,1,1,0,0,1,1,0,0,1,1,0,0,1,1,0,0,1,1,0,0,1,1,0,0,1,1,0,0,1,1)
purch_3 = c(0,0,0,0,1,1,1,1,0,0,0,0,1,1,1,1,0,0,0,0,1,1,1,1,0,0,0,0,1,1,1,1)
purch_4 = c(0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1)
purch_5 = c(0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1)
freq =
c(352,44,26,14,20,5,7,12,14,5,5,5,5,3,3,7,21,3,4,3,5,2,3,5,9,2,0,6,5,5,7,24)
switch_1 = c(0,1,1,0,0,1,1,0,0,1,1,0,0,1,1,0,0,1,1,0,0,1,1,0,0,1,1,0,0,1,1,0)
switch_2 = c(0,0,1,1,1,1,0,0,0,0,1,1,1,1,0,0,0,0,1,1,1,1,0,0,0,0,1,1,1,1,0,0)
switch_3 = c(0,0,0,0,1,1,1,1,1,1,1,1,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,0,0,0,0)
switch_4 = c(0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,0,0,0,0,0,0,0,0)
full =
cbind(purch_1,purch_2,purch_3,purch_4,purch_5,freq,switch_1,switch_2,switch_3,switch_4)
markovdata = data.frame(full)
library("maxLik")
set.seed(123)
loglikelihood = function(param) {
pi_1 =
param[1]
p0_1 =
param[2]
p00_1 =
param[3]
p11_1 =
param[4]
p0_2 =
param[5]
p00_2 = param[6]
p11_2 = param[7]
y =
markovdata$freq*
log(
pi_1*
(p0_1^(1-markovdata$purch_1)) *((1-p0_1)^markovdata$purch_1)
*
((p00_1^(1-markovdata$switch_1))*((1-p00_1)^markovdata$switch_1))
^ (1-markovdata$purch_1)
*
((p11_1^(1-markovdata$switch_1))*((1-p11_1)^markovdata$switch_1))
^ markovdata$purch_1
*
((p00_1^(1-markovdata$switch_2))*((1-p00_1)^markovdata$switch_2))
^ (1-markovdata$purch_2)
*
((p11_1^(1-markovdata$switch_2))*((1-p11_1)^markovdata$switch_2))
^ markovdata$purch_2
*
((p00_1^(1-markovdata$switch_3))*((1-p00_1)^markovdata$switch_3))
^ (1-markovdata$purch_3)
*
((p11_1^(1-markovdata$switch_3))*((1-p11_1)^markovdata$switch_3))
^ markovdata$purch_3
*
((p00_1^(1-markovdata$switch_4))*((1-p00_1)^markovdata$switch_4))
^ (1-markovdata$purch_4)
*
((p11_1^(1-markovdata$switch_4))*((1-p11_1)^markovdata$switch_4))
^ markovdata$purch_4
+
(1-pi_1)*
(p0_2^(1-markovdata$purch_1)) *((1-p0_2)^markovdata$purch_1)
*
((p00_2^(1-markovdata$switch_1))*((1-p00_2)^markovdata$switch_1))
^ (1-markovdata$purch_1)
*
((p11_2^(1-markovdata$switch_1))*((1-p11_2)^markovdata$switch_1))
^ markovdata$purch_1
*
((p00_2^(1-markovdata$switch_2))*((1-p00_2)^markovdata$switch_2))
^ (1-markovdata$purch_2)
*
((p11_2^(1-markovdata$switch_2))*((1-p11_2)^markovdata$switch_2))
^ markovdata$purch_2
*
((p00_2^(1-markovdata$switch_3))*((1-p00_2)^markovdata$switch_3))
^ (1-markovdata$purch_3)
*
((p11_2^(1-markovdata$switch_3))*((1-p11_2)^markovdata$switch_3))
^ markovdata$purch_3
*
((p00_2^(1-markovdata$switch_4))*((1-p00_2)^markovdata$switch_4))
^ (1-markovdata$purch_4)
*
((p11_2^(1-markovdata$switch_4))*((1-p11_2)^markovdata$switch_4))
^ markovdata$purch_4
)
y
}
mle = maxLik( logLik = loglikelihood,
start = c(pi_1 = 0.5, p0_1 =
0.3, p00_1 = 0.4, p11_1 = 0.5,
p0_2 = 0.6, p00_2 = 0.8, p11_2 = 0.5)
)
summary(mle)
The ideal solution should be
pi_1 =
0.19
p0_1 = 0.32
p00_1 = 0.50
p11_1 = 0.71
p0_2 = 0.87
p00_2 = 0.94
p11_2 = 0.21
Thanks,
Anhai
[[alternative HTML version deleted]]