hong qin
2006-Oct-21 14:21 UTC
[R] problem with mode of marginal distriubtion of rdirichlet{gtools}
Hi all, I have a problem using rdirichlet{gtools}. For Dir( a1, a2, ..., a_n), its mode can be found at $( a_i -1)/ ( \sum_{i}a_i - n)$; The means are $a_i / (\sum_{i} a_i ) $; I tried to study the above properties using rdirichlet from gtools. The code are: ############## library(gtools) alpha = c(1,3,9) #totoal=13 mean.expect = c(1/13, 3/13, 9/13) mode.expect = c(0, 2/10, 8/10) # this is for the overall mode. mode1 = numeric(3); mode2 = numeric(3); theta = data.frame( rdirichlet( 10000, alpha) ) m1 = mean(theta) for( i in 1:3) { h = hist(theta[,i], breaks=20); mode1[i]= h$mid[ h$density == max(h$density) ] } theta = data.frame( rdirichlet( 10000, alpha) ) m2 = mean(theta) for( i in 1:3) { h = hist(theta[,i], breaks=20); mode2[i]= h$mid[ h$density == max(h$density) ] } rbind(m1,m2,mean.expect) #the means are consistent rbind(mode1, mode2, mode.expect); #the marginal modes are quite different from the global mode. ############ An example output is:> rbind(m1,m2,mean.expect) #the means are consistentX1 X2 X3 m1 0.07609384 0.2301840 0.6937222 m2 0.07716923 0.2300336 0.6927971 mean.expect 0.07692308 0.2307692 0.6923077> rbind(mode1, mode2, mode.expect);[,1] [,2] [,3] mode1 0.025 0.175 0.725 mode2 0.010 0.175 0.725 mode.expect 0.000 0.200 0.800 So, what is the problem with the mode? Many thanks, HQ [[alternative HTML version deleted]]
Ben Bolker
2006-Oct-21 22:32 UTC
[R] problem with mode of marginal distriubtion of rdirichlet{gtools}
hong qin <alongway <at> gmail.com> writes:> > Hi all, > > I have a problem using rdirichlet{gtools}. > For Dir( a1, a2, ..., a_n), its mode can be found at $( a_i -1)/ ( > \sum_{i}a_i - n)$; > The means are $a_i / (\sum_{i} a_i ) $; >I believe the problem is that the marginal modes are not equal to the full posterior modes. (Think about the two-parameter normal distribution for example; for a sample {x_i}, the mode of the posterior variance is near sum((x_i-\bar x)^2)/n while the marginal mode is near sum((x_i-\bar x)^2)/(n-1) ...) [haven't checked this at all carefully.] Ben Bolker
Rob Carnell
2006-Oct-22 17:04 UTC
[R] problem with mode of marginal distriubtion of rdirichlet{gtools}
hong qin <alongway <at> gmail.com> writes:> > Hi all, > > I have a problem using rdirichlet{gtools}. > For Dir( a1, a2, ..., a_n), its mode can be found at $( a_i -1)/ ( > \sum_{i}a_i - n)$; > The means are $a_i / (\sum_{i} a_i ) $; > > I tried to study the above properties using rdirichlet from gtools. The code > are: > > ############## > library(gtools) > alpha = c(1,3,9) #totoal=13I agree with your assertions up to here. The proofs that I have seen for the mode of the dirichlet assert that a_i > 1. This is important. So I propose: alpha = c(2, 4, 7) # total = 13 And I will continue with the example, commenting out your example for comparison... # mean.expect = c(1/13, 3/13, 9/13) # mode.expect = c(0, 2/10, 8/10) # this is for the overall mode. mean.expect = c(2/13, 4/13, 7/13) mode.expect = c(1/10, 3/10, 6/10) You were also correct that the expected mode is for the joint distribution. For the marginal distributions, there is a slightly different expected mode. The marginal distributions of a dirichlet are beta distributions with an alpha parameter equal to a_i and a beta parameter equal to (\sum{j, j!=i} a_j). The mode of the beta distribution is (alpha - 1) / (alpha + beta - 2). The first marginal therefore has a mode of (2 - 1) / (2 + 11 -2) = 1/11 mode.expect.beta = c(1/11, 3/11, 6/11) I also propose that we look at the smoothed density estimate to find the mean instead of relying on histograms which have error in their mode estimate due to the bin size. mode1 = numeric(3); mode2 = numeric(3); theta = data.frame( rdirichlet( 100000, alpha) ) m1 = mean(theta) for( i in 1:3) { h2 <- density(theta[,i], n=1024) mode1[i] <- h2$x[h2$y==max(h2$y)] } theta = data.frame( rdirichlet( 100000, alpha) ) m2 = mean(theta) for( i in 1:3) { h2 <- density(theta[,i], n=1024) mode2[i] <- h2$x[h2$y==max(h2$y)] } The output then shows that the density follows the mode.expect.beta.> rbind(m1,m2,mean.expect)X1 X2 X3 m1 0.1539895 0.3078219 0.5381886 m2 0.1537157 0.3072688 0.5390156 mean.expect 0.1538462 0.3076923 0.5384615> rbind(mode1, mode2, mode.expect, mode.expect.beta);[,1] [,2] [,3] mode1 0.09231188 0.2687344 0.5426022 mode2 0.09039729 0.2757893 0.5608050 mode.expect 0.10000000 0.3000000 0.6000000 mode.expect.beta 0.09090909 0.2727273 0.5454545>Finally, we can show that the joint distribution mode is greater than the joint density at the values of the modes of the marginal densities.> ddirichlet(c(1/10, 3/10, 6/10), c(2,4,7)) # joint mode[1] 13.96769> ddirichlet(c(1/11, 3/11, 6/11), c(2,4,7)) # marginal modes[1] 5.385148>I hope this helps! Rob Carnell Battelle - Statistics and Information Analysis