On 02/01/2009 6:37 AM, Allan Clark wrote:> hello all
>
> happy new year and hope you r having a good holiday.
>
>
> i would like to calculate the expectation of a particular random variable
and would like to approximate it using a number of the functions contained in R.
decided to do some experimentation on a trivial example.
>
> example
> =======>
> suppose x(i)~N(0,s2) where s2 = the variance
> the prior for s2 = p(s2)~IG(a,b)
> so the posterior is IG(.5*n+a, b+.5*sum(x^2) )
> and the posterior expectation of s2 = (b+sum(x^2))/(.5*n+a-1)
>
> I want to calculate the value of this expectation using integrals.
>
> ie
>
> let L(X|s2) = the likelihood function
> so E(s2|X) = (integral(s2*L(X|s2)*p(s2)))/(integral(L(X|s2)*p(s2))) . the
bounds on both of the integrals is (0,Inf)
>
========================================================================>
>
> three methods are used to calculate E(s2|X) for different sample sizes:
>
> Method 1
> ========> (integral(s2*L(X|s2)*p(s2)))/(integral(L(X|s2)*p(s2))) where
the integrals are calculated using the integrate function
>
> Method 2
> ========> transform s2 to tau (from (0,Inf) to (-1,1))
> f(s2) = tau = 4*arctan(s2)/pi-1
>
> so
>
> E(s2|X) =
(integral(jac*finv(s2)*L(X|finv(s2))*p(finv(s2))))/(integral(L(X|finv(s2))*p(finv(s2))))
. the bounds on both integrals is (-1,1)
>
> jac = the jacobian and finv is the inverse function of f(s2)
>
> once again the integrals are calculated using the intergrate function
>
> Method 3
> ========> legendre gaussian quadriture is used to calculate each of the
integrals using Method 2.
>
>
> Some results and comments are below.
> method 3 seems to give the correct solution for n<=343
> differences between the different methods can be detected from as early as
n=3 for method 1
>
> IS THERE A WAY TO MODIFY THE PROBLEM SO THAT THE EXPECTATION CAN BE
CALCULATED FOR LARGE n without using MC or MCMC methods?
For large n, the density is highly peaked near its mode. The
integrate() function doesn't work well on functions like that, but you
can improve it by breaking the range up into three parts: the centre,
to the left, and to the right. If you define the centre so that almost
all of the mass will be there, then it doesn't matter if the other two
integrals are inaccurate. I'd suggest using the MLE plus or minus 5
standard errors. This should cover the mode unless the prior is very
informative.
You'll still have a problem of overflow when evaluating it. To fix
that, just rescale the density, so that its max is 1. (Since the
density occurs in both numerator and denominator, this won't affect the
final answer). In general it might be hard to find the max of the
density, but it's probably good enough to find the max of the
likelihood, or some other rough approximation, e.g. the density at the
MLE. The value 1 isn't important, all you need is an upper bound that
doesn't cause overflow.
Duncan Murdoch
>
> P(1)
> $correct
> [1] 2.201234
>
> $App.2
> [1] 2.201234
>
> $App.trans
> [1] 2.201234
> $App.gq
> [1] 2.201234
> #========================================>
> P(100)
> $correct
> [1] 18.85535
>
> $App.2
> [1] 15.65814
>
> $App.trans
> [1] 18.85028
> $App.gq
> [1] 18.85535
> #========================================>
>
> P(300)
> $correct
> [1] 22.59924
>
> $App.2
> [1] NaN
>
> $App.trans
> [1] 18.85405
> $App.gq
> [1] 22.59924
> #========================================>
> P(500)
> Error in integrate(num., 0, Inf) : non-finite function value
>
>
>
> The code is included below:
>
>
> P=function(n=1,howmany=500)
> {
> #you do need to have stamod
> #howmany is used to set the number of points used for the gaussian
quadriture
>
> set.seed(1)
> sigma=5
> x=rnorm(n)*sigma
> a=5
> b=5
>
> #the correct expectation
> #==========================================================>
correct=(b+sum(x^2)*.5)/(.5*n+a-1)
>
>
> #Method 1
> #==========================================================>
num.=function(s2)
> {
> L=((2*pi*s2)^(-n*.5))*exp(-.5*sum(x^2)/s2)
> P=(s2^(-(a+1)))*exp(-b/s2)*(b^a)/gamma(a)
> return(s2*L*P)
> }
> den.=function(s2)
> {
> L=((2*pi*s2)^(-n*.5))*exp(-.5*sum(x^2)/s2)
> P=(s2^(-(a+1)))*exp(-b/s2)*(b^a)/gamma(a)
> return(L*P)
> }
> App.2=integrate(num.,0,Inf)$value/integrate(den.,0,Inf)$value
>
>
> #Method 2
> #==========================================================>
num.t=function(s2.)
> {
> s2=tan(.25*pi*(1+s2.))
> jac=.25*pi*(((tan(.25*pi*(1+s2.)))^2)+1)
> num.tv=jac*exp(-(a+.5*n)*log(s2)-(b+sum(x^2)*.5)/s2)
> return(num.tv)
> }
> den.t=function(s2.)
> {
> s2=tan(.25*pi*(1+s2.))
> jac=.25*pi*(((tan(.25*pi*(1+s2.)))^2)+1)
> den.tv=jac*exp(-(a+.5*n+1)*log(s2)-(b+sum(x^2)*.5)/s2)
> return(den.tv)
> }
> App.trans=integrate(num.t,-1,1)$value/ integrate(den.t,-1,1)$value
>
>
> #Method 3
> #==========================================================>
library(statmod)
> ad.=gauss.quad(n=howmany,kind="legendre",alpha=-1,beta=-1)
> n.=ad.$nodes
> w.=ad.$weights
> App.gq=sum(w.*num.t(n.))/sum(w.*den.t(n.))
>
> list(correct=correct,App.2=App.2,App.trans=App.trans, App.gq=App.gq)
> }
>
>
> Allan Clark
> =======> Lecturer in Statistical Sciences Department
> University of Cape Town
> 7701 Rondebosch
> South Africa
> TEL (Office): +27-21-650-3228
> FAX: +27-21-650-4773
> http://web.uct.ac.za/depts/stats/aclark.htm
>
>
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.