I am computing a complex multidimensional integral (4 dimensions) using Gauss-legendre and am looking to optimize one piece of code that is currently very expensive. The integral is evaluated for K individuals in my data and once this piece is computed it can be stored and recycled over all K individuals. Once this piece is stored, the integral can be computed in about 1.3 minutes using R for each individual. Because the integral has multiple dimensions, the number of nodes grows exponentially such that I require q^4 total nodes and experimentation is showing I need no fewer than 40 per dimension. At each of these, I need to compute the density of the multivariate normal at all q^4 nodes and store it. This is very expensive and I wonder if there is a way to improve the computational time to be faster? Below is just a reproducible toy example (not using legendre nodes) to illustrate the issue. The final line is what I am wondering about to see if it can be improved in terms of computational time. Thank you Harold library(mvtnorm) ### Create parameters for MVN mu <- c(0,0,0,0) cov <- matrix(.2, ncol= 4,nrow=4) diag(cov) <- 1 sigma <- as.matrix(cov) ### Create nodes and expand to 4 dimensions for quadrature dm <- length(mu) gh <- seq(-4, 4, length.out = 10) n <- length(gh) idx <- as.matrix(expand.grid(rep(list(1:n),dm))) ### Compute density, this is expensive adjFactor <- sapply(1:nrow(idx), function(i) dmvnorm(gh[idx[i,]], mean = mu, sigma = sigma)) [[alternative HTML version deleted]]
On 07/06/2016 9:06 AM, Doran, Harold wrote:> I am computing a complex multidimensional integral (4 dimensions) using Gauss-legendre and am looking to optimize one piece of code that is currently very expensive. The integral is evaluated for K individuals in my data and once this piece is computed it can be stored and recycled over all K individuals. Once this piece is stored, the integral can be computed in about 1.3 minutes using R for each individual. > > Because the integral has multiple dimensions, the number of nodes grows exponentially such that I require q^4 total nodes and experimentation is showing I need no fewer than 40 per dimension. At each of these, I need to compute the density of the multivariate normal at all q^4 nodes and store it. This is very expensive and I wonder if there is a way to improve the computational time to be faster? > > Below is just a reproducible toy example (not using legendre nodes) to illustrate the issue. The final line is what I am wondering about to see if it can be improved in terms of computational time.I'd vectorize rather than using sapply (which is really a long loop). Try to put all the values into rows of a single matrix and just make one call to dmvnorm. Duncan Murdoch> Thank you > Harold > > library(mvtnorm) > ### Create parameters for MVN > mu <- c(0,0,0,0) > cov <- matrix(.2, ncol= 4,nrow=4) > diag(cov) <- 1 > sigma <- as.matrix(cov) > > ### Create nodes and expand to 4 dimensions for quadrature > dm <- length(mu) > gh <- seq(-4, 4, length.out = 10) > n <- length(gh) > idx <- as.matrix(expand.grid(rep(list(1:n),dm))) > > ### Compute density, this is expensive > adjFactor <- sapply(1:nrow(idx), function(i) dmvnorm(gh[idx[i,]], mean = mu, sigma = sigma)) > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
Thanks, Duncan. Not sure I follow, however. The call to dmvnorm(), in this instance, takes in a 4 x 1 vector of nodes (in addition to the mean and covariances matrices), such as in the example below which uses the original sample code. That vector of nodes changes for each iteration of the loop within the sapply().> i=1 > dmvnorm(gh[idx[i,]], mean = mu, sigma = sigma)[1] 5.768404e-11> i=2 > dmvnorm(gh[idx[i,]], mean = mu, sigma = sigma)[1] 3.455385e-10 I too would prefer to vectorize, but I'm not seeing how one call would work in this instance? -----Original Message----- From: Duncan Murdoch [mailto:murdoch.duncan at gmail.com] Sent: Tuesday, June 07, 2016 10:44 AM To: Doran, Harold <HDoran at air.org>; r-help at r-project.org Subject: Re: [R] Faster Multivariate Normal On 07/06/2016 9:06 AM, Doran, Harold wrote:> I am computing a complex multidimensional integral (4 dimensions) using Gauss-legendre and am looking to optimize one piece of code that is currently very expensive. The integral is evaluated for K individuals in my data and once this piece is computed it can be stored and recycled over all K individuals. Once this piece is stored, the integral can be computed in about 1.3 minutes using R for each individual. > > Because the integral has multiple dimensions, the number of nodes grows exponentially such that I require q^4 total nodes and experimentation is showing I need no fewer than 40 per dimension. At each of these, I need to compute the density of the multivariate normal at all q^4 nodes and store it. This is very expensive and I wonder if there is a way to improve the computational time to be faster? > > Below is just a reproducible toy example (not using legendre nodes) to illustrate the issue. The final line is what I am wondering about to see if it can be improved in terms of computational time.I'd vectorize rather than using sapply (which is really a long loop). Try to put all the values into rows of a single matrix and just make one call to dmvnorm. Duncan Murdoch> Thank you > Harold > > library(mvtnorm) > ### Create parameters for MVN > mu <- c(0,0,0,0) > cov <- matrix(.2, ncol= 4,nrow=4) > diag(cov) <- 1 > sigma <- as.matrix(cov) > > ### Create nodes and expand to 4 dimensions for quadrature dm <- > length(mu) gh <- seq(-4, 4, length.out = 10) n <- length(gh) idx <- > as.matrix(expand.grid(rep(list(1:n),dm))) > > ### Compute density, this is expensive adjFactor <- > sapply(1:nrow(idx), function(i) dmvnorm(gh[idx[i,]], mean = mu, sigma > = sigma)) > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >