Hoffman, Gabriel
2018-Nov-21 18:34 UTC
[R] Speed of RCppEigen Cholesky decomposition on sparse matrix
I am developing a statistical model and I have a prototype working in R code. I make extensive use of sparse matrices, so the R code is pretty fast, but hoped that using RCppEigen to evaluate the log-likelihood function could avoid a lot of memory copying and be substantially faster. However, in a simple example I am seeing that RCppEigen is 3-5x slower than standard R code for cholesky decomposition of a sparse matrix. This is the case on R 3.5.1 using RcppEigen_0.3.3.4.0 on both OS X and CentOS 6.9. Since this simple operation is so much slower it doesn?t seem like using RCppEigen is worth it in this case. Is this an issue with BLAS, some libraries or compiler options, or is R code really the fastest option? Here is my example: library(Matrix) library(inline) # construct sparse matrix ######################### # construct a matrix C that is N x X with S total entries N = 10000 S = 1000000 i = sample(1:1000, S, replace=TRUE) j = sample(1:1000, S, replace=TRUE) idx = i >= j values = runif(S, 0, .3) X = sparseMatrix(i=i, j=j, x = values, symmetric=FALSE ) C = as(crossprod(X), "dgCMatrix") # check sparsity fraction S / N^2 # define RCppEigen code CholeskyCppSparse<-' using Rcpp::as; using Eigen::Map; using Eigen::SparseMatrix; using Eigen::MappedSparseMatrix; using Eigen::SimplicialLLT; // get data into RcppEigen const MappedSparseMatrix<double> Sigma(as<MappedSparseMatrix<double> >(Sigma_in)); // compute Cholesky typedef SimplicialLLT<SparseMatrix<double> > SpChol; const SpChol Ch(Sigma); ' CholSparse <- cxxfunction(signature(Sigma_in = "dgCMatrix"), CholeskyCppSparse, plugin = "RcppEigen") # compare times system.time(replicate(10, chol( C ))) # output: # user system elapsed # 0.341 0.014 0.355 system.time(replicate(10, CholSparse( C ))) # output: # user system elapsed # 1.639 0.046 1.687> sessionInfo()R version 3.5.1 (2018-07-02) Platform: x86_64-apple-darwin15.6.0 (64-bit) Running under: macOS 10.14 Matrix products: default BLAS: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRblas.0.dylib LAPACK: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 attached base packages: [1] stats graphics grDevices datasets utils methods base other attached packages: [1] inline_0.3.15 Matrix_1.2-15 loaded via a namespace (and not attached): [1] compiler_3.5.1 RcppEigen_0.3.3.4.0 Rcpp_1.0.0 [4] grid_3.5.1 lattice_0.20-38 Changing the size of the matrix and the number of entries does not change the relative times Thanks, - Gabriel [[alternative HTML version deleted]]
Jeff Newmiller
2018-Nov-21 23:09 UTC
[R] Speed of RCppEigen Cholesky decomposition on sparse matrix
I believe you have the wrong list. (Read the Posting Guide... you seem to have R under control.) Try Rcpp-devel. FWIW You probably need to spend some time with a C++ profiler... any language can be unintentionally mis-used, and you first need to identify whether your calling code is inefficiently handling memory or invoking setup code repetitively before blaming BLAS. A reproducible example will probably help when you ask at Rcpp-devel. On November 21, 2018 10:34:33 AM PST, "Hoffman, Gabriel" <gabriel.hoffman at mssm.edu> wrote:>I am developing a statistical model and I have a prototype working in R >code. I make extensive use of sparse matrices, so the R code is pretty >fast, but hoped that using RCppEigen to evaluate the log-likelihood >function could avoid a lot of memory copying and be substantially >faster. However, in a simple example I am seeing that RCppEigen is >3-5x slower than standard R code for cholesky decomposition of a sparse >matrix. This is the case on R 3.5.1 using RcppEigen_0.3.3.4.0 on both >OS X and CentOS 6.9. > >Since this simple operation is so much slower it doesn?t seem like >using RCppEigen is worth it in this case. Is this an issue with BLAS, >some libraries or compiler options, or is R code really the fastest >option? > >Here is my example: > >library(Matrix) >library(inline) > ># construct sparse matrix >######################### > ># construct a matrix C that is N x X with S total entries >N = 10000 >S = 1000000 >i = sample(1:1000, S, replace=TRUE) >j = sample(1:1000, S, replace=TRUE) >idx = i >= j >values = runif(S, 0, .3) >X = sparseMatrix(i=i, j=j, x = values, symmetric=FALSE ) > >C = as(crossprod(X), "dgCMatrix") > ># check sparsity fraction >S / N^2 > ># define RCppEigen code >CholeskyCppSparse<-' >using Rcpp::as; >using Eigen::Map; >using Eigen::SparseMatrix; >using Eigen::MappedSparseMatrix; >using Eigen::SimplicialLLT; > >// get data into RcppEigen >const MappedSparseMatrix<double> Sigma(as<MappedSparseMatrix<double> >>(Sigma_in)); > >// compute Cholesky >typedef SimplicialLLT<SparseMatrix<double> > SpChol; >const SpChol Ch(Sigma); >' > >CholSparse <- cxxfunction(signature(Sigma_in = "dgCMatrix"), >CholeskyCppSparse, plugin = "RcppEigen") > ># compare times >system.time(replicate(10, chol( C ))) ># output: ># user system elapsed ># 0.341 0.014 0.355 > >system.time(replicate(10, CholSparse( C ))) ># output: ># user system elapsed ># 1.639 0.046 1.687 > >> sessionInfo() >R version 3.5.1 (2018-07-02) >Platform: x86_64-apple-darwin15.6.0 (64-bit) >Running under: macOS 10.14 > >Matrix products: default >BLAS: >/Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRblas.0.dylib >LAPACK: >/Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib > >locale: >[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 > >attached base packages: >[1] stats graphics grDevices datasets utils methods base > >other attached packages: >[1] inline_0.3.15 Matrix_1.2-15 > >loaded via a namespace (and not attached): >[1] compiler_3.5.1 RcppEigen_0.3.3.4.0 Rcpp_1.0.0 >[4] grid_3.5.1 lattice_0.20-38 > >Changing the size of the matrix and the number of entries does not >change the relative times > >Thanks, >- Gabriel > > > > > [[alternative HTML version deleted]]-- Sent from my phone. Please excuse my brevity.