-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 I'm trying to calculate Pearson correlation coefficients for a large matrix of size 18563 x 18563. The following function takes about XX minutes to complete, and I'd like to do this calculation about 15 times and so speed is some what of an issue. Does anyone have any suggestions on ways to speed this up? I'd wondered if using C++ code to do the calculations might speed things up, but I've never written any C/C++ code or attempted to use any within R. I've seen some C++ code here: http://www.alglib.net/statistics/correlation.php I wondered if anyone might be able to help me get this so it can run in R? I've tried the following: 1) download and unzipped http://www.alglib.net/translator/dl/statistics.correlation.cpp.zip 2) moved the contents of the libs dir into the parent dir alongside correlation.cpp (didn't know how to tell R where to look for C libraries) 3) Tried: "R CMD SHLIB correlation.cpp" and got the following as output: - -- start output -- icpc -I/tools/R/2.7.1/lib/R/include -I/usr/local/include -mp -fpic - -g -O2 -c correlation.cpp -o correlation.o ap.h(163): warning #858: type qualifier on return type is meaningless const bool operator==(const complex& lhs, const complex& rhs); ^ ap.h(164): warning #858: type qualifier on return type is meaningless const bool operator!=(const complex& lhs, const complex& rhs); ^ ap.h(179): warning #858: type qualifier on return type is meaningless const double abscomplex(const complex &z); ^ icpc -shared -L/usr/local/lib -o correlation.so correlation.o - -- end output -- 4) Now this doesn't look brilliant! Any thoughts? Also, I'm assuming I need to do some other work with the C++ code in order to allow me to use it from within my R scripts - any pointers on that? Thanks for any input - I hope I just need a hand over the initial hurdles and then I can get onto that up-hill learning curve!! Nathan - -- - -------------------------------------------------------- Dr. Nathan S. Watson-Haigh OCE Post Doctoral Fellow CSIRO Livestock Industries Queensland Bioscience Precinct St Lucia, QLD 4067 Australia Tel: +61 (0)7 3214 2922 Fax: +61 (0)7 3214 2900 Web: http://www.csiro.au/people/Nathan.Watson-Haigh.html - -------------------------------------------------------- -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iEYEARECAAYFAklF8McACgkQ9gTv6QYzVL68WwCfSNTEH9nszUzqUIFb7pUvnGxD 00QAn1uKJEqm4keX2viYdTVkQVHxDXQU =rBWF -----END PGP SIGNATURE-----
Nathan S. Watson-Haigh
2008-Dec-15 06:02 UTC
[R] [ExternalEmail] Pearson Correlation Speed
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Nathan S. Watson-Haigh wrote:> I'm trying to calculate Pearson correlation coefficients for a large > matrix of size 18563 x 18563. The following function takes about XX > minutes to complete, and I'd like to do this calculation about 15 times > and so speed is some what of an issue.Sorry, meant to fill in the blanks! the following takes about 15 mins to complete: corr <- abs(cor(dat, use="p"))> > Does anyone have any suggestions on ways to speed this up? I'd wondered > if using C++ code to do the calculations might speed things up, but I've > never written any C/C++ code or attempted to use any within R. > > I've seen some C++ code here: > http://www.alglib.net/statistics/correlation.php > > I wondered if anyone might be able to help me get this so it can run in > R? I've tried the following: > 1) download and unzipped > http://www.alglib.net/translator/dl/statistics.correlation.cpp.zip > 2) moved the contents of the libs dir into the parent dir alongside > correlation.cpp (didn't know how to tell R where to look for C libraries) > 3) Tried: "R CMD SHLIB correlation.cpp" and got the following as output: > -- start output -- > icpc -I/tools/R/2.7.1/lib/R/include -I/usr/local/include -mp -fpic > -g -O2 -c correlation.cpp -o correlation.o > ap.h(163): warning #858: type qualifier on return type is meaningless > const bool operator==(const complex& lhs, const complex& rhs); > ^ > > ap.h(164): warning #858: type qualifier on return type is meaningless > const bool operator!=(const complex& lhs, const complex& rhs); > ^ > > ap.h(179): warning #858: type qualifier on return type is meaningless > const double abscomplex(const complex &z); > ^ > > icpc -shared -L/usr/local/lib -o correlation.so correlation.o > -- end output -- > 4) Now this doesn't look brilliant! Any thoughts? Also, I'm assuming I > need to do some other work with the C++ code in order to allow me to use > it from within my R scripts - any pointers on that? > > Thanks for any input - I hope I just need a hand over the initial > hurdles and then I can get onto that up-hill learning curve!! > > Nathan > >______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. - -- - -------------------------------------------------------- Dr. Nathan S. Watson-Haigh OCE Post Doctoral Fellow CSIRO Livestock Industries Queensland Bioscience Precinct St Lucia, QLD 4067 Australia Tel: +61 (0)7 3214 2922 Fax: +61 (0)7 3214 2900 Web: http://www.csiro.au/people/Nathan.Watson-Haigh.html - -------------------------------------------------------- -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iEYEARECAAYFAklF8xAACgkQ9gTv6QYzVL4+wwCeL3jZVYi1VsCIQG/FQYpvcUPi XCwAoKGAImMBJOLSOBELchL+LpKDnlTT =LIiy -----END PGP SIGNATURE-----