On Jan 8, 2008 12:34 AM, suman Duvvuru <duvvuru.suman at gmail.com>
wrote:> Hello,
> I have a dataset with 20,000 variables.and I would like to compute a
pearson
> correlation matrix which will be 20000*20000. The cor() function doesnt
work
> in this case due to memory problem. If you have any ideas regarding a
> feasible way to compute correlations on such a huge dataset, please help me
> out.
Considering that a single copy of such a matrix, stored as a dense
matrix, is over 1 Gb
> 20000^2 * 8 / (2^20)
[1] 3051.8
I'm not surprised that you run into memory problems.
Perhaps it is time to look at the forest instead of the trees. What
would you do with such a matrix if you were able to calculate and
store it?
> Please feel free to share your memory handling techniques in R.
>
> Thanks,
> Suman
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>