Paul Bailey
2008-Jun-17 01:31 UTC
[R] Quickly reading data into the Matrix packages sparse formats
I have data set that I wish to solve with the Matrix package's sparse matrix functionality. The speed improvements that it has achieved are amazing, with my dense matrix solutions never taking really long enough to time in what I've been able to time so far. However, before I can solve my full linear model, I need to be able to read in all the data, and therein lies the rub. There are two ways that I see to read it in: (1) generate a dense X matrix and then convert it to a sparse matrix using i.e. R> require(Matrix) R> Xsparse <- as(X,"dgCMatrix") (2) make a new sparse X matrix and then populate it. R> require(Matrix) R> Xsparse <- Matrix(0,nrow=n,ncol=m,sparse=T) then for relevant cells: R> Xsparse[i,j] <- v But both of these methods are painfully slow. method 1 takes many times as long as the actual solving and what's worse, ends up being only about 1/2 as time consuming as sparse solvers when all is told. It also requires that a dense version of X approximately fit in memory. method 2 is significantly slower still, taking more than a factor of 10 longer than the dense solver. For 2 I tried dgCMatrix and dgTMatrix with little difference. I've searched though the documentation on the Matrix package, and there is no mention of this problem or its potential cure. Is there some way that I can format the data that will allow for rapid read in, or is there some other possible cure? Cheers, Paul Bailey
Reasonably Related Threads
- dgTMatrix --- [, , drop=F] strange behavior, Matrix 0.999375-20
- Matrix: How create a _row-oriented_ sparse Matrix (=dgRMatrix)?
- Matrix: How create a _row-oriented_ sparse Matrix (=dgRMatrix)?
- package:Matrix handling of data with identical indices
- package:Matrix handling of data with identical indices