Avraham.Adler at guycarp.com
2009-May-13 22:21 UTC
[R] Optimization algorithm to be applied to S4 classes - specifically sparse matrices
Hello. I am trying to optimize a set of parameters using /optim/ in which the actual function to be minimized contains matrix multiplication and is of the form: SUM ((A%*%X - B)^2) where A is a matrix and X and B are vectors, with X as parameter vector. This has worked well so far. Recently, I was given a data set A of size 360440 x 1173, which could not be handled as a normal matrix. I brought it into 'R' as a sparse matrix (dgCMatrix - using sparseMatrix from the Matrix package), and the formul? and gradient work, but /optim/ returns an error of the form "no method for coercing this S4 class to a vector". After briefly looking into methods and classes, I realize I am in way over my head. Is there any way I could use /optim/ or another optimization algorithm, on sparse matrices? Thank you very much, --Avraham Adler
spencerg
2009-May-15 02:52 UTC
[R] Optimization algorithm to be applied to S4 classes - specifically sparse matrices
Have you considered the following: solve(qr(A), B) I have not tried this with a small toy example, and the "qr" documentation in the Matrix package seems to suggest it. This solves the optimization problem you mentioned, as noted in "http://en.wikipedia.org/wiki/Linear_least_squares". Another alternative is the "biglm" function in the package of the same name. I have not tried this either, but it looks like it should work. Hope this helps. Spencer Graves Avraham.Adler at guycarp.com wrote:> Hello. > > I am trying to optimize a set of parameters using /optim/ in which the > actual function to be minimized contains matrix multiplication and is of > the form: > > SUM ((A%*%X - B)^2) > > where A is a matrix and X and B are vectors, with X as parameter vector. > > This has worked well so far. Recently, I was given a data set A of size > 360440 x 1173, which could not be handled as a normal matrix. I brought it > into 'R' as a sparse matrix (dgCMatrix - using sparseMatrix from the Matrix > package), and the formul? and gradient work, but /optim/ returns an error > of the form "no method for coercing this S4 class to a vector". > > After briefly looking into methods and classes, I realize I am in way over > my head. Is there any way I could use /optim/ or another optimization > algorithm, on sparse matrices? > > Thank you very much, > > --Avraham Adler > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > >
Douglas Bates
2009-May-15 15:57 UTC
[R] Optimization algorithm to be applied to S4 classes - specifically sparse matrices
On Wed, May 13, 2009 at 5:21 PM, <Avraham.Adler at guycarp.com> wrote:> > Hello. > > I am trying to optimize a set of parameters using /optim/ in which the > actual function to be minimized contains matrix multiplication and is of > the form: > > SUM ((A%*%X - B)^2) > > where A is a matrix and X and B are vectors, with X as parameter vector.As Spencer Graves pointed out, what you are describing here is a linear least squares problem, which has a direct (i.e. non-iterative) solution. A comparison of the speed of various ways of solving such a system is given in one of the vignettes in the Matrix package.> This has worked well so far. Recently, I was given a data set A of size > 360440 x 1173, which could not be handled as a normal matrix. I brought it > into 'R' as a sparse matrix (dgCMatrix - using sparseMatrix from the Matrix > package), and the formul? and gradient work, but /optim/ returns an error > of the form "no method for coercing this S4 class to a vector".If you just want the least squares solution X then X <- solve(crossprod(A), crossprod(A, B)) will likely be the fastest method where A is the sparse matrix. I do feel obligated to point out that the least squares solution for such large systems is rarely a sensible solution to the underlying problem. If you have over 1000 columns in A and it is very sparse then likely at least parts of A are based on indicator columns for a categorical variable. In such situations a model with random effects for the category is often preferable to the fixed-effects model you are fitting.> After briefly looking into methods and classes, I realize I am in way over > my head. Is there any way I could use /optim/ or another optimization > algorithm, on sparse matrices? > > Thank you very much, > > --Avraham Adler > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
Reasonably Related Threads
- Matrix inversion-different answers from LAPACK and LINPACK
- R-3.0.1 - "transient" make check failure in splines-EX.r
- Matrix: How create a _row-oriented_ sparse Matrix (=dgRMatrix)?
- Matrix: How create a _row-oriented_ sparse Matrix (=dgRMatrix)?
- preserving sparse matrices (Matrix)