Avraham.Adler at guycarp.com
2009-May-13 22:21 UTC
[R] Optimization algorithm to be applied to S4 classes - specifically sparse matrices
Hello. I am trying to optimize a set of parameters using /optim/ in which the actual function to be minimized contains matrix multiplication and is of the form: SUM ((A%*%X - B)^2) where A is a matrix and X and B are vectors, with X as parameter vector. This has worked well so far. Recently, I was given a data set A of size 360440 x 1173, which could not be handled as a normal matrix. I brought it into 'R' as a sparse matrix (dgCMatrix - using sparseMatrix from the Matrix package), and the formul? and gradient work, but /optim/ returns an error of the form "no method for coercing this S4 class to a vector". After briefly looking into methods and classes, I realize I am in way over my head. Is there any way I could use /optim/ or another optimization algorithm, on sparse matrices? Thank you very much, --Avraham Adler
spencerg
2009-May-15 02:52 UTC
[R] Optimization algorithm to be applied to S4 classes - specifically sparse matrices
Have you considered the following:
solve(qr(A), B)
I have not tried this with a small toy example, and the "qr"
documentation in the Matrix package seems to suggest it. This solves
the optimization problem you mentioned, as noted in
"http://en.wikipedia.org/wiki/Linear_least_squares".
Another alternative is the "biglm" function in the package of
the
same name. I have not tried this either, but it looks like it should
work.
Hope this helps.
Spencer Graves
Avraham.Adler at guycarp.com wrote:> Hello.
>
> I am trying to optimize a set of parameters using /optim/ in which the
> actual function to be minimized contains matrix multiplication and is of
> the form:
>
> SUM ((A%*%X - B)^2)
>
> where A is a matrix and X and B are vectors, with X as parameter vector.
>
> This has worked well so far. Recently, I was given a data set A of size
> 360440 x 1173, which could not be handled as a normal matrix. I brought it
> into 'R' as a sparse matrix (dgCMatrix - using sparseMatrix from
the Matrix
> package), and the formul? and gradient work, but /optim/ returns an error
> of the form "no method for coercing this S4 class to a vector".
>
> After briefly looking into methods and classes, I realize I am in way over
> my head. Is there any way I could use /optim/ or another optimization
> algorithm, on sparse matrices?
>
> Thank you very much,
>
> --Avraham Adler
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>
Douglas Bates
2009-May-15 15:57 UTC
[R] Optimization algorithm to be applied to S4 classes - specifically sparse matrices
On Wed, May 13, 2009 at 5:21 PM, <Avraham.Adler at guycarp.com> wrote:> > Hello. > > I am trying to optimize a set of parameters using /optim/ in which the > actual function to be minimized contains matrix multiplication and is of > the form: > > SUM ((A%*%X - B)^2) > > where A is a matrix and X and B are vectors, with X as parameter vector.As Spencer Graves pointed out, what you are describing here is a linear least squares problem, which has a direct (i.e. non-iterative) solution. A comparison of the speed of various ways of solving such a system is given in one of the vignettes in the Matrix package.> This has worked well so far. Recently, I was given a data set A of size > 360440 x 1173, which could not be handled as a normal matrix. I brought it > into 'R' as a sparse matrix (dgCMatrix - using sparseMatrix from the Matrix > package), and the formul? and gradient work, but /optim/ returns an error > of the form "no method for coercing this S4 class to a vector".If you just want the least squares solution X then X <- solve(crossprod(A), crossprod(A, B)) will likely be the fastest method where A is the sparse matrix. I do feel obligated to point out that the least squares solution for such large systems is rarely a sensible solution to the underlying problem. If you have over 1000 columns in A and it is very sparse then likely at least parts of A are based on indicator columns for a categorical variable. In such situations a model with random effects for the category is often preferable to the fixed-effects model you are fitting.> After briefly looking into methods and classes, I realize I am in way over > my head. Is there any way I could use /optim/ or another optimization > algorithm, on sparse matrices? > > Thank you very much, > > --Avraham Adler > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
Maybe Matching Threads
- Matrix inversion-different answers from LAPACK and LINPACK
- R-3.0.1 - "transient" make check failure in splines-EX.r
- preserving sparse matrices (Matrix)
- Matrix: How create a _row-oriented_ sparse Matrix (=dgRMatrix)?
- Matrix: How create a _row-oriented_ sparse Matrix (=dgRMatrix)?