rdiaz@cnio.es
2003-Apr-22 20:07 UTC
[Rd] "LAPACK routine DGESDD gave error code -12" with Debian (PR#2822)
Dear All, Under Debian GNU/Linux La.svd (with method = "dgesdd") sometimes gives the error "Error in La.svd(data, nu = 0, nv = min(nrow, ncol), method = "dgesdd") : LAPACK routine DGESDD gave error code -12" It seems not to depend on the data per se, but on the relationship between numbers of rows and columns. For example, if the number of columns is 100, La.svd will fail when the number of rows is 56, but not if it is 55 or 57. It will not fail if we use "dgesvd". If the number of columns is 51, La.svd fails when the number of rows is between 29 and 50 if we use "dgesdd". This happens if I use the latest deb packages (and thus ATLAS, etc). It does not happen if I build R in this same machine with "--without-blas" (where make check reports no errors). In case it matters, the bug does not show up in a different machine with Windwos 2000 and the Rblas.dll linked against ATLAS provided in http://cran.r-project.org/bin/windows/contrib/ATLAS/P4). I understand this is probably related to the issues mentioned in R-admin about LAPACK 3.0 and some of the issues recently discussed in this list by M. Burger, D. Bates and D. Eddelbuettel. Are there any workarounds (besides not using ATLAS at all?). Ramón ******************************** An example of failure:> ## ncol = 100 > nrow <- 56 > ncol <- 100 > data <- matrix(1:(nrow * ncol), ncol = ncol) > ## you get the errors if you use any other data > ## such as data <- matrix(rnorm(nrow * ncol), ncol = ncol) > svd(data) ## error > La.svd(data, nu = 0,nv = min(nrow, ncol), method = "dgesdd") ## error> La.svd(data, nu = 0,nv = min(nrow, ncol), method = "dgesvd") ## OK> ##ncol = 51; it fails with nrow in [29, 50]*************************> ## version that crashes > version_ platform i386-pc-linux-gnu arch i386 os linux-gnu system i386, linux-gnu status major 1 minor 7.0 year 2003 month 04 day 16 language R *****************************> ## version "--without-blas" that does not crash > version_ platform i686-pc-linux-gnu arch i686 os linux-gnu system i686, linux-gnu status major 1 minor 7.0 year 2003 month 04 day 16 language R ----- Ramón Díaz-Uriarte Bioinformatics Unit Centro Nacional de Investigaciones Oncológicas (CNIO) (Spanish National Cancer Center) Melchor Fernández Almagro, 3 28029 Madrid (Spain) Fax: +-34-91-224-6972 Phone: +-34-91-224-6900 http://bioinfo.cnio.es/~rdiaz
Kurt Hornik
2003-Apr-22 20:18 UTC
[Rd] "LAPACK routine DGESDD gave error code -12" with Debian (PR#2822)
>>>>> rdiaz writes:> Dear All, > Under Debian GNU/Linux La.svd (with method = "dgesdd") sometimes gives the > error> "Error in La.svd(data, nu = 0, nv = min(nrow, ncol), method = "dgesdd") : > LAPACK routine DGESDD gave error code -12"> It seems not to depend on the data per se, but on the relationship between > numbers of rows and columns.> For example, if the number of columns is 100, La.svd will fail when > the number of rows is 56, but not if it is 55 or 57. It will not fail > if we use "dgesvd". If the number of columns is 51, La.svd fails when > the number of rows is between 29 and 50 if we use "dgesdd".> This happens if I use the latest deb packages (and thus ATLAS, > etc). It does not happen if I build R in this same machine with > "--without-blas" (where make check reports no errors). In case it > matters, the bug does not show up in a different machine with Windwos > 2000 and the Rblas.dll linked against ATLAS provided in > http://cran.r-project.org/bin/windows/contrib/ATLAS/P4).> I understand this is probably related to the issues mentioned in > R-admin about LAPACK 3.0 and some of the issues recently discussed in > this list by M. Burger, D. Bates and D. Eddelbuettel. Are there any > workarounds (besides not using ATLAS at all?).> Ram?n> ******************************** > An example of failure: >> ## ncol = 100 >> nrow <- 56 >> ncol <- 100 >> data <- matrix(1:(nrow * ncol), ncol = ncol) >> ## you get the errors if you use any other data >> ## such as data <- matrix(rnorm(nrow * ncol), ncol = ncol) >> svd(data) ## error >> La.svd(data, nu = 0, > nv = min(nrow, ncol), method = "dgesdd") ## error >> La.svd(data, nu = 0, > nv = min(nrow, ncol), method = "dgesvd") ## OK>> ##ncol = 51; it fails with nrow in [29, 50] > *************************Confirmed on Debian GNU/Linux testing with atlas2-base-dev for both the current 1.7.0 debs and 1.8.0 built from scratch using --with-lapack: hornik@mithrandir:~/tmp$ ldd /usr/local/lib/R/modules/lapack.so libR.so => not found liblapack.so.2 => /usr/lib/atlas/liblapack.so.2 (0x40013000) -k
Dirk Eddelbuettel
2003-Apr-23 02:33 UTC
[Rd] "LAPACK routine DGESDD gave error code -12" with Debian (PR#2822)
On Tue, Apr 22, 2003 at 08:07:36PM +0200, rdiaz@cnio.es wrote:> Dear All, > > Under Debian GNU/Linux La.svd (with method = "dgesdd") sometimes gives the > error > > "Error in La.svd(data, nu = 0, nv = min(nrow, ncol), method = "dgesdd") : > LAPACK routine DGESDD gave error code -12" > > It seems not to depend on the data per se, but on the relationship between > numbers of rows and columns. > > For example, if the number of columns is 100, La.svd will fail when the number > of rows is 56, but not if it is 55 or 57. It will not fail if we use > "dgesvd". If the number of columns is 51, La.svd fails when the number of > rows is between 29 and 50 if we use "dgesdd". > > This happens if I use the latest deb packages (and thus ATLAS, etc). It does > not happen if I build R in this same machine with "--without-blas" (where > make check reports no errors). In case it matters, the bug does not show up > in a different machine with Windwos 2000 and the Rblas.dll linked against > ATLAS provided in http://cran.r-project.org/bin/windows/contrib/ATLAS/P4). > > I understand this is probably related to the issues mentioned in R-admin about > LAPACK 3.0 and some of the issues recently discussed in this list by M. > Burger, D. Bates and D. Eddelbuettel. Are there any workarounds (besides not > using ATLAS at all?).We should probably talk to Camm, the Atlas maintainer. Note how on recent upgrades he inserted the note (cf /var/lib/dpkg/info/atlas2-3dnow.templates on my Athlon system) via debconf: Template: atlas2-3dnow/3dnow_warning Type: note Description: 3dnow arithmetic is not IEEE compliant Please note that 3dnow arithmetic does not furnish several results required by the IEEE standard, and may therefore cause errors in code which needs to trap NaN and Inf results, for example. The atlas2-3dnow binaries make heavy use of the 3dnow extensions. Please see the accompanying file /usr/share/doc/atlas2-3dnow/3DNow.txt for details. I know Camm is on a sabbatical but will CC him nonetheless. Dirk> Ramón > > ******************************** > An example of failure: > > ## ncol = 100 > > nrow <- 56 > > ncol <- 100 > > data <- matrix(1:(nrow * ncol), ncol = ncol) > > ## you get the errors if you use any other data > > ## such as data <- matrix(rnorm(nrow * ncol), ncol = ncol) > > svd(data) ## error > > La.svd(data, nu = 0, > nv = min(nrow, ncol), method = "dgesdd") ## error > > La.svd(data, nu = 0, > nv = min(nrow, ncol), method = "dgesvd") ## OK > > > ##ncol = 51; it fails with nrow in [29, 50] > ************************* > > > ## version that crashes > > version > _ > platform i386-pc-linux-gnu > arch i386 > os linux-gnu > system i386, linux-gnu > status > major 1 > minor 7.0 > year 2003 > month 04 > day 16 > language R > ***************************** > > ## version "--without-blas" that does not crash > > version > _ > platform i686-pc-linux-gnu > arch i686 > os linux-gnu > system i686, linux-gnu > status > major 1 > minor 7.0 > year 2003 > month 04 > day 16 > language R > > > ----- > Ramón Díaz-Uriarte > Bioinformatics Unit > Centro Nacional de Investigaciones Oncológicas (CNIO) > (Spanish National Cancer Center) > Melchor Fernández Almagro, 3 > 28029 Madrid (Spain) > Fax: +-34-91-224-6972 > Phone: +-34-91-224-6900 > > http://bioinfo.cnio.es/~rdiaz > > ______________________________________________ > R-devel@stat.math.ethz.ch mailing list > https://www.stat.math.ethz.ch/mailman/listinfo/r-devel >-- Don't drink and derive. Alcohol and algebra don't mix.
bates@stat.wisc.edu
2003-Apr-24 19:03 UTC
[Rd] "LAPACK routine DGESDD gave error code -12" with Debian (PR#2822)
Peter Dalgaard BSA <p.dalgaard@biostat.ku.dk> writes:> Camm Maguire <camm@enhanced.com> writes: > > > 2) Given the error code, and the scaling behavior with matrix size, > > I'd say the lwork parameter (size of the work array) passed to > > dgesdd is not always large enough, i.e. is not scaling properly > > with n,m. Please see 'man dgesdd' for interpretations of the error > > code. It is the responsibility of the calling routine to allocate > > and pass the work array to dgesdd. With most lapack routines, one > > can make a 'workspace query' call first by setting lwork to -1, or > > some such. check the man page for details. This of course would > > have to be done with each change in n,m. Alternatively, you could > > take the minimum workspace requirements from the manpage. > > Right, but that's actually what we do, use the workspace query. It's > all very weird, because the -12 value indicates that the lwork > parameter is wrong, but it is computed from an exactly identical call, > except lwork=-l: > > lwork = -1; > > F77_CALL(dgesdd)(CHAR(STRING_ELT(jobu, 0)), > &n, &p, xvals, &n, REAL(s), > REAL(u), &ldu, > REAL(v), &ldvt, > &tmp, &lwork, iwork, &info); > lwork = (int) tmp;It looks like a problem in ilaenv, the Lapack routine that returns the tuning parameters, like the optimal temporary storage size, for various Lapack routines. The value returned in tmp will depend upon the results of several calls to ilaenv. These results can vary between different implementations of the blas (or atlas).
camm@enhanced.com
2003-Apr-25 15:29 UTC
[Rd] "LAPACK routine DGESDD gave error code -12" with Debian (PR#2822)
Greetings! Douglas Bates <bates@stat.wisc.edu> writes:> Peter Dalgaard BSA <p.dalgaard@biostat.ku.dk> writes: > > > Camm Maguire <camm@enhanced.com> writes: > > > > > 2) Given the error code, and the scaling behavior with matrix size, > > > I'd say the lwork parameter (size of the work array) passed to > > > dgesdd is not always large enough, i.e. is not scaling properly > > > with n,m. Please see 'man dgesdd' for interpretations of the error > > > code. It is the responsibility of the calling routine to allocate > > > and pass the work array to dgesdd. With most lapack routines, one > > > can make a 'workspace query' call first by setting lwork to -1, or > > > some such. check the man page for details. This of course would > > > have to be done with each change in n,m. Alternatively, you could > > > take the minimum workspace requirements from the manpage. > > > > Right, but that's actually what we do, use the workspace query. It's > > all very weird, because the -12 value indicates that the lwork > > parameter is wrong, but it is computed from an exactly identical call, > > except lwork=-l: > > > > lwork = -1; > > > > F77_CALL(dgesdd)(CHAR(STRING_ELT(jobu, 0)), > > &n, &p, xvals, &n, REAL(s), > > REAL(u), &ldu, > > REAL(v), &ldvt, > > &tmp, &lwork, iwork, &info); > > lwork = (int) tmp; > > It looks like a problem in ilaenv, the Lapack routine that returns the > tuning parameters, like the optimal temporary storage size, for > various Lapack routines. The value returned in tmp will depend upon > the results of several calls to ilaenv. These results can vary between > different implementations of the blas (or atlas). >Just a note that you should check the returned info value on the workspace call to make sure tmp has been filled in. (the man page says this is done only if the other parameters are valid.) Does this problem vary with blas, and if so, how? You can run under whatever blas you want, including reference, via the LD_LIBRARY_PATH variable. Asuming you've installed all the i386 atlas versions and the blas package: LD_LIBRARY_PATH used blas: /usr/lib reference /usr/lib/atlas base vanilla i386 atlas /usr/lib/{sse,sse2,3dnow}/atlas atlas with ISA extensions ilaenv might possibly be an issue, but only realistically if the problem blas is coming from the 3dnow atlas. When I put together the lapack package, I read in the lapack notes how many of the ilaenv constants can be hardwired, saving a certain amount of time on the first call. I chose not to do this only because of the existence of the 3dnow, non-ieee compliant blas option at runtime, as one of the parameters pertains directly to ieee. So ilaenv is calculating its parameters n first call at runtime on Debian. I'm dubious as to the relevance of this, though, as this should only kick in for single precision if at all, and should not affect integer values in any case. Take care,> > >-- Camm Maguire camm@enhanced.com ========================================================================="The earth is but one country, and mankind its citizens." -- Baha'u'llah