Ritter, Christian C SRTCL-CTGAS
2000-Sep-29  07:33 UTC
[R] non-ideal behavior in princomp/ not a feature but a bug
... I checked and Brian and I are both right (see bottom for prior mail exchange). Let me explain: ============================================================1. Indeed, in principle, princomp allows data matrices with are wider than high. Example:> x1[,1] [,2] [,3] [,4] [1,] 1 1 2 2 [2,] 1 1 2 2> princomp(x1)Call: princomp(x = x1) Standard deviations: Comp.1 Comp.2 Comp.3 Comp.4 0 0 0 0 4 variables and 2 observations. So far so good ========================================================== 2. However, not all matrices work, such as the first one I tried yesterday. Here is an example of a matrix which does not work.> x[,1] [,2] [,3] [,4] [1,] -0.4 -1.6 -1.0 -1.8 [2,] -2.2 -0.3 1.4 0.2> princomp(x)Error in princomp(x) : covariance matrix is not non-negative definite>Seeing this behavior on the matrix I tried, I hastily concluded that princomp was not suitable for this type (as similar functions are in many other software packages which are written for psychologists but not for chemometricians). ======================================================However, the problem is not with the design of princomp in general, but with a little check it performs inside, which can have problems with roundoff: if (any(edc$values < 0)) stop("covariance matrix is not non-negative definite") As it happens, the covariance of x1 has eigenvalues of:> eigen(cov.wt(x1)$cov)$values[1] 0 0 0 0 and x has eigenvalues> eigen(cov.wt(x)$cov)$values[1] 7.345000e+00 1.110223e-16 0.000000e+00 -1.110223e-16 Therefore, the observed behavior is logical (but wrong). Oh, by the way: _ platform Windows arch x86 os Win32 system x86, Win32 status major 1 minor 1.0 year 2000 month June day 15 language R>-----Original Message----- From: Prof Brian Ripley [mailto:ripley at stats.ox.ac.uk] Sent: Thursday, September 28, 2000 2:16 PM To: r-help at stat.math.ethz.ch; Ritter, Christian C SRTCL-CTGAS Subject: Re: [R] non-ideal behavior in princomp> From: "Ritter, Christian C SRTCL-CTGAS" <Christian.C.Ritter at OPC.shell.com> > Date: Thu, 28 Sep 2000 12:59:25 +0200 > > This problem is not limited to R, but R is one of the packages in which it > arises.I'm sorry, but I must be missing something here. I think R (and S) do already do what you want here.> princomp is a nice function which creates an object for which inspection > methods have been written.But there is also prcomp. (What are `inspection functions'? Do you mean methods for what Bill Venables sometimes calls accessor functions?) princomp is written for more flexibility, for example to use robust methods of extracting PCs.> Unfortunately, princomp does not admit cases in which the x matrix iswider> than high (i. e. more variables than observations). Such cases are typicalReally?> in spectroscopy and related disciplines. It would be nice if the following > two features were added to princomp: > > 1. as a default behavior, princomp should digest wider than high matrices > (it should just compute the nontrivial principal compontents).prcomp does exactly that, although there are problems for which the trivial PCs are the interesting ones! What problems are you finding with princomp? If I give it a n x p matrix with n < p, I get (correctly) p PCs, with (effectively) 0 standard deviations for the last n-p. It's possible that this does not always work for numerical reasons, but can you describe the `problem' you feel it has? (Perhaps the criterion for non-negative-definiteness is too stringent?)> 2. as an optional behavior, princomp should only calculate (and return)the> first A principal components. This is useful, if x is very wide and very > short.Well, only min(n, p) PC's are defined, so I presume you want A < min(n, p)? Then you just extract the first A PCs. If A << min(n, p) there are faster algorithms, but they are not currently implemented in R, and I can think of very few problems where the speed difference would be noticeable.> Some might remark that all this can be done easily by svd or byprogramming> the NIPALS algorithm, but I would prefer to see it in the common version.It is, in prcomp which uses the svd. -- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272860 (secr) Oxford OX1 3TG, UK Fax: +44 1865 272595 -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Prof Brian D Ripley
2000-Sep-29  08:32 UTC
[R] non-ideal behavior in princomp/ not a feature but a bug
On Fri, 29 Sep 2000, Ritter, Christian C SRTCL-CTGAS wrote:> ... I checked and Brian and I are both right (see bottom for prior mail > exchange).Um: you only appeared to be right because you use an obselete version of R (and I don't think you said so), whereas I used the current release. Moral: always upgrade before posting bugs.> Let me explain: > ============================================================> 1. Indeed, in principle, princomp allows data matrices with are wider than > high. > Example: > > x1 > [,1] [,2] [,3] [,4] > [1,] 1 1 2 2 > [2,] 1 1 2 2 > > princomp(x1) > Call: > princomp(x = x1) > > Standard deviations: > Comp.1 Comp.2 Comp.3 Comp.4 > 0 0 0 0 > > 4 variables and 2 observations. > > So far so good > ==========================================================> > 2. However, not all matrices work, such as the first one I tried yesterday. > Here is > an example of a matrix which does not work. > > x > [,1] [,2] [,3] [,4] > [1,] -0.4 -1.6 -1.0 -1.8 > [2,] -2.2 -0.3 1.4 0.2 > > princomp(x) > Error in princomp(x) : covariance matrix is not non-negative definite > > > > Seeing this behavior on the matrix I tried, I hastily concluded that > princomp was not > suitable for this type (as similar functions are in many other software > packages which > are written for psychologists but not for chemometricians). > > ======================================================> However, the problem is not with the design of princomp in general, but with > a little > check it performs inside, which can have problems with roundoff: > > if (any(edc$values < 0)) > stop("covariance matrix is not non-negative definite")Use the current version of R: the check is actually if (any(ev[neg] < -9 * .Machine$double.eps * ev[1])) Now, that may not be adequate (I don't know where the precise value comes from) and had I written it I would have used more like 100 than 9. But it covered all the examples I tried.> As it happens, the covariance of x1 has eigenvalues of: > > eigen(cov.wt(x1)$cov)$values > [1] 0 0 0 0 > and x has eigenvalues > > eigen(cov.wt(x)$cov)$values > [1] 7.345000e+00 1.110223e-16 0.000000e+00 -1.110223e-16 > Therefore, the observed behavior is logical (but wrong). > > Oh, by the way: > _ > platform Windows > arch x86 > os Win32 > system x86, Win32 > status > major 1 > minor 1.0That's not current. [...] -- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272860 (secr) Oxford OX1 3TG, UK Fax: +44 1865 272595 -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Paul Gilbert
2000-Sep-29  13:37 UTC
[Rd] Re: [R] non-ideal behavior in princomp/ not a feature but a bug
>2. However, not all matrices work, such as the first one I tried yesterday. >Here is >an example of a matrix which does not work. >> x > [,1] [,2] [,3] [,4] >[1,] -0.4 -1.6 -1.0 -1.8 >[2,] -2.2 -0.3 1.4 0.2 >> princomp(x) >Error in princomp(x) : covariance matrix is not non-negative definiteThis does not happen with this example in Solaris, but I have had similar problems with eigen before and had to use something like eigen( (cv + t(cv))/2 ) to avoid it. It seems to me that if eigen is called with symmetric = TRUE, as is done in princomp, then eigen should return an error message about a non-symmetric matrix, rather than return a possibly spurious answer from which negative eigenvalues are checked to test for a non symmetric matrix. Paul Gilbert -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-devel-request@stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._