brech at delphioutpost.com
2007-Apr-19 15:03 UTC
[Rd] qr.coef: permutes dimnames; inserts NA; promises minimum-length (PR#9623)
Full_Name: Christian Brechbuehler Version: 2.4.1 Patched (2007-03-25 r40917) OS: Linux 2.6.15-27-adm64-xeon; Ubuntu 6.06.1 LTS Submission from: (NULL) (24.61.47.236) Splus and R have different ideas about what qr.coef(qr()) should return, which is fine... but I believe that R has a bug in that it is not internally consistent, and another separate bug in the documentation. In particular, on this problem, Splus generates a permuted solution vector: Splus 6.2.1: > A <- matrix(c(0,0,0, 1,1,1), nrow=3,dimnames=list(NULL, c("zero", "one"))) > A zero one [1,] 0 1 [2,] 0 1 [3,] 0 1 > y <- matrix(c(6,7,8), nrow=3, dimnames=list(NULL, "y")) > y y [1,] 6 [2,] 7 [3,] 8 > x <- qr.coef(qr(A), y) > x y one 7 zero 0 > To my taste, the answer from Splus is unexpected; I was looking for this: > x[qr(A)$pivot,,drop=F] y zero 0 one 7 But at least the answer is internally consistent: the values and the row names were scrambled in corresponding ways. However, R seems to helpfully un-permute the data values, but forgets to un-permute the dimnames, which seems broken. R version 2.4.1 Patched (2007-03-25 r40917) > A <- matrix(c(0,0,0, 1,1,1), nrow=3,dimnames=list(NULL, c("zero", "one"))) > y <- matrix(c(6,7,8), nrow=3, dimnames=list(NULL, "y")) > x <- qr.coef(qr(A), y) > x y one NA zero 7 I think the answer from qr.coef() should look more like this: > x.wanted y zero NA one 7 > In addition, R uses NA to fill in rather than zero. NA for this problem above might mean "undefined": any value will do. But given the application of this function, where the solution might be multiplied by the matrix, any NA will cause that to turn into an NA result... in math, zero times anything is zero, but in R, NA times anything (even zero) is NA. This seems somewhat inconvenient. So I'd really like R to return this: > x y zero 0 one 7 More on the scrambling; I see in dqrsl.f that: c if pivoting was requested in dqrdc, the j-th c component of b will be associated with column jpvt(j) c of the original matrix x that was input into dqrdc.) which I think is referring to the helpful un-permuting, but I think qr.coef() in R needs to correspondingly un-permute the dimnames as well? Code from qr.coef: if (!is.null(nam <- colnames(qr$qr))) rownames(coef) <- nam Proposed fix: That second line could be changed to rownames(coef)[qr$pivot] <- nam Another issue (either with the documentation or the code) is as follows. In R, help(qr) promises: 'solve.qr' is the method for 'solve' for 'qr' objects. 'qr.solve' solves systems of equations via the QR decomposition: if 'a' is a QR decomposition it is the same as 'solve.qr', but if 'a' is a rectangular matrix the QR decomposition is computed first. Either will handle over- and under-determined systems, providing a minimal-length solution or a least-squares fit if appropriate. So unlike Splus, R promises a minimal-length solution, but I don't think it delivers that. For example, let's get the minimal-length solution for the following under-determined system: x1 + 2*x2 == 5 qr.solve would fail due to the singular matrix, so use qr.coef: > qr.coef(qr(t(1:2)),c(5)) R: [1] 5 NA Splus: [1] 5 0 1*5 + 2*0 does equal 5, so modulo NA, this is a solution, but it is NOT minimal-length. OTOH, svd gives the right answer in both Splus and R: > d <- svd(t(1:2)); c(5) %*% d$u %*% diag(1/d$d, length(d$d)) %*% t(d$v) [,1] [,2] [1,] 1 2 This *is* minimum length, and 1*1 + 2*2 == 5. So either qr.coef(qr()) should deliver that, or the documentation should clarify when a minimal-length solution is delivered? And revisiting the NA issue... in this problem, for the qr.coef() result to be a solution, that 2nd value MUST be zero, no other values will work. So the NA seems wrong here, more clearly than in the other example. Finally, a simpler test case that shows the same issues with a 2x1 example instead of 2x3; any fix to qr.coef() should handle this as well:> qr.coef(qr(matrix(0:1,1,dimnames=list(NULL, c("zero","one")))),5)one zero NA 5 I think that should return: zero one 0 5 or at least: zero one NA 5
Possibly Parallel Threads
- (PR#9623) qr.coef: permutes dimnames; inserts NA; promises
- typo or stale info in qr man
- bug in qr.coef() and (therefore) in qr.solve (PR#8476)
- Bug on qr.coef when qr is created by a zero matrix with colnames and all y equals zero
- qr.coef and complex numbers - still busted for non-square case? (PR#13305)