(PR#9623) qr.coef: permutes dimnames; inserts NA; promises
From: Prof Brian Ripley <ripley at stats.ox.ac.uk> Date: Tue, 1 May 2007 15:01:51 +0100 (BST) On Thu, 19 Apr 2007, brech at delphioutpost.com wrote:
Full_Name: Christian Brechbuehler Version: 2.4.1 Patched (2007-03-25 r40917) OS: Linux 2.6.15-27-adm64-xeon; Ubuntu 6.06.1 LTS Submission from: (NULL) (24.61.47.236) I believe that R has a bug in that it is not internally consistent, and another separate bug in the documentation.
I agree with the bug in the dimnames, and and have corrected that in 2.5.0 patched.
I see it in svn:
if(!is.null(nam <- colnames(qr$qr)))
- rownames(coef) <- nam
+ if(k < p) rownames(coef)[qr$pivot] <- nam
+ else rownames(coef) <- nam
+
Thank you!
But I think the rest stems from the following misunderstanding:
in math, zero times anything is zero, but in R, NA times anything (even zero) is NA. This seems somewhat inconvenient.
That just is not true.
OK, I accept that.
Stemming from this, what R is reporting is that certain columns are not used in the calculation.
OK, I understand qr.coef indicates with NA that dcrdc2 decided to exclude the corresponding columns (because of linear dependency). I still think the documentation is misleading -- trimming to the essence:
help(qr): 'solve.qr' is the method for 'solve' for 'qr' objects. 'qr.solve' solves systems of equations via the QR decomposition: if 'a' is a QR decomposition it is the same as 'solve.qr', but if 'a' is a rectangular matrix the QR decomposition is computed first. Either will handle over- and under-determined systems, providing a minimal-length solution or a least-squares fit if appropriate.
(A)
'qr.solve' and 'solve.qr' will NOT handle under-determined systems.
They both perform this check:
if (a$rank != nc)
stop("singular matrix 'a' in solve")
But 'qr.coef', which they call when all is well, does.
(B)
Help promises a minimal-length solution, but QR does not deliver that.
> qr.coef(qr(t(1:2)), 5)
[1] 5 NA
1*5 + 2*0 does equal 5, so this is a solution, but it is NOT minimal-length. OTOH
> d <- svd(t(1:2)); 5 %*% d$u %*% (d$d^-1 * t(d$v))
[,1] [,2]
[1,] 1 2
This *is* minimum length, and 1*1 + 2*2 == 5.
The documentation should clarify when minimal-length solution is
provided.
Maybe the phrase "a minimal-length solution or" should be removed? /Christian Brechbuehler