I'm interested in writing some model selection functions (for linear regression models, as a start), which incorporate the PRESS criterion since it, to my knowledge, is not currently implemented in any available model selection procedure. I thought it would be simplest to build on already existing functions like regsubsets in package leaps. It's easy enough to calculate the PRESS criterion for a fitted lm object, but I'm having trouble deciphering the structure of the regsubsets objects that leaps works with. Is there any way to calculate press from a regsubsets? Or, to put it another way, can I get the residual vector and the diagonal entries of the hat matrix from a regsubsets object? In fact, if the hat matrix is never calculated explicitly, the columns of Q from the QR factorization would suffice. Thanks, Andrew Smith [[alternative HTML version deleted]]
On Fri, 11 May 2007, Andrew Smith wrote:> I thought it would be simplest to build on already existing functions like > regsubsets in package leaps. It's easy enough to calculate the PRESS > criterion for a fitted lm object, but I'm having trouble deciphering the > structure of the regsubsets objects that leaps works with. Is there any way > to calculate press from a regsubsets? Or, to put it another way, can I get > the residual vector and the diagonal entries of the hat matrix from a > regsubsets object? In fact, if the hat matrix is never calculated > explicitly, the columns of Q from the QR factorization would suffice. >Not only is the hat matrix never calculated explicitly, the Q matrix isn't calculated either. The code forms R and Q^TY directly (the same code is used in the biglm package to provide bounded-space linear regression). -thomas
The main reason for explicitly constructing the Q matrix is for the pedagogical value of seeing it. As Thomas points out, if you want to actually use Q in a calculation, there will almost always be a much more efficient way of constructing the real goal of the calculation. For help in that construction, you can use the qr.Q and qr.R functions to see what Q and R look like. This will help you confirm in tiny cases that you have done the more efficient calculation correctly. Rich