Matthieu Stigler
2013-Jan-23 12:52 UTC
[Rd] na.omit option in prcomp: formula interface only
Dear r-devel list, dear Ben I came across a post of Ben Bolker from Feb 2012 (see below) on handling NA values in prcomp(). As I faced the same issue and found Ben's suggestions interesting, I was wondering whether this led to further discussions I might have missed? I understand handling NA values is far from trivial, but would it be possible to add a warning in the documentation, and/or whenever na.action is used with prcomp() on a data frame (suggesting to use the formula instead?)? Thanks! Matthieu Stigler This is a wishlist/request for discussion about the behaviour of the na.action option in prcomp, specifically the fact that it only applies to the formula interface. I had a question from a friend (who is smart and careful and generally R's TFM, although like all of us he misses things sometimes) asking why the na.action= argument didn't seem to be doing anything in prcomp (i.e. one gets an "Error in svd(x, nu=0): infinite or missing values in 'x'"). Some poking later, I realized that na.action only applied to the formula interface (so I told him to try prcomp(~.,data=x,...) instead). Sufficiently careful reading of the help page, with hindsight, revealed that na.action only appears in the arguments for the formula method, not the default (on the other hand, 'scale.' only appears in the default formula, but it *does* work with prcomp.formula as well, because prcomp.formula passes ... through to prcomp.default ...) Would it be reasonable to (at least) add a sentence to the documentation saying that na.action applies only to the formula interface or (possibly) to add some NA-processing machinery to prcomp.default to allow it to handle na.action as well? (I can appreciate from looking at stats:::prcomp.formula that the NA-processing is not completely trivial ...) Ben Bolker [[alternative HTML version deleted]]