Berwin A Turlach
2006-Mar-25 12:12 UTC
[Rd] Suggest patch for princomp.formula and prcomp.formula
Dear all, perhaps I am using princomp.formula and prcomp.formula in a way that is not documented to work, but then the documentation just says: formula: a formula with no response variable. Thus, to avoid a lot of typing, it would be nice if one could use '.' and '-' in the formula, e.g.> library(DAAG) > res <- prcomp(~ . - case - site - Pop - sex, possum)Error in prcomp.formula(~. - case - site - Pop - sex, possum) : PCA applies only to numerical variables> res <- princomp(~ . - case - site - Pop - sex, possum)Error in princomp.formula(~. - case - site - Pop - sex, possum) : PCA applies only to numerical variables Unfortunately, as the examples above show, this is currently not possible, since both functions test whether any term mentioned in the formula is non numeric or a factor, instead of just testing those that enter the analysis. The attached patch should allow the use of '.' and '-', while still producing an error when a factor or a non-numeric variable is specified to enter the analysis:> library(DAAG) > res <- prcomp(~ . - case - site - Pop - sex, possum) > res <- princomp(~ . - case - site - Pop - sex, possum) > res <- prcomp(~ . - case - site - Pop, possum)Error in prcomp.formula(~. - case - site - Pop, possum) : PCA applies only to numerical variables> res <- princomp(~ . - case - site - Pop, possum)Error in princomp.formula(~. - case - site - Pop, possum) : PCA applies only to numerical variables On my machine, `make check FORCE=FORCE' succeeds with this patch and, as far as I can tell, no modification of the help pages would be necessary. Cheers, Berwin -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: R-patch Url: https://stat.ethz.ch/pipermail/r-devel/attachments/20060325/5a5d4a95/attachment.pl
Prof Brian Ripley
2006-Mar-26 20:29 UTC
[Rd] Suggest patch for princomp.formula and prcomp.formula
I would argue this is a bug in model.frame(), but it seems that it is S-compatible. That is, variables excluded by - appear in the model frame even though they do not really appear in the simplified formula. (I suppose the rationale is that - need not always interpreted as deletion.) The proposed fix regretably will not work, since one can do things like library(MASS) prcomp(~ dist + dist:climb, hills) I'll see if I can dream up a general solution. (One way forward is to simplify the formula and call terms again, but simplifcation is rather clumsy code.) On Sat, 25 Mar 2006, Berwin A Turlach wrote:> Dear all, > > perhaps I am using princomp.formula and prcomp.formula in a way that > is not documented to work, but then the documentation just says: > > formula: a formula with no response variable. > > Thus, to avoid a lot of typing, it would be nice if one could use '.' > and '-' in the formula, e.g. > >> library(DAAG) >> res <- prcomp(~ . - case - site - Pop - sex, possum) > Error in prcomp.formula(~. - case - site - Pop - sex, possum) : > PCA applies only to numerical variables >> res <- princomp(~ . - case - site - Pop - sex, possum) > Error in princomp.formula(~. - case - site - Pop - sex, possum) : > PCA applies only to numerical variables > > Unfortunately, as the examples above show, this is currently not > possible, since both functions test whether any term mentioned in the > formula is non numeric or a factor, instead of just testing those that > enter the analysis. > > The attached patch should allow the use of '.' and '-', while still > producing an error when a factor or a non-numeric variable is > specified to enter the analysis: > >> library(DAAG) >> res <- prcomp(~ . - case - site - Pop - sex, possum) >> res <- princomp(~ . - case - site - Pop - sex, possum) >> res <- prcomp(~ . - case - site - Pop, possum) > Error in prcomp.formula(~. - case - site - Pop, possum) : > PCA applies only to numerical variables >> res <- princomp(~ . - case - site - Pop, possum) > Error in princomp.formula(~. - case - site - Pop, possum) : > PCA applies only to numerical variables > > On my machine, `make check FORCE=FORCE' succeeds with this patch and, > as far as I can tell, no modification of the help pages would be > necessary. > > Cheers, > > Berwin > >-- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595