1. Probably not, depending on what you expect to gain from this. R's
numerical procedures can almost certainly handle the correlations.
2. Search on "R package for principal components regression" instead
of rolling your own.There are several (e.g. "chemometrics",
"pls",
etc.)
-- Bert
On Fri, Nov 22, 2013 at 8:47 AM, Chris Wilkinson <kinsham at verizon.net>
wrote:> My data has correlations between predictors so I think it would be
> advantageous to rotate the axes with prcomp().
>
>> census <-
>
read.table(paste("http://www.stat.wisc.edu/~rich/JWMULT02dat","T8-5.DAT",sep
> ="/"),header=F)
>> census
> V1 V2 V3 V4 V5
> 1 5.935 14.2 2.265 2.27 2.91
> 2 1.523 13.1 0.597 0.75 2.62
> 3 2.599 12.7 1.237 1.11 1.72
> 4 4.009 15.2 1.649 0.81 3.02
> 5 4.687 14.7 2.312 2.50 2.22
> 6 8.044 15.6 3.641 4.51 2.36
> 7 2.766 13.3 1.244 1.03 1.97
> 8 6.538 17.0 2.618 2.39 1.85
> 9 6.451 12.9 3.147 5.52 2.01
> 10 3.314 12.2 1.606 2.18 1.82
> 11 3.777 13.0 2.119 2.83 1.80
> 12 1.530 13.8 0.798 0.84 4.25
> 13 2.768 13.6 1.336 1.75 2.64
> 14 6.585 14.9 2.763 1.91 3.17
>
>> pca1 <- prcomp(census)
>> summary(pca1)
> Importance of components:
> PC1 PC2 PC3 PC4 PC5
> Standard deviation 2.6327 1.3361 0.62422 0.47909 0.11897
> Proportion of Variance 0.7413 0.1909 0.04168 0.02455 0.00151
> Cumulative Proportion 0.7413 0.9323 0.97394 0.99849 1.00000
>
>> pca1$rotation # eigenvectors
> PC1 PC2 PC3 PC4 PC5
> V1 -0.78120807 0.07087183 -0.003656607 0.54171007 0.302039670
> V2 -0.30564856 0.76387277 0.161817438 -0.54479937 0.009279632
> V3 -0.33444840 -0.08290788 -0.014841008 0.05101636 -0.937255367
> V4 -0.42600795 -0.57945799 -0.220453468 -0.63601254 0.172145212
> V5 0.05435431 0.26235528 -0.961759720 0.05127599 -0.024583093
>
> I'd like to create a linear model based on the rotated axes.
>
>> linmod <- lm(y~a+b+....)
>
> Could someone be kind enough to suggest how to code a, b...?
>
> Chris
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
--
Bert Gunter
Genentech Nonclinical Biostatistics
(650) 467-7374