thr3ads.net - R help - [R] question about Principal Component Analysis in R? [Feb 2006]

If this information is useful, please help other people find it:
Share via:

Michael

2006-Feb-27 09:00 UTC

[R] question about Principal Component Analysis in R?

Hi all,

I am wondering in R, suppose I did the principal component analysis on
training data set and obtain the rotation matrix, via:
> pca=prcomp(training_data, center=TRUE, scale=FALSE, retx=TRUE);
Then I want to rotate the test data set using the
> d1=scale(test_data, center=TRUE, scale=FALSE) %*% pca$rotation;
> d2=predict(pca, test_data, center=TRUE, scale=FALSE);
these two values are different
> min(d2-d1)
[1] -1.976152> max(d2-d1)[1] 1.535222

However, if I do these on the training data:
> d1=scale(training_data, center=TRUE, scale=FALSE) %*% pca$rotation;
> d2=predict(pca, training_data, center=TRUE, scale=FALSE);
> d3=pca$x;
Then the d1, d2, d3 are all the same...

------------------------------------

So now I am confused... why does the test data have two different rotated
matrix value?

Thanks a lot!

	[[alternative HTML version deleted]]

Bjørn-Helge Mevik

2006-Feb-28 08:52 UTC

head link

[R] question about Principal Component Analysis in R?

Michael wrote:
>> pca=prcomp(training_data, center=TRUE, scale=FALSE, retx=TRUE);
>
> Then I want to rotate the test data set using the
>
>> d1=scale(test_data, center=TRUE, scale=FALSE) %*% pca$rotation;
>> d2=predict(pca, test_data, center=TRUE, scale=FALSE);
>
> these two values are different
>
>> min(d2-d1)
> [1] -1.976152
>> max(d2-d1)
> [1] 1.535222
This is because you have subtracted a different means vector.  You
should use the coloumn means of the training data (as predict does;
see the last line of stats:::predict.prcomp):

d1=scale(test_data, center=pca$center, scale=FALSE) %*% pca$rotation;


-- 
Bj??rn-Helge Mevik

Apparently Analagous Threads

Search for more reasonably related threads

R help - Feb 2006 - question about Principal Component Analysis in R?

[R] question about Principal Component Analysis in R?

[R] question about Principal Component Analysis in R?

Apparently Analagous Threads