thr3ads.net - R help - [R] Confused by SVD and Eigenvector Decomposition in PCA [Feb 2003]

If this information is useful, please help other people find it:
Share via:

Feng Zhang

2003-Feb-06 19:04 UTC

[R] Confused by SVD and Eigenvector Decomposition in PCA

Hey, All

In principal component analysis (PCA), we want to know how many percentage
the first principal component explain the total variances among the data.

Assume the data matrix X is zero-meaned, and
I used the following procedures:
C = covriance(X) %% calculate the covariance matrix;
[EVector,EValues]=eig(C) %%
L = diag(EValues) %%L is a column vector with eigenvalues as the elements
percent = L(1)/sum(L);


Others argue using Sigular Value Decomposition(SVD) to
calculate the same quantity, as:
[U,S,V]=svd(X);
L = diag(S);
L = L.^2;
percent = L(1)/sum(L);


So which way is the correct method to calculate the percentage explained by
the first principal component?

Thanks for your advices on this.

Fred

Liaw, Andy

2003-Feb-06 20:14 UTC

head link

[R] Confused by SVD and Eigenvector Decomposition in PCA

If I'm not mistaken, for positive semi-definite matrices, the eigenvalues
are equal to squared singular values, so you should get the same answer
either way.

The code you shown is definitely not R (looks like Matlab), so why are you
posting to R-help?

Andy
> -----Original Message-----
> From: Feng Zhang [mailto:f0z6305 at labs.tamu.edu]
> Sent: Thursday, February 06, 2003 1:03 PM
> To: R-Help
> Subject: [R] Confused by SVD and Eigenvector Decomposition in PCA
> 
> 
> Hey, All
> 
> In principal component analysis (PCA), we want to know how 
> many percentage
> the first principal component explain the total variances 
> among the data.
> 
> Assume the data matrix X is zero-meaned, and
> I used the following procedures:
> C = covriance(X) %% calculate the covariance matrix;
> [EVector,EValues]=eig(C) %%
> L = diag(EValues) %%L is a column vector with eigenvalues as 
> the elements
> percent = L(1)/sum(L);
> 
> 
> Others argue using Sigular Value Decomposition(SVD) to
> calculate the same quantity, as:
> [U,S,V]=svd(X);
> L = diag(S);
> L = L.^2;
> percent = L(1)/sum(L);
> 
> 
> So which way is the correct method to calculate the 
> percentage explained by
> the first principal component?
> 
> Thanks for your advices on this.
> 
> Fred
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> http://www.stat.math.ethz.ch/mailman/listinfo/r-help
> 

------------------------------------------------------------------------------

antonio rodriguez

2003-Feb-07 09:32 UTC

head link

[R] Confused by SVD and Eigenvector Decomposition in PCA

Hi Feng,

AFIK SVD analysis provides a one-step method for computing all the
components of the eigen value problem, without the need to compute and
store big covariance matrices. And also the resulting decomposition is
computationally more stable and robust.

Cheers,

Antonio Rodriguez


----- Original Message -----
From: "Feng Zhang" <f0z6305 at labs.tamu.edu>
To: "R-Help" <r-help at stat.math.ethz.ch>
Sent: Thursday, February 06, 2003 7:03 PM
Subject: [R] Confused by SVD and Eigenvector Decomposition in PCA

> Hey, All
>
> In principal component analysis (PCA), we want to know how many
percentage> the first principal component explain the total variances among the
data.>
> Assume the data matrix X is zero-meaned, and
> I used the following procedures:
> C = covriance(X) %% calculate the covariance matrix;
> [EVector,EValues]=eig(C) %%
> L = diag(EValues) %%L is a column vector with eigenvalues as the
elements> percent = L(1)/sum(L);
>
>
> Others argue using Sigular Value Decomposition(SVD) to
> calculate the same quantity, as:
> [U,S,V]=svd(X);
> L = diag(S);
> L = L.^2;
> percent = L(1)/sum(L);
>
>
> So which way is the correct method to calculate the percentage
explained by> the first principal component?
>
> Thanks for your advices on this.
>
> Fred
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> http://www.stat.math.ethz.ch/mailman/listinfo/r-help

---

Feng Zhang

2003-Feb-07 18:08 UTC

head link

[R] Confused by SVD and Eigenvector Decomposition in PCA

Thanks for those replies.

But I tested several cases, and found the two
percentage from SVD and EVD are not
the same.
So how to explain the difference and which
one should be the right one for use
in PCA?


----- Original Message -----
From: "antonio rodriguez" <arv at ono.com>
To: "Feng Zhang" <f0z6305 at labs.tamu.edu>; "R-Help"
<r-help at stat.math.ethz.ch>
Sent: Friday, February 07, 2003 2:36 AM
Subject: Re: [R] Confused by SVD and Eigenvector Decomposition in PCA

> Hi Feng,
>
> AFIK SVD analysis provides a one-step method for computing all the
> components of the eigen value problem, without the need to compute and
> store big covariance matrices. And also the resulting decomposition is
> computationally more stable and robust.
>
> Cheers,
>
> Antonio Rodriguez
>
>
> ----- Original Message -----
> From: "Feng Zhang" <f0z6305 at labs.tamu.edu>
> To: "R-Help" <r-help at stat.math.ethz.ch>
> Sent: Thursday, February 06, 2003 7:03 PM
> Subject: [R] Confused by SVD and Eigenvector Decomposition in PCA
>
>
> > Hey, All
> >
> > In principal component analysis (PCA), we want to know how many
> percentage
> > the first principal component explain the total variances among the
> data.
> >
> > Assume the data matrix X is zero-meaned, and
> > I used the following procedures:
> > C = covriance(X) %% calculate the covariance matrix;
> > [EVector,EValues]=eig(C) %%
> > L = diag(EValues) %%L is a column vector with eigenvalues as the
> elements
> > percent = L(1)/sum(L);
> >
> >
> > Others argue using Sigular Value Decomposition(SVD) to
> > calculate the same quantity, as:
> > [U,S,V]=svd(X);
> > L = diag(S);
> > L = L.^2;
> > percent = L(1)/sum(L);
> >
> >
> > So which way is the correct method to calculate the percentage
> explained by
> > the first principal component?
> >
> > Thanks for your advices on this.
> >
> > Fred
> >
> > ______________________________________________
> > R-help at stat.math.ethz.ch mailing list
> > http://www.stat.math.ethz.ch/mailman/listinfo/r-help
>
>
> ---
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> http://www.stat.math.ethz.ch/mailman/listinfo/r-help

Feng Zhang

2003-Feb-08 04:17 UTC

head link

[R] Confused by SVD and Eigenvector Decomposition in PCA

I used Matlab to do this case study.>x = randn(200,3); %%generating a 200x3 Gaussian matrix
>[a,b,c]=svd(x); %%SVD composition
>S=diag(b)  S =[15.6765   14.8674   13.4016]'
>S(1)^2/sum(S.^2); 0.3802
>ZeroedX = X - repmat(mean(X),200,1); %%ZeroedX is now zero centered data
>C = cov(ZeroedX); %%Covariance matrix of ZeroedX
>[U,L] = eig(C); %% Eigen decompostion of C
> SE = diag(L);
  [0.8918    1.1098    1.2337]'>SE(1)/sum(SE)  0.3813

This is the case that I was confused by.

Fred
----- Original Message -----
From: "Liaw, Andy" <andy_liaw at merck.com>
To: "'Feng Zhang'" <f0z6305 at labs.tamu.edu>
Sent: Friday, February 07, 2003 6:25 PM
Subject: RE: [R] Confused by SVD and Eigenvector Decomposition in PCA

> I've already shown you one example.  If that's not enough,
here's another
> one:
>
> > set.seed(1)
> > x <- matrix(runif(1e3), 50, 20)
> > La.eigen(crossprod(x))$value
>  [1] 258.5242317   9.3638224   8.7213839   7.7425270   6.5057190
6.2719056>  [7]   5.6582657   4.5002047   4.2289555   3.9098726   3.7172642
3.2826449> [13]   2.8758329   2.6907474   2.3300505   1.9700120   1.3191512
1.0228788> [19]   0.8883083   0.5883287
> > La.svd(x)$d^2
>  [1] 258.5242317   9.3638224   8.7213839   7.7425270   6.5057190
6.2719056>  [7]   5.6582657   4.5002047   4.2289555   3.9098726   3.7172642
3.2826449> [13]   2.8758329   2.6907474   2.3300505   1.9700120   1.3191512
1.0228788> [19]   0.8883083   0.5883287
>
> Where's your example of this not working?
>
> Andy
>
>
> > -----Original Message-----
> > From: Feng Zhang [mailto:f0z6305 at labs.tamu.edu]
> > Sent: Friday, February 07, 2003 12:07 PM
> > To: antonio rodriguez; R-Help
> > Subject: Re: [R] Confused by SVD and Eigenvector Decomposition in PCA
> >
> >
> > Thanks for those replies.
> >
> > But I tested several cases, and found the two
> > percentage from SVD and EVD are not
> > the same.
> > So how to explain the difference and which
> > one should be the right one for use
> > in PCA?
> >
> >
> > ----- Original Message -----
> > From: "antonio rodriguez" <arv at ono.com>
> > To: "Feng Zhang" <f0z6305 at labs.tamu.edu>;
"R-Help"
> > <r-help at stat.math.ethz.ch>
> > Sent: Friday, February 07, 2003 2:36 AM
> > Subject: Re: [R] Confused by SVD and Eigenvector Decomposition in PCA
> >
> >
> > > Hi Feng,
> > >
> > > AFIK SVD analysis provides a one-step method for computing all
the
> > > components of the eigen value problem, without the need to
> > compute and
> > > store big covariance matrices. And also the resulting
> > decomposition is
> > > computationally more stable and robust.
> > >
> > > Cheers,
> > >
> > > Antonio Rodriguez
> > >
> > >
> > > ----- Original Message -----
> > > From: "Feng Zhang" <f0z6305 at labs.tamu.edu>
> > > To: "R-Help" <r-help at stat.math.ethz.ch>
> > > Sent: Thursday, February 06, 2003 7:03 PM
> > > Subject: [R] Confused by SVD and Eigenvector Decomposition in PCA
> > >
> > >
> > > > Hey, All
> > > >
> > > > In principal component analysis (PCA), we want to know how
many
> > > percentage
> > > > the first principal component explain the total variances
> > among the
> > > data.
> > > >
> > > > Assume the data matrix X is zero-meaned, and
> > > > I used the following procedures:
> > > > C = covriance(X) %% calculate the covariance matrix;
> > > > [EVector,EValues]=eig(C) %%
> > > > L = diag(EValues) %%L is a column vector with eigenvalues as
the
> > > elements
> > > > percent = L(1)/sum(L);
> > > >
> > > >
> > > > Others argue using Sigular Value Decomposition(SVD) to
> > > > calculate the same quantity, as:
> > > > [U,S,V]=svd(X);
> > > > L = diag(S);
> > > > L = L.^2;
> > > > percent = L(1)/sum(L);
> > > >
> > > >
> > > > So which way is the correct method to calculate the
percentage
> > > explained by
> > > > the first principal component?
> > > >
> > > > Thanks for your advices on this.
> > > >
> > > > Fred
> > > >
> > > > ______________________________________________
> > > > R-help at stat.math.ethz.ch mailing list
> > > > http://www.stat.math.ethz.ch/mailman/listinfo/r-help
> > >
> > >
> > > ---
> > >
> > > ______________________________________________
> > > R-help at stat.math.ethz.ch mailing list
> > > http://www.stat.math.ethz.ch/mailman/listinfo/r-help
> >
> > ______________________________________________
> > R-help at stat.math.ethz.ch mailing list
> > http://www.stat.math.ethz.ch/mailman/listinfo/r-help
> >
>
> --------------------------------------------------------------------------
----> Notice: This e-mail message, together with any attachments, containsinformation of Merck & Co., Inc. (Whitehouse Station, New Jersey, USA) that
may be confidential, proprietary copyrighted and/or legally privileged, and
is intended solely for the use of the individual or entity named on this
message.  If you are not the intended recipient, and have received this
message in error, please immediately return this by e-mail and then delete
it.>
>============================================================================>

Stephane Dray

2003-Feb-08 10:28 UTC

head link

[R] Confused by SVD and Eigenvector Decomposition in PCA

At 21:16 07/02/2003 -0600, Feng Zhang wrote:>I used Matlab to do this case study.
> >x = randn(200,3); %%generating a 200x3 Gaussian matrix
> >[a,b,c]=svd(x); %%SVD composition
> >S=diag(b)
>   S =[15.6765   14.8674   13.4016]'
>
> >S(1)^2/sum(S.^2);
>  0.3802
> >ZeroedX = X - repmat(mean(X),200,1); %%ZeroedX is now zero centered
data
> >C = cov(ZeroedX); %%Covariance matrix of ZeroedX
> >[U,L] = eig(C); %% Eigen decompostion of C
> > SE = diag(L);
>   [0.8918    1.1098    1.2337]'
> >SE(1)/sum(SE)
>   0.3813
>
>This is the case that I was confused by.
>
>Fred
You must also apply svd on your centred table X (i.e. ZeroeX)

Liaw, Andy

2003-Feb-08 19:47 UTC

head link

[R] Confused by SVD and Eigenvector Decomposition in PCA

There *is* a Matlab newsgroup for you to ask Matlab questions.  From the
latest Matlab digest:

MATLAB Usenet Group Celebrates Its 10th Anniversary
The MATLAB Usenet group, comp.soft-sys.matlab (CSSM), celebrated its 10th
anniversary this month. CSSM is a collaboration space where thousands of
MATLAB users discuss MATLAB-related topics or post questions to the
community. In 2002, CSSM featured more than 33,800 posts. Use our online
newsreader to communicate with the MATLAB community at:
www.mathworks.com/matlabcentral

Andy

------------------------------------------------------------------------------

Apparently Analagous Threads

Search for more maybe matching threads

R help - Feb 2003 - Confused by SVD and Eigenvector Decomposition in PCA

[R] Confused by SVD and Eigenvector Decomposition in PCA

[R] Confused by SVD and Eigenvector Decomposition in PCA

[R] Confused by SVD and Eigenvector Decomposition in PCA

[R] Confused by SVD and Eigenvector Decomposition in PCA

[R] Confused by SVD and Eigenvector Decomposition in PCA

[R] Confused by SVD and Eigenvector Decomposition in PCA

[R] Confused by SVD and Eigenvector Decomposition in PCA

Apparently Analagous Threads