thr3ads.net - R help - [R] Data reconstruction following PCA using Eigen function [May 2010]

If this information is useful, please help other people find it:
Share via:

Julia El-Sayed Moustafa

2010-May-21 18:25 UTC

[R] Data reconstruction following PCA using Eigen function

Hi all,

As a molecular biologist by training, I'm fairly new to R (and statistics!),
and was hoping for some advice. First of all, I'd like to apologise if my
question is more methodological rather than relating to a specific R
function. I've done my best to search both in the forum and elsewhere but
can't seem to find an answer which works in practice. 

I am carrying out principal component analysis in R using the Eigen
function, and have applied this to my dataset and can identify patterns in
my data which result from batch effects. What I'd ultimately like to do is
use PCA as a noise filter to remove the batch effects from my data, and the
way I would like to try and do this is by reconstructing the data, minus the
components (or that part of them) which correlate with batch. 

As I have an extremely large dataset, as a first step I am attempting simply
to apply the Eigen function to a dummy dataset consisting of only 10
individuals and ten variables derived from my originial data, then
reconstruct the original dummy data matrix, just to see if I can get the
basics of the method to work first. Here is the dummy matrix (adjusted by
variable mean-subtraction and division by the variable standard deviation
applied to each data point) I am using:

adjusted_dummyset
               [,1]        [,2]        [,3]        [,4]       [,5]      
[,6]        [,7]       [,8]       [,9]      [,10]
Sample1   0.7329881  0.76912670  2.45906143 -0.06411602  1.2427801 
0.3785717  2.34508664  1.1043552 -0.1883830  0.6503095
Sample2  -2.0446131  1.72783245 -0.40965941 -0.21713655 -0.2386781
-0.1944390 -0.25181541  0.8882743  0.8404783  0.2531209
Sample3   1.7575174  0.03851425 -0.87537424  1.82811160  0.8342636
-0.8155942 -0.90893068 -0.5529098 -1.6586626  0.1761717
Sample4   0.2731749  0.24045167 -0.39913821 -0.48525909  0.1448994
-2.0173360 -0.01073639 -0.3219478 -1.1536431 -0.2521545
Sample5  -0.2014241 -1.04646151  0.28101160  0.74348390  0.1738312
-0.8431262 -0.08842512  1.2909658  1.2013136  0.6706926
Sample6   0.1743534 -1.70657357 -0.09170187 -0.55605031 -0.2940946 
1.4525891 -0.39068509 -0.3373913 -0.0533732  0.9658389
Sample7  -0.8533191 -0.34438091 -1.23890437 -0.77360636  0.5926479 
0.7742632 -1.12515017 -0.5720099  0.2243808  0.5420693
Sample8   0.4176988 -0.35906123 -0.07190644  0.90045123 -1.0621902 
0.2693762 -0.38033715  0.6267548  0.4767652  0.3012347
Sample9   0.1088066 -0.32197951  0.46665158 -1.72560781  0.7375796 
0.2794331  1.00171777 -0.1087306  1.2519195 -0.8848459
Sample10 -0.3651829  1.00253167 -0.12004007  0.34972942 -2.1310389 
0.7162622 -0.19072440 -2.0173607 -0.9407956 -2.4224372
 
I first multiply the matrix by the transposed matrix:
> xxt=adjusted_dummyset %*% t(adjusted_dummyset)
> xxt           Sample1    Sample2    Sample3    Sample4    Sample5    Sample6   
Sample7    Sample8    Sample9   Sample10
Sample1  16.045163 -1.1367278 -2.5390033 -1.4762231  1.0158736 -1.8408484
-5.8176751 -1.5163240  3.5304834 -6.2647180
Sample2  -1.136728  9.0985066 -5.2175085 -0.8332460  0.7980575 -3.3607995 
1.6342076 -0.3098849 -0.3462389 -0.3263661
Sample3  -2.539003 -5.2175085 12.4738749  3.7747189 -0.9562913 -1.3252885
-0.9175020  0.5849437 -6.0796087  0.2016647
Sample4  -1.476223 -0.8332460  3.7747189  6.1161097 -1.0232149 -3.0984138
-1.1213965 -1.9014901 -1.0503243  0.6134802
Sample5   1.015874  0.7980575 -0.9562913 -1.0232149  6.0758629  0.2183710
-0.9466710  2.1466445 -0.2626433 -7.0659890
Sample6  -1.840848 -3.3607995 -1.3252885 -3.0984138  0.2183710  6.6590617 
3.0772441  1.0977948  0.3980588 -1.8251802
Sample7  -5.817675  1.6342076 -0.9175020 -1.1213965 -0.9466710  3.0772441 
5.8681617 -0.9215294  0.1646922 -1.0195316
Sample8  -1.516324 -0.3098849  0.5849437 -1.9014901  2.1466445  1.0977948
-0.9215294  3.1757171 -2.2533118 -0.1025599
Sample9   3.530483 -0.3462389 -6.0796087 -1.0503243 -0.2626433  0.3980588 
0.1646922 -2.2533118  7.2986178 -1.3997251
Sample10 -6.264718 -0.3263661  0.2016647  0.6134802 -7.0659890 -1.8251802
-1.0195316 -0.1025599 -1.3997251 17.1889249

Then I apply the Eigen function and save the eigenvectors and eigenvalues
eigen(xxt,TRUE)
$values
 [1]  2.693535e+01  1.898845e+01  1.668814e+01  1.184849e+01  8.136377e+00 
4.893302e+00  2.007293e+00  4.994285e-01  3.173378e-03 -1.652399e-15

$vectors
             [,1]        [,2]        [,3]        [,4]        [,5]       
[,6]         [,7]        [,8]        [,9]      [,10]
 [1,]  0.60510924  0.13494101 -0.54092789 -0.11000076  0.11721749 
0.37004560 -0.106993925 -0.10503855  0.19436075 -0.3162278
 [2,]  0.05581480 -0.39453663  0.09160465  0.67089969  0.08088382 
0.32755098 -0.009124933  0.11721316 -0.39379401 -0.3162278
 [3,] -0.28666669  0.71121151 -0.01000162  0.03014755 -0.06261445 
0.22739798  0.422156670  0.02330019 -0.27677054 -0.3162278
 [4,] -0.13469379  0.23709870 -0.15611654  0.27760975 -0.50201617
-0.31821931 -0.584748137  0.12405010  0.11661762 -0.3162278
 [5,]  0.26143053  0.12386029  0.29769330  0.19250271  0.27617639
-0.49754797  0.084826842 -0.59908097  0.02670441 -0.3162278
 [6,]  0.01918933 -0.04796651  0.35560411 -0.57607725  0.07402744 
0.14338749 -0.455866694 -0.01271614 -0.45276432 -0.3162278
 [7,] -0.11887886 -0.18220751  0.42748177 -0.07360955 -0.34327294 
0.38589551  0.142875991 -0.16600492  0.59142740 -0.3162278
 [8,] -0.02725610  0.06425762  0.15301980 -0.01332073  0.53079711
-0.18890512  0.012099700  0.65880817  0.34630946 -0.3162278
 [9,]  0.24706215 -0.31886828 -0.13294196 -0.25401667 -0.42308063
-0.38135353  0.482282448  0.24225089 -0.19843310 -0.3162278
[10,] -0.62111060 -0.32779019 -0.48541562 -0.14413474  0.25188193
-0.06825162  0.012492037 -0.28278192  0.04634233 -0.3162278

I also calculate:

xtx=t(adjusted_dummyset) %*% adjusted_dummyset > xtx            [,1]       [,2]       [,3]       [,4]       [,5]        [,6]      
[,7]       [,8]       [,9]       [,10]
 [1,]  9.0000000 -3.1796310  2.0417037  3.9514142  2.7275520 -1.66571377 
1.5629736 -0.9104748 -4.8506341  0.68480373
 [2,] -3.1796310  9.0000000  1.0981249  0.5494749 -1.2662064 -1.89318349 
2.1005566 -0.5052055 -1.7947356 -3.90496189
 [3,]  2.0417037  1.0981249  9.0000000 -1.1689542  2.3836861  1.22540724 
8.5924385  4.2130304  1.8322104  0.72643120
 [4,]  3.9514142  0.5494749 -1.1689542  9.0000000 -1.7132579 -2.51679155
-2.8679239  0.5181511 -3.9536139  0.84095707
 [5,]  2.7275520 -1.2662064  2.3836861 -1.7132579  9.0000000 -2.17714376 
3.1966738  4.1903147  0.7937017  5.20170437
 [6,] -1.6657138 -1.8931835  1.2254072 -2.5167916 -2.1771438  9.00000000 
0.3764605 -1.9821441  2.3330854 -0.08204933
 [7,]  1.5629736  2.1005566  8.5924385 -2.8679239  3.1966738  0.37646053 
9.0000000  3.5708620  1.7809104 -0.28160124
 [8,] -0.9104748 -0.5052055  4.2130304  0.5181511  4.1903147 -1.98214408 
3.5708620  9.0000000  5.3281685  6.32863320
 [9,] -4.8506341 -1.7947356  1.8322104 -3.9536139  0.7937017  2.33308543 
1.7809104  5.3281685  9.0000000  2.27959505
[10,]  0.6848037 -3.9049619  0.7264312  0.8409571  5.2017044 -0.08204933
-0.2816012  6.3286332  2.2795951  9.00000000> and apply the eigen function to this:
> eigen(xtx,TRUE)$values
 [1] 2.693535e+01 1.898845e+01 1.668814e+01 1.184849e+01 8.136377e+00
4.893302e+00 2.007293e+00 4.994285e-01 3.173378e-03 6.009067e-16

$vectors
             [,1]        [,2]         [,3]        [,4]         [,5]       
[,6]        [,7]        [,8]        [,9]       [,10]
 [1,] -0.01604003  0.56323526 -0.190946669 -0.40105857 -0.002543387 
0.17417910  0.25192284 -0.59870554  0.18090811 -0.01463338
 [2,]  0.08466679 -0.16695878 -0.455330886  0.54953906 -0.002740213
-0.47379253  0.19091718 -0.36818489  0.23997046  0.03943000
 [3,] -0.42016626 -0.01672674 -0.438168969 -0.16986221 -0.259624942 
0.03774746 -0.20059192  0.33694834  0.50122476 -0.35847979
 [4,]  0.17380688  0.46248616 -0.009158497  0.19699882 -0.641525880
-0.08643216  0.32002024  0.38196270 -0.08822512  0.20466364
 [5,] -0.38230998  0.27820625  0.061021059  0.01793892  0.556873562
-0.31199982  0.48447946  0.33729239 -0.05542533 -0.11569124
 [6,] -0.01078626 -0.35623684  0.086516360 -0.57830505 -0.317806785
-0.56648878  0.22620425 -0.09749859 -0.18851088 -0.11373558
 [7,] -0.41357826 -0.06924232 -0.495909637 -0.11281244  0.008420391 
0.07292844 -0.09886884  0.01075021 -0.44335225  0.59469674
 [8,] -0.48705511  0.07929178  0.158922061  0.33799785 -0.279059362 
0.07881225 -0.04227138 -0.30875079 -0.45445853 -0.47881039
 [9,] -0.33349167 -0.40387477  0.287225004  0.07874711 -0.163416914 
0.37364320  0.52438989 -0.09014761  0.33696991  0.27201980
[10,] -0.34649980  0.24943468  0.446371805  0.03960183 -0.072567359
-0.40852699 -0.43011632 -0.14042978  0.30891048  0.38025985


I create a diagonal matrix containing the square  roots of the eigenvalues
along the diagonal:
> diagonalmatrix_eigenvaluesqrts     V1   V2   V3    V4    V5    V6    V7    V8    V9    V10
1  5.61 0.00 0.00 0.000 0.000 0.000 0.000 0.000 0.000 0.0000
2  0.00 1.41 0.00 0.000 0.000 0.000 0.000 0.000 0.000 0.0000
3  0.00 0.00 1.12 0.000 0.000 0.000 0.000 0.000 0.000 0.0000
4  0.00 0.00 0.00 0.941 0.000 0.000 0.000 0.000 0.000 0.0000
5  0.00 0.00 0.00 0.000 0.727 0.000 0.000 0.000 0.000 0.0000
6  0.00 0.00 0.00 0.000 0.000 0.471 0.000 0.000 0.000 0.0000
7  0.00 0.00 0.00 0.000 0.000 0.000 0.296 0.000 0.000 0.0000
8  0.00 0.00 0.00 0.000 0.000 0.000 0.000 0.246 0.000 0.0000
9  0.00 0.00 0.00 0.000 0.000 0.000 0.000 0.000 0.159 0.0000
10 0.00 0.00 0.00 0.000 0.000 0.000 0.000 0.000 0.000 0.0127

I have been operating under the assumption that the original data matrix
could be reconstructed by multiplying the eigenvectors from xxt X a diagonal
matrix with the square roots of the eigenvalues along the diagonal X the
eigenvectors from xtx as follows:

 Reconstructed_dummyset=eigenvectors_xxt %*%
as.matrix(diagonalmatrix_eigenvaluesqrts) %*%
t(eigenvectors_xtx)> Reconstructed_dummyset              [,1]         [,2]        [,3]       [,4]        [,5]       
[,6]        [,7]        [,8]         [,9]      [,10]
 [1,]  0.253193770  0.402535986 -1.14743281  0.5698820 -1.31593686
-0.23278020 -1.10482209 -1.78241789 -1.344859283 -1.4554367
 [2,] -0.593623956  0.320035080 -0.30357878 -0.1156343 -0.25992539
-0.25866729 -0.17538542  0.05041073  0.221261845 -0.2675080
 [3,]  0.620290857 -0.322491313  0.63113668  0.2551444  0.90025477
-0.36701269  0.61029139  0.90661927  0.226875492  0.6933749
 [4,]  0.045394154  0.135401863  0.48897212  0.2772398  0.14530592
-0.12141324  0.33943264  0.53647586  0.002607645  0.4395889
 [5,] -0.007416873  0.213143606 -0.90810973  0.2103079 -0.34091303
-0.06529543 -0.82817426 -0.61633082 -0.543192842 -0.2177753
 [6,]  0.068080740 -0.533029067 -0.14658587 -0.2022178 -0.09846926 
0.29932974 -0.12798021 -0.14611192 -0.016267532  0.1053571
 [7,] -0.113178997 -0.309552826  0.18447270 -0.1195559  0.01775153 
0.15282100  0.02644529  0.41110216  0.623380590  0.3368958
 [8,] -0.076800910 -0.117830736 -0.03085105 -0.1712106  0.39036065
-0.10570034 -0.05526268 -0.08333122 -0.028936234  0.1511139
 [9,] -0.187036903  0.212383109 -0.41884937  0.2704928 -0.69211148 
0.50295414 -0.45885438 -0.75166730 -0.301317053 -0.6536756
[10,] -0.008314191 -0.002179248  1.66522298 -0.9826679  1.25832941 
0.20033204  1.75042615  1.49448058  1.149522805  0.8527934

However as you can see I've clearly got my wires crossed somewhere! 

I'd really appreciate any advice you could offer... Thanks!

Julia
-- 
View this message in context:
http://r.789695.n4.nabble.com/Data-reconstruction-following-PCA-using-Eigen-function-tp2226535p2226535.html
Sent from the R help mailing list archive at Nabble.com.

Thomas Stewart

2010-May-22 18:51 UTC

head link

[R] Data reconstruction following PCA using Eigen function

Looks like you have some numerical precision issues.  Why not use the svd
function directly?  (See below.)

-tgs

x <- read.table(
textConnection(
"Sample1   0.7329881  0.76912670  2.45906143 -0.06411602  1.2427801
0.3785717  2.34508664  1.1043552 -0.1883830  0.6503095
Sample2  -2.0446131  1.72783245 -0.40965941 -0.21713655 -0.2386781
-0.1944390 -0.25181541  0.8882743  0.8404783  0.2531209
Sample3   1.7575174  0.03851425 -0.87537424  1.82811160  0.8342636
-0.8155942 -0.90893068 -0.5529098 -1.6586626  0.1761717
Sample4   0.2731749  0.24045167 -0.39913821 -0.48525909  0.1448994
-2.0173360 -0.01073639 -0.3219478 -1.1536431 -0.2521545
Sample5  -0.2014241 -1.04646151  0.28101160  0.74348390  0.1738312
-0.8431262 -0.08842512  1.2909658  1.2013136  0.6706926
Sample6   0.1743534 -1.70657357 -0.09170187 -0.55605031 -0.2940946 1.4525891
-0.39068509 -0.3373913 -0.0533732  0.9658389
Sample7  -0.8533191 -0.34438091 -1.23890437 -0.77360636  0.5926479 0.7742632
-1.12515017 -0.5720099  0.2243808  0.5420693
Sample8   0.4176988 -0.35906123 -0.07190644  0.90045123 -1.0621902 0.2693762
-0.38033715  0.6267548  0.4767652  0.3012347
Sample9   0.1088066 -0.32197951  0.46665158 -1.72560781  0.7375796 0.2794331
 1.00171777 -0.1087306  1.2519195 -0.8848459
Sample10 -0.3651829  1.00253167 -0.12004007  0.34972942 -2.1310389 0.7162622
-0.19072440 -2.0173607 -0.9407956 -2.4224372"),
 header=F, as.is=TRUE)

X<-as.matrix(x[,-1])

V <- t( eigen( t(X) %*% X , T )$vectors )
U <- eigen( X %*% t( X ) , T )$vectors
D <- diag( sqrt( eigen( X %*% t( X ) , T )$values ) )

X - U %*% D %*% t(V)

svd(X)$v - V
svd(X)$u - U
svd(X)$d - sqrt( eigen( X %*% t( X ) , T )$values )

X-svd(X)$u %*% diag( svd(X)$d ) %*% t( svd(X)$v )






On Fri, May 21, 2010 at 2:25 PM, Julia El-Sayed Moustafa <
jelsayed@imperial.ac.uk> wrote:
>
> Hi all,
>
> As a molecular biologist by training, I'm fairly new to R (and
> statistics!),
> and was hoping for some advice. First of all, I'd like to apologise if
my
> question is more methodological rather than relating to a specific R
> function. I've done my best to search both in the forum and elsewhere
but
> can't seem to find an answer which works in practice.
>
> I am carrying out principal component analysis in R using the Eigen
> function, and have applied this to my dataset and can identify patterns in
> my data which result from batch effects. What I'd ultimately like to do
is
> use PCA as a noise filter to remove the batch effects from my data, and the
> way I would like to try and do this is by reconstructing the data, minus
> the
> components (or that part of them) which correlate with batch.
>
> As I have an extremely large dataset, as a first step I am attempting
> simply
> to apply the Eigen function to a dummy dataset consisting of only 10
> individuals and ten variables derived from my originial data, then
> reconstruct the original dummy data matrix, just to see if I can get the
> basics of the method to work first. Here is the dummy matrix (adjusted by
> variable mean-subtraction and division by the variable standard deviation
> applied to each data point) I am using:
>
> adjusted_dummyset
>               [,1]        [,2]        [,3]        [,4]       [,5]
> [,6]        [,7]       [,8]       [,9]      [,10]
> Sample1   0.7329881  0.76912670  2.45906143 -0.06411602  1.2427801
> 0.3785717  2.34508664  1.1043552 -0.1883830  0.6503095
> Sample2  -2.0446131  1.72783245 -0.40965941 -0.21713655 -0.2386781
> -0.1944390 -0.25181541  0.8882743  0.8404783  0.2531209
> Sample3   1.7575174  0.03851425 -0.87537424  1.82811160  0.8342636
> -0.8155942 -0.90893068 -0.5529098 -1.6586626  0.1761717
> Sample4   0.2731749  0.24045167 -0.39913821 -0.48525909  0.1448994
> -2.0173360 -0.01073639 -0.3219478 -1.1536431 -0.2521545
> Sample5  -0.2014241 -1.04646151  0.28101160  0.74348390  0.1738312
> -0.8431262 -0.08842512  1.2909658  1.2013136  0.6706926
> Sample6   0.1743534 -1.70657357 -0.09170187 -0.55605031 -0.2940946
> 1.4525891 -0.39068509 -0.3373913 -0.0533732  0.9658389
> Sample7  -0.8533191 -0.34438091 -1.23890437 -0.77360636  0.5926479
> 0.7742632 -1.12515017 -0.5720099  0.2243808  0.5420693
> Sample8   0.4176988 -0.35906123 -0.07190644  0.90045123 -1.0621902
> 0.2693762 -0.38033715  0.6267548  0.4767652  0.3012347
> Sample9   0.1088066 -0.32197951  0.46665158 -1.72560781  0.7375796
> 0.2794331  1.00171777 -0.1087306  1.2519195 -0.8848459
> Sample10 -0.3651829  1.00253167 -0.12004007  0.34972942 -2.1310389
> 0.7162622 -0.19072440 -2.0173607 -0.9407956 -2.4224372
>
> I first multiply the matrix by the transposed matrix:
>
> > xxt=adjusted_dummyset %*% t(adjusted_dummyset)
> > xxt
>           Sample1    Sample2    Sample3    Sample4    Sample5    Sample6
> Sample7    Sample8    Sample9   Sample10
> Sample1  16.045163 -1.1367278 -2.5390033 -1.4762231  1.0158736 -1.8408484
> -5.8176751 -1.5163240  3.5304834 -6.2647180
> Sample2  -1.136728  9.0985066 -5.2175085 -0.8332460  0.7980575 -3.3607995
> 1.6342076 -0.3098849 -0.3462389 -0.3263661
> Sample3  -2.539003 -5.2175085 12.4738749  3.7747189 -0.9562913 -1.3252885
> -0.9175020  0.5849437 -6.0796087  0.2016647
> Sample4  -1.476223 -0.8332460  3.7747189  6.1161097 -1.0232149 -3.0984138
> -1.1213965 -1.9014901 -1.0503243  0.6134802
> Sample5   1.015874  0.7980575 -0.9562913 -1.0232149  6.0758629  0.2183710
> -0.9466710  2.1466445 -0.2626433 -7.0659890
> Sample6  -1.840848 -3.3607995 -1.3252885 -3.0984138  0.2183710  6.6590617
> 3.0772441  1.0977948  0.3980588 -1.8251802
> Sample7  -5.817675  1.6342076 -0.9175020 -1.1213965 -0.9466710  3.0772441
> 5.8681617 -0.9215294  0.1646922 -1.0195316
> Sample8  -1.516324 -0.3098849  0.5849437 -1.9014901  2.1466445  1.0977948
> -0.9215294  3.1757171 -2.2533118 -0.1025599
> Sample9   3.530483 -0.3462389 -6.0796087 -1.0503243 -0.2626433  0.3980588
> 0.1646922 -2.2533118  7.2986178 -1.3997251
> Sample10 -6.264718 -0.3263661  0.2016647  0.6134802 -7.0659890 -1.8251802
> -1.0195316 -0.1025599 -1.3997251 17.1889249
>
> Then I apply the Eigen function and save the eigenvectors and eigenvalues
> eigen(xxt,TRUE)
> $values
>  [1]  2.693535e+01  1.898845e+01  1.668814e+01  1.184849e+01  8.136377e+00
> 4.893302e+00  2.007293e+00  4.994285e-01  3.173378e-03 -1.652399e-15
>
> $vectors
>             [,1]        [,2]        [,3]        [,4]        [,5]
> [,6]         [,7]        [,8]        [,9]      [,10]
>  [1,]  0.60510924  0.13494101 -0.54092789 -0.11000076  0.11721749
> 0.37004560 -0.106993925 -0.10503855  0.19436075 -0.3162278
>  [2,]  0.05581480 -0.39453663  0.09160465  0.67089969  0.08088382
> 0.32755098 -0.009124933  0.11721316 -0.39379401 -0.3162278
>  [3,] -0.28666669  0.71121151 -0.01000162  0.03014755 -0.06261445
> 0.22739798  0.422156670  0.02330019 -0.27677054 -0.3162278
>  [4,] -0.13469379  0.23709870 -0.15611654  0.27760975 -0.50201617
> -0.31821931 -0.584748137  0.12405010  0.11661762 -0.3162278
>  [5,]  0.26143053  0.12386029  0.29769330  0.19250271  0.27617639
> -0.49754797  0.084826842 -0.59908097  0.02670441 -0.3162278
>  [6,]  0.01918933 -0.04796651  0.35560411 -0.57607725  0.07402744
> 0.14338749 -0.455866694 -0.01271614 -0.45276432 -0.3162278
>  [7,] -0.11887886 -0.18220751  0.42748177 -0.07360955 -0.34327294
> 0.38589551  0.142875991 -0.16600492  0.59142740 -0.3162278
>  [8,] -0.02725610  0.06425762  0.15301980 -0.01332073  0.53079711
> -0.18890512  0.012099700  0.65880817  0.34630946 -0.3162278
>  [9,]  0.24706215 -0.31886828 -0.13294196 -0.25401667 -0.42308063
> -0.38135353  0.482282448  0.24225089 -0.19843310 -0.3162278
> [10,] -0.62111060 -0.32779019 -0.48541562 -0.14413474  0.25188193
> -0.06825162  0.012492037 -0.28278192  0.04634233 -0.3162278
>
> I also calculate:
>
> xtx=t(adjusted_dummyset) %*% adjusted_dummyset
> > xtx
>            [,1]       [,2]       [,3]       [,4]       [,5]        [,6]
> [,7]       [,8]       [,9]       [,10]
>  [1,]  9.0000000 -3.1796310  2.0417037  3.9514142  2.7275520 -1.66571377
> 1.5629736 -0.9104748 -4.8506341  0.68480373
>  [2,] -3.1796310  9.0000000  1.0981249  0.5494749 -1.2662064 -1.89318349
> 2.1005566 -0.5052055 -1.7947356 -3.90496189
>  [3,]  2.0417037  1.0981249  9.0000000 -1.1689542  2.3836861  1.22540724
> 8.5924385  4.2130304  1.8322104  0.72643120
>  [4,]  3.9514142  0.5494749 -1.1689542  9.0000000 -1.7132579 -2.51679155
> -2.8679239  0.5181511 -3.9536139  0.84095707
>  [5,]  2.7275520 -1.2662064  2.3836861 -1.7132579  9.0000000 -2.17714376
> 3.1966738  4.1903147  0.7937017  5.20170437
>  [6,] -1.6657138 -1.8931835  1.2254072 -2.5167916 -2.1771438  9.00000000
> 0.3764605 -1.9821441  2.3330854 -0.08204933
>  [7,]  1.5629736  2.1005566  8.5924385 -2.8679239  3.1966738  0.37646053
> 9.0000000  3.5708620  1.7809104 -0.28160124
>  [8,] -0.9104748 -0.5052055  4.2130304  0.5181511  4.1903147 -1.98214408
> 3.5708620  9.0000000  5.3281685  6.32863320
>  [9,] -4.8506341 -1.7947356  1.8322104 -3.9536139  0.7937017  2.33308543
> 1.7809104  5.3281685  9.0000000  2.27959505
> [10,]  0.6848037 -3.9049619  0.7264312  0.8409571  5.2017044 -0.08204933
> -0.2816012  6.3286332  2.2795951  9.00000000
> >
> and apply the eigen function to this:
>
> > eigen(xtx,TRUE)
> $values
>  [1] 2.693535e+01 1.898845e+01 1.668814e+01 1.184849e+01 8.136377e+00
> 4.893302e+00 2.007293e+00 4.994285e-01 3.173378e-03 6.009067e-16
>
> $vectors
>             [,1]        [,2]         [,3]        [,4]         [,5]
> [,6]        [,7]        [,8]        [,9]       [,10]
>  [1,] -0.01604003  0.56323526 -0.190946669 -0.40105857 -0.002543387
> 0.17417910  0.25192284 -0.59870554  0.18090811 -0.01463338
>  [2,]  0.08466679 -0.16695878 -0.455330886  0.54953906 -0.002740213
> -0.47379253  0.19091718 -0.36818489  0.23997046  0.03943000
>  [3,] -0.42016626 -0.01672674 -0.438168969 -0.16986221 -0.259624942
> 0.03774746 -0.20059192  0.33694834  0.50122476 -0.35847979
>  [4,]  0.17380688  0.46248616 -0.009158497  0.19699882 -0.641525880
> -0.08643216  0.32002024  0.38196270 -0.08822512  0.20466364
>  [5,] -0.38230998  0.27820625  0.061021059  0.01793892  0.556873562
> -0.31199982  0.48447946  0.33729239 -0.05542533 -0.11569124
>  [6,] -0.01078626 -0.35623684  0.086516360 -0.57830505 -0.317806785
> -0.56648878  0.22620425 -0.09749859 -0.18851088 -0.11373558
>  [7,] -0.41357826 -0.06924232 -0.495909637 -0.11281244  0.008420391
> 0.07292844 -0.09886884  0.01075021 -0.44335225  0.59469674
>  [8,] -0.48705511  0.07929178  0.158922061  0.33799785 -0.279059362
> 0.07881225 -0.04227138 -0.30875079 -0.45445853 -0.47881039
>  [9,] -0.33349167 -0.40387477  0.287225004  0.07874711 -0.163416914
> 0.37364320  0.52438989 -0.09014761  0.33696991  0.27201980
> [10,] -0.34649980  0.24943468  0.446371805  0.03960183 -0.072567359
> -0.40852699 -0.43011632 -0.14042978  0.30891048  0.38025985
>
>
> I create a diagonal matrix containing the square  roots of the eigenvalues
> along the diagonal:
>
> > diagonalmatrix_eigenvaluesqrts
>     V1   V2   V3    V4    V5    V6    V7    V8    V9    V10
> 1  5.61 0.00 0.00 0.000 0.000 0.000 0.000 0.000 0.000 0.0000
> 2  0.00 1.41 0.00 0.000 0.000 0.000 0.000 0.000 0.000 0.0000
> 3  0.00 0.00 1.12 0.000 0.000 0.000 0.000 0.000 0.000 0.0000
> 4  0.00 0.00 0.00 0.941 0.000 0.000 0.000 0.000 0.000 0.0000
> 5  0.00 0.00 0.00 0.000 0.727 0.000 0.000 0.000 0.000 0.0000
> 6  0.00 0.00 0.00 0.000 0.000 0.471 0.000 0.000 0.000 0.0000
> 7  0.00 0.00 0.00 0.000 0.000 0.000 0.296 0.000 0.000 0.0000
> 8  0.00 0.00 0.00 0.000 0.000 0.000 0.000 0.246 0.000 0.0000
> 9  0.00 0.00 0.00 0.000 0.000 0.000 0.000 0.000 0.159 0.0000
> 10 0.00 0.00 0.00 0.000 0.000 0.000 0.000 0.000 0.000 0.0127
>
> I have been operating under the assumption that the original data matrix
> could be reconstructed by multiplying the eigenvectors from xxt X a
> diagonal
> matrix with the square roots of the eigenvalues along the diagonal X the
> eigenvectors from xtx as follows:
>
>  Reconstructed_dummyset=eigenvectors_xxt %*%
> as.matrix(diagonalmatrix_eigenvaluesqrts) %*% t(eigenvectors_xtx)
> > Reconstructed_dummyset
>              [,1]         [,2]        [,3]       [,4]        [,5]
> [,6]        [,7]        [,8]         [,9]      [,10]
>  [1,]  0.253193770  0.402535986 -1.14743281  0.5698820 -1.31593686
> -0.23278020 -1.10482209 -1.78241789 -1.344859283 -1.4554367
>  [2,] -0.593623956  0.320035080 -0.30357878 -0.1156343 -0.25992539
> -0.25866729 -0.17538542  0.05041073  0.221261845 -0.2675080
>  [3,]  0.620290857 -0.322491313  0.63113668  0.2551444  0.90025477
> -0.36701269  0.61029139  0.90661927  0.226875492  0.6933749
>  [4,]  0.045394154  0.135401863  0.48897212  0.2772398  0.14530592
> -0.12141324  0.33943264  0.53647586  0.002607645  0.4395889
>  [5,] -0.007416873  0.213143606 -0.90810973  0.2103079 -0.34091303
> -0.06529543 -0.82817426 -0.61633082 -0.543192842 -0.2177753
>  [6,]  0.068080740 -0.533029067 -0.14658587 -0.2022178 -0.09846926
> 0.29932974 -0.12798021 -0.14611192 -0.016267532  0.1053571
>  [7,] -0.113178997 -0.309552826  0.18447270 -0.1195559  0.01775153
> 0.15282100  0.02644529  0.41110216  0.623380590  0.3368958
>  [8,] -0.076800910 -0.117830736 -0.03085105 -0.1712106  0.39036065
> -0.10570034 -0.05526268 -0.08333122 -0.028936234  0.1511139
>  [9,] -0.187036903  0.212383109 -0.41884937  0.2704928 -0.69211148
> 0.50295414 -0.45885438 -0.75166730 -0.301317053 -0.6536756
> [10,] -0.008314191 -0.002179248  1.66522298 -0.9826679  1.25832941
> 0.20033204  1.75042615  1.49448058  1.149522805  0.8527934
>
> However as you can see I've clearly got my wires crossed somewhere!
>
> I'd really appreciate any advice you could offer... Thanks!
>
> Julia
> --
> View this message in context:
>
http://r.789695.n4.nabble.com/Data-reconstruction-following-PCA-using-Eigen-function-tp2226535p2226535.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
	[[alternative HTML version deleted]]

Julia El-Sayed Moustafa

2010-May-24 21:03 UTC

head link

[R] Data reconstruction following PCA using Eigen function

Hi Thomas,

Thanks very much for your reply. I used svd and it worked perfectly for my
purposes!

Thanks again,
Julia
-- 
View this message in context:
http://r.789695.n4.nabble.com/Data-reconstruction-following-PCA-using-Eigen-function-tp2226535p2229191.html
Sent from the R help mailing list archive at Nabble.com.

Maybe Matching Threads

Search for more maybe matching threads

R help - May 2010 - Data reconstruction following PCA using Eigen function

[R] Data reconstruction following PCA using Eigen function

[R] Data reconstruction following PCA using Eigen function

[R] Data reconstruction following PCA using Eigen function

Maybe Matching Threads