Bill.Venables@csiro.au
2005-Apr-11 04:15 UTC
[R] plotting Principal components vs individual variables.
At the cost of breaking the thread I'm going to change your subject and replace 'Principle' by 'Principal'. I just can't stand it any longer... OK, here is how I would solve your other problems. First put> wh <- c("USA", "New Zealand", "Dominican Republic","Western Samoa", "Cook Islands")> ind <- match(wh, row.names(running))Now 'ind' has the indices of the special set as numbers. You need this because although your data frame is indexed by the country names, your 'principal' [NB] component scores are not. Plot the scores against the first variable:> plot(running$X100m, running.pca$scores[, 1]) # you got this farFinally, pick out the special set:> points(running$X100m[ind], running.pca$scores[ind, 1], pch=4,col="red", cex=2)> text(running$X100m[ind], running.pca$scores[ind, 1], pos = 3, cex 0.7) # optionaland you can forget all about the subset data frame running2. Bill Venables. -----Original Message----- From: r-help-bounces at stat.math.ethz.ch [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Brett Stansfield Sent: Monday, 11 April 2005 10:04 AM To: R help (E-mail) Subject: [R] plotting Principle components vs individual variables. Dear R, I'm trying to plot the first principle component of an analysis vs the first variable but am having trouble. I have no trouble doing the initial plot but have difficulty thereafter. First I want to highlight some points of the following data set list(running) [[1]] X100m X200m X400m X800m X1500m X5K X10K Marathon Argentina 10.39 20.81 46.84 1.81 3.70 14.04 29.36 137.72 Australia 10.31 20.06 44.84 1.74 3.57 13.28 27.66 128.30 Austria 10.44 20.81 46.82 1.79 3.60 13.26 27.72 135.90 Belgium 10.34 20.68 45.04 1.73 3.60 13.22 27.45 129.95 Bermuda 10.28 20.58 45.91 1.80 3.75 14.68 30.55 146.62 Brazil 10.22 20.43 45.21 1.73 3.66 13.62 28.62 133.13 Burma 10.64 21.52 48.30 1.80 3.85 14.45 30.28 139.95 Canada 10.17 20.22 45.68 1.76 3.63 13.55 28.09 130.15 Chile 10.34 20.80 46.20 1.79 3.71 13.61 29.30 134.03 China 10.51 21.04 47.30 1.81 3.73 13.90 29.13 133.53 Columbia 10.43 21.05 46.10 1.82 3.74 13.49 27.88 131.35 Cook Islands 12.18 23.20 52.94 2.02 4.24 16.70 35.38 164.70 Costa Rica 10.94 21.90 48.66 1.87 3.84 14.03 28.81 136.58 Czechoslovakia 10.35 20.65 45.64 1.76 3.58 13.42 28.19 134.32 Denmark 10.56 20.52 45.89 1.78 3.61 13.50 28.11 130.78 Dominican Republic 10.14 20.65 46.80 1.82 3.82 14.91 31.45 154.12 Finland 10.43 20.69 45.49 1.74 3.61 13.27 27.52 130.87 France 10.11 20.38 45.28 1.73 3.57 13.34 27.97 132.30 East Germany 10.12 20.33 44.87 1.73 3.56 13.17 27.42 129.92 West Germany 10.16 20.37 44.50 1.73 3.53 13.21 27.61 132.23 United Kingdom 10.11 20.21 44.93 1.70 3.51 13.01 27.51 129.13 Greece 10.22 20.71 46.56 1.78 3.64 14.59 28.45 134.60 Guatemala 10.98 21.82 48.40 1.89 3.80 14.16 30.11 139.33 Hungary 10.26 20.62 46.02 1.77 3.62 13.49 28.44 132.58 India 10.60 21.42 45.73 1.76 3.73 13.77 28.81 131.98 Indonesia 10.59 21.49 47.80 1.84 3.92 14.73 30.79 148.83 Ireland 10.61 20.96 46.30 1.79 3.56 13.32 27.81 132.35 Israel 10.71 21.00 47.80 1.77 3.72 13.66 28.93 137.55 Italy 10.01 19.72 45.26 1.73 3.60 13.23 27.52 131.08 Japan 10.34 20.81 45.86 1.79 3.64 13.41 27.72 128.63 Kenya 10.46 20.66 44.92 1.73 3.55 13.10 27.38 129.75 South Korea 10.34 20.89 46.90 1.79 3.77 13.96 29.23 136.25 North Korea 10.91 21.94 47.30 1.85 3.77 14.13 29.67 130.87 Luxembourg 10.35 20.77 47.40 1.82 3.67 13.64 29.08 141.27 Malaysia 10.40 20.92 46.30 1.82 3.80 14.64 31.01 154.10 Mauritius 11.19 22.45 47.70 1.88 3.83 15.06 31.77 152.23 Mexico 10.42 21.30 46.10 1.80 3.65 13.46 27.95 129.20 Netherlands 10.52 20.95 45.10 1.74 3.62 13.36 27.61 129.02 New Zealand 10.51 20.88 46.10 1.74 3.54 13.21 27.70 128.98 Norway 10.55 21.16 46.71 1.76 3.62 13.34 27.69 131.48 Papua New Guinea 10.96 21.78 47.90 1.90 4.01 14.72 31.36 148.22 Philippines 10.78 21.64 46.24 1.81 3.83 14.74 30.64 145.27 Poland 10.16 20.24 45.36 1.76 3.60 13.29 27.89 131.58 Portugal 10.53 21.17 46.70 1.79 3.62 13.13 27.38 128.65 Rumania 10.41 20.98 45.87 1.76 3.64 13.25 27.67 132.50 Singapore 10.38 21.28 47.40 1.88 3.89 15.11 31.32 157.77 Spain 10.42 20.77 45.98 1.76 3.55 13.31 27.73 131.57 Sweden 10.25 20.61 45.63 1.77 3.61 13.29 27.94 130.63 Switzerland 10.37 20.46 45.78 1.78 3.55 13.22 27.91 131.20 Taiwan 10.59 21.29 46.80 1.79 3.77 14.07 30.07 139.27 Thailand 10.39 21.09 47.91 1.83 3.84 15.23 32.56 149.90 Turkey 10.71 21.43 47.60 1.79 3.67 13.56 28.58 131.50 USA 9.93 19.75 43.86 1.73 3.53 13.20 27.43 128.22 USSR 10.07 20.00 44.60 1.75 3.59 13.20 27.53 130.55 Western Samoa 10.82 21.86 49.00 2.02 4.24 16.28 34.71 161.83 So I do the following running2 <- running[c("USA","New Zealand", "Dominican Republic", "Western Samoa", "Cook Islands"),] I check running2 and it shows as this list(running2) [[1]] X100m X200m X400m X800m X1500m X5K X10K Marathon USA 9.93 19.75 43.86 1.73 3.53 13.20 27.43 128.22 New Zealand 10.51 20.88 46.10 1.74 3.54 13.21 27.70 128.98 Dominican Republic 10.14 20.65 46.80 1.82 3.82 14.91 31.45 154.12 Western Samoa 10.82 21.86 49.00 2.02 4.24 16.28 34.71 161.83 Cook Islands 12.18 23.20 52.94 2.02 4.24 16.70 35.38 164.70 I then ask to plot the first component vs X100m as follows: plot(running$X100m, running.pca$scores[,1]) It does this no problems but when I ask it to highlight the running2 points I get the following points(running2$X100m, running.pca$scores[,1], col="red") Error in xy.coords(x, y) : x and y lengths differ How can I get the programme to highlight the 5 countries in red with the remainder being black?? I have checked the pca$scores data Comp.1 Comp.2 Comp.3 Comp.4 Argentina -0.04924775 -0.465091996 -0.1569462564 -0.005810845 Australia 1.90192176 0.101049166 -0.0120464104 0.651816682 Austria 0.04010907 -0.163884583 -0.3055014673 0.016136753 Belgium 1.37253647 0.587803868 0.1488158699 -0.019595422 Bermuda 0.69426608 -0.493030587 0.1593950774 0.120074355 Brazil 1.68418949 0.214898184 -0.0733240991 -0.003162787 Burma -1.40707421 0.188937600 -0.6623424285 -0.527215158 Canada 1.52735698 -0.404836611 -0.2142964494 0.176168719 Chile 0.40862535 -0.212618765 0.0408667861 -0.078681885 China -0.56552586 -0.223429359 -0.2928094868 -0.086596928 Columbia -0.11564898 -0.226977793 0.4589401052 -0.048022658 Cook Islands -8.27371262 0.384947623 -0.7357421902 0.801461946 Costa Rica -2.80544713 0.066593276 -0.1383029607 -0.147987785 Czechoslovakia 0.94084659 0.146421855 0.0317629603 0.063414890 Denmark 0.50127535 0.141213477 -0.0481367885 0.642995929 Dominican Republic 0.37381694 -1.061799496 -0.1213552291 -0.266094008 Finland 1.00082548 0.544083222 -0.0258548880 0.117235365 France 1.85734404 -0.004344636 -0.1282195202 -0.168095452 East Germany 2.02630668 0.059699745 0.0679807810 -0.039807436 West Germany 2.06659223 0.217657596 0.2898976618 0.043722768 United Kingdom 2.34631901 0.287007920 -0.2631249772 -0.052175961 Greece 0.60350129 -0.417937938 -0.2722470809 -0.297147859 Guatemala -2.86229181 -0.084782032 0.1092307154 0.124678037 Hungary 0.88380469 -0.196275223 -0.1074953145 -0.089263972 India -0.06219546 0.987010331 0.3912982329 -0.297843161 Indonesia -1.44463975 -0.248819611 -0.0905110184 -0.370562155 Ireland -0.14120447 0.296136604 0.0471511543 0.249279625 Israel -0.68753510 0.413206648 -0.9205920726 0.116497574 Italy 2.53306911 -0.552500188 -0.4802520770 0.350811975 Japan 0.51968949 -0.142260162 0.2336252451 -0.043102377 Kenya 1.25823679 0.791008473 0.1904824736 0.246021878 South Korea 0.09196164 -0.291847998 -0.2933387263 -0.270244999 North Korea -2.16401990 0.517221912 0.4818608282 -0.138369205 Luxembourg -0.23310298 -0.767690670 -0.4063507475 -0.077338384 Malaysia -0.03915668 -0.390104541 0.2784969875 0.006861838 Mauritius -3.34291107 0.866944770 0.7514511384 -0.093101352 Mexico -0.14612335 0.122630574 0.4475266976 -0.410199703 Netherlands 0.80128457 0.916525709 0.3262696094 0.063077066 New Zealand 0.52127592 0.669474301 -0.2625746371 -0.017136425 Norway -0.12659992 0.566884546 -0.2898334704 -0.248028150 Papua New Guinea -2.70378288 -0.154472681 0.4409443971 0.235409736 Philippines -1.05941793 0.766237007 0.6010607266 -0.071473736 Poland 1.63781408 -0.348361281 -0.0257919744 0.179337045 Portugal -0.33352564 0.216740757 -0.0452993297 -0.181755129 Rumania 0.51161806 0.394841278 0.0856325451 -0.206992718 Singapore -1.14434377 -1.068790135 0.3417011592 -0.338799596 Spain 0.62586977 0.265405133 -0.0949525813 0.021691027 Sweden 1.04263923 -0.144339064 0.1025328504 -0.044472795 Switzerland 0.86021420 -0.178080230 -0.0009573433 0.361336836 Taiwan -0.55013518 0.365172050 -0.0388626904 -0.209802871 Thailand -0.80074552 -0.718955765 -0.4329859758 -0.375276203 Turkey -1.11381623 0.489001441 -0.4128057565 -0.240581646 USA 3.11410100 -0.397644307 0.3158182132 0.357347990 USSR 2.30107089 -0.382390458 0.1891648810 0.330762873 Western Samoa -3.87627808 -1.843488955 0.8209468511 0.188597852 I think what is happening is that running2 only has 5 rows while pca$scores has 55 Can anyone help here? Brett Stansfield ______________________________________________ R-help at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Martin Maechler
2005-Apr-11 07:12 UTC
[R] plotting Principal components vs individual variables.
>>>>> "BillV" == <Bill.Venables at csiro.au> >>>>> on Mon, 11 Apr 2005 14:15:08 +1000 writes:BillV> At the cost of breaking the thread I'm going to BillV> change your subject and replace 'Principle' by BillV> 'Principal'. I just can't stand it any longer... I understand - having had similar feelings. I'm starting to contemplate adding a filter to the mailing list software to do this automatically at least if this appears in a subject line... Martin