I have a decent sized matrix (36 x 11,000) that I have preformed a PCA on with prcomp(), but due to the large number of variables I can't plot the result with biplot(). How else can I plot the PCA output? I tried posting this before, but got no responses so I'm trying again. Surely this is a common problem, but I can't find a solution with google? The University of Dundee is a registered Scottish Charity, No: SC015096
That depends on what you want to plot there. Basically, you could just use plot() with pcaResult$x. You might need to define which PCs you want to plot there though. pcaResult<-prcomp(iris[,1:4]) plot(pcaResult$x) # gives the first 2 PCs plot(pcaResult$x[,2:3]) #gives the second vs the 3rd PC or if you want to see more you can use pairs() pairs(pcaResult$x) if you want things colored, theres the col parameter that works for both functions: pairs(pcaResult$x,col=iris[,5]) Does this help? Am 07.05.2012 um 12:22 schrieb Christian Cole:> I have a decent sized matrix (36 x 11,000) that I have preformed a PCA on > with prcomp(), but due to the large number of variables I can't plot the > result with biplot(). How else can I plot the PCA output? > > I tried posting this before, but got no responses so I'm trying again. > Surely this is a common problem, but I can't find a solution with google? > > > The University of Dundee is a registered Scottish Charity, No: SC015096 > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Christian, is that 36 samples x 11K variables? Sounds like it. Is this spectroscopic data? In any case, the scores are in the list element $x as follows: answer <- prcomp(your matrix) answer$x contains the scores, so if you want to plot the 1st 2 pcs, you could do plot(answer$x[,1], answer$x[,2]) Because the columns of answer$x contain the scores of the PCs in order. [I see Jessica just answered...] If you want the loading plot, it's going to be interesting with all those variables, but this will do it: plot(1:11000, answer$rotation[,1], type = "l") # for the loadings of the 1st PC Depending upon what kind of data this is, the 1:11000 could be replaced by something more sensible. If it is spectroscopic data, then replace it with your frequency values. By the way, plot(answer) will give you the scree plot to determine how many PCs are worthy. Good luck. Bryan *********** Bryan Hanson Professor of Chemistry & Biochemistry DePauw University On May 7, 2012, at 6:22 AM, Christian Cole wrote:> I have a decent sized matrix (36 x 11,000) that I have preformed a PCA on > with prcomp(), but due to the large number of variables I can't plot the > result with biplot(). How else can I plot the PCA output? > > I tried posting this before, but got no responses so I'm trying again. > Surely this is a common problem, but I can't find a solution with google? > > > The University of Dundee is a registered Scottish Charity, No: SC015096 > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
I think the question on your mind should be: 'what do I want to do with this plot'? Just producing output from the PCA is easy - plotting the output$sd is probably quite informative. From the sounds of it, though, you want to do clustering with the PCA component loadings? (Since that's mostly what the biplot accomplishes using the first two PCs.) The first thing to note, then is that you might not want to plot all 36 PCs, then! Once you go higher than the first few, your results will likely become remarkably awful in ways that might not be obvious. A biplot with PCs 1 & 2, or 2 & 3, for example, could be easily sufficient. If you want to still plot many PCs, from an exploratory point of view, something like a parallel coordinates plot might be helpful. Alternatively, you could look at rgl for general plotting of 3d points (so you can do a 3d version of the biplot), or apply more systematic clustering algorithms. Zhou -- View this message in context: http://r.789695.n4.nabble.com/How-to-plot-PCA-output-tp4614732p4617165.html Sent from the R help mailing list archive at Nabble.com.