Bert Gunter
2011-Jun-18 14:26 UTC
[R] "Justify" PCA? -- was: Bartlett's Test of Sphericity
Apologies for the obvious, but just to clarify: there is no reason to "justify" a PCA -- it's just an eigen decomposition of a matrix and is therefore "justified" by linear algebra. If one wants to determine whether some subset of the eigenvectors principal components suffice to "represent" the data in some sense, then that is where distributional considerations would come into play. But that is another (often unsatisfactory) story, typically irrelevant in the exploratory context where PCA is often used. -- Bert On Sat, Jun 18, 2011 at 5:08 AM, David Cross <d.cross at tcu.edu> wrote:> Yes, Bartlett's is not a good way to "justify" a PCA. > > David Cross > d.cross at tcu.edu > www.davidcross.us > > > > > On Jun 18, 2011, at 1:47 AM, Jeremy Miles wrote: > >> cortest.bartlett() in the psych package. >> >> I've never seen a non-significant Bartlett's test. >> >> Jeremy >> >> >> >> On 17 June 2011 12:43, thibault grava <thibault.grava at gmail.com> wrote: >>> Hello Dear R user, >>> >>> I want to conduct a Principal components analysis and I need to run two >>> tests to check whether I can do it or not. I found how to run the KMO >>> test, however i cannot find an R fonction for the Bartlett's test of >>> sphericity. Does somebody know if it exists? >>> >>> Thanks for your help! >>> >>> Thibault >>> >>> ? ? ? ?[[alternative HTML version deleted]] >>> >>> ______________________________________________ >>> R-help at r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >>> >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- "Men by nature long to get on to the ultimate truths, and will often be impatient with elementary studies or fight shy of them. If it were possible to reach the ultimate truths without the elementary studies usually prefixed to them, these would not be preparatory studies but superfluous diversions." -- Maimonides (1135-1204) Bert Gunter Genentech Nonclinical Biostatistics
peter dalgaard
2011-Jun-18 17:43 UTC
[R] "Justify" PCA? -- was: Bartlett's Test of Sphericity
On Jun 18, 2011, at 16:26 , Bert Gunter wrote:> Apologies for the obvious, but just to clarify: there is no reason to > "justify" a PCA -- it's just an eigen decomposition of a matrix and is > therefore "justified" by linear algebra. > > If one wants to determine whether some subset of the eigenvectors > principal components suffice to "represent" the data in some sense, > then that is where distributional considerations would come into play. > But that is another (often unsatisfactory) story, typically irrelevant > in the exploratory context where PCA is often used.Yes, I was wondering about that too. PCA on independent variables just sorts them by variance. PCA on their correlation matrix is essentially a random orthogonal rotation. So PCA is nonsensical if there is no correlation, but it can be pretty useless even if there is. Apparently the KMO/Bartlett "justification" comes out of SPSS usage, where a subculture has emerged in which it is conventional to cite those two quantities. If you google for "KMO", you'll find oodles of papers using the statistics, but precious few pages actually discussing or even defining it. Shame; the "adequate sampling" notion underlying the KMO measure could do with a qualified discussion. (Within such subcultures there often arises an ideology that software is somehow flawed if it does not provide their favorite quantities, relevant or not. What it really is is classical group dynamics, as in "you can't go to the opera if you don't own a tuxedo". See also "bandwagon effect".) -pd -- Peter Dalgaard Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com