Allyson Combes
2014-Apr-14 04:00 UTC
[R] Comparing initial eigenvalues to broken stick results
I am trying to create a function that will allow me to determine the number of components to retain based on the results of the broken stick criterion. In order to do so I know I need to compare the initial eigen values to the broken stick eigen values. The initial eigen value which becomes lower than the broken stick is the cutoff point so the cutoff for the number of components to retain is the number of eigen values before this cutoff points. So far this is the syntax I have and what I get. ev <- eigen(cor(EFAexample)) ev bstick(24, tot.var=24)> ev$values [1] 7.2819381 2.3951299 1.8190170 1.6056666 0.9862474 0.9092235 0.8269931 [8] 0.7861644 0.6978157 0.6824547 0.6333925 0.5997783 0.5737571 0.5347758 [15] 0.4976710 0.4849214 0.4502025 0.4223273 0.3819080 0.3599697 0.3353226 [22] 0.3184069 0.2146300 0.2022866> bstick(24, tot.var=24)Stick1 Stick2 Stick3 Stick4 Stick5 Stick6 Stick7 3.77595818 2.77595818 2.27595818 1.94262484 1.69262484 1.49262484 1.32595818 Stick8 Stick9 Stick10 Stick11 Stick12 Stick13 Stick14 1.18310103 1.05810103 0.94698992 0.84698992 0.75608083 0.67274750 0.59582442 Stick15 Stick16 Stick17 Stick18 Stick19 Stick20 Stick21 0.52439585 0.45772918 0.39522918 0.33640566 0.28085010 0.22821852 0.17821852 Stick22 Stick23 Stick24 0.13059947 0.08514493 0.04166667 In this case the cutoff would be at stick 2 thus we would only retain 1 component. What syntax can I use to automatically make this comparison instead of having to do it manually each time? Also, am I using bstick correctly? From what I understand I should just have to enter the number of components and the total variance which will be the total number of components. Any help would be greatly appreciated. Thanks! Allyson [[alternative HTML version deleted]]
dcarlson at tamu.edu
2014-Apr-14 15:23 UTC
[R] Comparing initial eigenvalues to broken stick results
It helps a great deal if you provide a small data set using dput() and indicate what packages need to be loaded for the functions you are using. This example uses random data so there are no eigenvalues above the initial broken stick values:> set.seed(42) > x <- matrix(rnorm(200), 20, 10) > require(vegan) > bs <- rle(eigen(cor(x))$values>bstick(10,tot.var=10)) > as.vector(ifelse(bs$values[1]==TRUE, bs$lengths[1], 0))[1] 0 ------------------------------------- David L Carlson Department of Anthropology Texas A&M University College Station, TX 77840-4352 ----Original Message----- From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Allyson Combes Sent: Sunday, April 13, 2014 11:01 PM To: r-help at R-project.org Subject: [R] Comparing initial eigenvalues to broken stick results I am trying to create a function that will allow me to determine the number of components to retain based on the results of the broken stick criterion. In order to do so I know I need to compare the initial eigen values to the broken stick eigen values. The initial eigen value which becomes lower than the broken stick is the cutoff point so the cutoff for the number of components to retain is the number of eigen values before this cutoff points. So far this is the syntax I have and what I get. ev <- eigen(cor(EFAexample)) ev bstick(24, tot.var=24)> ev$values [1] 7.2819381 2.3951299 1.8190170 1.6056666 0.9862474 0.9092235 0.8269931 [8] 0.7861644 0.6978157 0.6824547 0.6333925 0.5997783 0.5737571 0.5347758 [15] 0.4976710 0.4849214 0.4502025 0.4223273 0.3819080 0.3599697 0.3353226 [22] 0.3184069 0.2146300 0.2022866> bstick(24, tot.var=24)Stick1 Stick2 Stick3 Stick4 Stick5 Stick6 Stick7 3.77595818 2.77595818 2.27595818 1.94262484 1.69262484 1.49262484 1.32595818 Stick8 Stick9 Stick10 Stick11 Stick12 Stick13 Stick14 1.18310103 1.05810103 0.94698992 0.84698992 0.75608083 0.67274750 0.59582442 Stick15 Stick16 Stick17 Stick18 Stick19 Stick20 Stick21 0.52439585 0.45772918 0.39522918 0.33640566 0.28085010 0.22821852 0.17821852 Stick22 Stick23 Stick24 0.13059947 0.08514493 0.04166667 In this case the cutoff would be at stick 2 thus we would only retain 1 component. What syntax can I use to automatically make this comparison instead of having to do it manually each time? Also, am I using bstick correctly? From what I understand I should just have to enter the number of components and the total variance which will be the total number of components. Any help would be greatly appreciated. Thanks! Allyson [[alternative HTML version deleted]] ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.