Frank Harrell
2013-Apr-03 16:31 UTC
[R] Model Selection based on individual t-values with the largest possible number of variables in regression
To say that these strategies represent bad statistical practice is to put it mildly. Frank mister_O wrote> Dear R-Community, > > When writing my master thesis, I faced with difficult issue. Analyzing the > capital structure determinants I have one dependent variable (Total debt > ratio = TD) and 15 independent ones. At the first stage I normalized my > data by deleting outliers from each variable (Pairwise deletion) and in > the result I got every variable to be with different length. Now when > selecting relevant variables for the "best" model, neither stepwise nor > forward or backward procedures don't work perfectly since there are a > number of other combinations of variables wich have also high t-values. > Thus, wichever model I pick, you never know whether this model is > trustworthy. I tried to calculate all possible combinations of independent > variables, but since I have 15 ones, there are thousands of such > combinations and R simply refuses to calculate them! (computer crashes) I > wonder if you can help me to write the code in R in order to find the > model wich include as many variables as it possible with significant > t-values? > > cheers, Oleg----- Frank Harrell Department of Biostatistics, Vanderbilt University -- View this message in context: http://r.789695.n4.nabble.com/Model-Selection-based-on-individual-t-values-with-the-largest-possible-number-of-variables-in-regresn-tp4663174p4663202.html Sent from the R help mailing list archive at Nabble.com.