Hello All, I am trying to carry out variable reduction. I do not have information about the dependent variable, and have only the X variables as it were. In selecting variables I wish to keep, I have considered the following criteria. 1) Percentage of missing value in each column/variable 2) Variance of each variable, with a cut-off value. I recently came across Weka and found that there is an RWeka package which would allow me to make use of Weka through R. Weka provides a "Genetic search" variable reduction method, but I could not find its R code implementation in the RWeka Pdf file on CRAN. I looked for other R packages that allow me to do variable reduction without considering a dependent variable. I came across 'dprep' package but it does not have a Windows implementation. Moreover, I have a dataset that contains continuous and categorical variables, some categorical variables having 3 levels, 10 levels and so on, till a max 50 levels (E.g. States in the USA). Any suggestions in this regard will be much appreciated. Thank you Harsh Singhal Decision Systems, Mu Sigma, Inc.
Harsh <singhalblr <at> gmail.com> writes:> > Hello All, > I am trying to carry out variable reduction. I do not have information > about the dependent variable, and have only the X variables as it > were. > ... > I looked for other R packages that allow me to do variable reduction > without considering a dependent variable. I came across 'dprep' > package but it does not have a Windows implementation.I doubt that you will find what you are longing for, but: There is a Windows version available at the "Homepage of the drep package" at <http://math.uprm.edu/~edgar/dprep.html>. This version 2.0 can be loaded without errors into R 2.8.0 though it appears not to be fully compliant with the tests on CRAN.> Moreover, I have a dataset that contains continuous and categorical > variables, some categorical variables having 3 levels, 10 levels and > so on, till a max 50 levels (E.g. States in the USA). > > Any suggestions in this regard will be much appreciated. > > Thank you > > Harsh Singhal > Decision Systems, > Mu Sigma, Inc. >
Hi Harsh,>> I looked for other R packages that allow me to do variable reduction >> without considering a dependent variable.Have look at package subselect. This has an implementation of the genetic algorithm, along with some other methods. It should do what you want. Regards, Mark. Harsh-7 wrote:> > Hello All, > I am trying to carry out variable reduction. I do not have information > about the dependent variable, and have only the X variables as it > were. > In selecting variables I wish to keep, I have considered the following > criteria. > 1) Percentage of missing value in each column/variable > 2) Variance of each variable, with a cut-off value. > > I recently came across Weka and found that there is an RWeka package > which would allow me to make use of Weka through R. > Weka provides a "Genetic search" variable reduction method, but I > could not find its R code implementation in the RWeka Pdf file on > CRAN. > > I looked for other R packages that allow me to do variable reduction > without considering a dependent variable. I came across 'dprep' > package but it does not have a Windows implementation. > > Moreover, I have a dataset that contains continuous and categorical > variables, some categorical variables having 3 levels, 10 levels and > so on, till a max 50 levels (E.g. States in the USA). > > Any suggestions in this regard will be much appreciated. > > Thank you > > Harsh Singhal > Decision Systems, > Mu Sigma, Inc. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > >-- View this message in context: http://www.nabble.com/Pre-model-Variable-Reduction-tp20912229p20914146.html Sent from the R help mailing list archive at Nabble.com.
See: ?prcomp ?princomp On Tue, Dec 9, 2008 at 5:34 AM, Harsh <singhalblr at gmail.com> wrote:> Hello All, > I am trying to carry out variable reduction. I do not have information > about the dependent variable, and have only the X variables as it > were. > In selecting variables I wish to keep, I have considered the following criteria. > 1) Percentage of missing value in each column/variable > 2) Variance of each variable, with a cut-off value. > > I recently came across Weka and found that there is an RWeka package > which would allow me to make use of Weka through R. > Weka provides a "Genetic search" variable reduction method, but I > could not find its R code implementation in the RWeka Pdf file on > CRAN. > > I looked for other R packages that allow me to do variable reduction > without considering a dependent variable. I came across 'dprep' > package but it does not have a Windows implementation. > > Moreover, I have a dataset that contains continuous and categorical > variables, some categorical variables having 3 levels, 10 levels and > so on, till a max 50 levels (E.g. States in the USA). > > Any suggestions in this regard will be much appreciated. > > Thank you > > Harsh Singhal > Decision Systems, > Mu Sigma, Inc. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
Harsh wrote:> Hello All, > I am trying to carry out variable reduction. I do not have information > about the dependent variable, and have only the X variables as it > were. > In selecting variables I wish to keep, I have considered the following criteria. > 1) Percentage of missing value in each column/variable > 2) Variance of each variable, with a cut-off value. > > I recently came across Weka and found that there is an RWeka package > which would allow me to make use of Weka through R. > Weka provides a "Genetic search" variable reduction method, but I > could not find its R code implementation in the RWeka Pdf file on > CRAN. > > I looked for other R packages that allow me to do variable reduction > without considering a dependent variable. I came across 'dprep' > package but it does not have a Windows implementation. > > Moreover, I have a dataset that contains continuous and categorical > variables, some categorical variables having 3 levels, 10 levels and > so on, till a max 50 levels (E.g. States in the USA). > > Any suggestions in this regard will be much appreciated. > > Thank you > > Harsh Singhal > Decision Systems, > Mu Sigma, Inc. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >Take a look at the the redun function in the Hmisc package, which does redundancy analysis. Frank -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University