RAJ
2011-Oct-25 23:54 UTC
[R] Logistic Regression - Variable Selection Methods With Prediction
Hello, I am pretty new to R, I have always used SAS and SAS products. My target variable is binary ('Y' and 'N') and i have about 14 predictor variables. My goal is to compare different variable selection methods like Forward, Backward, All possible subsests. I am using misclassification rate to pick the winner method. This is what i have as of now, Reg <- glm (Graduation ~., DFtrain,family=binomial(link="logit")) step <- extractAIC(Reg, direction="forward") pred <- predict(Reg, DFtest,type="response") mis <- mean({pred > 0.5} != {DFtest[,"Graduation"] == "Y"}) This program actually works but I needed to check to make sure am doing this right. Also, I am getting the same misclassification rates for all different methods. I also tried to use Reg <- leaps(Graduation ~., DFtrain) pred <- predict(Reg, DFtest,type="response") mis <- mean({pred > 0.5} != {DFtest[,"Graduation"] == "Y"}) #print(summary(mis)) which doesnt work and Reg <- regsubsets(Graduation ~., DFtrain) pred <- predict(Reg, DFtest,type="response") mis <- mean({pred > 0.5} != {DFtest[,"Graduation"] == "Y"}) #print(summary(mis)) The Regsubsets will work but the 'predict' function does not work with it. Is there any other way to do predictions when using regsubsets Any help is appreciated. Thanks,
RAJ
2011-Oct-26 16:35 UTC
[R] Logistic Regression - Variable Selection Methods With Prediction
Can I atleast get help with what pacakge to use for logistic regression with all possible models and do prediction. I know i can use regsubsets but i am not sure if it has any prediction functions to go with it. Thanks On Oct 25, 6:54?pm, RAJ <dheerajathr... at gmail.com> wrote:> Hello, > > I am pretty new to R, I have always used SAS and SAS products. My > target variable is binary ('Y' and 'N') and i have about 14 predictor > variables. My goal is to compare different variable selection methods > like Forward, Backward, All possible subsests. I am using > misclassification rate to pick the winner method. > > This is what i have as of now, > > Reg <- glm (Graduation ~., DFtrain,family=binomial(link="logit")) > ? ? ? ? ? ? ? ? step <- extractAIC(Reg, direction="forward") > ? ? ? ? ? ? ? ? pred <- predict(Reg, DFtest,type="response") > ? ? ? ? ? ? ? ? mis <- mean({pred > 0.5} != {DFtest[,"Graduation"] == "Y"}) > This program actually works but I needed to check to make sure am > doing this right. Also, I am getting the same misclassification rates > for all different methods. > > I also tried to use > > Reg <- leaps(Graduation ~., DFtrain) > ? ? ? ? ? ? ? ? pred <- predict(Reg, DFtest,type="response") > ? ? ? ? ? ? ? ? mis <- mean({pred > 0.5} != {DFtest[,"Graduation"] == "Y"}) > ? ? ? ? ? ? ? ? #print(summary(mis)) > which doesnt work > > and > > Reg <- regsubsets(Graduation ~., DFtrain) > ? ? ? ? ? ? ? ? pred <- predict(Reg, DFtest,type="response") > ? ? ? ? ? ? ? ? mis <- mean({pred > 0.5} != {DFtest[,"Graduation"] == "Y"}) > ? ? ? ? ? ? ? ? #print(summary(mis)) > > The Regsubsets will work but the 'predict' function does not work with > it. Is there any other way to do predictions when using regsubsets > > Any help is appreciated. > > Thanks, > > ______________________________________________ > R-h... at r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Steve_Friedman at nps.gov
2011-Oct-26 17:31 UTC
[R] Logistic Regression - Variable Selection Methods With Prediction
Try the glm package Steve Friedman Ph. D. Ecologist / Spatial Statistical Analyst Everglades and Dry Tortugas National Park 950 N Krome Ave (3rd Floor) Homestead, Florida 33034 Steve_Friedman at nps.gov Office (305) 224 - 4282 Fax (305) 224 - 4147