Maggie Wang
2009-Mar-26 07:57 UTC
[R] Extreme AIC in glm(), perfect separation, svm() tuning
Dear List, With regard to the question I previously raised, here is the result I obtained right now, brglm() does help, but there are two situations: 1) Classifiers with extremely high AIC (over 200), no perfect separation, coefficients converge. in this case, using brglm() does help! It stabilize the AIC, and the classification power is better. Code and output: (need to install package: brglm) matrix <- read.table("http://ihome.ust.hk/~haitian/sample.txt") names(matrix)<- c("g0","g761","g2809","g3106","g4373","g4583") fo <- as.formula(g0 ~ g761 * g2809 * g3106 * g4373 * g4583) library(MASS) library(brglm) lr <- brglm(formula= fo, family=binomial(link=logit), data=matrix) summary(lr) Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) 1.2829 0.8281 1.549 0.1214 g761 4.0619 5.2519 0.773 0.4393 g2809 -2.2775 4.7237 -0.482 0.6297 g3106 -2.4431 3.8504 -0.635 0.5258 g4373 1.2095 2.7312 0.443 0.6579 g4583 1.0475 6.3020 0.166 0.8680 g761:g2809 -11.8279 22.0052 -0.538 0.5909 g761:g3106 -57.7909 35.6418 -1.621 0.1049 ...... (omitted)...... g761:g2809:g3106:g4373:g4583 -864.0858 2879.2579 -0.300 0.7641 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 (Dispersion parameter for binomial family taken to be 1) Null deviance: 78.708 on 86 degrees of freedom Residual deviance: 56.600 on 55 degrees of freedom Penalized deviance: 261.7148 AIC: 120.6 2) Classifiers with perfect separation with a too small AIC(around 50), co-efficients does Not converge. The prediction error of itself by glm() is 0! brglm() is no better than glm() in this case. Code and output of glm(): matrix2 <- read.table("http://ihome.ust.hk/~haitian/sample2.txt") names(matrix2)<- c("g0","g28","g1334","g1871","g3639","g4295") library(MASS) fo2 <- as.formula(g0 ~ g28 * g1334 * g1871 * g3639 * g4295) lr2 <- glm(fo2, family=binomial(link=logit), data=matrix2) summary(lr2) Deviance Residuals: Min 1Q Median 3Q Max -4.527e-05 -2.107e-08 -2.107e-08 2.107e-08 5.802e-05 Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) 6.028e+01 1.006e+07 5.99e-06 1 g28 4.569e+04 8.566e+07 0.001 1 g1334 1.733e+04 3.568e+07 4.86e-04 1 g1871 2.917e+02 7.194e+06 4.05e-05 1 g3639 1.936e+02 1.159e+08 1.67e-06 1 g4295 -3.642e+02 8.580e+06 -4.24e-05 1 g28:g1334 2.643e+05 3.732e+08 0.001 1 ....(omitted) .... g28:g1334:g1871:g3639:g4295 -1.084e+06 2.209e+09 -4.91e-04 1 (Dispersion parameter for binomial family taken to be 1) Null deviance: 1.2032e+02 on 86 degrees of freedom Residual deviance: 1.8272e-08 on 55 degrees of freedom AIC: 64 Number of Fisher Scoring iterations: 25 Now another question arise, if a perfect separation plane exists for (2), then Support Vector Machine should do a perfect job. But right now it's not the case: when I tried to use svm(), the error is over 0.3. Do you know if this is a tuning problem? Does svm() automatically consider all the inter-action terms? ( i actually tried to input interaction terms manually, and the result is still the same) (..cont', need to install package: e1071) library(e1071) library(rpart) train.x<- matrix2[,-1] train.y<- matrix2[,1] svm.tune <- tune(svm, train.x, train.y, validation.x= train.x, validation.ytrain.y, ranges = list(gamma = 2^(-10:5), cost = 2^(-10:4))) Cost<- svm.tune$best.parameters$cost Gamma<- svm.tune$best.parameters$gamma svm.model <-svm(x=train.x, y=train.y, kernel = "polynomial", cost= Cost, Gamma=Gamma,na.action=na.fail, probability =TRUE) svm.pre <- predict(svm.model, train.x, probability=TRUE) vote <- ifelse(svm.pre > 0.5, 1, 0) err.indicator <- ifelse(vote == train.y, 0, 1) error <- sum(err.indicator)/length(train.y) error I'm really sorry for such a long mail! And for my limited knowledge too! Would you please advise if there is any better way of tuning svm()? or what should i do to obtain a reasonable co-efficients for case (2)? Thank you so much!! Best Regards, Maggie ----------------------------------- Haitian Wang PhD Student in Statistics ISOM Department, HKUST, Hong Kong On Fri, Mar 20, 2009 at 4:44 PM, Gavin Simpson <gavin.simpson@ucl.ac.uk>wrote:> On Fri, 2009-03-20 at 12:39 +1100, Gad Abraham wrote: > > Maggie Wang wrote: > > > Hi, Dieter, Gad, and all, > > > > > > Thank you very much for your reply! > > > > > > So here is my data, you can copy it into a file names "sample.txt" > > > > Hi Maggie, > > > > With this data (allowing for more iterations) I get: > > > > > lr <- glm(fo, family=binomial(link=logit), data=matrix, > > control=glm.control(maxit=100)) > > Warning message: > > In glm.fit(x = X, y = Y, weights = weights, start = start, etastart > > etastart, : > > fitted probabilities numerically 0 or 1 occurred > > > > which indicates, as Thomas has said, perfect separation, which occurs > > because you're trying to fit too many variables with not enough data. > > It is worth mentioning that, in and of itself, that warning does not > necessarily indicate a separation issue - something I was unsure about > recently. You can get that warning (and I did for several data sets in a > recent problem I enquired on the list about) where the fitted values > really do become numerically 0 or 1 without separation. > > For example, see this response to my original question on the list: > > http://article.gmane.org/gmane.comp.lang.r.general/134472/ > > There Ioannis Kosmidis presents a number of ways to investigate the > results of a logit model fit for such issues. > > G > > > > > Cheers, > > Gad > > > -- > %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% > Dr. Gavin Simpson [t] +44 (0)20 7679 0522 > ECRC, UCL Geography, [f] +44 (0)20 7679 0565 > Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk > Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ > UK. WC1E 6BT. [w] http://www.freshwaters.org.uk > %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% > >[[alternative HTML version deleted]]
ratna ghosal
2009-Mar-26 08:49 UTC
[R] Extreme AIC in glm(), perfect separation, svm() tuning
Hi all, I am very new to R, thus have some basic doubts regarding the output of the analysis. I cannot understand that how the degrees of freedom gets calculated in a Mixed Effect Model. Suppose I use the following command: mod1 <- lme(D. ~ S., data=dat, random = ~S.|Female), then the output is as follows StdDev Corr (Intercept) 1.331293e-05 (Intr) Ser.scale 2.925450e-05 0 Residual 8.782149e-01 Fixed effects: D. ~ S. Value Std.Error DF t-value p-value (Intercept) 0.0000000 0..142 31 0.000000 1.0000 S. 0.3637731 0.155 31 2.343175 0.0257 Correlation: (Intr) Ser.scale 0 Standardized Within-Group Residuals: Min Q1 Med Q3 Max -1.9275740 -0..7093811 -0.2111987 0.4978089 2.2598701 Number of Observations: 38 Number of Groups: 6 Now I cannot understand that how the df is being calculated as 31, I only have crawley with me and it does not explain the calculation of df very elaborately. Hoping to get some insights regarding this. Thanks Ratna Ratna GhosalResearch ScholarCentre for Ecological SciencesIndian Institute of SciencesBangalore-12 --- On Thu, 26/3/09, Maggie Wang <haitian.wang@gmail.com> wrote: From: Maggie Wang <haitian.wang@gmail.com> Subject: Re: [R] Extreme AIC in glm(), perfect separation, svm() tuning To: r-help@r-project.org Cc: gavin.simpson@ucl.ac.uk Date: Thursday, 26 March, 2009, 1:27 PM -----Inline Attachment Follows----- Dear List, With regard to the question I previously raised, here is the result I obtained right now, brglm() does help, but there are two situations: 1) Classifiers with extremely high AIC (over 200), no perfect separation, coefficients converge. in this case, using brglm() does help! It stabilize the AIC, and the classification power is better. Code and output: (need to install package: brglm) matrix <- read.table("http://ihome.ust.hk/~haitian/sample.txt") names(matrix)<- c("g0","g761","g2809","g3106","g4373","g4583") fo <- as.formula(g0 ~ g761 * g2809 * g3106 * g4373 * g4583) library(MASS) library(brglm) lr <- brglm(formula= fo, family=binomial(link=logit), data=matrix) summary(lr) Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) 1.2829 0.8281 1.549 0.1214 g761 4.0619 5.2519 0.773 0.4393 g2809 -2.2775 4.7237 -0.482 0.6297 g3106 -2.4431 3.8504 -0.635 0.5258 g4373 1.2095 2.7312 0.443 0.6579 g4583 1.0475 6.3020 0.166 0.8680 g761:g2809 -11.8279 22.0052 -0.538 0.5909 g761:g3106 -57.7909 35.6418 -1.621 0.1049 ....... (omitted)...... g761:g2809:g3106:g4373:g4583 -864.0858 2879.2579 -0.300 0.7641 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 (Dispersion parameter for binomial family taken to be 1) Null deviance: 78.708 on 86 degrees of freedom Residual deviance: 56.600 on 55 degrees of freedom Penalized deviance: 261.7148 AIC: 120.6 2) Classifiers with perfect separation with a too small AIC(around 50), co-efficients does Not converge. The prediction error of itself by glm() is 0! brglm() is no better than glm() in this case. Code and output of glm(): matrix2 <- read.table("http://ihome.ust.hk/~haitian/sample2.txt") names(matrix2)<- c("g0","g28","g1334","g1871","g3639","g4295") library(MASS) fo2 <- as.formula(g0 ~ g28 * g1334 * g1871 * g3639 * g4295) lr2 <- glm(fo2, family=binomial(link=logit), data=matrix2) summary(lr2) Deviance Residuals: Min 1Q Median 3Q Max -4.527e-05 -2.107e-08 -2.107e-08 2.107e-08 5.802e-05 Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) 6..028e+01 1.006e+07 5.99e-06 1 g28 4.569e+04 8.566e+07 0.001 1 g1334 1.733e+04 3.568e+07 4.86e-04 1 g1871 2.917e+02 7.194e+06 4.05e-05 1 g3639 1.936e+02 1.159e+08 1.67e-06 1 g4295 -3.642e+02 8.580e+06 -4.24e-05 1 g28:g1334 2.643e+05 3.732e+08 0.001 1 .....(omitted) .... g28:g1334:g1871:g3639:g4295 -1.084e+06 2.209e+09 -4.91e-04 1 (Dispersion parameter for binomial family taken to be 1) Null deviance: 1.2032e+02 on 86 degrees of freedom Residual deviance: 1.8272e-08 on 55 degrees of freedom AIC: 64 Number of Fisher Scoring iterations: 25 Now another question arise, if a perfect separation plane exists for (2), then Support Vector Machine should do a perfect job. But right now it's not the case: when I tried to use svm(), the error is over 0.3. Do you know if this is a tuning problem? Does svm() automatically consider all the inter-action terms? ( i actually tried to input interaction terms manually, and the result is still the same) (..cont', need to install package: e1071) library(e1071) library(rpart) train.x<- matrix2[,-1] train.y<- matrix2[,1] svm.tune <- tune(svm, train.x, train.y, validation.x= train.x, validation..ytrain.y, ranges = list(gamma = 2^(-10:5), cost = 2^(-10:4))) Cost<- svm.tune$best.parameters$cost Gamma<- svm.tune$best.parameters$gamma svm.model <-svm(x=train.x, y=train.y, kernel = "polynomial", cost= Cost, Gamma=Gamma,na.action=na.fail, probability =TRUE) svm.pre <- predict(svm.model, train.x, probability=TRUE) vote <- ifelse(svm.pre > 0.5, 1, 0) err.indicator <- ifelse(vote == train.y, 0, 1) error <- sum(err.indicator)/length(train.y) error I'm really sorry for such a long mail! And for my limited knowledge too! Would you please advise if there is any better way of tuning svm()? or what should i do to obtain a reasonable co-efficients for case (2)? Thank you so much!! Best Regards, Maggie ----------------------------------- Haitian Wang PhD Student in Statistics ISOM Department, HKUST, Hong Kong On Fri, Mar 20, 2009 at 4:44 PM, Gavin Simpson <gavin.simpson@ucl.ac.uk>wrote:> On Fri, 2009-03-20 at 12:39 +1100, Gad Abraham wrote: > > Maggie Wang wrote: > > > Hi, Dieter, Gad, and all, > > >[[elided Yahoo spam]]> > > > > > So here is my data, you can copy it into a file names "sample.txt" > > > > Hi Maggie, > > > > With this data (allowing for more iterations) I get: > > > > > lr <- glm(fo, family=binomial(link=logit), data=matrix, > > control=glm.control(maxit=100)) > > Warning message: > > In glm.fit(x = X, y = Y, weights = weights, start = start, etastart > > etastart, : > > fitted probabilities numerically 0 or 1 occurred > > > > which indicates, as Thomas has said, perfect separation, which occurs > > because you're trying to fit too many variables with not enough data. > > It is worth mentioning that, in and of itself, that warning does not > necessarily indicate a separation issue - something I was unsure about > recently. You can get that warning (and I did for several data sets in a > recent problem I enquired on the list about) where the fitted values > really do become numerically 0 or 1 without separation. > > For example, see this response to my original question on the list: > > http://article.gmane.org/gmane.comp.lang.r.general/134472/ > > There Ioannis Kosmidis presents a number of ways to investigate the > results of a logit model fit for such issues. > > G > > > > > Cheers, > > Gad > > > -- > %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% > Dr. Gavin Simpson [t] +44 (0)20 7679 0522 > ECRC, UCL Geography, [f] +44 (0)20 7679 0565 > Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk > Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ > UK. WC1E 6BT. [w] http://www.freshwaters.org.uk > %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% > >[[alternative HTML version deleted]] -----Inline Attachment Follows----- ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Add more friends to your messenger and enjoy! Go to http://messenger.yahoo.com/invite/ [[alternative HTML version deleted]]