Maggie Wang
2009-Mar-26 07:57 UTC
[R] Extreme AIC in glm(), perfect separation, svm() tuning
Dear List,
With regard to the question I previously raised, here is the result I
obtained right now, brglm() does help, but there are two situations:
1) Classifiers with extremely high AIC (over 200), no perfect separation,
coefficients converge. in this case, using brglm() does help! It stabilize
the AIC, and the classification power is better.
Code and output: (need to install package: brglm)
matrix <- read.table("http://ihome.ust.hk/~haitian/sample.txt")
names(matrix)<-
c("g0","g761","g2809","g3106","g4373","g4583")
fo <- as.formula(g0 ~ g761 * g2809 * g3106 * g4373 * g4583)
library(MASS)
library(brglm)
lr <- brglm(formula= fo, family=binomial(link=logit), data=matrix)
summary(lr)
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 1.2829 0.8281 1.549 0.1214
g761 4.0619 5.2519 0.773 0.4393
g2809 -2.2775 4.7237 -0.482 0.6297
g3106 -2.4431 3.8504 -0.635 0.5258
g4373 1.2095 2.7312 0.443 0.6579
g4583 1.0475 6.3020 0.166 0.8680
g761:g2809 -11.8279 22.0052 -0.538 0.5909
g761:g3106 -57.7909 35.6418 -1.621 0.1049
...... (omitted)......
g761:g2809:g3106:g4373:g4583 -864.0858 2879.2579 -0.300 0.7641
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 78.708 on 86 degrees of freedom
Residual deviance: 56.600 on 55 degrees of freedom
Penalized deviance: 261.7148
AIC: 120.6
2) Classifiers with perfect separation with a too small AIC(around 50),
co-efficients does Not converge. The prediction error of itself by glm() is
0! brglm() is no better than glm() in this case. Code and output of glm():
matrix2 <- read.table("http://ihome.ust.hk/~haitian/sample2.txt")
names(matrix2)<-
c("g0","g28","g1334","g1871","g3639","g4295")
library(MASS)
fo2 <- as.formula(g0 ~ g28 * g1334 * g1871 * g3639 * g4295)
lr2 <- glm(fo2, family=binomial(link=logit), data=matrix2)
summary(lr2)
Deviance Residuals:
Min 1Q Median 3Q Max
-4.527e-05 -2.107e-08 -2.107e-08 2.107e-08 5.802e-05
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 6.028e+01 1.006e+07 5.99e-06 1
g28 4.569e+04 8.566e+07 0.001 1
g1334 1.733e+04 3.568e+07 4.86e-04 1
g1871 2.917e+02 7.194e+06 4.05e-05 1
g3639 1.936e+02 1.159e+08 1.67e-06 1
g4295 -3.642e+02 8.580e+06 -4.24e-05 1
g28:g1334 2.643e+05 3.732e+08 0.001 1
....(omitted) ....
g28:g1334:g1871:g3639:g4295 -1.084e+06 2.209e+09 -4.91e-04 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 1.2032e+02 on 86 degrees of freedom
Residual deviance: 1.8272e-08 on 55 degrees of freedom
AIC: 64
Number of Fisher Scoring iterations: 25
Now another question arise, if a perfect separation plane exists for (2),
then Support Vector Machine should do a perfect job. But right now it's not
the case: when I tried to use svm(), the error is over 0.3. Do you know if
this is a tuning problem?
Does svm() automatically consider all the inter-action terms? ( i actually
tried to input interaction terms manually, and the result is still the
same)
(..cont', need to install package: e1071)
library(e1071)
library(rpart)
train.x<- matrix2[,-1]
train.y<- matrix2[,1]
svm.tune <- tune(svm, train.x, train.y, validation.x= train.x,
validation.ytrain.y,
ranges = list(gamma = 2^(-10:5), cost = 2^(-10:4)))
Cost<- svm.tune$best.parameters$cost
Gamma<- svm.tune$best.parameters$gamma
svm.model <-svm(x=train.x, y=train.y, kernel = "polynomial", cost=
Cost,
Gamma=Gamma,na.action=na.fail, probability =TRUE)
svm.pre <- predict(svm.model, train.x, probability=TRUE)
vote <- ifelse(svm.pre > 0.5, 1, 0)
err.indicator <- ifelse(vote == train.y, 0, 1)
error <- sum(err.indicator)/length(train.y)
error
I'm really sorry for such a long mail! And for my limited knowledge too!
Would you please advise if there is any better way of tuning svm()? or what
should i do to obtain a reasonable co-efficients for case (2)? Thank you so
much!!
Best Regards,
Maggie
-----------------------------------
Haitian Wang
PhD Student in Statistics
ISOM Department, HKUST, Hong Kong
On Fri, Mar 20, 2009 at 4:44 PM, Gavin Simpson
<gavin.simpson@ucl.ac.uk>wrote:
> On Fri, 2009-03-20 at 12:39 +1100, Gad Abraham wrote:
> > Maggie Wang wrote:
> > > Hi, Dieter, Gad, and all,
> > >
> > > Thank you very much for your reply!
> > >
> > > So here is my data, you can copy it into a file names
"sample.txt"
> >
> > Hi Maggie,
> >
> > With this data (allowing for more iterations) I get:
> >
> > > lr <- glm(fo, family=binomial(link=logit), data=matrix,
> > control=glm.control(maxit=100))
> > Warning message:
> > In glm.fit(x = X, y = Y, weights = weights, start = start, etastart
> > etastart, :
> > fitted probabilities numerically 0 or 1 occurred
> >
> > which indicates, as Thomas has said, perfect separation, which occurs
> > because you're trying to fit too many variables with not enough
data.
>
> It is worth mentioning that, in and of itself, that warning does not
> necessarily indicate a separation issue - something I was unsure about
> recently. You can get that warning (and I did for several data sets in a
> recent problem I enquired on the list about) where the fitted values
> really do become numerically 0 or 1 without separation.
>
> For example, see this response to my original question on the list:
>
> http://article.gmane.org/gmane.comp.lang.r.general/134472/
>
> There Ioannis Kosmidis presents a number of ways to investigate the
> results of a logit model fit for such issues.
>
> G
>
> >
> > Cheers,
> > Gad
> >
> --
> %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
> Dr. Gavin Simpson [t] +44 (0)20 7679 0522
> ECRC, UCL Geography, [f] +44 (0)20 7679 0565
> Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk
> Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/
> UK. WC1E 6BT. [w] http://www.freshwaters.org.uk
> %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
>
>
[[alternative HTML version deleted]]
ratna ghosal
2009-Mar-26 08:49 UTC
[R] Extreme AIC in glm(), perfect separation, svm() tuning
Hi all,
I am very new to R, thus have some basic doubts regarding the output of the
analysis.
I cannot understand that how the degrees of freedom gets calculated in a Mixed
Effect Model. Suppose I use the following command:
mod1 <- lme(D. ~ S., data=dat, random = ~S.|Female), then the output is as
follows
StdDev Corr
(Intercept)
1.331293e-05 (Intr)
Ser.scale 2.925450e-05 0
Residual 8.782149e-01
Fixed effects: D. ~ S.
Value Std.Error DF t-value
p-value
(Intercept)
0.0000000 0..142 31 0.000000 1.0000
S. 0.3637731 0.155 31 2.343175 0.0257
Correlation:
(Intr)
Ser.scale 0
Standardized
Within-Group Residuals:
Min Q1 Med Q3 Max
-1.9275740 -0..7093811
-0.2111987 0.4978089 2.2598701
Number of
Observations: 38
Number of Groups: 6
Now I cannot understand that how the df is being calculated as 31, I only have
crawley with me and it does not explain the calculation of df very elaborately.
Hoping to get some insights regarding this.
Thanks
Ratna
Ratna GhosalResearch ScholarCentre for Ecological SciencesIndian Institute of
SciencesBangalore-12
--- On Thu, 26/3/09, Maggie Wang <haitian.wang@gmail.com> wrote:
From: Maggie Wang <haitian.wang@gmail.com>
Subject: Re: [R] Extreme AIC in glm(), perfect separation, svm() tuning
To: r-help@r-project.org
Cc: gavin.simpson@ucl.ac.uk
Date: Thursday, 26 March, 2009, 1:27 PM
-----Inline Attachment Follows-----
Dear List,
With regard to the question I previously raised, here is the result I
obtained right now, brglm() does help, but there are two situations:
1) Classifiers with extremely high AIC (over 200), no perfect separation,
coefficients converge. in this case, using brglm() does help! It stabilize
the AIC, and the classification power is better.
Code and output: (need to install package: brglm)
matrix <- read.table("http://ihome.ust.hk/~haitian/sample.txt")
names(matrix)<-
c("g0","g761","g2809","g3106","g4373","g4583")
fo <- as.formula(g0 ~ g761 * g2809 * g3106 * g4373 * g4583)
library(MASS)
library(brglm)
lr <- brglm(formula= fo, family=binomial(link=logit), data=matrix)
summary(lr)
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 1.2829 0.8281 1.549 0.1214
g761 4.0619 5.2519 0.773 0.4393
g2809 -2.2775 4.7237 -0.482 0.6297
g3106 -2.4431 3.8504 -0.635 0.5258
g4373 1.2095 2.7312 0.443 0.6579
g4583 1.0475 6.3020 0.166 0.8680
g761:g2809 -11.8279 22.0052 -0.538 0.5909
g761:g3106 -57.7909 35.6418 -1.621 0.1049
....... (omitted)......
g761:g2809:g3106:g4373:g4583 -864.0858 2879.2579 -0.300 0.7641
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 78.708 on 86 degrees of freedom
Residual deviance: 56.600 on 55 degrees of freedom
Penalized deviance: 261.7148
AIC: 120.6
2) Classifiers with perfect separation with a too small AIC(around 50),
co-efficients does Not converge. The prediction error of itself by glm() is
0! brglm() is no better than glm() in this case. Code and output of glm():
matrix2 <- read.table("http://ihome.ust.hk/~haitian/sample2.txt")
names(matrix2)<-
c("g0","g28","g1334","g1871","g3639","g4295")
library(MASS)
fo2 <- as.formula(g0 ~ g28 * g1334 * g1871 * g3639 * g4295)
lr2 <- glm(fo2, family=binomial(link=logit), data=matrix2)
summary(lr2)
Deviance Residuals:
Min 1Q Median 3Q Max
-4.527e-05 -2.107e-08 -2.107e-08 2.107e-08 5.802e-05
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 6..028e+01 1.006e+07 5.99e-06 1
g28 4.569e+04 8.566e+07 0.001 1
g1334 1.733e+04 3.568e+07 4.86e-04 1
g1871 2.917e+02 7.194e+06 4.05e-05 1
g3639 1.936e+02 1.159e+08 1.67e-06 1
g4295 -3.642e+02 8.580e+06 -4.24e-05 1
g28:g1334 2.643e+05 3.732e+08 0.001 1
.....(omitted) ....
g28:g1334:g1871:g3639:g4295 -1.084e+06 2.209e+09 -4.91e-04 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 1.2032e+02 on 86 degrees of freedom
Residual deviance: 1.8272e-08 on 55 degrees of freedom
AIC: 64
Number of Fisher Scoring iterations: 25
Now another question arise, if a perfect separation plane exists for (2),
then Support Vector Machine should do a perfect job. But right now it's not
the case: when I tried to use svm(), the error is over 0.3. Do you know if
this is a tuning problem?
Does svm() automatically consider all the inter-action terms? ( i actually
tried to input interaction terms manually, and the result is still the
same)
(..cont', need to install package: e1071)
library(e1071)
library(rpart)
train.x<- matrix2[,-1]
train.y<- matrix2[,1]
svm.tune <- tune(svm, train.x, train.y, validation.x= train.x,
validation..ytrain.y,
ranges = list(gamma = 2^(-10:5), cost = 2^(-10:4)))
Cost<- svm.tune$best.parameters$cost
Gamma<- svm.tune$best.parameters$gamma
svm.model <-svm(x=train.x, y=train.y, kernel = "polynomial", cost=
Cost,
Gamma=Gamma,na.action=na.fail, probability =TRUE)
svm.pre <- predict(svm.model, train.x, probability=TRUE)
vote <- ifelse(svm.pre > 0.5, 1, 0)
err.indicator <- ifelse(vote == train.y, 0, 1)
error <- sum(err.indicator)/length(train.y)
error
I'm really sorry for such a long mail! And for my limited knowledge too!
Would you please advise if there is any better way of tuning svm()? or what
should i do to obtain a reasonable co-efficients for case (2)? Thank you so
much!!
Best Regards,
Maggie
-----------------------------------
Haitian Wang
PhD Student in Statistics
ISOM Department, HKUST, Hong Kong
On Fri, Mar 20, 2009 at 4:44 PM, Gavin Simpson
<gavin.simpson@ucl.ac.uk>wrote:
> On Fri, 2009-03-20 at 12:39 +1100, Gad Abraham wrote:
> > Maggie Wang wrote:
> > > Hi, Dieter, Gad, and all,
> > >
[[elided Yahoo spam]]> > >
> > > So here is my data, you can copy it into a file names
"sample.txt"
> >
> > Hi Maggie,
> >
> > With this data (allowing for more iterations) I get:
> >
> > > lr <- glm(fo, family=binomial(link=logit), data=matrix,
> > control=glm.control(maxit=100))
> > Warning message:
> > In glm.fit(x = X, y = Y, weights = weights, start = start, etastart
> > etastart, :
> > fitted probabilities numerically 0 or 1 occurred
> >
> > which indicates, as Thomas has said, perfect separation, which occurs
> > because you're trying to fit too many variables with not enough
data.
>
> It is worth mentioning that, in and of itself, that warning does not
> necessarily indicate a separation issue - something I was unsure about
> recently. You can get that warning (and I did for several data sets in a
> recent problem I enquired on the list about) where the fitted values
> really do become numerically 0 or 1 without separation.
>
> For example, see this response to my original question on the list:
>
> http://article.gmane.org/gmane.comp.lang.r.general/134472/
>
> There Ioannis Kosmidis presents a number of ways to investigate the
> results of a logit model fit for such issues.
>
> G
>
> >
> > Cheers,
> > Gad
> >
> --
> %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
> Dr. Gavin Simpson [t] +44 (0)20 7679 0522
> ECRC, UCL Geography, [f] +44 (0)20 7679 0565
> Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk
> Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/
> UK. WC1E 6BT. [w] http://www.freshwaters.org.uk
> %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
>
>
[[alternative HTML version deleted]]
-----Inline Attachment Follows-----
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Add more friends to your messenger and enjoy! Go to
http://messenger.yahoo.com/invite/
[[alternative HTML version deleted]]