thr3ads.net - R help - [R] Extreme AIC in glm(), perfect separation, svm() tuning [Mar 2009]

If this information is useful, please help other people find it:
Share via:

Maggie Wang

2009-Mar-26 07:57 UTC

[R] Extreme AIC in glm(), perfect separation, svm() tuning

Dear List,
With regard to the question I previously raised, here is the result I
obtained right now, brglm() does help, but there are two situations:

1) Classifiers with extremely high AIC (over 200),  no perfect separation,
coefficients converge. in this case, using brglm() does help!  It stabilize
the AIC, and the classification power is better.

Code and output:  (need to install package: brglm)

matrix <- read.table("http://ihome.ust.hk/~haitian/sample.txt")
names(matrix)<-
c("g0","g761","g2809","g3106","g4373","g4583")
fo <- as.formula(g0 ~ g761 * g2809 * g3106 * g4373 * g4583)
library(MASS)
library(brglm)

lr <- brglm(formula= fo, family=binomial(link=logit), data=matrix)
summary(lr)

Coefficients:
                              Estimate Std. Error z value Pr(>|z|)
(Intercept)                     1.2829     0.8281   1.549   0.1214
g761                            4.0619     5.2519   0.773   0.4393
g2809                          -2.2775     4.7237  -0.482   0.6297
g3106                          -2.4431     3.8504  -0.635   0.5258
g4373                           1.2095     2.7312   0.443   0.6579
g4583                           1.0475     6.3020   0.166   0.8680
g761:g2809                    -11.8279    22.0052  -0.538   0.5909
g761:g3106                    -57.7909    35.6418  -1.621   0.1049
...... (omitted)......
g761:g2809:g3106:g4373:g4583 -864.0858  2879.2579  -0.300   0.7641
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 78.708  on 86  degrees of freedom
Residual deviance: 56.600  on 55  degrees of freedom
Penalized deviance: 261.7148
AIC:  120.6


2) Classifiers with perfect separation with a too small AIC(around 50),
co-efficients does Not converge.  The prediction error of itself by glm() is
0!  brglm() is no better than glm() in this case.  Code and output of glm():


matrix2 <- read.table("http://ihome.ust.hk/~haitian/sample2.txt")
names(matrix2)<-
c("g0","g28","g1334","g1871","g3639","g4295")
library(MASS)
fo2 <- as.formula(g0 ~ g28 * g1334 * g1871 * g3639 * g4295)
lr2 <- glm(fo2, family=binomial(link=logit), data=matrix2)
summary(lr2)

Deviance Residuals:
       Min          1Q      Median          3Q         Max
-4.527e-05  -2.107e-08  -2.107e-08   2.107e-08   5.802e-05

Coefficients:
                              Estimate Std. Error   z value Pr(>|z|)
(Intercept)                  6.028e+01  1.006e+07  5.99e-06        1
g28                          4.569e+04  8.566e+07     0.001        1
g1334                        1.733e+04  3.568e+07  4.86e-04        1
g1871                        2.917e+02  7.194e+06  4.05e-05        1
g3639                        1.936e+02  1.159e+08  1.67e-06        1
g4295                       -3.642e+02  8.580e+06 -4.24e-05        1
g28:g1334                    2.643e+05  3.732e+08     0.001        1
....(omitted) ....
g28:g1334:g1871:g3639:g4295 -1.084e+06  2.209e+09 -4.91e-04        1

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 1.2032e+02  on 86  degrees of freedom
Residual deviance: 1.8272e-08  on 55  degrees of freedom
AIC: 64

Number of Fisher Scoring iterations: 25

Now another question arise, if a perfect separation plane exists for (2),
then Support Vector Machine should do a perfect job. But right now it's not
the case: when I tried to use svm(), the error is over 0.3.   Do you know if
this is a tuning problem?
Does svm() automatically consider all the inter-action terms? ( i actually
tried to input interaction terms manually, and the result is still the
same)

(..cont', need to install package: e1071)
library(e1071)
library(rpart)
train.x<- matrix2[,-1]
train.y<- matrix2[,1]
svm.tune <- tune(svm, train.x, train.y, validation.x= train.x,
validation.ytrain.y,
 ranges = list(gamma = 2^(-10:5), cost = 2^(-10:4)))
Cost<- svm.tune$best.parameters$cost
Gamma<- svm.tune$best.parameters$gamma

svm.model <-svm(x=train.x, y=train.y, kernel = "polynomial", cost=
Cost,
Gamma=Gamma,na.action=na.fail, probability =TRUE)
svm.pre <- predict(svm.model, train.x, probability=TRUE)

vote <- ifelse(svm.pre > 0.5, 1, 0)
err.indicator <- ifelse(vote == train.y, 0, 1)
error <- sum(err.indicator)/length(train.y)
error

I'm really sorry for such a long mail!  And for my limited knowledge too!
 Would you please advise if there is any better way of tuning svm()? or what
should i do to obtain a reasonable co-efficients for case (2)? Thank you so
much!!

Best Regards,
Maggie


-----------------------------------
Haitian Wang
PhD Student in Statistics
ISOM Department, HKUST, Hong Kong





On Fri, Mar 20, 2009 at 4:44 PM, Gavin Simpson
<gavin.simpson@ucl.ac.uk>wrote:
> On Fri, 2009-03-20 at 12:39 +1100, Gad Abraham wrote:
> > Maggie Wang wrote:
> > > Hi, Dieter, Gad, and all,
> > >
> > > Thank you very much for your reply!
> > >
> > > So here is my data,  you can copy it into a file names
"sample.txt"
> >
> > Hi Maggie,
> >
> > With this data (allowing for more iterations) I get:
> >
> > > lr <- glm(fo, family=binomial(link=logit), data=matrix,
> > control=glm.control(maxit=100))
> > Warning message:
> > In glm.fit(x = X, y = Y, weights = weights, start = start, etastart
> > etastart,  :
> >    fitted probabilities numerically 0 or 1 occurred
> >
> > which indicates, as Thomas has said, perfect separation, which occurs
> > because you're trying to fit too many variables with not enough
data.
>
> It is worth mentioning that, in and of itself, that warning does not
> necessarily indicate a separation issue - something I was unsure about
> recently. You can get that warning (and I did for several data sets in a
> recent problem I enquired on the list about) where the fitted values
> really do become numerically 0 or 1 without separation.
>
> For example, see this response to my original question on the list:
>
> http://article.gmane.org/gmane.comp.lang.r.general/134472/
>
> There Ioannis Kosmidis presents a number of ways to investigate the
> results of a logit model fit for such issues.
>
> G
>
> >
> > Cheers,
> > Gad
> >
> --
> %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
>  Dr. Gavin Simpson             [t] +44 (0)20 7679 0522
>  ECRC, UCL Geography,          [f] +44 (0)20 7679 0565
>  Pearson Building,             [e] gavin.simpsonATNOSPAMucl.ac.uk
>  Gower Street, London          [w] http://www.ucl.ac.uk/~ucfagls/
>  UK. WC1E 6BT.                 [w] http://www.freshwaters.org.uk
> %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
>
>
	[[alternative HTML version deleted]]

ratna ghosal

2009-Mar-26 08:49 UTC

head link

[R] Extreme AIC in glm(), perfect separation, svm() tuning

Hi all,

I am very new to R, thus have some basic doubts regarding the output of the
analysis.

I cannot understand that how the degrees of freedom gets calculated in a Mixed
Effect Model. Suppose I use the following command:


mod1 <- lme(D. ~ S., data=dat, random = ~S.|Female), then the output is as
follows

            StdDev       Corr  

(Intercept)
1.331293e-05 (Intr)

Ser.scale   2.925450e-05 0     

Residual    8.782149e-01       




Fixed effects: D. ~ S. 

         Value               Std.Error           DF        t-value       
p-value

(Intercept)
0.0000000     0..142        31   0.000000      1.0000

S.   0.3637731                0.155        31     2.343175      0.0257

 Correlation: 

          (Intr)

Ser.scale 0     




Standardized
Within-Group Residuals:

       Min         Q1                 Med           Q3              Max 

-1.9275740 -0..7093811
-0.2111987  0.4978089  2.2598701 




Number of
Observations: 38

Number of Groups: 6 


Now I cannot understand that how the df is being calculated as 31, I only have
crawley with me and it does not explain the calculation of df very elaborately.
Hoping to get some insights regarding this.
Thanks
Ratna




Ratna GhosalResearch ScholarCentre for Ecological SciencesIndian Institute of
SciencesBangalore-12

--- On Thu, 26/3/09, Maggie Wang <haitian.wang@gmail.com> wrote:

From: Maggie Wang <haitian.wang@gmail.com>
Subject: Re: [R] Extreme AIC in glm(), perfect separation, svm() tuning
To: r-help@r-project.org
Cc: gavin.simpson@ucl.ac.uk
Date: Thursday, 26 March, 2009, 1:27 PM


-----Inline Attachment Follows-----

Dear List,
With regard to the question I previously raised, here is the result I
obtained right now, brglm() does help, but there are two situations:

1) Classifiers with extremely high AIC (over 200),  no perfect separation,
coefficients converge. in this case, using brglm() does help!  It stabilize
the AIC, and the classification power is better.

Code and output:  (need to install package: brglm)

matrix <- read.table("http://ihome.ust.hk/~haitian/sample.txt")
names(matrix)<-
c("g0","g761","g2809","g3106","g4373","g4583")
fo <- as.formula(g0 ~ g761 * g2809 * g3106 * g4373 * g4583)
library(MASS)
library(brglm)

lr <- brglm(formula= fo, family=binomial(link=logit), data=matrix)
summary(lr)

Coefficients:
                              Estimate Std. Error z value Pr(>|z|)
(Intercept)                     1.2829     0.8281   1.549   0.1214
g761                            4.0619     5.2519   0.773   0.4393
g2809                          -2.2775     4.7237  -0.482   0.6297
g3106                          -2.4431     3.8504  -0.635   0.5258
g4373                           1.2095     2.7312   0.443   0.6579
g4583                           1.0475     6.3020   0.166   0.8680
g761:g2809                    -11.8279    22.0052  -0.538   0.5909
g761:g3106                    -57.7909    35.6418  -1.621   0.1049
....... (omitted)......
g761:g2809:g3106:g4373:g4583 -864.0858  2879.2579  -0.300   0.7641
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 78.708  on 86  degrees of freedom
Residual deviance: 56.600  on 55  degrees of freedom
Penalized deviance: 261.7148
AIC:  120.6


2) Classifiers with perfect separation with a too small AIC(around 50),
co-efficients does Not converge.  The prediction error of itself by glm() is
0!  brglm() is no better than glm() in this case.  Code and output of glm():


matrix2 <- read.table("http://ihome.ust.hk/~haitian/sample2.txt")
names(matrix2)<-
c("g0","g28","g1334","g1871","g3639","g4295")
library(MASS)
fo2 <- as.formula(g0 ~ g28 * g1334 * g1871 * g3639 * g4295)
lr2 <- glm(fo2, family=binomial(link=logit), data=matrix2)
summary(lr2)

Deviance Residuals:
       Min          1Q      Median          3Q         Max
-4.527e-05  -2.107e-08  -2.107e-08   2.107e-08   5.802e-05

Coefficients:
                              Estimate Std. Error   z value Pr(>|z|)
(Intercept)                  6..028e+01  1.006e+07  5.99e-06        1
g28                          4.569e+04  8.566e+07     0.001        1
g1334                        1.733e+04  3.568e+07  4.86e-04        1
g1871                        2.917e+02  7.194e+06  4.05e-05        1
g3639                        1.936e+02  1.159e+08  1.67e-06        1
g4295                       -3.642e+02  8.580e+06 -4.24e-05        1
g28:g1334                    2.643e+05  3.732e+08     0.001        1
.....(omitted) ....
g28:g1334:g1871:g3639:g4295 -1.084e+06  2.209e+09 -4.91e-04        1

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 1.2032e+02  on 86  degrees of freedom
Residual deviance: 1.8272e-08  on 55  degrees of freedom
AIC: 64

Number of Fisher Scoring iterations: 25

Now another question arise, if a perfect separation plane exists for (2),
then Support Vector Machine should do a perfect job. But right now it's not
the case: when I tried to use svm(), the error is over 0.3.   Do you know if
this is a tuning problem?
Does svm() automatically consider all the inter-action terms? ( i actually
tried to input interaction terms manually, and the result is still the
same)

(..cont', need to install package: e1071)
library(e1071)
library(rpart)
train.x<- matrix2[,-1]
train.y<- matrix2[,1]
svm.tune <- tune(svm, train.x, train.y, validation.x= train.x,
validation..ytrain.y,
 ranges = list(gamma = 2^(-10:5), cost = 2^(-10:4)))
Cost<- svm.tune$best.parameters$cost
Gamma<- svm.tune$best.parameters$gamma

svm.model <-svm(x=train.x, y=train.y, kernel = "polynomial", cost=
Cost,
Gamma=Gamma,na.action=na.fail, probability =TRUE)
svm.pre <- predict(svm.model, train.x, probability=TRUE)

vote <- ifelse(svm.pre > 0.5, 1, 0)
err.indicator <- ifelse(vote == train.y, 0, 1)
error <- sum(err.indicator)/length(train.y)
error

I'm really sorry for such a long mail!  And for my limited knowledge too!
 Would you please advise if there is any better way of tuning svm()? or what
should i do to obtain a reasonable co-efficients for case (2)? Thank you so
much!!

Best Regards,
Maggie


-----------------------------------
Haitian Wang
PhD Student in Statistics
ISOM Department, HKUST, Hong Kong





On Fri, Mar 20, 2009 at 4:44 PM, Gavin Simpson
<gavin.simpson@ucl.ac.uk>wrote:
> On Fri, 2009-03-20 at 12:39 +1100, Gad Abraham wrote:
> > Maggie Wang wrote:
> > > Hi, Dieter, Gad, and all,
> > >
[[elided Yahoo spam]]> > >
> > > So here is my data,  you can copy it into a file names
"sample.txt"
> >
> > Hi Maggie,
> >
> > With this data (allowing for more iterations) I get:
> >
> > > lr <- glm(fo, family=binomial(link=logit), data=matrix,
> > control=glm.control(maxit=100))
> > Warning message:
> > In glm.fit(x = X, y = Y, weights = weights, start = start, etastart
> > etastart,  :
> >    fitted probabilities numerically 0 or 1 occurred
> >
> > which indicates, as Thomas has said, perfect separation, which occurs
> > because you're trying to fit too many variables with not enough
data.
>
> It is worth mentioning that, in and of itself, that warning does not
> necessarily indicate a separation issue - something I was unsure about
> recently. You can get that warning (and I did for several data sets in a
> recent problem I enquired on the list about) where the fitted values
> really do become numerically 0 or 1 without separation.
>
> For example, see this response to my original question on the list:
>
> http://article.gmane.org/gmane.comp.lang.r.general/134472/
>
> There Ioannis Kosmidis presents a number of ways to investigate the
> results of a logit model fit for such issues.
>
> G
>
> >
> > Cheers,
> > Gad
> >
> --
> %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
>  Dr. Gavin Simpson             [t] +44 (0)20 7679 0522
>  ECRC, UCL Geography,          [f] +44 (0)20 7679 0565
>  Pearson Building,             [e] gavin.simpsonATNOSPAMucl.ac.uk
>  Gower Street, London          [w] http://www.ucl.ac.uk/~ucfagls/
>  UK. WC1E 6BT.                 [w] http://www.freshwaters.org.uk
> %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
>
>
    [[alternative HTML version deleted]]


-----Inline Attachment Follows-----

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



      Add more friends to your messenger and enjoy! Go to
http://messenger.yahoo.com/invite/
	[[alternative HTML version deleted]]

Seemingly Similar Threads

Search for more apparently analagous threads

R help - Mar 2009 - Extreme AIC in glm(), perfect separation, svm() tuning

[R] Extreme AIC in glm(), perfect separation, svm() tuning

[R] Extreme AIC in glm(), perfect separation, svm() tuning

Seemingly Similar Threads