thr3ads.net - R devel - [Rd] problems with glm (PR#771) [Dec 2000]

If this information is useful, please help other people find it:
Share via:

james.lindsey@luc.ac.be

2000-Dec-18 08:27 UTC

[Rd] problems with glm (PR#771)

R1.2.0 with Linux RH5.2

I do not believe that the problems below are new to 1.2 but I only
cover this sort of thing once a year in my course and some of that
happened to be last Friday so too late to report for 1.2. I see that
one or two things that I was going to report have been corrected.
I like the fact that interactions now show : instead of .

Here is some output with comments inserted.


R : Copyright 2000, The R Development Core Team
Version 1.2.0  (2000-12-15)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type `license()' or `licence()' for distribution details.

R is a collaborative project with many contributors.
Type `contributors()' for more information.

Type `demo()' for some demos, `help()' for on-line help, or
`help.start()' for a HTML browser interface to help.
Type `q()' to quit R.

Notice contrasts for below:
> options(contrasts=c("contr.treatment","contr.poly"))
> data <- read.table("tmp.dat",skip=6,head=T)
> data   Freq Reg66 Reg71
1   118    CC    CC
2    14    LY    CC
3     8    WM    CC
4    12    GL    CC
5    12    CC    LY
6  2127    LY    LY
7    69    WM    LY
8   110    GL    LY
9     7    CC    WM
10   86    LY    WM
11 2548    WM    WM
12   88    GL    WM
13   23    CC    GL
14  130    LY    GL
15  107    WM    GL
16 7712    GL    GL> attach(data)
> wt <- codes(Reg66)!=codes(Reg71)
> print(z <- glm(Freq~Reg66+Reg71,family=poisson))
Call:  glm(formula = Freq ~ Reg66 + Reg71, family = poisson) 

Coefficients:
(Intercept)      Reg66GL      Reg66LY      Reg66WM      Reg71GL      Reg71LY  
     0.6134       3.9022       2.6900       2.8376       3.9597       2.7245  
    Reg71WM  
     2.8877  

Degrees of Freedom: 15 Total (i.e. Null);  9 Residual
Null Deviance:	    40740 
Residual Deviance: 19880 	AIC: 20000 > print(z2 <- glm(Freq~Reg66+Reg71,family=poisson,weight=wt))
Call:  glm(formula = Freq ~ Reg66 + Reg71, family = poisson, weights = wt) 

Coefficients:
(Intercept)      Reg66GL      Reg66LY      Reg66WM      Reg71GL      Reg71LY  
     0.4615       2.1242       2.0096       1.7236       2.4555       2.0847  
    Reg71WM  
     1.9140  

Degrees of Freedom: 11 Total (i.e. Null);  5 Residual
Null Deviance:	    486.2 
Residual Deviance: 4.367 	AIC: 82.7 > z2$fit          1           2           3           4           5           6 
 118.100000   11.836037    8.891541   13.272421   12.758205 2127.100000 
          7           8           9          10          11          12 
  71.505458  106.736339   10.756661   80.252100 2548.100000   89.991240 
         13          14          15          16 
  18.485138  137.911862  103.603001 7712.100000 

First problem: with diagonal values weighted out in a transition
matrix (mover-stayer model), glm wrongly estimates the diagonals to be
the observed plus 0.1. This was correct in 90.1 a year ago but already
wrong in 1.0.1. I don't have any versions in between installed.
> # correct values from GLIM4
> #   unit   observed    fitted   residual
> #     (1)       118     1.586      0.000
> #      2         14    11.836      0.629
> #      3          8     8.892     -0.299
> #      4         12    13.272     -0.349
> #      5         12    12.758     -0.212
> #     (6)      2127    95.185      0.000
> #      7         69    71.505     -0.296
> #      8        110   106.736      0.316
> #      9          7    10.757     -1.145
> #     10         86    80.252      0.642
> #    (11)      2548    60.287      0.000
> #     12         88    89.991     -0.210
> #     13         23    18.485      1.050
> #     14        130   137.912     -0.674
> #     15        107   103.603      0.334
> #    (16)      7712   154.648      0.000
> # and from R90.1 (wrong in R1.0.1)
> #> z2$fit
> #         1          2          3          4          5          6         
7
> #  1.586454  11.836037   8.891541  13.272421  12.758205  95.184992 
71.505458
> #         8          9         10         11         12         13        
14
> #106.736339  10.756661  80.252100  60.287479  89.991240  18.485138
137.911862
> #        15         16 
> #103.603001 154.648406 
> 
Second problem: the man page for factors states that the levels are
sorted, but gl does not do this. It keeps them in the order of the
labels. This is inconsistent with everything else, confusing for
students, and liable to induce errors in data analysis. In the example
below, it is not too important because the baseline category is the
same in the two cases but that will not generally be the case.

As an addition comment, I would strongly recommend that, if labels can
be coerced to numeric, they be ordered, not sorted. This is because
the order is very confusing if there are more than 9: for example, the
sorted 10 comes near the beginning and not at the end.
> reg66 <-
gl(4,1,16,labels=c("CC","LY","WM","GL"))
> z
Call:  glm(formula = Freq ~ Reg66 + Reg71, family = poisson) 

Coefficients:
(Intercept)      Reg66GL      Reg66LY      Reg66WM      Reg71GL      Reg71LY  
     0.6134       3.9022       2.6900       2.8376       3.9597       2.7245  
    Reg71WM  
     2.8877  

Degrees of Freedom: 15 Total (i.e. Null);  9 Residual
Null Deviance:	    40740 
Residual Deviance: 19880 	AIC: 20000 > print(za <- glm(Freq~reg66+Reg71,family=poisson))
Call:  glm(formula = Freq ~ reg66 + Reg71, family = poisson) 

Coefficients:
(Intercept)      reg66LY      reg66WM      reg66GL      Reg71GL      Reg71LY  
     0.6134       2.6900       2.8376       3.9022       3.9597       2.7245  
    Reg71WM  
     2.8877  

Degrees of Freedom: 15 Total (i.e. Null);  9 Residual
Null Deviance:	    40740 
Residual Deviance: 19880 	AIC: 20000 

Third problem: if mean value contrasts are used, the level names are
not attached to the variable name, but simply the number of the
category. This rightly drives students mad and is simply
uninterpretable if the levels have been sorted. It takes a great deal
of extreme care to try to find out to what the numbers refer.
> options(contrasts=c("contr.sum","contr.poly"))
> 
> z
Call:  glm(formula = Freq ~ Reg66 + Reg71, family = poisson) 

Coefficients:
(Intercept)      Reg66GL      Reg66LY      Reg66WM      Reg71GL      Reg71LY  
     0.6134       3.9022       2.6900       2.8376       3.9597       2.7245  
    Reg71WM  
     2.8877  

Degrees of Freedom: 15 Total (i.e. Null);  9 Residual
Null Deviance:	    40740 
Residual Deviance: 19880 	AIC: 20000 > print(zb <- glm(Freq~Reg66+Reg71,family=poisson))
Call:  glm(formula = Freq ~ Reg66 + Reg71, family = poisson) 

Coefficients:
(Intercept)       Reg661       Reg662       Reg663       Reg711       Reg712  
     5.3638      -2.3575       1.5448       0.3325      -2.3930       1.5667  
     Reg713  
     0.3315  

Degrees of Freedom: 15 Total (i.e. Null);  9 Residual
Null Deviance:	    40740 
Residual Deviance: 19880 	AIC: 20000 > 
Additional comment: I recommend that the aic functions for poisson and
binomial have the explicit calculation of the log density replaced by
the corresponding d function with the log option. This will avoid the
production of NAs in certain cases of zeroes in tables.
  Jim

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To:
r-devel-request@stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._

ripley@stats.ox.ac.uk

2000-Dec-29 08:02 UTC

head link

[Rd] problems with glm (PR#771)

[We do have a TooMuchAtOnce category, but it is easier to cope with
short separate reports with informative subject lines for, e.g. the BUGS
list.]

On Mon, 18 Dec 2000 james.lindsey@luc.ac.be wrote:
> R1.2.0 with Linux RH5.2
> 
> I do not believe that the problems below are new to 1.2 but I only
> cover this sort of thing once a year in my course and some of that
> happened to be last Friday so too late to report for 1.2. I see that
> one or two things that I was going to report have been corrected.
> I like the fact that interactions now show : instead of .
> 
> Here is some output with comments inserted.
> [...]
> > wt <- codes(Reg66)!=codes(Reg71)
Why not just wt <- Reg66 != Reg71  ?

[...]
> First problem: with diagonal values weighted out in a transition matrix
> (mover-stayer model), glm wrongly estimates the diagonals to be the
> observed plus 0.1. This was correct in 90.1 a year ago but already
> wrong in 1.0.1. I don't have any versions in between installed.
I think this is the same as

[Rd] glm gives incorrect results for zero-weight cases (PR#780)

in which case it is already fixed in R-patched.  That does look to have
solved this one too.  Note that glm has *not* `estimated the diagonals'
at all.  These are predictions (the cases are not in the model), and
predict.glm did get them right.

predict(z, data)
      1       2       3       4       5       6       7       8 
0.61337 3.30334 3.45098 4.51559 3.33786 6.02783 6.17548 7.24009 
      9      10      11      12      13      14      15      16 
3.50109 6.19106 6.33871 7.40332 4.57309 7.26306 7.41071 8.47532 

[...]
> Second problem: the man page for factors states that the levels are
> sorted, but gl does not do this. It keeps them in the order of the
There is a help page for factor(), not for factors, and I don't believe
that does state so. It says that *factor* sorts them by default, which is
true.  Where is the `man page for factors'?
> labels. This is inconsistent with everything else, confusing for
> students, and liable to induce errors in data analysis. In the example
> below, it is not too important because the baseline category is the
> same in the two cases but that will not generally be the case.
> 
> As an addition comment, I would strongly recommend that, if labels can
> be coerced to numeric, they be ordered, not sorted. This is because
> the order is very confusing if there are more than 9: for example, the
> sorted 10 comes near the beginning and not at the end.
You can do that, of course, by specifying the order of the levels.  
However, we teach students that the details of the coding of linear models
are details, and that they should regard coefficients as secondary and
predictions as primary.  aov() has the right idea in suppressing the
individual coefficients.  If you want to interpret coefficients you need to
understand codings. Period.

I don't see the value of changing the default for factor: it would not be
backwards-compatible, and some people have thought thought what the default
would be and accepted it. 
> > reg66 <-
gl(4,1,16,labels=c("CC","LY","WM","GL"))
> > z
> 
> Call:  glm(formula = Freq ~ Reg66 + Reg71, family = poisson) 
> 
> Coefficients:
> (Intercept)      Reg66GL      Reg66LY      Reg66WM      Reg71GL     
Reg71LY
>      0.6134       3.9022       2.6900       2.8376       3.9597      
2.7245
>     Reg71WM  
>      2.8877  
> 
> Degrees of Freedom: 15 Total (i.e. Null);  9 Residual
> Null Deviance:	    40740 
> Residual Deviance: 19880 	AIC: 20000 
> > print(za <- glm(Freq~reg66+Reg71,family=poisson))
> 
> Call:  glm(formula = Freq ~ reg66 + Reg71, family = poisson) 
> 
> Coefficients:
> (Intercept)      reg66LY      reg66WM      reg66GL      Reg71GL     
Reg71LY
>      0.6134       2.6900       2.8376       3.9022       3.9597      
2.7245
>     Reg71WM  
>      2.8877  
> 
> Degrees of Freedom: 15 Total (i.e. Null);  9 Residual
> Null Deviance:	    40740 
> Residual Deviance: 19880 	AIC: 20000 
> Third problem: if mean value contrasts are used, the level names are
> not attached to the variable name, but simply the number of the
> category. This rightly drives students mad and is simply
> uninterpretable if the levels have been sorted. It takes a great deal
> of extreme care to try to find out to what the numbers refer.
Are we supposed to guess that `mean value contrasts' means contr.sum?  In
which case the numbers refer to the number of the *contrast*: they do not
refer to levels, nor are single levels relevant. As in
> contr.sum(levels(Reg66))   [,1] [,2] [,3]
CC    1    0    0
GL    0    1    0
LY    0    0    1
WM   -1   -1   -1> contr.treatment(levels(Reg66))   GL LY WM
CC  0  0  0
GL  1  0  0
LY  0  1  0
WM  0  0  1

Now, what should the column labels be in the first case?  And in what sense
are these `mean value contrasts'?

Do you really want labels like "CCvWM"?  They could get very
cumbersome.

(One could argue that contr.treatment is already wrong, but as they
are not even contrasts ....)

[...]
> Additional comment: I recommend that the aic functions for poisson and
> binomial have the explicit calculation of the log density replaced by
> the corresponding d function with the log option. This will avoid the
> production of NAs in certain cases of zeroes in tables.
Good idea, but could you supply examples so we can put them in the
regression tests?

-- 
Brian D. Ripley,                  ripley@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272860 (secr)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595


-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To:
r-devel-request@stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._

Jim Lindsey

2001-Jan-03 14:10 UTC

head link

[R] Re: [Rd] problems with glm (PR#771)

> 
> [We do have a TooMuchAtOnce category, but it is easier to cope with
> short separate reports with informative subject lines for, e.g. the BUGS
> list.]
> 
> On Mon, 18 Dec 2000 james.lindsey at luc.ac.be wrote:
> 
> > R1.2.0 with Linux RH5.2
> > 
> > I do not believe that the problems below are new to 1.2 but I only
> > cover this sort of thing once a year in my course and some of that
> > happened to be last Friday so too late to report for 1.2. I see that
> > one or two things that I was going to report have been corrected.
> > I like the fact that interactions now show : instead of .
> > 
> > Here is some output with comments inserted.
> > 
> [...]
> 
> > > wt <- codes(Reg66)!=codes(Reg71)
> 
> Why not just wt <- Reg66 != Reg71  ?
Simple: I do not trust R factor levels (among others, for reasons
outlined below).
  For any serious work involving factor variables, I use GLIM4.
> 
> [...]
> 
> > First problem: with diagonal values weighted out in a transition
matrix
> > (mover-stayer model), glm wrongly estimates the diagonals to be the
> > observed plus 0.1. This was correct in 90.1 a year ago but already
> > wrong in 1.0.1. I don't have any versions in between installed.
> 
> I think this is the same as
> 
> [Rd] glm gives incorrect results for zero-weight cases (PR#780)
Good. That patch was posted after I submitted my bug report.
> 
> in which case it is already fixed in R-patched.  That does look to have
> solved this one too.  Note that glm has *not* `estimated the diagonals'
> at all.  These are predictions (the cases are not in the model), and
> predict.glm did get them right.
It *has* estimated the diagonals under the fitted mover-stayer model
(that is, if it were doing it right). They are estimates of certain
parameters in this model. (Another way to fit the same model is to use
a 4-level factor variable on the diagonal instead of weighting it out,
but that does not supply the desired estimates.) R incorrectly calls
them fitted values in its objects and printouts. GLIM4 clearly prints
them out in a distinct way so that there is no confusion.
> 
> predict(z, data)
>       1       2       3       4       5       6       7       8 
> 0.61337 3.30334 3.45098 4.51559 3.33786 6.02783 6.17548 7.24009 
>       9      10      11      12      13      14      15      16 
> 3.50109 6.19106 6.33871 7.40332 4.57309 7.26306 7.41071 8.47532 
> 
> [...]
> 
> > Second problem: the man page for factors states that the levels are
> > sorted, but gl does not do this. It keeps them in the order of the
> 
> There is a help page for factor(), not for factors, and I don't believe
> that does state so. It says that *factor* sorts them by default, which is
> true.  Where is the `man page for factors'?
Sorry. Typo.

Here is what the docs say:

The levels of a factor are by default sorted, but the sort order
may well depend on the locale at the time of creation, and should
not be assumed to be ASCII.

Because this is the only documentation on factor variables and it does
not mention the factor function explicitly, I take it to be a general
statement.
> 
> > labels. This is inconsistent with everything else, confusing for
> > students, and liable to induce errors in data analysis. In the example
> > below, it is not too important because the baseline category is the
> > same in the two cases but that will not generally be the case.
> > 
> > As an addition comment, I would strongly recommend that, if labels can
> > be coerced to numeric, they be ordered, not sorted. This is because
> > the order is very confusing if there are more than 9: for example, the
> > sorted 10 comes near the beginning and not at the end.
> 
> You can do that, of course, by specifying the order of the levels.  
> However, we teach students that the details of the coding of linear models
> are details, and that they should regard coefficients as secondary and
> predictions as primary.  aov() has the right idea in suppressing the
> individual coefficients.  If you want to interpret coefficients you need to
> understand codings. Period.
I do not see what aov has to do with it when discussing glm's. It is
for lm and, as far as I know, produces the old-fashioned material for
tests that more and more scientific journals are rightly refusing. In
say a standard high-dimensional contingency table (or a factorial
design), the first thing that scientists and industry people want is
to know which levels of factor variables yield scientifically
important effects.  Only then do they want measures of statistical
significance. What are required are first the appropriate coefficients
and then the corresponding intervals.
  The idea about predictions may be theoretically fine, at least in
simple models, but is practically useless in realistically complex
cases involving multidimensional contingency tables (or multi-
factorial designs, from which factor variables take their name, for
that matter), where one wants to extract the relative effects of the
various factor levels and interactions. The predictions are virtually
useless to disentangle such effects.
  I teach that all different constraint sets are equivalent because
they give the same predictions so choose that set which is most
appropriate or most easily interpretable for the problem at hand. If
one level of a factor is of particular interest, use baseline
constraints (so-called treatment contrasts); if levels are rather
symmetric, use mean value constraints (so-called sum contrasts). In
other cases, special contrasts need to be set up. Contrasts are a
property of a variable, not a design so that it would be nice to have
an automatic way of doing this for each factor variable rather than a
global contrast option. But I know of no software that allows this.

R makes it extremely difficult to understand codings. They are very
obscure and easily lead to errors of interpretation.  Everything seems
to be done to discourage their use (as opposed simply to looking at
tests) when their correct interpretation is essential. That is my
whole point.
> 
> I don't see the value of changing the default for factor: it would not
be
> backwards-compatible, and some people have thought thought what the default                                             ^^^^^^^^^^^^^^^

Does this mean that they thought twice about it or is it a type like
my "factors" above?
> would be and accepted it. 
> 
> > > reg66 <-
gl(4,1,16,labels=c("CC","LY","WM","GL"))
> > > z
> > 
> > Call:  glm(formula = Freq ~ Reg66 + Reg71, family = poisson) 
> > 
> > Coefficients:
> > (Intercept)      Reg66GL      Reg66LY      Reg66WM      Reg71GL     
Reg71LY
> >      0.6134       3.9022       2.6900       2.8376       3.9597      
2.7245
> >     Reg71WM  
> >      2.8877  
> > 
> > Degrees of Freedom: 15 Total (i.e. Null);  9 Residual
> > Null Deviance:	    40740 
> > Residual Deviance: 19880 	AIC: 20000 
> > > print(za <- glm(Freq~reg66+Reg71,family=poisson))
> > 
> > Call:  glm(formula = Freq ~ reg66 + Reg71, family = poisson) 
> > 
> > Coefficients:
> > (Intercept)      reg66LY      reg66WM      reg66GL      Reg71GL     
Reg71LY
> >      0.6134       2.6900       2.8376       3.9022       3.9597      
2.7245
> >     Reg71WM  
> >      2.8877  
> > 
> > Degrees of Freedom: 15 Total (i.e. Null);  9 Residual
> > Null Deviance:	    40740 
> > Residual Deviance: 19880 	AIC: 20000 
> 
> 
> > Third problem: if mean value contrasts are used, the level names are
> > not attached to the variable name, but simply the number of the
> > category. This rightly drives students mad and is simply
> > uninterpretable if the levels have been sorted. It takes a great deal
> > of extreme care to try to find out to what the numbers refer.
> 
> Are we supposed to guess that `mean value contrasts' means contr.sum? 
In
> which case the numbers refer to the number of the *contrast*: they do not
> refer to levels, nor are single levels relevant. As in
I stated above, in a part that you cut, to pay attention to the
contrasts options that I set. "Mean value constraints" is standard
terminology in the fields where I have worked for the simple reason
that the coefficients give differences from the mean value.
"Conventional constraints" is also commonly used but is less
informative.

What does the "number of the *contrast*" mean and how do you easily
find out what it is after the levels have been sorted around? In a log
linear model (or complex factorial design), they do refer to differences
("contrasts") of each level from the mean value. Single levels are at
least as relevant here as for any other contrast set.
> 
> > contr.sum(levels(Reg66))
>    [,1] [,2] [,3]
> CC    1    0    0
> GL    0    1    0
> LY    0    0    1
> WM   -1   -1   -1
> > contr.treatment(levels(Reg66))
>    GL LY WM
> CC  0  0  0
> GL  1  0  0
> LY  0  1  0
> WM  0  0  1
> 
> Now, what should the column labels be in the first case?  And in what sense
> are these `mean value contrasts'?
They should be CC GL LY. The coefficient for WM is minus the sum of
the other three. The coefficients will be differences from the mean
value (on the log scale for log linear models) and are symmetric in
all factor levels (in the sense that no particular category is chosen
as special). Some statistical software packages very nicely print out
the coefficients for all levels instead of arbitrarily dropping the
last one.
> 
> Do you really want labels like "CCvWM"?  They could get very
cumbersome.
> 
You have lost me here. The answer is, of course, No. What a
suggestion! It does not mean that at all. It compares CC to the mean
not to WM. To me, that looks like the appropriate label for baseline
constraints with the last category as baseline (what R calls treatment
but is more usually placebo!).
> (One could argue that contr.treatment is already wrong, but as they
> are not even contrasts ....)
> 
> [...]
> 
> > Additional comment: I recommend that the aic functions for poisson and
> > binomial have the explicit calculation of the log density replaced by
> > the corresponding d function with the log option. This will avoid the
> > production of NAs in certain cases of zeroes in tables.
> 
> Good idea, but could you supply examples so we can put them in the
> regression tests?
Here is a simple and unrealistic (as no one would ever fit the model
to these data) example that illustrates what can happen in more
complex tables:

y <- matrix(c(2,0,7,3,0,9), ncol=2)
x <- gl(3,1)
glm(y~x, family=binomial)
> 
> -- 
> Brian D. Ripley,                  ripley at stats.ox.ac.uk
> Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
> University of Oxford,             Tel:  +44 1865 272861 (self)
> 1 South Parks Road,                     +44 1865 272860 (secr)
> Oxford OX1 3TG, UK                Fax:  +44 1865 272595
> 
> 
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at
stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._

Prof Brian Ripley

2001-Jan-03 15:35 UTC

head link

[Rd] problems with glm (PR#771)

[Moved back to R-devel. That's where we discuss bug reports.]

On Wed, 3 Jan 2001, Jim Lindsey wrote:
> > > Additional comment: I recommend that the aic functions for
poisson and
> > > binomial have the explicit calculation of the log density
replaced by
> > > the corresponding d function with the log option. This will avoid
the
> > > production of NAs in certain cases of zeroes in tables.
> > 
> > Good idea, but could you supply examples so we can put them in the
> > regression tests?
> 
> Here is a simple and unrealistic (as no one would ever fit the model
> to these data) example that illustrates what can happen in more
> complex tables:
> 
> y <- matrix(c(2,0,7,3,0,9), ncol=2)
> x <- gl(3,1)
> glm(y~x, family=binomial)
That's not one. That fails because wt/n = NaN, and produces NaN not NA.
Now, could you please supply an example of what you reported, so we can
put it in the regression tests?  The BUGS section in the FAQ does ask for
one, with good reasons.  I've done as you suggested, and *not* fixed the
actual bug in this example.

-- 
Brian D. Ripley,                  ripley@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272860 (secr)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To:
r-devel-request@stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._

Apparently Analagous Threads

Search for more maybe matching threads

R devel - Dec 2000 - problems with glm (PR#771)

[Rd] problems with glm (PR#771)

[Rd] problems with glm (PR#771)

[R] Re: [Rd] problems with glm (PR#771)

[Rd] problems with glm (PR#771)

Apparently Analagous Threads