thr3ads.net - R help - [R] glm(family=binomial(link=logit)) [Jul 2005]

If this information is useful, please help other people find it:
Share via:

Robin Hankin

2005-Jul-15 15:00 UTC

[R] glm(family=binomial(link=logit))

Hi

I am trying to make glm() work to analyze a toy logit system.

I have a dataframe with x and y independent variables.  I have

L=1+x-y          (ie coefficients 1,1,-1)

then if I have a logit relation with L=log(p/(1-p)),
p=1/(1+exp(L)).

If I interpret "p" as the probability of  success in a Bernouilli
trial, and I can observe the result (0 for "no", 1 for
"yes")
how do I retrieve the coefficients c(1,1,-1)
from the data?

n <- 300
des <- data.frame(x=(1:n)/n,y=sample(n)/n)   # experimental design
des <- cbind(des,L=1+des$x-des$y)            # L=1+x-y
des <- cbind(des,p=1/(1+exp(des$L)))         # p=1/(1+e^L)
des <- cbind(des,obs=rbinom(n,1,des$p))      # observation: prob of  
success = p.


My attempt is:

glm(obs~x+y,data=des,family=binomial(link="logit"))

But it does not retrieve the correct coefficients of c(1,1,-1) ;
I would expect a reasonably close answer with so much data.

What is the correct glm() call to perform my logit analysis?





--
Robin Hankin
Uncertainty Analyst
National Oceanography Centre, Southampton
European Way, Southampton SO14 3ZH, UK
  tel  023-8059-7743

Prof Brian Ripley

2005-Jul-15 15:22 UTC

head link

[R] glm(family=binomial(link=logit))

On Fri, 15 Jul 2005, Robin Hankin wrote:
> I am trying to make glm() work to analyze a toy logit system.
>
> I have a dataframe with x and y independent variables.  I have
>
> L=1+x-y          (ie coefficients 1,1,-1)
>
> then if I have a logit relation with L=log(p/(1-p)),
> p=1/(1+exp(L)).
Not quite, see below.
> If I interpret "p" as the probability of  success in a Bernouilli
> trial, and I can observe the result (0 for "no", 1 for
"yes")
> how do I retrieve the coefficients c(1,1,-1)
> from the data?
>
> n <- 300
> des <- data.frame(x=(1:n)/n,y=sample(n)/n)   # experimental design
> des <- cbind(des,L=1+des$x-des$y)            # L=1+x-y
> des <- cbind(des,p=1/(1+exp(des$L)))         # p=1/(1+e^L)
A logit would be p = e^L/(1+e^L), so your signs for L are reversed.
> des <- cbind(des,obs=rbinom(n,1,des$p))      # observation: prob of
> success = p.
>
>
> My attempt is:
>
> glm(obs~x+y,data=des,family=binomial(link="logit"))
>
> But it does not retrieve the correct coefficients of c(1,1,-1) ;
> I would expect a reasonably close answer with so much data.
You actually have so little data.
> What is the correct glm() call to perform my logit analysis?
The call is correct, the expectation is not.  A single bernoulli 
observation provides far less information than you seem to suppose.

I got

Coefficients:
             Estimate Std. Error z value Pr(>|z|)
(Intercept)  -1.4747     0.3670  -4.019 5.85e-05 ***
x            -0.5549     0.4672  -1.188  0.23494
y             1.2963     0.4731   2.740  0.00614 **

and note how large the standard errors are.  With 10000 examples you will 
get closer.  Having fixed your sign change, I got

Coefficients:
             Estimate Std. Error z value Pr(>|z|)
(Intercept)  0.98711    0.06024   16.39   <2e-16 ***
x            1.00896    0.08052   12.53   <2e-16 ***
y           -0.87798    0.08031  -10.93   <2e-16 ***


-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

Peter Dalgaard

2005-Jul-15 15:58 UTC

head link

[R] glm(family=binomial(link=logit))

Robin Hankin <r.hankin at noc.soton.ac.uk> writes:
> Hi
> 
> I am trying to make glm() work to analyze a toy logit system.
> 
> I have a dataframe with x and y independent variables.  I have
> 
> L=1+x-y          (ie coefficients 1,1,-1)
> 
> then if I have a logit relation with L=log(p/(1-p)),
> p=1/(1+exp(L)).
> 
> If I interpret "p" as the probability of  success in a Bernouilli
> trial, and I can observe the result (0 for "no", 1 for
"yes")
> how do I retrieve the coefficients c(1,1,-1)
> from the data?
> 
> n <- 300
> des <- data.frame(x=(1:n)/n,y=sample(n)/n)   # experimental design
> des <- cbind(des,L=1+des$x-des$y)            # L=1+x-y
> des <- cbind(des,p=1/(1+exp(des$L)))         # p=1/(1+e^L)
> des <- cbind(des,obs=rbinom(n,1,des$p))      # observation: prob of  
> success = p.
> 
> 
> My attempt is:
> 
> glm(obs~x+y,data=des,family=binomial(link="logit"))
> 
> But it does not retrieve the correct coefficients of c(1,1,-1) ;
> I would expect a reasonably close answer with so much data.
> 
> What is the correct glm() call to perform my logit analysis?
Apart from a sign error, the only fault seems to be that you think
that 300 is a large number... Try upping n to 30000 and you'll see.

(The sign error is of course that log(p/(1-p)) = L implies that
p = exp(L)/(1+exp(L)) = 1/(1+exp(-L))).  

-- 
   O__  ---- Peter Dalgaard             ??ster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics     PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark          Ph:  (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)                  FAX: (+45) 35327907

Reasonably Related Threads

Search for more maybe matching threads

R help - Jul 2005 - glm(family=binomial(link=logit))

[R] glm(family=binomial(link=logit))

[R] glm(family=binomial(link=logit))

[R] glm(family=binomial(link=logit))

Reasonably Related Threads