thr3ads.net - R help - [R] Strange parametrization in polr [Jan 2004]

If this information is useful, please help other people find it:
Share via:

BXC (Bendix Carstensen)

2004-Jan-08 14:50 UTC

[R] Strange parametrization in polr

In Venables \& Ripley 3rd edition (p. 231) the proportional odds model
is described as:

logit(p<=k) = zeta_k + eta

but polr apparently thinks there is a minus in front of eta,
as is apprent below.

Is this a bug og a feature I have overlooked?


Here is the naked code for reproduction, below the results.
------------------------------------------------------------------------
---
version
library( MASS )
data( housing )
hnames <- lapply( housing[,-5], levels )
house.plr <- polr( Sat ~ Infl + Type + Cont, data=housing, weights=Freq
)
summary( house.plr )
newdat <- expand.grid( hnames[-1] )[1:5,]
cbind( newdat, predict( house.plr, newdat, type="probs" ) )
# Baseline probs:
diff( c(0,tigol( c(-0.4961,0.6907) ), 1) )
# First level of Infl:
diff( c(0,tigol( c(-0.4961,0.6907) + 0.5663922 ), 1) )
# But the change of sign for eta is needed to reproduce the fitted
values:
# Line 2:
diff( c(0,tigol( c(-0.4961,0.6907) - 0.5663922 ), 1) )
# Line 5:
diff( c(0,tigol( c(-0.4961,0.6907) - 0.5663922 + 0.5723552 ), 1) )
------------------------------------------------------------------------
---

Here is the resulting output:
------------------------------------------------------------------------
---> version         _              
platform i386-pc-mingw32
arch     i386           
os       mingw32        
system   i386, mingw32  
status                  
major    1              
minor    8.1            
year     2003           
month    11             
day      21             
language R              > library( MASS )
> data( housing )
> hnames <- lapply( housing[,-5], levels )
> house.plr <- polr( Sat ~ Infl + Type + Cont, data=housing,
weights=Freq )> summary( house.plr )
Re-fitting to get Hessian

Call:
polr(formula = Sat ~ Infl + Type + Cont, data = housing, weights = Freq)

Coefficients:
                   Value Std. Error   t value
InflMedium     0.5663922 0.10465276  5.412109
InflHigh       1.2888137 0.12715609 10.135682
TypeApartment -0.5723552 0.11923800 -4.800107
TypeAtrium    -0.3661908 0.15517331 -2.359882
TypeTerrace   -1.0910073 0.15148595 -7.202036
ContHigh       0.3602803 0.09553577  3.771156

Intercepts:
            Value   Std. Error t value
Low|Medium  -0.4961  0.1248    -3.9740
Medium|High  0.6907  0.1255     5.5049

Residual Deviance: 3479.149 
AIC: 3495.149 > newdat <- expand.grid( hnames[-1] )[1:5,]
> cbind( newdat, predict( house.plr, newdat, type="probs" ) )    Infl      Type Cont       Low    Medium      High
1    Low     Tower  Low 0.3784485 0.2876755 0.3338760
2 Medium     Tower  Low 0.2568261 0.2742125 0.4689614
3   High     Tower  Low 0.1436927 0.2110841 0.6452232
4    Low Apartment  Low 0.5190450 0.2605077 0.2204473
5 Medium Apartment  Low 0.3798522 0.2875967 0.3325511> # Baseline probs:
> diff( c(0,tigol( c(-0.4961,0.6907) ), 1) )
[1] 0.3784576 0.2876650 0.3338774> # First level of Infl:
> diff( c(0,tigol( c(-0.4961,0.6907) + 0.5663922 ), 1) )
[1] 0.5175658 0.2609593 0.2214749> # But the change of sign for eta is needed to reproduce the fitted
values:> # Line 2:
> diff( c(0,tigol( c(-0.4961,0.6907) - 0.5663922 ), 1) )
[1] 0.2568335 0.2742035 0.4689630> # Line 5:
> diff( c(0,tigol( c(-0.4961,0.6907) - 0.5663922 + 0.5723552 ), 1) )
[1] 0.3798613 0.2875862 0.3325525> ------------------------------------------------------------------------
-----

----------------------
Bendix Carstensen
Senior Statistician
Steno Diabetes Center
Niels Steensens Vej 2
DK-2820 Gentofte
Denmark
tel: +45 44 43 87 38
mob: +45 30 75 87 38
fax: +45 44 43 07 06
bxc at steno.dk
www.biostat.ku.dk/~bxc

Thomas Lumley

2004-Jan-08 15:00 UTC

head link

[R] Strange parametrization in polr

On Thu, 8 Jan 2004, BXC (Bendix Carstensen) wrote:
> In Venables \& Ripley 3rd edition (p. 231) the proportional odds model
> is described as:
>
> logit(p<=k) = zeta_k + eta
>
> but polr apparently thinks there is a minus in front of eta,
> as is apprent below.
>
> Is this a bug og a feature I have overlooked?
If there is really a bug I would guess that it was in the book rather than
the code. This is not an unusual parametrisation for this model.  It is
the parametrisation that reduces to logistic regression for binary data,
and makes the regression coefficients positive when the association is
positive.

	-thomas
>
> Here is the naked code for reproduction, below the results.
> ------------------------------------------------------------------------
> ---
> version
> library( MASS )
> data( housing )
> hnames <- lapply( housing[,-5], levels )
> house.plr <- polr( Sat ~ Infl + Type + Cont, data=housing, weights=Freq
> )
> summary( house.plr )
> newdat <- expand.grid( hnames[-1] )[1:5,]
> cbind( newdat, predict( house.plr, newdat, type="probs" ) )
> # Baseline probs:
> diff( c(0,tigol( c(-0.4961,0.6907) ), 1) )
> # First level of Infl:
> diff( c(0,tigol( c(-0.4961,0.6907) + 0.5663922 ), 1) )
> # But the change of sign for eta is needed to reproduce the fitted
> values:
> # Line 2:
> diff( c(0,tigol( c(-0.4961,0.6907) - 0.5663922 ), 1) )
> # Line 5:
> diff( c(0,tigol( c(-0.4961,0.6907) - 0.5663922 + 0.5723552 ), 1) )
> ------------------------------------------------------------------------
> ---
>
> Here is the resulting output:
> ------------------------------------------------------------------------
> ---
> > version
>          _
> platform i386-pc-mingw32
> arch     i386
> os       mingw32
> system   i386, mingw32
> status
> major    1
> minor    8.1
> year     2003
> month    11
> day      21
> language R
> > library( MASS )
> > data( housing )
> > hnames <- lapply( housing[,-5], levels )
> > house.plr <- polr( Sat ~ Infl + Type + Cont, data=housing,
> weights=Freq )
> > summary( house.plr )
>
> Re-fitting to get Hessian
>
> Call:
> polr(formula = Sat ~ Infl + Type + Cont, data = housing, weights = Freq)
>
> Coefficients:
>                    Value Std. Error   t value
> InflMedium     0.5663922 0.10465276  5.412109
> InflHigh       1.2888137 0.12715609 10.135682
> TypeApartment -0.5723552 0.11923800 -4.800107
> TypeAtrium    -0.3661908 0.15517331 -2.359882
> TypeTerrace   -1.0910073 0.15148595 -7.202036
> ContHigh       0.3602803 0.09553577  3.771156
>
> Intercepts:
>             Value   Std. Error t value
> Low|Medium  -0.4961  0.1248    -3.9740
> Medium|High  0.6907  0.1255     5.5049
>
> Residual Deviance: 3479.149
> AIC: 3495.149
> > newdat <- expand.grid( hnames[-1] )[1:5,]
> > cbind( newdat, predict( house.plr, newdat, type="probs" ) )
>     Infl      Type Cont       Low    Medium      High
> 1    Low     Tower  Low 0.3784485 0.2876755 0.3338760
> 2 Medium     Tower  Low 0.2568261 0.2742125 0.4689614
> 3   High     Tower  Low 0.1436927 0.2110841 0.6452232
> 4    Low Apartment  Low 0.5190450 0.2605077 0.2204473
> 5 Medium Apartment  Low 0.3798522 0.2875967 0.3325511
> > # Baseline probs:
> > diff( c(0,tigol( c(-0.4961,0.6907) ), 1) )
> [1] 0.3784576 0.2876650 0.3338774
> > # First level of Infl:
> > diff( c(0,tigol( c(-0.4961,0.6907) + 0.5663922 ), 1) )
> [1] 0.5175658 0.2609593 0.2214749
> > # But the change of sign for eta is needed to reproduce the fitted
> values:
> > # Line 2:
> > diff( c(0,tigol( c(-0.4961,0.6907) - 0.5663922 ), 1) )
> [1] 0.2568335 0.2742035 0.4689630
> > # Line 5:
> > diff( c(0,tigol( c(-0.4961,0.6907) - 0.5663922 + 0.5723552 ), 1) )
> [1] 0.3798613 0.2875862 0.3325525
> >
> ------------------------------------------------------------------------
> -----
>
> ----------------------
> Bendix Carstensen
> Senior Statistician
> Steno Diabetes Center
> Niels Steensens Vej 2
> DK-2820 Gentofte
> Denmark
> tel: +45 44 43 87 38
> mob: +45 30 75 87 38
> fax: +45 44 43 07 06
> bxc at steno.dk
> www.biostat.ku.dk/~bxc
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://www.stat.math.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html
>
Thomas Lumley			Assoc. Professor, Biostatistics
tlumley at u.washington.edu	University of Washington, Seattle

Peter Dalgaard

2004-Jan-08 15:43 UTC

head link

[R] Strange parametrization in polr

"BXC (Bendix Carstensen)" <bxc at steno.dk> writes:
> Here is the naked code for reproduction, below the results.
That would be easier to reproduce if you had remembered to define

logit <- function(p)log(p/(1-p))
tigol <- function(x)exp(x)/(1+exp(x)) #inverse logit

first... Also, beware the line-breaking Jabberwock, my friend: The
line with "values:" is syntactically incomplete.

        -p

> ------------------------------------------------------------------------
> ---
> version
> library( MASS )
> data( housing )
> hnames <- lapply( housing[,-5], levels )
> house.plr <- polr( Sat ~ Infl + Type + Cont, data=housing, weights=Freq
> )
> summary( house.plr )
> newdat <- expand.grid( hnames[-1] )[1:5,]
> cbind( newdat, predict( house.plr, newdat, type="probs" ) )
> # Baseline probs:
> diff( c(0,tigol( c(-0.4961,0.6907) ), 1) )
> # First level of Infl:
> diff( c(0,tigol( c(-0.4961,0.6907) + 0.5663922 ), 1) )
> # But the change of sign for eta is needed to reproduce the fitted
> values:
> # Line 2:
> diff( c(0,tigol( c(-0.4961,0.6907) - 0.5663922 ), 1) )
> # Line 5:
> diff( c(0,tigol( c(-0.4961,0.6907) - 0.5663922 + 0.5723552 ), 1) )
> ------------------------------------------------------------------------
> ---
> 
> Here is the resulting output:
> ------------------------------------------------------------------------
> ---
> > version
>          _              
> platform i386-pc-mingw32
> arch     i386           
> os       mingw32        
> system   i386, mingw32  
> status                  
> major    1              
> minor    8.1            
> year     2003           
> month    11             
> day      21             
> language R              
> > library( MASS )
> > data( housing )
> > hnames <- lapply( housing[,-5], levels )
> > house.plr <- polr( Sat ~ Infl + Type + Cont, data=housing,
> weights=Freq )
> > summary( house.plr )
> 
> Re-fitting to get Hessian
> 
> Call:
> polr(formula = Sat ~ Infl + Type + Cont, data = housing, weights = Freq)
> 
> Coefficients:
>                    Value Std. Error   t value
> InflMedium     0.5663922 0.10465276  5.412109
> InflHigh       1.2888137 0.12715609 10.135682
> TypeApartment -0.5723552 0.11923800 -4.800107
> TypeAtrium    -0.3661908 0.15517331 -2.359882
> TypeTerrace   -1.0910073 0.15148595 -7.202036
> ContHigh       0.3602803 0.09553577  3.771156
> 
> Intercepts:
>             Value   Std. Error t value
> Low|Medium  -0.4961  0.1248    -3.9740
> Medium|High  0.6907  0.1255     5.5049
> 
> Residual Deviance: 3479.149 
> AIC: 3495.149 
> > newdat <- expand.grid( hnames[-1] )[1:5,]
> > cbind( newdat, predict( house.plr, newdat, type="probs" ) )
>     Infl      Type Cont       Low    Medium      High
> 1    Low     Tower  Low 0.3784485 0.2876755 0.3338760
> 2 Medium     Tower  Low 0.2568261 0.2742125 0.4689614
> 3   High     Tower  Low 0.1436927 0.2110841 0.6452232
> 4    Low Apartment  Low 0.5190450 0.2605077 0.2204473
> 5 Medium Apartment  Low 0.3798522 0.2875967 0.3325511
> > # Baseline probs:
> > diff( c(0,tigol( c(-0.4961,0.6907) ), 1) )
> [1] 0.3784576 0.2876650 0.3338774
> > # First level of Infl:
> > diff( c(0,tigol( c(-0.4961,0.6907) + 0.5663922 ), 1) )
> [1] 0.5175658 0.2609593 0.2214749
> > # But the change of sign for eta is needed to reproduce the fitted
> values:
> > # Line 2:
> > diff( c(0,tigol( c(-0.4961,0.6907) - 0.5663922 ), 1) )
> [1] 0.2568335 0.2742035 0.4689630
> > # Line 5:
> > diff( c(0,tigol( c(-0.4961,0.6907) - 0.5663922 + 0.5723552 ), 1) )
> [1] 0.3798613 0.2875862 0.3325525
> > 
> ------------------------------------------------------------------------
> -----
> 
> ----------------------
> Bendix Carstensen
> Senior Statistician
> Steno Diabetes Center
> Niels Steensens Vej 2
> DK-2820 Gentofte
> Denmark
> tel: +45 44 43 87 38
> mob: +45 30 75 87 38
> fax: +45 44 43 07 06
> bxc at steno.dk
> www.biostat.ku.dk/~bxc
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://www.stat.math.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html
> 
-- 
   O__  ---- Peter Dalgaard             Blegdamsvej 3  
  c/ /'_ --- Dept. of Biostatistics     2200 Cph. N   
 (*) \(*) -- University of Copenhagen   Denmark      Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)             FAX: (+45) 35327907

Prof Brian Ripley

2004-Jan-08 15:45 UTC

head link

[R] Strange parametrization in polr

The problem is not in the book nor the code, but in Mr Carstensen not
looking up the actual reference given in ?polr.

There was a + in early printings of MASS3, and that difference is in the
on-line Errata.  But both the DESCRIPTION file and ?polr are explicitly to
the fourth edition.  As Thomas says, the minus seemed a more natural
parametrization and so we changed to it.

On Thu, 8 Jan 2004, Thomas Lumley wrote:
> On Thu, 8 Jan 2004, BXC (Bendix Carstensen) wrote:
> 
> > In Venables \& Ripley 3rd edition (p. 231) the proportional odds
model
> > is described as:
> >
> > logit(p<=k) = zeta_k + eta
> >
> > but polr apparently thinks there is a minus in front of eta,
> > as is apprent below.
> >
> > Is this a bug og a feature I have overlooked?
> 
> If there is really a bug I would guess that it was in the book rather than
> the code. This is not an unusual parametrisation for this model.  It is
> the parametrisation that reduces to logistic regression for binary data,
> and makes the regression coefficients positive when the association is
> positive.
-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

Seemingly Similar Threads

Search for more apparently analagous threads

R help - Jan 2004 - Strange parametrization in polr

[R] Strange parametrization in polr

[R] Strange parametrization in polr

[R] Strange parametrization in polr

[R] Strange parametrization in polr

Seemingly Similar Threads