Benjamin Hofner
2015-Mar-20 15:06 UTC
[R] mboost: Proportional odds boosting model - how to specify the offset?
Dear Madlene,
the problem that you observed was twofold.
First, mboost expects the offset to be a scalar or a vector with length
equal to the number of observations. However, fitted(p.iris) is a
matrix. In PropOdds(), the linear or additive predictor is shared among
all outcome categories and the thresholds are treated as nuisance
parameter. What you need to supply as offset is the result of the linear
or additive predictor (i.e., x'beta) instead of the fitted class
probabilities.
Second, there was a bug in mboost. I fixed it on R-forge [1]. If the
package was successfully built use
install.packages("mboost",
repos="http://R-Forge.R-project.org")
to install it. You can also email to me off list. Then I will send you
the package sources directly.
Your nuisance parameters (which represent the class thresholds) can be
extracted via nuisance(mlp). More details are given in the example below.
Best,
Benjamin
[1] http://r-forge.r-project.org/projects/mboost/
---- Example code ----
library(MASS)
library(mboost)
data(iris)
iris$Species <- factor(iris$Species, ordered = T)
p.iris <- polr(Species ~ Sepal.Length, data = iris)
p.iris
lm.iris <- glmboost(Species ~ Sepal.Length, data = iris,
family = PropOdds(nuirange = c(-0.5, 3)))
lm.iris[1000]
## thresholds:
nuisance(lm.iris)
## to make these comparable to p.iris use
nuisance(lm.iris) - coef(lm.iris)["(Intercept)"] -
attr(coef(lm.iris), "offset")
## now use linear predictor as offset:
mlp <- gamboost(Species ~ bols(Sepal.Length) + bols(Sepal.Width),
data = iris, family = PropOdds(nuirange = c(0, 1)),
offset = fitted(lm.iris))
Nussbaum Madlene wrote> Dear R team
>
> The package mboost allows for boosting of proportional odds models.
> However, I would like to include an offset for every observation.
> This produces an error - no matter how I put the offset (as response
> probabilities or as response link).
>
> Fitting gamboost-models with offset works satisfactory with family >
Gaussian() or Multinomial().
>
> Questions: 1) How do I need to specify the offset with family >
PropOdds()?
>
> 2) Where in the mboost-object do I find the Theta's (response
> category dependent intercept)?
>
>
>
> # --- minimal example with iris data ---
>
> library(MASS)
> library(mboost)
>
> data(iris)
> iris$Species <- factor(iris$Species, ordered = T)
> p.iris <- polr(Species ~ Sepal.Length, data = iris)
> mlp <- gamboost(Species ~ bols(Sepal.Length) + bols(Sepal.Width),
> data = iris, family = PropOdds(),
> offset = fitted(p.iris) )
>
> Error in tmp[[i]] : subscript out of bounds
>
>
> Thank you
> M. Nussbaum
>
> --
>
> ETH Z?rich
> Madlene Nussbaum
> Institut f?r Terrestrische ?kosysteme
> Boden- und Terrestrische Umweltphysik
> CHN E 37.2
> Universit?tstrasse 16
> CH-8092 Z?rich
>
> Telefon + 44 632 73 21
> Mobile + 79 761 34 66
> madlene.nussbaum at .ethz
> www.step.ethz.ch
Nussbaum Madlene
2015-Mar-23 16:07 UTC
[R] mboost: Proportional odds boosting model - how to specify the offset?
Dear Benjamin This solved the problem. On 03/20/2015 04:06 PM, Benjamin Hofner wrote:> First, mboost expects the offset to be a scalar or a vector with length > equal to the number of observations. > (i.e., x'beta)ok, I could not figure out how to get the correct link vector from the polr. Using glmboost is certainly a good idea.> Second, there was a bug in mboost.Thanks for fixing!> install.packages("mboost", repos="http://R-Forge.R-project.org")The changes were somehow not included in the newest build. I downloaded the function from the SCM Repository directly. It works fine now. Thank you for your help! Madlene