Benjamin Hofner
2015-Mar-20 15:06 UTC
[R] mboost: Proportional odds boosting model - how to specify the offset?
Dear Madlene, the problem that you observed was twofold. First, mboost expects the offset to be a scalar or a vector with length equal to the number of observations. However, fitted(p.iris) is a matrix. In PropOdds(), the linear or additive predictor is shared among all outcome categories and the thresholds are treated as nuisance parameter. What you need to supply as offset is the result of the linear or additive predictor (i.e., x'beta) instead of the fitted class probabilities. Second, there was a bug in mboost. I fixed it on R-forge [1]. If the package was successfully built use install.packages("mboost", repos="http://R-Forge.R-project.org") to install it. You can also email to me off list. Then I will send you the package sources directly. Your nuisance parameters (which represent the class thresholds) can be extracted via nuisance(mlp). More details are given in the example below. Best, Benjamin [1] http://r-forge.r-project.org/projects/mboost/ ---- Example code ---- library(MASS) library(mboost) data(iris) iris$Species <- factor(iris$Species, ordered = T) p.iris <- polr(Species ~ Sepal.Length, data = iris) p.iris lm.iris <- glmboost(Species ~ Sepal.Length, data = iris, family = PropOdds(nuirange = c(-0.5, 3))) lm.iris[1000] ## thresholds: nuisance(lm.iris) ## to make these comparable to p.iris use nuisance(lm.iris) - coef(lm.iris)["(Intercept)"] - attr(coef(lm.iris), "offset") ## now use linear predictor as offset: mlp <- gamboost(Species ~ bols(Sepal.Length) + bols(Sepal.Width), data = iris, family = PropOdds(nuirange = c(0, 1)), offset = fitted(lm.iris)) Nussbaum Madlene wrote> Dear R team > > The package mboost allows for boosting of proportional odds models. > However, I would like to include an offset for every observation. > This produces an error - no matter how I put the offset (as response > probabilities or as response link). > > Fitting gamboost-models with offset works satisfactory with family > Gaussian() or Multinomial(). > > Questions: 1) How do I need to specify the offset with family > PropOdds()? > > 2) Where in the mboost-object do I find the Theta's (response > category dependent intercept)?> > > > # --- minimal example with iris data --- > > library(MASS) > library(mboost) > > data(iris) > iris$Species <- factor(iris$Species, ordered = T) > p.iris <- polr(Species ~ Sepal.Length, data = iris) > mlp <- gamboost(Species ~ bols(Sepal.Length) + bols(Sepal.Width), > data = iris, family = PropOdds(), > offset = fitted(p.iris) ) > > Error in tmp[[i]] : subscript out of bounds > > > Thank you > M. Nussbaum > > -- > > ETH Z?rich > Madlene Nussbaum > Institut f?r Terrestrische ?kosysteme > Boden- und Terrestrische Umweltphysik > CHN E 37.2 > Universit?tstrasse 16 > CH-8092 Z?rich > > Telefon + 44 632 73 21 > Mobile + 79 761 34 66 > madlene.nussbaum at .ethz > www.step.ethz.ch
Nussbaum Madlene
2015-Mar-23 16:07 UTC
[R] mboost: Proportional odds boosting model - how to specify the offset?
Dear Benjamin This solved the problem. On 03/20/2015 04:06 PM, Benjamin Hofner wrote:> First, mboost expects the offset to be a scalar or a vector with length > equal to the number of observations. > (i.e., x'beta)ok, I could not figure out how to get the correct link vector from the polr. Using glmboost is certainly a good idea.> Second, there was a bug in mboost.Thanks for fixing!> install.packages("mboost", repos="http://R-Forge.R-project.org")The changes were somehow not included in the newest build. I downloaded the function from the SCM Repository directly. It works fine now. Thank you for your help! Madlene