Kevin Coombes
2010-May-06 15:42 UTC
[R] cannot update polr model if I specify "start" parameters
Hi, I am trying to build an ordinal regression model using polr (from the MASS package). In order to construct an initial model (without an error aborting it) in my setting, I must pass in a "start" parameter. I would then like to use the "step" function to remove unnecessary variables from the model. However, this fails with the error message: > mod1 <- step(model) Start: AIC=42 PathCR ~ Cluster + [[stuff omitted]] Error in polr(formula = PathCR ~ [[stuff omitted]] : 'start' is not of the correct length The underlying problem appears to be that "step" calls "drop1" which calls "update" on the formula with an omitted term. The "update" fails with the same error message: > update(model, ~ . - Cluster) Error in polr(formula = PathCR ~ [[stuff omitted]] : 'start' is not of the correct length Since "update" extracts the initial function call from the model, it apparently passes the "start" parameters along to "polr" to refit the model. Since one variable has been dropped, there are now too many parameters in the "start" parameter and the "update" fails. Does anyone have a way around this difficulty? (For an individual update, I could probably hack the "call" object inside the existing model, but I really don't see how to do that using 'step".) Thanks, Kevin > sessionInfo() R version 2.10.0 (2009-10-26) i386-pc-mingw32 locale: [1] LC_COLLATE=English_United States.1252 [2] LC_CTYPE=English_United States.1252 [3] LC_MONETARY=English_United States.1252 [4] LC_NUMERIC=C [5] LC_TIME=English_United States.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] MASS_7.3-4 loaded via a namespace (and not attached): [1] tools_2.10.0
Joris Meys
2010-May-07 10:08 UTC
[R] cannot update polr model if I specify "start" parameters
Hi Kevin, The obvious work-around is to start with a model that can be fitted without giving the start-parameters. If you have to specify the start parameters, that usually means there is too much parameters or too much dependence in your data for the algorithm to converge. Meaning that the outcome should not be trusted, and even when step() would work, it's highly unlikely to give you a result that actually makes sense. In case you still want to go ahead, you can step in the other direction : # senseless data x <- factor(rep(1:3,10),levels=c(1,2,3),ordered=T) a <- rnorm(30) b <- runif (30,1,5) c <- rpois(30,5) d <- rbeta(30,2,3) e <- rbinom(30,6,0.7) mod <- polr(x~1) step(mod,scope="x~a*b*c*d*e",direction="both") In update, you can specify the new start values update(mod,.~.+a+b,start=c(1,1,2,2)) mod2 <- polr(x~a*b*c*d*e,start=c(rep(1,7),rep(0,26))) update(mod2,.~.-a-b-c-d-e,start=c(1,1,rep(0,26))) But the best way is to start from a model that makes sense, and go from there. Using the diagnostics allows you to adapt the model in such a way it makes sense and behaves well. Trusting on the step() function for this, is like giving a dictionary to a monkey and hoping he turns it into a bestseller. Cheers Joris fortune(241) On Thu, May 6, 2010 at 5:42 PM, Kevin Coombes <kevin.r.coombes@gmail.com>wrote:> Hi, > > I am trying to build an ordinal regression model using polr (from the MASS > package). In order to construct an initial model (without an error aborting > it) in my setting, I must pass in a "start" parameter. I would then like to > use the "step" function to remove unnecessary variables from the model. > However, this fails with the error message: > > > mod1 <- step(model) > Start: AIC=42 > PathCR ~ Cluster + [[stuff omitted]] > Error in polr(formula = PathCR ~ [[stuff omitted]] : > 'start' is not of the correct length > > The underlying problem appears to be that "step" calls "drop1" which calls > "update" on the formula with an omitted term. The "update" fails with the > same error message: > > > update(model, ~ . - Cluster) > Error in polr(formula = PathCR ~ [[stuff omitted]] : > 'start' is not of the correct length > > Since "update" extracts the initial function call from the model, it > apparently passes the "start" parameters along to "polr" to refit the model. > Since one variable has been dropped, there are now too many parameters in > the "start" parameter and the "update" fails. > > Does anyone have a way around this difficulty? (For an individual update, > I could probably hack the "call" object inside the existing model, but I > really don't see how to do that using 'step".) > > Thanks, > Kevin > > > sessionInfo() > R version 2.10.0 (2009-10-26) > i386-pc-mingw32 > > locale: > [1] LC_COLLATE=English_United States.1252 > [2] LC_CTYPE=English_United States.1252 [3] LC_MONETARY=English_United > States.1252 > [4] LC_NUMERIC=C [5] LC_TIME=English_United > States.1252 > attached base packages: > [1] stats graphics grDevices utils datasets methods base > other attached packages: > [1] MASS_7.3-4 > > loaded via a namespace (and not attached): > [1] tools_2.10.0 > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Joris Meys Statistical Consultant Ghent University Faculty of Bioscience Engineering Department of Applied mathematics, biometrics and process control Coupure Links 653 B-9000 Gent tel : +32 9 264 59 87 Joris.Meys@Ugent.be ------------------------------- Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php [[alternative HTML version deleted]]