thr3ads.net - R help - [R] optim() for ordered logit model with parallel regression assumption [Aug 2012]

If this information is useful, please help other people find it:
Share via:

Xu Jun

2012-Aug-01 00:57 UTC

[R] optim() for ordered logit model with parallel regression assumption

Dear R listers,

I am learning the MLE utility optim() in R to program ordered logit
models just as an exercise. See below I have three independent
variables, x1, x2, and x3. Y is coded as ordinal from 1 to 4. Y is not
yet a factor variable here. The ordered logit model satisfies the
parallel regression assumption. The following codes can run through,
but results were totally different from what I got using the polr
function from the MASS package. I think it might be due to the way how
the p is constructed in the ologit.lf function. I am relatively new to
R, and here I would guess probably something related to the class type
(numeric vs. matrix) or something along that line among those if
conditions. Thanks in advance for any suggestion.

Jun Xu, PhD
Assistant Professor
Department of Sociology
Ball State University
Muncie, IN 47306


####################################################################

library(foreign)
readin <- read.dta("ordfile.dta", convert.factors=FALSE)
myvars <- c("depvar", "x1", "x2",
"x3")
mydta <- readin[myvars]
# remove all missings
mydta <- na.omit(mydta)

# theta is the parameter vector
ologit.lf <- function(theta, y, X) {
  n <- nrow(X)
  k <- ncol(X)
# b is the coefficient vector for independent variables
  b <- theta[1:k]
# tau1 is cut-point 1
  tau1 <- theta [k+1]
# tau2 is cut-point 2
  tau2 <- theta [k+2]
# tau3 is cut-point 1
  tau3 <- theta [k+3]

  if (y == 1){
    p <- (1/(1+exp( - tau1 + X %*% b)))
  }
  if (y == 2) {
    p <- (1/(1+exp( - tau2 + X %*% b))) - (1/(1+exp( - tau1 + X %*% b)))
  }
  if (y == 3) {
    p <- (1/(1+exp( - tau3 + X %*% b))) - (1/(1+exp( - tau2 + X %*% b)))
  }
  if (y == 4) {
    p <- 1 - (1/(1+exp( - tau3 + X %*% b)))
  }
  - sum(p)
}

y <- as.numeric(mydta$depvar)
X <- cbind (mydta$x1, mydta$x2, mydta$x3)
runopt <- optim(rep(0, ncol(X)+4), ologit.lf, method="BFGS",
hessian=T, y=y, X=X)


There were 50 or more warnings (use warnings() to see the first
50)> warnings()
1: In if (y == 1) { ... :
  the condition has length > 1 and only the first element will be used
2: In if (y == 2) { ... :
  the condition has length > 1 and only the first element will be used

R. Michael Weylandt

2012-Aug-01 02:07 UTC

head link

[R] optim() for ordered logit model with parallel regression assumption

On Tue, Jul 31, 2012 at 7:57 PM, Xu Jun <junxu.r at gmail.com>
wrote:> Dear R listers,
>
> I am learning the MLE utility optim() in R to program ordered logit
> models just as an exercise. See below I have three independent
> variables, x1, x2, and x3. Y is coded as ordinal from 1 to 4. Y is not
> yet a factor variable here. The ordered logit model satisfies the
> parallel regression assumption. The following codes can run through,
> but results were totally different from what I got using the polr
> function from the MASS package. I think it might be due to the way how
> the p is constructed in the ologit.lf function. I am relatively new to
> R, and here I would guess probably something related to the class type
> (numeric vs. matrix) or something along that line among those if
> conditions. Thanks in advance for any suggestion.
>
> Jun Xu, PhD
> Assistant Professor
> Department of Sociology
> Ball State University
> Muncie, IN 47306
>
>
> ####################################################################
>
> library(foreign)
> readin <- read.dta("ordfile.dta", convert.factors=FALSE)
> myvars <- c("depvar", "x1", "x2",
"x3")
> mydta <- readin[myvars]
> # remove all missings
> mydta <- na.omit(mydta)
>
> # theta is the parameter vector
> ologit.lf <- function(theta, y, X) {
>   n <- nrow(X)
>   k <- ncol(X)
> # b is the coefficient vector for independent variables
>   b <- theta[1:k]
> # tau1 is cut-point 1
>   tau1 <- theta [k+1]
> # tau2 is cut-point 2
>   tau2 <- theta [k+2]
> # tau3 is cut-point 1
>   tau3 <- theta [k+3]
>
>   if (y == 1){
>     p <- (1/(1+exp( - tau1 + X %*% b)))
>   }
>   if (y == 2) {
>     p <- (1/(1+exp( - tau2 + X %*% b))) - (1/(1+exp( - tau1 + X %*% b)))
>   }
>   if (y == 3) {
>     p <- (1/(1+exp( - tau3 + X %*% b))) - (1/(1+exp( - tau2 + X %*% b)))
>   }
>   if (y == 4) {
>     p <- 1 - (1/(1+exp( - tau3 + X %*% b)))
>   }
>   - sum(p)
> }
>
> y <- as.numeric(mydta$depvar)
> X <- cbind (mydta$x1, mydta$x2, mydta$x3)
> runopt <- optim(rep(0, ncol(X)+4), ologit.lf, method="BFGS",
> hessian=T, y=y, X=X)
>
>
> There were 50 or more warnings (use warnings() to see the first 50)
>> warnings()
>
> 1: In if (y == 1) { ... :
>   the condition has length > 1 and only the first element will be used
> 2: In if (y == 2) { ... :
>   the condition has length > 1 and only the first element will be used
It looks like you've got a fundamental problem in your if/else
statements. if and else are control structures and so they operate on
the whole program flow -- I think you want the ifelse() function here
instead.

Take a look at this example:

x <- c(1, 5, 9)

if(x < 3) {y <- x^2} else {y <- 2}

z <- ifelse(x < 3, x^2, 2)

print(x)
print(y)
print(z)

See also ?ifelse

Best,
Michael
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

Seemingly Similar Threads

Search for more maybe matching threads

R help - Aug 2012 - optim() for ordered logit model with parallel regression assumption

[R] optim() for ordered logit model with parallel regression assumption

[R] optim() for ordered logit model with parallel regression assumption

Seemingly Similar Threads