thr3ads.net - R help - [R] Estimating regression with constraints in model coefficients [Apr 2025]

If this information is useful, please help other people find it:
Share via:

Christofer Bogaso

2025-Apr-21 18:25 UTC

[R] Estimating regression with constraints in model coefficients

Hi Gregg,

I am sincerely thankful for this workout.

Could you please suggest any text book on how to create log-likelihood
for an ordinal model like this? Most of my online search point me
directly to some R function etc, but a theoretical discussion on this
subject would be really helpful to construct the same.

Thanks and regards,

On Mon, Apr 21, 2025 at 9:55?PM Gregg Powell <g.a.powell at
protonmail.com> wrote:>
> Christofer,
>
> Given the constraints you mentioned?bounded parameters, no intercept, and a
sum constraint?you're outside the capabilities of most off-the-shelf ordinal
logistic regression functions in R like vglm or polr.
>
> The most flexible recommendation at this point is to implement custom
likelihood optimization using constrOptim() or nloptr::nloptr() with a manually
coded log-likelihood function for the cumulative logit model.
>
> Since you need:
>
> Coefficient bounds (e.g., lb ? ? ? ub),
>
> No intercept,
>
> And a constraint like sum(?) = C,
>
> ?you'll need to code your own objective function. Here's something
of a high-level outline of the approach:
>
> A. Model Setup
> Let your design matrix X be n x p, and let Y take ordered values 1, 2, ...,
K.
>
> B. Parameters
> Assume the thresholds (?_k) are fixed (or removed entirely), and you?re
estimating only the coefficient vector ?. Your constraints are:
>
> Box constraints: lb ? ? ? ub
>
> Equality constraint: sum(?) = C
>
> C. Likelihood
> The probability of category k is given by:
>
> P(Y = k) = logistic(?_k - X?) - logistic(?_{k-1} - X?)
>
> Without intercepts, this becomes more like:
>
> P(Y ? k) = 1 / (1 + exp(-(c_k - X?)))
>
> ?where c_k are fixed thresholds.
>
> Implementation using nloptr
> This example shows a rough sketch using nloptr, which allows both equality
and bound constraints:
>
> >library(nloptr)
> >
> ># Custom negative log-likelihood function
> >negLL <- function(beta, X, y, K, cutpoints) {
> >  eta <- X %*% beta
> >  loglik <- 0
> >  for (k in 1:K) {
> >    pk <- plogis(cutpoints[k] - eta) - plogis(cutpoints[k - 1] -
eta)
> >    loglik <- loglik + sum(log(pk[y == k]))
> >  }
> >  return(-loglik)
> >}
> >
> ># Constraint: sum(beta) = C
> >sum_constraint <- function(beta, C) {
> >  return(sum(beta) - C)
> >}
> >
> ># Define objective and constraint wrapper
> >objective <- function(beta) negLL(beta, X, y, K, cutpoints)
> >eq_constraint <- function(beta) sum_constraint(beta, C = 2)  #
example >C
> >
> ># Run optimization
> >res <- nloptr(
> >  x0 = rep(0, ncol(X)),
> >  eval_f = objective,
> >  lb = lower_bounds,
> >  ub = upper_bounds,
> >  eval_g_eq = eq_constraint,
> >  opts = list(algorithm = "NLOPT_LD_SLSQP", xtol_rel = 1e-8)
> >)
>
>
>
> The next step would be writing the actual log-likelihood for your specific
problem or verifying constraint implementation.
>
> Manually coding the log-likelihood for an ordinal model is nontrivial... so
a bit of a challenge :)
>
>
>
> All the best,
> gregg powell
> Sierra Vista, AZ

Ben Bolker

2025-Apr-21 18:29 UTC

head link

[R] Estimating regression with constraints in model coefficients

Section 2 of the vignette for the ordinal package:

https://cran.r-project.org/web/packages/ordinal/vignettes/clm_article.pdf

   gives a reasonably complete, if short, definition/discussion of the 
log-likelihood framework for ordinal models. It's probably also 
discussed in Venables and Ripley (Modern Applied Statistics in S), since 
the polr function is in the MASS package ...

On 2025-04-21 2:25 p.m., Christofer Bogaso wrote:> Hi Gregg,
> 
> I am sincerely thankful for this workout.
> 
> Could you please suggest any text book on how to create log-likelihood
> for an ordinal model like this? Most of my online search point me
> directly to some R function etc, but a theoretical discussion on this
> subject would be really helpful to construct the same.
> 
> Thanks and regards,
> 
> On Mon, Apr 21, 2025 at 9:55?PM Gregg Powell <g.a.powell at
protonmail.com> wrote:
>>
>> Christofer,
>>
>> Given the constraints you mentioned?bounded parameters, no intercept,
and a sum constraint?you're outside the capabilities of most off-the-shelf
ordinal logistic regression functions in R like vglm or polr.
>>
>> The most flexible recommendation at this point is to implement custom
likelihood optimization using constrOptim() or nloptr::nloptr() with a manually
coded log-likelihood function for the cumulative logit model.
>>
>> Since you need:
>>
>> Coefficient bounds (e.g., lb ? ? ? ub),
>>
>> No intercept,
>>
>> And a constraint like sum(?) = C,
>>
>> ?you'll need to code your own objective function. Here's
something of a high-level outline of the approach:
>>
>> A. Model Setup
>> Let your design matrix X be n x p, and let Y take ordered values 1, 2,
..., K.
>>
>> B. Parameters
>> Assume the thresholds (?_k) are fixed (or removed entirely), and you?re
estimating only the coefficient vector ?. Your constraints are:
>>
>> Box constraints: lb ? ? ? ub
>>
>> Equality constraint: sum(?) = C
>>
>> C. Likelihood
>> The probability of category k is given by:
>>
>> P(Y = k) = logistic(?_k - X?) - logistic(?_{k-1} - X?)
>>
>> Without intercepts, this becomes more like:
>>
>> P(Y ? k) = 1 / (1 + exp(-(c_k - X?)))
>>
>> ?where c_k are fixed thresholds.
>>
>> Implementation using nloptr
>> This example shows a rough sketch using nloptr, which allows both
equality and bound constraints:
>>
>>> library(nloptr)
>>>
>>> # Custom negative log-likelihood function
>>> negLL <- function(beta, X, y, K, cutpoints) {
>>>   eta <- X %*% beta
>>>   loglik <- 0
>>>   for (k in 1:K) {
>>>     pk <- plogis(cutpoints[k] - eta) - plogis(cutpoints[k - 1] -
eta)
>>>     loglik <- loglik + sum(log(pk[y == k]))
>>>   }
>>>   return(-loglik)
>>> }
>>>
>>> # Constraint: sum(beta) = C
>>> sum_constraint <- function(beta, C) {
>>>   return(sum(beta) - C)
>>> }
>>>
>>> # Define objective and constraint wrapper
>>> objective <- function(beta) negLL(beta, X, y, K, cutpoints)
>>> eq_constraint <- function(beta) sum_constraint(beta, C = 2)  #
example >C
>>>
>>> # Run optimization
>>> res <- nloptr(
>>>   x0 = rep(0, ncol(X)),
>>>   eval_f = objective,
>>>   lb = lower_bounds,
>>>   ub = upper_bounds,
>>>   eval_g_eq = eq_constraint,
>>>   opts = list(algorithm = "NLOPT_LD_SLSQP", xtol_rel =
1e-8)
>>> )
>>
>>
>>
>> The next step would be writing the actual log-likelihood for your
specific problem or verifying constraint implementation.
>>
>> Manually coding the log-likelihood for an ordinal model is
nontrivial... so a bit of a challenge :)
>>
>>
>>
>> All the best,
>> gregg powell
>> Sierra Vista, AZ
> 
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
https://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
-- 
Dr. Benjamin Bolker
Professor, Mathematics & Statistics and Biology, McMaster University
Director, School of Computational Science and Engineering
 > E-mail is sent at my convenience; I don't expect replies outside of 
working hours.

Jeff Newmiller

2025-Apr-21 19:32 UTC

head link

[R] Estimating regression with constraints in model coefficients

Logarithms are being referred to in two contexts here... log of likelihood, and
log of domain variables. I just wanted to highlight that the latter has a
surprisingly simple theoretical basis in optimization.

If you have a function that you want to find an optimum of, but an input
variable needs to be blocked from exceeding a limit, then if you let the
optimization engine guess any unconstrained value within (-Inf,Inf) and you add
lines at the beginning that apply antilog (exp()) to the number that the
optimizer guesses, then your algorithm never sees numbers on the other side of
that limit. The optimizer engine can guess all over the place but the function
will just creep closer or further from the limit. (The limit for log is zero,
but you can use subtraction/anti-subtraction to move it wherever you need it to
be.) When the optimizer returns its answer, just remember to apply the same
transformation as in the optimized function.

There can be numerical issues if you are searching near 1 so often analysts will
use the expm1 function and log1p functions instead of exp and log... but you
should be careful to only apply those if the nature of your optimization
function lends itself to those functions and you have scaled your calculations
appropriately... if not they may create biases in your results. [1] is a
cautionary discussion on this point.

[1] https://gist.github.com/jdnewmil/99301a88de702ad2fcbaef33326b08b4

On April 21, 2025 11:25:38 AM PDT, Christofer Bogaso <bogaso.christofer at
gmail.com> wrote:>Hi Gregg,
>
>I am sincerely thankful for this workout.
>
>Could you please suggest any text book on how to create log-likelihood
>for an ordinal model like this? Most of my online search point me
>directly to some R function etc, but a theoretical discussion on this
>subject would be really helpful to construct the same.
>
>Thanks and regards,
>
>On Mon, Apr 21, 2025 at 9:55?PM Gregg Powell <g.a.powell at
protonmail.com> wrote:
>>
>> Christofer,
>>
>> Given the constraints you mentioned?bounded parameters, no intercept,
and a sum constraint?you're outside the capabilities of most off-the-shelf
ordinal logistic regression functions in R like vglm or polr.
>>
>> The most flexible recommendation at this point is to implement custom
likelihood optimization using constrOptim() or nloptr::nloptr() with a manually
coded log-likelihood function for the cumulative logit model.
>>
>> Since you need:
>>
>> Coefficient bounds (e.g., lb ? ? ? ub),
>>
>> No intercept,
>>
>> And a constraint like sum(?) = C,
>>
>> ?you'll need to code your own objective function. Here's
something of a high-level outline of the approach:
>>
>> A. Model Setup
>> Let your design matrix X be n x p, and let Y take ordered values 1, 2,
..., K.
>>
>> B. Parameters
>> Assume the thresholds (?_k) are fixed (or removed entirely), and you?re
estimating only the coefficient vector ?. Your constraints are:
>>
>> Box constraints: lb ? ? ? ub
>>
>> Equality constraint: sum(?) = C
>>
>> C. Likelihood
>> The probability of category k is given by:
>>
>> P(Y = k) = logistic(?_k - X?) - logistic(?_{k-1} - X?)
>>
>> Without intercepts, this becomes more like:
>>
>> P(Y ? k) = 1 / (1 + exp(-(c_k - X?)))
>>
>> ?where c_k are fixed thresholds.
>>
>> Implementation using nloptr
>> This example shows a rough sketch using nloptr, which allows both
equality and bound constraints:
>>
>> >library(nloptr)
>> >
>> ># Custom negative log-likelihood function
>> >negLL <- function(beta, X, y, K, cutpoints) {
>> >  eta <- X %*% beta
>> >  loglik <- 0
>> >  for (k in 1:K) {
>> >    pk <- plogis(cutpoints[k] - eta) - plogis(cutpoints[k - 1] -
eta)
>> >    loglik <- loglik + sum(log(pk[y == k]))
>> >  }
>> >  return(-loglik)
>> >}
>> >
>> ># Constraint: sum(beta) = C
>> >sum_constraint <- function(beta, C) {
>> >  return(sum(beta) - C)
>> >}
>> >
>> ># Define objective and constraint wrapper
>> >objective <- function(beta) negLL(beta, X, y, K, cutpoints)
>> >eq_constraint <- function(beta) sum_constraint(beta, C = 2)  #
example >C
>> >
>> ># Run optimization
>> >res <- nloptr(
>> >  x0 = rep(0, ncol(X)),
>> >  eval_f = objective,
>> >  lb = lower_bounds,
>> >  ub = upper_bounds,
>> >  eval_g_eq = eq_constraint,
>> >  opts = list(algorithm = "NLOPT_LD_SLSQP", xtol_rel =
1e-8)
>> >)
>>
>>
>>
>> The next step would be writing the actual log-likelihood for your
specific problem or verifying constraint implementation.
>>
>> Manually coding the log-likelihood for an ordinal model is
nontrivial... so a bit of a challenge :)
>>
>>
>>
>> All the best,
>> gregg powell
>> Sierra Vista, AZ
>
>______________________________________________
>R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
https://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.
-- 
Sent from my phone. Please excuse my brevity.

Gregg Powell

2025-Apr-23 16:24 UTC

head link

[R] Estimating regression with constraints in model coefficients

Hello again Christofer,
Thanks for your thoughtful note ? I?m glad the outline was helpful. Apologies
for the long delay getting back to you. Had to do a small bit of research?

Recommended Text on Ordinal Log-Likelihoods:
You're right that most online sources jump straight to code or canned
functions. For a solid theoretical treatment of ordinal models and their
likelihoods, consider:
"Categorical Data Analysis" by Alan Agresti
? Especially Chapters 7 and 8 on ordinal logistic regression.
? Covers proportional odds models, cumulative logits, adjacent-category logits,
and the derivation of likelihood functions.
? Provides not only equations but also intuition behind the model structure.
It?s a standard reference in the field and explains how to build log-likelihoods
from first principles ? including how the cumulative probabilities arise and why
the difference of CDFs represents a category-specific probability.
Also helpful:
"An Introduction to Categorical Data Analysis" by Agresti (2nd ed) ? A
bit more accessible, and still covers the basics of ordinal models.
________________________________________

If You Want to Proceed Practically:
In parallel with theory, if you'd like a working R example coded from
scratch ? with:
? a custom likelihood for an ordinal (cumulative logit) model,
? fixed thresholds / no intercept,
? coefficient bounds,
? and a sum constraint on ?

I?d be happy to attempt that using nloptr() or constrOptim(). You?d be able to
walk through it and tweak it as necessary (no guarantee that I will get it right
?)

Just let me know:
1. Whether you want the cutpoints fixed (and to what values), or if you want
them estimated separately (with identifiability managed some other way);
2. What your bounds on the coefficients are (lower/upper vectors),
3. What value the sum of coefficients should equal (e.g., sum(?) = 1, or
something else);
4. And whether you're working with the IDRE example data, or something else.

I can use the UCLA ologit.dta dataset as a basis if that's easiest to demo
on, or if you have another dataset you?d prefer ? again, let me know.

All the best!

gregg

On Monday, April 21st, 2025 at 11:25 AM, Christofer Bogaso <bogaso.christofer
at gmail.com> wrote:
> 
> 
> Hi Gregg,
> 
> I am sincerely thankful for this workout.
> 
> Could you please suggest any text book on how to create log-likelihood
> for an ordinal model like this? Most of my online search point me
> directly to some R function etc, but a theoretical discussion on this
> subject would be really helpful to construct the same.
> 
> Thanks and regards,
> 
> On Mon, Apr 21, 2025 at 9:55?PM Gregg Powell g.a.powell at protonmail.com
wrote:
> 
> > Christofer,
> > 
> > Given the constraints you mentioned?bounded parameters, no intercept,
and a sum constraint?you're outside the capabilities of most off-the-shelf
ordinal logistic regression functions in R like vglm or polr.
> > 
> > The most flexible recommendation at this point is to implement custom
likelihood optimization using constrOptim() or nloptr::nloptr() with a manually
coded log-likelihood function for the cumulative logit model.
> > 
> > Since you need:
> > 
> > Coefficient bounds (e.g., lb ? ? ? ub),
> > 
> > No intercept,
> > 
> > And a constraint like sum(?) = C,
> > 
> > ?you'll need to code your own objective function. Here's
something of a high-level outline of the approach:
> > 
> > A. Model Setup
> > Let your design matrix X be n x p, and let Y take ordered values 1, 2,
..., K.
> > 
> > B. Parameters
> > Assume the thresholds (?_k) are fixed (or removed entirely), and
you?re estimating only the coefficient vector ?. Your constraints are:
> > 
> > Box constraints: lb ? ? ? ub
> > 
> > Equality constraint: sum(?) = C
> > 
> > C. Likelihood
> > The probability of category k is given by:
> > 
> > P(Y = k) = logistic(?_k - X?) - logistic(?_{k-1} - X?)
> > 
> > Without intercepts, this becomes more like:
> > 
> > P(Y ? k) = 1 / (1 + exp(-(c_k - X?)))
> > 
> > ?where c_k are fixed thresholds.
> > 
> > Implementation using nloptr
> > This example shows a rough sketch using nloptr, which allows both
equality and bound constraints:
> > 
> > > library(nloptr)
> > > 
> > > # Custom negative log-likelihood function
> > > negLL <- function(beta, X, y, K, cutpoints) {
> > > eta <- X %*% beta
> > > loglik <- 0
> > > for (k in 1:K) {
> > > pk <- plogis(cutpoints[k] - eta) - plogis(cutpoints[k - 1] -
eta)
> > > loglik <- loglik + sum(log(pk[y == k]))
> > > }
> > > return(-loglik)
> > > }
> > > 
> > > # Constraint: sum(beta) = C
> > > sum_constraint <- function(beta, C) {
> > > return(sum(beta) - C)
> > > }
> > > 
> > > # Define objective and constraint wrapper
> > > objective <- function(beta) negLL(beta, X, y, K, cutpoints)
> > > eq_constraint <- function(beta) sum_constraint(beta, C = 2) #
example >C
> > > 
> > > # Run optimization
> > > res <- nloptr(
> > > x0 = rep(0, ncol(X)),
> > > eval_f = objective,
> > > lb = lower_bounds,
> > > ub = upper_bounds,
> > > eval_g_eq = eq_constraint,
> > > opts = list(algorithm = "NLOPT_LD_SLSQP", xtol_rel =
1e-8)
> > > )
> > 
> > The next step would be writing the actual log-likelihood for your
specific problem or verifying constraint implementation.
> > 
> > Manually coding the log-likelihood for an ordinal model is
nontrivial... so a bit of a challenge :)
> > 
> > All the best,
> > gregg powell
> > Sierra Vista, AZ-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 603 bytes
Desc: OpenPGP digital signature
URL:
<https://stat.ethz.ch/pipermail/r-help/attachments/20250423/d78ca286/attachment.sig>

R help - Apr 2025 - Estimating regression with constraints in model coefficients

[R] Estimating regression with constraints in model coefficients

[R] Estimating regression with constraints in model coefficients

[R] Estimating regression with constraints in model coefficients

[R] Estimating regression with constraints in model coefficients