thr3ads.net - R help - [R] How to find inverse of glm model? [Apr 2025]

If this information is useful, please help other people find it:
Share via:

Gregg Powell

2025-Apr-23 16:43 UTC

[R] How to find inverse of glm model?

Hello Luigi,

Great follow-up ? Looks like you?re on the right track using nls() for nonlinear
regression. You're fitting a logistic-like sigmoidal model (as in Rutledge?s
paper), I think both errors you?re encountering are signs of bad starting values
or a maybe model formulation issue.

Breakdown of Your Model Attempt
Your model formula:> Fluo ~ (MaxFluo / (1 + exp(-(Cycle - (HalfFluoCycle / Slope)))) + 1)
This expression has perhaps two issues:

1.   Parameter placement: In the paper, the term inside exp() should look like
-(Cycle - HalfFluoCycle)/Slope ? not Cycle - (HalfFluoCycle / Slope).

2.   +1 additive constant: You called it backgroundSignal, but it?s hard-coded
as +1. That constant should be a parameter (e.g., Bkg), otherwise the model fit
has no way to adjust it.

Alternative Model Form:
Here?s a likely corrected version of the model based on Rutledge?s sigmoidal
fit:
> mod1 <- nls(Fluo ~ MaxFluo / (1 + exp(-(Cycle - HalfFluoCycle)/Slope))
> + Bkg,
>            data = df,
>            start = list(MaxFluo = max(df$Fluo),
>                         HalfFluoCycle = 20,
>                         Slope = 2,
>                         Bkg = min(df$Fluo)))
Notes on nls() Convergence
nls() is very sensitive to starting values. Always inspect your plot and roughly
estimate parameters:

MaxFluo: Use max(Fluo)

Bkg: Use min(Fluo)

HalfFluoCycle: Approximate where the curve inflects

Slope: Try values like 1, 2, 5 depending on steepness

You can also add trace=TRUE to watch convergence progress:
> mod1 <- nls(..., trace=TRUE)
About SSmicmen>From what I've gathered - this is meant for fitting the Michaelis-Menten
model (enzyme kinetics), not sigmoidal growth. That's why mod2 fails ? wrong
self-starting model function.

Visualizing the Fit
After fitting:
> plot(Fluo ~ Cycle, data = df)
> lines(df$Cycle, predict(mod1), col = "red", lwd = 2)
You might also want to compute the inverse (e.g., solving for Cycle given a
fluorescence value) ? that would involve solving the nonlinear equation manually
or using uniroot().

All the best,
gregg




On Wednesday, April 23rd, 2025 at 8:23 AM, Luigi Marongiu <marongiu.luigi at
gmail.com> wrote:
> 
> 
> Further on this, I have found a formula from a paper from Rutledge
> (DOI: 10.1093/nar/gnh177), which I rendered as
> `MaxFluo / (1+ exp(-(Cycle-(HalfFluoCycle/Slope)))) + backgroundSignal`
> I then see that one can use `nls` to fit non-linear regression, so I tried:
> `df <- data.frame(Cycle=1:35, Fluo=c(15908.54, 16246.92, 16722.43,
17104.29, 16794.93, 17031.44, 17299.05, 17185.49, 17362.28, 17127.43, 17368.96,
17513.19, 17593.95, 17626.37, 18308.29, 18768.12, 19955.26, 22493.85, 27088.12,
36539.44, 53694.18, 84663.18, 138835.64, 223331.89, 336409.69, 457729.88,
561233.12, 637536.31, 688985.88, 723158.56, 746575.62, 766274.75, 776087.75,
785144.81, 791573.81) ) plot(Fluo~Cycle, df) mod1 = nls(Fluo~(MaxFluo / (1+
exp(-(Cycle-(HalfFluoCycle/Slope)))) + 1), data=df,
start=list(MaxFluo=max(df$Fluo), HalfFluoCycle=15, Slope=0.1)) mod2 =
nls(Fluo~SSmicmen(Fluo, Cycle), data=df)`
> but I got errors in both cases:
> ```
> 
> > mod1
> 
> Error in nlsModel(formula, mf, start, wts, scaleOffset = scOff,
> nDcentral = nDcntr) :
> singular gradient matrix at initial parameter estimates
> 
> > mod2
> 
> Error in qr.qty(QR.rhs, .swts * ddot(attr(rhs, "gradient"), lin))
:
> NA/NaN/Inf in foreign function call (arg 5)
> ```
> How can I properly set this regression model?
> Thank you
> 
> 
> On Wed, Apr 16, 2025 at 7:08?AM Luigi Marongiu marongiu.luigi at gmail.com
wrote:
> 
> > Thank you. This topic is more complicated than anticipated. Best
regards, Luigi
> > 
> > On Tue, Apr 15, 2025 at 11:09?PM Andrew Robinson apro at
unimelb.edu.au wrote:
> > 
> > > A statistical (off-topic!) point to consider: when the GLM was
fitted, you conditioned on x and let y be the random variable. Therefore the
model supports predictions of y conditional on x. You?re now seeking to make
predictions of x conditional on y. That?s not the same thing, even in OLS.
> > > 
> > > It might not matter for your application but it?s probably worth
thinking about. Simulation experiments might shed some light on that.
> > > 
> > > Cheers, Andrew
> > > 
> > > --
> > > Andrew Robinson
> > > Chief Executive Officer, CEBRA and Professor of Biosecurity,
> > > School/s of BioSciences and Mathematics & Statistics
> > > University of Melbourne, VIC 3010 Australia
> > > Tel: (+61) 0403 138 955
> > > Email: apro at unimelb.edu.au
> > > Website: https://researchers.ms.unimelb.edu.au/~apro at unimelb/
> > > 
> > > I acknowledge the Traditional Owners of the land I inhabit, and
pay my respects to their Elders.
> > > 
> > > On 16 Apr 2025 at 1:01?AM +1000, Gregg Powell via R-help r-help
at r-project.org, wrote:
> > > 
> > > Take a look at this Luigi...
> > > 
> > > # The model is: logit(p) = ?? + ??*Cycles
> > > # Where p is the probability (or normalized value in your case)
> > > 
> > > # The inverse function would be: Cycles = (logit??(p) - ??)/??
> > > # Where logit?? is the inverse logit function (also called the
expit >function)
> > > 
> > > # Extract coefficients from your model
> > > intercept <- coef(b_model)[1]
> > > slope <- coef(b_model)[2]
> > > 
> > > # Define the inverse function
> > > inverse_predict <- function(p) {
> > > # p is the probability or normalized value you want to find the
>cycles for
> > > # Inverse logit: log(p/(1-p)) which is the logit function
> > > logit_p <- log(p/(1-p))
> > > 
> > > # Solve for Cycles: (logit(p) - intercept)/slope
> > > cycles <- (logit_p - intercept)/slope
> > > 
> > > return(cycles)
> > > }
> > > 
> > > # Example: What cycle would give a normalized value of 0.5?
> > > inverse_predict(0.5)
> > > 
> > > This function takes a probability (normalized value between 0 and
1) and returns the cycle value that would produce this probability according to
your model.
> > > Also:
> > > This works because GLM with binomial family uses the logit link
function by default
> > > The inverse function will return values outside your original
data range if needed
> > > This won't work for p=0 or p=1 exactly (you'd get -Inf or
Inf), so you might want to add checks
> > > 
> > > All the best,
> > > Gregg
> > > 
> > > On Tuesday, April 15th, 2025 at 5:57 AM, Luigi Marongiu
marongiu.luigi at gmail.com wrote:
> > > 
> > > I have fitted a glm model to some data; how can I find the
inverse
> > > function of this model? Since I don't know the y=f(x)
implemented by
> > > glm (this works under the hood), I can't define a f??(y).
> > > Is there an R function that can find the inverse of a glm model?
> > > Thank you.
> > > 
> > > The working example is:
> > > `V = c(120.64, 66.14, 34.87, 27.11, 8.87, -5.8, 4.52, -7.16,
-17.39, -14.29, -20.26, -14.99, -21.05, -20.64, -8.03, -21.56, -1.28, 15.01,
75.26, 191.76, 455.09, 985.96, 1825.59, 2908.08, 3993.18, 5059.94, 6071.93,
6986.32, 7796.01, 8502.25, 9111.46, 9638.01, 10077.19, 10452.02, 10751.81,
11017.49, 11240.37, 11427.47, 11570.07, 11684.96, 11781.77, 11863.35, 11927.44,
11980.81, 12021.88, 12058.35, 12100.63, 12133.57, 12148.89, 12137.09) df =
data.frame(Cycles = 1:35, Values = V[1:35]) M = max(df$Values) df$Norm =
df$Values/M df$Norm[df$Norm<0] = 0 b_model = glm(Norm ~ Cycles, data=df,
family=binomial) plot(Norm ~ Cycles, df, main="Normalized view",
xlab=expression(bold("Time")), ylab=expression(bold("Signal
(normalized)")), type="p", col="cyan")
lines(b_model$fitted.values ~ df$Cycles, col="blue", lwd=2)`
> > > 
> > > ______________________________________________
> > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more,
see
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide
https://www.R-project.org/posting-guide.html
> > > and provide commented, minimal, self-contained, reproducible
code.
> > 
> > --
> > Best regards,
> > Luigi
> 
> 
> 
> 
> --
> Best regards,
> Luigi-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 603 bytes
Desc: OpenPGP digital signature
URL:
<https://stat.ethz.ch/pipermail/r-help/attachments/20250423/a1857a25/attachment.sig>

Luigi Marongiu

2025-Apr-24 19:27 UTC

head link

[R] How to find inverse of glm model?

Thank you. Following your tips I tried to guess the starting values
using an approached that determined (1) the background level (the
fluorescence before the take over signal), (2) the slope between this
point and the maximum:
```
df <- data.frame(Cycle=1:35,
              Fluo=c(15908.54,  16246.92,  16722.43,  17104.29,
16794.93,  17031.44,  17299.05,  17185.49,
                     17362.28,  17127.43,  17368.96,  17513.19,
17593.95,  17626.37,  18308.29,  18768.12,
                     19955.26,  22493.85,  27088.12,  36539.44,
53694.18,  84663.18, 138835.64, 223331.89,
                     336409.69, 457729.88, 561233.12, 637536.31,
688985.88, 723158.56, 746575.62, 766274.75,
                     776087.75, 785144.81, 791573.81)
              )

# estimate starting values
S = sd(df$Fluo[1:3])*10
for (j in 4:nrow(df)) {
  x = df$Fluo[1:j]
  s = sd(x)
  q = df$Fluo[j]
  cat("SD = ", S, "Fluo = ", q, "\n")
  if(q<S) {
    break
  } else {
    S = sd(df$Fluo[1:j])*10
  }
}
MIN_FLUO = q
MIN_IDX = which (df$Fluo == MIN_FLUO)
MIN_CYC = df$Cycle[MIN_IDX]
MAX_FLUO = max(df$Fluo)
MAX_IDX = which (df$Fluo == MAX_FLUO)
MAX_CYC = df$Cycle[MAX_IDX]
HLF_CYC = MIN_CYC + (MAX_CYC - MIN_CYC)/2
q_mod = lm(Fluo~Cycle, df[MIN_IDX:MAX_IDX,])
SLP = q_mod$coefficients[2]

# plot
plot(Fluo~Cycle, df)
points(MIN_FLUO~MIN_CYC, col="blue", pch=16)
points(MAX_FLUO~MAX_CYC, col="red", pch=16)
abline(h=MAX_FLUO, col="red")
abline(q_mod, col="green")
abline(v=HLF_CYC, col="gold")
abline(h=S, col="purple")
legend("topleft", legend=c("HalfFluoCycle",
"MaxFluo", "Slope", "Noise"),
       lty=1, col=c("gold", "red", "green",
"purple"),
       title="Estimated parameters", lwd=2)
legend("left", legend=c("Min", "Max"),
title="Selected interval",
       pch=16, col=c("blue", "red"))

# regression
mod1 = nls(Fluo ~ (MaxFluo / (1+ exp(-(Cycle - HalfFluoCycle)/Slope)) + Noise),
          data=df, start=list(MaxFluo=MAX_FLUO,
                              HalfFluoCycle=HLF_CYC,
                              Slope=SLP,
                              Noise=S))
mod2 = nls(Fluo ~ (MaxFluo / (1+ exp(-(Cycle - HalfFluoCycle)/Slope)) + Noise),
          data=df, start=list(MaxFluo=MAX_FLUO,
                              HalfFluoCycle=HLF_CYC,
                              Slope=2,
                              Noise=S))
lines(df$Cycle, predict(mod2), col = "orange", lwd = 2)
```
Model 1 failed probably because SLP = 57724.29 , which is a weird
slope value; model 2 instead worked, using your suggested value of 2.
If I could hard-code the slope to 2, unless there is a better way to
determine such a slope, then I think I could try to extend the
approach to other data...
Best regards
Luigi

On Wed, Apr 23, 2025 at 6:43?PM Gregg Powell <g.a.powell at
protonmail.com> wrote:>
> Hello Luigi,
>
> Great follow-up ? Looks like you?re on the right track using nls() for
nonlinear regression. You're fitting a logistic-like sigmoidal model (as in
Rutledge?s paper), I think both errors you?re encountering are signs of bad
starting values or a maybe model formulation issue.
>
> Breakdown of Your Model Attempt
> Your model formula:
> > Fluo ~ (MaxFluo / (1 + exp(-(Cycle - (HalfFluoCycle / Slope)))) + 1)
>
> This expression has perhaps two issues:
>
> 1.   Parameter placement: In the paper, the term inside exp() should look
like -(Cycle - HalfFluoCycle)/Slope ? not Cycle - (HalfFluoCycle / Slope).
>
> 2.   +1 additive constant: You called it backgroundSignal, but it?s
hard-coded as +1. That constant should be a parameter (e.g., Bkg), otherwise the
model fit has no way to adjust it.
>
> Alternative Model Form:
> Here?s a likely corrected version of the model based on Rutledge?s
sigmoidal fit:
>
> > mod1 <- nls(Fluo ~ MaxFluo / (1 + exp(-(Cycle -
HalfFluoCycle)/Slope)) > + Bkg,
> >            data = df,
> >            start = list(MaxFluo = max(df$Fluo),
> >                         HalfFluoCycle = 20,
> >                         Slope = 2,
> >                         Bkg = min(df$Fluo)))
>
> Notes on nls() Convergence
> nls() is very sensitive to starting values. Always inspect your plot and
roughly estimate parameters:
>
> MaxFluo: Use max(Fluo)
>
> Bkg: Use min(Fluo)
>
> HalfFluoCycle: Approximate where the curve inflects
>
> Slope: Try values like 1, 2, 5 depending on steepness
>
> You can also add trace=TRUE to watch convergence progress:
>
> > mod1 <- nls(..., trace=TRUE)
>
> About SSmicmen
> From what I've gathered - this is meant for fitting the
Michaelis-Menten model (enzyme kinetics), not sigmoidal growth. That's why
mod2 fails ? wrong self-starting model function.
>
>
> Visualizing the Fit
> After fitting:
>
> > plot(Fluo ~ Cycle, data = df)
> > lines(df$Cycle, predict(mod1), col = "red", lwd = 2)
>
> You might also want to compute the inverse (e.g., solving for Cycle given a
fluorescence value) ? that would involve solving the nonlinear equation manually
or using uniroot().
>
> All the best,
> gregg
>
>
>
>
> On Wednesday, April 23rd, 2025 at 8:23 AM, Luigi Marongiu
<marongiu.luigi at gmail.com> wrote:
>
> >
>
> >
>
> > Further on this, I have found a formula from a paper from Rutledge
> > (DOI: 10.1093/nar/gnh177), which I rendered as
> > `MaxFluo / (1+ exp(-(Cycle-(HalfFluoCycle/Slope)))) +
backgroundSignal`
> > I then see that one can use `nls` to fit non-linear regression, so I
tried:
> > `df <- data.frame(Cycle=1:35, Fluo=c(15908.54, 16246.92, 16722.43,
17104.29, 16794.93, 17031.44, 17299.05, 17185.49, 17362.28, 17127.43, 17368.96,
17513.19, 17593.95, 17626.37, 18308.29, 18768.12, 19955.26, 22493.85, 27088.12,
36539.44, 53694.18, 84663.18, 138835.64, 223331.89, 336409.69, 457729.88,
561233.12, 637536.31, 688985.88, 723158.56, 746575.62, 766274.75, 776087.75,
785144.81, 791573.81) ) plot(Fluo~Cycle, df) mod1 = nls(Fluo~(MaxFluo / (1+
exp(-(Cycle-(HalfFluoCycle/Slope)))) + 1), data=df,
start=list(MaxFluo=max(df$Fluo), HalfFluoCycle=15, Slope=0.1)) mod2 =
nls(Fluo~SSmicmen(Fluo, Cycle), data=df)`
> > but I got errors in both cases:
> > ```
> >
>
> > > mod1
> >
>
> > Error in nlsModel(formula, mf, start, wts, scaleOffset = scOff,
> > nDcentral = nDcntr) :
> > singular gradient matrix at initial parameter estimates
> >
>
> > > mod2
> >
>
> > Error in qr.qty(QR.rhs, .swts * ddot(attr(rhs, "gradient"),
lin)) :
> > NA/NaN/Inf in foreign function call (arg 5)
> > ```
> > How can I properly set this regression model?
> > Thank you
> >
>
> >
>
> > On Wed, Apr 16, 2025 at 7:08?AM Luigi Marongiu marongiu.luigi at
gmail.com wrote:
> >
>
> > > Thank you. This topic is more complicated than anticipated. Best
regards, Luigi
> > >
>
> > > On Tue, Apr 15, 2025 at 11:09?PM Andrew Robinson apro at
unimelb.edu.au wrote:
> > >
>
> > > > A statistical (off-topic!) point to consider: when the GLM
was fitted, you conditioned on x and let y be the random variable. Therefore the
model supports predictions of y conditional on x. You?re now seeking to make
predictions of x conditional on y. That?s not the same thing, even in OLS.
> > > >
>
> > > > It might not matter for your application but it?s probably
worth thinking about. Simulation experiments might shed some light on that.
> > > >
>
> > > > Cheers, Andrew
> > > >
>
> > > > --
> > > > Andrew Robinson
> > > > Chief Executive Officer, CEBRA and Professor of Biosecurity,
> > > > School/s of BioSciences and Mathematics & Statistics
> > > > University of Melbourne, VIC 3010 Australia
> > > > Tel: (+61) 0403 138 955
> > > > Email: apro at unimelb.edu.au
> > > > Website: https://researchers.ms.unimelb.edu.au/~apro at
unimelb/
> > > >
>
> > > > I acknowledge the Traditional Owners of the land I inhabit,
and pay my respects to their Elders.
> > > >
>
> > > > On 16 Apr 2025 at 1:01?AM +1000, Gregg Powell via R-help
r-help at r-project.org, wrote:
> > > >
>
> > > > Take a look at this Luigi...
> > > >
>
> > > > # The model is: logit(p) = ?? + ??*Cycles
> > > > # Where p is the probability (or normalized value in your
case)
> > > >
>
> > > > # The inverse function would be: Cycles = (logit??(p) -
??)/??
> > > > # Where logit?? is the inverse logit function (also called
the expit >function)
> > > >
>
> > > > # Extract coefficients from your model
> > > > intercept <- coef(b_model)[1]
> > > > slope <- coef(b_model)[2]
> > > >
>
> > > > # Define the inverse function
> > > > inverse_predict <- function(p) {
> > > > # p is the probability or normalized value you want to find
the >cycles for
> > > > # Inverse logit: log(p/(1-p)) which is the logit function
> > > > logit_p <- log(p/(1-p))
> > > >
>
> > > > # Solve for Cycles: (logit(p) - intercept)/slope
> > > > cycles <- (logit_p - intercept)/slope
> > > >
>
> > > > return(cycles)
> > > > }
> > > >
>
> > > > # Example: What cycle would give a normalized value of 0.5?
> > > > inverse_predict(0.5)
> > > >
>
> > > > This function takes a probability (normalized value between
0 and 1) and returns the cycle value that would produce this probability
according to your model.
> > > > Also:
> > > > This works because GLM with binomial family uses the logit
link function by default
> > > > The inverse function will return values outside your
original data range if needed
> > > > This won't work for p=0 or p=1 exactly (you'd get
-Inf or Inf), so you might want to add checks
> > > >
>
> > > > All the best,
> > > > Gregg
> > > >
>
> > > > On Tuesday, April 15th, 2025 at 5:57 AM, Luigi Marongiu
marongiu.luigi at gmail.com wrote:
> > > >
>
> > > > I have fitted a glm model to some data; how can I find the
inverse
> > > > function of this model? Since I don't know the y=f(x)
implemented by
> > > > glm (this works under the hood), I can't define a
f??(y).
> > > > Is there an R function that can find the inverse of a glm
model?
> > > > Thank you.
> > > >
>
> > > > The working example is:
> > > > `V = c(120.64, 66.14, 34.87, 27.11, 8.87, -5.8, 4.52, -7.16,
-17.39, -14.29, -20.26, -14.99, -21.05, -20.64, -8.03, -21.56, -1.28, 15.01,
75.26, 191.76, 455.09, 985.96, 1825.59, 2908.08, 3993.18, 5059.94, 6071.93,
6986.32, 7796.01, 8502.25, 9111.46, 9638.01, 10077.19, 10452.02, 10751.81,
11017.49, 11240.37, 11427.47, 11570.07, 11684.96, 11781.77, 11863.35, 11927.44,
11980.81, 12021.88, 12058.35, 12100.63, 12133.57, 12148.89, 12137.09) df =
data.frame(Cycles = 1:35, Values = V[1:35]) M = max(df$Values) df$Norm =
df$Values/M df$Norm[df$Norm<0] = 0 b_model = glm(Norm ~ Cycles, data=df,
family=binomial) plot(Norm ~ Cycles, df, main="Normalized view",
xlab=expression(bold("Time")), ylab=expression(bold("Signal
(normalized)")), type="p", col="cyan")
lines(b_model$fitted.values ~ df$Cycles, col="blue", lwd=2)`
> > > >
>
> > > > ______________________________________________
> > > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and
more, see
> > > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > > PLEASE do read the posting guide
https://www.R-project.org/posting-guide.html
> > > > and provide commented, minimal, self-contained, reproducible
code.
> > >
>
> > > --
> > > Best regards,
> > > Luigi
> >
>
> >
>
> >
>
> >
>
> > --
> > Best regards,
> > Luigi


-- 
Best regards,
Luigi

R help - Apr 2025 - How to find inverse of glm model?

[R] How to find inverse of glm model?

[R] How to find inverse of glm model?