I'm drawing a fitted normal distribution over a histogram. The use case is trivial (fitting normal distributions on densities) but I want to extend it to other fitting scenarios. What has stumped me so far is how to take the list that is returned by nls() and use it for curve(). I realize that I can easily do all of this with a few intermediate steps for any specific case. But I had expected that it should be possible to get a parametrized(!) function that computes predictions as one of the returned objects. I.e. I want a function that works with the model of nls() like abline() works() for lm(). I know that I can just pass parameters from coef(fit). But in the general case, I don't know how many parameters there are. I thought since nls() is able to put together the parametrized function internally, that function would be passed into the results object. But that doesn't seem to be the case. And though I could hack this together with parsing fit$m$formula to get() the formula from the environment and then paste() the parameter list in there - that sounds really, really awkward. So I hope that there's an obvious way that I have overlooked. Here's sample code to illustrate. : fitNorm <- function(x, y) { # fit a normal distribution # Param: x domain # y densities # Value: the fit object F <- function(x, a, mu, sig) { # some parametrized function return( ( a / (sig*sqrt(2*pi)) ) * exp( (-1/2)*((x-mu)/sig)^2 ) ) } mu <- weighted.mean(x, y) # estimate starting values sig <- sd(sample(x, 1000, prob = y, replace = TRUE)) a <- 1 fit <- nls(y ~ F(x, a, mu, sig), start = c(a = a, mu = mu, sig = sig)) # starting values return(fit) } # Fit and plot ...: # Values x <- c(rnorm(5000, 3, 5), rnorm(2000, -5, 7)) # Two normal distributions ... # Histogram h <- hist(x, freq = FALSE, breaks = seq(min(x)-2, max(x)+2, by = 1), col = "#cfd7fa", main = "", ylab = "density", xlab = "x") # Fit myFit <- fitNorm(h$mids, h$density) # Now: what I can do is, patch together the model function ... mF <- function(x, a = coef(myFit)["a"], mu = coef(myFit)["mu"], sig = coef(myFit)["sig"]){ a / (sig*sqrt(2*pi)) * exp( (-1/2)*((x-mu)/sig)^2 ) } # ... and add the curve: curve(mF(x), from = par("usr")[1], to = par("usr")[2], col = "#FF000055", lwd = 2, add = TRUE) # But what I would like to do is something much more general, like: curve(myFit$func, from = myFit$from, to = myFit$to, col = "#FF000055", lwd = 2, add = TRUE)) Thanks for any ideas and strategies. Cheers, Boris -- Boris Steipe MD, PhD University of Toronto Associate Professor, Department of Biochemistry and Department of Molecular Genetics
I haven't followed your example closely, but can't you use the predict() method for this? To draw a curve, the function that will be used in curve() sets up a newdata dataframe and passes it to predict(fit, newdata= ...) to get predictions at those locations. Duncan Murdoch On 17/10/2020 5:27 a.m., Boris Steipe wrote:> I'm drawing a fitted normal distribution over a histogram. The use case is trivial (fitting normal distributions on densities) but I want to extend it to other fitting scenarios. What has stumped me so far is how to take the list that is returned by nls() and use it for curve(). I realize that I can easily do all of this with a few intermediate steps for any specific case. But I had expected that it should be possible to get a parametrized(!) function that computes predictions as one of the returned objects. > > I.e. I want a function that works with the model of nls() like abline() works() for lm(). > > I know that I can just pass parameters from coef(fit). But in the general case, I don't know how many parameters there are. I thought since nls() is able to put together the parametrized function internally, that function would be passed into the results object. But that doesn't seem to be the case. And though I could hack this together with parsing fit$m$formula to get() the formula from the environment and then paste() the parameter list in there - that sounds really, really awkward. > > So I hope that there's an obvious way that I have overlooked. > > Here's sample code to illustrate. : > > fitNorm <- function(x, y) { > # fit a normal distribution > # Param: x domain > # y densities > # Value: the fit object > F <- function(x, a, mu, sig) { # some parametrized function > return( ( a / (sig*sqrt(2*pi)) ) * exp( (-1/2)*((x-mu)/sig)^2 ) ) > } > > mu <- weighted.mean(x, y) # estimate starting values > sig <- sd(sample(x, 1000, prob = y, replace = TRUE)) > a <- 1 > > fit <- nls(y ~ F(x, a, mu, sig), > start = c(a = a, mu = mu, sig = sig)) # starting values > return(fit) > } > > # Fit and plot ...: > > # Values > x <- c(rnorm(5000, 3, 5), rnorm(2000, -5, 7)) # Two normal distributions ... > > # Histogram > h <- hist(x, freq = FALSE, > breaks = seq(min(x)-2, max(x)+2, by = 1), > col = "#cfd7fa", > main = "", ylab = "density", xlab = "x") > # Fit > myFit <- fitNorm(h$mids, h$density) > > # Now: what I can do is, patch together the model function ... > mF <- function(x, > a = coef(myFit)["a"], > mu = coef(myFit)["mu"], > sig = coef(myFit)["sig"]){ > a / (sig*sqrt(2*pi)) * exp( (-1/2)*((x-mu)/sig)^2 ) > } > # ... and add the curve: > curve(mF(x), > from = par("usr")[1], to = par("usr")[2], > col = "#FF000055", lwd = 2, add = TRUE) > > > # But what I would like to do is something much more general, like: > curve(myFit$func, from = myFit$from, to = myFit$to, > col = "#FF000055", lwd = 2, add = TRUE)) > > > Thanks for any ideas and strategies. > > Cheers, > Boris > > > > > > > -- > Boris Steipe MD, PhD > University of Toronto > Associate Professor, Department of Biochemistry and > Department of Molecular Genetics > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
I had tried predict() but overlooked that it is picky about needing the data in a list-like object. When I passed a vector it just returned the predictions on the original values. Passing a list (as in the help example... of course) does the trick. This works - and does exactly what I had hoped for: curve(predict(myFit, list(x = x)), col = "#FF000055", lwd = 2, add = TRUE) Very grateful! Boris> On 2020-10-17, at 20:26, Duncan Murdoch <murdoch.duncan at gmail.com> wrote: > > > I haven't followed your example closely, but can't you use the predict() > method for this? To draw a curve, the function that will be used in > curve() sets up a newdata dataframe and passes it to predict(fit, > newdata= ...) to get predictions at those locations. > > Duncan Murdoch > > On 17/10/2020 5:27 a.m., Boris Steipe wrote: >> I'm drawing a fitted normal distribution over a histogram. The use case is trivial (fitting normal distributions on densities) but I want to extend it to other fitting scenarios. What has stumped me so far is how to take the list that is returned by nls() and use it for curve(). I realize that I can easily do all of this with a few intermediate steps for any specific case. But I had expected that it should be possible to get a parametrized(!) function that computes predictions as one of the returned objects. >> >> I.e. I want a function that works with the model of nls() like abline() works() for lm(). >> >> I know that I can just pass parameters from coef(fit). But in the general case, I don't know how many parameters there are. I thought since nls() is able to put together the parametrized function internally, that function would be passed into the results object. But that doesn't seem to be the case. And though I could hack this together with parsing fit$m$formula to get() the formula from the environment and then paste() the parameter list in there - that sounds really, really awkward. >> >> So I hope that there's an obvious way that I have overlooked. >> >> Here's sample code to illustrate. : >> >> fitNorm <- function(x, y) { >> # fit a normal distribution >> # Param: x domain >> # y densities >> # Value: the fit object >> F <- function(x, a, mu, sig) { # some parametrized function >> return( ( a / (sig*sqrt(2*pi)) ) * exp( (-1/2)*((x-mu)/sig)^2 ) ) >> } >> >> mu <- weighted.mean(x, y) # estimate starting values >> sig <- sd(sample(x, 1000, prob = y, replace = TRUE)) >> a <- 1 >> >> fit <- nls(y ~ F(x, a, mu, sig), >> start = c(a = a, mu = mu, sig = sig)) # starting values >> return(fit) >> } >> >> # Fit and plot ...: >> >> # Values >> x <- c(rnorm(5000, 3, 5), rnorm(2000, -5, 7)) # Two normal distributions ... >> >> # Histogram >> h <- hist(x, freq = FALSE, >> breaks = seq(min(x)-2, max(x)+2, by = 1), >> col = "#cfd7fa", >> main = "", ylab = "density", xlab = "x") >> # Fit >> myFit <- fitNorm(h$mids, h$density) >> >> # Now: what I can do is, patch together the model function ... >> mF <- function(x, >> a = coef(myFit)["a"], >> mu = coef(myFit)["mu"], >> sig = coef(myFit)["sig"]){ >> a / (sig*sqrt(2*pi)) * exp( (-1/2)*((x-mu)/sig)^2 ) >> } >> # ... and add the curve: >> curve(mF(x), >> from = par("usr")[1], to = par("usr")[2], >> col = "#FF000055", lwd = 2, add = TRUE) >> >> >> # But what I would like to do is something much more general, like: >> curve(myFit$func, from = myFit$from, to = myFit$to, >> col = "#FF000055", lwd = 2, add = TRUE)) >> >> >> Thanks for any ideas and strategies. >> >> Cheers, >> Boris >> >> >> >> >> >> >> -- >> Boris Steipe MD, PhD >> University of Toronto >> Associate Professor, Department of Biochemistry and >> Department of Molecular Genetics >> >> ______________________________________________ >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> >