Conti, David
2010-Apr-06 21:33 UTC
[R] estimating the starting value within a ODE using nls and lsoda
All- I am interested in estimating a parameter that is the starting value for an ODE model. That is, in the typical combined fitting procedure using nls and lsoda (alternatively rk4), I first defined the ODE model: minmod <- function(t, y, parms) { G <- y[1] X <- y[2] with(as.list(parms),{ I_t <- approx(time, I.input, t)$y dG <- -1*(p1 + X)*G +p1*G_b dX <- -1*p2*X + p3*(I_t-I_b) list(c(dG, dX)) }) } Then I estimated the parameters of the model using nls: fit.rk4 <- nls(noisy ~ rk4(p4, time, minmod, parms=c(p1,p2,p3)) However, my goal is to not only estimate p1, p2 and p3, but also to estimate the starting value p4 from the data. I am currently using the following procedure using optim to accomplish this: step 1 (define a DevFun to optimize): DevFunc <- function(p4) { fit.rk4 <- nls(noisy ~ rk4(p4, time, ODE.model, params) r <- deviance(fit.rk4) r } step 2 (optimize with optim): fit <- optim(inits, DevFunc, method="L-BFGS-B", hessian=T, lower=lower, upper=upper) I have explored this via simulation and with real data and while the procedure works for most cases, there are a few cases that cause difficulty. I think it has to do with estimating p1, p2 and p3 conditional on p4 as opposed to estimating all parameters jointly. Does anyone know of a way in [R] to estimate the starting value jointly with the parameters within a ODE model? I have included more detailed code below to "spin a perfect cycle" to give a better idea of what I am trying to accomplish. Thanks in advance, Dave ########################################### require(odesolve) ### ODE model minmod <- function(t, y, parms) { G <- y[1] X <- y[2] with(as.list(parms),{ I_t <- approx(time, I.input, t)$y dG <- -1*(p1 + X)*G +p1*G_b dX <- -1*p2*X + p3*(I_t-I_b) list(c(dG, dX)) }) } ##### simulate data I.input <- c(5.8,7.8,11.5,10.1,9.9,9.8,12.1,10.7,11.1,11.5,11.2,10.3,171.0,179.0,145.6,147.0,105.0,20.2,15.3,14.7,12.3,10.4,7.7,7.0,7.5,4.9,5.6,5.8) time <- c(0,1,3,4,6,7,8,10,12,14,16,19,22,23,24,25,27,40,50,60,70,80,90,100,120,140,160,180) G_b <- 100 I_b <- 5.8 parms <- c(p1=0.012584, p2=0.037708, p3=0.000012524) xstart <- c(G=273.92,X=0) sim.data <- as.data.frame(rk4(y=xstart, time=time, func=minmod, parms=parms)) # note: i use rk4 because the standard software for estimating this model uses a Runge-Kutta approach sim.data$noisy <- sim.data$G + rnorm(nrow(sim.data), mean=0, sd=0.01*sim.data$G) ###### estimation Weight <- c(0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1) xstart.est <- c(G=mean(sim.data$noisy[1:4]), X=0) # inital fit of nls fit.rk4 <- nls(noisy ~ rk4(xstart.est, time, minmod, c(p1=p1, p2=p2, p3=p3))[,2], data=sim.data, start=list(p1=0.025, p2=0.02, p3=0.00004), trace=F, control=list(tol=1e-4, minFactor=1/1024/1024, warnOnly=T), weight=(1/(G*0.015))*Weight ) # define deviance function to optimize DevFunc <- function(p4) { fit.rk4 <- nls(noisy ~ rk4(c(G=p4, X=0), time, minmod, c(p1=p1, p2=p2, p3=p3))[,2], data=sim.data, start=list(p1=summary(fit.rk4)$coef[1,1], p2=summary(fit.rk4)$coef[2,1], p3=summary(fit.rk4)$coef[3,1]), trace=F, control=list(tol=1e-4, minFactor=1/1024/1024, warnOnly=T), weight=(1/(G*0.015))*Weight ) r <- deviance(fit.rk4) r } ##### estimation via optim inits=c(mean(sim.data$noisy[1:5])) lower=G_b upper=500 fit <- optim(inits, DevFunc, method="L-BFGS-B", hessian=T, lower=lower, upper=upper) #extract p1, p2, and p3 conditional on estimated p4: fit.rk4 <- nls(noisy ~ rk4(c(G=fit$par, X=0), time, minmod, c(p1=p1, p2=p2, p3=p3))[,2], data=sim.data, start=list(p1=summary(fit.rk4)$coef[1,1], p2=summary(fit.rk4)$coef[2,1], p3=summary(fit.rk4)$coef[3,1]), trace=F, control=list(tol=1e-4, minFactor=1/1024/1024, warnOnly=T), weight=(1/(G*0.015))*Weight ) # get all four parameters: p <- list(p1=summary(fit.rk4)$coef[1,1], p2=summary(fit.rk4)$coef[2,1], p3=summary(fit.rk4)$coef[3,1], p4=fit$par) p David V. Conti, Ph.D. Associate Professor University of Southern California Zilkha Neurogenetics Institute Keck School of Medicine Department of Preventive Medicine Division of Biostatistics Mailing Address: University of Southern California 1501 San Pablo Street ZNI 101, MC2821 Los Angeles, CA 90089-2821 Physical Address: USC/Harlyne Norris Research Tower 1450 Biggy Street NRT 2509L Los Angeles, CA 90033 Tel: 323-442-3140 Fax: 323-442-2448 Email: dconti at usc.edu
Thomas Petzoldt
2010-Apr-08 20:17 UTC
[R] estimating the starting value within a ODE using nls and lsoda
Hi Dave, first of all, fitting starting values of a dynamic model the same way like its parameters is indeed the usual method. In that case parameters *and* some or all initial value(s) of the dynamic model are both in fact 'parameters' for the statistical model fitting problem. Fitting a nonlinear model can be easy or problematic or even impossible, depending on the data and the model structure. In such cases one speaks about "identifiability" and there are several methods that can help to find parameter combinations that can be identified simultaneously. It is not important whether such a statistical parameter was originally a 'parameter' or an 'initial value' of the dynamic model, it simply depends on collinearity of the problem: http://en.wikipedia.org/wiki/Multicollinearity Several methods for identifiability analysis are provided in the CRAN package "FME" (Flexible Modelling Environment), which comes with extensive documentation (package vignettes as pdf files) and examples and there is a recent paper in the Journal of Statistical Software: http://www.jstatsoft.org/v33/i03 Look for function 'collin' that implements a collinearity index. In addition, function 'modFit', that is also in this package provides an interface to use different optimizers of R in a unique way, namely nls, nlminb, optim, nls.lm (from package minpack), and pseudoOptim, a pseudo-random search algorithm. Another hint, because you are still using "odesolve" (from Woodrow Setzer). This package has now a direct and compatible successor "deSolve" (Soetaert, Petzoldt, Setzer), see: http://www.jstatsoft.org/v33/i09 deSolve is even more stable than the former version and contains many many more solvers, not only lsoda and rk4, can handle other types of differential equations too and has much more documentation. Hope it helps! Thomas Petzoldt PS: There is also a dedicated mailing list for such questions: https://stat.ethz.ch/mailman/listinfo/r-sig-dynamic-models -- Thomas Petzoldt Technische Universitaet Dresden Institut fuer Hydrobiologie 01062 Dresden GERMANY http://tu-dresden.de/Members/thomas.petzoldt