Dimitri Liakhovitski <dimitri.liakhovitski <at> gmail.com> writes:
> I have 2 variables - predictor "pred" and response variable
"DV":
>
> pred<-c(439635.053, 222925.718, 668434.755, 194242.330, 5786.321,
115537.344,
> 100835.368, 7133.206, 159058.286, 4079991.629, 3380078.060, 2661279.136,
> 2698324.478, 1245213.965, 1901815.503, 1517019.451, 1396857.736,
1034030.988,
> 432249.574, 342329.325, 1831335.792, 2209578.859, 1641709.754, 1329308.669,
> 1251794.367, 731368.430, 1705626.983, 673535.171, 242519.280, 57251.998,
> 5728.821, 2054514.244, 301954.819, 773955.355, 735497.506, 347355.976,
> 1678175.153, 133082.395, 591326.289, 30866.182, 27235.846, 118372.342,
> 71590.969, 84813.299, 366146.153, 1391725.205, 763199.746, 1216661.202,
> 263878.157, 930832.769, 261270.130, 589303.561, 455137.946,
> 954655.201, 873434.054)
> (pred)
> DV<-c(0.55351297,0.27616943,0.58134926,0.33887159,0.03092546,0.14928061,
> 0.11836759,0.01719463,0.03258188,1.81205587,2.86657699,2.49491195,
> 3.09727230,1.95648776,2.28106268,1.78978179,1.74003678,1.22520393,
> 0.54245878,0.41483039,1.08731493,2.19581289,1.60516129,1.30723431,
> 1.41822649,1.31530539,2.02406576,1.22211412,0.52055790,0.12975522,
> 0.01416903,0.61043485,0.44141748,0.64327070,0.53607039,0.32603820,
> 1.77261016,0.42035756,0.37853917,0.12342486,0.06607710,0.02383682,
> 0.08421590,0.09255332,0.23644909,1.67921092,1.26864432,1.38654574,
>
1.29833020,1.76873555,0.93363677,1.01857658,0.81359775,2.14758239,2.41583852)
> (DV)
>
> Both "pred" and "DV" above are time series (observed
across 55
> months). The relationship between them is pre-specified. In this
> relationship, the (predicted) "DV" at time t is a specific
function of
> itself at time t-1, of "pred" at time t, and of 2 scalars - a
and b.
> I have to find optimal a and b that would ensure the best fit between
> the observed DV and the predicted DV. Below is the function I have to
> optimize:
>
> my.function <- function(param){
> a<-param[1]
> b<-param[2]
> DV_pred <- rep(0,length(pred))
> for(i in 2:length(pred)){
> DV_pred[i] <- 1 - ( (1 - DV_pred[i-1] * a) / (exp(pred[i] * b)) )
> }
> DV_pred[1]<-DV[1]
> correl <- cor(DV,DV_pred)
> return(correl)
> }
>
> a has to be between 0.001 and 0.75
> b has to be positive.
Rather than worry about optimization routines, I think
you need to think more carefully about what your objective
function is actually doing. I played around with this a bit
and got something reasonable. You only have two parameters, so it shouldn't
be too hard to figure out what's going on by appropriate exploration.
matplot(cbind(pred,DV),log="y")
## split objective function into two parts, one that computes
## the predicted value ...
calc_DV_pred <- function(a,b) {
DV_pred <- rep(0,length(pred))
DV_pred[1]<-DV[1]
for(i in 2:length(pred)){
DV_pred[i] <- 1 - ( (1 - DV_pred[i-1] * a) / (exp(pred[i] * b)) )
}
DV_pred
}
## ... and the other (wrapper) to compute the correlation
## I switched to estimating b on a logarithmic scale
my.function <- function(param){
a<-param[1]
b<-exp(param[2])
correl <- cor(DV,calc_DV_pred(a,b))
return(correl)
}
## try out the function for various values
my.function(c(0.5,-5))
my.function(c(0.5,-6))
my.function(c(0.5,-9))
my.function(c(0.5,1.1))
my.function(c(0.5,1.2))
## try to fit
opt1 <- optim(fn=my.function,par=c(a=0.5,b=-9),
method="L-BFGS-B",
lower=c(0.001,-17),
upper=c(0.75,Inf),
control=list(fnscale=-1))
## look at objective function surface
library(emdbook)
cc <- curve3d(my.function(c(x,y)),xlim=c(0.001,0.75),ylim=c(-18,1),
n=c(50,50),sys3d="contour")
cc2 <- curve3d(my.function(c(x,y)),xlim=c(0.001,0.75),ylim=c(-16,-12),
n=c(50,50),sys3d="contour")
points(opt1$par[1],opt1$par[2])
DV_pred <- calc_DV_pred(opt1$par[1],exp(opt1$par[2]))
matplot(cbind(pred,DV,DV_pred),log="y",type="l",col=c(1,2,4))
In hindsight, my initial difficulty (and possibly yours as well)
was starting with a value of b that was much too large.