ivo welch
2010-May-09 20:14 UTC
[R] non-linear estimation with many firm-specific parameters
Dear R experts--- I doubt that someone has already solved my problem, but I thought I would ask quickly, just in case someone has. Let' say I start with a (flattened panel) model that says y[i] = x[i] + b*(T-x[i]) easy enough---this is just a linear model. I could also make this a fixed-effects model if I change T to T[fmid], where fmid is the firm's id. I know I can do this faster, but logically, what I want to estimate is lm( y ~ as.factor(fmid) + x ). I have about 100,000 observations, and about 10,000 firm ids. now, let me move to a world in which b is a function of the distance between T and x, b= a+c*(T-x[i])^2 y[i] = x[i] + b(T,x[i]) * (T-x[i]) = x[i] + (a+c*(T-x[i])^2) * (T-x[i]) R solves this nicely with the nls() function in about 5 seconds. The result are estimates for a, c, and T. here comes the hard part. I want to make the T again a function of each firm, i.e., T[fmid]. in a sense, I want y[i] = x[i] + (a+c*(T[fmid]-x[i])^2) * (T[fmid] - x[i]) where the firm-specific constants are supposed to be the same in the two terms (i.e., not the permutative set). the usual trick to speed up fixed-effects estimations (i.e., subtracting out the means) does not work here, because the problem is non-linear. I am thinking about expanding the dummies into an appropriate matrix, then coding my problem into an objective function, and letting R optimize over my, ahem, 10,000 or so T[i], a, and b. I fear that this would not only overwhelm my CPU (taking a few days, which would be ok), but overwhelm my memory, too. maybe it is just plain infeasible. has anyone seen someone else work on such a problem? sincerely, /ivo welch ---- Ivo Welch (ivo.welch at brown.edu, ivo.welch at gmail.com)