dear giovanni---
thanks for answering on r-help to me as well as privately. I very much
appreciate your responding. I read the plm vignette. I don't have the book,
so I can't consult it. :-(. I am going to post this message now (rather
than just email it privately), because other amateurs may have similar
questions in the future, and find this message and your answers via google.
Real Statisticians---please don't waste your time.
so here is my amateur interpretation of GMM in general and Arellano-Bond
and Blundell-Bond specifically. I will do an example with T=4. The model is
x(i,t) = a*x(i,t-1) + u(i,t)
ie
x(i,2) = a*x(i,1) + u(i,2)
x(i,3) = a*x(i,2) + u(i,3)
x(i,4) = a*x(i,4) + u(i,4)
I view u(i,t) as a function of a: u(i,t)[a] = x(i,t)-a*x(i,t-1) . the
Arellano-Bond method then claims that u(i,3) should be uncorrelated with
x(i,1); u(i,4) should be uncorrelated with x(i,1) and also with x(i,2).
Blundell Bond adds the further condition that u(i,4) should be uncorrelated
with x(i,2)-x(i,1). so, I think of having four sums, each over all firms
i's. Let me call cross-sectional summing as sumi. the penalty function to
minimize is
sumi u(i,3)[a]*x(i,1) + sumi u(i,4)[a]*x(i,1) + sumi u(i,4)[a]*x(i,2) +
sumi u(i,4)*(x(i,2)-x(i,1))
I am missing the correct H weights on the terms in this sum, which is some
GMM magic that I do not understand (though I can copy it from their
article). for this post, the exact moment weights are not conceptually
important. now, for this sum to be well-defined, I should not need very
many observations at all. even with, say, N=7 firms, there should be no
problem in finding an a that minimizes the sum. (To me, it seems that the
more moment conditions I have, the merrier.) I was a little more encouraged
to make such daring statements, because stata seemed able of running this
and producing output.
On the other hand, the exact NF number at which pgmm() dies does suggest
that you are right.
function( NF=7, NT=4 ) {
d= data.frame( firm= rep(1:NF, each=NT), year= rep( 1:NT, NF),
x=rnorm(NF*NT) )
lagformula= dynformula( x ~ 1, list(1) )
v=pgmm( lagformula, data=d, gmm.inst=~x, model="onestep", effect=NULL,
lag.gmm=c(1,99), transformation="ld" )
}
with NF=8, it works; and with NF=7, it dies. With NF=7, I have 28 data
points in levels and 21 data points in differences, which are used to
estimate only one auto-coefficient via 4 moment conditions. (Is this
correct?)
my best guess now is that even though one can get the GMM estimates with 7
firms, one cannot use the two-step method to learn how to best weight the
different moment conditions. the only method that may work is the one-step
matrix. of course, all of this is about conceptual tryouts, not about real
data. these methods work only well when NF is very large.
now, for the plm package: the non-descriptive error messages are also what
creates confusion when amateurs like myself want to create simple examples
[not real data] to understand how to provide proper inputs. if one needs a
minimum number of N, then may I strongly suggest that you trap this with a
descriptive error message at the outset? similarly, I would add an error
message if the formula provided to pgmm is not a dynformula, but a plain
formula. just die with "please use dynformula instead".
there is also a small bug in the documentation. the vignette says that NULL
is a possible input to effect, while the standard docs mention
only "individual" or "twoways".
I also emailed Yves that it would be great if you could provide a wrapper
for your more general function that does the simple estimation that 99% of
all end users would ever want. this would have the following inputs:
[a] method = arellano-bond or blundell-bond
[b] fixed effects or not
[c] a set of totally exogenous variables
[d] the number of lags of the dependent variable, defaults to 1
the version omits the GMM instrument vs. non-GMM instrument lingo, (though
after reading the vignette I have more of an inkling that all I need is to
not tell the function about exogenous variables and leave them in the
model), and knows that the dependent variable is dynamic by assumption, so
no more gmm.inst specification is required. yes, it is great to have the
implementation built on more heavy artillery that the statisticians can use
for more flexible estimations; but for end-users, having this simplified
function would really be terrific. (it would presumably default to using
the two-step method, which has more intelligent standard errors.) with such
a function wrapper, using these dynamic panel methods would become really
easy. just a suggestion...
May I end with stating that writing such a general plm seems like a
Herculean tasks, and that I want to express my thanks on behalf of many R
users that will benefit from it.
regards,
/ivo
[[alternative HTML version deleted]]