Dear Denis,
I'm not in a position to comment on the spatial and time-series aspects of
the data, though I'd be concerned about treating the observations as
independent. (I understand that you have some way of accounting for spatial
and temporal dependence.)
Except for the allowance that you've made for cross-equation correlated
errors, this model could be fit by equation-by-equation OLS regression, and
in its current form could be fit by equation-by-equation 2SLS. That FIML has
run into numerical problems is a signal, however, that the combination of
model and data are ill-conditioned in some way.
The output is a bit hard to follow because, for some reason, the table of
parameter estimates, standard errors, etc., is ravelled in your email. Some
of the parameter estimates are apparently very small numbers (I'm not sure
why you're not seeing more significant digits in some values), and you might
be able to make the computations more stable by changing the units of some
of the variables. Eliminating the intercepts and working with the covariance
matrix might also make the problem better conditioned.
"NaN" stands for "not a number," and is produced here when
the summary
method for sem objects tries to take the square-root of a negative variance;
consequently, the standard error, z-statistic, and p-value are all
undefined.
BTW, I count 15, not 13, exogenous variables (including the intercept).
I hope this helps,
John
--------------------------------
John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox
--------------------------------
> -----Original Message-----
> From: r-help-bounces at stat.math.ethz.ch
> [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Denis Fomchenko
> Sent: Monday, July 17, 2006 1:40 PM
> To: John Fox; 'Spencer Graves'
> Cc: r-help at stat.math.ethz.ch
> Subject: [R] sem: negative parameter variances
>
> Dear Spencer and Prof. Fox,
>
>
>
> Thank you for your replies. I'll very appreciate, if you have
> any ideas concerning the problem described below.
>
>
>
> First, I'd like to describe the model in brief.
>
> In general I consider a model with three equations.
>
> First one is for annual GRP growth - in general it looks like:
>
>
>
> 1) GRP growth per capita = G(investment, migration, initial
> GRP per capita, spatial lag on GRP growth + some additional
> explanatory variables to control for regional disparities),
>
>
>
> where spatial lag on GRP growth in year t is simply spatial
> weights matrix W(n*n) (weights - are inverse square distances
> between regional centers; with zeros on the main diagonal)
> times GRP growth in year t, (n*1). I compute it manually for
> every year.
>
>
>
> Two others are for migration (labor supply) and investment
> (capital supply):
>
>
>
> 2) Migration = M(factors, explaining migration)
>
> 3) Investment = I(factors, explaining investment)
>
>
>
> I consider GRP growth, migration and investment to be
> endogenous and all others vars to be exogenous to the model.
> The data are over 77 regions (n=77) and 8 years (t =
> 1997..2004), so that I have total 616 observations.
>
>
>
> Actually, the final goal of the study is to estimate a model
> 1) simultaneously by FIML 2) with Mundlak (1978, 1981)
> specification of panels to capture for fixed and between
> effects 3) with spatial lag on GRP growth. As for spatial lag
> - the model needs to be estimated by maximum likelihood (as
> it is shown in Anselin 1988, for example)
>
> The main problems with Mundlak are increasing number of
> variables: as well as dimension of a model (additional
> between equations). So to begin with, I try to estimate pool,
> which is obviously easier, and proceed as follows:
>
>
>
> Consider the following specification:
>
>
>
> model.3eq.33 <- specify.model( )
>
> ln.inv.p0.pc ->
> ln.grp.phvi.pc , beta13 , NA
>
> ln.migr.new ->
> ln.grp.phvi.pc , beta12 , NA
>
> ln.grp.corr.pc97 ->
> ln.grp.phvi.pc , gamma11 , NA
>
> ln.sh.ind.rawwide ->
> ln.grp.phvi.pc , gamma12 , NA
>
> port ->
> ln.grp.phvi.pc , gamma13 , NA
>
> t12 ->
> ln.grp.phvi.pc , gamma14 , NA
>
> wd.ln.grp.phvi.pc ->
> ln.grp.phvi.pc , rho , NA
>
> (Intercept) ->
> ln.grp.phvi.pc , alpha1 , NA
>
> ln.income.pc.fcb97 -> ln.migr.new
> , gamma21 , NA
>
> ln.unempl.level97 -> ln.migr.new
> , gamma26 , NA
>
> avertemp.jan.cs -> ln.migr.new
> , gamma22 , NA
>
> ln.pop.gr.89.26mbe -> ln.migr.new
> , gamma23 , NA
>
> ln.pass.railway.percap -> ln.migr.new
> , gamma24 , NA
>
> ln.city ->
> ln.migr.new , gamma25 , NA
>
> (Intercept) ->
> ln.migr.new , alpha2 , NA
>
> ln.grp.corr.pc97 -> ln.inv.p0.pc
> , gamma31 , NA
>
> avertemp.jan.cs -> ln.inv.p0.pc
> , gamma32 , NA
>
> permafrost ->
> ln.inv.p0.pc , gamma33 , NA
>
> ln.indoutput.fuel.p0.pc -> ln.inv.p0.pc
> , gamma34 , NA
>
> ln.phone1995 -> ln.inv.p0.pc
> , gamma35 , NA
>
> (Intercept) ->
> ln.inv.p0.pc , alpha3 , NA
>
> ln.grp.phvi.pc <->
> ln.grp.phvi.pc , sigma11 , NA
>
> ln.grp.phvi.pc <-> ln.migr.new
> , sigma12 , NA
>
> ln.grp.phvi.pc <->
> ln.inv.p0.pc , sigma13 , NA
>
> ln.migr.new <-> ln.migr.new
> , sigma22 , NA
>
> ln.migr.new <->
> ln.inv.p0.pc , sigma23 , NA
>
> ln.inv.p0.pc <->
> ln.inv.p0.pc , sigma33 , NA
>
>
>
> So I consider a model with 3 equations, 3 endogenous and 13
> exogenous variables and 6 double-arrow relations for error covariance.
>
>
>
> Then, I use raw moments to capture (Intercept) in each equation:
>
>
>
> raw.3eq.33 <- raw.moments(~ ln.grp.phvi.pc + ln.inv.p0.pc +
> ln.migr.new + ln.grp.corr.pc97 + ln.sh.ind.rawwide + port +
> t12 + wd.ln.grp.phvi.pc + ln.income.pc.fcb97 +
> ln.unempl.level97 + avertemp.jan.cs + ln.pop.gr.89.26mbe +
> ln.pass.railway.percap + ln.city + permafrost +
> ln.indoutput.fuel.p0.pc + ln.phone1995, data=pool.spat)
>
>
>
> Next, I estimate model by sem-function (here, I declare fixed
> exogenous variables, including spatial lag
> 'wd.ln.grp.phvi.pc' ... though I am not sure about that)
>
>
>
> model.3eq.33.estim <- sem(model.3eq.33, raw.3eq.33, 616,
> fixed.x=c('ln.grp.corr.pc97', 'ln.sh.ind.rawwide',
'port',
> 't12', 'wd.ln.grp.phvi.pc', 'ln.income.pc.fcb97',
> 'ln.unempl.level97', 'avertemp.jan.cs',
'ln.pop.gr.89.26mbe',
> 'ln.pass.railway.percap', 'ln.city', 'permafrost',
> 'ln.indoutput.fuel.p0.pc', 'ln.phone1995',
'(Intercept)'), raw=TRUE)
>
>
>
> Then, R returns:
>
>
>
> Warning message:
> Negative parameter variances.
> Model is probably underidentified.
> in: sem.default(ram = ram, S = S, N = N, param.names = pars,
> var.names = vars, :
>
>
>
> Obtained summary:
>
>
>
> summary(model.3eq.33.estim)
>
>
>
> Model fit to raw moment matrix.
>
> Model Chisquare = 398.81 Df = 24 Pr(>Chisq) = 0
>
> Goodness-of-fit index = 0.944
>
> Adjusted goodness-of-fit index = 0.60099
>
> RMSEA index = 0.15923 90 % CI: (0.14569, 0.17316)
>
> BIC = 175.29
>
> Normalized Residuals
>
> Min. 1st Qu. Median Mean 3rd Qu. Max.
>
> -0.9120 0.0000 0.0000 0.0268 0.0000 1.5600
>
>
>
> Parameter
> Estimate
> Std Error
> z value
> Pr(>|z|)
>
>
>
>
> beta13
> 0.0151
> 0.0031
> 4.9157
> 0.0000
> ln.grp.phvi.pc
> <---
> ln.inv.p0.pc
>
> beta12
> 0.9983
> 0.4831
> 2.0663
> 0.0388
> ln.grp.phvi.pc
> <---
> ln.migr.new
>
> gamma11
> -0.0268
> 0.0074
> -3.6420
> 0.0003
> ln.grp.phvi.pc
> <---
> ln.grp.corr.pc97
>
> gamma12
> 0.0120
> 0.0174
> 0.6867
> 0.4923
> ln.grp.phvi.pc
> <---
> ln.sh.ind.rawwide
>
> gamma13
> 0.0116
> 0.0062
> 1.8803
> 0.0601
> ln.grp.phvi.pc
> <---
> port
>
> gamma14
> -0.0063
> 0.0058
> -1.0996
> 0.2715
> ln.grp.phvi.pc
> <---
> t12
>
> rho
> 0.7542
> 0.0479
> 15.7449
> 0.0000
> ln.grp.phvi.pc
> <---
> wd.ln.grp.phvi.pc
>
> alpha1
> 0.1488
> 0.0667
> 2.2323
> 0.0256
> ln.grp.phvi.pc
> <---
> (Intercept)
>
> gamma25
> 0.0027
> 0.0013
> 2.1277
> 0.0334
> ln.migr.new
> <---
> ln.income.pc.fcb97
>
> gamma26
> 0.0038
> 0.0012
> 3.1503
> 0.0016
> ln.migr.new
> <---
> ln.unempl.level97
>
> gamma27
> 0.0003
> 0.0000
> 9.7510
> 0.0000
> ln.migr.new
> <---
> avertemp.jan.cs
>
> gamma28
> -0.0028
> 0.0003
> -9.6679
> 0.0000
> ln.migr.new
> <---
> ln.pop.gr.89.26mbe
>
> gamma29
> 0.0020
> 0.0009
> 2.2828
> 0.0224
> ln.migr.new
> <---
> ln.pass.railway.percap
>
> gamma210
> 0.0030
> 0.0004
> 8.2058
> 0.0000
> ln.migr.new
> <---
> ln.city
>
> alpha2
> -0.0244
> 0.0041
> -5.9237
> 0.0000
> ln.migr.new
> <---
> (Intercept)
>
> gamma31
> 0.7727
> 0.1070
> 7.2237
> 0.0000
> ln.inv.p0.pc
> <---
> ln.grp.corr.pc97
>
> gamma37
> 0.0123
> 0.0063
> 1.9596
> 0.0500
> ln.inv.p0.pc
> <---
> avertemp.jan.cs
>
> gamma311
> 0.1208
> 0.1118
> 1.0799
> 0.2802
> ln.inv.p0.pc
> <---
> permafrost
>
> gamma312
> 0.1805
> 0.0246
> 7.3497
> 0.0000
> ln.inv.p0.pc
> <---
> ln.indoutput.fuel.p0.pc
>
> gamma313
> 0.3888
> 0.1360
> 2.8591
> 0.0042
> ln.inv.p0.pc
> <---
> ln.phone1995
>
> alpha3
> -1.1867
> 0.8601
> -1.3798
> 0.1677
> ln.inv.p0.pc
> <---
> (Intercept)
>
> sigma11
> 0.0025
> 0.0001
> 17.3196
> 0.0000
> ln.grp.phvi.pc
> <-->
> ln.grp.phvi.pc
>
> sigma12
> 0.0000
> 0.0000
> -1.3206
> 0.1867
> ln.migr.new
> <-->
> ln.grp.phvi.pc
>
> sigma13
> 0.0001
> NaN
> NaN
> NaN
> ln.inv.p0.pc
> <-->
> ln.grp.phvi.pc
>
> sigma22
> 0.0000
> 0.0000
> 17.5513
> 0.0000
> ln.migr.new
> <-->
> ln.migr.new
>
> sigma23
> -0.0001
> 0.0001
> -0.5696
> 0.5690
> ln.inv.p0.pc
> <-->
> ln.migr.new
>
> sigma33
> 0.5859
> 0.0334
> 17.5470
> 0.0000
> ln.inv.p0.pc
> <-->
> ln.inv.p0.pc
>
>
>
>
> Iterations = 205
>
> Aliased parameters: sigma13
>
> Warning message:
>
> NaN in: sqrt(diag(object$cov))
>
>
>
> So R estimates the model with NaN for sigma13: I'm not sure
> what it means... As you can see, maximum is occurred at a
> negative value for sigma13. At the same time one can check
> for rank condition - if I'm not mistaken, all three equations
> are exact.
>
>
>
> After that, I estimate another specification, for example,
> including 'ln.postgrad.students.pc.be' (post-graduate
> students) instead of 't12' (dummy for poor and depressed
> regions), so that these two specifications differ only in one
> exogenous variable. I proceed in the same manner as before
> (including 'ln.postgrad.students.pc.be' instead of 't12' in
> specifying the model, computing raw moments and declaring new
> var to be fixed exogenous). So, I do not provide sem's code
> for the sake of space saving.
>
> Now R estimates the system "correctly", without any warning
messages:
>
>
>
> Model fit to raw moment matrix.
>
> Model Chisquare = 405.73 Df = 24 Pr(>Chisq) = 0
>
> Goodness-of-fit index = 0.94308
>
> Adjusted goodness-of-fit index = 0.59441
>
> RMSEA index = 0.16069 90 % CI: (0.14715, 0.17462)
>
> BIC = 182.21
>
> Normalized Residuals
>
> Min. 1st Qu. Median Mean 3rd Qu. Max.
>
> -0.9210 0.0000 0.0000 0.0286 0.0000 1.6000
>
>
>
> Parameter
> Estimate
> Std Error
> z value
> Pr(>|z|)
>
>
>
>
> beta13
> 0.0157
> 0.0036
> 4.3162
> 0.0000
> ln.grp.phvi.pc
> <---
> ln.inv.p0.pc
>
> beta12
> 0.9143
> 0.5337
> 1.7132
> 0.0867
> ln.grp.phvi.pc
> <---
> ln.migr.new
>
> gamma11
> -0.0268
> 0.0068
> -3.9642
> 0.0001
> ln.grp.phvi.pc
> <---
> ln.grp.corr.pc97
>
> gamma12
> 0.0221
> 0.0171
> 1.2919
> 0.1964
> ln.grp.phvi.pc
> <---
> ln.sh.ind.rawwide
>
> gamma13
> 0.0141
> 0.0069
> 2.0499
> 0.0404
> ln.grp.phvi.pc
> <---
> port
>
> gamma14
> 0.0061
> 0.0032
> 1.8769
> 0.0605
> ln.grp.phvi.pc
> <---
> ln.postgrad.students.pc.be
>
> rho
> 0.7483
> 0.0480
> 15.5823
> 0.0000
> ln.grp.phvi.pc
> <---
> wd.ln.grp.phvi.pc
>
> alpha1
> 0.1433
> 0.0591
> 2.4256
> 0.0153
> ln.grp.phvi.pc
> <---
> (Intercept)
>
> gamma25
> 0.0026
> 0.0014
> 1.8682
> 0.0617
> ln.migr.new
> <---
> ln.income.pc.fcb97
>
> gamma26
> 0.0038
> 0.0013
> 2.8600
> 0.0042
> ln.migr.new
> <---
> ln.unempl.level97
>
> gamma27
> 0.0003
> 0.0000
> 9.3817
> 0.0000
> ln.migr.new
> <---
> avertemp.jan.cs
>
> gamma28
> -0.0028
> 0.0003
> -9.5709
> 0.0000
> ln.migr.new
> <---
> ln.pop.gr.89.26mbe
>
> gamma29
> 0.0021
> 0.0010
> 2.1621
> 0.0306
> ln.migr.new
> <---
> ln.pass.railway.percap
>
> gamma210
> 0.0030
> 0.0004
> 7.9666
> 0.0000
> ln.migr.new
> <---
> ln.city
>
> alpha2
> -0.0242
> 0.0042
> -5.7363
> 0.0000
> ln.migr.new
> <---
> (Intercept)
>
> gamma31
> 0.7733
> 0.1096
> 7.0574
> 0.0000
> ln.inv.p0.pc
> <---
> ln.grp.corr.pc97
>
> gamma37
> 0.0123
> 0.0067
> 1.8285
> 0.0675
> ln.inv.p0.pc
> <---
> avertemp.jan.cs
>
> gamma311
> 0.1209
> 0.1201
> 1.0065
> 0.3142
> ln.inv.p0.pc
> <---
> permafrost
>
> gamma312
> 0.1806
> 0.0250
> 7.2323
> 0.0000
> ln.inv.p0.pc
> <---
> ln.indoutput.fuel.p0.pc
>
> gamma313
> 0.3879
> 0.1379
> 2.8129
> 0.0049
> ln.inv.p0.pc
> <---
> ln.phone1995
>
> alpha3
> -1.1881
> 0.8754
> -1.3572
> 0.1747
> ln.inv.p0.pc
> <---
> (Intercept)
>
> sigma11
> 0.0025
> 0.0001
> 17.2039
> 0.0000
> ln.grp.phvi.pc
> <-->
> ln.grp.phvi.pc
>
> sigma12
> 0.0000
> 0.0000
> -1.1470
> 0.2514
> ln.migr.new
> <-->
> ln.grp.phvi.pc
>
> sigma13
> -0.0001
> 0.0010
> -0.0820
> 0.9346
> ln.inv.p0.pc
> <-->
> ln.grp.phvi.pc
>
> sigma22
> 0.0000
> 0.0000
> 17.5416
> 0.0000
> ln.migr.new
> <-->
> ln.migr.new
>
> sigma23
> -0.0001
> 0.0002
> -0.4366
> 0.6624
> ln.inv.p0.pc
> <-->
> ln.migr.new
>
> sigma33
> 0.5859
> 0.0334
> 17.5370
> 0.0000
> ln.inv.p0.pc
> <-->
> ln.inv.p0.pc
>
>
> Iterations = 137
>
>
>
> In fact, I've estimated about 30 specifications trying to
> experiment with different additional explanatory variables in
> the first equation, and about two thirds of 30 are turned out
> to be estimated with that warning message (Negative parameter
> variances with NaN for sigma's variances). I've not been
> clarified yet with the final specification... though I am not
> sure that's the case. Perhaps, I do something wrong
> concerning specification of error covariance structure.
>
>
>
> Do you have any idea? I'll be pleased by any suggestion.
>
>
>
> Sorry to trouble you.
>
> Thanks in advance,
>
>
>
> Denis Fomchenko
>
> Research fellow
>
> Department for Economic Development Problems
>
> Institute for the Economy in Transition
>
> 5, Gazetny lane, Moscow 125993, Russia
>
> e-mail: fomchenko at iet.ru
>
> http://www.iet.ru
>
>
>
>
>
>
>
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
> http://www.R-project.org/posting-guide.html