I am a student who is doing empirical work for his thesis and trying to switch to R. I am familiar with Stata, and at the moment I am trying to replicate some of my previous work. I have a large unbalanced panel data set, observations for different countries between 1970 and 2007. My dependent variable is an overdispersed count. So far I have used fixed-effects negative binomial regression, i.e. assuming constant within-group dispersion. The command in Stata is xtnbreg, fe. How could I replicate this in R? I have found the package pglm, and tried the following pglm(T_total ~ Lgdpqt_2 + Lgdpqt_3 + Lgdpqt_4 + lpop + yrsconflict + past_T_total + Lpolcat_2 + Lpolcat_3 + Lpolcat_4 + Lgdpgr + mob_fixed + wdi_urbanpop + Lopen + Ldurable + factor(year), data = df, family negbin, model = "within", index = c("code","year"))) This takes ages, and then returns the following Maximum Likelihood estimation Newton-Raphson maximisation, 3 iterations Return code 3: Last step could not find a value above the current. Boundary of parameter space? Consider switching to a more robust optimisation method temporarily. Log-Likelihood: 112720.7 46 free parameters Estimates: Estimate Std. error t value Pr(> t) (Intercept) -177.015528 70.277178 -2.5188 0.01177 * Lgdpqt_2 -34.386693 NA NA NA Lgdpqt_3 -26.709422 NA NA NA Lgdpqt_4 -53.875809 NA NA NA lpop 34.821642 NA NA NA yrsconflict -8.693849 NA NA NA past_T_total -9.558045 NA NA NA Lpolcat_2 -11.601625 NA NA NA Lpolcat_3 2.397754 0.374797 6.3975 1.580e-10 *** Lpolcat_4 -11.661048 NA NA NA .......... and several warnings. If I drop the year dummies (is factor(year) more appropriate than a list of variables?), the results are the same as in Stata, but it is still taking quite long and the warnings persist. I think the problem lies somehow with figuring out the estimation sample. Stata automatically drops groups with all zero outcomes and with only one obs per group, as well as those year dummies that are unnecessary (I do the same regression for different dependent variables). The documentation for pglm mentions that there might be problems with unbalanced panels. How could I go about doing this? Did I make a mistake using pglm, or is it simply unsuited for my task? I think this could possibly be formulated as a mixed model. I looked into nlme, which afaik doesnt support the negative binomial family. Hope this is a relevant issue, I could find more anything else on the web/ this list, and a similar question on stackoverflow was left without a suitable answer. On a side note, I'm used to using underscores in variable names, but have read that this is not good pratice in R and that dots should be used instead. whats the reason behind that? Thanks very much for your help, hl