Beale, Colin
2005-Jul-13 09:41 UTC
[R] nlme, MASS and geoRglm for spatial autocorrelation?
Hi. I'm trying to perform what should be a reasonably basic analysis of some spatial presence/absence data but am somewhat overwhelmed by the options available and could do with a helpful pointer. My researches so far indicate that if my data were normal, I would simply use gls() (in nlme) and one of the various corSpatial functions (eg. corSpher() to be analagous to similar analysis in SAS) with form = ~ x+y (and a nugget if appropriate). However, my data are binomial, so I need a different approach. Using various packages I could define a mixed model (eg using glmmPQL() in MASS) with similar correlation structure, but I seem to need to define a random effect to use glmmPQL(), and I don't have any. Could this requirement be switched off and still use the mixed model approach? Alternatively, it may be possible to define the variance appropriately in gls and use logits directly, but I'm not quite sure how and suspect there's a more straight-forward alternative. Looking at geoRglm suggests there may be solutions here, but it seems like it might be overkill for what is, at first appearance at least, not such a difficult problem. Maybe I'm just being statistically naive, but I think I'm looking for a function somewhere between gls() and glmmPQL() and would be grateful for any pointers. Thanks very much, Colin Beale ... [[alternative HTML version deleted]]
Prof Brian Ripley
2005-Jul-13 10:29 UTC
[R] nlme, MASS and geoRglm for spatial autocorrelation?
You seem to want to model spatially correlated bernoulli variables. That's a difficult task, especially as these are bernoulli and not binomial(n>1). With a much fuller description of the problem we may be able to help, but I at least have no idea of the aims of the analysis. glmmPQL is designed for independent observations conditional on the random effects. On Wed, 13 Jul 2005, Beale, Colin wrote:> Hi. > > I'm trying to perform what should be a reasonably basic analysis of some > spatial presence/absence data but am somewhat overwhelmed by the options > available and could do with a helpful pointer. My researches so far > indicate that if my data were normal, I would simply use gls() (in nlme) > and one of the various corSpatial functions (eg. corSpher() to be > analagous to similar analysis in SAS) with form = ~ x+y (and a nugget if > appropriate). However, my data are binomial, so I need a different > approach. Using various packages I could define a mixed model (eg using > glmmPQL() in MASS) with similar correlation structure, but I seem to > need to define a random effect to use glmmPQL(), and I don't have any. > Could this requirement be switched off and still use the mixed model > approach? Alternatively, it may be possible to define the variance > appropriately in gls and use logits directly, but I'm not quite sure how > and suspect there's a more straight-forward alternative. Looking at > geoRglm suggests there may be solutions here, but it seems like it might > be overkill for what is, at first appearance at least, not such a > difficult problem. Maybe I'm just being statistically naive, but I think > I'm looking for a function somewhere between gls() and glmmPQL() and > would be grateful for any pointers. > > Thanks very much, > > Colin Beale > > ... > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html >-- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595
Beale, Colin
2005-Jul-13 11:14 UTC
[R] nlme, MASS and geoRglm for spatial autocorrelation?
My data are indeed bernoulli and not binomial, as I indicated. The dataset consists of points (grid refs) that are either locations of events (animals) or random points (with no animal present). For each point I have a suite of environmental covariates describing the habitat at this point. I was anticipating some sort of function that could run: function(present ~ env1 + env2 + env3 + x + y, correlation corSpher(form=~x+y), family = binomial) where env1 to env3 are the habitat covariates, x & y the grid refs. If my data were normal, I undertand I would use gls() with exactly this, but drop the family requirement. As my data are bernoulli this is clearly not possible, but I was hoping the analysis may be analagous? The eventual aim is to firstly understand which environmental covariates are important in determining presence and then to use habitat maps to identify the areas expected to be most important. Colin -----Original Message----- From: Prof Brian Ripley [mailto:ripley at stats.ox.ac.uk] Sent: 13 July 2005 11:30 To: Beale, Colin Cc: r-help at stat.math.ethz.ch Subject: Re: [R] nlme, MASS and geoRglm for spatial autocorrelation? You seem to want to model spatially correlated bernoulli variables. That's a difficult task, especially as these are bernoulli and not binomial(n>1). With a much fuller description of the problem we may be able to help, but I at least have no idea of the aims of the analysis. glmmPQL is designed for independent observations conditional on the random effects. On Wed, 13 Jul 2005, Beale, Colin wrote:> Hi. > > I'm trying to perform what should be a reasonably basic analysis of > some spatial presence/absence data but am somewhat overwhelmed by the > options available and could do with a helpful pointer. My researches > so far indicate that if my data were normal, I would simply use gls() > (in nlme) and one of the various corSpatial functions (eg. corSpher() > to be analagous to similar analysis in SAS) with form = ~ x+y (and a > nugget if appropriate). However, my data are binomial, so I need a > different approach. Using various packages I could define a mixed > model (eg using > glmmPQL() in MASS) with similar correlation structure, but I seem to > need to define a random effect to use glmmPQL(), and I don't have any. > Could this requirement be switched off and still use the mixed model > approach? Alternatively, it may be possible to define the variance > appropriately in gls and use logits directly, but I'm not quite sure > how and suspect there's a more straight-forward alternative. Looking > at geoRglm suggests there may be solutions here, but it seems like it > might be overkill for what is, at first appearance at least, not such > a difficult problem. Maybe I'm just being statistically naive, but I > think I'm looking for a function somewhere between gls() and glmmPQL()> and would be grateful for any pointers. > > Thanks very much, > > Colin Beale >...
> -----Original Message----- > From: r-help-bounces at stat.math.ethz.ch > [mailto:r-help-bounces at stat.math.ethz.ch]On Behalf Of Beale, Colin > Sent: 13 July 2005 10:15 > To: Prof Brian Ripley > Cc: r-help at stat.math.ethz.ch > Subject: Re: [R] nlme, MASS and geoRglm for spatial autocorrelation? > > > My data are indeed bernoulli and not binomial, as I indicated. The > dataset consists of points (grid refs) that are either locations of > events (animals) or random points (with no animal present). For each > point I have a suite of environmental covariates describing > the habitat at this point. I was anticipating some sort of function that > could run: > > function(present ~ env1 + env2 + env3 + x + y, correlation > corSpher(form=~x+y), family = binomial) > > where env1 to env3 are the habitat covariates, x & y the grid refs. If > my data were normal, I undertand I would use gls() with exactly this, > but drop the family requirement. As my data are bernoulli this is > clearly not possible, but I was hoping the analysis may be analagous? > The eventual aim is to firstly understand which environmental > covariates are important in determining presence and then to use habitat maps to > identify the areas expected to be most important.This could be done with geoRglm. I did something similar last week, but without covariates, only the spatial coordinates (i.e. my spatial process had expectation equal to a constant). If you are willing to sacrifice some spatial resolution you can create cells in your spatial data (say 100 m x 100 m) and in each cell count the number of successes in observing your spatial process and the number of trials. This will be a binomial problem and it seems to me to be the spatial equivalent of logistic regression where the predictor continuous variable is structured in bins and then events are counted in those bins. You can move to the R-sig-geo list if you have questions about geoRglm https://stat.ethz.ch/mailman/listinfo/r-sig-geo Btw, this can also be done in SAS using the glimmix macro. Ruben