Hi, Does anyone have an example of a Markov Random Field smoother (MRF) in MGCV where they have specified the neighbourhood directly, rather than supplying polygons? Does anyone understand how the rules should be? Based on the columb example, I have setup my data set and neighbourhood like so:> head(nb.l)$`10/10` [1] 135 155 153 $`10/2` [1] 27 8 6 $`10/3` [1] 48 7 28 26 $`10/4` [1] 69 27 49 47 $`10/5` [1] 48 70 68 $`10/7` [1] 115 95 93> head(obs)x y truth x.idx y.idx xy.idx 24 1.4835147 0.8026673 2.3605204 13 10 13/10 26 1.0452111 0.4673685 1.8316741 11 8 11/8 43 2.1514977 -0.2640058 -2.8812026 17 5 17/5 46 2.8473951 0.5445714 3.6347799 20 9 20/9 53 1.7983253 -0.6905912 -2.5473984 15 3 15/3 86 -0.1839814 -0.7824026 -0.5776616 5 2 5/2>but get the following error:> mdl <- gam(truth ~s(xy.idx,bs="mrf",xt=list(nb=nb.l)),data=obs,method="REML") Error in smooth.construct.mrf.smooth.spec(object, dk$data, dk$knots) : mismatch between nb/polys supplied area names and data area names However, there is a perfect match between the nb list names and the data area names:> all(levels(obs$xy.idx) %in% names(nb.l))[1] TRUE>Any suggestions where to start? Mark [[alternative HTML version deleted]]
Mark Payne <markpayneatwork <at> gmail.com> writes:> > Hi, > > Does anyone have an example of a Markov Random Field smoother (MRF) in MGCV > where they have specified the neighbourhood directly, rather than supplying > polygons? Does anyone understand how the rules should be? Based on the > columb example, I have setup my data set and neighbourhood like so: > > > head(nb.l) > $`10/10` > [1] 135 155 153 >...> > > head(obs) > x y truth x.idx y.idx xy.idx > 24 1.4835147 0.8026673 2.3605204 13 10 13/10...> > but get the following error: > > > mdl <- gam(truth ~ > s(xy.idx,bs="mrf",xt=list(nb=nb.l)),data=obs,method="REML") > Error in smooth.construct.mrf.smooth.spec(object, dk$data, dk$knots) : > mismatch between nb/polys supplied area names and data area namesI could reconstruct your problem with: library(mgcv) library(spdep) example(columbus) nb <- poly2nb(columbus) names(nb) <- attr(nb, "region.id") columbus$ID <- names(nb) bnb <- gam(CRIME ~ s(ID, bs="mrf", xt=list(nb=nb)), data=columbus, method="REML") #Error in ExtractData(object, data, knots) : # 'names' attribute [1] must be the same length as the vector [0] It appears that the ID variable is happier as integer: columbus$ID <- as.integer(names(nb)) bnb <- gam(CRIME ~ s(ID, bs="mrf", xt=list(nb=nb)), data=columbus, method="REML") summary(bnb) Using another representation: nb <- poly2nb(columbus, row.names=columbus$NEIGNO) names(nb) <- attr(nb, "region.id") bnb <- gam(CRIME ~ s(NEIGNO, bs="mrf", xt=list(nb=nb)), data=columbus, method="REML") also works, but NEIGNO is numeric. It appears that the ID variable should be numeric (or integer), but that the names of the list of neighbour vector should be character, but otherwise match. The call tree is not simple; the error seems to occur in mgcv:::ExtractData. Hope this helps, Roger> > However, there is a perfect match between the nb list names and the data > area names: > > all(levels(obs$xy.idx) %in% names(nb.l)) > [1] TRUE > > > > Any suggestions where to start? > > Mark > > [[alternative HTML version deleted]] > >
Hi Mark, I'm not sure what is happening here - there is no chance that nb.l contains a neighbourhood not in the levels of obs$xy.idx, I suppose? i.e. is all(names(nb.l)%in%levels(obs$xy.idx)) also TRUE? Here is some code illustrating what nb should look like (and in response to Roger Bivand's suggestion I also tried this replacing all the labels with things like "x/y", but it still works). ## example mrf fit using polygons.... library(mgcv) ## Load Columbus Ohio crime data (see ?columbus for details and credits) data(columb) ## data frame data(columb.polys) ## district shapes list xt <- list(polys=columb.polys) ## neighbourhood structure info for MRF par(mfrow=c(2,2)) ## First a full rank MRF... b0 <- gam(crime ~ s(district,bs="mrf",xt=xt),data=columb,method="REML") ## same fit based on direct neighbour spec... nb <- mgcv:::pol2nb(columb.polys)$nb xt <- list(nb=nb) b <- gam(crime ~ s(district,bs="mrf",xt=xt),data=columb,method="REML") best, Simon On 08/05/14 01:58, Mark Payne wrote:> Hi, > > Does anyone have an example of a Markov Random Field smoother (MRF) in MGCV > where they have specified the neighbourhood directly, rather than supplying > polygons? Does anyone understand how the rules should be? Based on the > columb example, I have setup my data set and neighbourhood like so: > >> head(nb.l) > $`10/10` > [1] 135 155 153 > > $`10/2` > [1] 27 8 6 > > $`10/3` > [1] 48 7 28 26 > > $`10/4` > [1] 69 27 49 47 > > $`10/5` > [1] 48 70 68 > > $`10/7` > [1] 115 95 93 > >> head(obs) > x y truth x.idx y.idx xy.idx > 24 1.4835147 0.8026673 2.3605204 13 10 13/10 > 26 1.0452111 0.4673685 1.8316741 11 8 11/8 > 43 2.1514977 -0.2640058 -2.8812026 17 5 17/5 > 46 2.8473951 0.5445714 3.6347799 20 9 20/9 > 53 1.7983253 -0.6905912 -2.5473984 15 3 15/3 > 86 -0.1839814 -0.7824026 -0.5776616 5 2 5/2 >> > > but get the following error: > >> mdl <- gam(truth ~ > s(xy.idx,bs="mrf",xt=list(nb=nb.l)),data=obs,method="REML") > Error in smooth.construct.mrf.smooth.spec(object, dk$data, dk$knots) : > mismatch between nb/polys supplied area names and data area names > > However, there is a perfect match between the nb list names and the data > area names: >> all(levels(obs$xy.idx) %in% names(nb.l)) > [1] TRUE >> > > Any suggestions where to start? > > Mark > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Simon Wood, Mathematical Science, University of Bath BA2 7AY UK +44 (0)1225 386603 http://people.bath.ac.uk/sw283