Dear R, I have a covariates matrix with 10 observations, e.g.> X <- matrix(rnorm(50), 10, 5) > X[,1] [,2] [,3] [,4] [,5] [1,] 0.24857135 0.30880745 -1.44118657 1.10229027 1.0526010 [2,] 1.24316806 0.36275370 -0.40096866 -0.24387888 -1.5324384 [3,] -0.33504014 0.42996246 0.03902479 -0.84778875 -2.4754644 [4,] 0.06710229 1.01950917 -0.09325091 -0.03222811 0.4127816 [5,] -0.13619141 1.33143821 -0.79958805 2.08274102 0.6901768 [6,] -0.45060357 0.19348831 -1.23793647 -0.72440163 0.5057326 [7,] -1.20740516 0.20231086 1.15584485 0.81777770 -1.2719855 [8,] -1.81166284 -0.07913113 -0.91080581 -0.34774436 0.9552182 [9,] 0.19131383 0.14980569 -0.37458224 -0.09371273 -1.7667203 [10,] -0.85159276 -0.66679528 1.63019340 0.56920196 -2.4049600 And I define a boundary of X: The smallest "ball" that nests all the observations of X. I wish to check if a particular point x_i> x_i <- matrix(rnorm(5), 1, 5) > x_i[,1] [,2] [,3] [,4] [,5] [1,] -0.1525543 0.4606419 -0.1011011 -1.557225 -1.035694 is inside the boundary of X or not. I know it's easy to do it with 1-D or 2-D, but I don't knot how to manage it when the dimension is large. Can someone give a hint? Thanks in advance! Feng -- Feng Li Department of Statistics Stockholm University 106 91 Stockholm, Sweden http://feng.li/ [[alternative HTML version deleted]]
Hello convex hulls in large numbers of dimensions are hard. For your problem, though, one can tell whether a given point is inside or outside by using linear programming:> X <- matrix(rnorm(50), 10, 5) > x_i <- matrix(rnorm(5), 1, 5) > isin.chullfunction(candidate,p,plot=FALSE,give.answers=FALSE, ...){ if(plot){ plot(p,...) p(candidate[1],candidate[2], pch=16) } n <- nrow(p) # number of points d <- ncol(p) # number of dimensions p <- t(sweep(p,2,candidate)) jj <- simplex(a=rep(1,n),A3=rbind(p,1),b3=c(0*candidate,1)) if(give.answers){ return(jj) } else { return((jj$solved >= 0) & all(jj$soln<1)) } }> isin.chull(x_i,X)[1] FALSE>(we can discuss offline; I'll summarize) HTH rksh On 24/09/10 10:44, Feng Li wrote:> Dear R, > > I have a covariates matrix with 10 observations, e.g. > > >> X <- matrix(rnorm(50), 10, 5) >> X >> > [,1] [,2] [,3] [,4] [,5] > [1,] 0.24857135 0.30880745 -1.44118657 1.10229027 1.0526010 > [2,] 1.24316806 0.36275370 -0.40096866 -0.24387888 -1.5324384 > [3,] -0.33504014 0.42996246 0.03902479 -0.84778875 -2.4754644 > [4,] 0.06710229 1.01950917 -0.09325091 -0.03222811 0.4127816 > [5,] -0.13619141 1.33143821 -0.79958805 2.08274102 0.6901768 > [6,] -0.45060357 0.19348831 -1.23793647 -0.72440163 0.5057326 > [7,] -1.20740516 0.20231086 1.15584485 0.81777770 -1.2719855 > [8,] -1.81166284 -0.07913113 -0.91080581 -0.34774436 0.9552182 > [9,] 0.19131383 0.14980569 -0.37458224 -0.09371273 -1.7667203 > [10,] -0.85159276 -0.66679528 1.63019340 0.56920196 -2.4049600 > > And I define a boundary of X: The smallest "ball" that nests all the > observations of X. I wish to check if a particular point x_i > > >> x_i <- matrix(rnorm(5), 1, 5) >> x_i >> > [,1] [,2] [,3] [,4] [,5] > [1,] -0.1525543 0.4606419 -0.1011011 -1.557225 -1.035694 > > is inside the boundary of X or not. I know it's easy to do it with 1-D or > 2-D, but I don't knot how to manage it when the dimension is large. > > Can someone give a hint? Thanks in advance! > > > Feng > >-- Robin K. S. Hankin Uncertainty Analyst University of Cambridge 19 Silver Street Cambridge CB3 9EP 01223-764877
Hello, If an N-dimensional convex hull fits your idea of a "smallest ball" then you could try the convhulln function in the geometry package. For testing if a new point is inside a previously derived hull, one brute force approach is to rbind the new point to your data, generate a new hull and see if it is the same as the previous one. I've only used convhulln in low dimensions so I don't know how efficient it is when N is large. Hope this helps. Michael On 24 September 2010 19:44, Feng Li <feng.li at stat.su.se> wrote:> Dear R, > > I have a covariates matrix with 10 observations, ?e.g. > >> X <- matrix(rnorm(50), 10, 5) >> X > ? ? ? ? ? ? [,1] ? ? ? ?[,2] ? ? ? ?[,3] ? ? ? ?[,4] ? ? ? [,5] > ?[1,] ?0.24857135 ?0.30880745 -1.44118657 ?1.10229027 ?1.0526010 > ?[2,] ?1.24316806 ?0.36275370 -0.40096866 -0.24387888 -1.5324384 > ?[3,] -0.33504014 ?0.42996246 ?0.03902479 -0.84778875 -2.4754644 > ?[4,] ?0.06710229 ?1.01950917 -0.09325091 -0.03222811 ?0.4127816 > ?[5,] -0.13619141 ?1.33143821 -0.79958805 ?2.08274102 ?0.6901768 > ?[6,] -0.45060357 ?0.19348831 -1.23793647 -0.72440163 ?0.5057326 > ?[7,] -1.20740516 ?0.20231086 ?1.15584485 ?0.81777770 -1.2719855 > ?[8,] -1.81166284 -0.07913113 -0.91080581 -0.34774436 ?0.9552182 > ?[9,] ?0.19131383 ?0.14980569 -0.37458224 -0.09371273 -1.7667203 > [10,] -0.85159276 -0.66679528 ?1.63019340 ?0.56920196 -2.4049600 > > And I define a boundary of X: ?The smallest "ball" that nests all the > observations of X. I wish to check if a particular point x_i > >> x_i <- matrix(rnorm(5), 1, 5) >> x_i > ? ? ? ? ? [,1] ? ? ?[,2] ? ? ? [,3] ? ? ?[,4] ? ? ?[,5] > [1,] -0.1525543 0.4606419 -0.1011011 -1.557225 -1.035694 > > is inside the boundary of X or not. I know it's easy to do it with 1-D or > 2-D, but I don't knot how to manage it when the dimension is large. > > Can someone give a hint? Thanks in advance! > > > Feng > > -- > Feng Li > Department of Statistics > Stockholm University > 106 91 Stockholm, Sweden > http://feng.li/ > > ? ? ? ?[[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
You did not originally define "ball", the other respondents have discussed using a convex hull, but here is another approach: Use "ball" to mean sphere (or technically hypersphere) and find the sphere with the smallest radius that contains all the points, optim or other optimizers could be programmed to do this (or an approximation that may be good enough is to use the means as the center and the distance to the furthest point as the radius). Then finding if a new point is within the sphere is just a matter of computing the Euclidean distance from the new point to the center and comparing that to the radius. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.snow at imail.org 801.408.8111> -----Original Message----- > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r- > project.org] On Behalf Of Feng Li > Sent: Friday, September 24, 2010 3:44 AM > To: r-help at r-project.org > Subject: [R] boundary check > > Dear R, > > I have a covariates matrix with 10 observations, e.g. > > > X <- matrix(rnorm(50), 10, 5) > > X > [,1] [,2] [,3] [,4] [,5] > [1,] 0.24857135 0.30880745 -1.44118657 1.10229027 1.0526010 > [2,] 1.24316806 0.36275370 -0.40096866 -0.24387888 -1.5324384 > [3,] -0.33504014 0.42996246 0.03902479 -0.84778875 -2.4754644 > [4,] 0.06710229 1.01950917 -0.09325091 -0.03222811 0.4127816 > [5,] -0.13619141 1.33143821 -0.79958805 2.08274102 0.6901768 > [6,] -0.45060357 0.19348831 -1.23793647 -0.72440163 0.5057326 > [7,] -1.20740516 0.20231086 1.15584485 0.81777770 -1.2719855 > [8,] -1.81166284 -0.07913113 -0.91080581 -0.34774436 0.9552182 > [9,] 0.19131383 0.14980569 -0.37458224 -0.09371273 -1.7667203 > [10,] -0.85159276 -0.66679528 1.63019340 0.56920196 -2.4049600 > > And I define a boundary of X: The smallest "ball" that nests all the > observations of X. I wish to check if a particular point x_i > > > x_i <- matrix(rnorm(5), 1, 5) > > x_i > [,1] [,2] [,3] [,4] [,5] > [1,] -0.1525543 0.4606419 -0.1011011 -1.557225 -1.035694 > > is inside the boundary of X or not. I know it's easy to do it with 1-D > or > 2-D, but I don't knot how to manage it when the dimension is large. > > Can someone give a hint? Thanks in advance! > > > Feng > > -- > Feng Li > Department of Statistics > Stockholm University > 106 91 Stockholm, Sweden > http://feng.li/ > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > and provide commented, minimal, self-contained, reproducible code.