thr3ads.net - R help - [R] a question about box counting [Apr 2005]

If this information is useful, please help other people find it:
Share via:

Rajarshi Guha

2005-Apr-04 18:22 UTC

[R] a question about box counting

Hi,
  I have a set of x,y data points and each data point lies between (0,0)
and (1,1). Of this set I have selected all those that lie in the lower
triangle (of the plot of these points).

What I would like to do is to divide the region (0,0) to (1,1) into
cells of say, side = 0.01 and then count the number of cells that
contain a point.

My first approach is to generate the coordinates of these cells and then
loop over the point list to see whether a point lies in a cell or not.

However this seems to be very inefficient esepcially since I will have
1000's of points.

Has anybody dealt with this type of problem and are there routines to
handle it?


-------------------------------------------------------------------
Rajarshi Guha <rxg218 at psu.edu> <http://jijo.cjb.net>
GPG Fingerprint: 0CCA 8EE2 2EEB 25E2 AB04 06F7 1BB9 E634 9B87 56EE
-------------------------------------------------------------------
Alone, adj.: In bad company.
-- Ambrose Bierce, "The Devil's Dictionary"

Deepayan Sarkar

2005-Apr-04 18:52 UTC

head link

[R] a question about box counting

On Monday 04 April 2005 13:22, Rajarshi Guha wrote:> Hi,
>   I have a set of x,y data points and each data point lies between
> (0,0) and (1,1). Of this set I have selected all those that lie in
> the lower triangle (of the plot of these points).
>
> What I would like to do is to divide the region (0,0) to (1,1) into
> cells of say, side = 0.01 and then count the number of cells that
> contain a point.
>
> My first approach is to generate the coordinates of these cells and
> then loop over the point list to see whether a point lies in a cell
> or not.
>
> However this seems to be very inefficient esepcially since I will
> have 1000's of points.
>
> Has anybody dealt with this type of problem and are there routines to
> handle it?
A combination of cut and table/xtabs should do it, e.g.:


x <- runif(3000)
y <- runif(3000)

fx <- cut(x, breaks = seq(0, 1, length = 101))
fy <- cut(y, breaks = seq(0, 1, length = 101))

txy <- xtabs(~ fx + fy)
image(txy > 0)
sum(txy > 0)


Deepayan

Ray Brownrigg

2005-Apr-04 21:27 UTC

head link

[R] a question about box counting

> From: Deepayan Sarkar <deepayan at stat.wisc.edu> Mon, 4 Apr 2005
13:52:48 -0500
> 
> On Monday 04 April 2005 13:22, Rajarshi Guha wrote:
> > Hi,
> >   I have a set of x,y data points and each data point lies between
> > (0,0) and (1,1). Of this set I have selected all those that lie in
> > the lower triangle (of the plot of these points).
> >
> > What I would like to do is to divide the region (0,0) to (1,1) into
> > cells of say, side = 0.01 and then count the number of cells that
> > contain a point.
> >
> > My first approach is to generate the coordinates of these cells and
> > then loop over the point list to see whether a point lies in a cell
> > or not.
> >
> > However this seems to be very inefficient esepcially since I will
> > have 1000's of points.
> >
> > Has anybody dealt with this type of problem and are there routines to
> > handle it?
> 
> A combination of cut and table/xtabs should do it, e.g.:
> 
> 
> x <- runif(3000)
> y <- runif(3000)
> 
> fx <- cut(x, breaks = seq(0, 1, length = 101))
> fy <- cut(y, breaks = seq(0, 1, length = 101))
> 
> txy <- xtabs(~ fx + fy)
> :
Another significantly faster way (but not generating row/column names)
is:
x <- runif(3000)
y <- runif(3000)
ints <- 100
myfun <- function(x, y, ints) {
  fx <- x %/% (1/ints)
  fy <- y %/% (1/ints)
  txy <- hist(fx + ints*fy+ 1, breaks=0:(ints*ints), plot=FALSE)$counts
  dim(fxy) <- c(ints, ints)
  return(txy)
}
myfun(x, y, ints)

Hope this helps,
Ray Brownrigg

Rajarshi Guha

2005-Apr-04 21:46 UTC

head link

[R] a question about box counting

On Mon, 2005-04-04 at 14:22 -0400, Rajarshi Guha wrote:> Hi,
>   I have a set of x,y data points and each data point lies between (0,0)
> and (1,1). Of this set I have selected all those that lie in the lower
> triangle (of the plot of these points).
> 
> What I would like to do is to divide the region (0,0) to (1,1) into
> cells of say, side = 0.01 and then count the number of cells that
> contain a point.
Thanks very much to Deepayan Sarkar, James Holtman and Ray Brownrigg for
very efficient (and elegant) solutions. I've summarized them below:

Deepayan Sarkar

A combination of cut and table/xtabs should do it, e.g.:


x <- runif(3000)
y <- runif(3000)

fx <- cut(x, breaks = seq(0, 1, length = 101))
fy <- cut(y, breaks = seq(0, 1, length = 101))

txy <- xtabs(~ fx + fy)
image(txy > 0)
sum(txy > 0)

---------------------------------------------------------
james Holtman

Here is a start.  This creates a dataframe and then divides the data up
into 10 segments (you wanted 100, so extend it) and then counts the
number
in each cell.

> df <- data.frame(x=runif(100), y=runif(100))  # create data
> breaks <- seq(0,1,.1)  # define breaks; you would use 0.01
> table(cut(df$x, breaks=breaks,labels=F),cut(df$y,breaks=breaks,labels=F))
# use 'cut' to partition and then 'table' to count

     1 2 3 4 5 6 7 8 9 10
  1  0 2 0 1 0 3 0 1 0 0
  2  0 1 0 0 0 2 1 2 0 0
  3  0 1 0 0 3 0 2 2 1 2
  4  0 0 1 2 3 3 1 2 2 0
  5  3 1 2 2 1 2 1 1 1 0
  6  2 0 2 0 0 0 0 1 0 0
  7  0 1 1 1 2 1 1 1 2 1
  8  0 3 2 1 1 2 2 2 1 1
  9  0 0 2 2 0 1 2 0 2 2
  10 0 2 1 0 0 0 0 0 0 3

-----------------------------------------------------------------
Ray Brownrigg

Another significantly faster way (but not generating row/column names)
is:
x <- runif(3000)
y <- runif(3000)
ints <- 100
myfun <- function(x, y, ints) {
  fx <- x %/% (1/ints)
  fy <- y %/% (1/ints)
  txy <- hist(fx + ints*fy+ 1, breaks=0:(ints*ints), plot=FALSE)$counts
  dim(fxy) <- c(ints, ints)
  return(txy)
}
myfun(x, y, ints)


-------------------------------------------------------------------
Rajarshi Guha <rxg218 at psu.edu> <http://jijo.cjb.net>
GPG Fingerprint: 0CCA 8EE2 2EEB 25E2 AB04 06F7 1BB9 E634 9B87 56EE
-------------------------------------------------------------------
Q: Why did the mathematician name his dog "Cauchy"?
A: Because he left a residue at every pole.

Ray Brownrigg

2005-Apr-04 21:59 UTC

head link

[R] a question about box counting

I said:> myfun <- function(x, y, ints) {
>   fx <- x %/% (1/ints)
>   fy <- y %/% (1/ints)
>   txy <- hist(fx + ints*fy+ 1, breaks=0:(ints*ints), plot=FALSE)$counts
>   dim(fxy) <- c(ints, ints)
        ^^^>   return(txy)
> }Of course it should be:
  dim(txy) <- c(ints, ints)
      ^^^

Sorry about that,
Ray

Ben Fairbank

2005-Apr-04 22:34 UTC

head link

[R] a question about box counting

Perhaps the following, substituting your vectors of x and y for
runif(10000)
> x<-trunc(100*runif(10000))
> y<-trunc(100*runif(10000))/100
> length(unique(x+y))[1] 6390

Ben Fairbank

-----Original Message-----
From: r-help-bounces at stat.math.ethz.ch
[mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Rajarshi Guha
Sent: Monday, April 04, 2005 1:23 PM
To: R
Subject: [R] a question about box counting

Hi,
  I have a set of x,y data points and each data point lies between (0,0)
and (1,1). Of this set I have selected all those that lie in the lower
triangle (of the plot of these points).

What I would like to do is to divide the region (0,0) to (1,1) into
cells of say, side = 0.01 and then count the number of cells that
contain a point.

My first approach is to generate the coordinates of these cells and then
loop over the point list to see whether a point lies in a cell or not.

However this seems to be very inefficient esepcially since I will have
1000's of points.

Has anybody dealt with this type of problem and are there routines to
handle it?


-------------------------------------------------------------------
Rajarshi Guha <rxg218 at psu.edu> <http://jijo.cjb.net>
GPG Fingerprint: 0CCA 8EE2 2EEB 25E2 AB04 06F7 1BB9 E634 9B87 56EE
-------------------------------------------------------------------
Alone, adj.: In bad company.
-- Ambrose Bierce, "The Devil's Dictionary"

______________________________________________
R-help at stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html

Reasonably Related Threads

Search for more reasonably related threads

R help - Apr 2005 - a question about box counting

[R] a question about box counting

[R] a question about box counting

[R] a question about box counting

[R] a question about box counting

[R] a question about box counting

[R] a question about box counting

Reasonably Related Threads