thr3ads.net - R help - SV: [R] sample from contingency table [Sep 2000]

If this information is useful, please help other people find it:
Share via:

Regin Reinert

2000-Sep-20 06:53 UTC

SV: [R] sample from contingency table

I have had the same problem and I wrote this function

rmulti <- function(n, size, p)
{
  NrDim <- length(p)
  if(NrDim<2) stop("The simulated variabel has to be at least
2-dimensional")
  res <- matrix(data=NA, nrow=n, ncol=NrDim)
  p <- p/sum(p)
  TempSize <- size
  for(i in 1:NrDim)
  {
    TempP <- p[i]/sum(p[i:NrDim])
    TempBin <- rbinom(n=n, size=TempSize, prob=TempP)
    TempSize <- TempSize-TempBin
    res[,i] <- TempBin
  }
  return(res)
}

# Then you can draw 10 samples like this, whith
# each row representing a contingency table

x <- as.matrix(1:4, nrow=2, ncol=2)
rmulti(10, sum(x), x)


Regin


-----Oprindelig meddelelse-----
Fra: Dirk F. Raetzel [mailto:raetzel at Mathematik.Uni-Marburg.de]
Sendt: 19. september 2000 18:48
Til: R-Help Mailing List
Emne: [R] sample from contingency table


Hello,

I have a multivariate (dim >= 3) discrete distribution
given by a contingency table from which I want to draw independent
random samples. The result should be a data.frame (or array) with each
column representing a dimension.

Before starting to hack some search tree with approbiate
transformations: Is there any built-in function I
have overseen or did anybody program such a function already?

Dirk



-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.
-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at
stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._.
_._
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at
stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._

Bill Venables

2000-Sep-20 08:29 UTC

head link

SV: [R] sample from contingency table

Regin Reinert writes:

> I have had the same problem and I wrote this function
> 
> rmulti <- function(n, size, p)
> {
>   NrDim <- length(p)
>   if(NrDim<2) stop("The simulated variabel has to be at least
> 2-dimensional")
>   res <- matrix(data=NA, nrow=n, ncol=NrDim)
>   p <- p/sum(p)
>   TempSize <- size
>   for(i in 1:NrDim)
>   {
>     TempP <- p[i]/sum(p[i:NrDim])
>     TempBin <- rbinom(n=n, size=TempSize, prob=TempP)
>     TempSize <- TempSize-TempBin
>     res[,i] <- TempBin
>   }
>   return(res)
> }
> 
> # Then you can draw 10 samples like this, whith
> # each row representing a contingency table
> 
> x <- as.matrix(1:4, nrow=2, ncol=2)
> rmulti(10, sum(x), x)
> 
> 
> Regin
Hey, hang on...  If I have understood the original question properly
what you have to do is to sample from the cells of a contingency table
with probabilities proportional to the frequencies in those cells.
Here is the original question:

> -----Oprindelig meddelelse-----
> Fra: Dirk F. Raetzel [mailto:raetzel at Mathematik.Uni-Marburg.de]
> Sendt: 19. september 2000 18:48
> Til: R-Help Mailing List
> Emne: [R] sample from contingency table
> 
> 
> Hello,
> 
> I have a multivariate (dim >= 3) discrete distribution
> given by a contingency table from which I want to draw independent
> random samples. The result should be a data.frame (or array) with each
> column representing a dimension.
> 
> Before starting to hack some search tree with approbiate
> transformations: Is there any built-in function I
> have overseen or did anybody program such a function already?
> 
> Dirk
I can't see why you would need a "search tree" for this problem
either.  Here is (what I think is) a very simple solution:

sampct <- function(n, Fr) {
# sample with replacement from a multivariate distribution
# defined by a contingency table
  if(!is.null(dfr <- dimnames(Fr)) &&
     prod(sapply(dfr, length)) == length(Fr))
    dfr <- expand.grid(dfr)
  else
    dfr <- expand.grid(lapply(dim(Fr), seq))
  dfr[sample(1:nrow(dfr), n, prob = Fr, rep = T), ]
}

Here n is the sample size and Fr the array of frequencies in the
table.  The result is an n x k data frame of cell indices where k is
the dimension of Fr.

It uses the fact that sample() can sample with replacement and with
probabilities proportional to the entries in a (non-negative) vector.
As usual there are a few little obscurities there to make it
interesting...

-- 
Bill Venables,      Statistician,     CMIS Environmetrics Project
CSIRO Marine Labs, PO Box 120, Cleveland, Qld,  AUSTRALIA.   4163
Tel: +61 7 3826 7251           Email: Bill.Venables at cmis.csiro.au    
Fax: +61 7 3826 7304      http://www.cmis.csiro.au/bill.venables/



-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at
stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._

Reasonably Related Threads

Search for more seemingly similar threads

R help - Sep 2000 - SV: sample from contingency table

SV: [R] sample from contingency table

SV: [R] sample from contingency table

Reasonably Related Threads