thr3ads.net - R help - [R] rarefy a matrix of counts [Oct 2006]

If this information is useful, please help other people find it:
Share via:

Brian Frappier

2006-Oct-10 21:40 UTC

[R] rarefy a matrix of counts

Hi all,

I have a matrix of counts for objects (rows) by samples (columns).  I aimed
for about 500 counts in each sample (I have about 80 samples) and would now
like to rarefy these down to 100 counts in each sample using simple random
sampling without replacement.  I plan on rarefying several times for each
sample.  I could do the tedious looping task of making a list of all objects
(with its associated identifier) in each sample and then use the wonderful
"sampling" package to select a sub-sample of 100 for each sample and
thereby
get a logical vector of inclusions.  I would then regroup the resulting
logical vector into a vector of counts by object, rinse and repeat several
times for each sample.

Alternately, using the same list, I could create a random index of integers
between 1 and the number of objects for a sample (without repeats) and then
select those objects from the list.  Again, rinse and repeat several time
for each sample.

Is there a way to directly rarefy a matrix of counts without having to
create a list of objects first?  I am trying to switch to R from Matlab and
am trying to pick up good programming habits from the start.

Much appreciation!

	[[alternative HTML version deleted]]

Petr Pikal

2006-Oct-11 05:57 UTC

head link

[R] rarefy a matrix of counts

Hi

I am not experienced in Matlab and from your explanation I do not 
understand what exactly do you want. It seems that you want randomly 
choose a sample of 100 rows from your martix, what can be achived by 
sample.

DF<-data.frame(rnorm(100), 1:100, 101:200, 201:300)
DF[sample(1:100, 10),]

If you want to do this several times, you need to save your result 
and than it depends on what you want to do next. One suitable form is 
list of matrices the other is array and you can use for loop for 
completing it.

HTH
Petr


On 10 Oct 2006 at 17:40, Brian Frappier wrote:

Date sent:      	Tue, 10 Oct 2006 17:40:47 -0400
From:           	"Brian Frappier" <brian.frappier at gmail.com>
To:             	r-help at stat.math.ethz.ch
Subject:        	[R] rarefy a matrix of counts
> Hi all,
> 
> I have a matrix of counts for objects (rows) by samples (columns).  I
> aimed for about 500 counts in each sample (I have about 80 samples)
> and would now like to rarefy these down to 100 counts in each sample
> using simple random sampling without replacement.  I plan on rarefying
> several times for each sample.  I could do the tedious looping task of
> making a list of all objects (with its associated identifier) in each
> sample and then use the wonderful "sampling" package to select a
> sub-sample of 100 for each sample and thereby get a logical vector of
> inclusions.  I would then regroup the resulting logical vector into a
> vector of counts by object, rinse and repeat several times for each
> sample.
> 
> Alternately, using the same list, I could create a random index of
> integers between 1 and the number of objects for a sample (without
> repeats) and then select those objects from the list.  Again, rinse
> and repeat several time for each sample.
> 
> Is there a way to directly rarefy a matrix of counts without having to
> create a list of objects first?  I am trying to switch to R from
> Matlab and am trying to pick up good programming habits from the
> start.
> 
> Much appreciation!
> 
>  [[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html and provide commented,
> minimal, self-contained, reproducible code.
Petr Pikal
petr.pikal at precheza.cz

Brian Frappier

2006-Oct-13 14:35 UTC

head link

[R] rarefy a matrix of counts

Thank you, Alex!  That's exactly what I was looking to do.  I'm going to
remove the loops and use your apply function approach.  Best regards and
much thanks,  brian

On 10/13/06, Alex Brown <alex@transitive.com>
wrote:>
> I thought at first that you could use a weighted sample (the sample
> function) but, you can't since it doesn't take proper account of
> replacement if you try that.
>
> You can use the list approach, but through the power of R, you don't
> need a lot of loops to do it...
>
> I can't speak for the efficiency of this approach in terms of cpu
cycle.
>
> In short:
>
> apply(z2,2,function(x)sample(rep(names(x),x),100))
>
> In long:
>
> #let's load the data:
>
> z = scan(,"",sep="\n")
>                 sample.1         sample.2         sample.3
> red.candy       400                 300               2500
> green.candy    100                    0                  200
> black.candy     300                1000                500
>
> #and turn into a table
>
>   z2 = read.table(textConnection(z), header=TRUE, row.names=1)
>
> # let's create a functon to expand a sample column into individuals:
>
> expand <- function(x) rep(names(x), x)
>
> # test it on a smaller set:
>
> ex <- expand( c( red = 2, blue = 3) )
>
> ex
> [1] "red"  "red"  "blue" "blue"
"blue"
>
> # and sample 2 things from that:
>
> sample( ex, 2 )
>
> # combine the two
>
> samplex <- function( x, size ) sample(expand(x), size )
>
> samplex( c( red = 2, blue = 3), size = 2 )
>
> # ok, now we use the apply function to apply this to each column
>
> apply(z2, 2, samplex, size = 2 )
>
> # you wanted 100?
>
> apply(z2, 2, samplex, size = 100 )
>
> # all done.
>
> #You should note that if there are less than 100 (samplenumber)
> candies in any given sample, this function will fail.
> # eg:
>
> apply(z2, 2, samplex, size = 2000 )
>
> Error in sample(length(x), size, replace, prob) :
>         cannot take a sample larger than the population
> when 'replace = FALSE'
>
> -Alex
>
> On 11 Oct 2006, at 15:10, Brian Frappier wrote:
>
> > Hi Petr,
> >
> > Thanks for your response.  I have data that looks like the following:
> >
> >                sample 1         sample 2         sample 3  ....
> > red candy        400                 300               2500
> > green candy    100                    0                  200
> > black candy     300                1000                500
> >
> > I don't want to randomly select either the samples (columns) or
the
> > "candy"
> > types (rows), which sample as you state would allow me.  Instead, I
> > want to
> > randomly sample 100 candies from each sample and retain info on their
> > associated type.  I could make a list of all the candies in each
> > sample:
> >
> > sample 1
> > red
> > red
> > red
> > red
> > green
> > green
> > black
> > red
> > black
> > ...
> >
> > and then randomly sample those rows.  Repeat for each sample.  But,
> > I am not
> > sure how to do that without alot of loops, and am wondering if
> > there is an
> > easier way in R.  Thanks!  I should have laid this out in the first
> > email...sorry.
> >
> >
> > On 10/11/06, Petr Pikal <petr.pikal@precheza.cz> wrote:
> >>
> >> Hi
> >>
> >> I am not experienced in Matlab and from your explanation I do not
> >> understand what exactly do you want. It seems that you want
randomly
> >> choose a sample of 100 rows from your martix, what can be achived
by
> >> sample.
> >>
> >> DF<-data.frame(rnorm(100), 1:100, 101:200, 201:300)
> >> DF[sample(1:100, 10),]
> >>
> >> If you want to do this several times, you need to save your result
> >> and than it depends on what you want to do next. One suitable form
is
> >> list of matrices the other is array and you can use for loop for
> >> completing it.
> >>
> >> HTH
> >> Petr
> >>
> >>
> >> On 10 Oct 2006 at 17:40, Brian Frappier wrote:
> >>
> >> Date sent:              Tue, 10 Oct 2006 17:40:47 -0400
> >> From:                   "Brian Frappier"
<brian.frappier@gmail.com>
> >> To:                     r-help@stat.math.ethz.ch
> >> Subject:                [R] rarefy a matrix of counts
> >>
> >>> Hi all,
> >>>
> >>> I have a matrix of counts for objects (rows) by samples
> >>> (columns).  I
> >>> aimed for about 500 counts in each sample (I have about 80
samples)
> >>> and would now like to rarefy these down to 100 counts in each
sample
> >>> using simple random sampling without replacement.  I plan on
> >>> rarefying
> >>> several times for each sample.  I could do the tedious looping
> >>> task of
> >>> making a list of all objects (with its associated identifier)
in
> >>> each
> >>> sample and then use the wonderful "sampling" package
to select a
> >>> sub-sample of 100 for each sample and thereby get a logical
> >>> vector of
> >>> inclusions.  I would then regroup the resulting logical vector
> >>> into a
> >>> vector of counts by object, rinse and repeat several times for
each
> >>> sample.
> >>>
> >>> Alternately, using the same list, I could create a random
index of
> >>> integers between 1 and the number of objects for a sample
(without
> >>> repeats) and then select those objects from the list.  Again,
rinse
> >>> and repeat several time for each sample.
> >>>
> >>> Is there a way to directly rarefy a matrix of counts without
> >>> having to
> >>> create a list of objects first?  I am trying to switch to R
from
> >>> Matlab and am trying to pick up good programming habits from
the
> >>> start.
> >>>
> >>> Much appreciation!
> >>>
> >>>  [[alternative HTML version deleted]]
> >>>
> >>> ______________________________________________
> >>> R-help@stat.math.ethz.ch mailing list
> >>> https://stat.ethz.ch/mailman/listinfo/r-help
> >>> PLEASE do read the posting guide
> >>> http://www.R-project.org/posting-guide.html and provide
commented,
> >>> minimal, self-contained, reproducible code.
> >>
> >> Petr Pikal
> >> petr.pikal@precheza.cz
> >>
> >>
> >
> >       [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-help@stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-
> > guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
>
	[[alternative HTML version deleted]]

Reasonably Related Threads

Search for more possibly parallel threads

R help - Oct 2006 - rarefy a matrix of counts

[R] rarefy a matrix of counts

[R] rarefy a matrix of counts

[R] rarefy a matrix of counts

Reasonably Related Threads