thr3ads.net - R help - [R] vectorized approach to cumulative sampling [Apr 2005]

If this information is useful, please help other people find it:
Share via:

Daniel E. Bunker

2005-Apr-07 21:19 UTC

[R] vectorized approach to cumulative sampling

Hi All,

I need to sample a vector ("old"), with replacement, up to the point 
where my vector of samples ("new") sums to a predefined value 
("target"), shortening the last sample if necessary so that the total 
sum ("newsum") of the samples matches the predefined value.

While I can easily do this with a "while" loop (see below for example 
code), because the length of both "old" and "new" may be
> 20,000, a
vectorized approach will save me lots of CPU time.

Any suggestions would be greatly appreciated.

Thanks, Dan

# loop approach
old=c(1:10)
p=runif(1:10)
target=20

newsum=0
new=NULL
while (newsum<target) {
    i=sample(old, size=1, prob=p);
    new[length(new)+1]=i;
    newsum=sum(new)
    }
new
newsum
target
if(newsum>target){new[length(new)]=target-sum(new[-length(new)])}
new
newsum=sum(new); newsum
target

-- 

Daniel E. Bunker
Associate Coordinator - BioMERGE
Post-Doctoral Research Scientist
Columbia University
Department of Ecology, Evolution and Environmental Biology
1020 Schermerhorn Extension
1200 Amsterdam Avenue
New York, NY 10027-5557

212-854-9881
212-854-8188 fax
deb37ATcolumbiaDOTedu

(Ted Harding)

2005-Apr-07 21:46 UTC

head link

[R] vectorized approach to cumulative sampling

On 07-Apr-05 Daniel E. Bunker wrote:> Hi All,
> 
> I need to sample a vector ("old"), with replacement, up to the
point
> where my vector of samples ("new") sums to a predefined value 
> ("target"), shortening the last sample if necessary so that the
total
> sum ("newsum") of the samples matches the predefined value.
> 
> While I can easily do this with a "while" loop (see below for
example
> code), because the length of both "old" and "new" may
be > 20,000, a
> vectorized approach will save me lots of CPU time.
> 
> Any suggestions would be greatly appreciated.
> 
> Thanks, Dan
Hi Dan,
You should be able to adapt the following vectorised approach
to your specific needs:

  old<-0.001*(1:1000)
  new<-sample(old,10000,replace=TRUE,prob=p)
  target<-200
  min(which(cumsum(new)>target))

## [1] 385

This took only a fraction of a second on my medium-speed machine.
If you get an "Inf" as result, then 'new' doesn't add up
to
'target', so you have to extend it.

Hoping this helps,
Ted.


--------------------------------------------------------------------
E-Mail: (Ted Harding) <Ted.Harding at nessie.mcc.ac.uk>
Fax-to-email: +44 (0)870 094 0861
Date: 07-Apr-05                                       Time: 22:46:12
------------------------------ XFMail ------------------------------

Rich FitzJohn

2005-Apr-07 21:47 UTC

head link

[R] vectorized approach to cumulative sampling

Hi,

sample() takes a "replace" argument, so you can take large samples,
with replacement, like this: (In the sample() call, the
50*target/mean(old) should make it sample 50 times more than likely.
This means the while loop will probably get executed only once.  This
could be tuned easily, and there may be better ways of guessing how
much to take).

old <- c(1:2000)
p <- runif(1:2000)
target <- 4000
new <- 0

while ( sum(new) < target )
  new <- sample(old, 50*target/mean(old), TRUE, p)

i <- which(cumsum(new) >= target)[1]
new <- new[1:i]
new[i] <- new[i] - (sum(new)-target)

Cheers,
Rich

On Apr 8, 2005 9:19 AM, Daniel E. Bunker <deb37 at columbia.edu>
wrote:> Hi All,
> 
> I need to sample a vector ("old"), with replacement, up to the
point
> where my vector of samples ("new") sums to a predefined value
> ("target"), shortening the last sample if necessary so that the
total
> sum ("newsum") of the samples matches the predefined value.
> 
> While I can easily do this with a "while" loop (see below for
example
> code), because the length of both "old" and "new" may
be > 20,000, a
> vectorized approach will save me lots of CPU time.
> 
> Any suggestions would be greatly appreciated.
> 
> Thanks, Dan
> 
> # loop approach
> old=c(1:10)
> p=runif(1:10)
> target=20
> 
> newsum=0
> new=NULL
> while (newsum<target) {
>    i=sample(old, size=1, prob=p);
>    new[length(new)+1]=i;
>    newsum=sum(new)
>    }
> new
> newsum
> target
> if(newsum>target){new[length(new)]=target-sum(new[-length(new)])}
> new
> newsum=sum(new); newsum
> target
> 
-- 
Rich FitzJohn
rich.fitzjohn <at> gmail.com   |   
http://homepages.paradise.net.nz/richa183
                      You are in a maze of twisty little functions, all alike

Apparently Analagous Threads

Search for more possibly parallel threads

R help - Apr 2005 - vectorized approach to cumulative sampling

[R] vectorized approach to cumulative sampling

[R] vectorized approach to cumulative sampling

[R] vectorized approach to cumulative sampling

Apparently Analagous Threads