Hi all,
sample() has some well-documented undesirable behaviour.
sample(1:6, 1)
sample(2:6, 1)
...
sample(5:6, 1)
do what you expect, but
sample(6:6, 1)
sample(1:6, 1)
do the same thing.
This behaviour is documented:
If 'x' has length 1, is numeric (in the sense of
'is.numeric') and
'x >= 1', sampling _via_ 'sample' takes place from
'1:x'. _Note_
that this convenience feature may lead to undesired behaviour when
'x' is of varying length 'sample(x)'. See the
'resample()'
example below.
My proposal is to add an extra parameter is.set to sample() to control
this behaviour. If the parameter is unspecified, then we keep the old
behaviour for compatibility. If it is TRUE, then we treat the first
parameter x as a set. If it is FALSE, then we treat it as a set size.
This means that
sample(6:6, 1, is.set=TRUE)
would return 6 with probability 1.
I have attached a patch to implement this new option.
Cheers,
Andrew
-------------- next part --------------
A non-text attachment was scrubbed...
Name: sample.diff
Type: text/x-patch
Size: 3636 bytes
Desc: not available
URL:
<https://stat.ethz.ch/pipermail/r-devel/attachments/20100322/cc18f9ee/attachment.bin>
Hi all, I forgot to test my patch! I fixed a few bugs. Cheers, Andrew On 22 March 2010 22:53, Andrew Clausen <clausen at econ.upenn.edu> wrote:> Hi all, > > sample() has some well-documented undesirable behaviour. > > sample(1:6, 1) > sample(2:6, 1) > ... > sample(5:6, 1) > > do what you expect, but > > sample(6:6, 1) > sample(1:6, 1) > > do the same thing. > > This behaviour is documented: > > ? ? If 'x' has length 1, is numeric (in the sense of 'is.numeric') and > ? ? 'x >= 1', sampling _via_ 'sample' takes place from '1:x'. ?_Note_ > ? ? that this convenience feature may lead to undesired behaviour when > ? ? 'x' is of varying length 'sample(x)'. ?See the 'resample()' > ? ? example below. > > My proposal is to add an extra parameter is.set to sample() to control > this behaviour. ?If the parameter is unspecified, then we keep the old > behaviour for compatibility. ?If it is TRUE, then we treat the first > parameter x as a set. ?If it is FALSE, then we treat it as a set size. > ?This means that > > sample(6:6, 1, is.set=TRUE) > > would return 6 with probability 1. > > I have attached a patch to implement this new option. > > Cheers, > Andrew >
Seemingly Similar Threads
- using "sample()" for a vector of length 1
- [Fwd: Re: [R] Randomly remove condition-selected rows from a matrix]
- Opus Tools -- low bitrates, new features in 1.5, "expect-loss"
- Randomly remove condition-selected rows from a matrix
- Opus Tools -- low bitrates, new features in 1.5, "expect-loss"