Hi:
On Thu, Feb 10, 2011 at 10:50 AM, Hui Du <Hui.Du@dataventures.com> wrote:
>
> Hi all,
>
> I have a dataset. Each time I want to sample N(i) elements
> from it and I want to repeated sampling M times. N(i) is varied from time
to
> time. For example,
>
> dataset = 1:50;
> a = list();
>
> M = 1000;
>
> I want to do something like
> for(i in 1:M)
>
> {
> a[[i]] = sample(dataset, sample(length(dataset), 1))
> }
>
For this specific example, isn't it the same as
a <- sample(dataset, M, replace = TRUE)
with
tabulate(sample(a, 1000, replace = TRUE))
[1] 22 22 17 22 20 12 19 23 26 22 22 22 13 16 14 23 15 27 25 21 23 16 15 22
24
[26] 19 23 27 20 19 19 16 14 21 16 23 16 27 15 18 21 26 14 22 15 25 28 14 20
19
representing the corresponding table of counts? This doesn't seem to me to
be the same as sampling N(i) elements from a (where I presume i represents
an iteration number) and then sampling from that M times. Here's an example
of that construct:
# Sample m elements from x and resample from the subvector M times
sfun <- function(x, m, M = 1000) {
if(m > length(x)) stop('m must be less than length(x)')
idx <- sample(1:length(x), m)
sample(x[idx], M, replace = TRUE)
}
# Vector of the number of subelements to sample (your m)
mvec <- c(10, 20, 15, 30, 18)
# Fix M = 1000 and use lapply() to generate each replicated set of
subsamples from a> ll <- lapply(mvec, function(x) sfun(a, x, 1000))
# length is right ( = length(mvec))> length(ll)
[1] 5
# lengths of each sample from the subvector is right> sapply(ll, length)
[1] 1000 1000 1000 1000 1000
# Generate frequency tables for the sets of resampled
subvectors> sapply(ll, table)
[[1]]
5 6 14 19 20 26 28 29 41 50
122 91 93 111 102 99 91 105 92 94
[[2]]
1 2 3 5 7 12 13 15 25 27 31 32 33 35 37 39 41 44 45 47
62 55 65 47 65 44 39 44 49 48 47 42 54 48 42 51 58 48 42 50
[[3]]
2 7 9 10 17 18 20 22 24 32 33 42 44 46 50
71 82 64 63 65 76 54 62 72 57 64 74 63 62 71
[[4]]
3 6 7 8 9 10 11 12 14 15 16 17 20 21 22 24 25 28 33 35 40 41 42 43 44
45
31 35 35 35 37 37 38 33 33 38 43 25 37 39 37 25 42 36 32 26 32 38 21 34 31
36
46 48 49 50
27 29 31 27
[[5]]
2 5 6 7 8 11 12 14 16 19 21 22 26 28 38 40 42 48
53 56 54 57 55 47 45 53 57 62 66 65 51 58 57 57 50 57
I have no idea if this is what you had in mind. If not, please try again and
be more careful about explaining what you need.
HTH,
Dennis
> But my question is that if there is more elegant solution
> for this, for example, without bothering loop, can I do the repeated
> sampling?
>
> Many thanks.
>
>
> HXD
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
[[alternative HTML version deleted]]