Hello, all, I'm learning to do randomized distributions in my Stats 101 class*. I thought I could do it with a call to sample() inside a matrix(), like:> matrix(sample(1:10, replace=TRUE), 5, 10, byrow=TRUE)[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [1,] 8 2 3 1 8 2 8 8 9 8 [2,] 8 2 3 1 8 2 8 8 9 8 [3,] 8 2 3 1 8 2 8 8 9 8 [4,] 8 2 3 1 8 2 8 8 9 8 [5,] 8 2 3 1 8 2 8 8 9 8>Imagine my surprise to learn that all the rows were the same permutation. I thought each time sample() was called inside the matrix, it would generate a different permutation. I modeled this after the bootstrap sample techniques in https://pages.stat.wisc.edu/~larget/stat302/chap3.pdf. I don't understand why it works in bootstrap samples (with replace=TRUE), but not in randomized distributions (with replace=FALSE). Thanks for any insight you can share with me, and any suggestions for getting rows in a matrix with different permutations. -Kevin *No, this isn't a homework problem. We're using Lock5 as the text in class, along with its StatKey web application. I'm just trying to get more out of the class by also solving our problems using R, for which I'm not receiving any class credit.
Your call to `sample` does not specify the `size` or the number of values to return, so it defaults to the same number in `x`, in this case 10. The `matrix` function then repeats the vector of 10 enough times to fill in the matrix. To do what you want you just need to specify the `size` as the total number of values you want sampled, 50 or 5*10 for your case. So the following should do what you want: matrix(sample(1:10, 5*10, replace=TRUE), 5, 10, byrow=TRUE) In this case the `byrow` does not really matter much since you are just filling in random values. Hope this helps, On Thu, Mar 13, 2025 at 3:23?PM Kevin Zembower via R-help <r-help at r-project.org> wrote:> > Hello, all, > > I'm learning to do randomized distributions in my Stats 101 class*. I > thought I could do it with a call to sample() inside a matrix(), like: > > > matrix(sample(1:10, replace=TRUE), 5, 10, byrow=TRUE) > [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] > [1,] 8 2 3 1 8 2 8 8 9 8 > [2,] 8 2 3 1 8 2 8 8 9 8 > [3,] 8 2 3 1 8 2 8 8 9 8 > [4,] 8 2 3 1 8 2 8 8 9 8 > [5,] 8 2 3 1 8 2 8 8 9 8 > > > > Imagine my surprise to learn that all the rows were the same > permutation. I thought each time sample() was called inside the matrix, > it would generate a different permutation. > > I modeled this after the bootstrap sample techniques in > https://pages.stat.wisc.edu/~larget/stat302/chap3.pdf. I don't > understand why it works in bootstrap samples (with replace=TRUE), but > not in randomized distributions (with replace=FALSE). > > Thanks for any insight you can share with me, and any suggestions for > getting rows in a matrix with different permutations. > > -Kevin > > *No, this isn't a homework problem. We're using Lock5 as the text in > class, along with its StatKey web application. I'm just trying to get > more out of the class by also solving our problems using R, for which > I'm not receiving any class credit. > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide https://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.-- Gregory (Greg) L. Snow Ph.D. 538280 at gmail.com
The textbook uses an extra argument 'size'. If you do the same, it should work. matrix(sample(1:10, size = 5 * 10, replace=TRUE), 5, 10, byrow=TRUE) On Thu, Mar 13, 2025, 17:23 Kevin Zembower via R-help <r-help at r-project.org> wrote:> Hello, all, > > I'm learning to do randomized distributions in my Stats 101 class*. I > thought I could do it with a call to sample() inside a matrix(), like: > > > matrix(sample(1:10, replace=TRUE), 5, 10, byrow=TRUE) > [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] > [1,] 8 2 3 1 8 2 8 8 9 8 > [2,] 8 2 3 1 8 2 8 8 9 8 > [3,] 8 2 3 1 8 2 8 8 9 8 > [4,] 8 2 3 1 8 2 8 8 9 8 > [5,] 8 2 3 1 8 2 8 8 9 8 > > > > Imagine my surprise to learn that all the rows were the same > permutation. I thought each time sample() was called inside the matrix, > it would generate a different permutation. > > I modeled this after the bootstrap sample techniques in > https://pages.stat.wisc.edu/~larget/stat302/chap3.pdf. I don't > understand why it works in bootstrap samples (with replace=TRUE), but > not in randomized distributions (with replace=FALSE). > > Thanks for any insight you can share with me, and any suggestions for > getting rows in a matrix with different permutations. > > -Kevin > > *No, this isn't a homework problem. We're using Lock5 as the text in > class, along with its StatKey web application. I'm just trying to get > more out of the class by also solving our problems using R, for which > I'm not receiving any class credit. > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > https://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]
Bravo for your unrequired R efforts. You misunderstand the nested call. sample() is called only once, producing 1 sample of 10 with replacement. Since your matrix call needs 50 values, ?matrix tells you (in details): "If there are too few elements in data to fill the matrix, then the elements in data are recycled. If data has length zero, NA of an appropriate type is used for atomic vectors (0 for raw vectors) and NULL for lists. This sort of "recycling" is quite standard in R. Though not universal. Cheers, Bert "An educated person is one who can entertain new ideas, entertain others, and entertain herself." On Thu, Mar 13, 2025 at 2:23?PM Kevin Zembower via R-help <r-help at r-project.org> wrote:> > Hello, all, > > I'm learning to do randomized distributions in my Stats 101 class*. I > thought I could do it with a call to sample() inside a matrix(), like: > > > matrix(sample(1:10, replace=TRUE), 5, 10, byrow=TRUE) > [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] > [1,] 8 2 3 1 8 2 8 8 9 8 > [2,] 8 2 3 1 8 2 8 8 9 8 > [3,] 8 2 3 1 8 2 8 8 9 8 > [4,] 8 2 3 1 8 2 8 8 9 8 > [5,] 8 2 3 1 8 2 8 8 9 8 > > > > Imagine my surprise to learn that all the rows were the same > permutation. I thought each time sample() was called inside the matrix, > it would generate a different permutation. > > I modeled this after the bootstrap sample techniques in > https://pages.stat.wisc.edu/~larget/stat302/chap3.pdf. I don't > understand why it works in bootstrap samples (with replace=TRUE), but > not in randomized distributions (with replace=FALSE). > > Thanks for any insight you can share with me, and any suggestions for > getting rows in a matrix with different permutations. > > -Kevin > > *No, this isn't a homework problem. We're using Lock5 as the text in > class, along with its StatKey web application. I'm just trying to get > more out of the class by also solving our problems using R, for which > I'm not receiving any class credit. > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide https://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
@vi@e@gross m@iii@g oii gm@ii@com
2025-Mar-13 21:37 UTC
[R] What don't I understand about sample()?
Kevin,
It is simple. Your matrix has fifty entries and you supplied just 10. R
tends to quietly assume you want the sample repeated as often as needed as
long as it can be used in whole amounts. So, you get five copies. If you
interchanged rows and columns with byrow=FALSE then every two rows would
repeat.
Ask for 50!
matrix(sample(1:50, replace=TRUE), 5, 10, byrow=TRUE)
But decide what you want. You are getting numbers in the range of 10. Asking
for 50 as I showed will get you something like this:
matrix(sample(1:50, replace=TRUE), 5, 10, byrow=TRUE)
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] 15 32 20 4 44 20 34 2 30 14
[2,] 20 8 42 8 46 45 10 27 27 9
[3,] 26 12 15 26 8 47 25 31 38 31
[4,] 47 5 2 28 13 33 19 3 3 49
[5,] 12 1 11 3 12 21 1 19 30 31
What you may want is this with size=5*10
matrix(sample(x=1:10, size=5*10, replace=TRUE), 5, 10, byrow=TRUE)
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] 2 1 9 9 3 5 7 4 4 10
[2,] 4 1 8 7 1 1 5 1 6 10
[3,] 4 3 6 2 4 4 10 10 8 8
[4,] 10 6 3 2 8 10 10 2 7 9
[5,] 2 4 2 5 5 10 10 10 8 1
-----Original Message-----
From: R-help <r-help-bounces at r-project.org> On Behalf Of Kevin Zembower
via
R-help
Sent: Thursday, March 13, 2025 5:00 PM
To: r-help at r-project.org
Subject: [R] What don't I understand about sample()?
Hello, all,
I'm learning to do randomized distributions in my Stats 101 class*. I
thought I could do it with a call to sample() inside a matrix(), like:
> matrix(sample(1:10, replace=TRUE), 5, 10, byrow=TRUE)
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] 8 2 3 1 8 2 8 8 9 8
[2,] 8 2 3 1 8 2 8 8 9 8
[3,] 8 2 3 1 8 2 8 8 9 8
[4,] 8 2 3 1 8 2 8 8 9 8
[5,] 8 2 3 1 8 2 8 8 9 8>
Imagine my surprise to learn that all the rows were the same
permutation. I thought each time sample() was called inside the matrix,
it would generate a different permutation.
I modeled this after the bootstrap sample techniques in
https://pages.stat.wisc.edu/~larget/stat302/chap3.pdf. I don't
understand why it works in bootstrap samples (with replace=TRUE), but
not in randomized distributions (with replace=FALSE).
Thanks for any insight you can share with me, and any suggestions for
getting rows in a matrix with different permutations.
-Kevin
*No, this isn't a homework problem. We're using Lock5 as the text in
class, along with its StatKey web application. I'm just trying to get
more out of the class by also solving our problems using R, for which
I'm not receiving any class credit.
______________________________________________
R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
https://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
This is fun.
In a stats class you are trying to deal with data. There is the underlying
distribution. This is the random number generator. I have a population that is
following the underlying distribution. In this case my population is 10,000
individuals with a true population mean of 40 and a standard deviation of 6.
Population1 <- round(rnorm(10000, 40, 6), 2) ### rounds numbers to 2 decimal
places
However, Population1 can be any vector of any size, and the code will work.
Population1 <- 1:10
Population1 <- c("a", "b", "C", "D",
"e", "f")
Population1 <- letters[1:13] ### note square brackets here. Round ones do
not work.
Population1 <- c(letters[1:13], LETTERS[8:21])
I cannot test every individual in the population of 10,000. My experiment must
sample the population. I hope to get away with a sample size of five, but I want
to understand the variability in my outcomes. I will take ten sets of five
values (just as you have in your example)
Matrix1 <- matrix(sample(Population1, size = length(Matrix1), replace =
TRUE), nrow = 5, ncol = 10)
print(Matrix1)
In some cases, I find it easier to understand if I use loops instead. This is
just a different way to solve the same problem.
# Fill the matrix using for loops
Matrix1 <- matrix(0,5,10) ### create and initialize the matrix
for (i in 1:nrow(Matrix1)) {
for (j in 1:ncol(Matrix1)) {
Matrix1[i, j] <- sample(Population1, 1) # Pick a random value from
Population1
}
}
print(Matrix1)
If you want every row to have every value in Population1 (in the case where
Population1 <- 1:10) then change replace=TRUE to replace=FALSE in
Matrix1 <- matrix(sample(Population1, size = length(Matrix1), replace =
TRUE), nrow = 5, ncol = 10)
. If you want to make this more generic, a simple improvement would be to set
the number of columns to be the length of Population1.
Matrix1 <- matrix(sample(Population1, size = length(Matrix1), replace =
FALSE), nrow = 5, ncol = length(Population1))
In your example you told R to take a random sample of ten values (from the
integers 1 to 10) and then R made five copies to fill the matrix. To make that
approach work as planned you could make a hybrid approach like this where I take
a random sample of ten values and then loop through that for each row in the
matrix.
Matrix1 <- matrix(0, 5, 10)
# Fill the matrix row-wise using a single loop
for (i in 1:nrow(Matrix1)) {
sample1 <- sample(Population1, 10, replace = TRUE) # Sample 10 values for
the row
Matrix1[i, ] <- sample1 # Directly assign the entire row
}
# Print the filled matrix
print(Matrix1)
You can also make and use your own variables.
matrix_rows <- 5
matrix_columns <- 10
values <- matrix_rows * matrix_columns
pop_min <- 1
pop_max <- 10
Population1 <- pop_min : pop_max
Matrix1 <- matrix(sample(Population1, size = values, replace = TRUE), nrow =
matrix_rows, ncol = matrix_columns)
print(Matrix1)
You can look at the effect of sample size by changing matrix_rows at the top of
the program.
Tim
-----Original Message-----
From: R-help <r-help-bounces at r-project.org> On Behalf Of Kevin Zembower
via R-help
Sent: Thursday, March 13, 2025 5:00 PM
To: r-help at r-project.org
Subject: [R] What don't I understand about sample()?
[External Email]
Hello, all,
I'm learning to do randomized distributions in my Stats 101 class*. I
thought I could do it with a call to sample() inside a matrix(), like:
> matrix(sample(1:10, replace=TRUE), 5, 10, byrow=TRUE)
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] 8 2 3 1 8 2 8 8 9 8
[2,] 8 2 3 1 8 2 8 8 9 8
[3,] 8 2 3 1 8 2 8 8 9 8
[4,] 8 2 3 1 8 2 8 8 9 8
[5,] 8 2 3 1 8 2 8 8 9 8>
Imagine my surprise to learn that all the rows were the same permutation. I
thought each time sample() was called inside the matrix, it would generate a
different permutation.
I modeled this after the bootstrap sample techniques in
https://pages.stat.wisc.edu/~larget/stat302/chap3.pdf. I don't understand
why it works in bootstrap samples (with replace=TRUE), but not in randomized
distributions (with replace=FALSE).
Thanks for any insight you can share with me, and any suggestions for getting
rows in a matrix with different permutations.
-Kevin
*No, this isn't a homework problem. We're using Lock5 as the text in
class, along with its StatKey web application. I'm just trying to get more
out of the class by also solving our problems using R, for which I'm not
receiving any class credit.
______________________________________________
R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide https://www.r-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.