I need to use R to model a large number of experiments (say, 1000). Each experiment involves the random selection of 5 numbers (without replacement) from a pool of numbers ranging between 1 and 30. What I need to know is what *proportion* of those experiments contains two or more numbers that are consecutive. So, for instance, an experiment that yielded the numbers 2, 28, 31, 4, 27 would be considered a "consecutive true" experiment since 28 and 27 are two consecutive numbers, even though they are not side-by-side. I am quite new to R, so really am puzzled as to how to go about this. I've tried sorting each experiment, and then subtracting adjacent pairs of numbers to see if the difference is plus or minus 1. I'm also unsure about whether to use an array to store all the data first. Any assistance would be much appreciated. -- View this message in context: http://www.nabble.com/How-do-you-test-for-%22consecutivity%22--tp16959748p16959748.html Sent from the R help mailing list archive at Nabble.com.
Anthony28 wrote:> I need to use R to model a large number of experiments (say, 1000). Each > experiment involves the random selection of 5 numbers (without replacement) > from a pool of numbers ranging between 1 and 30. > > What I need to know is what *proportion* of those experiments contains two > or more numbers that are consecutive. So, for instance, an experiment that > yielded the numbers 2, 28, 31, 4, 27 would be considered a "consecutive > true" experiment since 28 and 27 are two consecutive numbers, even though > they are not side-by-side. > > I am quite new to R, so really am puzzled as to how to go about this. I've > tried sorting each experiment, and then subtracting adjacent pairs of > numbers to see if the difference is plus or minus 1. I'm also unsure about > whether to use an array to store all the data first. > > Any assistance would be much appreciated.Vec <- c(2, 28, 31, 4, 27) > Vec [1] 2 28 31 4 27 # Sort the vector > sort(Vec) [1] 2 4 27 28 31 # Get differences between sequential elements > diff(sort(Vec)) [1] 2 23 1 3 # Are any differences == 1? > any(diff(sort(Vec)) == 1) [1] TRUE See ?sort, ?diff and ?any for more information On your last question, if the data are all numeric and each experiment contains 30 elements from which you select five, then you can store the data in a N x 30 matrix, where N is the number of source data sets. The result could be stored in a N x 5 matrix. You can then run your test of sequential members as follows, presuming 'Res' contains the N x 5 result matrix: prop.table(table(apply(Res, 1, function(x) any(diff(sort(x)) == 1))) The output will be the proportion TRUE/FALSE of rows that have sequential elements. HTH, Marc Schwartz
How about this result <- numeric(10) for(i in 1:10){ x <- sample(1:30, 5, replace = FALSE) x <- sort(x) result[i] <- any(diff(x) == 1) }> -----Original Message----- > From: r-help-bounces at r-project.org > [mailto:r-help-bounces at r-project.org] On Behalf Of Anthony28 > Sent: Tuesday, April 29, 2008 8:52 AM > To: r-help at r-project.org > Subject: [R] How do you test for "consecutivity"? > > > I need to use R to model a large number of experiments (say, > 1000). Each experiment involves the random selection of 5 > numbers (without replacement) from a pool of numbers ranging > between 1 and 30. > > What I need to know is what *proportion* of those experiments > contains two or more numbers that are consecutive. So, for > instance, an experiment that yielded the numbers 2, 28, 31, > 4, 27 would be considered a "consecutive = true" experiment > since 28 and 27 are two consecutive numbers, even though they > are not side-by-side. > > I am quite new to R, so really am puzzled as to how to go > about this. I've tried sorting each experiment, and then > subtracting adjacent pairs of numbers to see if the > difference is plus or minus 1. I'm also unsure about whether > to use an array to store all the data first. > > Any assistance would be much appreciated. > -- > View this message in context: > http://www.nabble.com/How-do-you-test-for-%22consecutivity%22- > -tp16959748p16959748.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
This will work: my.list <- c(2, 28, 31, 4, 27) sort(my.list) diff(sort(my.list)) any(diff(sort(my.list)) == 1) the middle two lines are only to illustrate what's going on. Best wishes! Charles Annis, P.E. Charles.Annis at StatisticalEngineering.com phone: 561-352-9699 eFax: 614-455-3265 http://www.StatisticalEngineering.com -----Original Message----- From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Anthony28 Sent: Tuesday, April 29, 2008 8:52 AM To: r-help at r-project.org Subject: [R] How do you test for "consecutivity"? I need to use R to model a large number of experiments (say, 1000). Each experiment involves the random selection of 5 numbers (without replacement) from a pool of numbers ranging between 1 and 30. What I need to know is what *proportion* of those experiments contains two or more numbers that are consecutive. So, for instance, an experiment that yielded the numbers 2, 28, 31, 4, 27 would be considered a "consecutive true" experiment since 28 and 27 are two consecutive numbers, even though they are not side-by-side. I am quite new to R, so really am puzzled as to how to go about this. I've tried sorting each experiment, and then subtracting adjacent pairs of numbers to see if the difference is plus or minus 1. I'm also unsure about whether to use an array to store all the data first. Any assistance would be much appreciated. -- View this message in context: http://www.nabble.com/How-do-you-test-for-%22consecutivity%22--tp16959748p16 959748.html Sent from the R help mailing list archive at Nabble.com. ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
On Tue, 29 Apr 2008, Anthony28 wrote:> > I need to use R to model a large number of experiments (say, 1000). Each > experiment involves the random selection of 5 numbers (without replacement) > from a pool of numbers ranging between 1 and 30. > > What I need to know is what *proportion* of those experiments contains two > or more numbers that are consecutive. So, for instance, an experiment that > yielded the numbers 2, 28, 31, 4, 27 would be considered a "consecutive > true" experiment since 28 and 27 are two consecutive numbers, even though > they are not side-by-side. > > I am quite new to R, so really am puzzled as to how to go about this. I've > tried sorting each experiment, and then subtracting adjacent pairs of > numbers to see if the difference is plus or minus 1. I'm also unsure about > whether to use an array to store all the data first. > > Any assistance would be much appreciated.Are the numbers 1:30 equiprobable?? If so, you can find the probability by direct enumeration.> mat <- combn(30,5) # each column happens to be in order > tab <- table( mat[2:5,]-mat[1:4,]==1, col(mat[1:4,]) ) > table(tab[2,])0 1 2 3 4 65780 59800 15600 1300 26> prop.table( table(tab[2,] != 0 ) )FALSE TRUE 0.4615946 0.5384054>If the numbers are not equiprobable, you will need to weight the values of tab[2,] according to the probability of each column of mat. HTH, Chuck> -- > View this message in context: http://www.nabble.com/How-do-you-test-for-%22consecutivity%22--tp16959748p16959748.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >Charles C. Berry (858) 534-2098 Dept of Family/Preventive Medicine E mailto:cberry at tajo.ucsd.edu UC San Diego http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego 92093-0901
Hey Anthony, There must be many ways to do this. This is one of them: #First, define a function to calculate the proportion of consecutive numbers in a vector. prop.diff=function(x){ d=diff(sort(x)) prop=(sum(d==1)+1)/length(x) return(prop)} #Note that I am counting both numbers in a consecutive pair. For example, the vector c(1,2,6,9,10) will contain 4 consecutive numbers. I think this is what you wanted do do, right? #Next, generate a matrix with 1000 columns (one for each experiment) and 5 rows (the five numbers in each experiment). Note the use of the 'replicate' function to generate multiple sets of random numbers selection=replicate(1000,sort(sample(1:30,5))) #Third, use the apply function to apply the function we defined above to each column of the matrix diffs=apply(selection,2,prop.diff) # This will give you a vector with the 1000 proportions of consecutive numbers Julian Anthony28 wrote:> I need to use R to model a large number of experiments (say, 1000). Each > experiment involves the random selection of 5 numbers (without replacement) > from a pool of numbers ranging between 1 and 30. > > What I need to know is what *proportion* of those experiments contains two > or more numbers that are consecutive. So, for instance, an experiment that > yielded the numbers 2, 28, 31, 4, 27 would be considered a "consecutive > true" experiment since 28 and 27 are two consecutive numbers, even though > they are not side-by-side. > > I am quite new to R, so really am puzzled as to how to go about this. I've > tried sorting each experiment, and then subtracting adjacent pairs of > numbers to see if the difference is plus or minus 1. I'm also unsure about > whether to use an array to store all the data first. > > Any assistance would be much appreciated.
I'd just like to thank all you guys for stepping in so promptly with help. I haven't yet had a chance to implement any of your code yet, but just by looking over what you've suggested, I think I have enough to guide me. So thanks once again! -- View this message in context: http://www.nabble.com/How-do-you-test-for-%22consecutivity%22--tp16959748p16973477.html Sent from the R help mailing list archive at Nabble.com.
Charles C. Berry:> Are the numbers 1:30 equiprobable?? > > If so, you can find the probability by direct enumeration.Or by a simple formula: * Probabilities of Consecutive Integers in Lotto * Author(s): Stanley P. Gudder and James N. Hagler * Source: Mathematics Magazine, Vol. 74, No. 3 (Jun., 2001), pp. 216-222 * Publisher: Mathematical Association of America * Stable URL: http://www.jstor.org/stable/2690723 -- Karl Ove Hufthammer
Seemingly Similar Threads
- Calculating conditional mean of large series of experiments
- conditionally merging adjacent rows in a data frame
- Disable combining of loads and stores in instcombine
- permutations of a binary matrix with fixed margins
- [CentOS-announce] Release for CentOS Linux 7 (1503 ) on x86_64