Shuhua Zhan
2008-Oct-30 20:44 UTC
[R] why does sample(x, n) give the same n items in every separate runs?
Hello R users, I have gene expression data of two groups of genes (large and small). Gene expression intensities of those genes are classified into 1 to 10 levels. What I want is to make a random set of genes that have the same levels as the small group from large group using sample(). I used smallvec to hold the number of genes in each levels (1 to 10) for small group, largevec for large group. I ordered the gene expression data frame of large group (largedf) by the levels and randomly chose the genes with same level as small group. Using the code below I can get the random set of genes from lagre group with same levels for small group. But I got the same set of genes when I run the code in every other runs in Linux. This gives me a doubt in my result. smallvec<-c(8,12,9,6,13,20,16,11,8,5) ## the No. of genes in levels 1 to 10 largevec<-c(400,300,550,600,210,420,380,600,450,500) generdm<-c() ## a random set of genes for( i in 1:length(smallvec)){ generdm<-c(generdm,sample(rownames(largedf)[sum(largevec[0:(i-1)],1):sum(largevec[0:i])],smallvec[i])) ## rownames(largedf) gives gene names ordered by levels in large group } Could you please help me out? Thanks a lot!! Josh
Dieter Menne
2008-Oct-30 21:12 UTC
[R] why does sample(x, n) give the same n items in every separate runs?
Shuhua Zhan <szhan <at> uoguelph.ca> writes:> > smallvec<-c(8,12,9,6,13,20,16,11,8,5) > largevec<-c(400,300,550,600,210,420,380,600,450,500) > generdm<-c() ## a random set of genes > for( i in 1:length(smallvec)){ ># try to chop down this line and add a print to understand # what is going on generdm<-c(generdm, sample(rownames(largedf)[sum(largevec[0:i-)],1):sum(largevec[0:i])],smallvec[i]))> ## rownames(largedf) gives gene names ordered by levels in > }Since your code does not run, and you do not show how you got your result, it is a bit of guesswork. I believe it has nothing to do with sample(), but rather with the way you "store" (=not store) the results. Your way of storing looks a bit like PHP. Try a variation of this one generdm = list() for (i in 1:10) { generdm[[i]] = rnorm(10) # put your sample construct here } generdm Dieter
Charles C. Berry
2008-Oct-31 01:07 UTC
[R] why does sample(x, n) give the same n items in every separate runs?
Run help.request(). In particular you need to attend to this part: Have you written example code that is - minimal - reproducible - self-contained - commented using data that is either - constructed by the code - loaded by data() - reproduced using dump("mydata", file = "") have you checked this code in a fresh R session (invoking R with the --vanilla option if possible) and is this code copied to the clipboard? (y/n) ---- Once you get this far and can honestly type 'y', if you have not found your error, you are ready to post a query. HTH, Chuck On Thu, 30 Oct 2008, Shuhua Zhan wrote:> Hello R users, > I have gene expression data of two groups of genes (large and small). Gene expression intensities of those genes are classified into 1 to 10 levels. What I want is to make a random set of genes that have the same levels as the small group from large group using sample(). > > I used smallvec to hold the number of genes in each levels (1 to 10) for small group, largevec for large group. I ordered the gene expression data frame of large group (largedf) by the levels and randomly chose the genes with same level as small group. Using the code below I can get the random set of genes from lagre group with same levels for small group. But I got the same set of genes when I run the code in every other runs in Linux. This gives me a doubt in my result. > > smallvec<-c(8,12,9,6,13,20,16,11,8,5) ## the No. of genes in levels 1 to 10 > largevec<-c(400,300,550,600,210,420,380,600,450,500) > generdm<-c() ## a random set of genes > for( i in 1:length(smallvec)){ > generdm<-c(generdm,sample(rownames(largedf)[sum(largevec[0:(i-1)],1):sum(largevec[0:i])],smallvec[i])) ## rownames(largedf) gives gene names ordered by levels in large group > } > > Could you please help me out? > Thanks a lot!! > Josh > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >Charles C. Berry (858) 534-2098 Dept of Family/Preventive Medicine E mailto:cberry at tajo.ucsd.edu UC San Diego http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego 92093-0901
Shuhua Zhan
2008-Oct-31 20:19 UTC
[R] why does sample(x, n) give the same n items in every separate runs?
Hello, I'd like to think all respondents for your advises and suggestions. I am sorry for not including the ordered gene expression data frame (largedf) since it is large. The problem is solved now by Greg Down by including the command: rm(.Random.seed) in the code, perhaps I used some R packges to process that expression data which set the seed on the Random Number Generator to a fixed value. Thanks again, Joshua ----- Original Message ----- From: "Charles C. Berry" <cberry at tajo.ucsd.edu> To: "Shuhua Zhan" <szhan at uoguelph.ca> Cc: r-help at r-project.org Sent: Thursday, October 30, 2008 9:07:09 PM GMT -05:00 US/Canada Eastern Subject: Re: [R] why does sample(x, n) give the same n items in every separate runs? Run help.request(). In particular you need to attend to this part: Have you written example code that is - minimal - reproducible - self-contained - commented using data that is either - constructed by the code - loaded by data() - reproduced using dump("mydata", file = "") have you checked this code in a fresh R session (invoking R with the --vanilla option if possible) and is this code copied to the clipboard? (y/n) ---- Once you get this far and can honestly type 'y', if you have not found your error, you are ready to post a query. HTH, Chuck On Thu, 30 Oct 2008, Shuhua Zhan wrote:> Hello R users, > I have gene expression data of two groups of genes (large and small). Gene expression intensities of those genes are classified into 1 to 10 levels. What I want is to make a random set of genes that have the same levels as the small group from large group using sample(). > > I used smallvec to hold the number of genes in each levels (1 to 10) for small group, largevec for large group. I ordered the gene expression data frame of large group (largedf) by the levels and randomly chose the genes with same level as small group. Using the code below I can get the random set of genes from lagre group with same levels for small group. But I got the same set of genes when I run the code in every other runs in Linux. This gives me a doubt in my result. > > smallvec<-c(8,12,9,6,13,20,16,11,8,5) ## the No. of genes in levels 1 to 10 > largevec<-c(400,300,550,600,210,420,380,600,450,500) > generdm<-c() ## a random set of genes > for( i in 1:length(smallvec)){ > generdm<-c(generdm,sample(rownames(largedf)[sum(largevec[0:(i-1)],1):sum(largevec[0:i])],smallvec[i])) ## rownames(largedf) gives gene names ordered by levels in large group > } > > Could you please help me out? > Thanks a lot!! > Josh > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >Charles C. Berry (858) 534-2098 Dept of Family/Preventive Medicine E mailto:cberry at tajo.ucsd.edu UC San Diego http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego 92093-0901
Possibly Parallel Threads
- segfault when trying to allocate a large vector
- segfault when trying to allocate a large vector
- how to skip a specific value when using apply() function to a matrix?
- A programming question - is what I want to do possible in R?
- RFC: [SmallVector] Adding SVec<T> and Vec<T> convenience wrappers.