Shuhua Zhan
2008-Oct-30 20:44 UTC
[R] why does sample(x, n) give the same n items in every separate runs?
Hello R users,
I have gene expression data of two groups of genes (large and small). Gene
expression intensities of those genes are classified into 1 to 10 levels. What I
want is to make a random set of genes that have the same levels as the small
group from large group using sample().
I used smallvec to hold the number of genes in each levels (1 to 10) for small
group, largevec for large group. I ordered the gene expression data frame of
large group (largedf) by the levels and randomly chose the genes with same level
as small group. Using the code below I can get the random set of genes from
lagre group with same levels for small group. But I got the same set of genes
when I run the code in every other runs in Linux. This gives me a doubt in my
result.
smallvec<-c(8,12,9,6,13,20,16,11,8,5) ## the No. of genes in levels 1 to 10
largevec<-c(400,300,550,600,210,420,380,600,450,500)
generdm<-c() ## a random set of genes
for( i in 1:length(smallvec)){
generdm<-c(generdm,sample(rownames(largedf)[sum(largevec[0:(i-1)],1):sum(largevec[0:i])],smallvec[i]))
## rownames(largedf) gives gene names ordered by levels in large group
}
Could you please help me out?
Thanks a lot!!
Josh
Dieter Menne
2008-Oct-30 21:12 UTC
[R] why does sample(x, n) give the same n items in every separate runs?
Shuhua Zhan <szhan <at> uoguelph.ca> writes:> > smallvec<-c(8,12,9,6,13,20,16,11,8,5) > largevec<-c(400,300,550,600,210,420,380,600,450,500) > generdm<-c() ## a random set of genes > for( i in 1:length(smallvec)){ ># try to chop down this line and add a print to understand # what is going on generdm<-c(generdm, sample(rownames(largedf)[sum(largevec[0:i-)],1):sum(largevec[0:i])],smallvec[i]))> ## rownames(largedf) gives gene names ordered by levels in > }Since your code does not run, and you do not show how you got your result, it is a bit of guesswork. I believe it has nothing to do with sample(), but rather with the way you "store" (=not store) the results. Your way of storing looks a bit like PHP. Try a variation of this one generdm = list() for (i in 1:10) { generdm[[i]] = rnorm(10) # put your sample construct here } generdm Dieter
Charles C. Berry
2008-Oct-31 01:07 UTC
[R] why does sample(x, n) give the same n items in every separate runs?
Run help.request().
In particular you need to attend to this part:
Have you written example code that is
- minimal
- reproducible
- self-contained
- commented
using data that is either
- constructed by the code
- loaded by data()
- reproduced using dump("mydata", file = "")
have you checked this code in a fresh R session (invoking R with the
--vanilla option if possible) and is this code copied to the clipboard? (y/n)
----
Once you get this far and can honestly type 'y', if you have not found
your error, you are ready to post a query.
HTH,
Chuck
On Thu, 30 Oct 2008, Shuhua Zhan wrote:
> Hello R users,
> I have gene expression data of two groups of genes (large and small). Gene
expression intensities of those genes are classified into 1 to 10 levels. What I
want is to make a random set of genes that have the same levels as the small
group from large group using sample().
>
> I used smallvec to hold the number of genes in each levels (1 to 10) for
small group, largevec for large group. I ordered the gene expression data frame
of large group (largedf) by the levels and randomly chose the genes with same
level as small group. Using the code below I can get the random set of genes
from lagre group with same levels for small group. But I got the same set of
genes when I run the code in every other runs in Linux. This gives me a doubt in
my result.
>
> smallvec<-c(8,12,9,6,13,20,16,11,8,5) ## the No. of genes in levels 1 to
10
> largevec<-c(400,300,550,600,210,420,380,600,450,500)
> generdm<-c() ## a random set of genes
> for( i in 1:length(smallvec)){
>
generdm<-c(generdm,sample(rownames(largedf)[sum(largevec[0:(i-1)],1):sum(largevec[0:i])],smallvec[i]))
## rownames(largedf) gives gene names ordered by levels in large group
> }
>
> Could you please help me out?
> Thanks a lot!!
> Josh
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
Charles C. Berry (858) 534-2098
Dept of Family/Preventive Medicine
E mailto:cberry at tajo.ucsd.edu UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego 92093-0901
Shuhua Zhan
2008-Oct-31 20:19 UTC
[R] why does sample(x, n) give the same n items in every separate runs?
Hello,
I'd like to think all respondents for your advises and suggestions. I am
sorry for not including the ordered gene expression data frame (largedf) since
it is large.
The problem is solved now by Greg Down by including the command:
rm(.Random.seed) in the code, perhaps I used some R packges to process that
expression data which set the seed on the Random Number Generator to a fixed
value.
Thanks again,
Joshua
----- Original Message -----
From: "Charles C. Berry" <cberry at tajo.ucsd.edu>
To: "Shuhua Zhan" <szhan at uoguelph.ca>
Cc: r-help at r-project.org
Sent: Thursday, October 30, 2008 9:07:09 PM GMT -05:00 US/Canada Eastern
Subject: Re: [R] why does sample(x, n) give the same n items in every separate
runs?
Run help.request().
In particular you need to attend to this part:
Have you written example code that is
- minimal
- reproducible
- self-contained
- commented
using data that is either
- constructed by the code
- loaded by data()
- reproduced using dump("mydata", file = "")
have you checked this code in a fresh R session (invoking R with the
--vanilla option if possible) and is this code copied to the clipboard? (y/n)
----
Once you get this far and can honestly type 'y', if you have not found
your error, you are ready to post a query.
HTH,
Chuck
On Thu, 30 Oct 2008, Shuhua Zhan wrote:
> Hello R users,
> I have gene expression data of two groups of genes (large and small). Gene
expression intensities of those genes are classified into 1 to 10 levels. What I
want is to make a random set of genes that have the same levels as the small
group from large group using sample().
>
> I used smallvec to hold the number of genes in each levels (1 to 10) for
small group, largevec for large group. I ordered the gene expression data frame
of large group (largedf) by the levels and randomly chose the genes with same
level as small group. Using the code below I can get the random set of genes
from lagre group with same levels for small group. But I got the same set of
genes when I run the code in every other runs in Linux. This gives me a doubt in
my result.
>
> smallvec<-c(8,12,9,6,13,20,16,11,8,5) ## the No. of genes in levels 1 to
10
> largevec<-c(400,300,550,600,210,420,380,600,450,500)
> generdm<-c() ## a random set of genes
> for( i in 1:length(smallvec)){
>
generdm<-c(generdm,sample(rownames(largedf)[sum(largevec[0:(i-1)],1):sum(largevec[0:i])],smallvec[i]))
## rownames(largedf) gives gene names ordered by levels in large group
> }
>
> Could you please help me out?
> Thanks a lot!!
> Josh
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
Charles C. Berry (858) 534-2098
Dept of Family/Preventive Medicine
E mailto:cberry at tajo.ucsd.edu UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego 92093-0901
Maybe Matching Threads
- segfault when trying to allocate a large vector
- segfault when trying to allocate a large vector
- how to skip a specific value when using apply() function to a matrix?
- A programming question - is what I want to do possible in R?
- RFC: [SmallVector] Adding SVec<T> and Vec<T> convenience wrappers.