thr3ads.net - R help - [R] doing 1000 permutations and doing test statistics distribution [Feb 2020]

If this information is useful, please help other people find it:
Share via:

Ana Marija

2020-Feb-04 21:28 UTC

[R] doing 1000 permutations and doing test statistics distribution

I tired your code on this simplified data just for say 10 permutations:

dat <- read.table(text = "   code.1 code.2 code.3 code.4
1     82     93     NA     NA
2     15     85     93     NA
3     93     89     NA     NA
4     81     NA     NA     NA",
                  header = TRUE, stringsAsFactors = FALSE)

dat2=data.matrix(dat)
> result<- lapply(seq_len(10), FUN = function(dat2){+     dat2 <- dat2[, sample.int(4)]
+     print(colnames(dat2))
+ } )
Error in dat2[, sample.int(4)] : incorrect number of dimensions


On Tue, Feb 4, 2020 at 3:10 PM Bert Gunter <bgunter.4567 at gmail.com>
wrote:>
> I am not going to do your programming for you. If the following doesn't
suffice, maybe someone else will provide you something that will.
>
> m = your matrix
>
> code = your code that uses m
>
> your list of results <- lapply(seq_len(1000), FUN = function(m){
>   m <- m[, sample.int(132)]
>  code
> } )
>
> or use an explicit for() loop to populate a list or vector with your
results.
>
> Again, if I have misunderstood what you want to do, then clarify, and
perhaps someone else will help.
>
> -- Bert
>
>
>
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along
and sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip
)
>
>
> On Tue, Feb 4, 2020 at 12:41 PM Ana Marija <sokovic.anamarija at
gmail.com> wrote:
>>
>> Hi Bert,
>>
>> thanks for getting back to me. I have to permute those 132 columns
>> 1000 times and perform the code given in the previous email.
>>
>> Can you please show me how you would do that in the loop? This is also
>> a huge data set ...
>>
>> Thanks
>> Ana
>>
>> On Tue, Feb 4, 2020 at 2:34 PM Bert Gunter <bgunter.4567 at
gmail.com> wrote:
>> >
>> > If you just want to permute columns of a matrix,
>> >
>> > ?sample
>> > > sample.int(10)
>> >  [1]  9  2 10  8  4  6  3  1  5  7
>> >
>> > and you can just use this as an index into the columns of your
matrix, presumably within a loop of some sort.
>> >
>> > If I have misunderstood, just ignore.
>> >
>> > Cheers,
>> > Bert
>> >
>> >
>> >
>> >
>> > On Tue, Feb 4, 2020 at 12:23 PM Ana Marija <sokovic.anamarija
at gmail.com> wrote:
>> >>
>> >> Hello,
>> >>
>> >> I have a matrix
>> >> > dim(dat)
>> >> [1] 15568   132
>> >>
>> >> It looks like this:
>> >>
>> >>                    NoD_14381_norm.1 NoD_14381_norm.2
NoD_14381_norm.3
>> >> NoD_14520_30mM.1 NoD_14520_30mM.2 NoD_14520_30mM.3
>> >> Ku8QhfS0n_hIOABXuE             4.75             4.25          
4.79
>> >>            4.33             4.63             3.85
>> >> Bx496XsFXiAlj.Eaeo             6.15             6.23          
6.55
>> >>            6.26             6.24             5.99
>> >> W38p0ogk.wIBVRXllY             7.13             7.35          
7.55
>> >>            7.37             7.36             7.55
>> >> QIBkqIS9LR5DfTlTS8             6.27             6.73          
6.45
>> >>            5.39             4.75             4.96
>> >> BZKiEvS0eQ305U0v34             6.35             7.02          
6.76
>> >>            5.45             5.25             5.02
>> >> 6TheVd.HiE1UF3lX6g             5.53             5.02          
5.36
>> >>            5.61             5.66             5.37
>> >>
>> >> So it is a matrix with gene names ex. Ku8QhfS0n_hIOABXuE, and
subjects
>> >> named ex. NoD_14381_norm.1
>> >>
>> >>
>> >> How to do 1000 permutations of these 132 columns and on each
created
>> >> new permuted matrix perform this code:
>> >>
>> >> subject="all_replicate"
>> >>
targets<-readTargets(paste(PhenotypeDir,"hg_sg_",subject,"_target.txt",
sep=''))
>> >> Treat <-
factor(targets$Treatment,levels=c("C","T"))
>> >> Replicates <- factor(targets$rep)
>> >> design <- model.matrix(~Replicates+Treat)
>> >> corfit <- duplicateCorrelation(dat, block =
targets$Subject)
>> >> corfit$consensus.correlation
>> >> fit
<-lmFit(dat,design,block=targets$Subject,correlation=corfit$consensus.correlation)
>> >> fit<-eBayes(fit)
>> >> qval.cutoff=0.1; FC.cutoff=0.17
>> >> y1=topTable(fit, coef="TreatT",
n=nrow(genes),adjust.method="BH",genelist=genes)
>> >>
>> >> y1 for each iteration of permutation would  have P.Value
column and
>> >> these I would have plotted on the end to find the distribution
of all
>> >> p values generated in those 1000 permutations.
>> >>
>> >> Please advise,
>> >> Ana
>> >>
>> >> ______________________________________________
>> >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and
more, see
>> >> https://stat.ethz.ch/mailman/listinfo/r-help
>> >> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
>> >> and provide commented, minimal, self-contained, reproducible
code.

Duncan Murdoch

2020-Feb-04 21:41 UTC

head link

[R] doing 1000 permutations and doing test statistics distribution

On 04/02/2020 4:28 p.m., Ana Marija wrote:> I tired your code on this simplified data just for say 10 permutations:
> 
> dat <- read.table(text = "   code.1 code.2 code.3 code.4
> 1     82     93     NA     NA
> 2     15     85     93     NA
> 3     93     89     NA     NA
> 4     81     NA     NA     NA",
>                    header = TRUE, stringsAsFactors = FALSE)
> 
> dat2=data.matrix(dat)
> 
>> result<- lapply(seq_len(10), FUN = function(dat2){
> +     dat2 <- dat2[, sample.int(4)]
> +     print(colnames(dat2))
> + } )
> Error in dat2[, sample.int(4)] : incorrect number of dimensions
Yes, Bert did suggest that, but it's obviously wrong.  The argument to 
FUN is an element of seq_len(10), it's not the full dataset.  Try

result<- lapply(seq_len(10), FUN = function(i){
      dat <- dat2[, sample.int(4)]
      print(colnames(dat))
  } )

Duncan Murdoch
> 
> 
> On Tue, Feb 4, 2020 at 3:10 PM Bert Gunter <bgunter.4567 at
gmail.com> wrote:
>>
>> I am not going to do your programming for you. If the following
doesn't suffice, maybe someone else will provide you something that will.
>>
>> m = your matrix
>>
>> code = your code that uses m
>>
>> your list of results <- lapply(seq_len(1000), FUN = function(m){
>>    m <- m[, sample.int(132)]
>>   code
>> } )
>>
>> or use an explicit for() loop to populate a list or vector with your
results.
>>
>> Again, if I have misunderstood what you want to do, then clarify, and
perhaps someone else will help.
>>
>> -- Bert
>>
>>
>>
>> Bert Gunter
>>
>> "The trouble with having an open mind is that people keep coming
along and sticking things into it."
>> -- Opus (aka Berkeley Breathed in his "Bloom County" comic
strip )
>>
>>
>> On Tue, Feb 4, 2020 at 12:41 PM Ana Marija <sokovic.anamarija at
gmail.com> wrote:
>>>
>>> Hi Bert,
>>>
>>> thanks for getting back to me. I have to permute those 132 columns
>>> 1000 times and perform the code given in the previous email.
>>>
>>> Can you please show me how you would do that in the loop? This is
also
>>> a huge data set ...
>>>
>>> Thanks
>>> Ana
>>>
>>> On Tue, Feb 4, 2020 at 2:34 PM Bert Gunter <bgunter.4567 at
gmail.com> wrote:
>>>>
>>>> If you just want to permute columns of a matrix,
>>>>
>>>> ?sample
>>>>> sample.int(10)
>>>>   [1]  9  2 10  8  4  6  3  1  5  7
>>>>
>>>> and you can just use this as an index into the columns of your
matrix, presumably within a loop of some sort.
>>>>
>>>> If I have misunderstood, just ignore.
>>>>
>>>> Cheers,
>>>> Bert
>>>>
>>>>
>>>>
>>>>
>>>> On Tue, Feb 4, 2020 at 12:23 PM Ana Marija
<sokovic.anamarija at gmail.com> wrote:
>>>>>
>>>>> Hello,
>>>>>
>>>>> I have a matrix
>>>>>> dim(dat)
>>>>> [1] 15568   132
>>>>>
>>>>> It looks like this:
>>>>>
>>>>>                     NoD_14381_norm.1 NoD_14381_norm.2
NoD_14381_norm.3
>>>>> NoD_14520_30mM.1 NoD_14520_30mM.2 NoD_14520_30mM.3
>>>>> Ku8QhfS0n_hIOABXuE             4.75             4.25       
4.79
>>>>>             4.33             4.63             3.85
>>>>> Bx496XsFXiAlj.Eaeo             6.15             6.23       
6.55
>>>>>             6.26             6.24             5.99
>>>>> W38p0ogk.wIBVRXllY             7.13             7.35       
7.55
>>>>>             7.37             7.36             7.55
>>>>> QIBkqIS9LR5DfTlTS8             6.27             6.73       
6.45
>>>>>             5.39             4.75             4.96
>>>>> BZKiEvS0eQ305U0v34             6.35             7.02       
6.76
>>>>>             5.45             5.25             5.02
>>>>> 6TheVd.HiE1UF3lX6g             5.53             5.02       
5.36
>>>>>             5.61             5.66             5.37
>>>>>
>>>>> So it is a matrix with gene names ex. Ku8QhfS0n_hIOABXuE,
and subjects
>>>>> named ex. NoD_14381_norm.1
>>>>>
>>>>>
>>>>> How to do 1000 permutations of these 132 columns and on
each created
>>>>> new permuted matrix perform this code:
>>>>>
>>>>> subject="all_replicate"
>>>>>
targets<-readTargets(paste(PhenotypeDir,"hg_sg_",subject,"_target.txt",
sep=''))
>>>>> Treat <-
factor(targets$Treatment,levels=c("C","T"))
>>>>> Replicates <- factor(targets$rep)
>>>>> design <- model.matrix(~Replicates+Treat)
>>>>> corfit <- duplicateCorrelation(dat, block =
targets$Subject)
>>>>> corfit$consensus.correlation
>>>>> fit
<-lmFit(dat,design,block=targets$Subject,correlation=corfit$consensus.correlation)
>>>>> fit<-eBayes(fit)
>>>>> qval.cutoff=0.1; FC.cutoff=0.17
>>>>> y1=topTable(fit, coef="TreatT",
n=nrow(genes),adjust.method="BH",genelist=genes)
>>>>>
>>>>> y1 for each iteration of permutation would  have P.Value
column and
>>>>> these I would have plotted on the end to find the
distribution of all
>>>>> p values generated in those 1000 permutations.
>>>>>
>>>>> Please advise,
>>>>> Ana
>>>>>
>>>>> ______________________________________________
>>>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and
more, see
>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
>>>>> and provide commented, minimal, self-contained,
reproducible code.
> 
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

Bert Gunter

2020-Feb-04 22:45 UTC

head link

[R] doing 1000 permutations and doing test statistics distribution

Yes, a clear thinko... Thanks for the correction.

-- Bert

On Tue, Feb 4, 2020 at 1:41 PM Duncan Murdoch <murdoch.duncan at
gmail.com>
wrote:
> On 04/02/2020 4:28 p.m., Ana Marija wrote:
> > I tired your code on this simplified data just for say 10
permutations:
> >
> > dat <- read.table(text = "   code.1 code.2 code.3 code.4
> > 1     82     93     NA     NA
> > 2     15     85     93     NA
> > 3     93     89     NA     NA
> > 4     81     NA     NA     NA",
> >                    header = TRUE, stringsAsFactors = FALSE)
> >
> > dat2=data.matrix(dat)
> >
> >> result<- lapply(seq_len(10), FUN = function(dat2){
> > +     dat2 <- dat2[, sample.int(4)]
> > +     print(colnames(dat2))
> > + } )
> > Error in dat2[, sample.int(4)] : incorrect number of dimensions
>
> Yes, Bert did suggest that, but it's obviously wrong.  The argument to
> FUN is an element of seq_len(10), it's not the full dataset.  Try
>
> result<- lapply(seq_len(10), FUN = function(i){
>       dat <- dat2[, sample.int(4)]
>       print(colnames(dat))
>   } )
>
> Duncan Murdoch
>
> >
> >
> > On Tue, Feb 4, 2020 at 3:10 PM Bert Gunter <bgunter.4567 at
gmail.com>
> wrote:
> >>
> >> I am not going to do your programming for you. If the following
doesn't
> suffice, maybe someone else will provide you something that will.
> >>
> >> m = your matrix
> >>
> >> code = your code that uses m
> >>
> >> your list of results <- lapply(seq_len(1000), FUN =
function(m){
> >>    m <- m[, sample.int(132)]
> >>   code
> >> } )
> >>
> >> or use an explicit for() loop to populate a list or vector with
your
> results.
> >>
> >> Again, if I have misunderstood what you want to do, then clarify,
and
> perhaps someone else will help.
> >>
> >> -- Bert
> >>
> >>
> >>
> >> Bert Gunter
> >>
> >> "The trouble with having an open mind is that people keep
coming along
> and sticking things into it."
> >> -- Opus (aka Berkeley Breathed in his "Bloom County"
comic strip )
> >>
> >>
> >> On Tue, Feb 4, 2020 at 12:41 PM Ana Marija <sokovic.anamarija
at gmail.com>
> wrote:
> >>>
> >>> Hi Bert,
> >>>
> >>> thanks for getting back to me. I have to permute those 132
columns
> >>> 1000 times and perform the code given in the previous email.
> >>>
> >>> Can you please show me how you would do that in the loop? This
is also
> >>> a huge data set ...
> >>>
> >>> Thanks
> >>> Ana
> >>>
> >>> On Tue, Feb 4, 2020 at 2:34 PM Bert Gunter <bgunter.4567 at
gmail.com>
> wrote:
> >>>>
> >>>> If you just want to permute columns of a matrix,
> >>>>
> >>>> ?sample
> >>>>> sample.int(10)
> >>>>   [1]  9  2 10  8  4  6  3  1  5  7
> >>>>
> >>>> and you can just use this as an index into the columns of
your
> matrix, presumably within a loop of some sort.
> >>>>
> >>>> If I have misunderstood, just ignore.
> >>>>
> >>>> Cheers,
> >>>> Bert
> >>>>
> >>>>
> >>>>
> >>>>
> >>>> On Tue, Feb 4, 2020 at 12:23 PM Ana Marija <
> sokovic.anamarija at gmail.com> wrote:
> >>>>>
> >>>>> Hello,
> >>>>>
> >>>>> I have a matrix
> >>>>>> dim(dat)
> >>>>> [1] 15568   132
> >>>>>
> >>>>> It looks like this:
> >>>>>
> >>>>>                     NoD_14381_norm.1 NoD_14381_norm.2
> NoD_14381_norm.3
> >>>>> NoD_14520_30mM.1 NoD_14520_30mM.2 NoD_14520_30mM.3
> >>>>> Ku8QhfS0n_hIOABXuE             4.75             4.25  
4.79
> >>>>>             4.33             4.63             3.85
> >>>>> Bx496XsFXiAlj.Eaeo             6.15             6.23  
6.55
> >>>>>             6.26             6.24             5.99
> >>>>> W38p0ogk.wIBVRXllY             7.13             7.35  
7.55
> >>>>>             7.37             7.36             7.55
> >>>>> QIBkqIS9LR5DfTlTS8             6.27             6.73  
6.45
> >>>>>             5.39             4.75             4.96
> >>>>> BZKiEvS0eQ305U0v34             6.35             7.02  
6.76
> >>>>>             5.45             5.25             5.02
> >>>>> 6TheVd.HiE1UF3lX6g             5.53             5.02  
5.36
> >>>>>             5.61             5.66             5.37
> >>>>>
> >>>>> So it is a matrix with gene names ex.
Ku8QhfS0n_hIOABXuE, and
> subjects
> >>>>> named ex. NoD_14381_norm.1
> >>>>>
> >>>>>
> >>>>> How to do 1000 permutations of these 132 columns and
on each created
> >>>>> new permuted matrix perform this code:
> >>>>>
> >>>>> subject="all_replicate"
> >>>>>
>
targets<-readTargets(paste(PhenotypeDir,"hg_sg_",subject,"_target.txt",
> sep=''))
> >>>>> Treat <-
factor(targets$Treatment,levels=c("C","T"))
> >>>>> Replicates <- factor(targets$rep)
> >>>>> design <- model.matrix(~Replicates+Treat)
> >>>>> corfit <- duplicateCorrelation(dat, block =
targets$Subject)
> >>>>> corfit$consensus.correlation
> >>>>> fit
>
<-lmFit(dat,design,block=targets$Subject,correlation=corfit$consensus.correlation)
> >>>>> fit<-eBayes(fit)
> >>>>> qval.cutoff=0.1; FC.cutoff=0.17
> >>>>> y1=topTable(fit, coef="TreatT",
> n=nrow(genes),adjust.method="BH",genelist=genes)
> >>>>>
> >>>>> y1 for each iteration of permutation would  have
P.Value column and
> >>>>> these I would have plotted on the end to find the
distribution of all
> >>>>> p values generated in those 1000 permutations.
> >>>>>
> >>>>> Please advise,
> >>>>> Ana
> >>>>>
> >>>>> ______________________________________________
> >>>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE
and more, see
> >>>>> https://stat.ethz.ch/mailman/listinfo/r-help
> >>>>> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> >>>>> and provide commented, minimal, self-contained,
reproducible code.
> >
> > ______________________________________________
> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
>
	[[alternative HTML version deleted]]

R help - Feb 2020 - doing 1000 permutations and doing test statistics distribution

[R] doing 1000 permutations and doing test statistics distribution

[R] doing 1000 permutations and doing test statistics distribution

[R] doing 1000 permutations and doing test statistics distribution