thr3ads.net - R help - [R] doing 1000 permutations and doing test statistics distribution [Feb 2020]

If this information is useful, please help other people find it:
Share via:

Ana Marija

2020-Feb-04 20:46 UTC

[R] doing 1000 permutations and doing test statistics distribution

Hi Bert,

thanks for getting back to me. I have to permute those 132 columns
1000 times and perform the code given in the previous email.

Can you please show me how you would do that in the loop? This is also
a huge data set ...

Thanks
Ana

On Tue, Feb 4, 2020 at 2:34 PM Bert Gunter <bgunter.4567 at gmail.com>
wrote:>
> If you just want to permute columns of a matrix,
>
> ?sample
> > sample.int(10)
>  [1]  9  2 10  8  4  6  3  1  5  7
>
> and you can just use this as an index into the columns of your matrix,
presumably within a loop of some sort.
>
> If I have misunderstood, just ignore.
>
> Cheers,
> Bert
>
>
>
>
> On Tue, Feb 4, 2020 at 12:23 PM Ana Marija <sokovic.anamarija at
gmail.com> wrote:
>>
>> Hello,
>>
>> I have a matrix
>> > dim(dat)
>> [1] 15568   132
>>
>> It looks like this:
>>
>>                    NoD_14381_norm.1 NoD_14381_norm.2 NoD_14381_norm.3
>> NoD_14520_30mM.1 NoD_14520_30mM.2 NoD_14520_30mM.3
>> Ku8QhfS0n_hIOABXuE             4.75             4.25             4.79
>>            4.33             4.63             3.85
>> Bx496XsFXiAlj.Eaeo             6.15             6.23             6.55
>>            6.26             6.24             5.99
>> W38p0ogk.wIBVRXllY             7.13             7.35             7.55
>>            7.37             7.36             7.55
>> QIBkqIS9LR5DfTlTS8             6.27             6.73             6.45
>>            5.39             4.75             4.96
>> BZKiEvS0eQ305U0v34             6.35             7.02             6.76
>>            5.45             5.25             5.02
>> 6TheVd.HiE1UF3lX6g             5.53             5.02             5.36
>>            5.61             5.66             5.37
>>
>> So it is a matrix with gene names ex. Ku8QhfS0n_hIOABXuE, and subjects
>> named ex. NoD_14381_norm.1
>>
>>
>> How to do 1000 permutations of these 132 columns and on each created
>> new permuted matrix perform this code:
>>
>> subject="all_replicate"
>>
targets<-readTargets(paste(PhenotypeDir,"hg_sg_",subject,"_target.txt",
sep=''))
>> Treat <-
factor(targets$Treatment,levels=c("C","T"))
>> Replicates <- factor(targets$rep)
>> design <- model.matrix(~Replicates+Treat)
>> corfit <- duplicateCorrelation(dat, block = targets$Subject)
>> corfit$consensus.correlation
>> fit
<-lmFit(dat,design,block=targets$Subject,correlation=corfit$consensus.correlation)
>> fit<-eBayes(fit)
>> qval.cutoff=0.1; FC.cutoff=0.17
>> y1=topTable(fit, coef="TreatT",
n=nrow(genes),adjust.method="BH",genelist=genes)
>>
>> y1 for each iteration of permutation would  have P.Value column and
>> these I would have plotted on the end to find the distribution of all
>> p values generated in those 1000 permutations.
>>
>> Please advise,
>> Ana
>>
>> ______________________________________________
>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.

Ana Marija

2020-Feb-04 20:46 UTC

head link

[R] doing 1000 permutations and doing test statistics distribution

Basically I would just reshuffle column names in each of 1000 permutations
how to do that and perform everything I described in my initial email

On Tue, 4 Feb 2020 at 14:46, Ana Marija <sokovic.anamarija at gmail.com>
wrote:
> Hi Bert,
>
> thanks for getting back to me. I have to permute those 132 columns
> 1000 times and perform the code given in the previous email.
>
> Can you please show me how you would do that in the loop? This is also
> a huge data set ...
>
> Thanks
> Ana
>
> On Tue, Feb 4, 2020 at 2:34 PM Bert Gunter <bgunter.4567 at
gmail.com> wrote:
> >
> > If you just want to permute columns of a matrix,
> >
> > ?sample
> > > sample.int(10)
> >  [1]  9  2 10  8  4  6  3  1  5  7
> >
> > and you can just use this as an index into the columns of your matrix,
> presumably within a loop of some sort.
> >
> > If I have misunderstood, just ignore.
> >
> > Cheers,
> > Bert
> >
> >
> >
> >
> > On Tue, Feb 4, 2020 at 12:23 PM Ana Marija <sokovic.anamarija at
gmail.com>
> wrote:
> >>
> >> Hello,
> >>
> >> I have a matrix
> >> > dim(dat)
> >> [1] 15568   132
> >>
> >> It looks like this:
> >>
> >>                    NoD_14381_norm.1 NoD_14381_norm.2
NoD_14381_norm.3
> >> NoD_14520_30mM.1 NoD_14520_30mM.2 NoD_14520_30mM.3
> >> Ku8QhfS0n_hIOABXuE             4.75             4.25            
4.79
> >>            4.33             4.63             3.85
> >> Bx496XsFXiAlj.Eaeo             6.15             6.23            
6.55
> >>            6.26             6.24             5.99
> >> W38p0ogk.wIBVRXllY             7.13             7.35            
7.55
> >>            7.37             7.36             7.55
> >> QIBkqIS9LR5DfTlTS8             6.27             6.73            
6.45
> >>            5.39             4.75             4.96
> >> BZKiEvS0eQ305U0v34             6.35             7.02            
6.76
> >>            5.45             5.25             5.02
> >> 6TheVd.HiE1UF3lX6g             5.53             5.02            
5.36
> >>            5.61             5.66             5.37
> >>
> >> So it is a matrix with gene names ex. Ku8QhfS0n_hIOABXuE, and
subjects
> >> named ex. NoD_14381_norm.1
> >>
> >>
> >> How to do 1000 permutations of these 132 columns and on each
created
> >> new permuted matrix perform this code:
> >>
> >> subject="all_replicate"
> >>
targets<-readTargets(paste(PhenotypeDir,"hg_sg_",subject,"_target.txt",
> sep=''))
> >> Treat <-
factor(targets$Treatment,levels=c("C","T"))
> >> Replicates <- factor(targets$rep)
> >> design <- model.matrix(~Replicates+Treat)
> >> corfit <- duplicateCorrelation(dat, block = targets$Subject)
> >> corfit$consensus.correlation
> >> fit
>
<-lmFit(dat,design,block=targets$Subject,correlation=corfit$consensus.correlation)
> >> fit<-eBayes(fit)
> >> qval.cutoff=0.1; FC.cutoff=0.17
> >> y1=topTable(fit, coef="TreatT",
> n=nrow(genes),adjust.method="BH",genelist=genes)
> >>
> >> y1 for each iteration of permutation would  have P.Value column
and
> >> these I would have plotted on the end to find the distribution of
all
> >> p values generated in those 1000 permutations.
> >>
> >> Please advise,
> >> Ana
> >>
> >> ______________________________________________
> >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more,
see
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
>
	[[alternative HTML version deleted]]

Bert Gunter

2020-Feb-04 21:10 UTC

head link

[R] doing 1000 permutations and doing test statistics distribution

I am not going to do your programming for you. If the following doesn't
suffice, maybe someone else will provide you something that will.

m = your matrix

code = your code that uses m

your list of results <- lapply(seq_len(1000), FUN = function(m){
  m <- m[, sample.int(132)]
 code
} )

or use an explicit for() loop to populate a list or vector with your
results.

Again, if I have misunderstood what you want to do, then clarify, and
perhaps someone else will help.

-- Bert



Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Tue, Feb 4, 2020 at 12:41 PM Ana Marija <sokovic.anamarija at
gmail.com>
wrote:
> Hi Bert,
>
> thanks for getting back to me. I have to permute those 132 columns
> 1000 times and perform the code given in the previous email.
>
> Can you please show me how you would do that in the loop? This is also
> a huge data set ...
>
> Thanks
> Ana
>
> On Tue, Feb 4, 2020 at 2:34 PM Bert Gunter <bgunter.4567 at
gmail.com> wrote:
> >
> > If you just want to permute columns of a matrix,
> >
> > ?sample
> > > sample.int(10)
> >  [1]  9  2 10  8  4  6  3  1  5  7
> >
> > and you can just use this as an index into the columns of your matrix,
> presumably within a loop of some sort.
> >
> > If I have misunderstood, just ignore.
> >
> > Cheers,
> > Bert
> >
> >
> >
> >
> > On Tue, Feb 4, 2020 at 12:23 PM Ana Marija <sokovic.anamarija at
gmail.com>
> wrote:
> >>
> >> Hello,
> >>
> >> I have a matrix
> >> > dim(dat)
> >> [1] 15568   132
> >>
> >> It looks like this:
> >>
> >>                    NoD_14381_norm.1 NoD_14381_norm.2
NoD_14381_norm.3
> >> NoD_14520_30mM.1 NoD_14520_30mM.2 NoD_14520_30mM.3
> >> Ku8QhfS0n_hIOABXuE             4.75             4.25            
4.79
> >>            4.33             4.63             3.85
> >> Bx496XsFXiAlj.Eaeo             6.15             6.23            
6.55
> >>            6.26             6.24             5.99
> >> W38p0ogk.wIBVRXllY             7.13             7.35            
7.55
> >>            7.37             7.36             7.55
> >> QIBkqIS9LR5DfTlTS8             6.27             6.73            
6.45
> >>            5.39             4.75             4.96
> >> BZKiEvS0eQ305U0v34             6.35             7.02            
6.76
> >>            5.45             5.25             5.02
> >> 6TheVd.HiE1UF3lX6g             5.53             5.02            
5.36
> >>            5.61             5.66             5.37
> >>
> >> So it is a matrix with gene names ex. Ku8QhfS0n_hIOABXuE, and
subjects
> >> named ex. NoD_14381_norm.1
> >>
> >>
> >> How to do 1000 permutations of these 132 columns and on each
created
> >> new permuted matrix perform this code:
> >>
> >> subject="all_replicate"
> >>
targets<-readTargets(paste(PhenotypeDir,"hg_sg_",subject,"_target.txt",
> sep=''))
> >> Treat <-
factor(targets$Treatment,levels=c("C","T"))
> >> Replicates <- factor(targets$rep)
> >> design <- model.matrix(~Replicates+Treat)
> >> corfit <- duplicateCorrelation(dat, block = targets$Subject)
> >> corfit$consensus.correlation
> >> fit
>
<-lmFit(dat,design,block=targets$Subject,correlation=corfit$consensus.correlation)
> >> fit<-eBayes(fit)
> >> qval.cutoff=0.1; FC.cutoff=0.17
> >> y1=topTable(fit, coef="TreatT",
> n=nrow(genes),adjust.method="BH",genelist=genes)
> >>
> >> y1 for each iteration of permutation would  have P.Value column
and
> >> these I would have plotted on the end to find the distribution of
all
> >> p values generated in those 1000 permutations.
> >>
> >> Please advise,
> >> Ana
> >>
> >> ______________________________________________
> >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more,
see
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
>
	[[alternative HTML version deleted]]

Ana Marija

2020-Feb-04 21:28 UTC

head link

[R] doing 1000 permutations and doing test statistics distribution

I tired your code on this simplified data just for say 10 permutations:

dat <- read.table(text = "   code.1 code.2 code.3 code.4
1     82     93     NA     NA
2     15     85     93     NA
3     93     89     NA     NA
4     81     NA     NA     NA",
                  header = TRUE, stringsAsFactors = FALSE)

dat2=data.matrix(dat)
> result<- lapply(seq_len(10), FUN = function(dat2){+     dat2 <- dat2[, sample.int(4)]
+     print(colnames(dat2))
+ } )
Error in dat2[, sample.int(4)] : incorrect number of dimensions


On Tue, Feb 4, 2020 at 3:10 PM Bert Gunter <bgunter.4567 at gmail.com>
wrote:>
> I am not going to do your programming for you. If the following doesn't
suffice, maybe someone else will provide you something that will.
>
> m = your matrix
>
> code = your code that uses m
>
> your list of results <- lapply(seq_len(1000), FUN = function(m){
>   m <- m[, sample.int(132)]
>  code
> } )
>
> or use an explicit for() loop to populate a list or vector with your
results.
>
> Again, if I have misunderstood what you want to do, then clarify, and
perhaps someone else will help.
>
> -- Bert
>
>
>
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along
and sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip
)
>
>
> On Tue, Feb 4, 2020 at 12:41 PM Ana Marija <sokovic.anamarija at
gmail.com> wrote:
>>
>> Hi Bert,
>>
>> thanks for getting back to me. I have to permute those 132 columns
>> 1000 times and perform the code given in the previous email.
>>
>> Can you please show me how you would do that in the loop? This is also
>> a huge data set ...
>>
>> Thanks
>> Ana
>>
>> On Tue, Feb 4, 2020 at 2:34 PM Bert Gunter <bgunter.4567 at
gmail.com> wrote:
>> >
>> > If you just want to permute columns of a matrix,
>> >
>> > ?sample
>> > > sample.int(10)
>> >  [1]  9  2 10  8  4  6  3  1  5  7
>> >
>> > and you can just use this as an index into the columns of your
matrix, presumably within a loop of some sort.
>> >
>> > If I have misunderstood, just ignore.
>> >
>> > Cheers,
>> > Bert
>> >
>> >
>> >
>> >
>> > On Tue, Feb 4, 2020 at 12:23 PM Ana Marija <sokovic.anamarija
at gmail.com> wrote:
>> >>
>> >> Hello,
>> >>
>> >> I have a matrix
>> >> > dim(dat)
>> >> [1] 15568   132
>> >>
>> >> It looks like this:
>> >>
>> >>                    NoD_14381_norm.1 NoD_14381_norm.2
NoD_14381_norm.3
>> >> NoD_14520_30mM.1 NoD_14520_30mM.2 NoD_14520_30mM.3
>> >> Ku8QhfS0n_hIOABXuE             4.75             4.25          
4.79
>> >>            4.33             4.63             3.85
>> >> Bx496XsFXiAlj.Eaeo             6.15             6.23          
6.55
>> >>            6.26             6.24             5.99
>> >> W38p0ogk.wIBVRXllY             7.13             7.35          
7.55
>> >>            7.37             7.36             7.55
>> >> QIBkqIS9LR5DfTlTS8             6.27             6.73          
6.45
>> >>            5.39             4.75             4.96
>> >> BZKiEvS0eQ305U0v34             6.35             7.02          
6.76
>> >>            5.45             5.25             5.02
>> >> 6TheVd.HiE1UF3lX6g             5.53             5.02          
5.36
>> >>            5.61             5.66             5.37
>> >>
>> >> So it is a matrix with gene names ex. Ku8QhfS0n_hIOABXuE, and
subjects
>> >> named ex. NoD_14381_norm.1
>> >>
>> >>
>> >> How to do 1000 permutations of these 132 columns and on each
created
>> >> new permuted matrix perform this code:
>> >>
>> >> subject="all_replicate"
>> >>
targets<-readTargets(paste(PhenotypeDir,"hg_sg_",subject,"_target.txt",
sep=''))
>> >> Treat <-
factor(targets$Treatment,levels=c("C","T"))
>> >> Replicates <- factor(targets$rep)
>> >> design <- model.matrix(~Replicates+Treat)
>> >> corfit <- duplicateCorrelation(dat, block =
targets$Subject)
>> >> corfit$consensus.correlation
>> >> fit
<-lmFit(dat,design,block=targets$Subject,correlation=corfit$consensus.correlation)
>> >> fit<-eBayes(fit)
>> >> qval.cutoff=0.1; FC.cutoff=0.17
>> >> y1=topTable(fit, coef="TreatT",
n=nrow(genes),adjust.method="BH",genelist=genes)
>> >>
>> >> y1 for each iteration of permutation would  have P.Value
column and
>> >> these I would have plotted on the end to find the distribution
of all
>> >> p values generated in those 1000 permutations.
>> >>
>> >> Please advise,
>> >> Ana
>> >>
>> >> ______________________________________________
>> >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and
more, see
>> >> https://stat.ethz.ch/mailman/listinfo/r-help
>> >> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
>> >> and provide commented, minimal, self-contained, reproducible
code.

R help - Feb 2020 - doing 1000 permutations and doing test statistics distribution

[R] doing 1000 permutations and doing test statistics distribution

[R] doing 1000 permutations and doing test statistics distribution

[R] doing 1000 permutations and doing test statistics distribution

[R] doing 1000 permutations and doing test statistics distribution