Ana Marija
2020-Feb-04 20:28 UTC
[R] doing 1000 permutations and doing test statistics distribution
Hello, I have a matrix> dim(dat)[1] 15568 132 It looks like this: NoD_14381_norm.1 NoD_14381_norm.2 NoD_14381_norm.3 NoD_14520_30mM.1 NoD_14520_30mM.2 NoD_14520_30mM.3 Ku8QhfS0n_hIOABXuE 4.75 4.25 4.79 4.33 4.63 3.85 Bx496XsFXiAlj.Eaeo 6.15 6.23 6.55 6.26 6.24 5.99 W38p0ogk.wIBVRXllY 7.13 7.35 7.55 7.37 7.36 7.55 QIBkqIS9LR5DfTlTS8 6.27 6.73 6.45 5.39 4.75 4.96 BZKiEvS0eQ305U0v34 6.35 7.02 6.76 5.45 5.25 5.02 6TheVd.HiE1UF3lX6g 5.53 5.02 5.36 5.61 5.66 5.37 So it is a matrix with gene names ex. Ku8QhfS0n_hIOABXuE, and subjects named ex. NoD_14381_norm.1 How to do 1000 permutations of these 132 columns and on each created new permuted matrix perform this code: subject="all_replicate" targets<-readTargets(paste(PhenotypeDir,"hg_sg_",subject,"_target.txt", sep='')) Treat <- factor(targets$Treatment,levels=c("C","T")) Replicates <- factor(targets$rep) design <- model.matrix(~Replicates+Treat) corfit <- duplicateCorrelation(dat, block = targets$Subject) corfit$consensus.correlation fit <-lmFit(dat,design,block=targets$Subject,correlation=corfit$consensus.correlation) fit<-eBayes(fit) qval.cutoff=0.1; FC.cutoff=0.17 y1=topTable(fit, coef="TreatT", n=nrow(genes),adjust.method="BH",genelist=genes) y1 for each iteration of permutation would have P.Value column and these I would have plotted on the end to find the distribution of all p values generated in those 1000 permutations. Please advise, Ana
Bert Gunter
2020-Feb-04 20:34 UTC
[R] doing 1000 permutations and doing test statistics distribution
If you just want to permute columns of a matrix, ?sample> sample.int(10)[1] 9 2 10 8 4 6 3 1 5 7 and you can just use this as an index into the columns of your matrix, presumably within a loop of some sort. If I have misunderstood, just ignore. Cheers, Bert On Tue, Feb 4, 2020 at 12:23 PM Ana Marija <sokovic.anamarija at gmail.com> wrote:> Hello, > > I have a matrix > > dim(dat) > [1] 15568 132 > > It looks like this: > > NoD_14381_norm.1 NoD_14381_norm.2 NoD_14381_norm.3 > NoD_14520_30mM.1 NoD_14520_30mM.2 NoD_14520_30mM.3 > Ku8QhfS0n_hIOABXuE 4.75 4.25 4.79 > 4.33 4.63 3.85 > Bx496XsFXiAlj.Eaeo 6.15 6.23 6.55 > 6.26 6.24 5.99 > W38p0ogk.wIBVRXllY 7.13 7.35 7.55 > 7.37 7.36 7.55 > QIBkqIS9LR5DfTlTS8 6.27 6.73 6.45 > 5.39 4.75 4.96 > BZKiEvS0eQ305U0v34 6.35 7.02 6.76 > 5.45 5.25 5.02 > 6TheVd.HiE1UF3lX6g 5.53 5.02 5.36 > 5.61 5.66 5.37 > > So it is a matrix with gene names ex. Ku8QhfS0n_hIOABXuE, and subjects > named ex. NoD_14381_norm.1 > > > How to do 1000 permutations of these 132 columns and on each created > new permuted matrix perform this code: > > subject="all_replicate" > targets<-readTargets(paste(PhenotypeDir,"hg_sg_",subject,"_target.txt", > sep='')) > Treat <- factor(targets$Treatment,levels=c("C","T")) > Replicates <- factor(targets$rep) > design <- model.matrix(~Replicates+Treat) > corfit <- duplicateCorrelation(dat, block = targets$Subject) > corfit$consensus.correlation > fit > <-lmFit(dat,design,block=targets$Subject,correlation=corfit$consensus.correlation) > fit<-eBayes(fit) > qval.cutoff=0.1; FC.cutoff=0.17 > y1=topTable(fit, coef="TreatT", > n=nrow(genes),adjust.method="BH",genelist=genes) > > y1 for each iteration of permutation would have P.Value column and > these I would have plotted on the end to find the distribution of all > p values generated in those 1000 permutations. > > Please advise, > Ana > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]
Ana Marija
2020-Feb-04 20:46 UTC
[R] doing 1000 permutations and doing test statistics distribution
Hi Bert, thanks for getting back to me. I have to permute those 132 columns 1000 times and perform the code given in the previous email. Can you please show me how you would do that in the loop? This is also a huge data set ... Thanks Ana On Tue, Feb 4, 2020 at 2:34 PM Bert Gunter <bgunter.4567 at gmail.com> wrote:> > If you just want to permute columns of a matrix, > > ?sample > > sample.int(10) > [1] 9 2 10 8 4 6 3 1 5 7 > > and you can just use this as an index into the columns of your matrix, presumably within a loop of some sort. > > If I have misunderstood, just ignore. > > Cheers, > Bert > > > > > On Tue, Feb 4, 2020 at 12:23 PM Ana Marija <sokovic.anamarija at gmail.com> wrote: >> >> Hello, >> >> I have a matrix >> > dim(dat) >> [1] 15568 132 >> >> It looks like this: >> >> NoD_14381_norm.1 NoD_14381_norm.2 NoD_14381_norm.3 >> NoD_14520_30mM.1 NoD_14520_30mM.2 NoD_14520_30mM.3 >> Ku8QhfS0n_hIOABXuE 4.75 4.25 4.79 >> 4.33 4.63 3.85 >> Bx496XsFXiAlj.Eaeo 6.15 6.23 6.55 >> 6.26 6.24 5.99 >> W38p0ogk.wIBVRXllY 7.13 7.35 7.55 >> 7.37 7.36 7.55 >> QIBkqIS9LR5DfTlTS8 6.27 6.73 6.45 >> 5.39 4.75 4.96 >> BZKiEvS0eQ305U0v34 6.35 7.02 6.76 >> 5.45 5.25 5.02 >> 6TheVd.HiE1UF3lX6g 5.53 5.02 5.36 >> 5.61 5.66 5.37 >> >> So it is a matrix with gene names ex. Ku8QhfS0n_hIOABXuE, and subjects >> named ex. NoD_14381_norm.1 >> >> >> How to do 1000 permutations of these 132 columns and on each created >> new permuted matrix perform this code: >> >> subject="all_replicate" >> targets<-readTargets(paste(PhenotypeDir,"hg_sg_",subject,"_target.txt", sep='')) >> Treat <- factor(targets$Treatment,levels=c("C","T")) >> Replicates <- factor(targets$rep) >> design <- model.matrix(~Replicates+Treat) >> corfit <- duplicateCorrelation(dat, block = targets$Subject) >> corfit$consensus.correlation >> fit <-lmFit(dat,design,block=targets$Subject,correlation=corfit$consensus.correlation) >> fit<-eBayes(fit) >> qval.cutoff=0.1; FC.cutoff=0.17 >> y1=topTable(fit, coef="TreatT", n=nrow(genes),adjust.method="BH",genelist=genes) >> >> y1 for each iteration of permutation would have P.Value column and >> these I would have plotted on the end to find the distribution of all >> p values generated in those 1000 permutations. >> >> Please advise, >> Ana >> >> ______________________________________________ >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code.