why I selected only those with P<0.003 to put on QQ plot is because the original data set contains 5556249 points and when I extract only P<0.001 I am getting 3713 points. Is there is a way to plot the whole data set, or choose only the representative points? On Tue, Nov 12, 2019 at 3:42 PM Ana Marija <sokovic.anamarija at gmail.com> wrote:> > the smallest p value in my dataset goes to 9.89e-08. How do I make > that known on the new QQ plot with multiplied with 1000 values > > On Tue, Nov 12, 2019 at 3:37 PM Ana Marija <sokovic.anamarija at gmail.com> wrote: > > > > Just do I need to change the axis when I multiply with 1000 and what > > should I put on my axis? > > > > On Tue, Nov 12, 2019 at 3:07 PM Ana Marija <sokovic.anamarija at gmail.com> wrote: > > > > > > Hi Duncan, > > > > > > yes I choose for QQ plot only P<1e-3 and multiplying everything with > > > 1000 works great! > > > This should not in my understanding influence the interpretation of > > > the plot, it is only changing the scale of axis. > > > > > > Thank you so much, > > > Ana > > > > > > On Tue, Nov 12, 2019 at 2:51 PM Duncan Murdoch <murdoch.duncan at gmail.com> wrote: > > > > > > > > On 12/11/2019 2:56 p.m., Jim Lemon wrote: > > > > > I thought about this and did a little study of GWAS and the use of > > > > > p-values to assess significant associations. As Ana's plot begins at > > > > > values of about 0.001, this seems to imply that almost everything in > > > > > the genome is associated to some degree. One expects that most SNPs > > > > > will not be associated with a particular condition (p~1), so perhaps > > > > > something is going wrong in the calculations that produce the > > > > > p-values. > > > > > > > > I may be misunderstanding your last sentence, but if there is no > > > > association, the p-value would usually have a uniform distribution from > > > > 0 to 1, it wouldn't be near 1. > > > > > > > > I'd guess we're not seeing the p values from every test, only those that > > > > are less than 0.001. If that's true, and there are no effects, it makes > > > > sense to multiply all of them by 1000 to get U(0,1) values. On the > > > > plot, that would correspond to subtracting 3 from -log10(p), or adding 3 > > > > to the reference line, as Ana requested. > > > > > > > > Or just multiply them by 1000 and pass them to qq(): > > > > > > > > qq(dd$P*1000, main = "Q-Q plot of small GWAS p-values") > > > > > > > > As far as I can see, there's no way to tell qqman::qq to move the > > > > reference line. > > > > > > > > Duncan Murdoch > > > > > > > > > > > > > > Jim > > > > > > > > > > On Wed, Nov 13, 2019 at 12:28 AM Patrick (Malone Quantitative) > > > > > <malone at malonequantitative.com> wrote: > > > > >> > > > > >> I agree with Abby. That would defeat the purpose of a QQ plot. > > > > >> > > > > >> On Mon, Nov 11, 2019, 9:54 PM Abby Spurdle <spurdle.a at gmail.com> wrote: > > > > >> > > > > >>> Hi > > > > >>> > > > > >>> I'm not familiar with the qqman package, or GWAS studies. > > > > >>> However, my guess would be that you're *not* supposed to change the > > > > >>> position of the line. > > > > >>> > > > > >>> On Tue, Nov 12, 2019 at 11:48 AM Ana Marija <sokovic.anamarija at gmail.com> > > > > >>> wrote: > > > > >>>> > > > > >>>> Hi, > > > > >>>> > > > > >>>> I was using this library, qqman > > > > >>>> https://cran.r-project.org/web/packages/qqman/vignettes/qqman.html > > > > >>>> > > > > >>>> to create QQ plot, attached. How would I change this default abline to > > > > >>>> start from the beginning of my QQ line? > > > > >>>> > > > > >>>> This is my code: > > > > >>>> qq(dd$P, main = "Q-Q plot of GWAS p-values") > > > > >>>> > > > > >>>> Thanks > > > > >>>> Ana > > > > >>>> ______________________________________________ > > > > >>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > > > > >>>> https://stat.ethz.ch/mailman/listinfo/r-help > > > > >>>> PLEASE do read the posting guide > > > > >>> http://www.R-project.org/posting-guide.html > > > > >>>> and provide commented, minimal, self-contained, reproducible code. > > > > >>> > > > > >>> ______________________________________________ > > > > >>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > > > > >>> https://stat.ethz.ch/mailman/listinfo/r-help > > > > >>> PLEASE do read the posting guide > > > > >>> http://www.R-project.org/posting-guide.html > > > > >>> and provide commented, minimal, self-contained, reproducible code. > > > > >>> > > > > >> > > > > >> [[alternative HTML version deleted]] > > > > >> > > > > >> ______________________________________________ > > > > >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > > > > >> https://stat.ethz.ch/mailman/listinfo/r-help > > > > >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > > > > >> and provide commented, minimal, self-contained, reproducible code. > > > > > > > > > > ______________________________________________ > > > > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > > > > > https://stat.ethz.ch/mailman/listinfo/r-help > > > > > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > > > > > and provide commented, minimal, self-contained, reproducible code. > > > > > > > > > > > > > ______________________________________________ > > > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > > > > https://stat.ethz.ch/mailman/listinfo/r-help > > > > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > > > > and provide commented, minimal, self-contained, reproducible code.
IMO, this thread has now gone totally off the rails and totally off topic -- it is clearly *not* about R programming and totally about statistics. I believe Ana Marija would do better to get local statistical help or post on a statistics or genomics list (stats.stackexchange.com is one such) where she can engage in a fuller discussion of what an *appropriate* qqplot would tell her. Of course selecting the lowest 3700 p-values from 55.5 million and plotting them against 3700 expected uniform quantiles will not give a line with 0 intercept and slope 1. The scale correction is easy to make, but it is not multiplying by 1000! Bert On Tue, Nov 12, 2019 at 2:11 PM Ana Marija <sokovic.anamarija at gmail.com> wrote:> why I selected only those with P<0.003 to put on QQ plot is because > the original data set contains 5556249 points and when I extract only > P<0.001 I am getting 3713 points. Is there is a way to plot the whole > data set, or choose only the representative points? > > On Tue, Nov 12, 2019 at 3:42 PM Ana Marija <sokovic.anamarija at gmail.com> > wrote: > > > > the smallest p value in my dataset goes to 9.89e-08. How do I make > > that known on the new QQ plot with multiplied with 1000 values > > > > On Tue, Nov 12, 2019 at 3:37 PM Ana Marija <sokovic.anamarija at gmail.com> > wrote: > > > > > > Just do I need to change the axis when I multiply with 1000 and what > > > should I put on my axis? > > > > > > On Tue, Nov 12, 2019 at 3:07 PM Ana Marija < > sokovic.anamarija at gmail.com> wrote: > > > > > > > > Hi Duncan, > > > > > > > > yes I choose for QQ plot only P<1e-3 and multiplying everything with > > > > 1000 works great! > > > > This should not in my understanding influence the interpretation of > > > > the plot, it is only changing the scale of axis. > > > > > > > > Thank you so much, > > > > Ana > > > > > > > > On Tue, Nov 12, 2019 at 2:51 PM Duncan Murdoch < > murdoch.duncan at gmail.com> wrote: > > > > > > > > > > On 12/11/2019 2:56 p.m., Jim Lemon wrote: > > > > > > I thought about this and did a little study of GWAS and the use > of > > > > > > p-values to assess significant associations. As Ana's plot > begins at > > > > > > values of about 0.001, this seems to imply that almost > everything in > > > > > > the genome is associated to some degree. One expects that most > SNPs > > > > > > will not be associated with a particular condition (p~1), so > perhaps > > > > > > something is going wrong in the calculations that produce the > > > > > > p-values. > > > > > > > > > > I may be misunderstanding your last sentence, but if there is no > > > > > association, the p-value would usually have a uniform distribution > from > > > > > 0 to 1, it wouldn't be near 1. > > > > > > > > > > I'd guess we're not seeing the p values from every test, only > those that > > > > > are less than 0.001. If that's true, and there are no effects, it > makes > > > > > sense to multiply all of them by 1000 to get U(0,1) values. On the > > > > > plot, that would correspond to subtracting 3 from -log10(p), or > adding 3 > > > > > to the reference line, as Ana requested. > > > > > > > > > > Or just multiply them by 1000 and pass them to qq(): > > > > > > > > > > qq(dd$P*1000, main = "Q-Q plot of small GWAS p-values") > > > > > > > > > > As far as I can see, there's no way to tell qqman::qq to move the > > > > > reference line. > > > > > > > > > > Duncan Murdoch > > > > > > > > > > > > > > > > > Jim > > > > > > > > > > > > On Wed, Nov 13, 2019 at 12:28 AM Patrick (Malone Quantitative) > > > > > > <malone at malonequantitative.com> wrote: > > > > > >> > > > > > >> I agree with Abby. That would defeat the purpose of a QQ plot. > > > > > >> > > > > > >> On Mon, Nov 11, 2019, 9:54 PM Abby Spurdle <spurdle.a at gmail.com> > wrote: > > > > > >> > > > > > >>> Hi > > > > > >>> > > > > > >>> I'm not familiar with the qqman package, or GWAS studies. > > > > > >>> However, my guess would be that you're *not* supposed to > change the > > > > > >>> position of the line. > > > > > >>> > > > > > >>> On Tue, Nov 12, 2019 at 11:48 AM Ana Marija < > sokovic.anamarija at gmail.com> > > > > > >>> wrote: > > > > > >>>> > > > > > >>>> Hi, > > > > > >>>> > > > > > >>>> I was using this library, qqman > > > > > >>>> > https://cran.r-project.org/web/packages/qqman/vignettes/qqman.html > > > > > >>>> > > > > > >>>> to create QQ plot, attached. How would I change this default > abline to > > > > > >>>> start from the beginning of my QQ line? > > > > > >>>> > > > > > >>>> This is my code: > > > > > >>>> qq(dd$P, main = "Q-Q plot of GWAS p-values") > > > > > >>>> > > > > > >>>> Thanks > > > > > >>>> Ana > > > > > >>>> ______________________________________________ > > > > > >>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and > more, see > > > > > >>>> https://stat.ethz.ch/mailman/listinfo/r-help > > > > > >>>> PLEASE do read the posting guide > > > > > >>> http://www.R-project.org/posting-guide.html > > > > > >>>> and provide commented, minimal, self-contained, reproducible > code. > > > > > >>> > > > > > >>> ______________________________________________ > > > > > >>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, > see > > > > > >>> https://stat.ethz.ch/mailman/listinfo/r-help > > > > > >>> PLEASE do read the posting guide > > > > > >>> http://www.R-project.org/posting-guide.html > > > > > >>> and provide commented, minimal, self-contained, reproducible > code. > > > > > >>> > > > > > >> > > > > > >> [[alternative HTML version deleted]] > > > > > >> > > > > > >> ______________________________________________ > > > > > >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, > see > > > > > >> https://stat.ethz.ch/mailman/listinfo/r-help > > > > > >> PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > > > > >> and provide commented, minimal, self-contained, reproducible > code. > > > > > > > > > > > > ______________________________________________ > > > > > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, > see > > > > > > https://stat.ethz.ch/mailman/listinfo/r-help > > > > > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > > > > > and provide commented, minimal, self-contained, reproducible > code. > > > > > > > > > > > > > > > > ______________________________________________ > > > > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > > > > > https://stat.ethz.ch/mailman/listinfo/r-help > > > > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > > > > and provide commented, minimal, self-contained, reproducible code. > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]
Typo: "... from 5.5 million..." Bert On Tue, Nov 12, 2019 at 3:11 PM Bert Gunter <bgunter.4567 at gmail.com> wrote:> IMO, this thread has now gone totally off the rails and totally off topic > -- it is clearly *not* about R programming and totally about statistics. > > I believe Ana Marija would do better to get local statistical help or post > on a statistics or genomics list (stats.stackexchange.com is one such) > where she can engage in a fuller discussion of what an *appropriate* qqplot > would tell her. Of course selecting the lowest 3700 p-values from 55.5 > million and plotting them against 3700 expected uniform quantiles will not > give a line with 0 intercept and slope 1. The scale correction is easy to > make, but it is not multiplying by 1000! > > Bert > > > On Tue, Nov 12, 2019 at 2:11 PM Ana Marija <sokovic.anamarija at gmail.com> > wrote: > >> why I selected only those with P<0.003 to put on QQ plot is because >> the original data set contains 5556249 points and when I extract only >> P<0.001 I am getting 3713 points. Is there is a way to plot the whole >> data set, or choose only the representative points? >> >> On Tue, Nov 12, 2019 at 3:42 PM Ana Marija <sokovic.anamarija at gmail.com> >> wrote: >> > >> > the smallest p value in my dataset goes to 9.89e-08. How do I make >> > that known on the new QQ plot with multiplied with 1000 values >> > >> > On Tue, Nov 12, 2019 at 3:37 PM Ana Marija <sokovic.anamarija at gmail.com> >> wrote: >> > > >> > > Just do I need to change the axis when I multiply with 1000 and what >> > > should I put on my axis? >> > > >> > > On Tue, Nov 12, 2019 at 3:07 PM Ana Marija < >> sokovic.anamarija at gmail.com> wrote: >> > > > >> > > > Hi Duncan, >> > > > >> > > > yes I choose for QQ plot only P<1e-3 and multiplying everything with >> > > > 1000 works great! >> > > > This should not in my understanding influence the interpretation of >> > > > the plot, it is only changing the scale of axis. >> > > > >> > > > Thank you so much, >> > > > Ana >> > > > >> > > > On Tue, Nov 12, 2019 at 2:51 PM Duncan Murdoch < >> murdoch.duncan at gmail.com> wrote: >> > > > > >> > > > > On 12/11/2019 2:56 p.m., Jim Lemon wrote: >> > > > > > I thought about this and did a little study of GWAS and the use >> of >> > > > > > p-values to assess significant associations. As Ana's plot >> begins at >> > > > > > values of about 0.001, this seems to imply that almost >> everything in >> > > > > > the genome is associated to some degree. One expects that most >> SNPs >> > > > > > will not be associated with a particular condition (p~1), so >> perhaps >> > > > > > something is going wrong in the calculations that produce the >> > > > > > p-values. >> > > > > >> > > > > I may be misunderstanding your last sentence, but if there is no >> > > > > association, the p-value would usually have a uniform >> distribution from >> > > > > 0 to 1, it wouldn't be near 1. >> > > > > >> > > > > I'd guess we're not seeing the p values from every test, only >> those that >> > > > > are less than 0.001. If that's true, and there are no effects, >> it makes >> > > > > sense to multiply all of them by 1000 to get U(0,1) values. On >> the >> > > > > plot, that would correspond to subtracting 3 from -log10(p), or >> adding 3 >> > > > > to the reference line, as Ana requested. >> > > > > >> > > > > Or just multiply them by 1000 and pass them to qq(): >> > > > > >> > > > > qq(dd$P*1000, main = "Q-Q plot of small GWAS p-values") >> > > > > >> > > > > As far as I can see, there's no way to tell qqman::qq to move the >> > > > > reference line. >> > > > > >> > > > > Duncan Murdoch >> > > > > >> > > > > > >> > > > > > Jim >> > > > > > >> > > > > > On Wed, Nov 13, 2019 at 12:28 AM Patrick (Malone Quantitative) >> > > > > > <malone at malonequantitative.com> wrote: >> > > > > >> >> > > > > >> I agree with Abby. That would defeat the purpose of a QQ plot. >> > > > > >> >> > > > > >> On Mon, Nov 11, 2019, 9:54 PM Abby Spurdle < >> spurdle.a at gmail.com> wrote: >> > > > > >> >> > > > > >>> Hi >> > > > > >>> >> > > > > >>> I'm not familiar with the qqman package, or GWAS studies. >> > > > > >>> However, my guess would be that you're *not* supposed to >> change the >> > > > > >>> position of the line. >> > > > > >>> >> > > > > >>> On Tue, Nov 12, 2019 at 11:48 AM Ana Marija < >> sokovic.anamarija at gmail.com> >> > > > > >>> wrote: >> > > > > >>>> >> > > > > >>>> Hi, >> > > > > >>>> >> > > > > >>>> I was using this library, qqman >> > > > > >>>> >> https://cran.r-project.org/web/packages/qqman/vignettes/qqman.html >> > > > > >>>> >> > > > > >>>> to create QQ plot, attached. How would I change this default >> abline to >> > > > > >>>> start from the beginning of my QQ line? >> > > > > >>>> >> > > > > >>>> This is my code: >> > > > > >>>> qq(dd$P, main = "Q-Q plot of GWAS p-values") >> > > > > >>>> >> > > > > >>>> Thanks >> > > > > >>>> Ana >> > > > > >>>> ______________________________________________ >> > > > > >>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and >> more, see >> > > > > >>>> https://stat.ethz.ch/mailman/listinfo/r-help >> > > > > >>>> PLEASE do read the posting guide >> > > > > >>> http://www.R-project.org/posting-guide.html >> > > > > >>>> and provide commented, minimal, self-contained, reproducible >> code. >> > > > > >>> >> > > > > >>> ______________________________________________ >> > > > > >>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and >> more, see >> > > > > >>> https://stat.ethz.ch/mailman/listinfo/r-help >> > > > > >>> PLEASE do read the posting guide >> > > > > >>> http://www.R-project.org/posting-guide.html >> > > > > >>> and provide commented, minimal, self-contained, reproducible >> code. >> > > > > >>> >> > > > > >> >> > > > > >> [[alternative HTML version deleted]] >> > > > > >> >> > > > > >> ______________________________________________ >> > > > > >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, >> see >> > > > > >> https://stat.ethz.ch/mailman/listinfo/r-help >> > > > > >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> > > > > >> and provide commented, minimal, self-contained, reproducible >> code. >> > > > > > >> > > > > > ______________________________________________ >> > > > > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, >> see >> > > > > > https://stat.ethz.ch/mailman/listinfo/r-help >> > > > > > PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> > > > > > and provide commented, minimal, self-contained, reproducible >> code. >> > > > > > >> > > > > >> > > > > ______________________________________________ >> > > > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >> > > > > https://stat.ethz.ch/mailman/listinfo/r-help >> > > > > PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> > > > > and provide commented, minimal, self-contained, reproducible code. >> >> ______________________________________________ >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> >[[alternative HTML version deleted]]
Dear Ana As others have commented this is getting a bit off-topic but here are some hints. It is helpful to distinguish two sorts of plot: archival plots and impact plots. If you want to have an impact plot which gives you a picture but possibly at the cost of completeness and accuracy then why not: 1 - plot a sample of your 5 million drawn at random 2 - bin the data and plot median p-value against median expected 3 - deal with overlap by choosing a graphical device which supports transparency and plot points in very light grey so the overlap is more visible. Michael On 12/11/2019 22:04, Ana Marija wrote:> why I selected only those with P<0.003 to put on QQ plot is because > the original data set contains 5556249 points and when I extract only > P<0.001 I am getting 3713 points. Is there is a way to plot the whole > data set, or choose only the representative points? > > On Tue, Nov 12, 2019 at 3:42 PM Ana Marija <sokovic.anamarija at gmail.com> wrote: >> >> the smallest p value in my dataset goes to 9.89e-08. How do I make >> that known on the new QQ plot with multiplied with 1000 values >> >> On Tue, Nov 12, 2019 at 3:37 PM Ana Marija <sokovic.anamarija at gmail.com> wrote: >>> >>> Just do I need to change the axis when I multiply with 1000 and what >>> should I put on my axis? >>> >>> On Tue, Nov 12, 2019 at 3:07 PM Ana Marija <sokovic.anamarija at gmail.com> wrote: >>>> >>>> Hi Duncan, >>>> >>>> yes I choose for QQ plot only P<1e-3 and multiplying everything with >>>> 1000 works great! >>>> This should not in my understanding influence the interpretation of >>>> the plot, it is only changing the scale of axis. >>>> >>>> Thank you so much, >>>> Ana >>>> >>>> On Tue, Nov 12, 2019 at 2:51 PM Duncan Murdoch <murdoch.duncan at gmail.com> wrote: >>>>> >>>>> On 12/11/2019 2:56 p.m., Jim Lemon wrote: >>>>>> I thought about this and did a little study of GWAS and the use of >>>>>> p-values to assess significant associations. As Ana's plot begins at >>>>>> values of about 0.001, this seems to imply that almost everything in >>>>>> the genome is associated to some degree. One expects that most SNPs >>>>>> will not be associated with a particular condition (p~1), so perhaps >>>>>> something is going wrong in the calculations that produce the >>>>>> p-values. >>>>> >>>>> I may be misunderstanding your last sentence, but if there is no >>>>> association, the p-value would usually have a uniform distribution from >>>>> 0 to 1, it wouldn't be near 1. >>>>> >>>>> I'd guess we're not seeing the p values from every test, only those that >>>>> are less than 0.001. If that's true, and there are no effects, it makes >>>>> sense to multiply all of them by 1000 to get U(0,1) values. On the >>>>> plot, that would correspond to subtracting 3 from -log10(p), or adding 3 >>>>> to the reference line, as Ana requested. >>>>> >>>>> Or just multiply them by 1000 and pass them to qq(): >>>>> >>>>> qq(dd$P*1000, main = "Q-Q plot of small GWAS p-values") >>>>> >>>>> As far as I can see, there's no way to tell qqman::qq to move the >>>>> reference line. >>>>> >>>>> Duncan Murdoch >>>>> >>>>>> >>>>>> Jim >>>>>> >>>>>> On Wed, Nov 13, 2019 at 12:28 AM Patrick (Malone Quantitative) >>>>>> <malone at malonequantitative.com> wrote: >>>>>>> >>>>>>> I agree with Abby. That would defeat the purpose of a QQ plot. >>>>>>> >>>>>>> On Mon, Nov 11, 2019, 9:54 PM Abby Spurdle <spurdle.a at gmail.com> wrote: >>>>>>> >>>>>>>> Hi >>>>>>>> >>>>>>>> I'm not familiar with the qqman package, or GWAS studies. >>>>>>>> However, my guess would be that you're *not* supposed to change the >>>>>>>> position of the line. >>>>>>>> >>>>>>>> On Tue, Nov 12, 2019 at 11:48 AM Ana Marija <sokovic.anamarija at gmail.com> >>>>>>>> wrote: >>>>>>>>> >>>>>>>>> Hi, >>>>>>>>> >>>>>>>>> I was using this library, qqman >>>>>>>>> https://cran.r-project.org/web/packages/qqman/vignettes/qqman.html >>>>>>>>> >>>>>>>>> to create QQ plot, attached. How would I change this default abline to >>>>>>>>> start from the beginning of my QQ line? >>>>>>>>> >>>>>>>>> This is my code: >>>>>>>>> qq(dd$P, main = "Q-Q plot of GWAS p-values") >>>>>>>>> >>>>>>>>> Thanks >>>>>>>>> Ana >>>>>>>>> ______________________________________________ >>>>>>>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >>>>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>>>>>>> PLEASE do read the posting guide >>>>>>>> http://www.R-project.org/posting-guide.html >>>>>>>>> and provide commented, minimal, self-contained, reproducible code. >>>>>>>> >>>>>>>> ______________________________________________ >>>>>>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >>>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>>>>>> PLEASE do read the posting guide >>>>>>>> http://www.R-project.org/posting-guide.html >>>>>>>> and provide commented, minimal, self-contained, reproducible code. >>>>>>>> >>>>>>> >>>>>>> [[alternative HTML version deleted]] >>>>>>> >>>>>>> ______________________________________________ >>>>>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>>>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >>>>>>> and provide commented, minimal, self-contained, reproducible code. >>>>>> >>>>>> ______________________________________________ >>>>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >>>>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >>>>>> and provide commented, minimal, self-contained, reproducible code. >>>>>> >>>>> >>>>> ______________________________________________ >>>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >>>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >>>>> and provide commented, minimal, self-contained, reproducible code. > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Michael http://www.dewey.myzen.co.uk/home.html
Hi Michael, Thank you so much for that valuable idea! I will try first to clump or remove SNPs in LD and maybe the situation would improve. But this procedure of yours is definitely something that would come handy in future! Cheers, Ana On Wed, Nov 13, 2019 at 5:47 AM Michael Dewey <lists at dewey.myzen.co.uk> wrote:> > Dear Ana > > As others have commented this is getting a bit off-topic but here are > some hints. > > It is helpful to distinguish two sorts of plot: archival plots and > impact plots. If you want to have an impact plot which gives you a > picture but possibly at the cost of completeness and accuracy then why not: > > 1 - plot a sample of your 5 million drawn at random > 2 - bin the data and plot median p-value against median expected > 3 - deal with overlap by choosing a graphical device which supports > transparency and plot points in very light grey so the overlap is more > visible. > > Michael > > On 12/11/2019 22:04, Ana Marija wrote: > > why I selected only those with P<0.003 to put on QQ plot is because > > the original data set contains 5556249 points and when I extract only > > P<0.001 I am getting 3713 points. Is there is a way to plot the whole > > data set, or choose only the representative points? > > > > On Tue, Nov 12, 2019 at 3:42 PM Ana Marija <sokovic.anamarija at gmail.com> wrote: > >> > >> the smallest p value in my dataset goes to 9.89e-08. How do I make > >> that known on the new QQ plot with multiplied with 1000 values > >> > >> On Tue, Nov 12, 2019 at 3:37 PM Ana Marija <sokovic.anamarija at gmail.com> wrote: > >>> > >>> Just do I need to change the axis when I multiply with 1000 and what > >>> should I put on my axis? > >>> > >>> On Tue, Nov 12, 2019 at 3:07 PM Ana Marija <sokovic.anamarija at gmail.com> wrote: > >>>> > >>>> Hi Duncan, > >>>> > >>>> yes I choose for QQ plot only P<1e-3 and multiplying everything with > >>>> 1000 works great! > >>>> This should not in my understanding influence the interpretation of > >>>> the plot, it is only changing the scale of axis. > >>>> > >>>> Thank you so much, > >>>> Ana > >>>> > >>>> On Tue, Nov 12, 2019 at 2:51 PM Duncan Murdoch <murdoch.duncan at gmail.com> wrote: > >>>>> > >>>>> On 12/11/2019 2:56 p.m., Jim Lemon wrote: > >>>>>> I thought about this and did a little study of GWAS and the use of > >>>>>> p-values to assess significant associations. As Ana's plot begins at > >>>>>> values of about 0.001, this seems to imply that almost everything in > >>>>>> the genome is associated to some degree. One expects that most SNPs > >>>>>> will not be associated with a particular condition (p~1), so perhaps > >>>>>> something is going wrong in the calculations that produce the > >>>>>> p-values. > >>>>> > >>>>> I may be misunderstanding your last sentence, but if there is no > >>>>> association, the p-value would usually have a uniform distribution from > >>>>> 0 to 1, it wouldn't be near 1. > >>>>> > >>>>> I'd guess we're not seeing the p values from every test, only those that > >>>>> are less than 0.001. If that's true, and there are no effects, it makes > >>>>> sense to multiply all of them by 1000 to get U(0,1) values. On the > >>>>> plot, that would correspond to subtracting 3 from -log10(p), or adding 3 > >>>>> to the reference line, as Ana requested. > >>>>> > >>>>> Or just multiply them by 1000 and pass them to qq(): > >>>>> > >>>>> qq(dd$P*1000, main = "Q-Q plot of small GWAS p-values") > >>>>> > >>>>> As far as I can see, there's no way to tell qqman::qq to move the > >>>>> reference line. > >>>>> > >>>>> Duncan Murdoch > >>>>> > >>>>>> > >>>>>> Jim > >>>>>> > >>>>>> On Wed, Nov 13, 2019 at 12:28 AM Patrick (Malone Quantitative) > >>>>>> <malone at malonequantitative.com> wrote: > >>>>>>> > >>>>>>> I agree with Abby. That would defeat the purpose of a QQ plot. > >>>>>>> > >>>>>>> On Mon, Nov 11, 2019, 9:54 PM Abby Spurdle <spurdle.a at gmail.com> wrote: > >>>>>>> > >>>>>>>> Hi > >>>>>>>> > >>>>>>>> I'm not familiar with the qqman package, or GWAS studies. > >>>>>>>> However, my guess would be that you're *not* supposed to change the > >>>>>>>> position of the line. > >>>>>>>> > >>>>>>>> On Tue, Nov 12, 2019 at 11:48 AM Ana Marija <sokovic.anamarija at gmail.com> > >>>>>>>> wrote: > >>>>>>>>> > >>>>>>>>> Hi, > >>>>>>>>> > >>>>>>>>> I was using this library, qqman > >>>>>>>>> https://cran.r-project.org/web/packages/qqman/vignettes/qqman.html > >>>>>>>>> > >>>>>>>>> to create QQ plot, attached. How would I change this default abline to > >>>>>>>>> start from the beginning of my QQ line? > >>>>>>>>> > >>>>>>>>> This is my code: > >>>>>>>>> qq(dd$P, main = "Q-Q plot of GWAS p-values") > >>>>>>>>> > >>>>>>>>> Thanks > >>>>>>>>> Ana > >>>>>>>>> ______________________________________________ > >>>>>>>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > >>>>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help > >>>>>>>>> PLEASE do read the posting guide > >>>>>>>> http://www.R-project.org/posting-guide.html > >>>>>>>>> and provide commented, minimal, self-contained, reproducible code. > >>>>>>>> > >>>>>>>> ______________________________________________ > >>>>>>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > >>>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help > >>>>>>>> PLEASE do read the posting guide > >>>>>>>> http://www.R-project.org/posting-guide.html > >>>>>>>> and provide commented, minimal, self-contained, reproducible code. > >>>>>>>> > >>>>>>> > >>>>>>> [[alternative HTML version deleted]] > >>>>>>> > >>>>>>> ______________________________________________ > >>>>>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > >>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help > >>>>>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > >>>>>>> and provide commented, minimal, self-contained, reproducible code. > >>>>>> > >>>>>> ______________________________________________ > >>>>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > >>>>>> https://stat.ethz.ch/mailman/listinfo/r-help > >>>>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > >>>>>> and provide commented, minimal, self-contained, reproducible code. > >>>>>> > >>>>> > >>>>> ______________________________________________ > >>>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > >>>>> https://stat.ethz.ch/mailman/listinfo/r-help > >>>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > >>>>> and provide commented, minimal, self-contained, reproducible code. > > > > ______________________________________________ > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > > > -- > Michael > http://www.dewey.myzen.co.uk/home.html