Eric Christopher via llvm-dev
2017-May-12 18:06 UTC
[llvm-dev] [GlobalISel][AArch64] Toward flipping the switch for O0: Please give it a try!
Agreed. That was probably just luck before :) -eric On Fri, May 12, 2017 at 5:22 AM Diana Picus via llvm-dev < llvm-dev at lists.llvm.org> wrote:> It turns out that can be fixed by adding -lm to the link line, so I > will probably convert it into a test-suite bug. > > I don't suppose it's crucial to handle the fabs intrinsic nicely at -O0. > > On 12 May 2017 at 13:44, Diana Picus <diana.picus at linaro.org> wrote: > > Hi, > > > > I ran into a little snag on the test-suite: > > https://bugs.llvm.org/show_bug.cgi?id=33021 > > It boils down to GlobalISel generating calls to fabs instead of using > > FABSDr (so we get undefined references). > > > > Cheers, > > Diana > > > > On 11 May 2017 at 18:40, Quentin Colombet <qcolombet at apple.com> wrote: > >> Hi Diana, > >> > >> Thanks for the summary. > >> > >> On May 11, 2017, at 4:01 AM, Diana Picus <diana.picus at linaro.org> > wrote: > >> > >> Hi all, > >> > >> I'm still running some validation on this, I'll send an email when > >> it's done. If that goes well I don't have anything against making the > >> switch. > >> > >> For the record, here's a summary of issues that were deferred for > >> later on (some of which are optimization-ish and we might decide to > >> never do at -O0 at all): > >> * Crash in RegBankSelect for half fp types: > >> https://bugs.llvm.org/show_bug.cgi?id=32560 > >> > >> > >> I’ll have a look. > >> > >> * Improving constant placement: > http://bugs.llvm.org/show_bug.cgi?id=32561 > >> > >> > >> I’ve commented in the PR to mention the localizer technic I was playing > >> with, if someone wants to give it a try. > >> > >> * Fancy switch lowering > >> * Transforming division-by-constant-power-of-2 into right shift > >> > >> > >> AFAICT all the other issues that were brought up were fixed (yay!). > >> > >> Cheers, > >> Diana > >> > >> > >> Cheers, > >> -Quentin > >> > >> > >> > >> On 11 May 2017 at 08:44, Kristof Beyls via llvm-dev > >> <llvm-dev at lists.llvm.org> wrote: > >> > >> > >> On 10 May 2017, at 17:36, Quentin Colombet <qcolombet at apple.com> wrote: > >> > >> > >> MultiSource/Benchmarks/mediabench/mpeg2/mpeg2dec/mpeg2decode (46%): > Function > >> Reference_IDCT: Probably due to creating all constants in the entry BB + > >> spilling floating point data through an X register: > >> > >> FastISel: > >> fadd d0, d1, d0 > >> str d0, [sp,#528] > >> GlobalISel: > >> fadd d0, d1, d0 > >> fmov x9, d0 > >> stur x9, [x29,#-48] > >> > >> > >> Good finding, I forgot to do stores in my previous fix. I’ll do them > >> shortly. > >> > >> > >> Should be fixed by r302679 > >> > >> > >> Thanks Quentin, > >> > >> That reduces the slow-down when enabling globalisel at -O0 from 13% (on > >> r302453) to 9.5% (on r302679) in my experiments. > >> The code size increase also reduces from just over 3% to 2.8%. > >> > >> Kristof > >> > >> > >> _______________________________________________ > >> LLVM Developers mailing list > >> llvm-dev at lists.llvm.org > >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > >> > >> > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170512/def7609e/attachment-0001.html>
Diana Picus via llvm-dev
2017-May-15 07:38 UTC
[llvm-dev] [GlobalISel][AArch64] Toward flipping the switch for O0: Please give it a try!
Got another one: https://bugs.llvm.org/show_bug.cgi?id=33036 I'm still investigating whether this is an actual GlobalISel issue or something else (I'll also start a build on a more recent revision to see how that behaves). On 12 May 2017 at 20:06, Eric Christopher <echristo at gmail.com> wrote:> Agreed. That was probably just luck before :) > > -eric > > On Fri, May 12, 2017 at 5:22 AM Diana Picus via llvm-dev > <llvm-dev at lists.llvm.org> wrote: >> >> It turns out that can be fixed by adding -lm to the link line, so I >> will probably convert it into a test-suite bug. >> >> I don't suppose it's crucial to handle the fabs intrinsic nicely at -O0. >> >> On 12 May 2017 at 13:44, Diana Picus <diana.picus at linaro.org> wrote: >> > Hi, >> > >> > I ran into a little snag on the test-suite: >> > https://bugs.llvm.org/show_bug.cgi?id=33021 >> > It boils down to GlobalISel generating calls to fabs instead of using >> > FABSDr (so we get undefined references). >> > >> > Cheers, >> > Diana >> > >> > On 11 May 2017 at 18:40, Quentin Colombet <qcolombet at apple.com> wrote: >> >> Hi Diana, >> >> >> >> Thanks for the summary. >> >> >> >> On May 11, 2017, at 4:01 AM, Diana Picus <diana.picus at linaro.org> >> >> wrote: >> >> >> >> Hi all, >> >> >> >> I'm still running some validation on this, I'll send an email when >> >> it's done. If that goes well I don't have anything against making the >> >> switch. >> >> >> >> For the record, here's a summary of issues that were deferred for >> >> later on (some of which are optimization-ish and we might decide to >> >> never do at -O0 at all): >> >> * Crash in RegBankSelect for half fp types: >> >> https://bugs.llvm.org/show_bug.cgi?id=32560 >> >> >> >> >> >> I’ll have a look. >> >> >> >> * Improving constant placement: >> >> http://bugs.llvm.org/show_bug.cgi?id=32561 >> >> >> >> >> >> I’ve commented in the PR to mention the localizer technic I was playing >> >> with, if someone wants to give it a try. >> >> >> >> * Fancy switch lowering >> >> * Transforming division-by-constant-power-of-2 into right shift >> >> >> >> >> >> AFAICT all the other issues that were brought up were fixed (yay!). >> >> >> >> Cheers, >> >> Diana >> >> >> >> >> >> Cheers, >> >> -Quentin >> >> >> >> >> >> >> >> On 11 May 2017 at 08:44, Kristof Beyls via llvm-dev >> >> <llvm-dev at lists.llvm.org> wrote: >> >> >> >> >> >> On 10 May 2017, at 17:36, Quentin Colombet <qcolombet at apple.com> wrote: >> >> >> >> >> >> MultiSource/Benchmarks/mediabench/mpeg2/mpeg2dec/mpeg2decode (46%): >> >> Function >> >> Reference_IDCT: Probably due to creating all constants in the entry BB >> >> + >> >> spilling floating point data through an X register: >> >> >> >> FastISel: >> >> fadd d0, d1, d0 >> >> str d0, [sp,#528] >> >> GlobalISel: >> >> fadd d0, d1, d0 >> >> fmov x9, d0 >> >> stur x9, [x29,#-48] >> >> >> >> >> >> Good finding, I forgot to do stores in my previous fix. I’ll do them >> >> shortly. >> >> >> >> >> >> Should be fixed by r302679 >> >> >> >> >> >> Thanks Quentin, >> >> >> >> That reduces the slow-down when enabling globalisel at -O0 from 13% (on >> >> r302453) to 9.5% (on r302679) in my experiments. >> >> The code size increase also reduces from just over 3% to 2.8%. >> >> >> >> Kristof >> >> >> >> >> >> _______________________________________________ >> >> LLVM Developers mailing list >> >> llvm-dev at lists.llvm.org >> >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >> >> >> >> >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Diana Picus via llvm-dev
2017-May-16 12:06 UTC
[llvm-dev] [GlobalISel][AArch64] Toward flipping the switch for O0: Please give it a try!
Turns out it really is a GlobalISel issue - we eat up too much stack space because all the constants used in the switches are stored on the stack. We need to fix this somehow before changing the default. I'll try to give it a run with Quentin's r299283 on top to see if it helps. Cheers, Diana On 15 May 2017 at 09:38, Diana Picus <diana.picus at linaro.org> wrote:> Got another one: https://bugs.llvm.org/show_bug.cgi?id=33036 > > I'm still investigating whether this is an actual GlobalISel issue or > something else (I'll also start a build on a more recent revision to > see how that behaves). > > On 12 May 2017 at 20:06, Eric Christopher <echristo at gmail.com> wrote: >> Agreed. That was probably just luck before :) >> >> -eric >> >> On Fri, May 12, 2017 at 5:22 AM Diana Picus via llvm-dev >> <llvm-dev at lists.llvm.org> wrote: >>> >>> It turns out that can be fixed by adding -lm to the link line, so I >>> will probably convert it into a test-suite bug. >>> >>> I don't suppose it's crucial to handle the fabs intrinsic nicely at -O0. >>> >>> On 12 May 2017 at 13:44, Diana Picus <diana.picus at linaro.org> wrote: >>> > Hi, >>> > >>> > I ran into a little snag on the test-suite: >>> > https://bugs.llvm.org/show_bug.cgi?id=33021 >>> > It boils down to GlobalISel generating calls to fabs instead of using >>> > FABSDr (so we get undefined references). >>> > >>> > Cheers, >>> > Diana >>> > >>> > On 11 May 2017 at 18:40, Quentin Colombet <qcolombet at apple.com> wrote: >>> >> Hi Diana, >>> >> >>> >> Thanks for the summary. >>> >> >>> >> On May 11, 2017, at 4:01 AM, Diana Picus <diana.picus at linaro.org> >>> >> wrote: >>> >> >>> >> Hi all, >>> >> >>> >> I'm still running some validation on this, I'll send an email when >>> >> it's done. If that goes well I don't have anything against making the >>> >> switch. >>> >> >>> >> For the record, here's a summary of issues that were deferred for >>> >> later on (some of which are optimization-ish and we might decide to >>> >> never do at -O0 at all): >>> >> * Crash in RegBankSelect for half fp types: >>> >> https://bugs.llvm.org/show_bug.cgi?id=32560 >>> >> >>> >> >>> >> I’ll have a look. >>> >> >>> >> * Improving constant placement: >>> >> http://bugs.llvm.org/show_bug.cgi?id=32561 >>> >> >>> >> >>> >> I’ve commented in the PR to mention the localizer technic I was playing >>> >> with, if someone wants to give it a try. >>> >> >>> >> * Fancy switch lowering >>> >> * Transforming division-by-constant-power-of-2 into right shift >>> >> >>> >> >>> >> AFAICT all the other issues that were brought up were fixed (yay!). >>> >> >>> >> Cheers, >>> >> Diana >>> >> >>> >> >>> >> Cheers, >>> >> -Quentin >>> >> >>> >> >>> >> >>> >> On 11 May 2017 at 08:44, Kristof Beyls via llvm-dev >>> >> <llvm-dev at lists.llvm.org> wrote: >>> >> >>> >> >>> >> On 10 May 2017, at 17:36, Quentin Colombet <qcolombet at apple.com> wrote: >>> >> >>> >> >>> >> MultiSource/Benchmarks/mediabench/mpeg2/mpeg2dec/mpeg2decode (46%): >>> >> Function >>> >> Reference_IDCT: Probably due to creating all constants in the entry BB >>> >> + >>> >> spilling floating point data through an X register: >>> >> >>> >> FastISel: >>> >> fadd d0, d1, d0 >>> >> str d0, [sp,#528] >>> >> GlobalISel: >>> >> fadd d0, d1, d0 >>> >> fmov x9, d0 >>> >> stur x9, [x29,#-48] >>> >> >>> >> >>> >> Good finding, I forgot to do stores in my previous fix. I’ll do them >>> >> shortly. >>> >> >>> >> >>> >> Should be fixed by r302679 >>> >> >>> >> >>> >> Thanks Quentin, >>> >> >>> >> That reduces the slow-down when enabling globalisel at -O0 from 13% (on >>> >> r302453) to 9.5% (on r302679) in my experiments. >>> >> The code size increase also reduces from just over 3% to 2.8%. >>> >> >>> >> Kristof >>> >> >>> >> >>> >> _______________________________________________ >>> >> LLVM Developers mailing list >>> >> llvm-dev at lists.llvm.org >>> >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>> >> >>> >> >>> _______________________________________________ >>> LLVM Developers mailing list >>> llvm-dev at lists.llvm.org >>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev