Diana Picus via llvm-dev
2017-May-11 11:01 UTC
[llvm-dev] [GlobalISel][AArch64] Toward flipping the switch for O0: Please give it a try!
Hi all, I'm still running some validation on this, I'll send an email when it's done. If that goes well I don't have anything against making the switch. For the record, here's a summary of issues that were deferred for later on (some of which are optimization-ish and we might decide to never do at -O0 at all): * Crash in RegBankSelect for half fp types: https://bugs.llvm.org/show_bug.cgi?id=32560 * Improving constant placement: http://bugs.llvm.org/show_bug.cgi?id=32561 * Fancy switch lowering * Transforming division-by-constant-power-of-2 into right shift AFAICT all the other issues that were brought up were fixed (yay!). Cheers, Diana On 11 May 2017 at 08:44, Kristof Beyls via llvm-dev <llvm-dev at lists.llvm.org> wrote:> > On 10 May 2017, at 17:36, Quentin Colombet <qcolombet at apple.com> wrote: > > > MultiSource/Benchmarks/mediabench/mpeg2/mpeg2dec/mpeg2decode (46%): Function > Reference_IDCT: Probably due to creating all constants in the entry BB + > spilling floating point data through an X register: > > FastISel: > fadd d0, d1, d0 > str d0, [sp,#528] > GlobalISel: > fadd d0, d1, d0 > fmov x9, d0 > stur x9, [x29,#-48] > > > Good finding, I forgot to do stores in my previous fix. I’ll do them > shortly. > > > Should be fixed by r302679 > > > Thanks Quentin, > > That reduces the slow-down when enabling globalisel at -O0 from 13% (on > r302453) to 9.5% (on r302679) in my experiments. > The code size increase also reduces from just over 3% to 2.8%. > > Kristof > > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >
Quentin Colombet via llvm-dev
2017-May-11 16:40 UTC
[llvm-dev] [GlobalISel][AArch64] Toward flipping the switch for O0: Please give it a try!
Hi Diana, Thanks for the summary.> On May 11, 2017, at 4:01 AM, Diana Picus <diana.picus at linaro.org> wrote: > > Hi all, > > I'm still running some validation on this, I'll send an email when > it's done. If that goes well I don't have anything against making the > switch. > > For the record, here's a summary of issues that were deferred for > later on (some of which are optimization-ish and we might decide to > never do at -O0 at all): > * Crash in RegBankSelect for half fp types: > https://bugs.llvm.org/show_bug.cgi?id=32560 <https://bugs.llvm.org/show_bug.cgi?id=32560>I’ll have a look.> * Improving constant placement: http://bugs.llvm.org/show_bug.cgi?id=32561 <http://bugs.llvm.org/show_bug.cgi?id=32561>I’ve commented in the PR to mention the localizer technic I was playing with, if someone wants to give it a try.> * Fancy switch lowering > * Transforming division-by-constant-power-of-2 into right shift > > AFAICT all the other issues that were brought up were fixed (yay!). > > Cheers, > DianaCheers, -Quentin> > > On 11 May 2017 at 08:44, Kristof Beyls via llvm-dev > <llvm-dev at lists.llvm.org> wrote: >> >> On 10 May 2017, at 17:36, Quentin Colombet <qcolombet at apple.com> wrote: >> >> >> MultiSource/Benchmarks/mediabench/mpeg2/mpeg2dec/mpeg2decode (46%): Function >> Reference_IDCT: Probably due to creating all constants in the entry BB + >> spilling floating point data through an X register: >> >> FastISel: >> fadd d0, d1, d0 >> str d0, [sp,#528] >> GlobalISel: >> fadd d0, d1, d0 >> fmov x9, d0 >> stur x9, [x29,#-48] >> >> >> Good finding, I forgot to do stores in my previous fix. I’ll do them >> shortly. >> >> >> Should be fixed by r302679 >> >> >> Thanks Quentin, >> >> That reduces the slow-down when enabling globalisel at -O0 from 13% (on >> r302453) to 9.5% (on r302679) in my experiments. >> The code size increase also reduces from just over 3% to 2.8%. >> >> Kristof >> >> >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170511/d4b409be/attachment.html>
Diana Picus via llvm-dev
2017-May-12 11:44 UTC
[llvm-dev] [GlobalISel][AArch64] Toward flipping the switch for O0: Please give it a try!
Hi, I ran into a little snag on the test-suite: https://bugs.llvm.org/show_bug.cgi?id=33021 It boils down to GlobalISel generating calls to fabs instead of using FABSDr (so we get undefined references). Cheers, Diana On 11 May 2017 at 18:40, Quentin Colombet <qcolombet at apple.com> wrote:> Hi Diana, > > Thanks for the summary. > > On May 11, 2017, at 4:01 AM, Diana Picus <diana.picus at linaro.org> wrote: > > Hi all, > > I'm still running some validation on this, I'll send an email when > it's done. If that goes well I don't have anything against making the > switch. > > For the record, here's a summary of issues that were deferred for > later on (some of which are optimization-ish and we might decide to > never do at -O0 at all): > * Crash in RegBankSelect for half fp types: > https://bugs.llvm.org/show_bug.cgi?id=32560 > > > I’ll have a look. > > * Improving constant placement: http://bugs.llvm.org/show_bug.cgi?id=32561 > > > I’ve commented in the PR to mention the localizer technic I was playing > with, if someone wants to give it a try. > > * Fancy switch lowering > * Transforming division-by-constant-power-of-2 into right shift > > > AFAICT all the other issues that were brought up were fixed (yay!). > > Cheers, > Diana > > > Cheers, > -Quentin > > > > On 11 May 2017 at 08:44, Kristof Beyls via llvm-dev > <llvm-dev at lists.llvm.org> wrote: > > > On 10 May 2017, at 17:36, Quentin Colombet <qcolombet at apple.com> wrote: > > > MultiSource/Benchmarks/mediabench/mpeg2/mpeg2dec/mpeg2decode (46%): Function > Reference_IDCT: Probably due to creating all constants in the entry BB + > spilling floating point data through an X register: > > FastISel: > fadd d0, d1, d0 > str d0, [sp,#528] > GlobalISel: > fadd d0, d1, d0 > fmov x9, d0 > stur x9, [x29,#-48] > > > Good finding, I forgot to do stores in my previous fix. I’ll do them > shortly. > > > Should be fixed by r302679 > > > Thanks Quentin, > > That reduces the slow-down when enabling globalisel at -O0 from 13% (on > r302453) to 9.5% (on r302679) in my experiments. > The code size increase also reduces from just over 3% to 2.8%. > > Kristof > > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > >