Quentin Colombet via llvm-dev
2017-Nov-27 17:44 UTC
[llvm-dev] [GlobalISel][AArch64] Toward flipping the switch for O0: Please give it a try!
Thanks all. Amara, could you take a look?> On Nov 20, 2017, at 3:06 AM, Oliver Stannard <oliver.stannard at arm.com> wrote: > > Hi Quentin, > > I’ve raised: > https://bugs.llvm.org/show_bug.cgi?id=35359 <https://bugs.llvm.org/show_bug.cgi?id=35359> > https://bugs.llvm.org/show_bug.cgi?id=35360 <https://bugs.llvm.org/show_bug.cgi?id=35360> > https://bugs.llvm.org/show_bug.cgi?id=35361 <https://bugs.llvm.org/show_bug.cgi?id=35361> > > I also left the test suite running over the weekend in little-endian mode with the __fp16 type disabled, and it didn’t find any bugs there. > > Oliver > > From: qcolombet at apple.com [mailto:qcolombet at apple.com] > Sent: 17 November 2017 17:28 > To: Oliver Stannard > Cc: llvm-dev at lists.llvm.org; nd; Kristof Beyls > Subject: Re: [llvm-dev] [GlobalISel][AArch64] Toward flipping the switch for O0: Please give it a try! > > Hi Oliver, > > Thanks for trying this. > Could you file a different PR for each of the problem you found and reference the umbrella PR: http://llvm.org/PR35347? <http://llvm.org/PR35347?> > > Thanks, > -Quentin > > > On Nov 17, 2017, at 8:17 AM, Oliver Stannard <oliver.stannard at arm.com <mailto:oliver.stannard at arm.com>> wrote: > > Hi Quentin, > > One more reproducer, this time with small (<64bit) values being passed on the stack: > > int foo(int x0, int x1, int x2, int x3, int x4, int x5, int x6, int x7, > int stack1) { > return stack1; > } > > int main() { > int ret = foo(0,1,2,3,4,5,6,7,8); > printf("%d\n", ret); > } > > Global isel thinks that the incoming value of stack1 is stored in bytes [0,4) above SP, but for big-endian targets this should be in bytes [4,8): > > // /work/llvm/build/bin/clang --target=aarch64-arm-none-eabi -march=armv8-a -c callees.cpp -O0 -Wall -std=c++11 -mllvm -global-isel=true -mllvm -global-isel-abort=0 -mbig-endian -o - -S > _Z3fooiiiiiiiii: // @_Z3fooiiiiiiiii > // BB#0: // %entry > sub sp, sp, #48 // =48 > ldr w8, [sp, #48] // <= Should be [sp, #52] > str w0, [sp, #44] > str w1, [sp, #40] > str w2, [sp, #36] > str w3, [sp, #32] > str w4, [sp, #28] > str w5, [sp, #24] > str w6, [sp, #20] > str w7, [sp, #16] > str w8, [sp, #12] > ldr w0, [sp, #12] > add sp, sp, #48 // =48 > ret > > Oliver > > From: Oliver Stannard > Sent: 17 November 2017 14:57 > To: 'qcolombet at apple.com <mailto:qcolombet at apple.com>' > Cc: 'llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>'; nd; Kristof Beyls > Subject: RE: [llvm-dev] [GlobalISel][AArch64] Toward flipping the switch for O0: Please give it a try! > > Hi Quentin, > > It seems that we also get the calling convention wrong for vector types on big-endian: > #include <arm_neon.h> > int32x2_t load_vector(int32x2_t *p) { > return *p; > } > > Global-isel generates this: > // armclang --target=aarch64-arm-none-eabi -march=armv8-a -c callees.cpp -O0 -Wall -std=c++11 -mllvm -global-isel=true -mllvm -global-isel-abort=0 -mbig-endian -o - -S > _Z11load_vectorP11__Int32x2_t: // @_Z11load_vectorP11__Int32x2_t > // BB#0: // %entry > sub sp, sp, #16 // =16 > str x0, [sp, #8] > ldr x0, [sp, #8] > ld1 { v0.2s }, [x0] > add sp, sp, #16 // =16 > ret > > With global-isel off, there is a rev64 instruction between the ld1 and the add, which fixes up the endianness of the vector. > > Oliver > > From: Oliver Stannard > Sent: 17 November 2017 13:32 > To: 'qcolombet at apple.com <mailto:qcolombet at apple.com>' > Cc: llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>; nd; Kristof Beyls > Subject: RE: [llvm-dev] [GlobalISel][AArch64] Toward flipping the switch for O0: Please give it a try! > > Hi Quentin, > > At Kristof’s suggestion, I tried running our ABI test suite for a big-endian AArch64 target, and this found an ABI mismatch between global-isel and regular -O0. Here’s a reproducer for the first one I’ve investigated: > > struct foo { > float first; > float second; > }; > float get_first(foo p) { > return p.first; > } > > This is the code that global-isel currently generates: > // /work/llvm/build/bin/clang --target=aarch64--none-eabi -march=armv8-a -c callees.cpp -O0 -mllvm -global-isel=true -mllvm -global-isel-abort=0 -mbig-endian -o - -S > > _Z9get_first3foo: // @_Z9get_first3foo > // BB#0: // %entry > sub sp, sp, #16 // =16 > // implicit-def: %X8 > fmov w9, s0 > mov w10, w9 > bfxil x8, x10, #0, #32 > fmov w9, s1 > mov w10, w9 > bfi x8, x10, #32, #32 > add x10, sp, #8 // =8 > str x8, [sp, #8] > ldr w9, [x10] > fmov s0, w9 > add sp, sp, #16 // =16 > ret > > When run on a big-endian target, this incorrectly returns the second member of the struct, instead of the first. > > Oliver > > From: qcolombet at apple.com <mailto:qcolombet at apple.com> [mailto:qcolombet at apple.com <mailto:qcolombet at apple.com>] > Sent: 14 November 2017 23:11 > To: Quentin Colombet > Cc: Oliver Stannard; llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>; Justin Bogner; Ahmed Bougacha; Aditya Nandakumar; nd > Subject: Re: [llvm-dev] [GlobalISel][AArch64] Toward flipping the switch for O0: Please give it a try! > > To give an update here, we actually are not missing a mapping. The code complains because we are copying around a fp16 into a gpr32 and that shouldn’t be done with a copy (default mapping). > I extended the repairing code to issue G_ANYEXT in those cases instead of asserting. > > However, now, I have to teach instruction select about those ANYEXT otherwise we’ll fallback in that case. But that’s a different story. > > I’ll try to commit today or tomorrow (I have to strengthen the tests). > > On Nov 14, 2017, at 9:29 AM, Quentin Colombet via llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: > > Thanks Oliver. > I’ll have a look. This typically means that we miss a mapping for this type/instruction, which is not surprising given how little code we have we fp16. > > > On Nov 14, 2017, at 2:27 AM, Oliver Stannard <oliver.stannard at arm.com <mailto:oliver.stannard at arm.com>> wrote: > > Hi Quentin, > > I’ve started running an ABI test suite with global isel on AArch64, and while it hasn’t found any ABI issues it has hit an assertion in clang when using the __fp16 type. Here’s a reproducer: > > __fp16 pass_f16(__fp16 p) { > return p; > } > > $ /work/llvm/build/bin/clang --target=aarch64-arm-none-eabi -march=armv8-a -c test.c -O0 -mllvm -global-isel -mllvm -global-isel-abort=0 > clang-6.0: /work/llvm/llvm/lib/CodeGen/GlobalISel/RegisterBankInfo.cpp:446: static void llvm::RegisterBankInfo::applyDefaultMapping(const llvm::RegisterBankInfo::OperandsMapper &): Assertion `OrigTy.getSizeInBits() == NewTy.getSizeInBits() && "Types with difference size cannot be handled by the default " "mapping"' failed. > #0 0x000000000362a764 PrintStackTraceSignalHandler(void*) (/work/llvm/build/bin/clang-6.0+0x362a764) > #1 0x000000000362aac6 SignalHandler(int) (/work/llvm/build/bin/clang-6.0+0x362aac6) > #2 0x00007f9193b78330 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x10330) > #3 0x00007f919276bc37 gsignal /build/eglibc-oGUzwX/eglibc-2.19/signal/../nptl/sysdeps/unix/sysv/linux/raise.c:56:0 > #4 0x00007f919276f028 abort /build/eglibc-oGUzwX/eglibc-2.19/stdlib/abort.c:91:0 > #5 0x00007f9192764bf6 __assert_fail_base /build/eglibc-oGUzwX/eglibc-2.19/assert/assert.c:92:0 > #6 0x00007f9192764ca2 (/lib/x86_64-linux-gnu/libc.so.6+0x2fca2) > #7 0x0000000003d70eb9 (/work/llvm/build/bin/clang-6.0+0x3d70eb9) > #8 0x0000000003d6b00c llvm::RegBankSelect::applyMapping(llvm::MachineInstr&, llvm::RegisterBankInfo::InstructionMapping const&, llvm::SmallVectorImpl<llvm::RegBankSelect::RepairingPlacement>&) (/work/llvm/build/bin/clang-6.0+0x3d6b00c) > #9 0x0000000003d6b366 llvm::RegBankSelect::assignInstr(llvm::MachineInstr&) (/work/llvm/build/bin/clang-6.0+0x3d6b366) > #10 0x0000000003d6b7f1 llvm::RegBankSelect::runOnMachineFunction(llvm::MachineFunction&) (/work/llvm/build/bin/clang-6.0+0x3d6b7f1) > #11 0x0000000002d934c8 llvm::MachineFunctionPass::runOnFunction(llvm::Function&) (/work/llvm/build/bin/clang-6.0+0x2d934c8) > #12 0x00000000030c998f llvm::FPPassManager::runOnFunction(llvm::Function&) (/work/llvm/build/bin/clang-6.0+0x30c998f) > #13 0x00000000030c9c53 llvm::FPPassManager::runOnModule(llvm::Module&) (/work/llvm/build/bin/clang-6.0+0x30c9c53) > #14 0x00000000030ca136 llvm::legacy::PassManagerImpl::run(llvm::Module&) (/work/llvm/build/bin/clang-6.0+0x30ca136) > #15 0x00000000037c3dcf clang::EmitBackendOutput(clang::DiagnosticsEngine&, clang::HeaderSearchOptions const&, clang::CodeGenOptions const&, clang::TargetOptions const&, clang::LangOptions const&, llvm::DataLayout const&, llvm::Module*, clang::BackendAction, std::unique_ptr<llvm::raw_pwrite_stream, std::default_delete<llvm::raw_pwrite_stream> >) (/work/llvm/build/bin/clang-6.0+0x37c3dcf) > #16 0x0000000003d421a0 clang::BackendConsumer::HandleTranslationUnit(clang::ASTContext&) (/work/llvm/build/bin/clang-6.0+0x3d421a0) > #17 0x0000000004457376 clang::ParseAST(clang::Sema&, bool, bool) (/work/llvm/build/bin/clang-6.0+0x4457376) > #18 0x0000000003ca6ea0 clang::FrontendAction::Execute() (/work/llvm/build/bin/clang-6.0+0x3ca6ea0) > #19 0x0000000003c1fa31 clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) (/work/llvm/build/bin/clang-6.0+0x3c1fa31) > #20 0x0000000003d3bf4b clang::ExecuteCompilerInvocation(clang::CompilerInstance*) (/work/llvm/build/bin/clang-6.0+0x3d3bf4b) > #21 0x0000000001f85629 cc1_main(llvm::ArrayRef<char const*>, char const*, void*) (/work/llvm/build/bin/clang-6.0+0x1f85629) > #22 0x0000000001f83096 main (/work/llvm/build/bin/clang-6.0+0x1f83096) > #23 0x00007f9192756f45 __libc_start_main /build/eglibc-oGUzwX/eglibc-2.19/csu/libc-start.c:321:0 > #24 0x0000000001f80029 _start (/work/llvm/build/bin/clang-6.0+0x1f80029) > Stack dump: > 0. Program arguments: /work/llvm/build/bin/clang-6.0 -cc1 -triple aarch64-arm-none-eabi -emit-obj -mrelax-all -disable-free -main-file-name test.c -mrelocation-model static -mthread-model posix -mdisable-fp-elim -fmath-errno -masm-verbose -mconstructor-aliases -fuse-init-array -target-cpu generic -target-feature +neon -target-abi aapcs -fallow-half-arguments-and-returns -dwarf-column-info -debugger-tuning=gdb -coverage-notes-file /work/innovation/cctest/test.gcno -resource-dir /work/llvm/build/lib/clang/6.0.0 -O0 -fdebug-compilation-dir /work/innovation/cctest -ferror-limit 19 -fmessage-length 226 -fno-signed-char -fobjc-runtime=gcc -fdiagnostics-show-option -fcolor-diagnostics -mllvm -global-isel -mllvm -global-isel-abort=0 -o test.o -x c test.c > 1. <eof> parser at end of file > 2. Code generation > 3. Running pass 'Function Pass Manager' on module 'test.c'. > 4. Running pass 'RegBankSelect' on function '@pass_f16' > clang-6.0: error: unable to execute command: Aborted (core dumped) > clang-6.0: error: clang frontend command failed due to signal (use -v to see invocation) > clang version 6.0.0 (ssh://olista01 at ds-gerrit.euhpc.arm.com:29418/armcompiler/clang <ssh://olista01 at ds-gerrit.euhpc.arm.com:29418/armcompiler/clang> aa2b9952ef98a5fe2d47384ef17106855b8bae51) (ssh://olista01 at ds-gerrit.euhpc.arm.com:29418/armcompiler/llvm <ssh://olista01 at ds-gerrit.euhpc.arm.com:29418/armcompiler/llvm> 29f89772107a79b5f2a816d4748ed9c19416c1b6) > Target: aarch64-arm-none-eabi > Thread model: posix > InstalledDir: /work/llvm/build/bin > clang-6.0: note: diagnostic msg: PLEASE submit a bug report to http://llvm.org/bugs/ <http://llvm.org/bugs/> and include the crash backtrace, preprocessed source, and associated run script. > clang-6.0: note: diagnostic msg: > ******************** > > PLEASE ATTACH THE FOLLOWING FILES TO THE BUG REPORT: > Preprocessed source(s) and associated run script(s) are located at: > clang-6.0: note: diagnostic msg: /tmp/test-e06964.c > clang-6.0: note: diagnostic msg: /tmp/test-e06964.sh > clang-6.0: note: diagnostic msg: > > ******************** > > Oliver > > From: llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org <mailto:llvm-dev-bounces at lists.llvm.org>] On Behalf Of Quentin Colombet via llvm-dev > Sent: 13 November 2017 18:27 > To: Kristof Beyls > Cc: llvm-dev; nd; Ahmed Bougacha; Justin Bogner; Aditya Nandakumar > Subject: Re: [llvm-dev] [GlobalISel][AArch64] Toward flipping the switch for O0: Please give it a try! > > Hi Kristof, > > > On Nov 13, 2017, at 9:10 AM, Kristof Beyls <Kristof.Beyls at arm.com <mailto:Kristof.Beyls at arm.com>> wrote: > > Hi Quentin, > > My only remaining concern is around ABI compatibility. > The following commit seems to indicate that in the previous round of evaluation, we didn’t find an existing ABI compatibility issue: > http://llvm.org/viewvc/llvm-project?view=revision&revision=311388 <http://llvm.org/viewvc/llvm-project?view=revision&revision=311388>. > I haven’t looked into the details of this issue - so maybe I’m worried over nothing? > > No, you’re right. The problem with ABI is if you are consistently wrong, then you won’t notice :). > > > > I’m wondering if since then on your side you did any testing around ABI compatibility? > E.g. building software where you semi-randomly build some functions through GlobalISel and some functions through DAGISel? > > Justin will look into that. Clang has utility script for that utils/ABITest. > > Given we will only be able to check iOS ABI, you may want to follow the same kind of validation on your side. > > I let you sync up with Justin for the method. > > Cheers, > -Quentin > > > > Thanks, > > Kristof > > On 8 Nov 2017, at 00:42, Quentin Colombet via llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: > > Hi all, > > I’d like to resurrect this thread and ask if people are on board for enabling this by default for AArch64 O0. > > > *** What Changed Since June? *** > > - We added a way to describe the legalization actions for non-power-of-2 > - We gave a tutorial that covers the best practices to target GlobalISel > - We improved the TableGen backend to reuse existing SDISel patterns > - We built and ran huge internal software with GISel > - We evaluated the performance of GISel and are confident things are in a good shape (with https://reviews.llvm.org/D39034 <https://reviews.llvm.org/D39034>) and moving forward would look even better (see the last LLVM Dev talk: GlobalISel: Present, Past, and Future when it is available) > > > *** So What’s he Plan? *** > > - Switch the default instruction selector to GISel for AArch64 at O0 > - Enable the fallback path by default for AArch64 (with warnings enabled when that path is hit) > - Provide a clang option to turn GISel off > > What do you think? > > Thanks, > -Quentin > > > On Jun 16, 2017, at 4:43 PM, Quentin Colombet <qcolombet at apple.com <mailto:qcolombet at apple.com>> wrote: > > Hi all, > > We had some internal discussions about flipping the default for O0 and we concluded that we wanted to postpone it. > > > *** Why Is That? *** > > We don’t want to send the wrong message that GlobalISel’s design is set in stone and ready for broader adoption. > In particular, > 1. The APIs are still evolving and can still possibly change significantly > 2. The TableGen backend to reuse the existing SD patterns is still at its early stage > 3. We want to investigate closely the performance of global-isel (compile-time, runtime, code size, fallbacks) > > The rationale behind those items is that we want to minimize the pain of moving forward for everybody. We also want the out-of-the-box experience to be pleasant (like all/most of the tablegen patterns just work, we have documentation on how to target a new backend, etc.) Finally, we want to gain confidence we are going to be able to address the performance issues we have with the current design and if not, derive a plan for that. > > We purposely left out of the conversation what will be the right time and requirements to flip the switch. We want to gather more data first. Your help would be appreciated! > > > *** Short-Term Proposal *** > > What we would like to do instead short-term is: > A. Repurpose or create an option “-aarch64-enable-global-isel-at-O” to enable GISel with fallbacks and warnings enables (i.e., equivalent of -global-isel -global-isel-abort=2) > B. Advertise this option in the next open source release to allow compiler enthusiastic to try it and report problems > C. Have GISel always built so we can push thing in the right place, MachineVerifier in mind, and stop doing some weird gymnastic > > What do people think? > > > *** Your Help Is Needed *** > > - Please share your experience in using the GISel APIs and how we can make them better. Moving forward we’ll have those conversations on open source instead of internally/with a narrower audience. > - Report any performance problem you identify > - Propose patches! > > Cheers, > -Quentin > > > > On Jun 16, 2017, at 3:06 PM, Quentin Colombet via llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: > > > On Jun 14, 2017, at 7:27 AM, Diana Picus <diana.picus at linaro.org <mailto:diana.picus at linaro.org>> wrote: > > On 12 June 2017 at 18:54, Diana Picus <diana.picus at linaro.org <mailto:diana.picus at linaro.org>> wrote: > Hi all, > > I added a buildbot [1] running the test-suite with -O0 -global-isel. It runs into the same 2 timeouts that I reported previously on this thread (paq8p and scimark2). It would be nice to make it green before flipping the switch. > > > I did some more investigations on a machine similar to the one running the buildbot. For paq8p and scimark2, I get these results for O0: > > PAQ8p: > Fast isel: 666.344 > Global isel: 731.384 > > SciMark2-C: > Fast isel: 463.908 > Global isel: 496.22 > > The current timeout is 500s (so in this particular case we didn't hit it for scimark2, and it ran successfully to completion). I don't think the difference between FastISel and GlobalISel is too atrocious, so I would propose increasing the timeout for these 2 benchmarks. I'm not sure if we can do this on a per-bot basis, but I see some precedent for setting custom timeout thresholds for various benchmarks on different architectures (sometimes with comments that it's done so we can run O0 on that particular benchmark). > > Something along these lines works: > https://reviews.llvm.org/differential/diff/102547/ <https://reviews.llvm.org/differential/diff/102547/> > > What do you guys think about this approach? > > Looks reasonable to me. > > > > Thanks, > Diana > > PS: The buildbot is using the Makefiles because that's what our other AArch64 test-suite bots use. Moving all of them to CMake is a transition for another time. > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev> > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20171127/6c0de432/attachment-0001.html>
Amara Emerson via llvm-dev
2017-Dec-11 17:08 UTC
[llvm-dev] [GlobalISel][AArch64] Toward flipping the switch for O0: Please give it a try!
As of r320388 we’ve either fixed the blocker bugs or disabled big-endian on GISel, falling back to SDAG. Fixing at least one of the big-endian issues will need use to change the way we handle aggregates, which will take a bit longer (it’s next on my list of things to do). Do we have any other issues preventing us flipping the switch? Amara> On Nov 27, 2017, at 5:44 PM, Quentin Colombet <qcolombet at apple.com> wrote: > > Thanks all. > > Amara, could you take a look? > >> On Nov 20, 2017, at 3:06 AM, Oliver Stannard <oliver.stannard at arm.com <mailto:oliver.stannard at arm.com>> wrote: >> >> Hi Quentin, >> >> I’ve raised: >> https://bugs.llvm.org/show_bug.cgi?id=35359 <https://bugs.llvm.org/show_bug.cgi?id=35359> >> https://bugs.llvm.org/show_bug.cgi?id=35360 <https://bugs.llvm.org/show_bug.cgi?id=35360> >> https://bugs.llvm.org/show_bug.cgi?id=35361 <https://bugs.llvm.org/show_bug.cgi?id=35361> >> >> I also left the test suite running over the weekend in little-endian mode with the __fp16 type disabled, and it didn’t find any bugs there. >> >> Oliver >> >> From: qcolombet at apple.com <mailto:qcolombet at apple.com> [mailto:qcolombet at apple.com <mailto:qcolombet at apple.com>] >> Sent: 17 November 2017 17:28 >> To: Oliver Stannard >> Cc: llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>; nd; Kristof Beyls >> Subject: Re: [llvm-dev] [GlobalISel][AArch64] Toward flipping the switch for O0: Please give it a try! >> >> Hi Oliver, >> >> Thanks for trying this. >> Could you file a different PR for each of the problem you found and reference the umbrella PR: http://llvm.org/PR35347? <http://llvm.org/PR35347?> >> >> Thanks, >> -Quentin >> >> >> On Nov 17, 2017, at 8:17 AM, Oliver Stannard <oliver.stannard at arm.com <mailto:oliver.stannard at arm.com>> wrote: >> >> Hi Quentin, >> >> One more reproducer, this time with small (<64bit) values being passed on the stack: >> >> int foo(int x0, int x1, int x2, int x3, int x4, int x5, int x6, int x7, >> int stack1) { >> return stack1; >> } >> >> int main() { >> int ret = foo(0,1,2,3,4,5,6,7,8); >> printf("%d\n", ret); >> } >> >> Global isel thinks that the incoming value of stack1 is stored in bytes [0,4) above SP, but for big-endian targets this should be in bytes [4,8): >> >> // /work/llvm/build/bin/clang --target=aarch64-arm-none-eabi -march=armv8-a -c callees.cpp -O0 -Wall -std=c++11 -mllvm -global-isel=true -mllvm -global-isel-abort=0 -mbig-endian -o - -S >> _Z3fooiiiiiiiii: // @_Z3fooiiiiiiiii >> // BB#0: // %entry >> sub sp, sp, #48 // =48 >> ldr w8, [sp, #48] // <= Should be [sp, #52] >> str w0, [sp, #44] >> str w1, [sp, #40] >> str w2, [sp, #36] >> str w3, [sp, #32] >> str w4, [sp, #28] >> str w5, [sp, #24] >> str w6, [sp, #20] >> str w7, [sp, #16] >> str w8, [sp, #12] >> ldr w0, [sp, #12] >> add sp, sp, #48 // =48 >> ret >> >> Oliver >> >> From: Oliver Stannard >> Sent: 17 November 2017 14:57 >> To: 'qcolombet at apple.com <mailto:qcolombet at apple.com>' >> Cc: 'llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>'; nd; Kristof Beyls >> Subject: RE: [llvm-dev] [GlobalISel][AArch64] Toward flipping the switch for O0: Please give it a try! >> >> Hi Quentin, >> >> It seems that we also get the calling convention wrong for vector types on big-endian: >> #include <arm_neon.h> >> int32x2_t load_vector(int32x2_t *p) { >> return *p; >> } >> >> Global-isel generates this: >> // armclang --target=aarch64-arm-none-eabi -march=armv8-a -c callees.cpp -O0 -Wall -std=c++11 -mllvm -global-isel=true -mllvm -global-isel-abort=0 -mbig-endian -o - -S >> _Z11load_vectorP11__Int32x2_t: // @_Z11load_vectorP11__Int32x2_t >> // BB#0: // %entry >> sub sp, sp, #16 // =16 >> str x0, [sp, #8] >> ldr x0, [sp, #8] >> ld1 { v0.2s }, [x0] >> add sp, sp, #16 // =16 >> ret >> >> With global-isel off, there is a rev64 instruction between the ld1 and the add, which fixes up the endianness of the vector. >> >> Oliver >> >> From: Oliver Stannard >> Sent: 17 November 2017 13:32 >> To: 'qcolombet at apple.com <mailto:qcolombet at apple.com>' >> Cc: llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>; nd; Kristof Beyls >> Subject: RE: [llvm-dev] [GlobalISel][AArch64] Toward flipping the switch for O0: Please give it a try! >> >> Hi Quentin, >> >> At Kristof’s suggestion, I tried running our ABI test suite for a big-endian AArch64 target, and this found an ABI mismatch between global-isel and regular -O0. Here’s a reproducer for the first one I’ve investigated: >> >> struct foo { >> float first; >> float second; >> }; >> float get_first(foo p) { >> return p.first; >> } >> >> This is the code that global-isel currently generates: >> // /work/llvm/build/bin/clang --target=aarch64--none-eabi -march=armv8-a -c callees.cpp -O0 -mllvm -global-isel=true -mllvm -global-isel-abort=0 -mbig-endian -o - -S >> >> _Z9get_first3foo: // @_Z9get_first3foo >> // BB#0: // %entry >> sub sp, sp, #16 // =16 >> // implicit-def: %X8 >> fmov w9, s0 >> mov w10, w9 >> bfxil x8, x10, #0, #32 >> fmov w9, s1 >> mov w10, w9 >> bfi x8, x10, #32, #32 >> add x10, sp, #8 // =8 >> str x8, [sp, #8] >> ldr w9, [x10] >> fmov s0, w9 >> add sp, sp, #16 // =16 >> ret >> >> When run on a big-endian target, this incorrectly returns the second member of the struct, instead of the first. >> >> Oliver >> >> From: qcolombet at apple.com <mailto:qcolombet at apple.com> [mailto:qcolombet at apple.com <mailto:qcolombet at apple.com>] >> Sent: 14 November 2017 23:11 >> To: Quentin Colombet >> Cc: Oliver Stannard; llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>; Justin Bogner; Ahmed Bougacha; Aditya Nandakumar; nd >> Subject: Re: [llvm-dev] [GlobalISel][AArch64] Toward flipping the switch for O0: Please give it a try! >> >> To give an update here, we actually are not missing a mapping. The code complains because we are copying around a fp16 into a gpr32 and that shouldn’t be done with a copy (default mapping). >> I extended the repairing code to issue G_ANYEXT in those cases instead of asserting. >> >> However, now, I have to teach instruction select about those ANYEXT otherwise we’ll fallback in that case. But that’s a different story. >> >> I’ll try to commit today or tomorrow (I have to strengthen the tests). >> >> On Nov 14, 2017, at 9:29 AM, Quentin Colombet via llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: >> >> Thanks Oliver. >> I’ll have a look. This typically means that we miss a mapping for this type/instruction, which is not surprising given how little code we have we fp16. >> >> >> On Nov 14, 2017, at 2:27 AM, Oliver Stannard <oliver.stannard at arm.com <mailto:oliver.stannard at arm.com>> wrote: >> >> Hi Quentin, >> >> I’ve started running an ABI test suite with global isel on AArch64, and while it hasn’t found any ABI issues it has hit an assertion in clang when using the __fp16 type. Here’s a reproducer: >> >> __fp16 pass_f16(__fp16 p) { >> return p; >> } >> >> $ /work/llvm/build/bin/clang --target=aarch64-arm-none-eabi -march=armv8-a -c test.c -O0 -mllvm -global-isel -mllvm -global-isel-abort=0 >> clang-6.0: /work/llvm/llvm/lib/CodeGen/GlobalISel/RegisterBankInfo.cpp:446: static void llvm::RegisterBankInfo::applyDefaultMapping(const llvm::RegisterBankInfo::OperandsMapper &): Assertion `OrigTy.getSizeInBits() == NewTy.getSizeInBits() && "Types with difference size cannot be handled by the default " "mapping"' failed. >> #0 0x000000000362a764 PrintStackTraceSignalHandler(void*) (/work/llvm/build/bin/clang-6.0+0x362a764) >> #1 0x000000000362aac6 SignalHandler(int) (/work/llvm/build/bin/clang-6.0+0x362aac6) >> #2 0x00007f9193b78330 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x10330) >> #3 0x00007f919276bc37 gsignal /build/eglibc-oGUzwX/eglibc-2.19/signal/../nptl/sysdeps/unix/sysv/linux/raise.c:56:0 >> #4 0x00007f919276f028 abort /build/eglibc-oGUzwX/eglibc-2.19/stdlib/abort.c:91:0 >> #5 0x00007f9192764bf6 __assert_fail_base /build/eglibc-oGUzwX/eglibc-2.19/assert/assert.c:92:0 >> #6 0x00007f9192764ca2 (/lib/x86_64-linux-gnu/libc.so.6+0x2fca2) >> #7 0x0000000003d70eb9 (/work/llvm/build/bin/clang-6.0+0x3d70eb9) >> #8 0x0000000003d6b00c llvm::RegBankSelect::applyMapping(llvm::MachineInstr&, llvm::RegisterBankInfo::InstructionMapping const&, llvm::SmallVectorImpl<llvm::RegBankSelect::RepairingPlacement>&) (/work/llvm/build/bin/clang-6.0+0x3d6b00c) >> #9 0x0000000003d6b366 llvm::RegBankSelect::assignInstr(llvm::MachineInstr&) (/work/llvm/build/bin/clang-6.0+0x3d6b366) >> #10 0x0000000003d6b7f1 llvm::RegBankSelect::runOnMachineFunction(llvm::MachineFunction&) (/work/llvm/build/bin/clang-6.0+0x3d6b7f1) >> #11 0x0000000002d934c8 llvm::MachineFunctionPass::runOnFunction(llvm::Function&) (/work/llvm/build/bin/clang-6.0+0x2d934c8) >> #12 0x00000000030c998f llvm::FPPassManager::runOnFunction(llvm::Function&) (/work/llvm/build/bin/clang-6.0+0x30c998f) >> #13 0x00000000030c9c53 llvm::FPPassManager::runOnModule(llvm::Module&) (/work/llvm/build/bin/clang-6.0+0x30c9c53) >> #14 0x00000000030ca136 llvm::legacy::PassManagerImpl::run(llvm::Module&) (/work/llvm/build/bin/clang-6.0+0x30ca136) >> #15 0x00000000037c3dcf clang::EmitBackendOutput(clang::DiagnosticsEngine&, clang::HeaderSearchOptions const&, clang::CodeGenOptions const&, clang::TargetOptions const&, clang::LangOptions const&, llvm::DataLayout const&, llvm::Module*, clang::BackendAction, std::unique_ptr<llvm::raw_pwrite_stream, std::default_delete<llvm::raw_pwrite_stream> >) (/work/llvm/build/bin/clang-6.0+0x37c3dcf) >> #16 0x0000000003d421a0 clang::BackendConsumer::HandleTranslationUnit(clang::ASTContext&) (/work/llvm/build/bin/clang-6.0+0x3d421a0) >> #17 0x0000000004457376 clang::ParseAST(clang::Sema&, bool, bool) (/work/llvm/build/bin/clang-6.0+0x4457376) >> #18 0x0000000003ca6ea0 clang::FrontendAction::Execute() (/work/llvm/build/bin/clang-6.0+0x3ca6ea0) >> #19 0x0000000003c1fa31 clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) (/work/llvm/build/bin/clang-6.0+0x3c1fa31) >> #20 0x0000000003d3bf4b clang::ExecuteCompilerInvocation(clang::CompilerInstance*) (/work/llvm/build/bin/clang-6.0+0x3d3bf4b) >> #21 0x0000000001f85629 cc1_main(llvm::ArrayRef<char const*>, char const*, void*) (/work/llvm/build/bin/clang-6.0+0x1f85629) >> #22 0x0000000001f83096 main (/work/llvm/build/bin/clang-6.0+0x1f83096) >> #23 0x00007f9192756f45 __libc_start_main /build/eglibc-oGUzwX/eglibc-2.19/csu/libc-start.c:321:0 >> #24 0x0000000001f80029 _start (/work/llvm/build/bin/clang-6.0+0x1f80029) >> Stack dump: >> 0. Program arguments: /work/llvm/build/bin/clang-6.0 -cc1 -triple aarch64-arm-none-eabi -emit-obj -mrelax-all -disable-free -main-file-name test.c -mrelocation-model static -mthread-model posix -mdisable-fp-elim -fmath-errno -masm-verbose -mconstructor-aliases -fuse-init-array -target-cpu generic -target-feature +neon -target-abi aapcs -fallow-half-arguments-and-returns -dwarf-column-info -debugger-tuning=gdb -coverage-notes-file /work/innovation/cctest/test.gcno -resource-dir /work/llvm/build/lib/clang/6.0.0 -O0 -fdebug-compilation-dir /work/innovation/cctest -ferror-limit 19 -fmessage-length 226 -fno-signed-char -fobjc-runtime=gcc -fdiagnostics-show-option -fcolor-diagnostics -mllvm -global-isel -mllvm -global-isel-abort=0 -o test.o -x c test.c >> 1. <eof> parser at end of file >> 2. Code generation >> 3. Running pass 'Function Pass Manager' on module 'test.c'. >> 4. Running pass 'RegBankSelect' on function '@pass_f16' >> clang-6.0: error: unable to execute command: Aborted (core dumped) >> clang-6.0: error: clang frontend command failed due to signal (use -v to see invocation) >> clang version 6.0.0 (ssh://olista01 at ds-gerrit.euhpc.arm.com:29418/armcompiler/clang <ssh://olista01 at ds-gerrit.euhpc.arm.com:29418/armcompiler/clang> aa2b9952ef98a5fe2d47384ef17106855b8bae51) (ssh://olista01 at ds-gerrit.euhpc.arm.com:29418/armcompiler/llvm <ssh://olista01 at ds-gerrit.euhpc.arm.com:29418/armcompiler/llvm> 29f89772107a79b5f2a816d4748ed9c19416c1b6) >> Target: aarch64-arm-none-eabi >> Thread model: posix >> InstalledDir: /work/llvm/build/bin >> clang-6.0: note: diagnostic msg: PLEASE submit a bug report to http://llvm.org/bugs/ <http://llvm.org/bugs/> and include the crash backtrace, preprocessed source, and associated run script. >> clang-6.0: note: diagnostic msg: >> ******************** >> >> PLEASE ATTACH THE FOLLOWING FILES TO THE BUG REPORT: >> Preprocessed source(s) and associated run script(s) are located at: >> clang-6.0: note: diagnostic msg: /tmp/test-e06964.c >> clang-6.0: note: diagnostic msg: /tmp/test-e06964.sh >> clang-6.0: note: diagnostic msg: >> >> ******************** >> >> Oliver >> >> From: llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org <mailto:llvm-dev-bounces at lists.llvm.org>] On Behalf Of Quentin Colombet via llvm-dev >> Sent: 13 November 2017 18:27 >> To: Kristof Beyls >> Cc: llvm-dev; nd; Ahmed Bougacha; Justin Bogner; Aditya Nandakumar >> Subject: Re: [llvm-dev] [GlobalISel][AArch64] Toward flipping the switch for O0: Please give it a try! >> >> Hi Kristof, >> >> >> On Nov 13, 2017, at 9:10 AM, Kristof Beyls <Kristof.Beyls at arm.com <mailto:Kristof.Beyls at arm.com>> wrote: >> >> Hi Quentin, >> >> My only remaining concern is around ABI compatibility. >> The following commit seems to indicate that in the previous round of evaluation, we didn’t find an existing ABI compatibility issue: >> http://llvm.org/viewvc/llvm-project?view=revision&revision=311388 <http://llvm.org/viewvc/llvm-project?view=revision&revision=311388>. >> I haven’t looked into the details of this issue - so maybe I’m worried over nothing? >> >> No, you’re right. The problem with ABI is if you are consistently wrong, then you won’t notice :). >> >> >> >> I’m wondering if since then on your side you did any testing around ABI compatibility? >> E.g. building software where you semi-randomly build some functions through GlobalISel and some functions through DAGISel? >> >> Justin will look into that. Clang has utility script for that utils/ABITest. >> >> Given we will only be able to check iOS ABI, you may want to follow the same kind of validation on your side. >> >> I let you sync up with Justin for the method. >> >> Cheers, >> -Quentin >> >> >> >> Thanks, >> >> Kristof >> >> On 8 Nov 2017, at 00:42, Quentin Colombet via llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: >> >> Hi all, >> >> I’d like to resurrect this thread and ask if people are on board for enabling this by default for AArch64 O0. >> >> >> *** What Changed Since June? *** >> >> - We added a way to describe the legalization actions for non-power-of-2 >> - We gave a tutorial that covers the best practices to target GlobalISel >> - We improved the TableGen backend to reuse existing SDISel patterns >> - We built and ran huge internal software with GISel >> - We evaluated the performance of GISel and are confident things are in a good shape (with https://reviews.llvm.org/D39034 <https://reviews.llvm.org/D39034>) and moving forward would look even better (see the last LLVM Dev talk: GlobalISel: Present, Past, and Future when it is available) >> >> >> *** So What’s he Plan? *** >> >> - Switch the default instruction selector to GISel for AArch64 at O0 >> - Enable the fallback path by default for AArch64 (with warnings enabled when that path is hit) >> - Provide a clang option to turn GISel off >> >> What do you think? >> >> Thanks, >> -Quentin >> >> >> On Jun 16, 2017, at 4:43 PM, Quentin Colombet <qcolombet at apple.com <mailto:qcolombet at apple.com>> wrote: >> >> Hi all, >> >> We had some internal discussions about flipping the default for O0 and we concluded that we wanted to postpone it. >> >> >> *** Why Is That? *** >> >> We don’t want to send the wrong message that GlobalISel’s design is set in stone and ready for broader adoption. >> In particular, >> 1. The APIs are still evolving and can still possibly change significantly >> 2. The TableGen backend to reuse the existing SD patterns is still at its early stage >> 3. We want to investigate closely the performance of global-isel (compile-time, runtime, code size, fallbacks) >> >> The rationale behind those items is that we want to minimize the pain of moving forward for everybody. We also want the out-of-the-box experience to be pleasant (like all/most of the tablegen patterns just work, we have documentation on how to target a new backend, etc.) Finally, we want to gain confidence we are going to be able to address the performance issues we have with the current design and if not, derive a plan for that. >> >> We purposely left out of the conversation what will be the right time and requirements to flip the switch. We want to gather more data first. Your help would be appreciated! >> >> >> *** Short-Term Proposal *** >> >> What we would like to do instead short-term is: >> A. Repurpose or create an option “-aarch64-enable-global-isel-at-O” to enable GISel with fallbacks and warnings enables (i.e., equivalent of -global-isel -global-isel-abort=2) >> B. Advertise this option in the next open source release to allow compiler enthusiastic to try it and report problems >> C. Have GISel always built so we can push thing in the right place, MachineVerifier in mind, and stop doing some weird gymnastic >> >> What do people think? >> >> >> *** Your Help Is Needed *** >> >> - Please share your experience in using the GISel APIs and how we can make them better. Moving forward we’ll have those conversations on open source instead of internally/with a narrower audience. >> - Report any performance problem you identify >> - Propose patches! >> >> Cheers, >> -Quentin >> >> >> >> On Jun 16, 2017, at 3:06 PM, Quentin Colombet via llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: >> >> >> On Jun 14, 2017, at 7:27 AM, Diana Picus <diana.picus at linaro.org <mailto:diana.picus at linaro.org>> wrote: >> >> On 12 June 2017 at 18:54, Diana Picus <diana.picus at linaro.org <mailto:diana.picus at linaro.org>> wrote: >> Hi all, >> >> I added a buildbot [1] running the test-suite with -O0 -global-isel. It runs into the same 2 timeouts that I reported previously on this thread (paq8p and scimark2). It would be nice to make it green before flipping the switch. >> >> >> I did some more investigations on a machine similar to the one running the buildbot. For paq8p and scimark2, I get these results for O0: >> >> PAQ8p: >> Fast isel: 666.344 >> Global isel: 731.384 >> >> SciMark2-C: >> Fast isel: 463.908 >> Global isel: 496.22 >> >> The current timeout is 500s (so in this particular case we didn't hit it for scimark2, and it ran successfully to completion). I don't think the difference between FastISel and GlobalISel is too atrocious, so I would propose increasing the timeout for these 2 benchmarks. I'm not sure if we can do this on a per-bot basis, but I see some precedent for setting custom timeout thresholds for various benchmarks on different architectures (sometimes with comments that it's done so we can run O0 on that particular benchmark). >> >> Something along these lines works: >> https://reviews.llvm.org/differential/diff/102547/ <https://reviews.llvm.org/differential/diff/102547/> >> >> What do you guys think about this approach? >> >> Looks reasonable to me. >> >> >> >> Thanks, >> Diana >> >> PS: The buildbot is using the Makefiles because that's what our other AArch64 test-suite bots use. Moving all of them to CMake is a transition for another time. >> >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev> >> >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20171211/e3235c0f/attachment-0001.html>
Kristof Beyls via llvm-dev
2017-Dec-15 09:55 UTC
[llvm-dev] [GlobalISel][AArch64] Toward flipping the switch for O0: Please give it a try!
I don’t know of any further issues preventing us flipping the switch. At this point, I’d aim to flip the switch shortly after the creation of the 6.0.0 release branch, so that GlobalISel can harden a bit more enabled-by-default on trunk before it goes into an LLVM release (presumably 7.0.0 then). Thanks, Kristof> On 11 Dec 2017, at 17:08, Amara Emerson <aemerson at apple.com> wrote: > > As of r320388 we’ve either fixed the blocker bugs or disabled big-endian on GISel, falling back to SDAG. Fixing at least one of the big-endian issues will need use to change the way we handle aggregates, which will take a bit longer (it’s next on my list of things to do). > > Do we have any other issues preventing us flipping the switch? > > Amara