Hi all, A bunch of us met at EuroLLVM to discuss the planned merge of the two current AArch64 backends in the tree. The primary question was which backend should form the basis of the merge (since the core .td files aren't directly mergeable), with code being cherry-picked from the other on a case-by-case basis. There were factors to consider both ways, but I think the key points of interest were agreed on by everyone: 1. That getting the merge done as quickly as possible was important to avoid duplicated effort and confusion among our users. 2. That neither performance nor correctness were particularly useful discriminators between the backends. Both are good enough to form the basis on those grounds. Ana Pazos had managed to run some benchmarks on Cortex-A53 (an in-order CPU) which showed that porting a few simple cases across could reduce differences to low single digits, with winners in both directions. Similarly, people from ARM had managed to resolve most known correctness issues since the initial commits last week. That leaves long-term maintenance and features as the remaining factors to make the decision: we want to spend as little effort as possible (in total) to do things on the backend, both now and in future. In the short term, ARM64 is the clear choice; it simply has more features now: ELF, FastISel, and the two NEON syntaxes were mentioned. On the other side there was incipient big-endian and CPUs with different sub-features (NEON/FP/...). Longer term, the question is much more difficult. Maintainability is often a matter of taste and there are issues with both backends (which we should do our best to resolve!): ARM64 has horrific handling of aliases and hacks in the various MC components (AsmParser, Disassembler, ...); AArch64 has similar contortions in the .td files (see loads/stores & instruction proliferation for aliases). ARM64 has a clean implementation of calling conventions; AArch64 has its sysreg lookup. I don't think either has fundamental barriers to a clean design in future, personally, though AArch64 probably needs a couple more pushes to get there. The tentative conclusion was that we probably have all the information we need available and should propose using the ARM64 backend as the basis on the list and continue discussion here. Tim.
Hi again, In my original message I was attempting to summarise the key arguments as I saw them. Other points came up in the discussion, which Ana kindly recorded and I'll summarise here: First, extra arguments brought up in favour of each backend (I'll mention duplicates too so that the list is as complete as possible): + Register class usage in ARM64 is cleaner. + FastISel is on ARM64, but not AArch64. Some TableGen work will be needed to enable it because of how patterns are written there. + There is no macro support in AArch64. + Both NEON syntax variants (general & iOS) are supported by ARM64 now. + ARM64 assumes neon enabled by default, and indeed has no notion that a CPU might not have NEON. Instructions will need to be predicated to check NEON is present and probably some corresponding .cpp changes where it was also assumed. + Inline asm is possibly better in ARM64. + Anecdotal evidence suggests it's easier to debug MC layer issues on ARM64 than on AArch64. Other important points that we discussed: + We need to setup a buildbot for performance using some real hardware (volunteers with hardware?) so patches can be validated in the supported targets. And also for correctness using qemu. + Google is working on a framework to build and run benchmarks – to be available soon? And should enable the buildbot setup from item above. + We need to sort out differences between cortex-a53 and Cyclone model descriptions (both use the new approach for MI scheduler, but one requires annotating instructions and the other does not). We should pin down Andy and get him to describe the perfect machine model. Cheers. Tim
Hi folks, As Tim pointed out, we recently had the opportunity to collect 64-bit benchmark performance data for GCC 4.9, AArch64 and ARM64 compilers on a real hardware. It is a cortex-a53 device. Due to proprietary reasons we cannot share the full hardware configuration. The preliminary results were shared at the hackers lab at EuroLLVM yesterday. For those who could not make it, below is the summarized performance data. A positive number means the ARM64 run is better by the number %. A negative number means the baseline (GCC 4.9 or AArch64) is better by the number %. Tuning of AArch64 backend on this processor has not been completely done yet (some initial work has started on modeling cortex-a53). But we quickly investigated the bad vectorized code in some of the tests (Linpack for example) and identified straightforward fixes that improved AArch64 performance (similar patches are present in ARM64, e.g. loop unroll default limit, unaligned memory accesses, etc.). These patches are going to the AArch64 commits list for review. This experiment indicates that from the point of view of correctness and performance either ARM64 or AArch64 could be the base compiler of choice if the known correctness issues (in ARM64) and lack of performance tuning (in AArch64) are addressed. However much more work has to be done to catch up with GCC 4.9 middle-end and backend optimizations. Benchmark ARM64 vs GCC 4.9 % ARM64 vs AArch64 % ARM64 vs AArch64 patched % EEMBC (no consumer) geomean -17 1 -2 EEMBC (consumer only) geomean -21 -2 -5 Linpack Double -29 45 -1 Linpack Single -51 40 1 SPEC2000 geomean -6 0 1 Thanks, Ana. -----Original Message----- From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On Behalf Of Tim Northover Sent: Tuesday, April 08, 2014 12:04 AM To: LLVM Developers Mailing List Subject: Re: [LLVMdev] Proposal: AArch64/ARM64 merge from EuroLLVM Hi again, In my original message I was attempting to summarise the key arguments as I saw them. Other points came up in the discussion, which Ana kindly recorded and I'll summarise here: First, extra arguments brought up in favour of each backend (I'll mention duplicates too so that the list is as complete as possible): + Register class usage in ARM64 is cleaner. + FastISel is on ARM64, but not AArch64. Some TableGen work will be needed to enable it because of how patterns are written there. + There is no macro support in AArch64. + Both NEON syntax variants (general & iOS) are supported by ARM64 now. + ARM64 assumes neon enabled by default, and indeed has no notion that a CPU might not have NEON. Instructions will need to be predicated to check NEON is present and probably some corresponding .cpp changes where it was also assumed. + Inline asm is possibly better in ARM64. + Anecdotal evidence suggests it's easier to debug MC layer issues on ARM64 than on AArch64. Other important points that we discussed: + We need to setup a buildbot for performance using some real hardware (volunteers with hardware?) so patches can be validated in the supported targets. And also for correctness using qemu. + Google is working on a framework to build and run benchmarks – to be available soon? And should enable the buildbot setup from item above. + We need to sort out differences between cortex-a53 and Cyclone model descriptions (both use the new approach for MI scheduler, but one requires annotating instructions and the other does not). We should pin down Andy and get him to describe the perfect machine model. Cheers. Tim _______________________________________________ LLVM Developers mailing list <mailto:LLVMdev at cs.uiuc.edu> LLVMdev at cs.uiuc.edu <http://llvm.cs.uiuc.edu> http://llvm.cs.uiuc.edu <http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20140408/b262b628/attachment.html>
Hi again, Having heard no howls of protest, those of us remaining on the Wednesday decided to get down to planning a few more details of the merge. David Kipping very kindly took notes, and we've produced the summary of the discussion below: On Wednesday after the EuroLLVM meeting, a group met to continue discussing the ARMv8 backend merge and how to accelerate completion. Attending was James, Bradley, Tim, Jiangning, Kristof, Vinod, Chandler, Pierre, Ana, and David. EuroLLVM provided a timely and convenient opportunity to meet in person to discuss this topic. But it is important to note that this is only one meeting and some issues likely have been missed, and that not everyone involved in the discussion was at EuroLLVM; everything below is open for further discussion and revision on the community lists. Later in the mail are details on the work to complete the merge, but there is a lot and participation from the community is warmly welcome. This is an excellent opportunity if you want to learn more about backends, the ARMv8 architecture, or just want to ensure that the community ARMv8 backend is of this highest quality and performance. Some of the areas that have been identified needing help are: - Code reviews (there will be lots of changes and quality of review and timeliness is critical) - Merging regression tests from both ARMv8 backends - Tim will lead this effort but is looking for help - Inline ASM (I think Eric said at the Hackers Lab that he might be willing to do this) - Fix bugs - For others who want to help test, compiling and running your codebases on QEMU (no crypto extensions) - Code coverage analysis of backend - Clean up the codebase (C++11-ify it, for example) - J im will lead this effort - In addition, any of the work items identified later in this mail ----------------------------------------------------------------------------------------------------------------- Summary of the meeting: The meeting reaffirmed the conclusion of the discussion at the EuroLLVM Hackerlab of using ARM64 backend as the merge target. It's important to merge as quickly as possible to avoid fragmentation of community efforts across two backends . Completing the merge in time for the 3.5 release shall be a stretch goal, but this will be very difficult because of the short time remaining, and may be missed. More important than schedule, is to make sure the merge is done right with good design that is maintainable. When the merge is complete, need to delete AArch64 and rename ARM64 to AArch64 to avoid confusion. Also, alias together the arm64 and aarch64 triples to the merged backend. Should try to minimize patches to AArch64 during the merge, but it is important to realize that this backend is being used for product releases and there are contributions in flight and more expected. Bugs should be filled for ARM64 when appropriate. Work that needs to be completed prior to the merge is considered complete: - No significant regressions: correctness, features, stability, performance. There will likely be exceptions, particularly in some performance subtests, that need to be addressed on a case by case basis - Correctness -- Merged backend passes LLVM test suite -- Merged backend passes the invested parties internal tests (Apple, ARM, QuIC) and should not have significant regressions. It should be recognized that this is a special situation as there are commercial releases being made on the two backends, and for adoption of the merge it is critical that there are no regressions. Examples of tests are: SPEC2000, SPEC2006, EEMBC, Geekbench, Coremark, MCHammer, Emperor (NEON) - Performance - Difficult to have precise and fixed baseline for measuring performance regressions on the merged backends because of variability in hardware, but all significant performance regressions must be investigated and justified as fix/notfix - Feature parity - to the level found in the ARM64 and AArch64 backends today -- big-endian -- Optionality of ARMv8 architecture extension sets (no fpu, crypto, crc, ...) -- A53 scheduler -- Inline assembly -- ACLE 2.0 --- Neon (chapter 12 of ACLE); probably there already on the ARM64 backend --- Predefines -- Proper guarding of platform-specific features (Cyclone, Darwin, ELF, …) -- Regression tests from both backends merged The following patches were identified in order to swap in the merged backend once the merge is completed: - Delete AArch64 backend - Move and rename ARM64 to AArch64 (Changes filename, class names, replace all non comments ARM64 strings to AArch64) - Retarget ARM64 triples to merged backend - Clean up any ARM64 references elsewhere in llvm subprojects The following is the anticipated sequence of work leading to a merge: - During merge, invested parties will frequently run their internal correctness, stability and performance tests. Report bugs as appropriate (ALL) - System registers redesign, refactoring to use some more of tablegen resources, and bug fixes (45 patches from ARM were reviewed during the meeting) - A53 scheduler - (Dave E, Ana, Andy) have already started discussing - LLVM test suite run and report failures (Jiangning/Kevin/Hao) - LLVM test suite enabled in the buildbot and testing ARM64 (Gabor) - CSE of ADRP optimization (Jiangning) - Making optional armv8 architecture extension sets optional in LLVM; no fpu, crypto, crc, ... (Jiangning/Kevin/Hao) - Proper guarding of platform-specific features (Cyclone, Darwin, ELF, …) (Tim) - Big-endian (James/Bradley/Kristof) - Predefines (Bradley) - Fixes bugs (ALL) - Backend switch patch-sets (Tim) Communication during the merge - Primary discussions will take place on llvmdev, llvm-commits, and IRC - A top-level bug: http://llvm.org/bugs/show_bug.cgi?id=19392 - Depending on how things go, we may want to get together for some kind of telephone call. We'll send a message to the list if that happens. I think that about covers it. If anyone has any questions, ask away! Cheers. Tim.
Eric Christopher
2014-Apr-14 06:09 UTC
[LLVMdev] Proposal: AArch64/ARM64 merge from EuroLLVM
> - Inline ASM (I think Eric said at the Hackers Lab that he might be > willing to do this)I am, yes.> - For others who want to help test, compiling and running your > codebases on QEMU (no crypto extensions)Some reasonable description of how this works would be awesome.> > - Feature parity - to the level found in the ARM64 and AArch64 backends todayAs a note this should definitely be "Today", as in the day you sent the email/had the meeting/etc. No new work should be considered part of the final sign off - basically a gentle chide for people to stop putting new features into the existing AArch64 backend :) -eric
This sounds reasonable. Thanks, all.> - CSE of ADRP optimization (Jiangning)Quentin may have some input here. He’s done quite a lot of optimizations for ADRP sequences. -Jim On Apr 12, 2014, at 12:08 AM, Tim Northover <t.p.northover at gmail.com> wrote:> Hi again, > > Having heard no howls of protest, those of us remaining on the > Wednesday decided to get down to planning a few more details of the > merge. > > David Kipping very kindly took notes, and we've produced the summary > of the discussion below: > > On Wednesday after the EuroLLVM meeting, a group met to continue > discussing the ARMv8 backend merge and how to accelerate completion. > Attending was James, Bradley, Tim, Jiangning, Kristof, Vinod, > Chandler, Pierre, Ana, and David. > > EuroLLVM provided a timely and convenient opportunity to meet in > person to discuss this topic. But it is important to note that this is > only one meeting and some issues likely have been missed, and that not > everyone involved in the discussion was at EuroLLVM; everything below > is open for further discussion and revision on the community lists. > > Later in the mail are details on the work to complete the merge, but > there is a lot and participation from the community is warmly welcome. > This is an excellent opportunity if you want to learn more about > backends, the ARMv8 architecture, or just want to ensure that the > community ARMv8 backend is of this highest quality and performance. > Some of the areas that have been identified needing help are: > > - Code reviews (there will be lots of changes and quality of review > and timeliness is critical) > - Merging regression tests from both ARMv8 backends - Tim will lead > this effort but is looking for help > - Inline ASM (I think Eric said at the Hackers Lab that he might be > willing to do this) > - Fix bugs > - For others who want to help test, compiling and running your > codebases on QEMU (no crypto extensions) > - Code coverage analysis of backend > - Clean up the codebase (C++11-ify it, for example) - J im will lead > this effort > - In addition, any of the work items identified later in this mail > > ----------------------------------------------------------------------------------------------------------------- > > Summary of the meeting: > > The meeting reaffirmed the conclusion of the discussion at the > EuroLLVM Hackerlab of using ARM64 backend as the merge target. > > It's important to merge as quickly as possible to avoid fragmentation > of community efforts across two backends . Completing the merge in > time for the 3.5 release shall be a stretch goal, but this will be > very difficult because of the short time remaining, and may be missed. > More important than schedule, is to make sure the merge is done right > with good design that is maintainable. > > When the merge is complete, need to delete AArch64 and rename ARM64 to > AArch64 to avoid confusion. Also, alias together the arm64 and aarch64 > triples to the merged backend. > > Should try to minimize patches to AArch64 during the merge, but it is > important to realize that this backend is being used for product > releases and there are contributions in flight and more expected. Bugs > should be filled for ARM64 when appropriate. > > Work that needs to be completed prior to the merge is considered complete: > > - No significant regressions: correctness, features, stability, > performance. There will likely be exceptions, particularly in some > performance subtests, that need to be addressed on a case by case > basis > > - Correctness > -- Merged backend passes LLVM test suite > -- Merged backend passes the invested parties internal tests (Apple, > ARM, QuIC) and should not have significant regressions. It should be > recognized that this is a special situation as there are commercial > releases being made on the two backends, and for adoption of the merge > it is critical that there are no regressions. Examples of tests are: > SPEC2000, SPEC2006, EEMBC, Geekbench, Coremark, MCHammer, Emperor > (NEON) > > - Performance - Difficult to have precise and fixed baseline for > measuring performance regressions on the merged backends because of > variability in hardware, but all significant performance regressions > must be investigated and justified as fix/notfix > > - Feature parity - to the level found in the ARM64 and AArch64 backends today > -- big-endian > -- Optionality of ARMv8 architecture extension sets (no fpu, crypto, crc, ...) > -- A53 scheduler > -- Inline assembly > -- ACLE 2.0 > --- Neon (chapter 12 of ACLE); probably there already on the ARM64 backend > --- Predefines > -- Proper guarding of platform-specific features (Cyclone, Darwin, ELF, …) > -- Regression tests from both backends merged > > The following patches were identified in order to swap in the merged > backend once the merge is completed: > > - Delete AArch64 backend > - Move and rename ARM64 to AArch64 (Changes filename, class names, > replace all non comments ARM64 strings to AArch64) > - Retarget ARM64 triples to merged backend > - Clean up any ARM64 references elsewhere in llvm subprojects > > The following is the anticipated sequence of work leading to a merge: > > - During merge, invested parties will frequently run their internal > correctness, stability and performance tests. Report bugs as > appropriate (ALL) > - System registers redesign, refactoring to use some more of tablegen > resources, and bug fixes (45 patches from ARM were reviewed during the > meeting) > - A53 scheduler - (Dave E, Ana, Andy) have already started discussing > - LLVM test suite run and report failures (Jiangning/Kevin/Hao) > - LLVM test suite enabled in the buildbot and testing ARM64 (Gabor) > - CSE of ADRP optimization (Jiangning) > - Making optional armv8 architecture extension sets optional in LLVM; > no fpu, crypto, crc, ... (Jiangning/Kevin/Hao) > - Proper guarding of platform-specific features (Cyclone, Darwin, ELF, …) (Tim) > - Big-endian (James/Bradley/Kristof) > - Predefines (Bradley) > - Fixes bugs (ALL) > - Backend switch patch-sets (Tim) > > Communication during the merge > - Primary discussions will take place on llvmdev, llvm-commits, and IRC > - A top-level bug: http://llvm.org/bugs/show_bug.cgi?id=19392 > - Depending on how things go, we may want to get together for some > kind of telephone call. We'll send a message to the list if that > happens. > > I think that about covers it. If anyone has any questions, ask away! > > Cheers. > > Tim. > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Gabor Ballabas
2014-Apr-15 12:26 UTC
[LLVMdev] Proposal: AArch64/ARM64 merge from EuroLLVM
Hi Tim, I just read this thread and I see that you mentioned the buildbot and my name.> - LLVM test suite enabled in the buildbot and testing ARM64 (Gabor)What exactly I can do to help you with the merge process? Best regards, Gabor Ballabas -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20140415/dcf98767/attachment.html>
Apparently Analagous Threads
- [LLVMdev] Proposal: AArch64/ARM64 merge from EuroLLVM
- [LLVMdev] Proposal: AArch64/ARM64 merge from EuroLLVM
- [LLVMdev] [RFC] AArch64: Should we disable GlobalMerge?
- [LLVMdev] Proposal: AArch64/ARM64 merge from EuroLLVM
- [LLVMdev] Proposal: AArch64/ARM64 merge from EuroLLVM