James Molloy
2014-Mar-31 18:02 UTC
[LLVMdev] Contributing the Apple ARM64 compiler backend
Hi all, Firstly thanks so much to Apple for open sourcing this and Tim for going through the effort of committing it! Along with Bradley I've been looking at this today from a perspective of working out how best to get this merge completed. The one sentence summary is "I think we should use ARM64 as a base". My view on the backends is that the ARM64 backend is more performant but has correctness issues. What we want in the end is a performant, correct compiler. I think the best solution is to use ARM64 as the base for the merge, because detecting and fixing correctness issues is significantly easier than detecting and fixing performance issues. The big pain issues I see merging from ARM64 to AArch64 are: 1. Apple have created a fairly complete scheduling model already for ARM64, and we'd have to merge the partial? model in AArch64 and theirs. We risk regressing performance on Apple's targets here, and we can't determine ourselves whether we have or not. This is not ideal. 2. Porting over the DAG-to-DAG optimizations and any other optimizations that rely on the tablegen layout will be very tricky. 3. The conditional compare pass is fairly comprehensive - we'd have to port that over or rewrite it and that would be a lot of work. 4. A very quick analysis last night indicated that ARM64 has implemented just under half of the optimizations we discovered opportunities for in SPEC and EEMBC. That's a fairly comprehensive number of optimizations, and they won't all be easy to port. The big pain issues I see going the other way, from AArch64 to ARM64 are: 1. Functional regressions. These are fairly easy to detect - we have a bunch of test suites and codegen faults are easy to spot (incorrect results). I've spent the day looking at the MC Hammer failures, and there aren't many very bad ones. Certainly none that are horrendous to fix. 2. Performance on A53. But isn't it really just the scheduling model that needs updating? There are no A53-specific optimizations in Target/AArch64 that I know of. Cheers, James> -----Original Message----- > From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] > On Behalf Of Bradley Smith > Sent: 31 March 2014 18:19 > To: LLVM Developers Mailing List > Subject: Re: [LLVMdev] Contributing the Apple ARM64 compiler backend > > > 1. Import the ARM64 backend into the public tree so it's easily > > accessible for everyone to investigate. > > 2. Test it. Benchmark it. Explore it. Get data for the community to > > work with about the state of the back end. ARM has some excellent data > > that will help guide us here. > > 3. Identify the core backend to build on and to merge features and > > tuning from the other to. The data I have seen so far leads me to > > believe ARM64 is the correct choice here, but that's a decision > > primarily for the contributors above. > > 4. Clean up the codebase (C++11-ify it, for example), fix any > > regressions and test failures identified in benchmarking. > > > > This will give us a backend that is a superset of both ARM64 and > > AArch64 in terms both of performance and functionality. We can then > > consolidate to a single backend, named AArch64 for consistency with the > > current public tree. > > Our MC Hammer[0] testing on this shows that the ARM64 backend has > around a > 4% failure rate overall, one criteria for a successful merge wouldcertainly> be to retain the level architectural correctness that is currently present > in the AArch64 backend. > > Looking at the failures that are present in the ARM64 backend, it doesn't > look like it would be too much work to fixup the MC layer to get this > testsuite passing. > > [0] http://llvm.org/devmtg/2012-04-12/Slides/Richard_Barton.pdf > > Regards, > Bradley Smith > > > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Eric Christopher
2014-Mar-31 19:43 UTC
[LLVMdev] Contributing the Apple ARM64 compiler backend
> The big pain issues I see merging from ARM64 to AArch64 are: > 1. Apple have created a fairly complete scheduling model already for > ARM64, and we'd have to merge the partial? model in AArch64 and theirs. We > risk regressing performance on Apple's targets here, and we can't determine > ourselves whether we have or not. This is not ideal. > 2. Porting over the DAG-to-DAG optimizations and any other > optimizations that rely on the tablegen layout will be very tricky. > 3. The conditional compare pass is fairly comprehensive - we'd have to > port that over or rewrite it and that would be a lot of work. > 4. A very quick analysis last night indicated that ARM64 has > implemented just under half of the optimizations we discovered opportunities > for in SPEC and EEMBC. That's a fairly comprehensive number of > optimizations, and they won't all be easy to port.There's already a working fast isel port as well. Though I'm not sure how well tested that's been on linux. Tim?> > The big pain issues I see going the other way, from AArch64 to ARM64 are: > 1. Functional regressions. These are fairly easy to detect - we have a > bunch of test suites and codegen faults are easy to spot (incorrect > results). I've spent the day looking at the MC Hammer failures, and there > aren't many very bad ones. Certainly none that are horrendous to fix.Nice.> 2. Performance on A53. But isn't it really just the scheduling model > that needs updating? There are no A53-specific optimizations in > Target/AArch64 that I know of. >Agreed. There could be some small codegen changes, I'm not familiar with the A53. -eric> Cheers, > > James > >> -----Original Message----- >> From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] >> On Behalf Of Bradley Smith >> Sent: 31 March 2014 18:19 >> To: LLVM Developers Mailing List >> Subject: Re: [LLVMdev] Contributing the Apple ARM64 compiler backend >> >> > 1. Import the ARM64 backend into the public tree so it's easily >> > accessible for everyone to investigate. >> > 2. Test it. Benchmark it. Explore it. Get data for the community to >> > work with about the state of the back end. ARM has some excellent data >> > that will help guide us here. >> > 3. Identify the core backend to build on and to merge features and >> > tuning from the other to. The data I have seen so far leads me to >> > believe ARM64 is the correct choice here, but that's a decision >> > primarily for the contributors above. >> > 4. Clean up the codebase (C++11-ify it, for example), fix any >> > regressions and test failures identified in benchmarking. >> > >> > This will give us a backend that is a superset of both ARM64 and >> > AArch64 in terms both of performance and functionality. We can then >> > consolidate to a single backend, named AArch64 for consistency with the >> > current public tree. >> >> Our MC Hammer[0] testing on this shows that the ARM64 backend has >> around a >> 4% failure rate overall, one criteria for a successful merge would > certainly >> be to retain the level architectural correctness that is currently present >> in the AArch64 backend. >> >> Looking at the failures that are present in the ARM64 backend, it doesn't >> look like it would be too much work to fixup the MC layer to get this >> testsuite passing. >> >> [0] http://llvm.org/devmtg/2012-04-12/Slides/Richard_Barton.pdf >> >> Regards, >> Bradley Smith >> >> >> >> _______________________________________________ >> LLVM Developers mailing list >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > > > > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Tim Northover
2014-Mar-31 19:48 UTC
[LLVMdev] Contributing the Apple ARM64 compiler backend
> There's already a working fast isel port as well. Though I'm not sure > how well tested that's been on linux. > > Tim?I've not tested FastISel on Linux at all, I'm afraid. In theory I'd expect only minor modifications to be necessary (around global variable materialisation, if anywhere). Cheers. Tim.
Renato Golin
2014-Mar-31 19:54 UTC
[LLVMdev] Contributing the Apple ARM64 compiler backend
On 31 March 2014 20:43, Eric Christopher <echristo at gmail.com> wrote:>> 2. Performance on A53. But isn't it really just the scheduling model >> that needs updating? There are no A53-specific optimizations in >> Target/AArch64 that I know of. > > Agreed. There could be some small codegen changes, I'm not familiar > with the A53.I don't think there are any A53 specific optimization in the AArch64 back-end. GCC used the A15 tune as a starting point for A53 specific tuning and it's *a lot* better than what LLVM produces already. Not to mention that auto-vectorization is not turned on by default on AArch64, another thing that would improve massively the quality of the code gen, does the ARM64 have it on? cheers, --renato
Eric Christopher <echristo <at> gmail.com> writes:> > > The big pain issues I see merging from ARM64 to AArch64 are: > > 1. Apple have created a fairly complete scheduling model alreadyfor> > ARM64, and we'd have to merge the partial? model in AArch64 and theirs.We> > risk regressing performance on Apple's targets here, and we can'tdetermine> > ourselves whether we have or not. This is not ideal. > > 2. Porting over the DAG-to-DAG optimizations and any other > > optimizations that rely on the tablegen layout will be very tricky. > > 3. The conditional compare pass is fairly comprehensive - we'd haveto> > port that over or rewrite it and that would be a lot of work. > > 4. A very quick analysis last night indicated that ARM64 has > > implemented just under half of the optimizations we discoveredopportunities> > for in SPEC and EEMBC. That's a fairly comprehensive number of > > optimizations, and they won't all be easy to port.Eric, You mention that there a quite a few optimization opportunities in SPEC 2000/ EEMBC. I am looking to optimize the Aarch64 backend. Could you please let me know the big optimizations possible?