Renato Golin via llvm-dev
2016-Nov-11 11:04 UTC
[llvm-dev] Is the correct behavior of getelementptr i192* for opt + llc -march=aarch64?
On 10 November 2016 at 04:29, MITSUNARI Shigeo via llvm-dev <llvm-dev at lists.llvm.org> wrote:>>Is your default target aarch64? Otherwise opt may be assuming a different >>target which might explain the difference. > > No, My target is x86-64, x86, arm, aarch64, ..., then I'll avoid using i192* and datalayout.I believe Tom's point was about the line: % opt-3.8 -O3 a.ll -o - | llc-3.8 -O3 -o - -march=aarch64 If your host is x86_64, then the first call to opt will assume x86_64 unless you have a triple in the IR (which I believe you didn't). You can override with: % opt-3.8 -march=aarch64 -O3 a.ll -o - | llc-3.8 -O3 -o - -march=aarch64 Or making sure your IR always have triple+layout. I'm not sure it would have made any difference on the i192* case, but it will have noticeable impact on more complicated (and more target specific) IR, so you should be careful. Also, don't assume that OPT+LLC == LLC, as you'll be running more of the same passes on the first case, which can, in rare cases, have an impact (for better or worse) on the code generated. I recommend you keep the passes to a minimum. Opt is a debug tool, not an optimiser. To generate target code, use llc directly, which will (should) have the same effect without command line flag duplication. Better still, use Clang, or make sure your own front-end uses the middle and back ends in a consistent way, and use it instead of llc. cheers, --renato
MITSUNARI Shigeo via llvm-dev
2016-Nov-12 03:42 UTC
[llvm-dev] Is the correct behavior of getelementptr i192* for opt + llc -march=aarch64?
Hi Renato,> I believe Tom's point was about the line: > > % opt-3.8 -O3 a.ll -o - | llc-3.8 -O3 -o - -march=aarch64 > > If your host is x86_64, then the first call to opt will assume x86_64 > unless you have a triple in the IR (which I believe you didn't).Thank you, I lost it, but I always use opt and llc on each host and do not mix them. I'm sorry, I should have written %opt -O3 a.ll -o -|llc -O3 -o - (on x86-64 / aarch64).> % opt-3.8 -O3 a.ll -o - | llc-3.8 -O3 -o - -march=aarch64> Also, don't assume that OPT+LLC == LLC, as you'll be running more of > the same passes on the first case, which can, in rare cases, have an > impact (for better or worse) on the code generated. > > I recommend you keep the passes to a minimum. Opt is a debug tool, not > an optimiser.I see, but I want load192() in the previous mail should be inlined, but only llc does not it if alwaysinline attribute is add. Yours, Shigeo
Renato Golin via llvm-dev
2016-Nov-12 15:28 UTC
[llvm-dev] Is the correct behavior of getelementptr i192* for opt + llc -march=aarch64?
On 12 November 2016 at 03:42, MITSUNARI Shigeo via llvm-dev <llvm-dev at lists.llvm.org> wrote:> Thank you, I lost it, but I always use opt and llc on each host and do not mix them. > I'm sorry, I should have written %opt -O3 a.ll -o -|llc -O3 -o - (on x86-64 / aarch64).Yes, that works if it's all native.> I see, but I want load192() in the previous mail should be inlined, but > only llc does not it if alwaysinline attribute is add.Right, and that shows my argument very well: opt+llc != llc alone. This time, it worked for you and you got the result you wanted. Next time, it may work against you and you'll be in the situation where you don't know if you run opt or not beforehand, depending on the case, or how many times you'll run opt. I recommend you identify why opt is making a difference and submit a bug report. In theory, opt will pass the same passes in the same order as llc (give or take a few things), so this falls into two scenarios: 1. Some pass *after* inlining is reducing the threshold of the function you want to inline, so it only inlines on the second pass. To see if this is the case, try to run opt twice on the IR (opt | opt) and see if the function is inlined the second time. If it is, then the "fix" would be working around the heuristics, or fiddling with your function to understand and correct the problem. Using --print-after-all will give you an idea which pass is responsible for the simplification (hint: check for the state of IR just before the inlining phase on both runs, then trace back "who did it" when it worked). 2. Opt and llc are not passing the same passes in the same order. With the --print-after-all results from the investigation above, you can try to re-order the passes via opt and see, if you run in the same order as llc, you get only the out-of-line function. If this is the case, then changing llc to re-order the passes (and be like opt) could be an easy fix. But always remember: both opt and llc are *debug* tools. If you build a compiler with LLVM you should use the middle and back end classes inside your front-end driver. Emitting IR and using opt/llc is only a way to bootstrap your front-end, and not meant to be embedded in a final product. As an example, if we change llc to be like opt here, all the other tools that rely on llc's behaviour will change unexpectedly, and this may break or generate worse code for them. You don't want to be in that situation. But it would be good if both opt and llc behave in similar ways, so that you can bootstrap products and run more reliable tests with them. cheers, --renato