Hi Sjoerd,
I'm already using RDA in the pass I mentioned and it works great. Thanks
Sam!
Regarding the root cause, I didn't see anything obviously suboptimal not in
the copy coalescing or the register allocation, at least in my previous
example. Alternatively we might want to improve what we pass onto RA: i.e.
remove the redundant copy earlier. At this point however it doesn't
(obviously) look like one (it still using different vregs) which suggests
it might require a bit more of work to discover something that will
ultimately lead to a redundant copy. I will investigate this option as well.
I'll take a look at the hardware-loop pass DCE code. Thanks for the pointer!
Kind regards,
Missatge de Sjoerd Meijer <Sjoerd.Meijer at arm.com> del dia dj., 12 de
març
2020 a les 20:50:
> + Sam
>
> Hi Roger,
>
> FWIW: we have observed redundant copies/movies, they are annoying us for
> some time now but we haven't got round to looking at it. Not sure we if
we
> are looking at exactly the same problem, but I guess so.
>
> Treating symptoms with post RA dead code elimination might be very
> effective, but it might also be worth to just have a look at the source of
> the problem (regalloc?) to see if we are not missing something obvious.
>
> Regarding a post RA pass: you may want to have a look at the ARM
> hardware-loop pass. In order to make that beneficial, we have to do quite
> some dead code elimination post RA, both in inside loops and in preheaders,
> see e.g. ARMLowOverheadLoops::IterationCountDCE. This is using
> ReachingDefAnalysis (RDA), which has been extended by Sam and made more
> generic to support this, which was also going to be his eurollvm talk:
> http://llvm.org/devmtg/2020-04/talks.html#LightningTalk_26. End of
> advertisement. ;-) Basically what I want to say is that this should provide
> most of the things you'll need.
>
> Cheers,
> Sjoerd.
>
>
>
> ------------------------------
> *From:* llvm-dev <llvm-dev-bounces at lists.llvm.org> on behalf of
Roger
> Ferrer Ibáñez via llvm-dev <llvm-dev at lists.llvm.org>
> *Sent:* 12 March 2020 18:06
> *To:* LLVM-Dev <llvm-dev at lists.llvm.org>
> *Subject:* [llvm-dev] Redundant copies
>
> Hi all,
>
> we have encountered a case of redundant copies still left in the final
> code and we would like to, at least, mitigate it. The original motivating
> case comes from a context where we have large vector registers. In that
> context, copies are expensive and we would like to avoid them as much as
> possible.
>
> This small testcase in C, similar to the original vector case, exposes the
> issue but using scalars.
>
> long a, b;
> long fn1();
> long fn2() {
> long c = a, d = c;
> for (; b;) {
> long e = fn1();
> d = d + e;
> }
> long f = d - c;
> return f;
> }
>
> For instance in RISC-V we emit something like this but other backends like
> ARM or X86 show the same behaviour.
>
> add s0, zero, s2 # ← copy
> beqz a0, .LBB0_3
> # %bb.1: # %for.body.preheader
> add s0, zero, s2 # ← not needed
> .LBB0_2: # %for.body
>
> Has anyone encountered a similar issue like this in the past?
>
> We are looking into removing these copies with a post RA pass to address
> the most obvious case: if we see a copy with the same physregs in dest and
> source to an earlier one and the reaching definition of the dest and source
> registers is one and the same, then that copy should be redundant.
>
> This might be too specific though, so perhaps there are better approaches?
>
> Thanks!
>
> --
> Roger Ferrer Ibáñez
>
--
Roger Ferrer Ibáñez
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20200316/4c2456cc/attachment-0001.html>