thr3ads.net - llvm dev - [llvm-dev] Load combine pass [Sep 2016]

If this information is useful, please help other people find it:
Share via:

David Chisnall via llvm-dev

2016-Sep-28 15:54 UTC

[llvm-dev] Load combine pass

On 28 Sep 2016, at 16:50, Philip Reames via llvm-dev <llvm-dev at
lists.llvm.org> wrote:> 
> At this point, my general view is that widening transformations of any kind
should be done very late.  Ideally, this is something the backend would do, but
doing it as a CGP like fixup pass over the IR is also reasonable.
I’m really glad to see that this is gone in GVN - it will reduce our diffs a lot
when we do the next import.  The GVN load widening is not sound in the presence
of hardware-enforced spacial memory safety, so we ended up with the compiler
inserting things that caused hardware bounds checks to fail and had to disable
it in a few places.

If you’re reintroducing it, please can we have a backend option to specify
whether it’s valid to widen loads beyond the extents (for example, for us it is
not valid to widen an i16 load followed by an i8 load to an i32 load).  Ideally,
we’d also want the back end to supply some cost function: on MIPS, even when
this is valid, the i32 load followed by the masks and shifts is a lot more
expensive than the two loads (and increases register pressure, so has knock-on
effects that mean that turning off this bit of GVN gave us *much* better code,
even without any of the later optimisation opportunities being exploited).

David

-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 3719 bytes
Desc: not available
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160928/b3750c85/attachment.bin>

Sanjoy Das via llvm-dev

2016-Sep-29 00:25 UTC

head link

[llvm-dev] Load combine pass

Hi David,

David Chisnall via llvm-dev wrote:
 > On 28 Sep 2016, at 16:50, Philip Reames via 
llvm-dev<llvm-dev at lists.llvm.org>  wrote:
 >> At this point, my general view is that widening transformations of 
any kind should be done very late.  Ideally, this is something the 
backend would do, but doing it as a CGP like fixup pass over the IR is 
also reasonable.
 >
 > I’m really glad to see that this is gone in GVN - it will reduce our
 > diffs a lot when we do the next import.  The GVN load widening is not
 > sound in the presence of hardware-enforced spacial memory safety, so
 > we ended up with the compiler inserting things that caused hardware
 > bounds checks to fail and had to disable it in a few places.
 >
 > If you’re reintroducing it, please can we have a backend option to
 > specify whether it’s valid to widen loads beyond the extents (for
 > example, for us it is not valid to widen an i16 load followed by an i8
 > load to an i32 load).  Ideally, we’d also want the back end to supply

Don't you have to mark the functions you generate as
"Attribute::SanitizeAddress"?  We should definitely make the
speculative form of this transform (i.e. "load i32, load i16" -->
"load i64") predicated on Attribute::SanitizeAddress.

-- Sanjoy

Matthias Braun via llvm-dev

2016-Sep-29 01:23 UTC

head link

[llvm-dev] Load combine pass

Just to chime in that we have been seeing a bunch of instances on aarch64 as
well where the load widening prevented LICM and probably worse inserted
additional shifts instructions into the critical path. This was showing
especially after r246985 (see for example http://llvm.org/PR25321
<http://llvm.org/PR25321>) which leads to additional widening
possibilities.

So I'd also strongly suggest to only optimize obvious cases and stay away
from variations that lead to additional shift instructions being inserted.

(I missed D24096, but will check whether that fixes our performance problems as
well).

- Matthias
> On Sep 28, 2016, at 5:25 PM, Sanjoy Das via llvm-dev <llvm-dev at
lists.llvm.org> wrote:
> 
> Hi David,
> 
> David Chisnall via llvm-dev wrote:
> > On 28 Sep 2016, at 16:50, Philip Reames via llvm-dev<llvm-dev at
lists.llvm.org>  wrote:
> >> At this point, my general view is that widening transformations of
any kind should be done very late.  Ideally, this is something the backend would
do, but doing it as a CGP like fixup pass over the IR is also reasonable.
> >
> > I’m really glad to see that this is gone in GVN - it will reduce our
> > diffs a lot when we do the next import.  The GVN load widening is not
> > sound in the presence of hardware-enforced spacial memory safety, so
> > we ended up with the compiler inserting things that caused hardware
> > bounds checks to fail and had to disable it in a few places.
> >
> > If you’re reintroducing it, please can we have a backend option to
> > specify whether it’s valid to widen loads beyond the extents (for
> > example, for us it is not valid to widen an i16 load followed by an i8
> > load to an i32 load).  Ideally, we’d also want the back end to supply
> 
> Don't you have to mark the functions you generate as
> "Attribute::SanitizeAddress"?  We should definitely make the
> speculative form of this transform (i.e. "load i32, load i16"
-->
> "load i64") predicated on Attribute::SanitizeAddress.
> 
> -- Sanjoy
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160928/e67d372a/attachment.html>

David Chisnall via llvm-dev

2016-Sep-29 10:10 UTC

head link

[llvm-dev] Load combine pass

On 29 Sep 2016, at 01:25, Sanjoy Das <sanjoy at playingwithpointers.com>
wrote:> 
> Hi David,
> 
> David Chisnall via llvm-dev wrote:
> > On 28 Sep 2016, at 16:50, Philip Reames via llvm-dev<llvm-dev at
lists.llvm.org>  wrote:
> >> At this point, my general view is that widening transformations of
any kind should be done very late.  Ideally, this is something the backend would
do, but doing it as a CGP like fixup pass over the IR is also reasonable.
> >
> > I’m really glad to see that this is gone in GVN - it will reduce our
> > diffs a lot when we do the next import.  The GVN load widening is not
> > sound in the presence of hardware-enforced spacial memory safety, so
> > we ended up with the compiler inserting things that caused hardware
> > bounds checks to fail and had to disable it in a few places.
> >
> > If you’re reintroducing it, please can we have a backend option to
> > specify whether it’s valid to widen loads beyond the extents (for
> > example, for us it is not valid to widen an i16 load followed by an i8
> > load to an i32 load).  Ideally, we’d also want the back end to supply
> 
> Don't you have to mark the functions you generate as
> "Attribute::SanitizeAddress"?  We should definitely make the
> speculative form of this transform (i.e. "load i32, load i16"
-->
> "load i64") predicated on Attribute::SanitizeAddress.
Nope, we’re not using the address sanitiser.  Our architecture supports
byte-granularity bounds checking in hardware.

Note that even without this, for pure MIPS code without our extensions, load
widening generates significantly worse code than when it doesn’t happen.  I’m
actually finding it difficult to come up with a microarchitecture where a 16-bit
load followed by an 8-bit load from the same cache line would give worse
performance than a 32-bit load, a mask and a shift.  In an in-order design, it’s
more instructions to do the same work, and therefore slower.  In an out-of-order
design, the two loads within the cache line will likely be dispatched
simultaneously and you’ll have less pressure on the register rename engine.

It seems like something that will only give wins if it exposes optimisation
opportunities to other transforms and, as such, should probably be exposed as an
analysis so that the later transforms can do the combine if it actually makes
sense to do so.

David

Maybe Matching Threads

Search for more maybe matching threads

llvm dev - Sep 2016 - Load combine pass

[llvm-dev] Load combine pass

[llvm-dev] Load combine pass

[llvm-dev] Load combine pass

[llvm-dev] Load combine pass

Maybe Matching Threads