thr3ads.net - llvm dev - [LLVMdev] Missed optimization opportunity with piecewise load shift-or'd together? [Oct 2013]

If this information is useful, please help other people find it:
Share via:

David Nadlinger

2013-Oct-29 23:25 UTC

[LLVMdev] Missed optimization opportunity with piecewise load shift-or'd together?

On Mon, Oct 28, 2013 at 10:09 AM, James Courtier-Dutton
<james.dutton at gmail.com> wrote:> My guess is that this is a missed optimization, but in real life, all
> projects i have worked fix this in the C or C++ code using macros that
> change what instructions are used based on target platform and its
> endedness.
One reason for writing code like this, i.e. explicitly spelling out
the accesses to the individual bytes, would be to allow compile-time
evaluation of the fragment in the D programming language, where
arbitrarily reinterpreting memory is not supported (although
integer->integer pointer casts might be supported at some point).

Would a patch adding the capability to lower this to InstCombine or
similar have a chance of being accepted, or would that be considered
to be too rare a spacial case to be worth the added complexity?

David

Hal Finkel

2013-Oct-29 23:53 UTC

head link

[LLVMdev] Missed optimization opportunity with piecewise load shift-or'd together?

----- Original Message -----> On Mon, Oct 28, 2013 at 10:09 AM, James Courtier-Dutton
> <james.dutton at gmail.com> wrote:
> > My guess is that this is a missed optimization, but in real life,
> > all
> > projects i have worked fix this in the C or C++ code using macros
> > that
> > change what instructions are used based on target platform and its
> > endedness.
> 
> One reason for writing code like this, i.e. explicitly spelling out
> the accesses to the individual bytes, would be to allow compile-time
> evaluation of the fragment in the D programming language, where
> arbitrarily reinterpreting memory is not supported (although
> integer->integer pointer casts might be supported at some point).
> 
> Would a patch adding the capability to lower this to InstCombine or
> similar have a chance of being accepted, or would that be considered
> to be too rare a spacial case to be worth the added complexity?
I think that a patch for this would be great; I've seen plenty of real-life
deserialization code that looks like this.

FWIW, some patterns like this (byte swapping, for example), are matched during
CodeGen (see DAGCombiner::visitOR in lib/CodeGen/SelectionDAG/DAGCombiner.cpp),
but there is no reason that this pattern cannot be recognized and canonicalized
early in the IR (and I think that it should be).

 -Hal
> 
> David
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> 
-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory

Ben Karel

2013-Oct-30 20:04 UTC

head link

[LLVMdev] Missed optimization opportunity with piecewise load shift-or'd together?

I wrote up this optimization as an LLVM IR pass last month, actually:
https://code.google.com/p/foster/source/browse/compiler/llvm/passes/BitcastLoadRecognizer.cpp

It recognizes trees of `or' operations where the leaves are (buf[v+c]
<< c
* sizeof(buf[0])).

There are a few improvements needed to make it fit for general consumption;
it assumes (without checking) that it's targeting a little-endian
architecture, and it doesn't propagate alignment or inbounds information
from loads. It also does not recognize/generate byte swaps, though it would
be a pretty trivial extension to do so. It might also be nice to have the
dual optimization to coalesce stores rather than loads.

If anyone wants to use this as a starting point for a patch, feel free!



On Tue, Oct 29, 2013 at 7:53 PM, Hal Finkel <hfinkel at anl.gov> wrote:
> ----- Original Message -----
> > On Mon, Oct 28, 2013 at 10:09 AM, James Courtier-Dutton
> > <james.dutton at gmail.com> wrote:
> > > My guess is that this is a missed optimization, but in real life,
> > > all
> > > projects i have worked fix this in the C or C++ code using macros
> > > that
> > > change what instructions are used based on target platform and
its
> > > endedness.
> >
> > One reason for writing code like this, i.e. explicitly spelling out
> > the accesses to the individual bytes, would be to allow compile-time
> > evaluation of the fragment in the D programming language, where
> > arbitrarily reinterpreting memory is not supported (although
> > integer->integer pointer casts might be supported at some point).
> >
> > Would a patch adding the capability to lower this to InstCombine or
> > similar have a chance of being accepted, or would that be considered
> > to be too rare a spacial case to be worth the added complexity?
>
> I think that a patch for this would be great; I've seen plenty of
> real-life deserialization code that looks like this.
>
> FWIW, some patterns like this (byte swapping, for example), are matched
> during CodeGen (see DAGCombiner::visitOR in
> lib/CodeGen/SelectionDAG/DAGCombiner.cpp), but there is no reason that this
> pattern cannot be recognized and canonicalized early in the IR (and I think
> that it should be).
>
>  -Hal
>
> >
> > David
> > _______________________________________________
> > LLVM Developers mailing list
> > LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> >
>
> --
> Hal Finkel
> Assistant Computational Scientist
> Leadership Computing Facility
> Argonne National Laboratory
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20131030/9d608541/attachment.html>

Apparently Analagous Threads

Search for more apparently analagous threads

llvm dev - Oct 2013 - [LLVMdev] Missed optimization opportunity with piecewise load shift-or'd together?

[LLVMdev] Missed optimization opportunity with piecewise load shift-or'd together?

[LLVMdev] Missed optimization opportunity with piecewise load shift-or'd together?

[LLVMdev] Missed optimization opportunity with piecewise load shift-or'd together?

Apparently Analagous Threads