thr3ads.net - llvm dev - [LLVMdev] scalarrepl tuning [Feb 2010]

If this information is useful, please help other people find it:
Share via:

Bob Wilson

2010-Feb-03 18:05 UTC

[LLVMdev] scalarrepl tuning

In svn r95224 I modified the scalar replacement (SROA) pass to use different
criteria to decide when it is likely to be profitable to split up an aggregate
into its separate elements. The commit message has a pretty decent explanation,
but I wanted to give some further detail here. I am hoping that the llvm-dev
list allows messages with attachments. If the graphs get stripped off, this
won't be very interesting.

I was inspired by Jakob's approach to using SCIENCE to adjust our inlining
heuristics, so I did a few experiments for SROA. For these experiments, I ran
the llvm nightly test benchmarks, including 3 different versions of SPEC and
various other applications. The first thing measured was the total size and
number of elements for each aggregate encountered by SROA.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: sroa-stats.png
Type: image/png
Size: 6939 bytes
Desc: not available
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20100203/23fb1e8a/attachment.png>
-------------- next part --------------

The old SROA code would only consider the points in the lower left quadrant:
size <= 128 bytes and elements <= 32. The obvious diagonal lines made me
realize that most of the large aggregates are arrays (duh!) and that it would be
a good idea to look at structs and arrays separately. So, I repeated the
experiment, looking only at struct types:
-------------- next part --------------
A non-text attachment was scrubbed...
Name: sroa-structs.png
Type: image/png
Size: 5427 bytes
Desc: not available
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20100203/23fb1e8a/attachment-0001.png>
-------------- next part --------------

This confirmed my intuition that there are a reasonable number of large (in
bytes) structs with a relatively small number of elements. Those structs should
be candidates for SROA. I have not done the experiment to see how many of them
we actually succeed in transforming. The current limit of 32 elements looks to
me like a reasonable cutoff point. There are a few points above that green
line, and chances are that structs with so many elements will not be safe to
SROA anyway.

The last experiment was to test my intuition that SROA will rarely apply to
arrays, since it requires that they be accessed with only constant indices. I
was expecting that we could get away with setting a very low limit on the number
of array elements for SROA candidates. To test this, I compiled all the
benchmarks and accumulated counts of array sizes in cases where we succeeded
with SROA. I don't have a graph for this, but I was surprised in several
ways. First there were 8377 arrays transformed by SROA. That's a lot more
than I expected. Second, I was guessing that we would almost never split up an
array with more than 4 elements, but there were 350 cases where that happened.
I decided to go with a limit of 8 elements, since that caught all but 16 cases.

Chris Lattner

2010-Feb-04 06:15 UTC

head link

[LLVMdev] scalarrepl tuning

On Feb 3, 2010, at 10:05 AM, Bob Wilson wrote:
> The last experiment was to test my intuition that SROA will rarely apply to
arrays, since it requires that they be accessed with only constant indices.  I
was expecting that we could get away with setting a very low limit on the number
of array elements for SROA candidates.  To test this, I compiled all the
benchmarks and accumulated counts of array sizes in cases where we succeeded
with SROA.  I don't have a graph for this, but I was surprised in several
ways.  First there were 8377 arrays transformed by SROA.  That's a lot more
than I expected.  Second, I was guessing that we would almost never split up an
array with more than 4 elements, but there were 350 cases where that happened. 
I decided to go with a limit of 8 elements, since that caught all but 16 cases.
FWIW, a common reason for this is that loop unrolling can turn all accesses to
an array into constants.

-Chris

Reasonably Related Threads

Search for more maybe matching threads

llvm dev - Feb 2010 - [LLVMdev] scalarrepl tuning

[LLVMdev] scalarrepl tuning

[LLVMdev] scalarrepl tuning

Reasonably Related Threads