thr3ads.net - llvm dev - [LLVMdev] MergeFunctions: reduce complexity to O(log(N)) [Feb 2014]

If this information is useful, please help other people find it:
Share via:

Stepan Dyatkovskiy

2014-Feb-03 19:11 UTC

[LLVMdev] MergeFunctions: reduce complexity to O(log(N))

Hi all,

Previous patch has been split onto series of small changes.
On each stage (after each patch) MergeFunctions pass is compilable and 
stable.

Please find patches in attachment for review.

To apply all patches at once, use "apply-patches.sh" script as
follows:
0. Place "apply-patches.sh" in same directory with patches.
1. cd <llvm-sources-dir>
2. "bash <path-to-patches-dir>/apply-patches.sh"

-Stepan

Stepan Dyatkovskiy wrote:> Hi Nick,
> About partition refinement.
>
> It looks like patch proposed here allows to apply your idea with
> painless way (as series of small changes actually).
>
> If I got it right, you would like to introduce the next
> partition-refinement.
>
> Initial partition P consists of only one functions class X0 (whole
> module, single bucket). Suppose S is subset of X0, so S represents
> functions with some trait that is absent in rest part of X0.
>
> Then refinement looks like below:
>
> P`=refine(P, S):
>    X`[0] = intersection(X[0], S)
>    X`[1] = difference(X[0], S)  // X0 without S0
>
> so P`={X`[0], X`[1]}
>
> And actually X`[0] and X`[1] are classes of functions that differs with
> trait defined by S.
>
> Then (suppose, we need to introduce new trait, so it would be S`):
>
> P``=refine(P`, S`), so each X`` could be defined like below:
> for each X`[i] in P`:
>    X``[i*2] = intersection(X`[i], S`)
>    X``[i*2+1] = difference(X`[i], S`)
>
> and so on, until all the traits are over.
>
> Finally we got the classes (set of Xs) that consists of equivalent
> functions. So, we can merge functions inside the class bounds.
>
> Am I right?
>
> Patch, that was proposed here could be represented as such partition
> refinement then:
>
> On each refinement stage, you need to introduce S somehow, the algorithm
> that answers the question: "what sort of difference we want to check
in
> this bucket?"
>
> In existing implementation we check traits already, but pair-wise
> (attributes, calling convention, number of arguments, number of basic
> blocks, etc.). How to to use the same traits in partition refinement?
>
> Consider next example. Suppose we want to split some bucket X:
>
> 0. Take some trait from *pair-wise* comparison. For example: number of
> arguments.
> 1. Take some function from bucket X, let it be F1. Let F1 has N1 number
> of arguments.
> 2. Then S are functions with number of arguments *less* than N1.
> 3. Do refinement by S. As result we got X`[0] and X`[1].
> 4. If both of X`[0] and X`[1] are not empty, then repeat #1-#3 for them.
> 5. If some of X`[0] of X`[1] is empty set, then all functions in X are
> the same in context of trait S. We need another trait, go to #0. And
> repeat #0-#5 for all X` buckets we have.
>
> Finally we got the groups of equivalent functions. And actually these
> groups would be organized in binary tree.
> Most of stuff is implemented in std::map. I showed it as #0-#5 just to
> put it into context of your idea. The only problem was to introduce way
> of how to define order-relation (or how to define S). But this problem
> has been solved.
> And on output we got groups of equal functions.
>
> Fix me if I'm wrong. But it looks like every advantage given by
> partition refinement is present here.
>
> As I said in beginning, it is way of how to commit things you proposed
> with minimum of changes. We will use old comparison routines, only thing
> we need to change is return type (from bool to -1,0,1).
> In this way, partition refinement could be committed as series of small
> changes: one comparison routine per commit.
>
> Every change in MergeFunctions from now could be well checked by this
> builder:
> Now we have special builder for MergeFunctions. It runs test-suite with
> forced MergeFunctions pass:
> http://lab.llvm.org:8011/builders/clang-mergefunc-x86_64-freeBSD9.2
> Currently, MergeFunctions pass is stable, builder is green and passes
> all test-suite programs.
>
> -Stepan.
>
> Stepan Dyatkovskiy wrote:
>> Hi all,
>> Please find the updated patch in attachment:
>> * Added some comments.
>> * Fixed some typos.
>>
>> -Stepan
>>
>> Nick Lewycky wrote:
>>> On 30 January 2014 01:27, Stepan Dyatkovskiy <stpworld at
narod.ru
>>> <mailto:stpworld at narod.ru>> wrote:
>>>
>>>     Hello Sean and Tobias,
>>>
>>>     Sean,
>>>     Thank you. Could you describe Nick's ideas in few words or
give me
>>>     links to your discussion, so I could adapt my ideas to it.
>>>
>>>
>>> Sorry I haven't had time to write it up. The essential idea is
that we
>>> use partition-refinement to determine the differences between a
group of
>>> functions instead of using pair-wise comparison.
>>>
>>> Imagine you want to compare long sequences of numbers. You can grab
a
>>> pair of sequences and run down them until they differ or you reach
the
>>> end, and move on to another pair of sequences. With partition
refinement
>>> you start with the first number in all the sequences and bucket
them.
>>> Say, three 1's, two 2's, and an 8. You've just formed 3
buckets. For
>>> each bucket (with more than 1 member), do it again with the next
number
>>> in the sequence. This gets you an equivalence comparison at less
than
>>> the complexity of pairwise comparisons. (The buckets are
partitions, and
>>> you're refining them as you read more data. Partition
refinement is the
>>> dual of union-find, which we use quite heavily in the compiler.)
>>>
>>> I haven't figured out how this stacks up against Stepan's
sorting with
>>> <=> comparison. I can probably do that, I just haven't
spent the time
>>> thinking about it.
>>>
>>> I also haven't thought through how to integrate it with
"near miss"
>>> comparison. My current leading idea is that you do the same thing,
but
>>> make a record of what makes two buckets similar. (Note that the
wrong
>>> solution is to store a per-function-pair list of
>>> similarities/differences.)
>>>
>>> Nick
>>>
>>>     Tobias,
>>>     Your patch fails on several modules in my benchmark (73 of
~1800
>>>     tests). I have sent one as attachment.
>>>
>>>     See statistics files for more details, all the .ll files you
could
>>>     simply find in test-suite object directory (after "make
TEST=nightly
>>>     report").
>>>     I have attached script that launches benchmark. Example of
usage
>>>     (you need Linux OS):
>>>     bash test-scipt.sh report.txt opt-no-patch opt-with-patch
>>>
>>>     Some numbers.
>>>
>>>     Total number of functions: 52715
>>>
>>>     Below, by default, I have excluded modules where your patch
fails.
>>>
>>>     Functions merged
>>>     Original version: 1113
>>>     Order relation patch: 1112
>>>     Similar merging patch: 541
>>>
>>>     Functions merged (including failed tests)
>>>     Original version: 8398
>>>     Order relation patch: 8395
>>>
>>>     Summary files size
>>>     Initial:                                 163595634 bytes.
>>>     After merging with order relation patch: 163147731 bytes.
>>>     After similar merging patch:             162491905 bytes.
>>>
>>>     Summary files size (including failed tests)
>>>     Initial:          250242266 bytes.
>>>     Original version: 247168198 bytes.
>>>     Order relation:   247175650 bytes.
>>>
>>>     Time. I measured with "time" utility, and used
"real (wall clock)
>>>     time used by the process".
>>>
>>>     Summary time spent
>>>     With existing version:     28.05 secs.
>>>     With order relation patch: 28.19 secs.
>>>     Similar merging patch:     28.61 secs.
>>>
>>>     Summary time spent (including failed tests)
>>>     With existing version:     41.74 secs.
>>>     With order relation patch: 36.35 secs.
>>>
>>>     -Stepan
>>>
>>>     Sean Silva wrote:
>>>
>>>
>>>
>>>
>>>         On Tue, Jan 28, 2014 at 2:47 PM, Tobias von Koch
>>>         <tobias.von.koch at gmail.com <mailto:tobias.von.koch
at gmail.com>
>>>         <mailto:tobias.von.koch at gmail.__com
>>>         <mailto:tobias.von.koch at gmail.com>>> wrote:
>>>
>>>              Hi Stepan,
>>>
>>>              Sorry for the delay. It's great that you are
working on
>>>              MergeFunctions as well and I agree, we should
definitely
>>> try to
>>>              combine our efforts to improve MergeFunctions.
>>>
>>>              Just to give you some context, the pass (with the
similar
>>>         function
>>>              merging patch) is already being used in a production
>>>         setting. From
>>>              my point of view, it would be better if we focus on
>>>         improving its
>>>              capability of merging functions at this stage rather
>>> than on
>>>              reduction of compilation time.
>>>
>>>              I'll give a brief overview of how our similar
function
>>> merging
>>>              algorithm works (you can also watch the presentation
from
>>>         the US
>>>              LLVM conference to hear me explain it using an
animation).
>>>
>>>              1. Hash all functions into buckets
>>>
>>>              In each bucket, separately:
>>>
>>>                2. Compare functions pair-wise and determine a
>>>                   similarity metric for each pair (%age of
equivalent
>>>         instructions)
>>>
>>>                3. Merge identical functions (with similarity =
100%),
>>>         update call
>>>                   sites for those functions.
>>>
>>>                4. If the updates of call sites have touched other
>>> functions,
>>>                   go back to step 2 and re-compare *only* those
>>>         functions to all
>>>                   others in their buckets.
>>>
>>>                Finally,
>>>
>>>                5. Form groups of similar functions for merging:
>>>                   a) Pick most similar pair (A,B)
>>>                   b) Find all functions C that are also similar to
B but
>>>         are not
>>>                      more similar to some other function D.
>>>                   c) Merge A with B and all the C's.
>>>                   Repeat until there are no more functions to
merge.
>>>
>>>              As you can see, we need to compute a similarity
measure for
>>>         each
>>>              pair of functions in a bucket. If I understand
correctly,
>>>         your patch
>>>              reduces compile-time by avoiding comparisons of
functions
>>> that
>>>              cannot be identical. That's exactly what you want
to do if
>>>         you only
>>>              want to merge identical functions. However, because we
also
>>>         need to
>>>              determine whether functions are merely *similar*, I
can't
>>>         see how
>>>              your idea could be applied in that case.
>>>
>>>
>>>         Yeah, the existing pass only tries to merge identical
functions
>>>         (identical in machine code). I'm wondering if the
"similar"
>>> function
>>>         merging would be better as a separate pass (I'm
genuinely
>>>         wondering; I
>>>         have no clue). I would wait for Nick's feedback.
>>>
>>>         -- Sean Silva
>>>
>>>
>>>              Looking at your statistics, I see that you are
considering
>>>         a very
>>>              large number of functions. I don't know which
benchmarks
>>>         you are
>>>              using, but I doubt that many of these are worth
merging.
>>> For
>>>              instance, you rarely gain a code size benefit from
merging
>>>         functions
>>>              with just a few instructions. Our patch takes care of
this
>>>         using
>>>              thresholds. It's worth looking at actual code size
>>> reductions,
>>>              rather than just numbers of functions merged/
compilation
>>>         time spent
>>>              comparing functions. It turns out that the number of
>>>         functions in
>>>              each bucket is really quite small once you apply some
>>>         heuristics
>>>              (and could still be further reduced!).
>>>
>>>              Given your experience with MergeFunctions, it would be
>>>         really great
>>>              if you could review our patch and also try it out on
your
>>>         benchmarks.
>>>
>>>              Tobias
>>>
>>>
>>>              On 24/01/2014 19:11, Stepan Dyatkovskiy wrote:
>>>
>>>                  Hi Tobias.
>>>
>>>                  So, what do you think?
>>>
>>>                  If it means to much extra-job for your team, may
be I
>>>         can help you
>>>                  somehow? I really would like to.
>>>
>>>                  -Stepan
>>>
>>>                  Stepan Dyatkovskiy wrote:
>>>
>>>                      Hi Tobias,
>>>
>>>                          I can't really see a way to combine
our
>>>         approach with
>>>                          your patch. What
>>>                          are your thoughts?
>>>
>>>
>>>                      I think it is possible. Even more - we need to
>>>         combine our
>>>                      efforts, in
>>>                      order to bring this pass into real live.
>>>                      I'have read your presentation file, and
>>>         unfortunately read
>>>                      your patch
>>>                      only a little.
>>>                      How exactly you scan functions on 2nd stage?
>>> Could you
>>>                      explain the
>>>                      algorithm in few words, how you compare
workflow?
>>> Is it
>>>                      possible to
>>>                      scan binary tree instead of hash table? ...OK.
>>>
>>>                      That's how I see the modification. Now its
>>> important to
>>>                      explain idea,
>>>                      so consider the simplest case: you need to
catch
>>> two
>>>                      functions that
>>>                      are differs with single instruction somewhere.
>>>
>>>                      1. Imagine, that IR *module* contents is
>>> represented as
>>>                      binary tree:
>>>                      Each line (trace) from root to terminal node
is a
>>>         function.
>>>                      Each node - is function's primitive
(instruction
>>>         opcode, for
>>>                      example).
>>>                      Down - is direction to terminal nodes, up - is
>>>         direction to
>>>                      the root.
>>>
>>>                      2. Now you are standing on some node. And you
>>> have two
>>>                      directions
>>>                      down-left and down-right.
>>>
>>>                      3. If you are able to find two equal
sub-traces
>>>         down (at
>>>                      left and at
>>>                      right), then the only difference lies in this
node.
>>>         Then we
>>>                      catch that
>>>                      case.
>>>
>>>                      4. Even more, if two subtrees at left and at
>>> right are
>>>                      equal, than you
>>>                      catch at once all the cases that are
represented by
>>>         these
>>>                      subtrees.
>>>
>>>                      I still didn't look at you patch
carefully. Sorry..
>>>         But I
>>>                      hope that
>>>                      helps, and I'll try look at it in nearest
time and
>>>         perhaps
>>>                      its not the
>>>                      best solution I gave in this post.
>>>
>>>                      -Stepan
>>>
>>>                      22.01.2014, 20:53, "Tobias von Koch"
>>>                      <tobias.von.koch at gmail.com
>>>         <mailto:tobias.von.koch at gmail.com>
>>>         <mailto:tobias.von.koch at gmail.__com
>>>         <mailto:tobias.von.koch at gmail.com>>>:
>>>
>>>
>>>                          Hi Stepan,
>>>
>>>                          As you've seen we have recently
implemented a
>>>                          significant enhancement to
>>>                          the MergeFunctions pass that also allows
>>> merging of
>>>                          functions that are
>>>                          only similar but not identical
>>>                         
(http://llvm-reviews.__chandle__rc.com/D2591
>>>         <http://chandlerc.com/D2591>
>>>                         
<http://llvm-reviews.__chandlerc.com/D2591
>>>         <http://llvm-reviews.chandlerc.com/D2591>>).
>>>
>>>
>>>                          Our patch also changes the way that
functions
>>> are
>>>                          compared quite
>>>                          significantly. This is necessary because
we
>>> need to
>>>                          compare all
>>>                          functions in a bucket in order to get a
>>> similarity
>>>                          measure for each
>>>                          pair, so we can then decide which
'groups' of
>>>         functions
>>>                          to merge.
>>>
>>>                          I can't really see a way to combine
our
>>>         approach with
>>>                          your patch. What
>>>                          are your thoughts?
>>>
>>>                          Another way to improve the performance of
>>>         MergeFunctions
>>>                          might be to
>>>                          make the hash function better. I've
already
>>>         added the
>>>                          size of the first
>>>                          BB to it, and perhaps there are other
factors
>>>         we could
>>>                          try... if we
>>>                          don't have to compare functions in the
first
>>> place
>>>                          (because they're in
>>>                          different buckets) then that's
obviously a much
>>>         bigger win.
>>>
>>>                          Thanks,
>>>                          Tobias
>>>
>>>                          On 17/01/2014 20:25, Stepan Dyatkovskiy
wrote:
>>>
>>>                                 Hi all,
>>>
>>>                                 I propose simple improvement for
>>>         MergeFunctions
>>>                              pass, that reduced
>>>                              its
>>>                                 complexity from O(N^2) to
O(log(N)),
>>>         where N is
>>>                              number of
>>>                              functions in
>>>                                 module.
>>>
>>>                                 The idea, is to replace the result
of
>>>         comparison
>>>                              from "bool" to
>>>                                 "-1,0,1". In another
words: define order
>>>         relation
>>>                              on functions set.
>>>                                 To be sure, that functions could be
>>>         comparable
>>>                              that way, we have to
>>>                                 prove that order relation is
possible
>>> on all
>>>                              comparison stage.
>>>
>>>                                 The last one is possible, if for
each
>>>         comparison
>>>                              stage we
>>>                              implement has
>>>                                 next properties:
>>>                                 * reflexivity (a <= a, a == a, a
>= a),
>>>                                 * antisymmetry (if a <= b and b
<= a
>>>         then a  == b),
>>>                                 * transitivity (a <= b and b
<= c, then
>>>         a <= c)
>>>                                 * asymmetry (if a < b, then a
> b or a
>>>         == b).
>>>
>>>                                 Once we have defined order relation
we
>>>         can store
>>>                              all the functions in
>>>                                 binary tree and perform lookup in
>>>         O(log(N)) time.
>>>
>>>                                 This post has two attachments:
>>>                                 1. The patch, that has
implementation of
>>>         this idea.
>>>                                 2. The MergeFunctions pass detailed
>>>         description,
>>>                              with explanation how
>>>                                 order relation could be possible.
>>>
>>>                                 Hope it helps to make things
better!
>>>
>>>                                 -Stepan.
>>>
>>>
>>>           ___________________________________________________
>>>                                 LLVM Developers mailing list
>>>         LLVMdev at cs.uiuc.edu <mailto:LLVMdev at
cs.uiuc.edu>
>>>         <mailto:LLVMdev at cs.uiuc.edu <mailto:LLVMdev at
cs.uiuc.edu>>
>>>         http://llvm.cs.uiuc.edu
>>>         http://lists.cs.uiuc.edu/____mailman/listinfo/llvmdev
>>>         <http://lists.cs.uiuc.edu/__mailman/listinfo/llvmdev>
>>>
>>>         <http://lists.cs.uiuc.edu/__mailman/listinfo/llvmdev
>>>        
<http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev>>
>>>
>>>
>>>
>>>              ___________________________________________________
>>>              LLVM Developers mailing list
>>>         LLVMdev at cs.uiuc.edu <mailto:LLVMdev at
cs.uiuc.edu>
>>>         <mailto:LLVMdev at cs.uiuc.edu <mailto:LLVMdev at
cs.uiuc.edu>>
>>>         http://llvm.cs.uiuc.edu
>>>         http://lists.cs.uiuc.edu/____mailman/listinfo/llvmdev
>>>         <http://lists.cs.uiuc.edu/__mailman/listinfo/llvmdev>
>>>             
<http://lists.cs.uiuc.edu/__mailman/listinfo/llvmdev
>>>        
<http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev>>
>>>
>>>
>>>
>>>
>>>     _______________________________________________
>>>     LLVM Developers mailing list
>>>     LLVMdev at cs.uiuc.edu <mailto:LLVMdev at cs.uiuc.edu>
>>> http://llvm.cs.uiuc.edu
>>>     http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>>
>>
>>
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev


-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-types-comparison.patch
Type: text/x-diff
Size: 4893 bytes
Desc: not available
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20140203/fcba0937/attachment.patch>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0002-check-for-lossless-bitcast-adaptation.patch
Type: text/x-diff
Size: 3797 bytes
Desc: not available
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20140203/fcba0937/attachment-0001.patch>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0003-constants-comparison.patch
Type: text/x-diff
Size: 6479 bytes
Desc: not available
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20140203/fcba0937/attachment-0002.patch>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0004-constant-comparison-additional-checks.patch
Type: text/x-diff
Size: 2066 bytes
Desc: not available
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20140203/fcba0937/attachment-0003.patch>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0005-values-enumeration-routine.patch
Type: text/x-diff
Size: 5296 bytes
Desc: not available
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20140203/fcba0937/attachment-0004.patch>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0006-attributes-comparison.patch
Type: text/x-diff
Size: 2507 bytes
Desc: not available
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20140203/fcba0937/attachment-0005.patch>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0007-operations-comparison.patch
Type: text/x-diff
Size: 9654 bytes
Desc: not available
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20140203/fcba0937/attachment-0006.patch>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0008-GEP-comparison.patch
Type: text/x-diff
Size: 3566 bytes
Desc: not available
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20140203/fcba0937/attachment-0007.patch>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0009-BB-comparison.patch
Type: text/x-diff
Size: 3821 bytes
Desc: not available
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20140203/fcba0937/attachment-0008.patch>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0010-top-level-compare-method.patch
Type: text/x-diff
Size: 4161 bytes
Desc: not available
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20140203/fcba0937/attachment-0009.patch>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0011-sanity-check.patch
Type: text/x-diff
Size: 3987 bytes
Desc: not available
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20140203/fcba0937/attachment-0010.patch>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0012-removed-unused-methods.patch
Type: text/x-diff
Size: 2067 bytes
Desc: not available
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20140203/fcba0937/attachment-0011.patch>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0013-FnSet-replaced-with-FnTree.patch
Type: text/x-diff
Size: 6662 bytes
Desc: not available
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20140203/fcba0937/attachment-0012.patch>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0014-updated-comments.patch
Type: text/x-diff
Size: 3633 bytes
Desc: not available
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20140203/fcba0937/attachment-0013.patch>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0015-removed-dense-map-helpers.patch
Type: text/x-diff
Size: 4530 bytes
Desc: not available
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20140203/fcba0937/attachment-0014.patch>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: apply-patches.sh
Type: application/x-sh
Size: 199 bytes
Desc: not available
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20140203/fcba0937/attachment.sh>

Stepan Dyatkovskiy

2014-Feb-10 17:40 UTC

head link

[LLVMdev] MergeFunctions: reduce complexity to O(log(N))

ping
Stepan Dyatkovskiy wrote:> Hi all,
>
> Previous patch has been split onto series of small changes.
> On each stage (after each patch) MergeFunctions pass is compilable and
> stable.
>
> Please find patches in attachment for review.
>
> To apply all patches at once, use "apply-patches.sh" script as
follows:
> 0. Place "apply-patches.sh" in same directory with patches.
> 1. cd <llvm-sources-dir>
> 2. "bash <path-to-patches-dir>/apply-patches.sh"
>
> -Stepan
>
> Stepan Dyatkovskiy wrote:
>> Hi Nick,
>> About partition refinement.
>>
>> It looks like patch proposed here allows to apply your idea with
>> painless way (as series of small changes actually).
>>
>> If I got it right, you would like to introduce the next
>> partition-refinement.
>>
>> Initial partition P consists of only one functions class X0 (whole
>> module, single bucket). Suppose S is subset of X0, so S represents
>> functions with some trait that is absent in rest part of X0.
>>
>> Then refinement looks like below:
>>
>> P`=refine(P, S):
>>    X`[0] = intersection(X[0], S)
>>    X`[1] = difference(X[0], S)  // X0 without S0
>>
>> so P`={X`[0], X`[1]}
>>
>> And actually X`[0] and X`[1] are classes of functions that differs with
>> trait defined by S.
>>
>> Then (suppose, we need to introduce new trait, so it would be S`):
>>
>> P``=refine(P`, S`), so each X`` could be defined like below:
>> for each X`[i] in P`:
>>    X``[i*2] = intersection(X`[i], S`)
>>    X``[i*2+1] = difference(X`[i], S`)
>>
>> and so on, until all the traits are over.
>>
>> Finally we got the classes (set of Xs) that consists of equivalent
>> functions. So, we can merge functions inside the class bounds.
>>
>> Am I right?
>>
>> Patch, that was proposed here could be represented as such partition
>> refinement then:
>>
>> On each refinement stage, you need to introduce S somehow, the
algorithm
>> that answers the question: "what sort of difference we want to
check in
>> this bucket?"
>>
>> In existing implementation we check traits already, but pair-wise
>> (attributes, calling convention, number of arguments, number of basic
>> blocks, etc.). How to to use the same traits in partition refinement?
>>
>> Consider next example. Suppose we want to split some bucket X:
>>
>> 0. Take some trait from *pair-wise* comparison. For example: number of
>> arguments.
>> 1. Take some function from bucket X, let it be F1. Let F1 has N1 number
>> of arguments.
>> 2. Then S are functions with number of arguments *less* than N1.
>> 3. Do refinement by S. As result we got X`[0] and X`[1].
>> 4. If both of X`[0] and X`[1] are not empty, then repeat #1-#3 for
them.
>> 5. If some of X`[0] of X`[1] is empty set, then all functions in X are
>> the same in context of trait S. We need another trait, go to #0. And
>> repeat #0-#5 for all X` buckets we have.
>>
>> Finally we got the groups of equivalent functions. And actually these
>> groups would be organized in binary tree.
>> Most of stuff is implemented in std::map. I showed it as #0-#5 just to
>> put it into context of your idea. The only problem was to introduce way
>> of how to define order-relation (or how to define S). But this problem
>> has been solved.
>> And on output we got groups of equal functions.
>>
>> Fix me if I'm wrong. But it looks like every advantage given by
>> partition refinement is present here.
>>
>> As I said in beginning, it is way of how to commit things you proposed
>> with minimum of changes. We will use old comparison routines, only
thing
>> we need to change is return type (from bool to -1,0,1).
>> In this way, partition refinement could be committed as series of small
>> changes: one comparison routine per commit.
>>
>> Every change in MergeFunctions from now could be well checked by this
>> builder:
>> Now we have special builder for MergeFunctions. It runs test-suite with
>> forced MergeFunctions pass:
>> http://lab.llvm.org:8011/builders/clang-mergefunc-x86_64-freeBSD9.2
>> Currently, MergeFunctions pass is stable, builder is green and passes
>> all test-suite programs.
>>
>> -Stepan.
>>
>> Stepan Dyatkovskiy wrote:
>>> Hi all,
>>> Please find the updated patch in attachment:
>>> * Added some comments.
>>> * Fixed some typos.
>>>
>>> -Stepan
>>>
>>> Nick Lewycky wrote:
>>>> On 30 January 2014 01:27, Stepan Dyatkovskiy <stpworld at
narod.ru
>>>> <mailto:stpworld at narod.ru>> wrote:
>>>>
>>>>     Hello Sean and Tobias,
>>>>
>>>>     Sean,
>>>>     Thank you. Could you describe Nick's ideas in few words
or give me
>>>>     links to your discussion, so I could adapt my ideas to it.
>>>>
>>>>
>>>> Sorry I haven't had time to write it up. The essential idea
is that we
>>>> use partition-refinement to determine the differences between a
>>>> group of
>>>> functions instead of using pair-wise comparison.
>>>>
>>>> Imagine you want to compare long sequences of numbers. You can
grab a
>>>> pair of sequences and run down them until they differ or you
reach the
>>>> end, and move on to another pair of sequences. With partition
>>>> refinement
>>>> you start with the first number in all the sequences and bucket
them.
>>>> Say, three 1's, two 2's, and an 8. You've just
formed 3 buckets. For
>>>> each bucket (with more than 1 member), do it again with the
next number
>>>> in the sequence. This gets you an equivalence comparison at
less than
>>>> the complexity of pairwise comparisons. (The buckets are
partitions,
>>>> and
>>>> you're refining them as you read more data. Partition
refinement is the
>>>> dual of union-find, which we use quite heavily in the
compiler.)
>>>>
>>>> I haven't figured out how this stacks up against
Stepan's sorting with
>>>> <=> comparison. I can probably do that, I just
haven't spent the time
>>>> thinking about it.
>>>>
>>>> I also haven't thought through how to integrate it with
"near miss"
>>>> comparison. My current leading idea is that you do the same
thing, but
>>>> make a record of what makes two buckets similar. (Note that the
wrong
>>>> solution is to store a per-function-pair list of
>>>> similarities/differences.)
>>>>
>>>> Nick
>>>>
>>>>     Tobias,
>>>>     Your patch fails on several modules in my benchmark (73 of
~1800
>>>>     tests). I have sent one as attachment.
>>>>
>>>>     See statistics files for more details, all the .ll files
you could
>>>>     simply find in test-suite object directory (after
"make
>>>> TEST=nightly
>>>>     report").
>>>>     I have attached script that launches benchmark. Example of
usage
>>>>     (you need Linux OS):
>>>>     bash test-scipt.sh report.txt opt-no-patch opt-with-patch
>>>>
>>>>     Some numbers.
>>>>
>>>>     Total number of functions: 52715
>>>>
>>>>     Below, by default, I have excluded modules where your patch
fails.
>>>>
>>>>     Functions merged
>>>>     Original version: 1113
>>>>     Order relation patch: 1112
>>>>     Similar merging patch: 541
>>>>
>>>>     Functions merged (including failed tests)
>>>>     Original version: 8398
>>>>     Order relation patch: 8395
>>>>
>>>>     Summary files size
>>>>     Initial:                                 163595634 bytes.
>>>>     After merging with order relation patch: 163147731 bytes.
>>>>     After similar merging patch:             162491905 bytes.
>>>>
>>>>     Summary files size (including failed tests)
>>>>     Initial:          250242266 bytes.
>>>>     Original version: 247168198 bytes.
>>>>     Order relation:   247175650 bytes.
>>>>
>>>>     Time. I measured with "time" utility, and used
"real (wall clock)
>>>>     time used by the process".
>>>>
>>>>     Summary time spent
>>>>     With existing version:     28.05 secs.
>>>>     With order relation patch: 28.19 secs.
>>>>     Similar merging patch:     28.61 secs.
>>>>
>>>>     Summary time spent (including failed tests)
>>>>     With existing version:     41.74 secs.
>>>>     With order relation patch: 36.35 secs.
>>>>
>>>>     -Stepan
>>>>
>>>>     Sean Silva wrote:
>>>>
>>>>
>>>>
>>>>
>>>>         On Tue, Jan 28, 2014 at 2:47 PM, Tobias von Koch
>>>>         <tobias.von.koch at gmail.com
<mailto:tobias.von.koch at gmail.com>
>>>>         <mailto:tobias.von.koch at gmail.__com
>>>>         <mailto:tobias.von.koch at gmail.com>>>
wrote:
>>>>
>>>>              Hi Stepan,
>>>>
>>>>              Sorry for the delay. It's great that you are
working on
>>>>              MergeFunctions as well and I agree, we should
definitely
>>>> try to
>>>>              combine our efforts to improve MergeFunctions.
>>>>
>>>>              Just to give you some context, the pass (with the
similar
>>>>         function
>>>>              merging patch) is already being used in a
production
>>>>         setting. From
>>>>              my point of view, it would be better if we focus
on
>>>>         improving its
>>>>              capability of merging functions at this stage
rather
>>>> than on
>>>>              reduction of compilation time.
>>>>
>>>>              I'll give a brief overview of how our similar
function
>>>> merging
>>>>              algorithm works (you can also watch the
presentation from
>>>>         the US
>>>>              LLVM conference to hear me explain it using an
animation).
>>>>
>>>>              1. Hash all functions into buckets
>>>>
>>>>              In each bucket, separately:
>>>>
>>>>                2. Compare functions pair-wise and determine a
>>>>                   similarity metric for each pair (%age of
equivalent
>>>>         instructions)
>>>>
>>>>                3. Merge identical functions (with similarity =
100%),
>>>>         update call
>>>>                   sites for those functions.
>>>>
>>>>                4. If the updates of call sites have touched
other
>>>> functions,
>>>>                   go back to step 2 and re-compare *only* those
>>>>         functions to all
>>>>                   others in their buckets.
>>>>
>>>>                Finally,
>>>>
>>>>                5. Form groups of similar functions for merging:
>>>>                   a) Pick most similar pair (A,B)
>>>>                   b) Find all functions C that are also similar
to B
>>>> but
>>>>         are not
>>>>                      more similar to some other function D.
>>>>                   c) Merge A with B and all the C's.
>>>>                   Repeat until there are no more functions to
merge.
>>>>
>>>>              As you can see, we need to compute a similarity
measure
>>>> for
>>>>         each
>>>>              pair of functions in a bucket. If I understand
correctly,
>>>>         your patch
>>>>              reduces compile-time by avoiding comparisons of
functions
>>>> that
>>>>              cannot be identical. That's exactly what you
want to do if
>>>>         you only
>>>>              want to merge identical functions. However,
because we
>>>> also
>>>>         need to
>>>>              determine whether functions are merely *similar*,
I can't
>>>>         see how
>>>>              your idea could be applied in that case.
>>>>
>>>>
>>>>         Yeah, the existing pass only tries to merge identical
functions
>>>>         (identical in machine code). I'm wondering if the
"similar"
>>>> function
>>>>         merging would be better as a separate pass (I'm
genuinely
>>>>         wondering; I
>>>>         have no clue). I would wait for Nick's feedback.
>>>>
>>>>         -- Sean Silva
>>>>
>>>>
>>>>              Looking at your statistics, I see that you are
considering
>>>>         a very
>>>>              large number of functions. I don't know which
benchmarks
>>>>         you are
>>>>              using, but I doubt that many of these are worth
merging.
>>>> For
>>>>              instance, you rarely gain a code size benefit from
merging
>>>>         functions
>>>>              with just a few instructions. Our patch takes care
of this
>>>>         using
>>>>              thresholds. It's worth looking at actual code
size
>>>> reductions,
>>>>              rather than just numbers of functions merged/
compilation
>>>>         time spent
>>>>              comparing functions. It turns out that the number
of
>>>>         functions in
>>>>              each bucket is really quite small once you apply
some
>>>>         heuristics
>>>>              (and could still be further reduced!).
>>>>
>>>>              Given your experience with MergeFunctions, it
would be
>>>>         really great
>>>>              if you could review our patch and also try it out
on your
>>>>         benchmarks.
>>>>
>>>>              Tobias
>>>>
>>>>
>>>>              On 24/01/2014 19:11, Stepan Dyatkovskiy wrote:
>>>>
>>>>                  Hi Tobias.
>>>>
>>>>                  So, what do you think?
>>>>
>>>>                  If it means to much extra-job for your team,
may be I
>>>>         can help you
>>>>                  somehow? I really would like to.
>>>>
>>>>                  -Stepan
>>>>
>>>>                  Stepan Dyatkovskiy wrote:
>>>>
>>>>                      Hi Tobias,
>>>>
>>>>                          I can't really see a way to
combine our
>>>>         approach with
>>>>                          your patch. What
>>>>                          are your thoughts?
>>>>
>>>>
>>>>                      I think it is possible. Even more - we
need to
>>>>         combine our
>>>>                      efforts, in
>>>>                      order to bring this pass into real live.
>>>>                      I'have read your presentation file,
and
>>>>         unfortunately read
>>>>                      your patch
>>>>                      only a little.
>>>>                      How exactly you scan functions on 2nd
stage?
>>>> Could you
>>>>                      explain the
>>>>                      algorithm in few words, how you compare
workflow?
>>>> Is it
>>>>                      possible to
>>>>                      scan binary tree instead of hash table?
...OK.
>>>>
>>>>                      That's how I see the modification. Now
its
>>>> important to
>>>>                      explain idea,
>>>>                      so consider the simplest case: you need to
catch
>>>> two
>>>>                      functions that
>>>>                      are differs with single instruction
somewhere.
>>>>
>>>>                      1. Imagine, that IR *module* contents is
>>>> represented as
>>>>                      binary tree:
>>>>                      Each line (trace) from root to terminal
node is a
>>>>         function.
>>>>                      Each node - is function's primitive
(instruction
>>>>         opcode, for
>>>>                      example).
>>>>                      Down - is direction to terminal nodes, up
- is
>>>>         direction to
>>>>                      the root.
>>>>
>>>>                      2. Now you are standing on some node. And
you
>>>> have two
>>>>                      directions
>>>>                      down-left and down-right.
>>>>
>>>>                      3. If you are able to find two equal
sub-traces
>>>>         down (at
>>>>                      left and at
>>>>                      right), then the only difference lies in
this
>>>> node.
>>>>         Then we
>>>>                      catch that
>>>>                      case.
>>>>
>>>>                      4. Even more, if two subtrees at left and
at
>>>> right are
>>>>                      equal, than you
>>>>                      catch at once all the cases that are
>>>> represented by
>>>>         these
>>>>                      subtrees.
>>>>
>>>>                      I still didn't look at you patch
carefully.
>>>> Sorry..
>>>>         But I
>>>>                      hope that
>>>>                      helps, and I'll try look at it in
nearest time and
>>>>         perhaps
>>>>                      its not the
>>>>                      best solution I gave in this post.
>>>>
>>>>                      -Stepan
>>>>
>>>>                      22.01.2014, 20:53, "Tobias von
Koch"
>>>>                      <tobias.von.koch at gmail.com
>>>>         <mailto:tobias.von.koch at gmail.com>
>>>>         <mailto:tobias.von.koch at gmail.__com
>>>>         <mailto:tobias.von.koch at gmail.com>>>:
>>>>
>>>>
>>>>                          Hi Stepan,
>>>>
>>>>                          As you've seen we have recently
implemented a
>>>>                          significant enhancement to
>>>>                          the MergeFunctions pass that also
allows
>>>> merging of
>>>>                          functions that are
>>>>                          only similar but not identical
>>>>                         
(http://llvm-reviews.__chandle__rc.com/D2591
>>>>         <http://chandlerc.com/D2591>
>>>>                         
<http://llvm-reviews.__chandlerc.com/D2591
>>>>         <http://llvm-reviews.chandlerc.com/D2591>>).
>>>>
>>>>
>>>>                          Our patch also changes the way that
functions
>>>> are
>>>>                          compared quite
>>>>                          significantly. This is necessary
because we
>>>> need to
>>>>                          compare all
>>>>                          functions in a bucket in order to get
a
>>>> similarity
>>>>                          measure for each
>>>>                          pair, so we can then decide which
'groups' of
>>>>         functions
>>>>                          to merge.
>>>>
>>>>                          I can't really see a way to
combine our
>>>>         approach with
>>>>                          your patch. What
>>>>                          are your thoughts?
>>>>
>>>>                          Another way to improve the performance
of
>>>>         MergeFunctions
>>>>                          might be to
>>>>                          make the hash function better.
I've already
>>>>         added the
>>>>                          size of the first
>>>>                          BB to it, and perhaps there are other
factors
>>>>         we could
>>>>                          try... if we
>>>>                          don't have to compare functions in
the first
>>>> place
>>>>                          (because they're in
>>>>                          different buckets) then that's
obviously a
>>>> much
>>>>         bigger win.
>>>>
>>>>                          Thanks,
>>>>                          Tobias
>>>>
>>>>                          On 17/01/2014 20:25, Stepan
Dyatkovskiy wrote:
>>>>
>>>>                                 Hi all,
>>>>
>>>>                                 I propose simple improvement
for
>>>>         MergeFunctions
>>>>                              pass, that reduced
>>>>                              its
>>>>                                 complexity from O(N^2) to
O(log(N)),
>>>>         where N is
>>>>                              number of
>>>>                              functions in
>>>>                                 module.
>>>>
>>>>                                 The idea, is to replace the
result of
>>>>         comparison
>>>>                              from "bool" to
>>>>                                 "-1,0,1". In another
words: define
>>>> order
>>>>         relation
>>>>                              on functions set.
>>>>                                 To be sure, that functions
could be
>>>>         comparable
>>>>                              that way, we have to
>>>>                                 prove that order relation is
possible
>>>> on all
>>>>                              comparison stage.
>>>>
>>>>                                 The last one is possible, if
for each
>>>>         comparison
>>>>                              stage we
>>>>                              implement has
>>>>                                 next properties:
>>>>                                 * reflexivity (a <= a, a ==
a, a >= a),
>>>>                                 * antisymmetry (if a <= b
and b <= a
>>>>         then a  == b),
>>>>                                 * transitivity (a <= b and b
<= c, then
>>>>         a <= c)
>>>>                                 * asymmetry (if a < b, then
a > b or a
>>>>         == b).
>>>>
>>>>                                 Once we have defined order
relation we
>>>>         can store
>>>>                              all the functions in
>>>>                                 binary tree and perform lookup
in
>>>>         O(log(N)) time.
>>>>
>>>>                                 This post has two attachments:
>>>>                                 1. The patch, that has
>>>> implementation of
>>>>         this idea.
>>>>                                 2. The MergeFunctions pass
detailed
>>>>         description,
>>>>                              with explanation how
>>>>                                 order relation could be
possible.
>>>>
>>>>                                 Hope it helps to make things
better!
>>>>
>>>>                                 -Stepan.
>>>>
>>>>
>>>>           ___________________________________________________
>>>>                                 LLVM Developers mailing list
>>>>         LLVMdev at cs.uiuc.edu <mailto:LLVMdev at
cs.uiuc.edu>
>>>>         <mailto:LLVMdev at cs.uiuc.edu <mailto:LLVMdev at
cs.uiuc.edu>>
>>>>         http://llvm.cs.uiuc.edu
>>>>         http://lists.cs.uiuc.edu/____mailman/listinfo/llvmdev
>>>>        
<http://lists.cs.uiuc.edu/__mailman/listinfo/llvmdev>
>>>>
>>>>         <http://lists.cs.uiuc.edu/__mailman/listinfo/llvmdev
>>>>        
<http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev>>
>>>>
>>>>
>>>>
>>>>             
___________________________________________________
>>>>              LLVM Developers mailing list
>>>>         LLVMdev at cs.uiuc.edu <mailto:LLVMdev at
cs.uiuc.edu>
>>>>         <mailto:LLVMdev at cs.uiuc.edu <mailto:LLVMdev at
cs.uiuc.edu>>
>>>>         http://llvm.cs.uiuc.edu
>>>>         http://lists.cs.uiuc.edu/____mailman/listinfo/llvmdev
>>>>        
<http://lists.cs.uiuc.edu/__mailman/listinfo/llvmdev>
>>>>             
<http://lists.cs.uiuc.edu/__mailman/listinfo/llvmdev
>>>>        
<http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev>>
>>>>
>>>>
>>>>
>>>>
>>>>     _______________________________________________
>>>>     LLVM Developers mailing list
>>>>     LLVMdev at cs.uiuc.edu <mailto:LLVMdev at
cs.uiuc.edu>
>>>> http://llvm.cs.uiuc.edu
>>>>     http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>>
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
>
>
>
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>

Stepan Dyatkovskiy

2014-Feb-17 14:34 UTC

head link

[LLVMdev] MergeFunctions: reduce complexity to O(log(N))

ping
Stepan Dyatkovskiy wrote:> ping
> Stepan Dyatkovskiy wrote:
>> Hi all,
>>
>> Previous patch has been split onto series of small changes.
>> On each stage (after each patch) MergeFunctions pass is compilable and
>> stable.
>>
>> Please find patches in attachment for review.
>>
>> To apply all patches at once, use "apply-patches.sh" script
as follows:
>> 0. Place "apply-patches.sh" in same directory with patches.
>> 1. cd <llvm-sources-dir>
>> 2. "bash <path-to-patches-dir>/apply-patches.sh"
>>
>> -Stepan
>>
>> Stepan Dyatkovskiy wrote:
>>> Hi Nick,
>>> About partition refinement.
>>>
>>> It looks like patch proposed here allows to apply your idea with
>>> painless way (as series of small changes actually).
>>>
>>> If I got it right, you would like to introduce the next
>>> partition-refinement.
>>>
>>> Initial partition P consists of only one functions class X0 (whole
>>> module, single bucket). Suppose S is subset of X0, so S represents
>>> functions with some trait that is absent in rest part of X0.
>>>
>>> Then refinement looks like below:
>>>
>>> P`=refine(P, S):
>>>    X`[0] = intersection(X[0], S)
>>>    X`[1] = difference(X[0], S)  // X0 without S0
>>>
>>> so P`={X`[0], X`[1]}
>>>
>>> And actually X`[0] and X`[1] are classes of functions that differs
with
>>> trait defined by S.
>>>
>>> Then (suppose, we need to introduce new trait, so it would be S`):
>>>
>>> P``=refine(P`, S`), so each X`` could be defined like below:
>>> for each X`[i] in P`:
>>>    X``[i*2] = intersection(X`[i], S`)
>>>    X``[i*2+1] = difference(X`[i], S`)
>>>
>>> and so on, until all the traits are over.
>>>
>>> Finally we got the classes (set of Xs) that consists of equivalent
>>> functions. So, we can merge functions inside the class bounds.
>>>
>>> Am I right?
>>>
>>> Patch, that was proposed here could be represented as such
partition
>>> refinement then:
>>>
>>> On each refinement stage, you need to introduce S somehow, the
algorithm
>>> that answers the question: "what sort of difference we want to
check in
>>> this bucket?"
>>>
>>> In existing implementation we check traits already, but pair-wise
>>> (attributes, calling convention, number of arguments, number of
basic
>>> blocks, etc.). How to to use the same traits in partition
refinement?
>>>
>>> Consider next example. Suppose we want to split some bucket X:
>>>
>>> 0. Take some trait from *pair-wise* comparison. For example: number
of
>>> arguments.
>>> 1. Take some function from bucket X, let it be F1. Let F1 has N1
number
>>> of arguments.
>>> 2. Then S are functions with number of arguments *less* than N1.
>>> 3. Do refinement by S. As result we got X`[0] and X`[1].
>>> 4. If both of X`[0] and X`[1] are not empty, then repeat #1-#3 for
them.
>>> 5. If some of X`[0] of X`[1] is empty set, then all functions in X
are
>>> the same in context of trait S. We need another trait, go to #0.
And
>>> repeat #0-#5 for all X` buckets we have.
>>>
>>> Finally we got the groups of equivalent functions. And actually
these
>>> groups would be organized in binary tree.
>>> Most of stuff is implemented in std::map. I showed it as #0-#5 just
to
>>> put it into context of your idea. The only problem was to introduce
way
>>> of how to define order-relation (or how to define S). But this
problem
>>> has been solved.
>>> And on output we got groups of equal functions.
>>>
>>> Fix me if I'm wrong. But it looks like every advantage given by
>>> partition refinement is present here.
>>>
>>> As I said in beginning, it is way of how to commit things you
proposed
>>> with minimum of changes. We will use old comparison routines, only
thing
>>> we need to change is return type (from bool to -1,0,1).
>>> In this way, partition refinement could be committed as series of
small
>>> changes: one comparison routine per commit.
>>>
>>> Every change in MergeFunctions from now could be well checked by
this
>>> builder:
>>> Now we have special builder for MergeFunctions. It runs test-suite
with
>>> forced MergeFunctions pass:
>>> http://lab.llvm.org:8011/builders/clang-mergefunc-x86_64-freeBSD9.2
>>> Currently, MergeFunctions pass is stable, builder is green and
passes
>>> all test-suite programs.
>>>
>>> -Stepan.
>>>
>>> Stepan Dyatkovskiy wrote:
>>>> Hi all,
>>>> Please find the updated patch in attachment:
>>>> * Added some comments.
>>>> * Fixed some typos.
>>>>
>>>> -Stepan
>>>>
>>>> Nick Lewycky wrote:
>>>>> On 30 January 2014 01:27, Stepan Dyatkovskiy <stpworld
at narod.ru
>>>>> <mailto:stpworld at narod.ru>> wrote:
>>>>>
>>>>>     Hello Sean and Tobias,
>>>>>
>>>>>     Sean,
>>>>>     Thank you. Could you describe Nick's ideas in few
words or give me
>>>>>     links to your discussion, so I could adapt my ideas to
it.
>>>>>
>>>>>
>>>>> Sorry I haven't had time to write it up. The essential
idea is that we
>>>>> use partition-refinement to determine the differences
between a
>>>>> group of
>>>>> functions instead of using pair-wise comparison.
>>>>>
>>>>> Imagine you want to compare long sequences of numbers. You
can grab a
>>>>> pair of sequences and run down them until they differ or
you reach the
>>>>> end, and move on to another pair of sequences. With
partition
>>>>> refinement
>>>>> you start with the first number in all the sequences and
bucket them.
>>>>> Say, three 1's, two 2's, and an 8. You've just
formed 3 buckets. For
>>>>> each bucket (with more than 1 member), do it again with the
next
>>>>> number
>>>>> in the sequence. This gets you an equivalence comparison at
less than
>>>>> the complexity of pairwise comparisons. (The buckets are
partitions,
>>>>> and
>>>>> you're refining them as you read more data. Partition
refinement is
>>>>> the
>>>>> dual of union-find, which we use quite heavily in the
compiler.)
>>>>>
>>>>> I haven't figured out how this stacks up against
Stepan's sorting with
>>>>> <=> comparison. I can probably do that, I just
haven't spent the time
>>>>> thinking about it.
>>>>>
>>>>> I also haven't thought through how to integrate it with
"near miss"
>>>>> comparison. My current leading idea is that you do the same
thing, but
>>>>> make a record of what makes two buckets similar. (Note that
the wrong
>>>>> solution is to store a per-function-pair list of
>>>>> similarities/differences.)
>>>>>
>>>>> Nick
>>>>>
>>>>>     Tobias,
>>>>>     Your patch fails on several modules in my benchmark (73
of ~1800
>>>>>     tests). I have sent one as attachment.
>>>>>
>>>>>     See statistics files for more details, all the .ll
files you could
>>>>>     simply find in test-suite object directory (after
"make
>>>>> TEST=nightly
>>>>>     report").
>>>>>     I have attached script that launches benchmark. Example
of usage
>>>>>     (you need Linux OS):
>>>>>     bash test-scipt.sh report.txt opt-no-patch
opt-with-patch
>>>>>
>>>>>     Some numbers.
>>>>>
>>>>>     Total number of functions: 52715
>>>>>
>>>>>     Below, by default, I have excluded modules where your
patch fails.
>>>>>
>>>>>     Functions merged
>>>>>     Original version: 1113
>>>>>     Order relation patch: 1112
>>>>>     Similar merging patch: 541
>>>>>
>>>>>     Functions merged (including failed tests)
>>>>>     Original version: 8398
>>>>>     Order relation patch: 8395
>>>>>
>>>>>     Summary files size
>>>>>     Initial:                                 163595634
bytes.
>>>>>     After merging with order relation patch: 163147731
bytes.
>>>>>     After similar merging patch:             162491905
bytes.
>>>>>
>>>>>     Summary files size (including failed tests)
>>>>>     Initial:          250242266 bytes.
>>>>>     Original version: 247168198 bytes.
>>>>>     Order relation:   247175650 bytes.
>>>>>
>>>>>     Time. I measured with "time" utility, and
used "real (wall clock)
>>>>>     time used by the process".
>>>>>
>>>>>     Summary time spent
>>>>>     With existing version:     28.05 secs.
>>>>>     With order relation patch: 28.19 secs.
>>>>>     Similar merging patch:     28.61 secs.
>>>>>
>>>>>     Summary time spent (including failed tests)
>>>>>     With existing version:     41.74 secs.
>>>>>     With order relation patch: 36.35 secs.
>>>>>
>>>>>     -Stepan
>>>>>
>>>>>     Sean Silva wrote:
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>         On Tue, Jan 28, 2014 at 2:47 PM, Tobias von Koch
>>>>>         <tobias.von.koch at gmail.com
<mailto:tobias.von.koch at gmail.com>
>>>>>         <mailto:tobias.von.koch at gmail.__com
>>>>>         <mailto:tobias.von.koch at gmail.com>>>
wrote:
>>>>>
>>>>>              Hi Stepan,
>>>>>
>>>>>              Sorry for the delay. It's great that you
are working on
>>>>>              MergeFunctions as well and I agree, we should
definitely
>>>>> try to
>>>>>              combine our efforts to improve MergeFunctions.
>>>>>
>>>>>              Just to give you some context, the pass (with
the similar
>>>>>         function
>>>>>              merging patch) is already being used in a
production
>>>>>         setting. From
>>>>>              my point of view, it would be better if we
focus on
>>>>>         improving its
>>>>>              capability of merging functions at this stage
rather
>>>>> than on
>>>>>              reduction of compilation time.
>>>>>
>>>>>              I'll give a brief overview of how our
similar function
>>>>> merging
>>>>>              algorithm works (you can also watch the
presentation from
>>>>>         the US
>>>>>              LLVM conference to hear me explain it using an
>>>>> animation).
>>>>>
>>>>>              1. Hash all functions into buckets
>>>>>
>>>>>              In each bucket, separately:
>>>>>
>>>>>                2. Compare functions pair-wise and determine
a
>>>>>                   similarity metric for each pair (%age of
equivalent
>>>>>         instructions)
>>>>>
>>>>>                3. Merge identical functions (with
similarity = 100%),
>>>>>         update call
>>>>>                   sites for those functions.
>>>>>
>>>>>                4. If the updates of call sites have touched
other
>>>>> functions,
>>>>>                   go back to step 2 and re-compare *only*
those
>>>>>         functions to all
>>>>>                   others in their buckets.
>>>>>
>>>>>                Finally,
>>>>>
>>>>>                5. Form groups of similar functions for
merging:
>>>>>                   a) Pick most similar pair (A,B)
>>>>>                   b) Find all functions C that are also
similar to B
>>>>> but
>>>>>         are not
>>>>>                      more similar to some other function D.
>>>>>                   c) Merge A with B and all the C's.
>>>>>                   Repeat until there are no more functions
to merge.
>>>>>
>>>>>              As you can see, we need to compute a
similarity measure
>>>>> for
>>>>>         each
>>>>>              pair of functions in a bucket. If I understand
correctly,
>>>>>         your patch
>>>>>              reduces compile-time by avoiding comparisons
of functions
>>>>> that
>>>>>              cannot be identical. That's exactly what
you want to
>>>>> do if
>>>>>         you only
>>>>>              want to merge identical functions. However,
because we
>>>>> also
>>>>>         need to
>>>>>              determine whether functions are merely
*similar*, I can't
>>>>>         see how
>>>>>              your idea could be applied in that case.
>>>>>
>>>>>
>>>>>         Yeah, the existing pass only tries to merge
identical
>>>>> functions
>>>>>         (identical in machine code). I'm wondering if
the "similar"
>>>>> function
>>>>>         merging would be better as a separate pass (I'm
genuinely
>>>>>         wondering; I
>>>>>         have no clue). I would wait for Nick's
feedback.
>>>>>
>>>>>         -- Sean Silva
>>>>>
>>>>>
>>>>>              Looking at your statistics, I see that you are
>>>>> considering
>>>>>         a very
>>>>>              large number of functions. I don't know
which benchmarks
>>>>>         you are
>>>>>              using, but I doubt that many of these are
worth merging.
>>>>> For
>>>>>              instance, you rarely gain a code size benefit
from
>>>>> merging
>>>>>         functions
>>>>>              with just a few instructions. Our patch takes
care of
>>>>> this
>>>>>         using
>>>>>              thresholds. It's worth looking at actual
code size
>>>>> reductions,
>>>>>              rather than just numbers of functions merged/
compilation
>>>>>         time spent
>>>>>              comparing functions. It turns out that the
number of
>>>>>         functions in
>>>>>              each bucket is really quite small once you
apply some
>>>>>         heuristics
>>>>>              (and could still be further reduced!).
>>>>>
>>>>>              Given your experience with MergeFunctions, it
would be
>>>>>         really great
>>>>>              if you could review our patch and also try it
out on your
>>>>>         benchmarks.
>>>>>
>>>>>              Tobias
>>>>>
>>>>>
>>>>>              On 24/01/2014 19:11, Stepan Dyatkovskiy wrote:
>>>>>
>>>>>                  Hi Tobias.
>>>>>
>>>>>                  So, what do you think?
>>>>>
>>>>>                  If it means to much extra-job for your
team, may be I
>>>>>         can help you
>>>>>                  somehow? I really would like to.
>>>>>
>>>>>                  -Stepan
>>>>>
>>>>>                  Stepan Dyatkovskiy wrote:
>>>>>
>>>>>                      Hi Tobias,
>>>>>
>>>>>                          I can't really see a way to
combine our
>>>>>         approach with
>>>>>                          your patch. What
>>>>>                          are your thoughts?
>>>>>
>>>>>
>>>>>                      I think it is possible. Even more - we
need to
>>>>>         combine our
>>>>>                      efforts, in
>>>>>                      order to bring this pass into real
live.
>>>>>                      I'have read your presentation
file, and
>>>>>         unfortunately read
>>>>>                      your patch
>>>>>                      only a little.
>>>>>                      How exactly you scan functions on 2nd
stage?
>>>>> Could you
>>>>>                      explain the
>>>>>                      algorithm in few words, how you
compare workflow?
>>>>> Is it
>>>>>                      possible to
>>>>>                      scan binary tree instead of hash
table? ...OK.
>>>>>
>>>>>                      That's how I see the modification.
Now its
>>>>> important to
>>>>>                      explain idea,
>>>>>                      so consider the simplest case: you
need to catch
>>>>> two
>>>>>                      functions that
>>>>>                      are differs with single instruction
somewhere.
>>>>>
>>>>>                      1. Imagine, that IR *module* contents
is
>>>>> represented as
>>>>>                      binary tree:
>>>>>                      Each line (trace) from root to
terminal node is a
>>>>>         function.
>>>>>                      Each node - is function's
primitive (instruction
>>>>>         opcode, for
>>>>>                      example).
>>>>>                      Down - is direction to terminal nodes,
up - is
>>>>>         direction to
>>>>>                      the root.
>>>>>
>>>>>                      2. Now you are standing on some node.
And you
>>>>> have two
>>>>>                      directions
>>>>>                      down-left and down-right.
>>>>>
>>>>>                      3. If you are able to find two equal
sub-traces
>>>>>         down (at
>>>>>                      left and at
>>>>>                      right), then the only difference lies
in this
>>>>> node.
>>>>>         Then we
>>>>>                      catch that
>>>>>                      case.
>>>>>
>>>>>                      4. Even more, if two subtrees at left
and at
>>>>> right are
>>>>>                      equal, than you
>>>>>                      catch at once all the cases that are
>>>>> represented by
>>>>>         these
>>>>>                      subtrees.
>>>>>
>>>>>                      I still didn't look at you patch
carefully.
>>>>> Sorry..
>>>>>         But I
>>>>>                      hope that
>>>>>                      helps, and I'll try look at it in
nearest time
>>>>> and
>>>>>         perhaps
>>>>>                      its not the
>>>>>                      best solution I gave in this post.
>>>>>
>>>>>                      -Stepan
>>>>>
>>>>>                      22.01.2014, 20:53, "Tobias von
Koch"
>>>>>                      <tobias.von.koch at gmail.com
>>>>>         <mailto:tobias.von.koch at gmail.com>
>>>>>         <mailto:tobias.von.koch at gmail.__com
>>>>>         <mailto:tobias.von.koch at
gmail.com>>>:
>>>>>
>>>>>
>>>>>                          Hi Stepan,
>>>>>
>>>>>                          As you've seen we have
recently implemented a
>>>>>                          significant enhancement to
>>>>>                          the MergeFunctions pass that also
allows
>>>>> merging of
>>>>>                          functions that are
>>>>>                          only similar but not identical
>>>>>                         
(http://llvm-reviews.__chandle__rc.com/D2591
>>>>>         <http://chandlerc.com/D2591>
>>>>>                         
<http://llvm-reviews.__chandlerc.com/D2591
>>>>>        
<http://llvm-reviews.chandlerc.com/D2591>>).
>>>>>
>>>>>
>>>>>                          Our patch also changes the way
that functions
>>>>> are
>>>>>                          compared quite
>>>>>                          significantly. This is necessary
because we
>>>>> need to
>>>>>                          compare all
>>>>>                          functions in a bucket in order to
get a
>>>>> similarity
>>>>>                          measure for each
>>>>>                          pair, so we can then decide which
'groups' of
>>>>>         functions
>>>>>                          to merge.
>>>>>
>>>>>                          I can't really see a way to
combine our
>>>>>         approach with
>>>>>                          your patch. What
>>>>>                          are your thoughts?
>>>>>
>>>>>                          Another way to improve the
performance of
>>>>>         MergeFunctions
>>>>>                          might be to
>>>>>                          make the hash function better.
I've already
>>>>>         added the
>>>>>                          size of the first
>>>>>                          BB to it, and perhaps there are
other factors
>>>>>         we could
>>>>>                          try... if we
>>>>>                          don't have to compare
functions in the first
>>>>> place
>>>>>                          (because they're in
>>>>>                          different buckets) then that's
obviously a
>>>>> much
>>>>>         bigger win.
>>>>>
>>>>>                          Thanks,
>>>>>                          Tobias
>>>>>
>>>>>                          On 17/01/2014 20:25, Stepan
Dyatkovskiy
>>>>> wrote:
>>>>>
>>>>>                                 Hi all,
>>>>>
>>>>>                                 I propose simple
improvement for
>>>>>         MergeFunctions
>>>>>                              pass, that reduced
>>>>>                              its
>>>>>                                 complexity from O(N^2) to
O(log(N)),
>>>>>         where N is
>>>>>                              number of
>>>>>                              functions in
>>>>>                                 module.
>>>>>
>>>>>                                 The idea, is to replace the
result of
>>>>>         comparison
>>>>>                              from "bool" to
>>>>>                                 "-1,0,1". In
another words: define
>>>>> order
>>>>>         relation
>>>>>                              on functions set.
>>>>>                                 To be sure, that functions
could be
>>>>>         comparable
>>>>>                              that way, we have to
>>>>>                                 prove that order relation
is possible
>>>>> on all
>>>>>                              comparison stage.
>>>>>
>>>>>                                 The last one is possible,
if for each
>>>>>         comparison
>>>>>                              stage we
>>>>>                              implement has
>>>>>                                 next properties:
>>>>>                                 * reflexivity (a <= a, a
== a, a >>>>>> a),
>>>>>                                 * antisymmetry (if a <=
b and b <= a
>>>>>         then a  == b),
>>>>>                                 * transitivity (a <= b
and b <= c,
>>>>> then
>>>>>         a <= c)
>>>>>                                 * asymmetry (if a < b,
then a > b or a
>>>>>         == b).
>>>>>
>>>>>                                 Once we have defined order
relation we
>>>>>         can store
>>>>>                              all the functions in
>>>>>                                 binary tree and perform
lookup in
>>>>>         O(log(N)) time.
>>>>>
>>>>>                                 This post has two
attachments:
>>>>>                                 1. The patch, that has
>>>>> implementation of
>>>>>         this idea.
>>>>>                                 2. The MergeFunctions pass
detailed
>>>>>         description,
>>>>>                              with explanation how
>>>>>                                 order relation could be
possible.
>>>>>
>>>>>                                 Hope it helps to make
things better!
>>>>>
>>>>>                                 -Stepan.
>>>>>
>>>>>
>>>>>          
___________________________________________________
>>>>>                                 LLVM Developers mailing
list
>>>>>         LLVMdev at cs.uiuc.edu <mailto:LLVMdev at
cs.uiuc.edu>
>>>>>         <mailto:LLVMdev at cs.uiuc.edu
<mailto:LLVMdev at cs.uiuc.edu>>
>>>>>         http://llvm.cs.uiuc.edu
>>>>>        
http://lists.cs.uiuc.edu/____mailman/listinfo/llvmdev
>>>>>        
<http://lists.cs.uiuc.edu/__mailman/listinfo/llvmdev>
>>>>>
>>>>>        
<http://lists.cs.uiuc.edu/__mailman/listinfo/llvmdev
>>>>>        
<http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev>>
>>>>>
>>>>>
>>>>>
>>>>>             
___________________________________________________
>>>>>              LLVM Developers mailing list
>>>>>         LLVMdev at cs.uiuc.edu <mailto:LLVMdev at
cs.uiuc.edu>
>>>>>         <mailto:LLVMdev at cs.uiuc.edu
<mailto:LLVMdev at cs.uiuc.edu>>
>>>>>         http://llvm.cs.uiuc.edu
>>>>>        
http://lists.cs.uiuc.edu/____mailman/listinfo/llvmdev
>>>>>        
<http://lists.cs.uiuc.edu/__mailman/listinfo/llvmdev>
>>>>>             
<http://lists.cs.uiuc.edu/__mailman/listinfo/llvmdev
>>>>>        
<http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>     _______________________________________________
>>>>>     LLVM Developers mailing list
>>>>>     LLVMdev at cs.uiuc.edu <mailto:LLVMdev at
cs.uiuc.edu>
>>>>> http://llvm.cs.uiuc.edu
>>>>>     http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>>>>
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> LLVM Developers mailing list
>>>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>>>
>>>
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>
>>
>>
>>
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>
>

Nick Lewycky

2014-Feb-25 04:49 UTC

head link

[LLVMdev] MergeFunctions: reduce complexity to O(log(N))

/// Compare two Types, treating all pointer types as equal.
-  bool isEquivalentType(Type *Ty1, Type *Ty2) const;
+  int cmpType(Type *Ty1, Type *Ty2) const;
+
+  /// Compare two Types, treating all pointer types as equal.
+  bool isEquivalentType(Type *Ty1, Type *Ty2) const {
+    return cmpType(Ty1, Ty2) == 0;
+  }

Why do we still need isEquivalentType? Can we nuke this?

+static int cmpNumbers(uint64_t L, uint64_t R) {
+  if (L < R) return -1;
+  if (L > R) return 1;
+  return 0;
+}

At a high level, you don't actually need a <=> operator to use a sort.
A
strict ordering ( < operator) is sufficient. (Note that for mergefunc, a
strict weak ordering is not sufficient, it must be a total ordering.)

Consider hoisting this inside the FunctionComparator class? That class
should have a bunch of implementations of comparisons between various
different things, which can pass down to other methods in the same class.

+  // Replacement for type::canLosslesslyBitCastTo, that
+  // establish order relation on this kind of properties.
+  int checkForLosslessBitcast(const Type *L, const Type *R);

Type:: not type:: . Please make this comment more descriptive.

+  /// Replacement for C1 == getBitCast(C2, C1Ty)
+  /// Its more controllable, and supposed to be simpler and more
predictionable.
+  /// As very important advantage: it allows to introduce order relation on
+  /// constants set, and thus use it as trait in refinement routines.

"Its" --> "It's". "predictionable" -->
"predictable". And how is it more
predictable? I think this comment would be better if it described the
function instead of making comparisons between it and other functions.
Something like, "Compare constants under a system where pointer to X and
pointer to Y are considered equal" or whatever is actually true here.

+enum ConstantType {

I'm not sure that this buys you much. All the "isa" tests can be
broken
down into a switch on getValueID() with the one exception of isNullValue().

+    // Check whether C2 and C1 has equal bit patterns.
+    if (checkForLosslessBitcast(C1->getType(), C2->getType()) != 0)
+      return false;
+
+    assert(
+      C1->getType()->canLosslesslyBitCastTo(C2->getType()) &&
+      "Pass is healthless. checkForLosslessBitcast should be twin of
"
+      "canLosslesslyBitCastTo method, except the case when the last one
"
+      "returns false, the first one should return -1 or 1");
+
+    // Compare constant contents
+    if (cmpConstants(C1, C2) != 0)
+      return false;
+
+    assert(
+      C1 =+      ConstantExpr::getBitCast(const_cast<Constant*>(C2),
C1->getType()) &&
+      "Pass is healthless. cmpConstants should detect *subset* of "
+      "equivalent constants that could be detected by "
+      "getBitCast method. So whereas cmpConstants returns 0 "
+      "getBitCast also must return true. And wheres getBitCast "
+      "returns false, cmpConstants also must return -1 or 1.");

I think we can skip the asserts here. They aren't detecting a specific bug,
they're checking whether the new code does a certain task relative to the
old code. Drop the old code, your new code is the new sheriff in town.

+  DenseMap<const Value*, int> sn_map1, sn_map2;

What is sn short for ("seen")? Why are there two of these?

+  std::pair<DenseMap<const Value *, int>::iterator, bool>
+    LeftSN = sn_map1.insert(std::make_pair(V1, sn_map1.size())),
+    RightSN = sn_map2.insert(std::make_pair(V2, sn_map2.size()));

So I think I get it, this is easy to reason about. You number each value
going down both functions. But I think you can eliminate one of these maps
because you know that if the left and right were ever different then we
terminate the comparison immediately. You can at least share one map with
both V1 and V2, but I think you can reduce the number of map insertions.

+// Mostly cloned from BitcodeWriter, but simplified a bit.

Missing indent.

+static int cmpAttrs(const AttributeSet L, const AttributeSet R) {

High-level question, are attributes ordered? Your comparison is ordered. So
if I have function F with string attributes "sse2",
"gc=hemisphere" and
another function G with string attributes "gc=hemisphere",
"sse2" then they
would be considered different. I don't think that's right.

+  int cmpOperation(const Instruction *I1, const Instruction *I2) const;
+
   bool isEquivalentOperation(const Instruction *I1,
-                             const Instruction *I2) const;
+                             const Instruction *I2) const {
+    return cmpOperation(I1, I2) == 0;
+  }

Hopefully this patch series ultimately eliminates calls of
isEquivalentOperation?

+  if (int Res = cmpNumbers(I1->getRawSubclassOptionalData(),
+                           I2->getRawSubclassOptionalData()))

By the way, this is an interesting one. Should "add x, y" and
"add nsw x,
y" be equal or not? Depends whether the backend uses it, and whether
we're
going to do more optimization afterwards (ie., this is the compile step of
an LTO build). Nothing for you to change here, just want you to be aware
that this is a point where we might want to have a flag and make mergefunc
behave differently in different compile strategies.

(I stopped before getting to 0008.)

Nick


On 3 February 2014 11:11, Stepan Dyatkovskiy <stpworld at narod.ru> wrote:
> Hi all,
>
> Previous patch has been split onto series of small changes.
> On each stage (after each patch) MergeFunctions pass is compilable and
> stable.
>
> Please find patches in attachment for review.
>
> To apply all patches at once, use "apply-patches.sh" script as
follows:
> 0. Place "apply-patches.sh" in same directory with patches.
> 1. cd <llvm-sources-dir>
> 2. "bash <path-to-patches-dir>/apply-patches.sh"
>
> -Stepan
>
> Stepan Dyatkovskiy wrote:
>
>> Hi Nick,
>> About partition refinement.
>>
>>
>> It looks like patch proposed here allows to apply your idea with
>> painless way (as series of small changes actually).
>>
>> If I got it right, you would like to introduce the next
>> partition-refinement.
>>
>> Initial partition P consists of only one functions class X0 (whole
>> module, single bucket). Suppose S is subset of X0, so S represents
>> functions with some trait that is absent in rest part of X0.
>>
>> Then refinement looks like below:
>>
>> P`=refine(P, S):
>>    X`[0] = intersection(X[0], S)
>>    X`[1] = difference(X[0], S)  // X0 without S0
>>
>> so P`={X`[0], X`[1]}
>>
>> And actually X`[0] and X`[1] are classes of functions that differs with
>> trait defined by S.
>>
>> Then (suppose, we need to introduce new trait, so it would be S`):
>>
>> P``=refine(P`, S`), so each X`` could be defined like below:
>> for each X`[i] in P`:
>>    X``[i*2] = intersection(X`[i], S`)
>>    X``[i*2+1] = difference(X`[i], S`)
>>
>> and so on, until all the traits are over.
>>
>> Finally we got the classes (set of Xs) that consists of equivalent
>> functions. So, we can merge functions inside the class bounds.
>>
>> Am I right?
>>
>> Patch, that was proposed here could be represented as such partition
>> refinement then:
>>
>> On each refinement stage, you need to introduce S somehow, the
algorithm
>> that answers the question: "what sort of difference we want to
check in
>> this bucket?"
>>
>> In existing implementation we check traits already, but pair-wise
>> (attributes, calling convention, number of arguments, number of basic
>> blocks, etc.). How to to use the same traits in partition refinement?
>>
>> Consider next example. Suppose we want to split some bucket X:
>>
>> 0. Take some trait from *pair-wise* comparison. For example: number of
>> arguments.
>> 1. Take some function from bucket X, let it be F1. Let F1 has N1 number
>> of arguments.
>> 2. Then S are functions with number of arguments *less* than N1.
>> 3. Do refinement by S. As result we got X`[0] and X`[1].
>> 4. If both of X`[0] and X`[1] are not empty, then repeat #1-#3 for
them.
>> 5. If some of X`[0] of X`[1] is empty set, then all functions in X are
>> the same in context of trait S. We need another trait, go to #0. And
>> repeat #0-#5 for all X` buckets we have.
>>
>> Finally we got the groups of equivalent functions. And actually these
>> groups would be organized in binary tree.
>> Most of stuff is implemented in std::map. I showed it as #0-#5 just to
>> put it into context of your idea. The only problem was to introduce way
>> of how to define order-relation (or how to define S). But this problem
>> has been solved.
>> And on output we got groups of equal functions.
>>
>> Fix me if I'm wrong. But it looks like every advantage given by
>> partition refinement is present here.
>>
>> As I said in beginning, it is way of how to commit things you proposed
>> with minimum of changes. We will use old comparison routines, only
thing
>> we need to change is return type (from bool to -1,0,1).
>> In this way, partition refinement could be committed as series of small
>> changes: one comparison routine per commit.
>>
>> Every change in MergeFunctions from now could be well checked by this
>> builder:
>> Now we have special builder for MergeFunctions. It runs test-suite with
>> forced MergeFunctions pass:
>> http://lab.llvm.org:8011/builders/clang-mergefunc-x86_64-freeBSD9.2
>> Currently, MergeFunctions pass is stable, builder is green and passes
>> all test-suite programs.
>>
>> -Stepan.
>>
>> Stepan Dyatkovskiy wrote:
>>
>>> Hi all,
>>> Please find the updated patch in attachment:
>>> * Added some comments.
>>> * Fixed some typos.
>>>
>>> -Stepan
>>>
>>> Nick Lewycky wrote:
>>>
>>>> On 30 January 2014 01:27, Stepan Dyatkovskiy <stpworld at
narod.ru
>>>> <mailto:stpworld at narod.ru>> wrote:
>>>>
>>>>     Hello Sean and Tobias,
>>>>
>>>>     Sean,
>>>>     Thank you. Could you describe Nick's ideas in few words
or give me
>>>>     links to your discussion, so I could adapt my ideas to it.
>>>>
>>>>
>>>> Sorry I haven't had time to write it up. The essential idea
is that we
>>>> use partition-refinement to determine the differences between a
group of
>>>> functions instead of using pair-wise comparison.
>>>>
>>>> Imagine you want to compare long sequences of numbers. You can
grab a
>>>> pair of sequences and run down them until they differ or you
reach the
>>>> end, and move on to another pair of sequences. With partition
refinement
>>>> you start with the first number in all the sequences and bucket
them.
>>>> Say, three 1's, two 2's, and an 8. You've just
formed 3 buckets. For
>>>> each bucket (with more than 1 member), do it again with the
next number
>>>> in the sequence. This gets you an equivalence comparison at
less than
>>>> the complexity of pairwise comparisons. (The buckets are
partitions, and
>>>> you're refining them as you read more data. Partition
refinement is the
>>>> dual of union-find, which we use quite heavily in the
compiler.)
>>>>
>>>> I haven't figured out how this stacks up against
Stepan's sorting with
>>>> <=> comparison. I can probably do that, I just
haven't spent the time
>>>> thinking about it.
>>>>
>>>> I also haven't thought through how to integrate it with
"near miss"
>>>> comparison. My current leading idea is that you do the same
thing, but
>>>> make a record of what makes two buckets similar. (Note that the
wrong
>>>> solution is to store a per-function-pair list of
>>>> similarities/differences.)
>>>>
>>>> Nick
>>>>
>>>>     Tobias,
>>>>     Your patch fails on several modules in my benchmark (73 of
~1800
>>>>     tests). I have sent one as attachment.
>>>>
>>>>     See statistics files for more details, all the .ll files
you could
>>>>     simply find in test-suite object directory (after
"make TEST=nightly
>>>>     report").
>>>>     I have attached script that launches benchmark. Example of
usage
>>>>     (you need Linux OS):
>>>>     bash test-scipt.sh report.txt opt-no-patch opt-with-patch
>>>>
>>>>     Some numbers.
>>>>
>>>>     Total number of functions: 52715
>>>>
>>>>     Below, by default, I have excluded modules where your patch
fails.
>>>>
>>>>     Functions merged
>>>>     Original version: 1113
>>>>     Order relation patch: 1112
>>>>     Similar merging patch: 541
>>>>
>>>>     Functions merged (including failed tests)
>>>>     Original version: 8398
>>>>     Order relation patch: 8395
>>>>
>>>>     Summary files size
>>>>     Initial:                                 163595634 bytes.
>>>>     After merging with order relation patch: 163147731 bytes.
>>>>     After similar merging patch:             162491905 bytes.
>>>>
>>>>     Summary files size (including failed tests)
>>>>     Initial:          250242266 bytes.
>>>>     Original version: 247168198 bytes.
>>>>     Order relation:   247175650 bytes.
>>>>
>>>>     Time. I measured with "time" utility, and used
"real (wall clock)
>>>>     time used by the process".
>>>>
>>>>     Summary time spent
>>>>     With existing version:     28.05 secs.
>>>>     With order relation patch: 28.19 secs.
>>>>     Similar merging patch:     28.61 secs.
>>>>
>>>>     Summary time spent (including failed tests)
>>>>     With existing version:     41.74 secs.
>>>>     With order relation patch: 36.35 secs.
>>>>
>>>>     -Stepan
>>>>
>>>>     Sean Silva wrote:
>>>>
>>>>
>>>>
>>>>
>>>>         On Tue, Jan 28, 2014 at 2:47 PM, Tobias von Koch
>>>>         <tobias.von.koch at gmail.com
<mailto:tobias.von.koch at gmail.com>
>>>>         <mailto:tobias.von.koch at gmail.__com
>>>>         <mailto:tobias.von.koch at gmail.com>>>
wrote:
>>>>
>>>>              Hi Stepan,
>>>>
>>>>              Sorry for the delay. It's great that you are
working on
>>>>              MergeFunctions as well and I agree, we should
definitely
>>>> try to
>>>>              combine our efforts to improve MergeFunctions.
>>>>
>>>>              Just to give you some context, the pass (with the
similar
>>>>         function
>>>>              merging patch) is already being used in a
production
>>>>         setting. From
>>>>              my point of view, it would be better if we focus
on
>>>>         improving its
>>>>              capability of merging functions at this stage
rather
>>>> than on
>>>>              reduction of compilation time.
>>>>
>>>>              I'll give a brief overview of how our similar
function
>>>> merging
>>>>              algorithm works (you can also watch the
presentation from
>>>>         the US
>>>>              LLVM conference to hear me explain it using an
animation).
>>>>
>>>>              1. Hash all functions into buckets
>>>>
>>>>              In each bucket, separately:
>>>>
>>>>                2. Compare functions pair-wise and determine a
>>>>                   similarity metric for each pair (%age of
equivalent
>>>>         instructions)
>>>>
>>>>                3. Merge identical functions (with similarity =
100%),
>>>>         update call
>>>>                   sites for those functions.
>>>>
>>>>                4. If the updates of call sites have touched
other
>>>> functions,
>>>>                   go back to step 2 and re-compare *only* those
>>>>         functions to all
>>>>                   others in their buckets.
>>>>
>>>>                Finally,
>>>>
>>>>                5. Form groups of similar functions for merging:
>>>>                   a) Pick most similar pair (A,B)
>>>>                   b) Find all functions C that are also similar
to B but
>>>>         are not
>>>>                      more similar to some other function D.
>>>>                   c) Merge A with B and all the C's.
>>>>                   Repeat until there are no more functions to
merge.
>>>>
>>>>              As you can see, we need to compute a similarity
measure for
>>>>         each
>>>>              pair of functions in a bucket. If I understand
correctly,
>>>>         your patch
>>>>              reduces compile-time by avoiding comparisons of
functions
>>>> that
>>>>              cannot be identical. That's exactly what you
want to do if
>>>>         you only
>>>>              want to merge identical functions. However,
because we also
>>>>         need to
>>>>              determine whether functions are merely *similar*,
I can't
>>>>         see how
>>>>              your idea could be applied in that case.
>>>>
>>>>
>>>>         Yeah, the existing pass only tries to merge identical
functions
>>>>         (identical in machine code). I'm wondering if the
"similar"
>>>> function
>>>>         merging would be better as a separate pass (I'm
genuinely
>>>>         wondering; I
>>>>         have no clue). I would wait for Nick's feedback.
>>>>
>>>>         -- Sean Silva
>>>>
>>>>
>>>>              Looking at your statistics, I see that you are
considering
>>>>         a very
>>>>              large number of functions. I don't know which
benchmarks
>>>>         you are
>>>>              using, but I doubt that many of these are worth
merging.
>>>> For
>>>>              instance, you rarely gain a code size benefit from
merging
>>>>         functions
>>>>              with just a few instructions. Our patch takes care
of this
>>>>         using
>>>>              thresholds. It's worth looking at actual code
size
>>>> reductions,
>>>>              rather than just numbers of functions merged/
compilation
>>>>         time spent
>>>>              comparing functions. It turns out that the number
of
>>>>         functions in
>>>>              each bucket is really quite small once you apply
some
>>>>         heuristics
>>>>              (and could still be further reduced!).
>>>>
>>>>              Given your experience with MergeFunctions, it
would be
>>>>         really great
>>>>              if you could review our patch and also try it out
on your
>>>>         benchmarks.
>>>>
>>>>              Tobias
>>>>
>>>>
>>>>              On 24/01/2014 19:11, Stepan Dyatkovskiy wrote:
>>>>
>>>>                  Hi Tobias.
>>>>
>>>>                  So, what do you think?
>>>>
>>>>                  If it means to much extra-job for your team,
may be I
>>>>         can help you
>>>>                  somehow? I really would like to.
>>>>
>>>>                  -Stepan
>>>>
>>>>                  Stepan Dyatkovskiy wrote:
>>>>
>>>>                      Hi Tobias,
>>>>
>>>>                          I can't really see a way to
combine our
>>>>         approach with
>>>>                          your patch. What
>>>>                          are your thoughts?
>>>>
>>>>
>>>>                      I think it is possible. Even more - we
need to
>>>>         combine our
>>>>                      efforts, in
>>>>                      order to bring this pass into real live.
>>>>                      I'have read your presentation file,
and
>>>>         unfortunately read
>>>>                      your patch
>>>>                      only a little.
>>>>                      How exactly you scan functions on 2nd
stage?
>>>> Could you
>>>>                      explain the
>>>>                      algorithm in few words, how you compare
workflow?
>>>> Is it
>>>>                      possible to
>>>>                      scan binary tree instead of hash table?
...OK.
>>>>
>>>>                      That's how I see the modification. Now
its
>>>> important to
>>>>                      explain idea,
>>>>                      so consider the simplest case: you need to
catch
>>>> two
>>>>                      functions that
>>>>                      are differs with single instruction
somewhere.
>>>>
>>>>                      1. Imagine, that IR *module* contents is
>>>> represented as
>>>>                      binary tree:
>>>>                      Each line (trace) from root to terminal
node is a
>>>>         function.
>>>>                      Each node - is function's primitive
(instruction
>>>>         opcode, for
>>>>                      example).
>>>>                      Down - is direction to terminal nodes, up
- is
>>>>         direction to
>>>>                      the root.
>>>>
>>>>                      2. Now you are standing on some node. And
you
>>>> have two
>>>>                      directions
>>>>                      down-left and down-right.
>>>>
>>>>                      3. If you are able to find two equal
sub-traces
>>>>         down (at
>>>>                      left and at
>>>>                      right), then the only difference lies in
this node.
>>>>         Then we
>>>>                      catch that
>>>>                      case.
>>>>
>>>>                      4. Even more, if two subtrees at left and
at
>>>> right are
>>>>                      equal, than you
>>>>                      catch at once all the cases that are
represented by
>>>>         these
>>>>                      subtrees.
>>>>
>>>>                      I still didn't look at you patch
carefully. Sorry..
>>>>         But I
>>>>                      hope that
>>>>                      helps, and I'll try look at it in
nearest time and
>>>>         perhaps
>>>>                      its not the
>>>>                      best solution I gave in this post.
>>>>
>>>>                      -Stepan
>>>>
>>>>                      22.01.2014, 20:53, "Tobias von
Koch"
>>>>                      <tobias.von.koch at gmail.com
>>>>         <mailto:tobias.von.koch at gmail.com>
>>>>         <mailto:tobias.von.koch at gmail.__com
>>>>         <mailto:tobias.von.koch at gmail.com>>>:
>>>>
>>>>
>>>>                          Hi Stepan,
>>>>
>>>>                          As you've seen we have recently
implemented a
>>>>                          significant enhancement to
>>>>                          the MergeFunctions pass that also
allows
>>>> merging of
>>>>                          functions that are
>>>>                          only similar but not identical
>>>>                         
(http://llvm-reviews.__chandle__rc.com/D2591
>>>>         <http://chandlerc.com/D2591>
>>>>                         
<http://llvm-reviews.__chandlerc.com/D2591
>>>>         <http://llvm-reviews.chandlerc.com/D2591>>).
>>>>
>>>>
>>>>                          Our patch also changes the way that
functions
>>>> are
>>>>                          compared quite
>>>>                          significantly. This is necessary
because we
>>>> need to
>>>>                          compare all
>>>>                          functions in a bucket in order to get
a
>>>> similarity
>>>>                          measure for each
>>>>                          pair, so we can then decide which
'groups' of
>>>>         functions
>>>>                          to merge.
>>>>
>>>>                          I can't really see a way to
combine our
>>>>         approach with
>>>>                          your patch. What
>>>>                          are your thoughts?
>>>>
>>>>                          Another way to improve the performance
of
>>>>         MergeFunctions
>>>>                          might be to
>>>>                          make the hash function better.
I've already
>>>>         added the
>>>>                          size of the first
>>>>                          BB to it, and perhaps there are other
factors
>>>>         we could
>>>>                          try... if we
>>>>                          don't have to compare functions in
the first
>>>> place
>>>>                          (because they're in
>>>>                          different buckets) then that's
obviously a much
>>>>         bigger win.
>>>>
>>>>                          Thanks,
>>>>                          Tobias
>>>>
>>>>                          On 17/01/2014 20:25, Stepan
Dyatkovskiy wrote:
>>>>
>>>>                                 Hi all,
>>>>
>>>>                                 I propose simple improvement
for
>>>>         MergeFunctions
>>>>                              pass, that reduced
>>>>                              its
>>>>                                 complexity from O(N^2) to
O(log(N)),
>>>>         where N is
>>>>                              number of
>>>>                              functions in
>>>>                                 module.
>>>>
>>>>                                 The idea, is to replace the
result of
>>>>         comparison
>>>>                              from "bool" to
>>>>                                 "-1,0,1". In another
words: define order
>>>>         relation
>>>>                              on functions set.
>>>>                                 To be sure, that functions
could be
>>>>         comparable
>>>>                              that way, we have to
>>>>                                 prove that order relation is
possible
>>>> on all
>>>>                              comparison stage.
>>>>
>>>>                                 The last one is possible, if
for each
>>>>         comparison
>>>>                              stage we
>>>>                              implement has
>>>>                                 next properties:
>>>>                                 * reflexivity (a <= a, a ==
a, a >= a),
>>>>                                 * antisymmetry (if a <= b
and b <= a
>>>>         then a  == b),
>>>>                                 * transitivity (a <= b and b
<= c, then
>>>>         a <= c)
>>>>                                 * asymmetry (if a < b, then
a > b or a
>>>>         == b).
>>>>
>>>>                                 Once we have defined order
relation we
>>>>         can store
>>>>                              all the functions in
>>>>                                 binary tree and perform lookup
in
>>>>         O(log(N)) time.
>>>>
>>>>                                 This post has two attachments:
>>>>                                 1. The patch, that has
implementation of
>>>>         this idea.
>>>>                                 2. The MergeFunctions pass
detailed
>>>>         description,
>>>>                              with explanation how
>>>>                                 order relation could be
possible.
>>>>
>>>>                                 Hope it helps to make things
better!
>>>>
>>>>                                 -Stepan.
>>>>
>>>>
>>>>           ___________________________________________________
>>>>                                 LLVM Developers mailing list
>>>>         LLVMdev at cs.uiuc.edu <mailto:LLVMdev at
cs.uiuc.edu>
>>>>         <mailto:LLVMdev at cs.uiuc.edu <mailto:LLVMdev at
cs.uiuc.edu>>
>>>>         http://llvm.cs.uiuc.edu
>>>>         http://lists.cs.uiuc.edu/____mailman/listinfo/llvmdev
>>>>        
<http://lists.cs.uiuc.edu/__mailman/listinfo/llvmdev>
>>>>
>>>>         <http://lists.cs.uiuc.edu/__mailman/listinfo/llvmdev
>>>>        
<http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev>>
>>>>
>>>>
>>>>
>>>>             
___________________________________________________
>>>>              LLVM Developers mailing list
>>>>         LLVMdev at cs.uiuc.edu <mailto:LLVMdev at
cs.uiuc.edu>
>>>>         <mailto:LLVMdev at cs.uiuc.edu <mailto:LLVMdev at
cs.uiuc.edu>>
>>>>         http://llvm.cs.uiuc.edu
>>>>         http://lists.cs.uiuc.edu/____mailman/listinfo/llvmdev
>>>>        
<http://lists.cs.uiuc.edu/__mailman/listinfo/llvmdev>
>>>>             
<http://lists.cs.uiuc.edu/__mailman/listinfo/llvmdev
>>>>        
<http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev>>
>>>>
>>>>
>>>>
>>>>
>>>>     _______________________________________________
>>>>     LLVM Developers mailing list
>>>>     LLVMdev at cs.uiuc.edu <mailto:LLVMdev at
cs.uiuc.edu>
>>>> http://llvm.cs.uiuc.edu
>>>>     http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>>>
>>>>
>>>
>>>
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>>
>>>
>> _______________________________________________
>> LLVM Developers mailing list
>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>
>
>
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20140224/3a957036/attachment.html>

Stepan Dyatkovskiy

2014-Feb-27 16:23 UTC

head link

[LLVMdev] MergeFunctions: reduce complexity to O(log(N))

Hi Nick,

I tried to rework changes as you requested. One of patches (0004 with 
extra assertions) has been removed.

 > +  bool isEquivalentType(Type *Ty1, Type *Ty2) const {
 > +    return cmpType(Ty1, Ty2) == 0;
 > +  }
 >
 > Why do we still need isEquivalentType? Can we nuke this?
Yup. After applying all the patches isEquivalentType will be totally 
replaced with cmpType. All isEqXXXX and friends will be removed in 0011 
(old 0012). No traces left.
Old function wasn't removed in 0001 just for keeping patches without 
extra noise like:

- something that uses isEquivalentType
+ something that uses cmpType

The point is, that "something" that uses isEquivalentType, will be
also
replaced with one of next patches in this series.

 >
 > +static int cmpNumbers(uint64_t L, uint64_t R) {
 > +  if (L < R) return -1;
 > +  if (L > R) return 1;
 > +  return 0;
 > +}
 >
 > At a high level, you don't actually need a <=> operator to use a
sort. A
 > strict ordering ( < operator) is sufficient. (Note that for mergefunc,
a
 > strict weak ordering is not sufficient, it must be a total ordering.)
That could be done with int FunctionComparator::compare(). We can 
replace it with bool FunctionComparator::less(). Though for all other 
cmp methods need at least 2 bits of information as result:
1. Whether things are equal.
2. Whether left less than right.

As for FunctionComparator::compare(), conversion into less() will 
increase time of sanity check (patch #0010).
Sanity check is just a sweet bonus. It checks that ordering implemented 
properly (checks order relation properties).
Turning compare() into less() mean, that we'll have to run comparison 
two times: L.less(R) and R.less(L). But may be sanity check is not a 
thing to be published at all.

 >
 > Consider hoisting this inside the FunctionComparator class? That class
 > should have a bunch of implementations of comparisons between various
 > different things, which can pass down to other methods in the same class.
In new patch series attached to this post, I have moved all static 
methods into FunctionComparator.

 > +  // Replacement for type::canLosslesslyBitCastTo, that
 > +  // establish order relation on this kind of properties.
 > +  int checkForLosslessBitcast(const Type *L, const Type *R);
 >
 > Type:: not type:: . Please make this comment more descriptive.
Done.
[new comment]
Replacement for Type::canLosslesslyBitCastTo, that
establish order relation on this kind of properties
Returns 0, if L and R types could be converted to each other without
reinterpretation of bits.
Otherwise method returns -1 or 1, defining total ordering between
types in context of lossless bitcastability trait.
E.g.: if L is less than R (result is -1), than every type that could be
losslessly bitcasted to L is less than R.
[/new comment]

 >
 > +  /// Replacement for C1 == getBitCast(C2, C1Ty)
 > +  /// Its more controllable, and supposed to be simpler and more
 > predictionable.
 > +  /// As very important advantage: it allows to introduce order 
relation on
 > +  /// constants set, and thus use it as trait in refinement routines.
 >
 > "Its" --> "It's". "predictionable"
--> "predictable". And how is it more
 > predictable? I think this comment would be better if it described the
 > function instead of making comparisons between it and other functions.
 > Something like, "Compare constants under a system where pointer to X
and
 > pointer to Y are considered equal" or whatever is actually true here.
Done.
[new comment]
Replacement for C1 == getBitCast(C2, C1Ty)
Parses constants contents, assuming that types are losslessly
bitcasted between each other. So actually it ignores types and only
compares bits from L and R.
Returns 0, if L and R has equivalent content.
-1 or 1 if values are different. Maintaining total ordering requires
two values that indicates non-equivalence (L less R, L greater R).
[/new comment]

 >
 > +enum ConstantType {
 > I'm not sure that this buys you much. All the "isa" tests
can be broken
 > down into a switch on getValueID() with the one exception of 
isNullValue().
Done.

 > +    assert(
 > +      C1->getType()->canLosslesslyBitCastTo(C2->getType())
&&
 > +      "Pass is healthless. checkForLosslessBitcast should be twin of
"
 > +      "canLosslesslyBitCastTo method, except the case when the last 
one "
 > +      "returns false, the first one should return -1 or 1");
...
 > I think we can skip the asserts here. They aren't detecting a specific
 > bug, they're checking whether the new code does a certain task
relative
 > to the old code. Drop the old code, your new code is the new sheriff in
 > town.
Asserts has been removed.

 >
 > +  DenseMap<const Value*, int> sn_map1, sn_map2;
 >
 > What is sn short for ("seen")? Why are there two of these?
Serial number :-)
 >
 > +  std::pair<DenseMap<const Value *, int>::iterator, bool>
 > +    LeftSN = sn_map1.insert(std::make_pair(V1, sn_map1.size())),
 > +    RightSN = sn_map2.insert(std::make_pair(V2, sn_map2.size()));
 >
 > So I think I get it, this is easy to reason about. You number each value
 > going down both functions. But I think you can eliminate one of these
 > maps because you know that if the left and right were ever different
 > then we terminate the comparison immediately. You can at least share one
 > map with both V1 and V2, but I think you can reduce the number of map
 > insertions.
Not sure. Consider that in left you have:
%A = alloca i32
%B = alloca i32
store i32 1, i32* %A

And in right:
%A = alloca i32
%B = alloca i32
store i32 1, i32* %B

When we meet 'store' instruction we need to check in which order %A was 
allocated at left and in which order %B was allocated at right.
So for each value in function we have to assign serial number. Then 
comparing to local values from different functions we can just check 
their serial numbers.
Though, may be I missed something? May be there is another way to 
determine "who was first".

 > High-level question, are attributes ordered? Your comparison is ordered.
 > So if I have function F with string attributes "sse2",
"gc=hemisphere"
 > and another function G with string attributes "gc=hemisphere",
"sse2"
 > then they would be considered different. I don't think that's
right.
I didn't check it. But if it is not ordered, we could order it 
lexicographically.

 >
 > +  int cmpOperation(const Instruction *I1, const Instruction *I2) const;
 > +
 >     bool isEquivalentOperation(const Instruction *I1,
 > -                             const Instruction *I2) const;
 > +                             const Instruction *I2) const {
 > +    return cmpOperation(I1, I2) == 0;
 > +  }
 >
 > Hopefully this patch series ultimately eliminates calls of
 > isEquivalentOperation?
Of course. Just like isEquivalentType.

 > By the way, this is an interesting one. Should "add x, y" and
"add nsw
 > x, y" be equal or not?
Yeah. Those are interesting questions. We could even treat 'mul a, 2' 
and 'shl a,1' as equal in some cases. I think it's matter of time
and
further optimization.

Please find first reworked patches in attachment.

-Stepan

Nick Lewycky wrote:>     /// Compare two Types, treating all pointer types as equal.
> -  bool isEquivalentType(Type *Ty1, Type *Ty2) const;
> +  int cmpType(Type *Ty1, Type *Ty2) const;
> +
> +  /// Compare two Types, treating all pointer types as equal.
> +  bool isEquivalentType(Type *Ty1, Type *Ty2) const {
> +    return cmpType(Ty1, Ty2) == 0;
> +  }
>
> Why do we still need isEquivalentType? Can we nuke this?
>
> +static int cmpNumbers(uint64_t L, uint64_t R) {
> +  if (L < R) return -1;
> +  if (L > R) return 1;
> +  return 0;
> +}
>
> At a high level, you don't actually need a <=> operator to use a
sort. A
> strict ordering ( < operator) is sufficient. (Note that for mergefunc, a
> strict weak ordering is not sufficient, it must be a total ordering.)
>
> Consider hoisting this inside the FunctionComparator class? That class
> should have a bunch of implementations of comparisons between various
> different things, which can pass down to other methods in the same class.
>
> +  // Replacement for type::canLosslesslyBitCastTo, that
> +  // establish order relation on this kind of properties.
> +  int checkForLosslessBitcast(const Type *L, const Type *R);
>
> Type:: not type:: . Please make this comment more descriptive.
>
> +  /// Replacement for C1 == getBitCast(C2, C1Ty)
> +  /// Its more controllable, and supposed to be simpler and more
> predictionable.
> +  /// As very important advantage: it allows to introduce order relation
on
> +  /// constants set, and thus use it as trait in refinement routines.
>
> "Its" --> "It's". "predictionable"
--> "predictable". And how is it more
> predictable? I think this comment would be better if it described the
> function instead of making comparisons between it and other functions.
> Something like, "Compare constants under a system where pointer to X
and
> pointer to Y are considered equal" or whatever is actually true here.
>
> +enum ConstantType {
>
> I'm not sure that this buys you much. All the "isa" tests can
be broken
> down into a switch on getValueID() with the one exception of isNullValue().
>
> +    // Check whether C2 and C1 has equal bit patterns.
> +    if (checkForLosslessBitcast(C1->getType(), C2->getType()) != 0)
> +      return false;
> +
> +    assert(
> +      C1->getType()->canLosslesslyBitCastTo(C2->getType())
&&
> +      "Pass is healthless. checkForLosslessBitcast should be twin of
"
> +      "canLosslesslyBitCastTo method, except the case when the last
one "
> +      "returns false, the first one should return -1 or 1");
> +
> +    // Compare constant contents
> +    if (cmpConstants(C1, C2) != 0)
> +      return false;
> +
> +    assert(
> +      C1 => +     
ConstantExpr::getBitCast(const_cast<Constant*>(C2), C1->getType())
&&
> +      "Pass is healthless. cmpConstants should detect *subset* of
"
> +      "equivalent constants that could be detected by "
> +      "getBitCast method. So whereas cmpConstants returns 0 "
> +      "getBitCast also must return true. And wheres getBitCast "
> +      "returns false, cmpConstants also must return -1 or 1.");
>
> I think we can skip the asserts here. They aren't detecting a specific
> bug, they're checking whether the new code does a certain task relative
> to the old code. Drop the old code, your new code is the new sheriff in
> town.
>
> +  DenseMap<const Value*, int> sn_map1, sn_map2;
>
> What is sn short for ("seen")? Why are there two of these?
>
> +  std::pair<DenseMap<const Value *, int>::iterator, bool>
> +    LeftSN = sn_map1.insert(std::make_pair(V1, sn_map1.size())),
> +    RightSN = sn_map2.insert(std::make_pair(V2, sn_map2.size()));
>
> So I think I get it, this is easy to reason about. You number each value
> going down both functions. But I think you can eliminate one of these
> maps because you know that if the left and right were ever different
> then we terminate the comparison immediately. You can at least share one
> map with both V1 and V2, but I think you can reduce the number of map
> insertions.
>
> +// Mostly cloned from BitcodeWriter, but simplified a bit.
>
> Missing indent.
>
> +static int cmpAttrs(const AttributeSet L, const AttributeSet R) {
>
> High-level question, are attributes ordered? Your comparison is ordered.
> So if I have function F with string attributes "sse2",
"gc=hemisphere"
> and another function G with string attributes "gc=hemisphere",
"sse2"
> then they would be considered different. I don't think that's
right.
>
> +  int cmpOperation(const Instruction *I1, const Instruction *I2) const;
> +
>     bool isEquivalentOperation(const Instruction *I1,
> -                             const Instruction *I2) const;
> +                             const Instruction *I2) const {
> +    return cmpOperation(I1, I2) == 0;
> +  }
>
> Hopefully this patch series ultimately eliminates calls of
> isEquivalentOperation?
>
> +  if (int Res = cmpNumbers(I1->getRawSubclassOptionalData(),
> +                           I2->getRawSubclassOptionalData()))
>
> By the way, this is an interesting one. Should "add x, y" and
"add nsw
> x, y" be equal or not? Depends whether the backend uses it, and
whether
> we're going to do more optimization afterwards (ie., this is the
compile
> step of an LTO build). Nothing for you to change here, just want you to
> be aware that this is a point where we might want to have a flag and
> make mergefunc behave differently in different compile strategies.
>
> (I stopped before getting to 0008.)
>
> Nick
>
>
> On 3 February 2014 11:11, Stepan Dyatkovskiy <stpworld at narod.ru
> <mailto:stpworld at narod.ru>> wrote:
>
>     Hi all,
>
>     Previous patch has been split onto series of small changes.
>     On each stage (after each patch) MergeFunctions pass is compilable
>     and stable.
>
>     Please find patches in attachment for review.
>
>     To apply all patches at once, use "apply-patches.sh" script
as follows:
>     0. Place "apply-patches.sh" in same directory with patches.
>     1. cd <llvm-sources-dir>
>     2. "bash <path-to-patches-dir>/apply-__patches.sh"
>
>     -Stepan
>
>     Stepan Dyatkovskiy wrote:
>
>         Hi Nick,
>         About partition refinement.
>
>
>         It looks like patch proposed here allows to apply your idea with
>         painless way (as series of small changes actually).
>
>         If I got it right, you would like to introduce the next
>         partition-refinement.
>
>         Initial partition P consists of only one functions class X0 (whole
>         module, single bucket). Suppose S is subset of X0, so S represents
>         functions with some trait that is absent in rest part of X0.
>
>         Then refinement looks like below:
>
>         P`=refine(P, S):
>             X`[0] = intersection(X[0], S)
>             X`[1] = difference(X[0], S)  // X0 without S0
>
>         so P`={X`[0], X`[1]}
>
>         And actually X`[0] and X`[1] are classes of functions that
>         differs with
>         trait defined by S.
>
>         Then (suppose, we need to introduce new trait, so it would be S`):
>
>         P``=refine(P`, S`), so each X`` could be defined like below:
>         for each X`[i] in P`:
>             X``[i*2] = intersection(X`[i], S`)
>             X``[i*2+1] = difference(X`[i], S`)
>
>         and so on, until all the traits are over.
>
>         Finally we got the classes (set of Xs) that consists of equivalent
>         functions. So, we can merge functions inside the class bounds.
>
>         Am I right?
>
>         Patch, that was proposed here could be represented as such
partition
>         refinement then:
>
>         On each refinement stage, you need to introduce S somehow, the
>         algorithm
>         that answers the question: "what sort of difference we want to
>         check in
>         this bucket?"
>
>         In existing implementation we check traits already, but pair-wise
>         (attributes, calling convention, number of arguments, number of
>         basic
>         blocks, etc.). How to to use the same traits in partition
>         refinement?
>
>         Consider next example. Suppose we want to split some bucket X:
>
>         0. Take some trait from *pair-wise* comparison. For example:
>         number of
>         arguments.
>         1. Take some function from bucket X, let it be F1. Let F1 has N1
>         number
>         of arguments.
>         2. Then S are functions with number of arguments *less* than N1.
>         3. Do refinement by S. As result we got X`[0] and X`[1].
>         4. If both of X`[0] and X`[1] are not empty, then repeat #1-#3
>         for them.
>         5. If some of X`[0] of X`[1] is empty set, then all functions in
>         X are
>         the same in context of trait S. We need another trait, go to #0.
And
>         repeat #0-#5 for all X` buckets we have.
>
>         Finally we got the groups of equivalent functions. And actually
>         these
>         groups would be organized in binary tree.
>         Most of stuff is implemented in std::map. I showed it as #0-#5
>         just to
>         put it into context of your idea. The only problem was to
>         introduce way
>         of how to define order-relation (or how to define S). But this
>         problem
>         has been solved.
>         And on output we got groups of equal functions.
>
>         Fix me if I'm wrong. But it looks like every advantage given by
>         partition refinement is present here.
>
>         As I said in beginning, it is way of how to commit things you
>         proposed
>         with minimum of changes. We will use old comparison routines,
>         only thing
>         we need to change is return type (from bool to -1,0,1).
>         In this way, partition refinement could be committed as series
>         of small
>         changes: one comparison routine per commit.
>
>         Every change in MergeFunctions from now could be well checked by
>         this
>         builder:
>         Now we have special builder for MergeFunctions. It runs
>         test-suite with
>         forced MergeFunctions pass:
>        
http://lab.llvm.org:8011/__builders/clang-mergefunc-x86___64-freeBSD9.2
>        
<http://lab.llvm.org:8011/builders/clang-mergefunc-x86_64-freeBSD9.2>
>         Currently, MergeFunctions pass is stable, builder is green and
>         passes
>         all test-suite programs.
>
>         -Stepan.
>
>         Stepan Dyatkovskiy wrote:
>
>             Hi all,
>             Please find the updated patch in attachment:
>             * Added some comments.
>             * Fixed some typos.
>
>             -Stepan
>
>             Nick Lewycky wrote:
>
>                 On 30 January 2014 01:27, Stepan Dyatkovskiy
>                 <stpworld at narod.ru <mailto:stpworld at
narod.ru>
>                 <mailto:stpworld at narod.ru <mailto:stpworld at
narod.ru>>>
>                 wrote:
>
>                      Hello Sean and Tobias,
>
>                      Sean,
>                      Thank you. Could you describe Nick's ideas in few
>                 words or give me
>                      links to your discussion, so I could adapt my ideas
>                 to it.
>
>
>                 Sorry I haven't had time to write it up. The essential
>                 idea is that we
>                 use partition-refinement to determine the differences
>                 between a group of
>                 functions instead of using pair-wise comparison.
>
>                 Imagine you want to compare long sequences of numbers.
>                 You can grab a
>                 pair of sequences and run down them until they differ or
>                 you reach the
>                 end, and move on to another pair of sequences. With
>                 partition refinement
>                 you start with the first number in all the sequences and
>                 bucket them.
>                 Say, three 1's, two 2's, and an 8. You've just
formed 3
>                 buckets. For
>                 each bucket (with more than 1 member), do it again with
>                 the next number
>                 in the sequence. This gets you an equivalence comparison
>                 at less than
>                 the complexity of pairwise comparisons. (The buckets are
>                 partitions, and
>                 you're refining them as you read more data. Partition
>                 refinement is the
>                 dual of union-find, which we use quite heavily in the
>                 compiler.)
>
>                 I haven't figured out how this stacks up against
>                 Stepan's sorting with
>                 <=> comparison. I can probably do that, I just
haven't
>                 spent the time
>                 thinking about it.
>
>                 I also haven't thought through how to integrate it with
>                 "near miss"
>                 comparison. My current leading idea is that you do the
>                 same thing, but
>                 make a record of what makes two buckets similar. (Note
>                 that the wrong
>                 solution is to store a per-function-pair list of
>                 similarities/differences.)
>
>                 Nick
>
>                      Tobias,
>                      Your patch fails on several modules in my benchmark
>                 (73 of ~1800
>                      tests). I have sent one as attachment.
>
>                      See statistics files for more details, all the .ll
>                 files you could
>                      simply find in test-suite object directory (after
>                 "make TEST=nightly
>                      report").
>                      I have attached script that launches benchmark.
>                 Example of usage
>                      (you need Linux OS):
>                      bash test-scipt.sh report.txt opt-no-patch
>                 opt-with-patch
>
>                      Some numbers.
>
>                      Total number of functions: 52715
>
>                      Below, by default, I have excluded modules where
>                 your patch fails.
>
>                      Functions merged
>                      Original version: 1113
>                      Order relation patch: 1112
>                      Similar merging patch: 541
>
>                      Functions merged (including failed tests)
>                      Original version: 8398
>                      Order relation patch: 8395
>
>                      Summary files size
>                      Initial:                                 163595634
>                 bytes.
>                      After merging with order relation patch: 163147731
>                 bytes.
>                      After similar merging patch:             162491905
>                 bytes.
>
>                      Summary files size (including failed tests)
>                      Initial:          250242266 bytes.
>                      Original version: 247168198 bytes.
>                      Order relation:   247175650 bytes.
>
>                      Time. I measured with "time" utility, and
used
>                 "real (wall clock)
>                      time used by the process".
>
>                      Summary time spent
>                      With existing version:     28.05 secs.
>                      With order relation patch: 28.19 secs.
>                      Similar merging patch:     28.61 secs.
>
>                      Summary time spent (including failed tests)
>                      With existing version:     41.74 secs.
>                      With order relation patch: 36.35 secs.
>
>                      -Stepan
>
>                      Sean Silva wrote:
>
>
>
>
>                          On Tue, Jan 28, 2014 at 2:47 PM, Tobias von Koch
>                          <tobias.von.koch at gmail.com
>                 <mailto:tobias.von.koch at gmail.com>
>                 <mailto:tobias.von.koch at gmail.__com
>                 <mailto:tobias.von.koch at gmail.com>>
>                          <mailto:tobias.von.koch at gmail.
>                 <mailto:tobias.von.koch at gmail.>____com
>                          <mailto:tobias.von.koch at gmail.__com
>                 <mailto:tobias.von.koch at gmail.com>>>>
wrote:
>
>                               Hi Stepan,
>
>                               Sorry for the delay. It's great that you
>                 are working on
>                               MergeFunctions as well and I agree, we
>                 should definitely
>                 try to
>                               combine our efforts to improve
MergeFunctions.
>
>                               Just to give you some context, the pass
>                 (with the similar
>                          function
>                               merging patch) is already being used in a
>                 production
>                          setting. From
>                               my point of view, it would be better if we
>                 focus on
>                          improving its
>                               capability of merging functions at this
>                 stage rather
>                 than on
>                               reduction of compilation time.
>
>                               I'll give a brief overview of how our
>                 similar function
>                 merging
>                               algorithm works (you can also watch the
>                 presentation from
>                          the US
>                               LLVM conference to hear me explain it
>                 using an animation).
>
>                               1. Hash all functions into buckets
>
>                               In each bucket, separately:
>
>                                 2. Compare functions pair-wise and
>                 determine a
>                                    similarity metric for each pair (%age
>                 of equivalent
>                          instructions)
>
>                                 3. Merge identical functions (with
>                 similarity = 100%),
>                          update call
>                                    sites for those functions.
>
>                                 4. If the updates of call sites have
>                 touched other
>                 functions,
>                                    go back to step 2 and re-compare
>                 *only* those
>                          functions to all
>                                    others in their buckets.
>
>                                 Finally,
>
>                                 5. Form groups of similar functions for
>                 merging:
>                                    a) Pick most similar pair (A,B)
>                                    b) Find all functions C that are also
>                 similar to B but
>                          are not
>                                       more similar to some other function
D.
>                                    c) Merge A with B and all the C's.
>                                    Repeat until there are no more
>                 functions to merge.
>
>                               As you can see, we need to compute a
>                 similarity measure for
>                          each
>                               pair of functions in a bucket. If I
>                 understand correctly,
>                          your patch
>                               reduces compile-time by avoiding
>                 comparisons of functions
>                 that
>                               cannot be identical. That's exactly what
>                 you want to do if
>                          you only
>                               want to merge identical functions.
>                 However, because we also
>                          need to
>                               determine whether functions are merely
>                 *similar*, I can't
>                          see how
>                               your idea could be applied in that case.
>
>
>                          Yeah, the existing pass only tries to merge
>                 identical functions
>                          (identical in machine code). I'm wondering if
>                 the "similar"
>                 function
>                          merging would be better as a separate pass
(I'm
>                 genuinely
>                          wondering; I
>                          have no clue). I would wait for Nick's
feedback.
>
>                          -- Sean Silva
>
>
>                               Looking at your statistics, I see that you
>                 are considering
>                          a very
>                               large number of functions. I don't know
>                 which benchmarks
>                          you are
>                               using, but I doubt that many of these are
>                 worth merging.
>                 For
>                               instance, you rarely gain a code size
>                 benefit from merging
>                          functions
>                               with just a few instructions. Our patch
>                 takes care of this
>                          using
>                               thresholds. It's worth looking at actual
>                 code size
>                 reductions,
>                               rather than just numbers of functions
>                 merged/ compilation
>                          time spent
>                               comparing functions. It turns out that the
>                 number of
>                          functions in
>                               each bucket is really quite small once you
>                 apply some
>                          heuristics
>                               (and could still be further reduced!).
>
>                               Given your experience with MergeFunctions,
>                 it would be
>                          really great
>                               if you could review our patch and also try
>                 it out on your
>                          benchmarks.
>
>                               Tobias
>
>
>                               On 24/01/2014 19:11, Stepan Dyatkovskiy
wrote:
>
>                                   Hi Tobias.
>
>                                   So, what do you think?
>
>                                   If it means to much extra-job for your
>                 team, may be I
>                          can help you
>                                   somehow? I really would like to.
>
>                                   -Stepan
>
>                                   Stepan Dyatkovskiy wrote:
>
>                                       Hi Tobias,
>
>                                           I can't really see a way to
>                 combine our
>                          approach with
>                                           your patch. What
>                                           are your thoughts?
>
>
>                                       I think it is possible. Even more
>                 - we need to
>                          combine our
>                                       efforts, in
>                                       order to bring this pass into real
>                 live.
>                                       I'have read your presentation
>                 file, and
>                          unfortunately read
>                                       your patch
>                                       only a little.
>                                       How exactly you scan functions on
>                 2nd stage?
>                 Could you
>                                       explain the
>                                       algorithm in few words, how you
>                 compare workflow?
>                 Is it
>                                       possible to
>                                       scan binary tree instead of hash
>                 table? ...OK.
>
>                                       That's how I see the
modification.
>                 Now its
>                 important to
>                                       explain idea,
>                                       so consider the simplest case: you
>                 need to catch
>                 two
>                                       functions that
>                                       are differs with single
>                 instruction somewhere.
>
>                                       1. Imagine, that IR *module*
>                 contents is
>                 represented as
>                                       binary tree:
>                                       Each line (trace) from root to
>                 terminal node is a
>                          function.
>                                       Each node - is function's
>                 primitive (instruction
>                          opcode, for
>                                       example).
>                                       Down - is direction to terminal
>                 nodes, up - is
>                          direction to
>                                       the root.
>
>                                       2. Now you are standing on some
>                 node. And you
>                 have two
>                                       directions
>                                       down-left and down-right.
>
>                                       3. If you are able to find two
>                 equal sub-traces
>                          down (at
>                                       left and at
>                                       right), then the only difference
>                 lies in this node.
>                          Then we
>                                       catch that
>                                       case.
>
>                                       4. Even more, if two subtrees at
>                 left and at
>                 right are
>                                       equal, than you
>                                       catch at once all the cases that
>                 are represented by
>                          these
>                                       subtrees.
>
>                                       I still didn't look at you patch
>                 carefully. Sorry..
>                          But I
>                                       hope that
>                                       helps, and I'll try look at it in
>                 nearest time and
>                          perhaps
>                                       its not the
>                                       best solution I gave in this post.
>
>                                       -Stepan
>
>                                       22.01.2014, 20:53, "Tobias von
Koch"
>                                       <tobias.von.koch at gmail.com
>                 <mailto:tobias.von.koch at gmail.com>
>                          <mailto:tobias.von.koch at gmail.__com
>                 <mailto:tobias.von.koch at gmail.com>>
>                          <mailto:tobias.von.koch at gmail.
>                 <mailto:tobias.von.koch at gmail.>____com
>                          <mailto:tobias.von.koch at gmail.__com
>                 <mailto:tobias.von.koch at gmail.com>>>>:
>
>
>                                           Hi Stepan,
>
>                                           As you've seen we have
>                 recently implemented a
>                                           significant enhancement to
>                                           the MergeFunctions pass that
>                 also allows
>                 merging of
>                                           functions that are
>                                           only similar but not identical
>
>                   (http://llvm-reviews.__chandle____rc.com/D2591
>                 <http://chandle__rc.com/D2591>
>                          <http://chandlerc.com/D2591>
>
>                   <http://llvm-reviews.__chandle__rc.com/D2591
>                 <http://chandlerc.com/D2591>
>                          <http://llvm-reviews.__chandlerc.com/D2591
>                 <http://llvm-reviews.chandlerc.com/D2591>>>).
>
>
>                                           Our patch also changes the way
>                 that functions
>                 are
>                                           compared quite
>                                           significantly. This is
>                 necessary because we
>                 need to
>                                           compare all
>                                           functions in a bucket in order
>                 to get a
>                 similarity
>                                           measure for each
>                                           pair, so we can then decide
>                 which 'groups' of
>                          functions
>                                           to merge.
>
>                                           I can't really see a way to
>                 combine our
>                          approach with
>                                           your patch. What
>                                           are your thoughts?
>
>                                           Another way to improve the
>                 performance of
>                          MergeFunctions
>                                           might be to
>                                           make the hash function better.
>                 I've already
>                          added the
>                                           size of the first
>                                           BB to it, and perhaps there
>                 are other factors
>                          we could
>                                           try... if we
>                                           don't have to compare
>                 functions in the first
>                 place
>                                           (because they're in
>                                           different buckets) then
that's
>                 obviously a much
>                          bigger win.
>
>                                           Thanks,
>                                           Tobias
>
>                                           On 17/01/2014 20:25, Stepan
>                 Dyatkovskiy wrote:
>
>                                                  Hi all,
>
>                                                  I propose simple
>                 improvement for
>                          MergeFunctions
>                                               pass, that reduced
>                                               its
>                                                  complexity from O(N^2)
>                 to O(log(N)),
>                          where N is
>                                               number of
>                                               functions in
>                                                  module.
>
>                                                  The idea, is to replace
>                 the result of
>                          comparison
>                                               from "bool" to
>                                                  "-1,0,1". In
another
>                 words: define order
>                          relation
>                                               on functions set.
>                                                  To be sure, that
>                 functions could be
>                          comparable
>                                               that way, we have to
>                                                  prove that order
>                 relation is possible
>                 on all
>                                               comparison stage.
>
>                                                  The last one is
>                 possible, if for each
>                          comparison
>                                               stage we
>                                               implement has
>                                                  next properties:
>                                                  * reflexivity (a <= a,
>                 a == a, a >= a),
>                                                  * antisymmetry (if a
<>                 b and b <= a
>                          then a  == b),
>                                                  * transitivity (a <= b
>                 and b <= c, then
>                          a <= c)
>                                                  * asymmetry (if a < b,
>                 then a > b or a
>                          == b).
>
>                                                  Once we have defined
>                 order relation we
>                          can store
>                                               all the functions in
>                                                  binary tree and perform
>                 lookup in
>                          O(log(N)) time.
>
>                                                  This post has two
>                 attachments:
>                                                  1. The patch, that has
>                 implementation of
>                          this idea.
>                                                  2. The MergeFunctions
>                 pass detailed
>                          description,
>                                               with explanation how
>                                                  order relation could be
>                 possible.
>
>                                                  Hope it helps to make
>                 things better!
>
>                                                  -Stepan.
>
>
>
>                 _____________________________________________________
>                                                  LLVM Developers mailing
>                 list
>                 LLVMdev at cs.uiuc.edu <mailto:LLVMdev at
cs.uiuc.edu>
>                 <mailto:LLVMdev at cs.uiuc.edu <mailto:LLVMdev at
cs.uiuc.edu>>
>                          <mailto:LLVMdev at cs.uiuc.edu
>                 <mailto:LLVMdev at cs.uiuc.edu> <mailto:LLVMdev at
cs.uiuc.edu
>                 <mailto:LLVMdev at cs.uiuc.edu>>>
>                 http://llvm.cs.uiuc.edu
>                 http://lists.cs.uiuc.edu/______mailman/listinfo/llvmdev
>                
<http://lists.cs.uiuc.edu/____mailman/listinfo/llvmdev>
>
>                 <http://lists.cs.uiuc.edu/____mailman/listinfo/llvmdev
>                
<http://lists.cs.uiuc.edu/__mailman/listinfo/llvmdev>>
>
>
>                 <http://lists.cs.uiuc.edu/____mailman/listinfo/llvmdev
>                 <http://lists.cs.uiuc.edu/__mailman/listinfo/llvmdev>
>
>                 <http://lists.cs.uiuc.edu/__mailman/listinfo/llvmdev
>                
<http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev>>>
>
>
>
>
>                   _____________________________________________________
>                               LLVM Developers mailing list
>                 LLVMdev at cs.uiuc.edu <mailto:LLVMdev at
cs.uiuc.edu>
>                 <mailto:LLVMdev at cs.uiuc.edu <mailto:LLVMdev at
cs.uiuc.edu>>
>                          <mailto:LLVMdev at cs.uiuc.edu
>                 <mailto:LLVMdev at cs.uiuc.edu> <mailto:LLVMdev at
cs.uiuc.edu
>                 <mailto:LLVMdev at cs.uiuc.edu>>>
>                 http://llvm.cs.uiuc.edu
>                 http://lists.cs.uiuc.edu/______mailman/listinfo/llvmdev
>                
<http://lists.cs.uiuc.edu/____mailman/listinfo/llvmdev>
>
>                 <http://lists.cs.uiuc.edu/____mailman/listinfo/llvmdev
>                
<http://lists.cs.uiuc.edu/__mailman/listinfo/llvmdev>>
>
>                   <http://lists.cs.uiuc.edu/____mailman/listinfo/llvmdev
>                 <http://lists.cs.uiuc.edu/__mailman/listinfo/llvmdev>
>
>                 <http://lists.cs.uiuc.edu/__mailman/listinfo/llvmdev
>                
<http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev>>>
>
>
>
>
>                      _________________________________________________
>                      LLVM Developers mailing list
>                 LLVMdev at cs.uiuc.edu <mailto:LLVMdev at
cs.uiuc.edu>
>                 <mailto:LLVMdev at cs.uiuc.edu <mailto:LLVMdev at
cs.uiuc.edu>>
>                 http://llvm.cs.uiuc.edu
>                 http://lists.cs.uiuc.edu/__mailman/listinfo/llvmdev
>                 <http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev>
>
>
>
>
>             _________________________________________________
>             LLVM Developers mailing list
>             LLVMdev at cs.uiuc.edu <mailto:LLVMdev at cs.uiuc.edu>
>             http://llvm.cs.uiuc.edu
>             http://lists.cs.uiuc.edu/__mailman/listinfo/llvmdev
>             <http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev>
>
>
>         _________________________________________________
>         LLVM Developers mailing list
>         LLVMdev at cs.uiuc.edu <mailto:LLVMdev at cs.uiuc.edu>
>         http://llvm.cs.uiuc.edu
>         http://lists.cs.uiuc.edu/__mailman/listinfo/llvmdev
>         <http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev>
>
>
>
>
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-types-comparison.patch
Type: text/x-diff
Size: 4967 bytes
Desc: not available
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20140227/733af4a9/attachment.patch>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0002-check-for-lossless-bitcast-adaptation.patch
Type: text/x-diff
Size: 4163 bytes
Desc: not available
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20140227/733af4a9/attachment-0001.patch>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0003-constants-comparison.patch
Type: text/x-diff
Size: 6388 bytes
Desc: not available
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20140227/733af4a9/attachment-0002.patch>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0004-values-enumeration-routine.patch
Type: text/x-diff
Size: 5357 bytes
Desc: not available
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20140227/733af4a9/attachment-0003.patch>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0005-attributes-comparison.patch
Type: text/x-diff
Size: 3084 bytes
Desc: not available
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20140227/733af4a9/attachment-0004.patch>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0006-operations-comparison.patch
Type: text/x-diff
Size: 9648 bytes
Desc: not available
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20140227/733af4a9/attachment-0005.patch>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0007-GEP-comparison.patch
Type: text/x-diff
Size: 3561 bytes
Desc: not available
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20140227/733af4a9/attachment-0006.patch>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0008-BB-comparison.patch
Type: text/x-diff
Size: 3816 bytes
Desc: not available
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20140227/733af4a9/attachment-0007.patch>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0009-top-level-compare-method.patch
Type: text/x-diff
Size: 4106 bytes
Desc: not available
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20140227/733af4a9/attachment-0008.patch>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0010-sanity-check.patch
Type: text/x-diff
Size: 3981 bytes
Desc: not available
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20140227/733af4a9/attachment-0009.patch>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0011-removed-unused-methods.patch
Type: text/x-diff
Size: 2087 bytes
Desc: not available
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20140227/733af4a9/attachment-0010.patch>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0012-FnSet-replaced-with-FnTree.patch
Type: text/x-diff
Size: 6674 bytes
Desc: not available
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20140227/733af4a9/attachment-0011.patch>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0013-updated-comments.patch
Type: text/x-diff
Size: 3275 bytes
Desc: not available
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20140227/733af4a9/attachment-0012.patch>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0014-removed-DenseMap-helpsers.patch
Type: text/x-diff
Size: 4590 bytes
Desc: not available
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20140227/733af4a9/attachment-0013.patch>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: apply-patches.sh
Type: application/x-sh
Size: 197 bytes
Desc: not available
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20140227/733af4a9/attachment.sh>

Reasonably Related Threads

Search for more reasonably related threads

llvm dev - Feb 2014 - [LLVMdev] MergeFunctions: reduce complexity to O(log(N))

[LLVMdev] MergeFunctions: reduce complexity to O(log(N))

[LLVMdev] MergeFunctions: reduce complexity to O(log(N))

[LLVMdev] MergeFunctions: reduce complexity to O(log(N))

[LLVMdev] MergeFunctions: reduce complexity to O(log(N))

[LLVMdev] MergeFunctions: reduce complexity to O(log(N))

Reasonably Related Threads