thr3ads.net - llvm dev - [llvm-dev] Significant code difference with a split call to opt [Jun 2019]

If this information is useful, please help other people find it:
Share via:

Sébastien Michelland via llvm-dev

2019-Jun-26 22:35 UTC

[llvm-dev] Significant code difference with a split call to opt

Hi,

This answer is a bit slow; I tried to look into the sequence details but 
250 passes plus the complex bitcode of test suite examples makes this 
pretty hard.

In the meantime I stumbled upon llvm-diff which abstracts away the most 
significant difference, namely instruction renaming. It also ignores 
function attributes so calling conventions are silently unified; but at 
least it gives empty diffs when comparing the two methods. This means 
that my performance differences are mostly measurement errors...

Some of the differences might be "normal", eg. caused by randomized
data
structures. I don't have that much experience with LLVM code so I'm not 
sure how probable this is.

I'll stick to llvm-diff for now and maybe come back to this when I have 
a clearer understanding of the pass management process. ^^

Thanks for your time and help!
Sébastien Michelland

On 6/19/19 11:42 AM, Hiroshi Yamauchi wrote:> Passing -print-after-all to opt should print the IR after each pass. 
> That may help figure out what's going on.
> 
> On Mon, Jun 17, 2019 at 1:30 PM Sébastien Michelland via llvm-dev 
> <llvm-dev at lists.llvm.org <mailto:llvm-dev at
lists.llvm.org>> wrote:
> 
>     Hi,
> 
>     I reproduced the test on many individual files and got very variable
>     results... it seems the computer's workload when running the test
suite
>     influenced the execution speed a lot more than standard deviation
>     shows.
>     I'll withdraw the performance claim until I can get consistent
results
>     (changed subject line), apologies for the confusion.
> 
>     What I can still show easily is that the code generated by these two
>     methods is different (which is already weird). For a simple example,
>     grab a copy of bilateral_grid.bc:
> 
> 
>    
<https://github.com/llvm/llvm-test-suite/blob/master/Bitcode/Benchmarks/Halide/bilateral_grid/bilateral_grid.bc>
> 
>     Then you can generate my sequences with [opt -O3 -debug-pass=Arguments]
>     and diff the outputs. Please see the attached script.
> 
>     The differences seem to be mainly on variable indices (are they
>     randomized?); on some test (namely jacobi-2d-imper) I have seen calling
>     convention differences.
> 
>     I'd like to optimize programs by greedily selecting optimizations,
>     making a call to opt at each step. If I don't have equality between
the
>     two methods, I can't be sure that the sequence I'm building
will make
>     much sense.
> 
>     Sébastien Michelland
> 
>     On 6/14/19 4:49 PM, David Greene wrote:
>      > Do you have more information?  What were the exact command lines
you
>      > used?  Do you have an example program that demonstrates the
>     difference
>      > than you can share?
>      >
>      >                        -David
>      >
>      > Sébastien Michelland via llvm-dev <llvm-dev at lists.llvm.org
>     <mailto:llvm-dev at lists.llvm.org>> writes:
>      >
>      >> Hello list,
>      >>
>      >> This is a follow-up from a question I asked last month.
I'm
>     evaluating
>      >> the performance of two pass sequences that resemble (but are
>     not) -O3.
>      >>
>      >> With -O3, -debug-pass=Structure prints several independent
>     blocks that
>      >> seem to represent several calls to opt. I focused on two of
these
>      >> blocks, say S1 and S2, and compared the following
optimization
>      >> methods:
>      >>
>      >> 1. Executing them separately, ie. opt -S1 | opt -S2
>      >> 2. Executing them in a single call, ie. opt -S1 -S2
>      >>
>      >> I built the test suite with each of these configurations,
then
>      >> measured the performance of the compiled programs with perf,
over 10
>      >> runs.
>      >>
>      >> I'm attaching a plot of the speedup of method 1 over
method 2. The
>      >> intervals represent the standard deviation of the performance
>      >> measures.
>      >>
>      >> As you can see, programs compiled with method 1 are
significantly
>      >> slower than their counterparts compiled with method 2.
However, if
>      >> passes were applied in order using function composition,
their
>      >> performance should be the same.
>      >>
>      >> I'd like to know if there is a way to recover this
property in the
>      >> pass manager, or at least explain the difference. If needed,
I can
>      >> provide scripts to reproduce the measurements.
>      >>
>      >> Thanks,
>      >> Sébastien Michelland
>      >>
>      >>
>      >> _______________________________________________
>      >> LLVM Developers mailing list
>      >> llvm-dev at lists.llvm.org <mailto:llvm-dev at
lists.llvm.org>
>      >> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>     _______________________________________________
>     LLVM Developers mailing list
>     llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
>     https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>

David Greene via llvm-dev

2019-Jun-27 17:47 UTC

head link

[llvm-dev] Significant code difference with a split call to opt

We want LLVM to be deterministic and there have been efforts to fix
problems related to data structures causing different generated code
sequences.  It's certainly possible something like that is going on, but
it shouldn't just be dismissed.  It would be best if we could get to the
bottom of it and see what needs fixing.

                      -David

Sébastien Michelland <sebastien.michelland at ens-lyon.fr> writes:
> Hi,
>
> This answer is a bit slow; I tried to look into the sequence details
> but 250 passes plus the complex bitcode of test suite examples makes
> this pretty hard.
>
> In the meantime I stumbled upon llvm-diff which abstracts away the
> most significant difference, namely instruction renaming. It also
> ignores function attributes so calling conventions are silently
> unified; but at least it gives empty diffs when comparing the two
> methods. This means that my performance differences are mostly
> measurement errors...
>
> Some of the differences might be "normal", eg. caused by
randomized
> data structures. I don't have that much experience with LLVM code so
> I'm not sure how probable this is.
>
> I'll stick to llvm-diff for now and maybe come back to this when I
> have a clearer understanding of the pass management process. ^^
>
> Thanks for your time and help!
> Sébastien Michelland
>
> On 6/19/19 11:42 AM, Hiroshi Yamauchi wrote:
>> Passing -print-after-all to opt should print the IR after each
>> pass. That may help figure out what's going on.
>>
>> On Mon, Jun 17, 2019 at 1:30 PM Sébastien Michelland via llvm-dev
>> <llvm-dev at lists.llvm.org <mailto:llvm-dev at
lists.llvm.org>> wrote:
>>
>>     Hi,
>>
>>     I reproduced the test on many individual files and got very
variable
>>     results... it seems the computer's workload when running the
test suite
>>     influenced the execution speed a lot more than standard deviation
>>     shows.
>>     I'll withdraw the performance claim until I can get consistent
results
>>     (changed subject line), apologies for the confusion.
>>
>>     What I can still show easily is that the code generated by these
two
>>     methods is different (which is already weird). For a simple
example,
>>     grab a copy of bilateral_grid.bc:
>>
>>
>>    
<https://github.com/llvm/llvm-test-suite/blob/master/Bitcode/Benchmarks/Halide/bilateral_grid/bilateral_grid.bc>>
>>     Then you can generate my sequences with [opt -O3
-debug-pass=Arguments]
>>     and diff the outputs. Please see the attached script.
>>
>>     The differences seem to be mainly on variable indices (are they
>>     randomized?); on some test (namely jacobi-2d-imper) I have seen
calling
>>     convention differences.
>>
>>     I'd like to optimize programs by greedily selecting
optimizations,
>>     making a call to opt at each step. If I don't have equality
between the
>>     two methods, I can't be sure that the sequence I'm building
will make
>>     much sense.
>>
>>     Sébastien Michelland
>>
>>     On 6/14/19 4:49 PM, David Greene wrote:
>>      > Do you have more information?  What were the exact command
lines you
>>      > used?  Do you have an example program that demonstrates the
>>     difference
>>      > than you can share?
>>      >
>>      >                        -David
>>      >
>>      > Sébastien Michelland via llvm-dev <llvm-dev at
lists.llvm.org
>>     <mailto:llvm-dev at lists.llvm.org>> writes:
>>      >
>>      >> Hello list,
>>      >>
>>      >> This is a follow-up from a question I asked last month.
I'm
>>     evaluating
>>      >> the performance of two pass sequences that resemble (but
are
>>     not) -O3.
>>      >>
>>      >> With -O3, -debug-pass=Structure prints several
independent
>>     blocks that
>>      >> seem to represent several calls to opt. I focused on two
of these
>>      >> blocks, say S1 and S2, and compared the following
optimization
>>      >> methods:
>>      >>
>>      >> 1. Executing them separately, ie. opt -S1 | opt -S2
>>      >> 2. Executing them in a single call, ie. opt -S1 -S2
>>      >>
>>      >> I built the test suite with each of these configurations,
then
>>      >> measured the performance of the compiled programs with
perf, over 10
>>      >> runs.
>>      >>
>>      >> I'm attaching a plot of the speedup of method 1 over
method 2. The
>>      >> intervals represent the standard deviation of the
performance
>>      >> measures.
>>      >>
>>      >> As you can see, programs compiled with method 1 are
significantly
>>      >> slower than their counterparts compiled with method 2.
However, if
>>      >> passes were applied in order using function composition,
their
>>      >> performance should be the same.
>>      >>
>>      >> I'd like to know if there is a way to recover this
property in the
>>      >> pass manager, or at least explain the difference. If
needed, I can
>>      >> provide scripts to reproduce the measurements.
>>      >>
>>      >> Thanks,
>>      >> Sébastien Michelland
>>      >>
>>      >>
>>      >> _______________________________________________
>>      >> LLVM Developers mailing list
>>      >> llvm-dev at lists.llvm.org <mailto:llvm-dev at
lists.llvm.org>
>>      >>
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>    
_______________________________________________
>>     LLVM Developers mailing list
>>     llvm-dev at lists.llvm.org <mailto:llvm-dev at
lists.llvm.org>
>>     https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>

Sébastien Michelland via llvm-dev

2019-Jul-03 13:55 UTC

head link

[llvm-dev] Significant code difference with a split call to opt

Alright, since it's deemed important I'll do my best to help. I've
built
an up-to-date LLVM from Git, I'll make more tests once I setup the test 
suite instrumentation I need to replace the optimization sequences.

So far I can't say it looks like a determinism issue to me. The two 
methods are deterministic on their own as far as I can test; they just 
don't have the same output (which might have an explanation in terms of 
how the pass manager works?).

Are you able to reproduce the test case from earlier (the one with the 
attached shell script)?

Sébastien Michelland

On 6/27/19 1:47 PM, David Greene wrote:> We want LLVM to be deterministic and there have been efforts to fix
> problems related to data structures causing different generated code
> sequences.  It's certainly possible something like that is going on,
but
> it shouldn't just be dismissed.  It would be best if we could get to
the
> bottom of it and see what needs fixing.
> 
>                        -David
> 
> Sébastien Michelland <sebastien.michelland at ens-lyon.fr> writes:
> 
>> Hi,
>>
>> This answer is a bit slow; I tried to look into the sequence details
>> but 250 passes plus the complex bitcode of test suite examples makes
>> this pretty hard.
>>
>> In the meantime I stumbled upon llvm-diff which abstracts away the
>> most significant difference, namely instruction renaming. It also
>> ignores function attributes so calling conventions are silently
>> unified; but at least it gives empty diffs when comparing the two
>> methods. This means that my performance differences are mostly
>> measurement errors...
>>
>> Some of the differences might be "normal", eg. caused by
randomized
>> data structures. I don't have that much experience with LLVM code
so
>> I'm not sure how probable this is.
>>
>> I'll stick to llvm-diff for now and maybe come back to this when I
>> have a clearer understanding of the pass management process. ^^
>>
>> Thanks for your time and help!
>> Sébastien Michelland
>>
>> On 6/19/19 11:42 AM, Hiroshi Yamauchi wrote:
>>> Passing -print-after-all to opt should print the IR after each
>>> pass. That may help figure out what's going on.
>>>
>>> On Mon, Jun 17, 2019 at 1:30 PM Sébastien Michelland via llvm-dev
>>> <llvm-dev at lists.llvm.org <mailto:llvm-dev at
lists.llvm.org>> wrote:
>>>
>>>      Hi,
>>>
>>>      I reproduced the test on many individual files and got very
variable
>>>      results... it seems the computer's workload when running
the test suite
>>>      influenced the execution speed a lot more than standard
deviation
>>>      shows.
>>>      I'll withdraw the performance claim until I can get
consistent results
>>>      (changed subject line), apologies for the confusion.
>>>
>>>      What I can still show easily is that the code generated by
these two
>>>      methods is different (which is already weird). For a simple
example,
>>>      grab a copy of bilateral_grid.bc:
>>>
>>>
>>>     
<https://github.com/llvm/llvm-test-suite/blob/master/Bitcode/Benchmarks/Halide/bilateral_grid/bilateral_grid.bc>>
>>>      Then you can generate my sequences with [opt -O3
-debug-pass=Arguments]
>>>      and diff the outputs. Please see the attached script.
>>>
>>>      The differences seem to be mainly on variable indices (are
they
>>>      randomized?); on some test (namely jacobi-2d-imper) I have
seen calling
>>>      convention differences.
>>>
>>>      I'd like to optimize programs by greedily selecting
optimizations,
>>>      making a call to opt at each step. If I don't have
equality between the
>>>      two methods, I can't be sure that the sequence I'm
building will make
>>>      much sense.
>>>
>>>      Sébastien Michelland

llvm dev - Jun 2019 - Significant code difference with a split call to opt

[llvm-dev] Significant code difference with a split call to opt

[llvm-dev] Significant code difference with a split call to opt

[llvm-dev] Significant code difference with a split call to opt