thr3ads.net - llvm dev - [llvm-dev] Significant code difference with a split call to opt [Jun 2019]

If this information is useful, please help other people find it:
Share via:

Sébastien Michelland via llvm-dev

2019-Jun-14 19:11 UTC

[llvm-dev] Significant performance difference with a split call to opt

Hello list,

This is a follow-up from a question I asked last month. I'm evaluating 
the performance of two pass sequences that resemble (but are not) -O3.

With -O3, -debug-pass=Structure prints several independent blocks that 
seem to represent several calls to opt. I focused on two of these 
blocks, say S1 and S2, and compared the following optimization methods:

1. Executing them separately, ie. opt -S1 | opt -S2
2. Executing them in a single call, ie. opt -S1 -S2

I built the test suite with each of these configurations, then measured 
the performance of the compiled programs with perf, over 10 runs.

I'm attaching a plot of the speedup of method 1 over method 2. The 
intervals represent the standard deviation of the performance measures.

As you can see, programs compiled with method 1 are significantly slower 
than their counterparts compiled with method 2. However, if passes were 
applied in order using function composition, their performance should be 
the same.

I'd like to know if there is a way to recover this property in the pass 
manager, or at least explain the difference. If needed, I can provide 
scripts to reproduce the measurements.

Thanks,
Sébastien Michelland
-------------- next part --------------
A non-text attachment was scrubbed...
Name: speedup-plot.png
Type: image/png
Size: 53411 bytes
Desc: not available
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20190614/0c7096f9/attachment.png>

David Greene via llvm-dev

2019-Jun-14 20:49 UTC

head link

[llvm-dev] Significant performance difference with a split call to opt

Do you have more information?  What were the exact command lines you
used?  Do you have an example program that demonstrates the difference
than you can share?

                      -David

Sébastien Michelland via llvm-dev <llvm-dev at lists.llvm.org> writes:
> Hello list,
>
> This is a follow-up from a question I asked last month. I'm evaluating
> the performance of two pass sequences that resemble (but are not) -O3.
>
> With -O3, -debug-pass=Structure prints several independent blocks that
> seem to represent several calls to opt. I focused on two of these
> blocks, say S1 and S2, and compared the following optimization
> methods:
>
> 1. Executing them separately, ie. opt -S1 | opt -S2
> 2. Executing them in a single call, ie. opt -S1 -S2
>
> I built the test suite with each of these configurations, then
> measured the performance of the compiled programs with perf, over 10
> runs.
>
> I'm attaching a plot of the speedup of method 1 over method 2. The
> intervals represent the standard deviation of the performance
> measures.
>
> As you can see, programs compiled with method 1 are significantly
> slower than their counterparts compiled with method 2. However, if
> passes were applied in order using function composition, their
> performance should be the same.
>
> I'd like to know if there is a way to recover this property in the
> pass manager, or at least explain the difference. If needed, I can
> provide scripts to reproduce the measurements.
>
> Thanks,
> Sébastien Michelland
>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

Sébastien Michelland via llvm-dev

2019-Jun-17 20:31 UTC

head link

[llvm-dev] Significant code difference with a split call to opt

Hi,

I reproduced the test on many individual files and got very variable 
results... it seems the computer's workload when running the test suite 
influenced the execution speed a lot more than standard deviation shows. 
I'll withdraw the performance claim until I can get consistent results 
(changed subject line), apologies for the confusion.

What I can still show easily is that the code generated by these two 
methods is different (which is already weird). For a simple example, 
grab a copy of bilateral_grid.bc:

<https://github.com/llvm/llvm-test-suite/blob/master/Bitcode/Benchmarks/Halide/bilateral_grid/bilateral_grid.bc>

Then you can generate my sequences with [opt -O3 -debug-pass=Arguments] 
and diff the outputs. Please see the attached script.

The differences seem to be mainly on variable indices (are they 
randomized?); on some test (namely jacobi-2d-imper) I have seen calling 
convention differences.

I'd like to optimize programs by greedily selecting optimizations, 
making a call to opt at each step. If I don't have equality between the 
two methods, I can't be sure that the sequence I'm building will make 
much sense.

Sébastien Michelland

On 6/14/19 4:49 PM, David Greene wrote:> Do you have more information?  What were the exact command lines you
> used?  Do you have an example program that demonstrates the difference
> than you can share?
> 
>                        -David
> 
> Sébastien Michelland via llvm-dev <llvm-dev at lists.llvm.org>
writes:
> 
>> Hello list,
>>
>> This is a follow-up from a question I asked last month. I'm
evaluating
>> the performance of two pass sequences that resemble (but are not) -O3.
>>
>> With -O3, -debug-pass=Structure prints several independent blocks that
>> seem to represent several calls to opt. I focused on two of these
>> blocks, say S1 and S2, and compared the following optimization
>> methods:
>>
>> 1. Executing them separately, ie. opt -S1 | opt -S2
>> 2. Executing them in a single call, ie. opt -S1 -S2
>>
>> I built the test suite with each of these configurations, then
>> measured the performance of the compiled programs with perf, over 10
>> runs.
>>
>> I'm attaching a plot of the speedup of method 1 over method 2. The
>> intervals represent the standard deviation of the performance
>> measures.
>>
>> As you can see, programs compiled with method 1 are significantly
>> slower than their counterparts compiled with method 2. However, if
>> passes were applied in order using function composition, their
>> performance should be the same.
>>
>> I'd like to know if there is a way to recover this property in the
>> pass manager, or at least explain the difference. If needed, I can
>> provide scripts to reproduce the measurements.
>>
>> Thanks,
>> Sébastien Michelland
>>
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev-------------- next part --------------
A non-text attachment was scrubbed...
Name: build-bileteral-grid.sh
Type: application/x-shellscript
Size: 409 bytes
Desc: not available
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20190617/d1843710/attachment.bin>

Possibly Parallel Threads

Search for more apparently analagous threads

llvm dev - Jun 2019 - Significant code difference with a split call to opt

[llvm-dev] Significant performance difference with a split call to opt

[llvm-dev] Significant performance difference with a split call to opt

[llvm-dev] Significant code difference with a split call to opt

Possibly Parallel Threads