thr3ads.net - llvm dev - [llvm-dev] llvm is getting slower, January edition [Jan 2017]

If this information is useful, please help other people find it:
Share via:

Mikhail Zolotukhin via llvm-dev

2017-Jan-18 02:02 UTC

[llvm-dev] llvm is getting slower, January edition

Hi,

Continuing recent efforts in understanding compile time slowdowns, I looked at
some historical data: I picked one test and tried to pin-point commits that
affected its compile-time. The data I have is not 100% accurate, but hopefully
it helps to provide an overview of what's going on with compile time in LLVM
and give a better understanding of what changes usually impact compile time.

Configuration:
The test I used is tramp3d-v4 from LLVM testsuite. It consists of a single
source file, but still takes a noticeable time to compile, which makes it very
convenient for this kind of experiments. The file was compiled with Os for arm64
on x86 host.

Results:
The attached PDF has a compile time graph, on which I marked points where
compile time changed with a list of corresponding commits. A textual version of
the list is available below, but I think it might be much harder to comprehend
the data without the graph. A number in the end shows compile time change after
the given commit:

1. r239821: [InstSimplify] Allow folding of fdiv X, X with just NaNs ignored.
+1%
2. r241886: [InstCombine] Employ AliasAnalysis in FindAvailableLoadedValue. +1%
3. r245118: [SCEV] Apply NSW and NUW flags via poison value analysis for sub,
mul and shl. +2%
4. r246694: [RemoveDuplicatePHINodes] Start over after removing a PHI. -1%
5. r247269: [ADT] Rewrite the StringRef::find implementation to be simpler...
+1%
r247240: [LPM] Use a map from analysis ID to immutable passes in the legacy
pass manager... +3%
r247264: Enable GlobalsAA by default. +1%
6. r247674: [GlobalsAA] Disable globals-aa by default. -1%
7. r248638: [SCEV] Reapply 'Teach isLoopBackedgeGuardedByCond to exploit
trip counts'. +2%
8. r249802: [SCEV] Call `StrengthenNoWrapFlags` after `GroupByComplexity`; NFCI.
+4%
9. r250157: [GlobalsAA] Turn GlobalsAA on again by default. +1%
10. r251049: [SCEV] Mark AddExprs as nsw or nuw if legal. +23%
11. No data
12. r259252: AttributeSetImpl: Summarize existing function attributes in a
bitset. -1%
r259256: Add LoopSimplifyCFG pass. -2%
13. r262250: Enable LoopLoadElimination by default. +3%
14. r262839: Revert "Enable LoopLoadElimination by default". -3%
15. r263393: Remove PreserveNames template parameter from IRBuilder. -3%
16. r263595: Turn LoopLoadElimination on again. +3%
17. r267672: [LoopDist] Add llvm.loop.distribute.enable loop metadata. +4%
18. r268509: Do not disable completely loop unroll when optimizing for size.
-34%
19. r269124: Loop unroller: set thresholds for optsize and minsize functions to
zero. +50%
20. r269392: [LoopDist] Only run LAA for loops with the pragma. -4%
21. r270630: Re-enable "[LoopUnroll] Enable advanced unrolling analysis by
default" one more time. -28%
22. r270881: Don't allocate in APInt::slt. NFC. -2%
r270959: Don't allocate unnecessarily in APInt::operator[+-]. NFC. -1%
r271020: Don't generate unnecessary signed ConstantRange during
multiply. NFC. -3%
23. r271615: [LoopUnroll] Set correct thresholds for new recently enabled
unrolling heuristic. +22%
24. r276942: Don't invoke getName() from Function::isIntrinsic(). -1%
r277087: Revert "Don't invoke getName() from
Function::isIntrinsic().", rL276942. +1%
25. r279585: [LoopUnroll] By default disable unrolling when optimizing for size.
26. r286814: [InlineCost] Remove skew when calculating call costs. +3%
27. r289755: Make processing @llvm.assume more efficient by using operand
bundles. +6%
28. r290086: Revert @llvm.assume with operator bundles (r289755-r289757). -6%
-------------- next part --------------
A non-text attachment was scrubbed...
Name: CompileTime.pdf
Type: application/pdf
Size: 526526 bytes
Desc: not available
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20170117/4b4a2391/attachment-0001.pdf>
-------------- next part --------------

Disclaimer:
The data is specific for this particular test, so I could have skipped some
commits affecting compile time on other workloads/configurations.
The data I have is not perfect, so I could have skipped some commits, even if
they impacted compile-time on this test case.
Same commits might have a different impact on a different test/configuration, up
to the opposite to the one listed.
I didn't mean to label any commits as 'good' or 'bad' by
posting these numbers. It's expected that some commits increase compile
time, we just need to be aware of it and avoid unnecessary slowdowns.

Conclusions:
Changes in optimization thresholds/cost-models usually have the biggest impact
on compile time. However, usually they are well-assessed and trade-offs are
discussed and agreed on.
Introducing a pass doesn't necessarily mean a compile time slowdown.
Sometimes the total compile time might decrease because we're saving some
work for later passes.
There are many commits, which individually have a low compile time impact, but
together sum up to a noticeable slowdown.
Conscious efforts on reducing compile time definitely help - thanks everyone
who's been working on this!

Thanks for reading, any comments or suggestions on how to make LLVM faster are
welcome! I hope we'll see this graph going down this year :-)

Michael

Sanjoy Das via llvm-dev

2017-Jan-18 05:41 UTC

head link

[llvm-dev] llvm is getting slower, January edition

Hi Mikhail,

Thank you for doing this!

On a quick scan of just the SCEV commits, this already highlights some things:

 - https://reviews.llvm.org/rL249802 has a 4% regression, when it was
"NFCI".
 - The 23% regression on https://reviews.llvm.org/rL251049 is also
very suspicious.

I'll take a closer look at both.

Is it possible to run some of this as a Jenkin's job?  Running just
what you ran, and getting a graph that we can view at some URL will be
great.

-- Sanjoy

On Tue, Jan 17, 2017 at 6:02 PM, Mikhail Zolotukhin via llvm-dev
<llvm-dev at lists.llvm.org> wrote:> Hi,
>
> Continuing recent efforts in understanding compile time slowdowns, I looked
at some historical data: I picked one test and tried to pin-point commits that
affected its compile-time. The data I have is not 100% accurate, but hopefully
it helps to provide an overview of what's going on with compile time in LLVM
and give a better understanding of what changes usually impact compile time.
>
> Configuration:
> The test I used is tramp3d-v4 from LLVM testsuite. It consists of a single
source file, but still takes a noticeable time to compile, which makes it very
convenient for this kind of experiments. The file was compiled with Os for arm64
on x86 host.
>
> Results:
> The attached PDF has a compile time graph, on which I marked points where
compile time changed with a list of corresponding commits. A textual version of
the list is available below, but I think it might be much harder to comprehend
the data without the graph. A number in the end shows compile time change after
the given commit:
>
> 1. r239821: [InstSimplify] Allow folding of fdiv X, X with just NaNs
ignored. +1%
> 2. r241886: [InstCombine] Employ AliasAnalysis in FindAvailableLoadedValue.
+1%
> 3. r245118: [SCEV] Apply NSW and NUW flags via poison value analysis for
sub, mul and shl. +2%
> 4. r246694: [RemoveDuplicatePHINodes] Start over after removing a PHI. -1%
> 5. r247269: [ADT] Rewrite the StringRef::find implementation to be
simpler... +1%
>    r247240: [LPM] Use a map from analysis ID to immutable passes in the
legacy pass manager... +3%
>    r247264: Enable GlobalsAA by default. +1%
> 6. r247674: [GlobalsAA] Disable globals-aa by default. -1%
> 7. r248638: [SCEV] Reapply 'Teach isLoopBackedgeGuardedByCond to
exploit trip counts'. +2%
> 8. r249802: [SCEV] Call `StrengthenNoWrapFlags` after `GroupByComplexity`;
NFCI. +4%
> 9. r250157: [GlobalsAA] Turn GlobalsAA on again by default. +1%
> 10. r251049: [SCEV] Mark AddExprs as nsw or nuw if legal. +23%
> 11. No data
> 12. r259252: AttributeSetImpl: Summarize existing function attributes in a
bitset. -1%
>     r259256: Add LoopSimplifyCFG pass. -2%
> 13. r262250: Enable LoopLoadElimination by default. +3%
> 14. r262839: Revert "Enable LoopLoadElimination by default". -3%
> 15. r263393: Remove PreserveNames template parameter from IRBuilder. -3%
> 16. r263595: Turn LoopLoadElimination on again. +3%
> 17. r267672: [LoopDist] Add llvm.loop.distribute.enable loop metadata. +4%
> 18. r268509: Do not disable completely loop unroll when optimizing for
size. -34%
> 19. r269124: Loop unroller: set thresholds for optsize and minsize
functions to zero. +50%
> 20. r269392: [LoopDist] Only run LAA for loops with the pragma. -4%
> 21. r270630: Re-enable "[LoopUnroll] Enable advanced unrolling
analysis by default" one more time. -28%
> 22. r270881: Don't allocate in APInt::slt.  NFC. -2%
>     r270959: Don't allocate unnecessarily in APInt::operator[+-].  NFC.
-1%
>     r271020: Don't generate unnecessary signed ConstantRange during
multiply.  NFC. -3%
> 23. r271615: [LoopUnroll] Set correct thresholds for new recently enabled
unrolling heuristic. +22%
> 24. r276942: Don't invoke getName() from Function::isIntrinsic(). -1%
>     r277087: Revert "Don't invoke getName() from
Function::isIntrinsic().", rL276942. +1%
> 25. r279585: [LoopUnroll] By default disable unrolling when optimizing for
size.
> 26. r286814: [InlineCost] Remove skew when calculating call costs. +3%
> 27. r289755: Make processing @llvm.assume more efficient by using operand
bundles. +6%
> 28. r290086: Revert @llvm.assume with operator bundles (r289755-r289757).
-6%
>
>
> Disclaimer:
> The data is specific for this particular test, so I could have skipped some
commits affecting compile time on other workloads/configurations.
> The data I have is not perfect, so I could have skipped some commits, even
if they impacted compile-time on this test case.
> Same commits might have a different impact on a different
test/configuration, up to the opposite to the one listed.
> I didn't mean to label any commits as 'good' or 'bad'
by posting these numbers. It's expected that some commits increase compile
time, we just need to be aware of it and avoid unnecessary slowdowns.
>
> Conclusions:
> Changes in optimization thresholds/cost-models usually have the biggest
impact on compile time. However, usually they are well-assessed and trade-offs
are discussed and agreed on.
> Introducing a pass doesn't necessarily mean a compile time slowdown.
Sometimes the total compile time might decrease because we're saving some
work for later passes.
> There are many commits, which individually have a low compile time impact,
but together sum up to a noticeable slowdown.
> Conscious efforts on reducing compile time definitely help - thanks
everyone who's been working on this!
>
> Thanks for reading, any comments or suggestions on how to make LLVM faster
are welcome! I hope we'll see this graph going down this year :-)
>
> Michael
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>

Michael Zolotukhin via llvm-dev

2017-Jan-18 07:40 UTC

head link

[llvm-dev] llvm is getting slower, January edition

Hi,> On Jan 17, 2017, at 9:41 PM, Sanjoy Das <sanjoy at
playingwithpointers.com> wrote:
> 
> Hi Mikhail,
> 
> Thank you for doing this!
> 
> On a quick scan of just the SCEV commits, this already highlights some
things:
> 
> - https://reviews.llvm.org/rL249802 has a 4% regression, when it was
"NFCI”.Yep, this surprised for me as well. Please let me know if you can reproduce it
too.> - The 23% regression on https://reviews.llvm.org/rL251049 is also
> very suspicious.Hmm, this should be r251048 (or at least go with it), which we’ve discussed some
time ago.> 
> I'll take a closer look at both.
Thank you!> 
> Is it possible to run some of this as a Jenkin's job?  Running just
> what you ran, and getting a graph that we can view at some URL will be
> great.We have it on a green dragon now:
http://lab.llvm.org:8080/green/view/Compile%20Time/
<http://lab.llvm.org:8080/green/view/Compile%20Time/>
See e.g. http://104.154.54.203/db_default/v4/nts/machine/1336
<http://104.154.54.203/db_default/v4/nts/machine/1336> for Os ARM64
compile time data. Did you mean just that or something else? LNT should also
automagically detect regressions and allow to track them (The links to the
regressions list should be on the green dragon page. I CCed ChrisM  who can
provide more details about this). The main problem is that it was set up
relatively recently, and thus it doesn’t have much history. For the graph that I
sent I used our internal data and in some cases needed to manually apply/revert
commits as we didn’t have enough data points.

Michael> 
> -- Sanjoy
> 
> On Tue, Jan 17, 2017 at 6:02 PM, Mikhail Zolotukhin via llvm-dev
> <llvm-dev at lists.llvm.org> wrote:
>> Hi,
>> 
>> Continuing recent efforts in understanding compile time slowdowns, I
looked at some historical data: I picked one test and tried to pin-point commits
that affected its compile-time. The data I have is not 100% accurate, but
hopefully it helps to provide an overview of what's going on with compile
time in LLVM and give a better understanding of what changes usually impact
compile time.
>> 
>> Configuration:
>> The test I used is tramp3d-v4 from LLVM testsuite. It consists of a
single source file, but still takes a noticeable time to compile, which makes it
very convenient for this kind of experiments. The file was compiled with Os for
arm64 on x86 host.
>> 
>> Results:
>> The attached PDF has a compile time graph, on which I marked points
where compile time changed with a list of corresponding commits. A textual
version of the list is available below, but I think it might be much harder to
comprehend the data without the graph. A number in the end shows compile time
change after the given commit:
>> 
>> 1. r239821: [InstSimplify] Allow folding of fdiv X, X with just NaNs
ignored. +1%
>> 2. r241886: [InstCombine] Employ AliasAnalysis in
FindAvailableLoadedValue. +1%
>> 3. r245118: [SCEV] Apply NSW and NUW flags via poison value analysis
for sub, mul and shl. +2%
>> 4. r246694: [RemoveDuplicatePHINodes] Start over after removing a PHI.
-1%
>> 5. r247269: [ADT] Rewrite the StringRef::find implementation to be
simpler... +1%
>>   r247240: [LPM] Use a map from analysis ID to immutable passes in the
legacy pass manager... +3%
>>   r247264: Enable GlobalsAA by default. +1%
>> 6. r247674: [GlobalsAA] Disable globals-aa by default. -1%
>> 7. r248638: [SCEV] Reapply 'Teach isLoopBackedgeGuardedByCond to
exploit trip counts'. +2%
>> 8. r249802: [SCEV] Call `StrengthenNoWrapFlags` after
`GroupByComplexity`; NFCI. +4%
>> 9. r250157: [GlobalsAA] Turn GlobalsAA on again by default. +1%
>> 10. r251049: [SCEV] Mark AddExprs as nsw or nuw if legal. +23%
>> 11. No data
>> 12. r259252: AttributeSetImpl: Summarize existing function attributes
in a bitset. -1%
>>    r259256: Add LoopSimplifyCFG pass. -2%
>> 13. r262250: Enable LoopLoadElimination by default. +3%
>> 14. r262839: Revert "Enable LoopLoadElimination by default".
-3%
>> 15. r263393: Remove PreserveNames template parameter from IRBuilder.
-3%
>> 16. r263595: Turn LoopLoadElimination on again. +3%
>> 17. r267672: [LoopDist] Add llvm.loop.distribute.enable loop metadata.
+4%
>> 18. r268509: Do not disable completely loop unroll when optimizing for
size. -34%
>> 19. r269124: Loop unroller: set thresholds for optsize and minsize
functions to zero. +50%
>> 20. r269392: [LoopDist] Only run LAA for loops with the pragma. -4%
>> 21. r270630: Re-enable "[LoopUnroll] Enable advanced unrolling
analysis by default" one more time. -28%
>> 22. r270881: Don't allocate in APInt::slt.  NFC. -2%
>>    r270959: Don't allocate unnecessarily in APInt::operator[+-]. 
NFC. -1%
>>    r271020: Don't generate unnecessary signed ConstantRange during
multiply.  NFC. -3%
>> 23. r271615: [LoopUnroll] Set correct thresholds for new recently
enabled unrolling heuristic. +22%
>> 24. r276942: Don't invoke getName() from Function::isIntrinsic().
-1%
>>    r277087: Revert "Don't invoke getName() from
Function::isIntrinsic().", rL276942. +1%
>> 25. r279585: [LoopUnroll] By default disable unrolling when optimizing
for size.
>> 26. r286814: [InlineCost] Remove skew when calculating call costs. +3%
>> 27. r289755: Make processing @llvm.assume more efficient by using
operand bundles. +6%
>> 28. r290086: Revert @llvm.assume with operator bundles
(r289755-r289757). -6%
>> 
>> 
>> Disclaimer:
>> The data is specific for this particular test, so I could have skipped
some commits affecting compile time on other workloads/configurations.
>> The data I have is not perfect, so I could have skipped some commits,
even if they impacted compile-time on this test case.
>> Same commits might have a different impact on a different
test/configuration, up to the opposite to the one listed.
>> I didn't mean to label any commits as 'good' or
'bad' by posting these numbers. It's expected that some commits
increase compile time, we just need to be aware of it and avoid unnecessary
slowdowns.
>> 
>> Conclusions:
>> Changes in optimization thresholds/cost-models usually have the biggest
impact on compile time. However, usually they are well-assessed and trade-offs
are discussed and agreed on.
>> Introducing a pass doesn't necessarily mean a compile time
slowdown. Sometimes the total compile time might decrease because we're
saving some work for later passes.
>> There are many commits, which individually have a low compile time
impact, but together sum up to a noticeable slowdown.
>> Conscious efforts on reducing compile time definitely help - thanks
everyone who's been working on this!
>> 
>> Thanks for reading, any comments or suggestions on how to make LLVM
faster are welcome! I hope we'll see this graph going down this year :-)
>> 
>> Michael
>> 
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>> 
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20170117/f765ae42/attachment.html>

Mehdi Amini via llvm-dev

2017-Jan-18 07:44 UTC

head link

[llvm-dev] llvm is getting slower, January edition

> On Jan 17, 2017, at 9:41 PM, Sanjoy Das via llvm-dev <llvm-dev at
lists.llvm.org> wrote:
> 
> Hi Mikhail,
> 
> Thank you for doing this!
> 
> On a quick scan of just the SCEV commits, this already highlights some
things:
> 
> - https://reviews.llvm.org/rL249802 has a 4% regression, when it was
"NFCI".
> - The 23% regression on https://reviews.llvm.org/rL251049 is also
> very suspicious.
> 
> I'll take a closer look at both.
> 
> Is it possible to run some of this as a Jenkin's job?  Running just
> what you ran, and getting a graph that we can view at some URL will be
> great.
This is a good opportunity to remind everyone about compile time tracking:

CTMark announcement:
http://lists.llvm.org/pipermail/llvm-dev/2016-November/107087.html
Jenkins job: http://lab.llvm.org:8080/green/view/Compile%20Time/
<http://lab.llvm.org:8080/green/view/Compile%20Time/>
LNT: http://104.154.54.203/db_default/v4/nts/recent_activity
<http://104.154.54.203/db_default/v4/nts/recent_activity>
tramp3d-v4 Os:
http://104.154.54.203/db_default/v4/nts/graph?plot.0=1336.1604487.2&highlight_run=25082
<http://104.154.54.203/db_default/v4/nts/graph?plot.0=1336.1604487.2&highlight_run=25082>

— 
Mehdi

> 
> -- Sanjoy
> 
> On Tue, Jan 17, 2017 at 6:02 PM, Mikhail Zolotukhin via llvm-dev
> <llvm-dev at lists.llvm.org> wrote:
>> Hi,
>> 
>> Continuing recent efforts in understanding compile time slowdowns, I
looked at some historical data: I picked one test and tried to pin-point commits
that affected its compile-time. The data I have is not 100% accurate, but
hopefully it helps to provide an overview of what's going on with compile
time in LLVM and give a better understanding of what changes usually impact
compile time.
>> 
>> Configuration:
>> The test I used is tramp3d-v4 from LLVM testsuite. It consists of a
single source file, but still takes a noticeable time to compile, which makes it
very convenient for this kind of experiments. The file was compiled with Os for
arm64 on x86 host.
>> 
>> Results:
>> The attached PDF has a compile time graph, on which I marked points
where compile time changed with a list of corresponding commits. A textual
version of the list is available below, but I think it might be much harder to
comprehend the data without the graph. A number in the end shows compile time
change after the given commit:
>> 
>> 1. r239821: [InstSimplify] Allow folding of fdiv X, X with just NaNs
ignored. +1%
>> 2. r241886: [InstCombine] Employ AliasAnalysis in
FindAvailableLoadedValue. +1%
>> 3. r245118: [SCEV] Apply NSW and NUW flags via poison value analysis
for sub, mul and shl. +2%
>> 4. r246694: [RemoveDuplicatePHINodes] Start over after removing a PHI.
-1%
>> 5. r247269: [ADT] Rewrite the StringRef::find implementation to be
simpler... +1%
>>   r247240: [LPM] Use a map from analysis ID to immutable passes in the
legacy pass manager... +3%
>>   r247264: Enable GlobalsAA by default. +1%
>> 6. r247674: [GlobalsAA] Disable globals-aa by default. -1%
>> 7. r248638: [SCEV] Reapply 'Teach isLoopBackedgeGuardedByCond to
exploit trip counts'. +2%
>> 8. r249802: [SCEV] Call `StrengthenNoWrapFlags` after
`GroupByComplexity`; NFCI. +4%
>> 9. r250157: [GlobalsAA] Turn GlobalsAA on again by default. +1%
>> 10. r251049: [SCEV] Mark AddExprs as nsw or nuw if legal. +23%
>> 11. No data
>> 12. r259252: AttributeSetImpl: Summarize existing function attributes
in a bitset. -1%
>>    r259256: Add LoopSimplifyCFG pass. -2%
>> 13. r262250: Enable LoopLoadElimination by default. +3%
>> 14. r262839: Revert "Enable LoopLoadElimination by default".
-3%
>> 15. r263393: Remove PreserveNames template parameter from IRBuilder.
-3%
>> 16. r263595: Turn LoopLoadElimination on again. +3%
>> 17. r267672: [LoopDist] Add llvm.loop.distribute.enable loop metadata.
+4%
>> 18. r268509: Do not disable completely loop unroll when optimizing for
size. -34%
>> 19. r269124: Loop unroller: set thresholds for optsize and minsize
functions to zero. +50%
>> 20. r269392: [LoopDist] Only run LAA for loops with the pragma. -4%
>> 21. r270630: Re-enable "[LoopUnroll] Enable advanced unrolling
analysis by default" one more time. -28%
>> 22. r270881: Don't allocate in APInt::slt.  NFC. -2%
>>    r270959: Don't allocate unnecessarily in APInt::operator[+-]. 
NFC. -1%
>>    r271020: Don't generate unnecessary signed ConstantRange during
multiply.  NFC. -3%
>> 23. r271615: [LoopUnroll] Set correct thresholds for new recently
enabled unrolling heuristic. +22%
>> 24. r276942: Don't invoke getName() from Function::isIntrinsic().
-1%
>>    r277087: Revert "Don't invoke getName() from
Function::isIntrinsic().", rL276942. +1%
>> 25. r279585: [LoopUnroll] By default disable unrolling when optimizing
for size.
>> 26. r286814: [InlineCost] Remove skew when calculating call costs. +3%
>> 27. r289755: Make processing @llvm.assume more efficient by using
operand bundles. +6%
>> 28. r290086: Revert @llvm.assume with operator bundles
(r289755-r289757). -6%
>> 
>> 
>> Disclaimer:
>> The data is specific for this particular test, so I could have skipped
some commits affecting compile time on other workloads/configurations.
>> The data I have is not perfect, so I could have skipped some commits,
even if they impacted compile-time on this test case.
>> Same commits might have a different impact on a different
test/configuration, up to the opposite to the one listed.
>> I didn't mean to label any commits as 'good' or
'bad' by posting these numbers. It's expected that some commits
increase compile time, we just need to be aware of it and avoid unnecessary
slowdowns.
>> 
>> Conclusions:
>> Changes in optimization thresholds/cost-models usually have the biggest
impact on compile time. However, usually they are well-assessed and trade-offs
are discussed and agreed on.
>> Introducing a pass doesn't necessarily mean a compile time
slowdown. Sometimes the total compile time might decrease because we're
saving some work for later passes.
>> There are many commits, which individually have a low compile time
impact, but together sum up to a noticeable slowdown.
>> Conscious efforts on reducing compile time definitely help - thanks
everyone who's been working on this!
>> 
>> Thanks for reading, any comments or suggestions on how to make LLVM
faster are welcome! I hope we'll see this graph going down this year :-)
>> 
>> Michael
>> 
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>> 
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20170117/4715b6d2/attachment.html>

Mehdi Amini via llvm-dev

2017-Jan-18 07:49 UTC

head link

[llvm-dev] llvm is getting slower, January edition

Hi Mikhail,

> On Jan 17, 2017, at 6:02 PM, Mikhail Zolotukhin via llvm-dev <llvm-dev
at lists.llvm.org> wrote:
> 
> Hi,
> 
> Continuing recent efforts in understanding compile time slowdowns, I looked
at some historical data: I picked one test and tried to pin-point commits that
affected its compile-time. The data I have is not 100% accurate, but hopefully
it helps to provide an overview of what's going on with compile time in LLVM
and give a better understanding of what changes usually impact compile time.
> 
> Configuration:
> The test I used is tramp3d-v4 from LLVM testsuite. It consists of a single
source file, but still takes a noticeable time to compile, which makes it very
convenient for this kind of experiments. The file was compiled with Os for arm64
on x86 host.
> 
> Results:
> The attached PDF has a compile time graph, on which I marked points where
compile time changed with a list of corresponding commits. A textual version of
the list is available below, but I think it might be much harder to comprehend
the data without the graph. A number in the end shows compile time change after
the given commit:
> 
> 1. r239821: [InstSimplify] Allow folding of fdiv X, X with just NaNs
ignored. +1%
> 2. r241886: [InstCombine] Employ AliasAnalysis in FindAvailableLoadedValue.
+1%
> 3. r245118: [SCEV] Apply NSW and NUW flags via poison value analysis for
sub, mul and shl. +2%
> 4. r246694: [RemoveDuplicatePHINodes] Start over after removing a PHI. -1%
> 5. r247269: [ADT] Rewrite the StringRef::find implementation to be
simpler... +1%
>   r247240: [LPM] Use a map from analysis ID to immutable passes in the
legacy pass manager... +3%
>   r247264: Enable GlobalsAA by default. +1%
> 6. r247674: [GlobalsAA] Disable globals-aa by default. -1%
> 7. r248638: [SCEV] Reapply 'Teach isLoopBackedgeGuardedByCond to
exploit trip counts'. +2%
> 8. r249802: [SCEV] Call `StrengthenNoWrapFlags` after `GroupByComplexity`;
NFCI. +4%
> 9. r250157: [GlobalsAA] Turn GlobalsAA on again by default. +1%
> 10. r251049: [SCEV] Mark AddExprs as nsw or nuw if legal. +23%
> 11. No data
> 12. r259252: AttributeSetImpl: Summarize existing function attributes in a
bitset. -1%
>    r259256: Add LoopSimplifyCFG pass. -2%
> 13. r262250: Enable LoopLoadElimination by default. +3%
> 14. r262839: Revert "Enable LoopLoadElimination by default". -3%
> 15. r263393: Remove PreserveNames template parameter from IRBuilder. -3%
> 16. r263595: Turn LoopLoadElimination on again. +3%
> 17. r267672: [LoopDist] Add llvm.loop.distribute.enable loop metadata. +4%
> 18. r268509: Do not disable completely loop unroll when optimizing for
size. -34%
> 19. r269124: Loop unroller: set thresholds for optsize and minsize
functions to zero. +50%
> 20. r269392: [LoopDist] Only run LAA for loops with the pragma. -4%
> 21. r270630: Re-enable "[LoopUnroll] Enable advanced unrolling
analysis by default" one more time. -28%
> 22. r270881: Don't allocate in APInt::slt.  NFC. -2%
>    r270959: Don't allocate unnecessarily in APInt::operator[+-].  NFC.
-1%
>    r271020: Don't generate unnecessary signed ConstantRange during
multiply.  NFC. -3%
> 23. r271615: [LoopUnroll] Set correct thresholds for new recently enabled
unrolling heuristic. +22%
> 24. r276942: Don't invoke getName() from Function::isIntrinsic(). -1%
>    r277087: Revert "Don't invoke getName() from
Function::isIntrinsic().", rL276942. +1%
> 25. r279585: [LoopUnroll] By default disable unrolling when optimizing for
size.
> 26. r286814: [InlineCost] Remove skew when calculating call costs. +3%
> 27. r289755: Make processing @llvm.assume more efficient by using operand
bundles. +6%
> 28. r290086: Revert @llvm.assume with operator bundles (r289755-r289757).
-6%
> <CompileTime.pdf>
This is an amazing set of data!
> Disclaimer:
> The data is specific for this particular test, so I could have skipped some
commits affecting compile time on other workloads/configurations.
> The data I have is not perfect, so I could have skipped some commits, even
if they impacted compile-time on this test case.
> Same commits might have a different impact on a different
test/configuration, up to the opposite to the one listed.
> I didn't mean to label any commits as 'good' or 'bad'
by posting these numbers. It's expected that some commits increase compile
time, we just need to be aware of it and avoid unnecessary slowdowns.
> 
> Conclusions:
> Changes in optimization thresholds/cost-models usually have the biggest
impact on compile time. However, usually they are well-assessed and trade-offs
are discussed and agreed on.
My impression is that most of the time, they are well-assessed, discussed, and
agreed on, bases solely on the “performance” expectation, without necessarily
looking at the compile time impact.
For example, a change in a threshold in the loop unrolled may trigger a pattern
that makes SCEV blowup later. Looking at this only from the "performance of
the generated code” point of view is a mistake in my opinion, and hopefully a
closer tracking like you’ve been doing will help preventing these situations. So
thanks a lot for this!

> Introducing a pass doesn't necessarily mean a compile time slowdown.
Sometimes the total compile time might decrease because we're saving some
work for later passes.
> There are many commits, which individually have a low compile time impact,
but together sum up to a noticeable slowdown.
> Conscious efforts on reducing compile time definitely help - thanks
everyone who's been working on this!
> 
> Thanks for reading, any comments or suggestions on how to make LLVM faster
are welcome! I hope we'll see this graph going down this year :-)
Looking forward for this!
Do you plan to generate a report like that frequently (weekly? Whenever you
notice a regression?)

Thanks,

— 
Mehdi

Michael Zolotukhin via llvm-dev

2017-Jan-18 22:51 UTC

head link

[llvm-dev] llvm is getting slower, January edition

Hi Mehdi,> On Jan 17, 2017, at 11:49 PM, Mehdi Amini <mehdi.amini at apple.com>
wrote:
> 
> Hi Mikhail,
> 
> 
>> On Jan 17, 2017, at 6:02 PM, Mikhail Zolotukhin via llvm-dev
<llvm-dev at lists.llvm.org> wrote:
>> 
>> Hi,
>> 
>> Continuing recent efforts in understanding compile time slowdowns, I
looked at some historical data: I picked one test and tried to pin-point commits
that affected its compile-time. The data I have is not 100% accurate, but
hopefully it helps to provide an overview of what's going on with compile
time in LLVM and give a better understanding of what changes usually impact
compile time.
>> 
>> Configuration:
>> The test I used is tramp3d-v4 from LLVM testsuite. It consists of a
single source file, but still takes a noticeable time to compile, which makes it
very convenient for this kind of experiments. The file was compiled with Os for
arm64 on x86 host.
>> 
>> Results:
>> The attached PDF has a compile time graph, on which I marked points
where compile time changed with a list of corresponding commits. A textual
version of the list is available below, but I think it might be much harder to
comprehend the data without the graph. A number in the end shows compile time
change after the given commit:
>> 
>> 1. r239821: [InstSimplify] Allow folding of fdiv X, X with just NaNs
ignored. +1%
>> 2. r241886: [InstCombine] Employ AliasAnalysis in
FindAvailableLoadedValue. +1%
>> 3. r245118: [SCEV] Apply NSW and NUW flags via poison value analysis
for sub, mul and shl. +2%
>> 4. r246694: [RemoveDuplicatePHINodes] Start over after removing a PHI.
-1%
>> 5. r247269: [ADT] Rewrite the StringRef::find implementation to be
simpler... +1%
>>  r247240: [LPM] Use a map from analysis ID to immutable passes in the
legacy pass manager... +3%
>>  r247264: Enable GlobalsAA by default. +1%
>> 6. r247674: [GlobalsAA] Disable globals-aa by default. -1%
>> 7. r248638: [SCEV] Reapply 'Teach isLoopBackedgeGuardedByCond to
exploit trip counts'. +2%
>> 8. r249802: [SCEV] Call `StrengthenNoWrapFlags` after
`GroupByComplexity`; NFCI. +4%
>> 9. r250157: [GlobalsAA] Turn GlobalsAA on again by default. +1%
>> 10. r251049: [SCEV] Mark AddExprs as nsw or nuw if legal. +23%
>> 11. No data
>> 12. r259252: AttributeSetImpl: Summarize existing function attributes
in a bitset. -1%
>>   r259256: Add LoopSimplifyCFG pass. -2%
>> 13. r262250: Enable LoopLoadElimination by default. +3%
>> 14. r262839: Revert "Enable LoopLoadElimination by default".
-3%
>> 15. r263393: Remove PreserveNames template parameter from IRBuilder.
-3%
>> 16. r263595: Turn LoopLoadElimination on again. +3%
>> 17. r267672: [LoopDist] Add llvm.loop.distribute.enable loop metadata.
+4%
>> 18. r268509: Do not disable completely loop unroll when optimizing for
size. -34%
>> 19. r269124: Loop unroller: set thresholds for optsize and minsize
functions to zero. +50%
>> 20. r269392: [LoopDist] Only run LAA for loops with the pragma. -4%
>> 21. r270630: Re-enable "[LoopUnroll] Enable advanced unrolling
analysis by default" one more time. -28%
>> 22. r270881: Don't allocate in APInt::slt.  NFC. -2%
>>   r270959: Don't allocate unnecessarily in APInt::operator[+-]. 
NFC. -1%
>>   r271020: Don't generate unnecessary signed ConstantRange during
multiply.  NFC. -3%
>> 23. r271615: [LoopUnroll] Set correct thresholds for new recently
enabled unrolling heuristic. +22%
>> 24. r276942: Don't invoke getName() from Function::isIntrinsic().
-1%
>>   r277087: Revert "Don't invoke getName() from
Function::isIntrinsic().", rL276942. +1%
>> 25. r279585: [LoopUnroll] By default disable unrolling when optimizing
for size.
>> 26. r286814: [InlineCost] Remove skew when calculating call costs. +3%
>> 27. r289755: Make processing @llvm.assume more efficient by using
operand bundles. +6%
>> 28. r290086: Revert @llvm.assume with operator bundles
(r289755-r289757). -6%
>> <CompileTime.pdf>
> 
> This is an amazing set of data!
Thanks for the interest in this!> 
>> Disclaimer:
>> The data is specific for this particular test, so I could have skipped
some commits affecting compile time on other workloads/configurations.
>> The data I have is not perfect, so I could have skipped some commits,
even if they impacted compile-time on this test case.
>> Same commits might have a different impact on a different
test/configuration, up to the opposite to the one listed.
>> I didn't mean to label any commits as 'good' or
'bad' by posting these numbers. It's expected that some commits
increase compile time, we just need to be aware of it and avoid unnecessary
slowdowns.
>> 
>> Conclusions:
>> Changes in optimization thresholds/cost-models usually have the biggest
impact on compile time. However, usually they are well-assessed and trade-offs
are discussed and agreed on.
> 
> My impression is that most of the time, they are well-assessed, discussed,
and agreed on, bases solely on the “performance” expectation, without
necessarily looking at the compile time impact.Runtime performance has definitely been getting more attention, but I think
people who changed heuristics usually looked at compile time too. In fact, I
think changes that are not expected to affect performance much are more likely
to go in without thorough compile time testing. Hopefully, improved regular
tracking will help to detect such undesired side-effects in future.
> For example, a change in a threshold in the loop unrolled may trigger a
pattern that makes SCEV blowup later. Looking at this only from the
"performance of the generated code” point of view is a mistake in my
opinion, and hopefully a closer tracking like you’ve been doing will help
preventing these situations.That’s true, but I think it’s an accepted requirement for such sort of changes
to provide compile-time testing results as well.
> So thanks a lot for this!
> 
> 
>> Introducing a pass doesn't necessarily mean a compile time
slowdown. Sometimes the total compile time might decrease because we're
saving some work for later passes.
>> There are many commits, which individually have a low compile time
impact, but together sum up to a noticeable slowdown.
>> Conscious efforts on reducing compile time definitely help - thanks
everyone who's been working on this!
>> 
>> Thanks for reading, any comments or suggestions on how to make LLVM
faster are welcome! I hope we'll see this graph going down this year :-)
> 
> Looking forward for this!
> Do you plan to generate a report like that frequently (weekly? Whenever you
notice a regression?)I didn’t plan to send such a report regularly, but if I find something
interesting, I’ll definitely share it. Also, it will make sense to compare
releases, so that’s what I’ll probably do as well.

Michael
> 
> Thanks,
> 
> — 
> Mehdi
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20170118/fc0b9e84/attachment.html>

Davide Italiano via llvm-dev

2017-Jan-18 22:55 UTC

head link

[llvm-dev] llvm is getting slower, January edition

On Tue, Jan 17, 2017 at 6:02 PM, Mikhail Zolotukhin
<mzolotukhin at apple.com> wrote:> Hi,
>
> Continuing recent efforts in understanding compile time slowdowns, I looked
at some historical data: I picked one test and tried to pin-point commits that
affected its compile-time. The data I have is not 100% accurate, but hopefully
it helps to provide an overview of what's going on with compile time in LLVM
and give a better understanding of what changes usually impact compile time.
>
> Configuration:
> The test I used is tramp3d-v4 from LLVM testsuite. It consists of a single
source file, but still takes a noticeable time to compile, which makes it very
convenient for this kind of experiments. The file was compiled with Os for arm64
on x86 host.
>
> Results:
> The attached PDF has a compile time graph, on which I marked points where
compile time changed with a list of corresponding commits. A textual version of
the list is available below, but I think it might be much harder to comprehend
the data without the graph. A number in the end shows compile time change after
the given commit:
>
> 1. r239821: [InstSimplify] Allow folding of fdiv X, X with just NaNs
ignored. +1%
> 2. r241886: [InstCombine] Employ AliasAnalysis in FindAvailableLoadedValue.
+1%
> 3. r245118: [SCEV] Apply NSW and NUW flags via poison value analysis for
sub, mul and shl. +2%
> 4. r246694: [RemoveDuplicatePHINodes] Start over after removing a PHI. -1%
> 5. r247269: [ADT] Rewrite the StringRef::find implementation to be
simpler... +1%
>    r247240: [LPM] Use a map from analysis ID to immutable passes in the
legacy pass manager... +3%
>    r247264: Enable GlobalsAA by default. +1%
> 6. r247674: [GlobalsAA] Disable globals-aa by default. -1%
> 7. r248638: [SCEV] Reapply 'Teach isLoopBackedgeGuardedByCond to
exploit trip counts'. +2%
> 8. r249802: [SCEV] Call `StrengthenNoWrapFlags` after `GroupByComplexity`;
NFCI. +4%
> 9. r250157: [GlobalsAA] Turn GlobalsAA on again by default. +1%
> 10. r251049: [SCEV] Mark AddExprs as nsw or nuw if legal. +23%
> 11. No data
> 12. r259252: AttributeSetImpl: Summarize existing function attributes in a
bitset. -1%
>     r259256: Add LoopSimplifyCFG pass. -2%
> 13. r262250: Enable LoopLoadElimination by default. +3%
> 14. r262839: Revert "Enable LoopLoadElimination by default". -3%
> 15. r263393: Remove PreserveNames template parameter from IRBuilder. -3%
> 16. r263595: Turn LoopLoadElimination on again. +3%
> 17. r267672: [LoopDist] Add llvm.loop.distribute.enable loop metadata. +4%
> 18. r268509: Do not disable completely loop unroll when optimizing for
size. -34%
> 19. r269124: Loop unroller: set thresholds for optsize and minsize
functions to zero. +50%
> 20. r269392: [LoopDist] Only run LAA for loops with the pragma. -4%
> 21. r270630: Re-enable "[LoopUnroll] Enable advanced unrolling
analysis by default" one more time. -28%
> 22. r270881: Don't allocate in APInt::slt.  NFC. -2%
>     r270959: Don't allocate unnecessarily in APInt::operator[+-].  NFC.
-1%
>     r271020: Don't generate unnecessary signed ConstantRange during
multiply.  NFC. -3%
> 23. r271615: [LoopUnroll] Set correct thresholds for new recently enabled
unrolling heuristic. +22%
> 24. r276942: Don't invoke getName() from Function::isIntrinsic(). -1%
>     r277087: Revert "Don't invoke getName() from
Function::isIntrinsic().", rL276942. +1%
> 25. r279585: [LoopUnroll] By default disable unrolling when optimizing for
size.
> 26. r286814: [InlineCost] Remove skew when calculating call costs. +3%
> 27. r289755: Make processing @llvm.assume more efficient by using operand
bundles. +6%
> 28. r290086: Revert @llvm.assume with operator bundles (r289755-r289757).
-6%
>
>
> Disclaimer:
> The data is specific for this particular test, so I could have skipped some
commits affecting compile time on other workloads/configurations.
> The data I have is not perfect, so I could have skipped some commits, even
if they impacted compile-time on this test case.
> Same commits might have a different impact on a different
test/configuration, up to the opposite to the one listed.
> I didn't mean to label any commits as 'good' or 'bad'
by posting these numbers. It's expected that some commits increase compile
time, we just need to be aware of it and avoid unnecessary slowdowns.
>
> Conclusions:
> Changes in optimization thresholds/cost-models usually have the biggest
impact on compile time. However, usually they are well-assessed and trade-offs
are discussed and agreed on.
> Introducing a pass doesn't necessarily mean a compile time slowdown.
Sometimes the total compile time might decrease because we're saving some
work for later passes.
> There are many commits, which individually have a low compile time impact,
but together sum up to a noticeable slowdown.
> Conscious efforts on reducing compile time definitely help - thanks
everyone who's been working on this!
>
> Thanks for reading, any comments or suggestions on how to make LLVM faster
are welcome! I hope we'll see this graph going down this year :-)
>
> Michael
>
This is great, thanks for the January update :)
Do you mind to share how you collected the numbers (script etc.. and
how you plotted the graph so I can try repeating at home with my
testcases?)

Thanks,

-- 
Davide

"There are no solved problems; there are only problems that are more
or less solved" -- Henri Poincare

Jonathan Roelofs via llvm-dev

2017-Jan-18 23:21 UTC

head link

[llvm-dev] llvm is getting slower, January edition

On 1/18/17 3:55 PM, Davide Italiano via llvm-dev wrote:> On Tue, Jan 17, 2017 at 6:02 PM, Mikhail Zolotukhin
> <mzolotukhin at apple.com> wrote:
>> Hi,
>>
>> Continuing recent efforts in understanding compile time slowdowns, I
looked at some historical data: I picked one test and tried to pin-point commits
that affected its compile-time. The data I have is not 100% accurate, but
hopefully it helps to provide an overview of what's going on with compile
time in LLVM and give a better understanding of what changes usually impact
compile time.
>>
>> Configuration:
>> The test I used is tramp3d-v4 from LLVM testsuite. It consists of a
single source file, but still takes a noticeable time to compile, which makes it
very convenient for this kind of experiments. The file was compiled with Os for
arm64 on x86 host.
>>
>> Results:
>> The attached PDF has a compile time graph, on which I marked points
where compile time changed with a list of corresponding commits. A textual
version of the list is available below, but I think it might be much harder to
comprehend the data without the graph. A number in the end shows compile time
change after the given commit:
>>
>> 1. r239821: [InstSimplify] Allow folding of fdiv X, X with just NaNs
ignored. +1%
>> 2. r241886: [InstCombine] Employ AliasAnalysis in
FindAvailableLoadedValue. +1%
>> 3. r245118: [SCEV] Apply NSW and NUW flags via poison value analysis
for sub, mul and shl. +2%
>> 4. r246694: [RemoveDuplicatePHINodes] Start over after removing a PHI.
-1%
>> 5. r247269: [ADT] Rewrite the StringRef::find implementation to be
simpler... +1%
>>    r247240: [LPM] Use a map from analysis ID to immutable passes in the
legacy pass manager... +3%
>>    r247264: Enable GlobalsAA by default. +1%
>> 6. r247674: [GlobalsAA] Disable globals-aa by default. -1%
>> 7. r248638: [SCEV] Reapply 'Teach isLoopBackedgeGuardedByCond to
exploit trip counts'. +2%
>> 8. r249802: [SCEV] Call `StrengthenNoWrapFlags` after
`GroupByComplexity`; NFCI. +4%
>> 9. r250157: [GlobalsAA] Turn GlobalsAA on again by default. +1%
>> 10. r251049: [SCEV] Mark AddExprs as nsw or nuw if legal. +23%
>> 11. No data
>> 12. r259252: AttributeSetImpl: Summarize existing function attributes
in a bitset. -1%
>>     r259256: Add LoopSimplifyCFG pass. -2%
>> 13. r262250: Enable LoopLoadElimination by default. +3%
>> 14. r262839: Revert "Enable LoopLoadElimination by default".
-3%
>> 15. r263393: Remove PreserveNames template parameter from IRBuilder.
-3%
>> 16. r263595: Turn LoopLoadElimination on again. +3%
>> 17. r267672: [LoopDist] Add llvm.loop.distribute.enable loop metadata.
+4%
>> 18. r268509: Do not disable completely loop unroll when optimizing for
size. -34%
>> 19. r269124: Loop unroller: set thresholds for optsize and minsize
functions to zero. +50%
>> 20. r269392: [LoopDist] Only run LAA for loops with the pragma. -4%
>> 21. r270630: Re-enable "[LoopUnroll] Enable advanced unrolling
analysis by default" one more time. -28%
>> 22. r270881: Don't allocate in APInt::slt.  NFC. -2%
>>     r270959: Don't allocate unnecessarily in APInt::operator[+-]. 
NFC. -1%
>>     r271020: Don't generate unnecessary signed ConstantRange during
multiply.  NFC. -3%
>> 23. r271615: [LoopUnroll] Set correct thresholds for new recently
enabled unrolling heuristic. +22%
>> 24. r276942: Don't invoke getName() from Function::isIntrinsic().
-1%
>>     r277087: Revert "Don't invoke getName() from
Function::isIntrinsic().", rL276942. +1%
>> 25. r279585: [LoopUnroll] By default disable unrolling when optimizing
for size.
>> 26. r286814: [InlineCost] Remove skew when calculating call costs. +3%
>> 27. r289755: Make processing @llvm.assume more efficient by using
operand bundles. +6%
>> 28. r290086: Revert @llvm.assume with operator bundles
(r289755-r289757). -6%
>>
>>
>> Disclaimer:
>> The data is specific for this particular test, so I could have skipped
some commits affecting compile time on other workloads/configurations.
>> The data I have is not perfect, so I could have skipped some commits,
even if they impacted compile-time on this test case.
>> Same commits might have a different impact on a different
test/configuration, up to the opposite to the one listed.
>> I didn't mean to label any commits as 'good' or
'bad' by posting these numbers. It's expected that some commits
increase compile time, we just need to be aware of it and avoid unnecessary
slowdowns.
>>
>> Conclusions:
>> Changes in optimization thresholds/cost-models usually have the biggest
impact on compile time. However, usually they are well-assessed and trade-offs
are discussed and agreed on.
>> Introducing a pass doesn't necessarily mean a compile time
slowdown. Sometimes the total compile time might decrease because we're
saving some work for later passes.
>> There are many commits, which individually have a low compile time
impact, but together sum up to a noticeable slowdown.
>> Conscious efforts on reducing compile time definitely help - thanks
everyone who's been working on this!
>>
>> Thanks for reading, any comments or suggestions on how to make LLVM
faster are welcome! I hope we'll see this graph going down this year :-)
>>
>> Michael
>>
>
> This is great, thanks for the January update :)
> Do you mind to share how you collected the numbers (script etc.. and
> how you plotted the graph so I can try repeating at home with my
> testcases?)
Out of pure curiosity, I would love to see the performance of the 
resulting binary co-plotted with the same horizontal axis as this 
compile duration data.


Jon
>
> Thanks,
>
-- 
Jon Roelofs
jonathan at codesourcery.com
CodeSourcery / Mentor Embedded

Mikhail Zolotukhin via llvm-dev

2017-Jan-18 23:48 UTC

head link

[llvm-dev] llvm is getting slower, January edition

> On Jan 18, 2017, at 2:55 PM, Davide Italiano <davide at freebsd.org>
wrote:
> 
> On Tue, Jan 17, 2017 at 6:02 PM, Mikhail Zolotukhin
> <mzolotukhin at apple.com> wrote:
>> Hi,
>> 
>> Continuing recent efforts in understanding compile time slowdowns, I
looked at some historical data: I picked one test and tried to pin-point commits
that affected its compile-time. The data I have is not 100% accurate, but
hopefully it helps to provide an overview of what's going on with compile
time in LLVM and give a better understanding of what changes usually impact
compile time.
>> 
>> Configuration:
>> The test I used is tramp3d-v4 from LLVM testsuite. It consists of a
single source file, but still takes a noticeable time to compile, which makes it
very convenient for this kind of experiments. The file was compiled with Os for
arm64 on x86 host.
>> 
>> Results:
>> The attached PDF has a compile time graph, on which I marked points
where compile time changed with a list of corresponding commits. A textual
version of the list is available below, but I think it might be much harder to
comprehend the data without the graph. A number in the end shows compile time
change after the given commit:
>> 
>> 1. r239821: [InstSimplify] Allow folding of fdiv X, X with just NaNs
ignored. +1%
>> 2. r241886: [InstCombine] Employ AliasAnalysis in
FindAvailableLoadedValue. +1%
>> 3. r245118: [SCEV] Apply NSW and NUW flags via poison value analysis
for sub, mul and shl. +2%
>> 4. r246694: [RemoveDuplicatePHINodes] Start over after removing a PHI.
-1%
>> 5. r247269: [ADT] Rewrite the StringRef::find implementation to be
simpler... +1%
>>   r247240: [LPM] Use a map from analysis ID to immutable passes in the
legacy pass manager... +3%
>>   r247264: Enable GlobalsAA by default. +1%
>> 6. r247674: [GlobalsAA] Disable globals-aa by default. -1%
>> 7. r248638: [SCEV] Reapply 'Teach isLoopBackedgeGuardedByCond to
exploit trip counts'. +2%
>> 8. r249802: [SCEV] Call `StrengthenNoWrapFlags` after
`GroupByComplexity`; NFCI. +4%
>> 9. r250157: [GlobalsAA] Turn GlobalsAA on again by default. +1%
>> 10. r251049: [SCEV] Mark AddExprs as nsw or nuw if legal. +23%
>> 11. No data
>> 12. r259252: AttributeSetImpl: Summarize existing function attributes
in a bitset. -1%
>>    r259256: Add LoopSimplifyCFG pass. -2%
>> 13. r262250: Enable LoopLoadElimination by default. +3%
>> 14. r262839: Revert "Enable LoopLoadElimination by default".
-3%
>> 15. r263393: Remove PreserveNames template parameter from IRBuilder.
-3%
>> 16. r263595: Turn LoopLoadElimination on again. +3%
>> 17. r267672: [LoopDist] Add llvm.loop.distribute.enable loop metadata.
+4%
>> 18. r268509: Do not disable completely loop unroll when optimizing for
size. -34%
>> 19. r269124: Loop unroller: set thresholds for optsize and minsize
functions to zero. +50%
>> 20. r269392: [LoopDist] Only run LAA for loops with the pragma. -4%
>> 21. r270630: Re-enable "[LoopUnroll] Enable advanced unrolling
analysis by default" one more time. -28%
>> 22. r270881: Don't allocate in APInt::slt.  NFC. -2%
>>    r270959: Don't allocate unnecessarily in APInt::operator[+-]. 
NFC. -1%
>>    r271020: Don't generate unnecessary signed ConstantRange during
multiply.  NFC. -3%
>> 23. r271615: [LoopUnroll] Set correct thresholds for new recently
enabled unrolling heuristic. +22%
>> 24. r276942: Don't invoke getName() from Function::isIntrinsic().
-1%
>>    r277087: Revert "Don't invoke getName() from
Function::isIntrinsic().", rL276942. +1%
>> 25. r279585: [LoopUnroll] By default disable unrolling when optimizing
for size.
>> 26. r286814: [InlineCost] Remove skew when calculating call costs. +3%
>> 27. r289755: Make processing @llvm.assume more efficient by using
operand bundles. +6%
>> 28. r290086: Revert @llvm.assume with operator bundles
(r289755-r289757). -6%
>> 
>> 
>> Disclaimer:
>> The data is specific for this particular test, so I could have skipped
some commits affecting compile time on other workloads/configurations.
>> The data I have is not perfect, so I could have skipped some commits,
even if they impacted compile-time on this test case.
>> Same commits might have a different impact on a different
test/configuration, up to the opposite to the one listed.
>> I didn't mean to label any commits as 'good' or
'bad' by posting these numbers. It's expected that some commits
increase compile time, we just need to be aware of it and avoid unnecessary
slowdowns.
>> 
>> Conclusions:
>> Changes in optimization thresholds/cost-models usually have the biggest
impact on compile time. However, usually they are well-assessed and trade-offs
are discussed and agreed on.
>> Introducing a pass doesn't necessarily mean a compile time
slowdown. Sometimes the total compile time might decrease because we're
saving some work for later passes.
>> There are many commits, which individually have a low compile time
impact, but together sum up to a noticeable slowdown.
>> Conscious efforts on reducing compile time definitely help - thanks
everyone who's been working on this!
>> 
>> Thanks for reading, any comments or suggestions on how to make LLVM
faster are welcome! I hope we'll see this graph going down this year :-)
>> 
>> Michael
>> 
> 
> This is great, thanks for the January update :)
> Do you mind to share how you collected the numbers (script etc.. and
> how you plotted the graph so I can try repeating at home with my
> testcases?)It involved a lot of manual work, so I'm not sure there is anything to
share.
For the graph I just used LNT and some madskills to mark the points of interest.
Then I checked out LLVM from the date I wanted to check (near jumps in the
graph), built it, and ran the test 20 times to verify the change and find
responsible for the change commit. As I said, a lot of manual work, but
we're working on some infrastructure to automate some of this though.

Michael

PS: If you haven't used LNT before, then definitely try using it - at least
it'll take care of plotting graphs. If you need any guidance on this part, I
can try to help.

> 
> Thanks,
> 
> -- 
> Davide
> 
> "There are no solved problems; there are only problems that are more
> or less solved" -- Henri Poincare

Mehdi Amini via llvm-dev

2017-Jan-19 17:39 UTC

head link

[llvm-dev] llvm is getting slower, January edition

Hi,

On this topic, I just tried to build ToT with clang-3.9.1 and clang-4.0 and the
total time to complete `ninja clang` on this machine went from 12m54s to 13m44s
for RelWithDebInfo (6.5% slower!) and 11m18s to 12m06s for Release (7% slower!).

— 
Mehdi
> On Jan 17, 2017, at 6:02 PM, Mikhail Zolotukhin via llvm-dev <llvm-dev
at lists.llvm.org> wrote:
> 
> Hi,
> 
> Continuing recent efforts in understanding compile time slowdowns, I looked
at some historical data: I picked one test and tried to pin-point commits that
affected its compile-time. The data I have is not 100% accurate, but hopefully
it helps to provide an overview of what's going on with compile time in LLVM
and give a better understanding of what changes usually impact compile time.
> 
> Configuration:
> The test I used is tramp3d-v4 from LLVM testsuite. It consists of a single
source file, but still takes a noticeable time to compile, which makes it very
convenient for this kind of experiments. The file was compiled with Os for arm64
on x86 host.
> 
> Results:
> The attached PDF has a compile time graph, on which I marked points where
compile time changed with a list of corresponding commits. A textual version of
the list is available below, but I think it might be much harder to comprehend
the data without the graph. A number in the end shows compile time change after
the given commit:
> 
> 1. r239821: [InstSimplify] Allow folding of fdiv X, X with just NaNs
ignored. +1%
> 2. r241886: [InstCombine] Employ AliasAnalysis in FindAvailableLoadedValue.
+1%
> 3. r245118: [SCEV] Apply NSW and NUW flags via poison value analysis for
sub, mul and shl. +2%
> 4. r246694: [RemoveDuplicatePHINodes] Start over after removing a PHI. -1%
> 5. r247269: [ADT] Rewrite the StringRef::find implementation to be
simpler... +1%
>   r247240: [LPM] Use a map from analysis ID to immutable passes in the
legacy pass manager... +3%
>   r247264: Enable GlobalsAA by default. +1%
> 6. r247674: [GlobalsAA] Disable globals-aa by default. -1%
> 7. r248638: [SCEV] Reapply 'Teach isLoopBackedgeGuardedByCond to
exploit trip counts'. +2%
> 8. r249802: [SCEV] Call `StrengthenNoWrapFlags` after `GroupByComplexity`;
NFCI. +4%
> 9. r250157: [GlobalsAA] Turn GlobalsAA on again by default. +1%
> 10. r251049: [SCEV] Mark AddExprs as nsw or nuw if legal. +23%
> 11. No data
> 12. r259252: AttributeSetImpl: Summarize existing function attributes in a
bitset. -1%
>    r259256: Add LoopSimplifyCFG pass. -2%
> 13. r262250: Enable LoopLoadElimination by default. +3%
> 14. r262839: Revert "Enable LoopLoadElimination by default". -3%
> 15. r263393: Remove PreserveNames template parameter from IRBuilder. -3%
> 16. r263595: Turn LoopLoadElimination on again. +3%
> 17. r267672: [LoopDist] Add llvm.loop.distribute.enable loop metadata. +4%
> 18. r268509: Do not disable completely loop unroll when optimizing for
size. -34%
> 19. r269124: Loop unroller: set thresholds for optsize and minsize
functions to zero. +50%
> 20. r269392: [LoopDist] Only run LAA for loops with the pragma. -4%
> 21. r270630: Re-enable "[LoopUnroll] Enable advanced unrolling
analysis by default" one more time. -28%
> 22. r270881: Don't allocate in APInt::slt.  NFC. -2%
>    r270959: Don't allocate unnecessarily in APInt::operator[+-].  NFC.
-1%
>    r271020: Don't generate unnecessary signed ConstantRange during
multiply.  NFC. -3%
> 23. r271615: [LoopUnroll] Set correct thresholds for new recently enabled
unrolling heuristic. +22%
> 24. r276942: Don't invoke getName() from Function::isIntrinsic(). -1%
>    r277087: Revert "Don't invoke getName() from
Function::isIntrinsic().", rL276942. +1%
> 25. r279585: [LoopUnroll] By default disable unrolling when optimizing for
size.
> 26. r286814: [InlineCost] Remove skew when calculating call costs. +3%
> 27. r289755: Make processing @llvm.assume more efficient by using operand
bundles. +6%
> 28. r290086: Revert @llvm.assume with operator bundles (r289755-r289757).
-6%
> <CompileTime.pdf>
> Disclaimer:
> The data is specific for this particular test, so I could have skipped some
commits affecting compile time on other workloads/configurations.
> The data I have is not perfect, so I could have skipped some commits, even
if they impacted compile-time on this test case.
> Same commits might have a different impact on a different
test/configuration, up to the opposite to the one listed.
> I didn't mean to label any commits as 'good' or 'bad'
by posting these numbers. It's expected that some commits increase compile
time, we just need to be aware of it and avoid unnecessary slowdowns.
> 
> Conclusions:
> Changes in optimization thresholds/cost-models usually have the biggest
impact on compile time. However, usually they are well-assessed and trade-offs
are discussed and agreed on.
> Introducing a pass doesn't necessarily mean a compile time slowdown.
Sometimes the total compile time might decrease because we're saving some
work for later passes.
> There are many commits, which individually have a low compile time impact,
but together sum up to a noticeable slowdown.
> Conscious efforts on reducing compile time definitely help - thanks
everyone who's been working on this!
> 
> Thanks for reading, any comments or suggestions on how to make LLVM faster
are welcome! I hope we'll see this graph going down this year :-)
> 
> Michael
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

Jeremy Lakeman via llvm-dev

2017-Jan-20 00:22 UTC

head link

[llvm-dev] llvm is getting slower, January edition

Ah but how did you compile the clang-4.0 you were using? Does it run faster
if you compile it with clang-4.0? :)

On Fri, Jan 20, 2017 at 4:09 AM, Mehdi Amini via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> Hi,
>
> On this topic, I just tried to build ToT with clang-3.9.1 and clang-4.0
> and the total time to complete `ninja clang` on this machine went from
> 12m54s to 13m44s for RelWithDebInfo (6.5% slower!) and 11m18s to 12m06s for
> Release (7% slower!).
>
> —
> Mehdi
>
> > On Jan 17, 2017, at 6:02 PM, Mikhail Zolotukhin via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
> >
> > Hi,
> >
> > Continuing recent efforts in understanding compile time slowdowns, I
> looked at some historical data: I picked one test and tried to pin-point
> commits that affected its compile-time. The data I have is not 100%
> accurate, but hopefully it helps to provide an overview of what's going
on
> with compile time in LLVM and give a better understanding of what changes
> usually impact compile time.
> >
> > Configuration:
> > The test I used is tramp3d-v4 from LLVM testsuite. It consists of a
> single source file, but still takes a noticeable time to compile, which
> makes it very convenient for this kind of experiments. The file was
> compiled with Os for arm64 on x86 host.
> >
> > Results:
> > The attached PDF has a compile time graph, on which I marked points
> where compile time changed with a list of corresponding commits. A textual
> version of the list is available below, but I think it might be much harder
> to comprehend the data without the graph. A number in the end shows compile
> time change after the given commit:
> >
> > 1. r239821: [InstSimplify] Allow folding of fdiv X, X with just NaNs
> ignored. +1%
> > 2. r241886: [InstCombine] Employ AliasAnalysis in
> FindAvailableLoadedValue. +1%
> > 3. r245118: [SCEV] Apply NSW and NUW flags via poison value analysis
for
> sub, mul and shl. +2%
> > 4. r246694: [RemoveDuplicatePHINodes] Start over after removing a PHI.
> -1%
> > 5. r247269: [ADT] Rewrite the StringRef::find implementation to be
> simpler... +1%
> >   r247240: [LPM] Use a map from analysis ID to immutable passes in the
> legacy pass manager... +3%
> >   r247264: Enable GlobalsAA by default. +1%
> > 6. r247674: [GlobalsAA] Disable globals-aa by default. -1%
> > 7. r248638: [SCEV] Reapply 'Teach isLoopBackedgeGuardedByCond to
exploit
> trip counts'. +2%
> > 8. r249802: [SCEV] Call `StrengthenNoWrapFlags` after
> `GroupByComplexity`; NFCI. +4%
> > 9. r250157: [GlobalsAA] Turn GlobalsAA on again by default. +1%
> > 10. r251049: [SCEV] Mark AddExprs as nsw or nuw if legal. +23%
> > 11. No data
> > 12. r259252: AttributeSetImpl: Summarize existing function attributes
in
> a bitset. -1%
> >    r259256: Add LoopSimplifyCFG pass. -2%
> > 13. r262250: Enable LoopLoadElimination by default. +3%
> > 14. r262839: Revert "Enable LoopLoadElimination by default".
-3%
> > 15. r263393: Remove PreserveNames template parameter from IRBuilder.
-3%
> > 16. r263595: Turn LoopLoadElimination on again. +3%
> > 17. r267672: [LoopDist] Add llvm.loop.distribute.enable loop metadata.
> +4%
> > 18. r268509: Do not disable completely loop unroll when optimizing for
> size. -34%
> > 19. r269124: Loop unroller: set thresholds for optsize and minsize
> functions to zero. +50%
> > 20. r269392: [LoopDist] Only run LAA for loops with the pragma. -4%
> > 21. r270630: Re-enable "[LoopUnroll] Enable advanced unrolling
analysis
> by default" one more time. -28%
> > 22. r270881: Don't allocate in APInt::slt.  NFC. -2%
> >    r270959: Don't allocate unnecessarily in APInt::operator[+-]. 
NFC.
> -1%
> >    r271020: Don't generate unnecessary signed ConstantRange during
> multiply.  NFC. -3%
> > 23. r271615: [LoopUnroll] Set correct thresholds for new recently
> enabled unrolling heuristic. +22%
> > 24. r276942: Don't invoke getName() from Function::isIntrinsic().
-1%
> >    r277087: Revert "Don't invoke getName() from
> Function::isIntrinsic().", rL276942. +1%
> > 25. r279585: [LoopUnroll] By default disable unrolling when optimizing
> for size.
> > 26. r286814: [InlineCost] Remove skew when calculating call costs. +3%
> > 27. r289755: Make processing @llvm.assume more efficient by using
> operand bundles. +6%
> > 28. r290086: Revert @llvm.assume with operator bundles
> (r289755-r289757). -6%
> > <CompileTime.pdf>
> > Disclaimer:
> > The data is specific for this particular test, so I could have skipped
> some commits affecting compile time on other workloads/configurations.
> > The data I have is not perfect, so I could have skipped some commits,
> even if they impacted compile-time on this test case.
> > Same commits might have a different impact on a different
> test/configuration, up to the opposite to the one listed.
> > I didn't mean to label any commits as 'good' or
'bad' by posting these
> numbers. It's expected that some commits increase compile time, we just
> need to be aware of it and avoid unnecessary slowdowns.
> >
> > Conclusions:
> > Changes in optimization thresholds/cost-models usually have the
biggest
> impact on compile time. However, usually they are well-assessed and
> trade-offs are discussed and agreed on.
> > Introducing a pass doesn't necessarily mean a compile time
slowdown.
> Sometimes the total compile time might decrease because we're saving
some
> work for later passes.
> > There are many commits, which individually have a low compile time
> impact, but together sum up to a noticeable slowdown.
> > Conscious efforts on reducing compile time definitely help - thanks
> everyone who's been working on this!
> >
> > Thanks for reading, any comments or suggestions on how to make LLVM
> faster are welcome! I hope we'll see this graph going down this year
:-)
> >
> > Michael
> > _______________________________________________
> > LLVM Developers mailing list
> > llvm-dev at lists.llvm.org
> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20170120/4cda98de/attachment.html>

llvm dev - Jan 2017 - llvm is getting slower, January edition

[llvm-dev] llvm is getting slower, January edition

[llvm-dev] llvm is getting slower, January edition

[llvm-dev] llvm is getting slower, January edition

[llvm-dev] llvm is getting slower, January edition

[llvm-dev] llvm is getting slower, January edition

[llvm-dev] llvm is getting slower, January edition

[llvm-dev] llvm is getting slower, January edition

[llvm-dev] llvm is getting slower, January edition

[llvm-dev] llvm is getting slower, January edition

[llvm-dev] llvm is getting slower, January edition

[llvm-dev] llvm is getting slower, January edition