Displaying 20 results from an estimated 1600 matches similar to: "Showing hotness in LLVM optimization remarks using AutoFDO sampling profile data?"
2017 Jul 14
2
Next steps for optimization remarks?
> On Jul 14, 2017, at 10:22 AM, Davide Italiano <davide at freebsd.org> wrote:
>
> On Fri, Jul 14, 2017 at 10:10 AM, Adam Nemet <anemet at apple.com <mailto:anemet at apple.com>> wrote:
>>
>>
>> On Jul 14, 2017, at 8:21 AM, Davide Italiano via llvm-dev <llvm-dev at lists.llvm.org> wrote:
>>
>> On Mon, Jun 19, 2017 at 4:13 PM, Brian
2017 Jun 19
8
Next steps for optimization remarks?
Hello all,
In https://www.youtube.com/watch?v=qq0q1hfzidg, Adam Nemet (cc'ed)
describes optimization remarks and some future plans for the project. I had
a few follow-up questions:
1. As an example of future work to be done, the talk mentions expanding the
set of optimization passes that emit remarks. However, the Clang User
Manual mentions that "optimization remarks do not really make
2017 Jul 14
3
Next steps for optimization remarks?
> On Jul 14, 2017, at 8:21 AM, Davide Italiano via llvm-dev <llvm-dev at lists.llvm.org> wrote:
>
> On Mon, Jun 19, 2017 at 4:13 PM, Brian Gesiak via llvm-dev
> <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote:
>> Hello all,
>>
>> In https://www.youtube.com/watch?v=qq0q1hfzidg, Adam Nemet (cc'ed) describes
>>
2017 Jun 28
3
Next steps for optimization remarks?
> On Wed, Jun 28, 2017 at 8:13 AM, Hal Finkel <hfinkel at anl.gov> wrote:
>
> I don't object to adding some kind of filtering option, but in general it won't help. An important goal here is to provide analysis (and other) tools to users that present this information at a higher level. The users won't, and shouldn't, know exactly what kinds of messages the tools use.
2017 Jun 27
2
Next steps for optimization remarks?
Adam, thanks for all the suggestions!
One nice aspect of the `-Rpass` family of options is that I can filter
based on what I want. If I only want to see which inlines I missed, I could
use `clang -Rpass-missed="inline"`, for example. On the other hand,
optimization remark YAML always include remarks from all passes (as far as
I can tell), which increases the amount of time it takes
2016 May 04
4
Filter optimization remarks by the hotness of the code region
This idea came up a few times recently [1][2] so I’d like start prototyping it. To summarize, we can emit optimization remarks using the -Rpass* options. These are currently emitted by optimizations like vectorization[3], unrolling, inlining and since last week loop distribution.
For large programs however this can amount to a lot of diagnostics output to sift through. Filtering this by the
2016 May 11
2
Filter optimization remarks by the hotness of the code region
Hi Hal,
> On May 10, 2016, at 5:39 PM, Hal Finkel <hfinkel at anl.gov> wrote:
>
> Hi Adam,
>
> I think would be a really useful feature to have. I don't think that the backend should be responsible for filtering, but should pass the relative hotness information to the frontend. Given that these diagnostics are not just going to be used for -Rpass and friends, but also
2016 May 11
4
Filter optimization remarks by the hotness of the code region
> On May 11, 2016, at 3:37 AM, Hal Finkel <hfinkel at anl.gov> wrote:
>
> ----- Original Message -----
>> From: "Adam Nemet" <anemet at apple.com>
>> To: "Hal Finkel" <hfinkel at anl.gov>
>> Cc: "llvm-dev (llvm-dev at lists.llvm.org)" <llvm-dev at lists.llvm.org>
>> Sent: Wednesday, May 11, 2016 1:15:42 AM
2016 Jun 27
0
Filter optimization remarks by the hotness of the code region
> On May 11, 2016, at 10:45 AM, Adam Nemet via llvm-dev <llvm-dev at lists.llvm.org> wrote:
>
>>
>> On May 11, 2016, at 3:37 AM, Hal Finkel <hfinkel at anl.gov <mailto:hfinkel at anl.gov>> wrote:
>>
>> ----- Original Message -----
>>> From: "Adam Nemet" <anemet at apple.com <mailto:anemet at apple.com>>
>>>
2015 Oct 09
3
LLVM AutoFDO status
With recent bug fixes and performance tunings, AutoFDO at llvm has reached a
usable state. To evaluate performance, we used
O3/-fprofile-use/-fprofile-sample-use respectively to optimize clang
itself, and measure its speed.
clang built with -fprofile-use is ~20% faster than clang built with O3
clang built with -fprofile-sample-use is ~10% faster than clang built with
O3
AutoFDO can deliver 50%
2014 May 12
3
[LLVMdev] Questions about LLVM PGO and autoFDO
Hi, all
Recently I'm trying to use LLVM PGO and autoFDO. However I have some problems in the process.
LLVM source code is updated on April 9th. Operating system is SUSE x86_64
1. Problems in instrumentation based PGO:
clang -O2 -fprofile-instr-generate test.c -o a.out
./a.out (then default.profraw is generated)
clang -O2 -fprofile-instr-use=default.profraw test.c -o a.out
2016 Aug 12
3
AutoFDO sample profiles v. SelectInst,
I am looking for advice on a problem observed with
-fprofile-sample-use for samples built with the AutoFDO tool
I took the "hmmer" benchmark out of SPEC2006
It is initially compiled
clnag++ -o hmmer -O3 -std=gnu89 -DSPEC_CPU -DNDEBUG -fno-strict-aliasing -w -g *.c
This baseline binary runs in about 164.2 seconds as reported by "perf stat"
We build a sample file from this
2016 Aug 17
5
AutoFDO sample profiles v. SelectInst,
On Fri, Aug 12, 2016 at 12:15 PM, Xinliang David Li via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> +dehao.
>
> There are two potential problems:
>
> 1) the branch gets eliminated in the binary that is being profiled, so
> there is no profile data
>
This seems like a fundamental problem for PGO. Maybe it is also responsible
for this bug:
2019 Jul 30
2
ICE in release/9.x when using LLVM_ENABLE_MODULES
Thank you for the link and the suggestion to try master! I did so and
discovered that it reproduces on master for me as well. The repro
script I used (unchanged from before) and the output can be found
here: https://gist.github.com/modocache/d9700166067f4a155820bc57d9bee1f3
(Note that the output looks nearly identical, but it's using clang-10
from the master branch of llvm-project.)
I wonder
2020 Nov 19
0
[RFC] Control Flow Sensitive AutoFDO (FS-AFDO)
Hi Rong,
This is a very interesting proposal. We've also observed profile quality degradation from CFG destructive pass like loop rotate, and I can see how this proposal would help improve quality of profile that drives later optimization passes in the pipeline. I have a few questions.
* How does this affect today's AutoFDO? Specifically, can users upgrade compiler with FS-AutoFDO
2018 Dec 15
4
Disabling LLVM_ATTRIBUTE_ALWAYS_INLINE for development?
Hello all!
I find that using lldb to debug LLVM libraries can be super
frustrating, because a lot of LLVM classes, like the constructor for
StringRef, are marked LLVM_ATTRIBUTE_ALWAYS_INLINE. So when I attempt
to have lldb evaluate an expression that implicitly instantiates a
StringRef, I get 'error: Couldn't lookup symbols:
__ZN4llvm9StringRefC1EPKc'.
As an example, most recently
2018 Jun 04
4
Mach-O support in lld: what are the known issues?
Hello all,
I'm trying to better understand the state of Mach-O support in lld.
The lld docs state that "the linker supports ELF (Unix), PE/COFF (Windows),
Mach-O (macOS) and WebAssembly in descending order of completeness." [1]
True to that statement, I found an email on this list from Jan 2018 stating
that "MachO support in lld is not really ready for real world usage. It was
2019 Dec 26
2
[RFC] Coroutines passes in the new pass manager
Hello all,
It's been a month since my previous email on the topic, and since then
I've done some initial work on porting the coroutines passes to the
new pass manager. In total there are 6 patches -- that's a lot to
review, so allow me to introduce the changes being made in each of
them.
# What's finished
In these first 6 patches, I focused on lowering coroutine intrinsics
2020 Feb 26
2
Why is lldb telling me "variable not available"?
Vedant, Jeremy,
Thanks a ton! I copied ASan's use of 'replaceDbgDeclare', think that worked!
https://github.com/modocache/llvm-project/commit/afbc04e1dcba has some
extremely quick and dirty changes I made (with no tests!), and a link
to a Gist with the LLVM IR and DWARF produced,
https://gist.github.com/modocache/6f29093ba2827946011b422ed3bd2903.
There's only one kink: the spot
2020 Jan 07
2
Let CallGraphSCCPass Use Function-Level Analysis
Hi Mikhail,
As Brian noted, stuff like this works better in the new pass manager.
Even in the old pass manager I thought it should work though.
Did you initialize the pass, via
`INITIALIZE_PASS_DEPENDENCY(PostDominatorTreeWrapperPass)`?
Did you require it, via
` AU.addRequired<PostDominatorTreeWrapperPass>();`?
Btw. May I ask what you are planning to do?
Cheers,
Johannes
On 01/07,