Displaying 20 results from an estimated 4000 matches similar to: "Profile-based inlining status"
2016 Mar 09
3
PGO question
Hi,
I have a question regarding PGO.
I collected profile data with the instrumentation build
(-fprofile-instr-generate) and provided for PGO optimization in the second
build (with -fprofile-instr-use=xxx.profdata). This works fine.
Then I tried to provide the profile data to opt using the option
-pgo-instr-use, but this causes an error with the message: "Not an IR level
instrumentation
2016 Mar 22
3
Instrumented BB in PGO
Hello,
I have a question regarding PGO instrumented BBs (I use IR-level
instrumentation).
It seems that instrumented BBs do not match between the two compilations
for profile-gen and profile-use for some cases. Here is an example from
SPECcpu 2006 lbm (a simple case consisting of just two modules).
In the first compilation, we have 5 instrumentation points for the main
function as follows:
$
2019 Sep 12
6
PGO is ineffective for Rust - but why?
Hi everyone,
As part of my work for Mozilla's Low Level Tools team I've
implemented PGO in the Rust compiler. The feature is
available since Rust 1.37 [1]. However, so far we have not
seen any actual performance gains from enabling PGO for
Rust code. Performance even seems to drop 1-3% with PGO
enabled. I wonder why that is and I'm hoping that someone
here might have experience
2016 May 11
4
Filter optimization remarks by the hotness of the code region
> On May 11, 2016, at 3:37 AM, Hal Finkel <hfinkel at anl.gov> wrote:
>
> ----- Original Message -----
>> From: "Adam Nemet" <anemet at apple.com>
>> To: "Hal Finkel" <hfinkel at anl.gov>
>> Cc: "llvm-dev (llvm-dev at lists.llvm.org)" <llvm-dev at lists.llvm.org>
>> Sent: Wednesday, May 11, 2016 1:15:42 AM
2016 Jun 03
5
The state of IRPGO (3 remaining work items)
On Thu, Jun 2, 2016 at 5:30 PM, Sean Silva <chisophugis at gmail.com> wrote:
>
>
> On Thu, Jun 2, 2016 at 2:51 PM, Frédéric Riss <friss at apple.com> wrote:
>
>>
>> On Jun 2, 2016, at 12:10 AM, Sean Silva <chisophugis at gmail.com> wrote:
>>
>>
>>
>> On Wed, Jun 1, 2016 at 5:46 PM, Frédéric Riss <friss at apple.com> wrote:
2016 May 24
6
The state of IRPGO (3 remaining work items)
> On May 23, 2016, at 8:56 PM, Xinliang David Li <davidxl at google.com> wrote:
>
> On Mon, May 23, 2016 at 8:23 PM, Sean Silva <chisophugis at gmail.com> wrote:
> Jake and I have been integrating IRPGO on PS4, and we've identified 3 remaining work items.
>
> Sean, thanks for the write up. It matches very well with what we think as well.
+ 1
> - Driver
2016 May 11
2
Filter optimization remarks by the hotness of the code region
Hi Hal,
> On May 10, 2016, at 5:39 PM, Hal Finkel <hfinkel at anl.gov> wrote:
>
> Hi Adam,
>
> I think would be a really useful feature to have. I don't think that the backend should be responsible for filtering, but should pass the relative hotness information to the frontend. Given that these diagnostics are not just going to be used for -Rpass and friends, but also
2016 Jun 02
4
The state of IRPGO (3 remaining work items)
> On Jun 2, 2016, at 12:10 AM, Sean Silva <chisophugis at gmail.com> wrote:
>
>
>
> On Wed, Jun 1, 2016 at 5:46 PM, Frédéric Riss <friss at apple.com <mailto:friss at apple.com>> wrote:
>
>> On Jun 1, 2016, at 1:46 PM, Sean Silva <chisophugis at gmail.com <mailto:chisophugis at gmail.com>> wrote:
>>
>>
>>
>> On
2016 Jun 13
2
The state of IRPGO (3 remaining work items)
Quick update. I've gotten derailed from posting a patch for this due to
focusing on higher priority PGO inlining work. No ETA.
-- Sean Silva
On Fri, Jun 3, 2016 at 6:06 PM, Sean Silva <chisophugis at gmail.com> wrote:
>
>
> On Thu, Jun 2, 2016 at 6:41 PM, Xinliang David Li <davidxl at google.com>
> wrote:
>
>>
>>
>> On Thu, Jun 2, 2016 at 5:30
2016 May 25
0
The state of IRPGO (3 remaining work items)
On Tue, May 24, 2016 at 3:41 PM, Vedant Kumar <vsk at apple.com> wrote:
>
> > On May 23, 2016, at 8:56 PM, Xinliang David Li <davidxl at google.com>
> wrote:
> >
> > On Mon, May 23, 2016 at 8:23 PM, Sean Silva <chisophugis at gmail.com>
> wrote:
> > Jake and I have been integrating IRPGO on PS4, and we've identified 3
> remaining work
2016 Jun 01
4
The state of IRPGO (3 remaining work items)
> On May 24, 2016, at 5:21 PM, Sean Silva <chisophugis at gmail.com> wrote:
>
>
>
> On Tue, May 24, 2016 at 3:41 PM, Vedant Kumar <vsk at apple.com <mailto:vsk at apple.com>> wrote:
>
> > On May 23, 2016, at 8:56 PM, Xinliang David Li <davidxl at google.com <mailto:davidxl at google.com>> wrote:
> >
> > On Mon, May 23, 2016 at
2016 Jun 03
2
The state of IRPGO (3 remaining work items)
> On Jun 2, 2016, at 5:30 PM, Sean Silva <chisophugis at gmail.com> wrote:
>
> This also means that if the consensus is that -fprofile-instr-generate should really change its meaning to mean IRPGO, I’m open to having this internal patch on our side.
>
> Yeah, it sounds like someone is going to have to keep a "private patch" to change the default. At that point
2016 Jun 02
2
The state of IRPGO (3 remaining work items)
> On Jun 1, 2016, at 1:46 PM, Sean Silva <chisophugis at gmail.com> wrote:
>
>
>
> On Tue, May 31, 2016 at 6:02 PM, Frédéric Riss <friss at apple.com <mailto:friss at apple.com>> wrote:
>
>> On May 24, 2016, at 5:21 PM, Sean Silva <chisophugis at gmail.com <mailto:chisophugis at gmail.com>> wrote:
>>
>>
>>
>> On
2014 May 12
3
[LLVMdev] Questions about LLVM PGO and autoFDO
Hi, all
Recently I'm trying to use LLVM PGO and autoFDO. However I have some problems in the process.
LLVM source code is updated on April 9th. Operating system is SUSE x86_64
1. Problems in instrumentation based PGO:
clang -O2 -fprofile-instr-generate test.c -o a.out
./a.out (then default.profraw is generated)
clang -O2 -fprofile-instr-use=default.profraw test.c -o a.out
2016 May 25
2
The state of IRPGO (3 remaining work items)
It sounds to me we are likely to converge on the following:
1) Making IR/llvm based PGO the default;
2) Enhance -fcoverage-mapping such that it automatically turns on FE based
instrumentation
3) if -fcoverage-mapping is used together with -fprofile-instr-generate,
-fcoverage-mapping serves as a switch to turn on FE based instrumetnation
All the above are transparent to users.
The following are
2018 Feb 06
2
Current PGO status
Hello David, thanks for detailed response!
Do you have any tests that you use to measure the PGO effectiveness? I
have tested clang version 6.0 with the same sample that Jie Chen used in
2016 and actually both frontend-based PGO and IR-based make code run
slower, see the average time:
clang++ -O3: 3.15 sec
clang++ -O3 and -fprofile-instr-use: 3.160 sec
clang++ -O3 and -fprofile-use: 3.180 sec
2016 Jun 27
2
The state of IRPGO (3 remaining work items)
On Mon, Jun 27, 2016 at 2:53 PM Xinliang David Li <davidxl at google.com>
wrote:
> There is some misunderstanding about the intention of this flag. The
> purpose of the flag is not to turn on profile instrumentation (which
> already has -fprofile-instr-generate or -fprofile-generate for it), but to
> select which instrumentors to use for PGO (IR or FE). I prefer fewer flags
2018 Feb 07
2
Current PGO status
David, could you please clarify on which code did you gain 10%
improvement? I have run numerous tests with and w/o this option and it
looks like it has no effect on performance (I am talking of the old 2016
sample to be concrete). Maybe we could investigate it together? Just
tell me where to start?
On 02/07/2018 02:11 AM, Xinliang David Li wrote:
> Victor, thanks for the experiment.
>
>
2016 May 24
0
The state of IRPGO (3 remaining work items)
On Mon, May 23, 2016 at 8:23 PM, Sean Silva <chisophugis at gmail.com> wrote:
> Jake and I have been integrating IRPGO on PS4, and we've identified 3
> remaining work items.
>
Sean, thanks for the write up. It matches very well with what we think as
well.
>
>
> - Driver changes
>
> We'd like to make IRPGO the default on PS4. We also think that it would be
2018 Feb 06
0
Current PGO status
Victor, thanks for the experiment.
My suspicion is it is due to the remaining issues with block layout --
especially with loop rotation (with PGO). Another problem is that tail dup
is not happening after loop rotation which can limit the effectiveness of
loop rotation.
I tried the internal option -mllvm -force-precise-rotation-cost and there
is about 10% speedup with -fprofile-use. This option