similar to: Profile-based inlining status

Displaying 20 results from an estimated 4000 matches similar to: "Profile-based inlining status"

2016 Mar 09
3
PGO question
Hi, I have a question regarding PGO. I collected profile data with the instrumentation build (-fprofile-instr-generate) and provided for PGO optimization in the second build (with -fprofile-instr-use=xxx.profdata). This works fine. Then I tried to provide the profile data to opt using the option -pgo-instr-use, but this causes an error with the message: "Not an IR level instrumentation
2016 Mar 22
3
Instrumented BB in PGO
Hello, I have a question regarding PGO instrumented BBs (I use IR-level instrumentation). It seems that instrumented BBs do not match between the two compilations for profile-gen and profile-use for some cases. Here is an example from SPECcpu 2006 lbm (a simple case consisting of just two modules). In the first compilation, we have 5 instrumentation points for the main function as follows: $
2019 Sep 12
6
PGO is ineffective for Rust - but why?
Hi everyone, As part of my work for Mozilla's Low Level Tools team I've implemented PGO in the Rust compiler. The feature is available since Rust 1.37 [1]. However, so far we have not seen any actual performance gains from enabling PGO for Rust code. Performance even seems to drop 1-3% with PGO enabled. I wonder why that is and I'm hoping that someone here might have experience
2016 May 11
4
Filter optimization remarks by the hotness of the code region
> On May 11, 2016, at 3:37 AM, Hal Finkel <hfinkel at anl.gov> wrote: > > ----- Original Message ----- >> From: "Adam Nemet" <anemet at apple.com> >> To: "Hal Finkel" <hfinkel at anl.gov> >> Cc: "llvm-dev (llvm-dev at lists.llvm.org)" <llvm-dev at lists.llvm.org> >> Sent: Wednesday, May 11, 2016 1:15:42 AM
2016 Jun 03
5
The state of IRPGO (3 remaining work items)
On Thu, Jun 2, 2016 at 5:30 PM, Sean Silva <chisophugis at gmail.com> wrote: > > > On Thu, Jun 2, 2016 at 2:51 PM, Frédéric Riss <friss at apple.com> wrote: > >> >> On Jun 2, 2016, at 12:10 AM, Sean Silva <chisophugis at gmail.com> wrote: >> >> >> >> On Wed, Jun 1, 2016 at 5:46 PM, Frédéric Riss <friss at apple.com> wrote:
2016 May 24
6
The state of IRPGO (3 remaining work items)
> On May 23, 2016, at 8:56 PM, Xinliang David Li <davidxl at google.com> wrote: > > On Mon, May 23, 2016 at 8:23 PM, Sean Silva <chisophugis at gmail.com> wrote: > Jake and I have been integrating IRPGO on PS4, and we've identified 3 remaining work items. > > Sean, thanks for the write up. It matches very well with what we think as well. + 1 > - Driver
2016 May 11
2
Filter optimization remarks by the hotness of the code region
Hi Hal, > On May 10, 2016, at 5:39 PM, Hal Finkel <hfinkel at anl.gov> wrote: > > Hi Adam, > > I think would be a really useful feature to have. I don't think that the backend should be responsible for filtering, but should pass the relative hotness information to the frontend. Given that these diagnostics are not just going to be used for -Rpass and friends, but also
2016 Jun 02
4
The state of IRPGO (3 remaining work items)
> On Jun 2, 2016, at 12:10 AM, Sean Silva <chisophugis at gmail.com> wrote: > > > > On Wed, Jun 1, 2016 at 5:46 PM, Frédéric Riss <friss at apple.com <mailto:friss at apple.com>> wrote: > >> On Jun 1, 2016, at 1:46 PM, Sean Silva <chisophugis at gmail.com <mailto:chisophugis at gmail.com>> wrote: >> >> >> >> On
2016 Jun 13
2
The state of IRPGO (3 remaining work items)
Quick update. I've gotten derailed from posting a patch for this due to focusing on higher priority PGO inlining work. No ETA. -- Sean Silva On Fri, Jun 3, 2016 at 6:06 PM, Sean Silva <chisophugis at gmail.com> wrote: > > > On Thu, Jun 2, 2016 at 6:41 PM, Xinliang David Li <davidxl at google.com> > wrote: > >> >> >> On Thu, Jun 2, 2016 at 5:30
2016 May 25
0
The state of IRPGO (3 remaining work items)
On Tue, May 24, 2016 at 3:41 PM, Vedant Kumar <vsk at apple.com> wrote: > > > On May 23, 2016, at 8:56 PM, Xinliang David Li <davidxl at google.com> > wrote: > > > > On Mon, May 23, 2016 at 8:23 PM, Sean Silva <chisophugis at gmail.com> > wrote: > > Jake and I have been integrating IRPGO on PS4, and we've identified 3 > remaining work
2016 Jun 01
4
The state of IRPGO (3 remaining work items)
> On May 24, 2016, at 5:21 PM, Sean Silva <chisophugis at gmail.com> wrote: > > > > On Tue, May 24, 2016 at 3:41 PM, Vedant Kumar <vsk at apple.com <mailto:vsk at apple.com>> wrote: > > > On May 23, 2016, at 8:56 PM, Xinliang David Li <davidxl at google.com <mailto:davidxl at google.com>> wrote: > > > > On Mon, May 23, 2016 at
2016 Jun 03
2
The state of IRPGO (3 remaining work items)
> On Jun 2, 2016, at 5:30 PM, Sean Silva <chisophugis at gmail.com> wrote: > > This also means that if the consensus is that -fprofile-instr-generate should really change its meaning to mean IRPGO, I’m open to having this internal patch on our side. > > Yeah, it sounds like someone is going to have to keep a "private patch" to change the default. At that point
2016 Jun 02
2
The state of IRPGO (3 remaining work items)
> On Jun 1, 2016, at 1:46 PM, Sean Silva <chisophugis at gmail.com> wrote: > > > > On Tue, May 31, 2016 at 6:02 PM, Frédéric Riss <friss at apple.com <mailto:friss at apple.com>> wrote: > >> On May 24, 2016, at 5:21 PM, Sean Silva <chisophugis at gmail.com <mailto:chisophugis at gmail.com>> wrote: >> >> >> >> On
2014 May 12
3
[LLVMdev] Questions about LLVM PGO and autoFDO
Hi, all Recently I'm trying to use LLVM PGO and autoFDO. However I have some problems in the process. LLVM source code is updated on April 9th. Operating system is SUSE x86_64 1. Problems in instrumentation based PGO: clang -O2 -fprofile-instr-generate test.c -o a.out ./a.out (then default.profraw is generated) clang -O2 -fprofile-instr-use=default.profraw test.c -o a.out
2016 May 25
2
The state of IRPGO (3 remaining work items)
It sounds to me we are likely to converge on the following: 1) Making IR/llvm based PGO the default; 2) Enhance -fcoverage-mapping such that it automatically turns on FE based instrumentation 3) if -fcoverage-mapping is used together with -fprofile-instr-generate, -fcoverage-mapping serves as a switch to turn on FE based instrumetnation All the above are transparent to users. The following are
2018 Feb 06
2
Current PGO status
Hello David, thanks for detailed response! Do you have any tests that you use to measure the PGO effectiveness? I have tested clang version 6.0 with the same sample that Jie Chen used in 2016 and actually both frontend-based PGO and IR-based make code run slower, see the average time: clang++ -O3: 3.15 sec  clang++ -O3 and -fprofile-instr-use: 3.160 sec clang++ -O3 and -fprofile-use: 3.180 sec
2016 Jun 27
2
The state of IRPGO (3 remaining work items)
On Mon, Jun 27, 2016 at 2:53 PM Xinliang David Li <davidxl at google.com> wrote: > There is some misunderstanding about the intention of this flag. The > purpose of the flag is not to turn on profile instrumentation (which > already has -fprofile-instr-generate or -fprofile-generate for it), but to > select which instrumentors to use for PGO (IR or FE). I prefer fewer flags
2018 Feb 07
2
Current PGO status
David, could you please clarify on which code did you gain 10% improvement? I have run numerous tests with and w/o this option and it looks like it has no effect on performance (I am talking of the old 2016 sample to be concrete). Maybe we could investigate it together? Just tell me where to start? On 02/07/2018 02:11 AM, Xinliang David Li wrote: > Victor, thanks for the experiment. > >
2016 May 24
0
The state of IRPGO (3 remaining work items)
On Mon, May 23, 2016 at 8:23 PM, Sean Silva <chisophugis at gmail.com> wrote: > Jake and I have been integrating IRPGO on PS4, and we've identified 3 > remaining work items. > Sean, thanks for the write up. It matches very well with what we think as well. > > > - Driver changes > > We'd like to make IRPGO the default on PS4. We also think that it would be
2018 Feb 06
0
Current PGO status
Victor, thanks for the experiment. My suspicion is it is due to the remaining issues with block layout -- especially with loop rotation (with PGO). Another problem is that tail dup is not happening after loop rotation which can limit the effectiveness of loop rotation. I tried the internal option -mllvm -force-precise-rotation-cost and there is about 10% speedup with -fprofile-use. This option