similar to: Using PGO and -O3

Displaying 20 results from an estimated 4000 matches similar to: "Using PGO and -O3"

2018 Jan 29
0
Using PGO and -O3
It means using PGO with -O2 and above (including -O3). David On Sun, Jan 28, 2018 at 6:48 PM, Victor Leschuk via llvm-dev < llvm-dev at lists.llvm.org> wrote: > Hello all, > > clang-related PGO documentation recommends using PGO with -O2 (for > example: > https://clang.llvm.org/docs/UsersManual.html#profile-guided-optimization). > The question is: is there any reason why
2018 Jan 31
1
Using PGO and -O3
Maybe we should update the documentation to state this directly? Currently its a little bit confusing. On 01/29/2018 05:51 AM, Xinliang David Li wrote: > It means using PGO with -O2 and above (including -O3). > > David > > On Sun, Jan 28, 2018 at 6:48 PM, Victor Leschuk via llvm-dev > <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: > >
2018 Feb 07
2
Current PGO status
David, could you please clarify on which code did you gain 10% improvement? I have run numerous tests with and w/o this option and it looks like it has no effect on performance (I am talking of the old 2016 sample to be concrete). Maybe we could investigate it together? Just tell me where to start? On 02/07/2018 02:11 AM, Xinliang David Li wrote: > Victor, thanks for the experiment. > >
2018 Feb 07
0
Current PGO status
Victor, please file a bug tracking the issue. We can put relevant information there including test cases used in the experiment etc. thanks, David On Wed, Feb 7, 2018 at 2:15 PM, Victor Leschuk <vleschuk at accesssoftek.com> wrote: > David, could you please clarify on which code did you gain 10% > improvement? I have run numerous tests with and w/o this option and it > looks
2018 Feb 06
2
Current PGO status
Hello David, thanks for detailed response! Do you have any tests that you use to measure the PGO effectiveness? I have tested clang version 6.0 with the same sample that Jie Chen used in 2016 and actually both frontend-based PGO and IR-based make code run slower, see the average time: clang++ -O3: 3.15 sec  clang++ -O3 and -fprofile-instr-use: 3.160 sec clang++ -O3 and -fprofile-use: 3.180 sec
2018 Feb 06
0
Current PGO status
Victor, thanks for the experiment. My suspicion is it is due to the remaining issues with block layout -- especially with loop rotation (with PGO). Another problem is that tail dup is not happening after loop rotation which can limit the effectiveness of loop rotation. I tried the internal option -mllvm -force-precise-rotation-cost and there is about 10% speedup with -fprofile-use. This option
2018 Feb 05
0
Current PGO status
On Sun, Feb 4, 2018 at 9:59 PM, Victor Leschuk <vleschuk at accesssoftek.com> wrote: > Hello David! > > I have recently started acquaintance with PGO in LLVM/clang and found > your e-mail thread: > http://lists.llvm.org/pipermail/llvm-dev/2016-May/099395.html . Here you > posted a nice list of optimizations that use profiling and of those > which could be using but
2018 Feb 05
3
Current PGO status
Hello David! I have recently started acquaintance with PGO in LLVM/clang and found your e-mail thread: http://lists.llvm.org/pipermail/llvm-dev/2016-May/099395.html . Here you posted a nice list of optimizations that use profiling and of those which could be using but don't. However that thread is about 2 years old. Could you please kindly let me know if there were any significant changes in
2018 Feb 26
1
Current PGO status
Hello David and all involved =) On 02/05/2018 09:38 AM, Xinliang David Li wrote: > ThinLTO also works well with PGO. Could you please let me know if there are any problems which prevent using PGO with FullLTO? Thanks in advance! -- Best Regards, Victor Leschuk | Software Engineer | Access Softek
2015 May 27
4
[LLVMdev] Capabilities of Clang's PGO (e.g. improving code density)
Hello - I'm an Engineer in Microsoft Office after looking into possible advantages of using PGO for our Android Applications. We at Microsoft have deep experience with Visual C++'s Profile Guided Optimization<https://msdn.microsoft.com/en-us/library/e7k32f4k.aspx> and often see 10% or more reduction in the size of application code loaded after using PGO for key scenarios (e.g.
2015 May 27
3
[LLVMdev] Capabilities of Clang's PGO (e.g. improving code density)
Thanks! CIL [LeeHu] for a few comments… From: Xinliang David Li [mailto:xinliangli at gmail.com] Sent: Wednesday, May 27, 2015 9:29 AM To: Lee Hunt Cc: llvmdev at cs.uiuc.edu Subject: Re: [LLVMdev] Capabilities of Clang's PGO (e.g. improving code density) On Tue, May 26, 2015 at 8:47 PM, Lee Hunt <leehu at exchange.microsoft.com<mailto:leehu at exchange.microsoft.com>> wrote:
2017 Aug 20
3
Buildmaster restart 08.20.2017
Hello everyone, LLVM buildmasters (both main and staging) will be restarted in 2 hours (~3:00 AM PDT). -- Best Regards, Victor Leschuk | Software Engineer |Access Softek
2018 Jul 12
2
debug_rnglists status
Hi Victor, The work Wolfgang is doing should get us to the "minimum syntactically correct DWARF v5" stage, which we really wanted to have for LLVM 7.0. That is, once we have .debug_rnglists and .debug_loclists done, you can ask for DWARF 5 and get something that conforms to the spec. However, it won't conform if you ask for type units (I'm working on that) or split DWARF. If
2015 May 27
1
[LLVMdev] FW: Capabilities of Clang's PGO (e.g. improving code density)
Hi David! Thanks again for your help! I was wondering if you could clarify one thing for me? I find mention of “hot arc” optimization (-fprofile-arcs) , but I’m unclear if this is the same thing. Does Clang PGO do block reordering? It does reordering, but does not do splitting/partitioning. I take this to mean that PGO does block reordering within the function? I don’t see that the clang
2015 May 27
2
[LLVMdev] FW: Capabilities of Clang's PGO (e.g. improving code density)
David, Yes, that is very helpful. Thanks! --randy From: Xinliang David Li [mailto:xinliangli at gmail.com] Sent: Wednesday, May 27, 2015 12:53 PM To: Randy Chapman Cc: Lee Hunt; llvmdev at cs.uiuc.edu Subject: Re: FW: [LLVMdev] Capabilities of Clang's PGO (e.g. improving code density) On Wed, May 27, 2015 at 12:40 PM, Randy Chapman <randyc at microsoft.com<mailto:randyc at
2015 Jul 17
2
[LLVMdev] LLVM instrumentation
The PGO was my first guess but I can get a lot of information. At first, I follow the explanation at http://clang.llvm.org/docs/UsersManual.html#profiling-with-instrumentation but instead of llvm-profdata merge, I used llvm-profdata show *.profraw. Sadly, the information I get is the total number of function, the maximum function count and the maximum internal block count. Do you know if you
2015 May 27
0
[LLVMdev] FW: Capabilities of Clang's PGO (e.g. improving code density)
Yes, thanks David! For the intra-procedural Basic Block Reordering, do you have any data as to how much improvement that gives speed-wise for any perf tests you’ve measured? I’m thinking this may speed things up for things like application launch by a couple %. For perf intensive code (e.g. spreadsheet recalc), I would expect it would be more. From: Randy Chapman Sent: Wednesday, May 27, 2015
2018 Jul 12
2
debug_rnglists status
Hello Wolfgang and team, I see that you are working on support of .debug_rnglists, I am interested in the feature too, could you please point me out what else left to be done so that I could help you? -- Best Regards, Victor Leschuk | Software Engineer | Access Softek
2015 May 22
0
[LLVMdev] RFC - Improvements to PGO profile support
On Fri, May 22, 2015 at 11:16 AM, Dario Domizioli <dario.domizioli at gmail.com> wrote: > Hi all, > > I am a bit confused about the documentation of the format of the profile > data file. > > The Clang user guide here describes it as an ASCII text file: > http://clang.llvm.org/docs/UsersManual.html#sample-profile-format > > Whereas the posts above and the
2015 May 22
2
[LLVMdev] RFC - Improvements to PGO profile support
Hi all, I am a bit confused about the documentation of the format of the profile data file. The Clang user guide here describes it as an ASCII text file: http://clang.llvm.org/docs/UsersManual.html#sample-profile-format Whereas the posts above and the referenced link describe it as a stream of bytes containing LEB128s: http://www.llvm.org/docs/CoverageMappingFormat.html >From experimenting