similar to: [LLVMdev] FDO (profile guided) and llvm

Displaying 20 results from an estimated 40000 matches similar to: "[LLVMdev] FDO (profile guided) and llvm"

2013 Jun 12
2
[LLVMdev] RFC - Profile Guided Optimization in LLVM
> > After the basic profile-based transformations are working, I would like to > add new sources of profile. Mainly, I am thinking of implementing Auto > FDO. > For those who are not familiar with what autoFDO is -- Auto FDO is originally called Sample Based FDO. Its main author is Dehao Chen @google, and Robert Hundt is the one of the main pushers of technology in Google. The
2017 Jun 15
2
[RFC] Profile guided section layout
On Thu, Jun 15, 2017 at 2:33 PM, Xinliang David Li <xinliangli at gmail.com> wrote: > > > On Thu, Jun 15, 2017 at 2:30 PM, Sean Silva <chisophugis at gmail.com> wrote: > >> >> >> On Thu, Jun 15, 2017 at 11:09 AM, Xinliang David Li via llvm-dev < >> llvm-dev at lists.llvm.org> wrote: >> >>> >>> >>> On Thu, Jun
2017 Jun 15
2
[RFC] Profile guided section layout
On Thu, Jun 15, 2017 at 11:09 AM, Xinliang David Li via llvm-dev < llvm-dev at lists.llvm.org> wrote: > > > On Thu, Jun 15, 2017 at 10:55 AM, Michael Spencer via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > >> On Thu, Jun 15, 2017 at 10:08 AM, Tobias Edler von Koch < >> tobias at codeaurora.org> wrote: >> >>> Hi Michael,
2013 Jun 15
0
[LLVMdev] RFC - Profile Guided Optimization in LLVM
Apple folks are also gearing up to push on the PGO front. We are primarily interested in using instrumentation, rather than sampling, to collect profile info. However, I suspect the way profile ended up being used in the various optimization and codegen passes would be largely similar. There is also some interests in pursuing profile directed specialization. But that can wait. I think it makes
2016 Aug 17
5
AutoFDO sample profiles v. SelectInst,
On Fri, Aug 12, 2016 at 12:15 PM, Xinliang David Li via llvm-dev < llvm-dev at lists.llvm.org> wrote: > +dehao. > > There are two potential problems: > > 1) the branch gets eliminated in the binary that is being profiled, so > there is no profile data > This seems like a fundamental problem for PGO. Maybe it is also responsible for this bug:
2015 Dec 11
5
[LLVMdev] Path forward on profile guided inlining?
On Thu, Dec 10, 2015 at 4:51 PM, Philip Reames <listmail at philipreames.com> wrote: > > > On 12/10/2015 04:29 PM, Xinliang David Li wrote: >> >> On Thu, Dec 10, 2015 at 4:00 PM, Philip Reames >> <listmail at philipreames.com> wrote: >>> >>> Given I didn't get any response to my original query, I chose not to >>> invest
2015 Dec 11
2
[LLVMdev] Path forward on profile guided inlining?
On Thu, Dec 10, 2015 at 4:00 PM, Philip Reames <listmail at philipreames.com> wrote: > Given I didn't get any response to my original query, I chose not to invest > time in this at the time. I am unlikely to get time for this in the near > future. > > On 12/07/2015 03:13 PM, Easwaran Raman wrote: > > (Resending after removing llvmdev at cs.uiuc.edu and using >
2014 Apr 17
3
[LLVMdev] multithreaded performance disaster with -fprofile-instr-generate (contention on profile counters)
On Apr 17, 2014, at 11:09 AM, Xinliang David Li <xinliangli at gmail.com> wrote: > > On Thu, Apr 17, 2014 at 10:58 AM, Duncan P. N. Exon Smith <dexonsmith at apple.com> wrote: > > On 2014-Apr-17, at 10:38, Xinliang David Li <xinliangli at gmail.com> wrote: > > > > > Another idea is to use stack local counters per function -- synced up with global
2014 Apr 17
2
[LLVMdev] multithreaded performance disaster with -fprofile-instr-generate (contention on profile counters)
On 2014-Apr-17, at 10:38, Xinliang David Li <xinliangli at gmail.com> wrote: > > Another idea is to use stack local counters per function -- synced up with global counters on entry and exit. the problem with it is for deeply recursive calls, stack pressure can be too high. I think they'd need to be synced with global counters before function calls as well, since any function
2015 Mar 24
3
[LLVMdev] RFC - Improvements to PGO profile support
On Tue, Mar 24, 2015 at 11:46 AM, Xinliang David Li <xinliangli at gmail.com> wrote: > On Tue, Mar 24, 2015 at 11:29 AM, Chandler Carruth <chandlerc at google.com> > wrote: > >> Sorry I haven't responded earlier, but one point here still doesn't make >> sense to me: >> >> On Tue, Mar 24, 2015 at 10:27 AM, Xinliang David Li <davidxl at
2015 Mar 24
2
[LLVMdev] RFC - Improvements to PGO profile support
On Tue, Mar 24, 2015 at 12:50 PM, Xinliang David Li <xinliangli at gmail.com> wrote: > On Tue, Mar 24, 2015 at 12:45 PM, Chandler Carruth <chandlerc at google.com> > wrote: > >> >> On Tue, Mar 24, 2015 at 11:46 AM, Xinliang David Li <xinliangli at gmail.com >> > wrote: >> >>> On Tue, Mar 24, 2015 at 11:29 AM, Chandler Carruth
2015 Dec 12
2
Memory utilization problems in profile reader
On Wed, Dec 9, 2015 at 12:14 PM, Xinliang David Li via llvm-dev < llvm-dev at lists.llvm.org> wrote: > Can you extract the relevant part of the heap profile data? How large is > the sample profile data fed to the compiler? > > The indexed format profile size for clang is <100MB. The InstrProfRecord > for each function is read, used and discarded one at a time, so there
2015 Mar 24
2
[LLVMdev] RFC - Improvements to PGO profile support
Capping also leads to other kinds of problems -- e.g., sum of incoming edge count (callgraph) does not match the callee entry count etc. David On Tue, Mar 24, 2015 at 12:50 PM, Xinliang David Li <xinliangli at gmail.com> wrote: > > > On Tue, Mar 24, 2015 at 12:45 PM, Chandler Carruth <chandlerc at google.com> > wrote: > >> >> On Tue, Mar 24, 2015 at 11:46
2015 Mar 24
2
[LLVMdev] RFC - Improvements to PGO profile support
Example. Assuming the cap is 'C' void bar() { // ENTRY count is 4*C, after capping it becomes 'C' ... } void test() { // BB1: count(BB1) = C bar(); // BB2: count(BB2) = C bar(); } void test2() { // BB3: count(BB3) = C bar(); // BB4: count(BB4) = C bar(); } What would inliner see here ? When it sees callsite1 -- it might mistaken that is the
2015 Dec 10
3
Memory utilization problems in profile reader
On Wed, Dec 9, 2015 at 3:14 PM, Xinliang David Li <xinliangli at gmail.com> wrote: > Can you extract the relevant part of the heap profile data? > It's all profile data, actually. The heap utilization is massively dominated by the profile reader. > How large is the sample profile data fed to the compiler? > > For this run, the input file was 21Mb. > The
2015 Dec 09
3
Memory utilization problems in profile reader
I've been experimenting with profiled bootstraps using sample profiles. Initially, I made stage2 build stage3 while running under Perf. This produced a 20Gb profile which took too long to convert to LLVM, and used ~30Gb of RAM. So, I decided that this was not going to be very useful for general usage. I then changed the bootstrap to instead run each individual compile under Perf. This
2014 Oct 27
2
[LLVMdev] Recent changes in -gmlt break sample profiling
On Fri, Oct 24, 2014 at 4:06 PM, Xinliang David Li <xinliangli at gmail.com> wrote: > Diego, > > I think sampleFDO needs to be designed in a way which can protect itself > from future breakage like this. The roots in the unnecessary dependency of > sample FDO on gmlt setting. It is totally reasonable to tune debug binary > size by changes like this. > > The right way
2015 Mar 24
2
[LLVMdev] RFC - Improvements to PGO profile support
Sorry I haven't responded earlier, but one point here still doesn't make sense to me: On Tue, Mar 24, 2015 at 10:27 AM, Xinliang David Li <davidxl at google.com> wrote: > Diego and I have discussed this according to the feedback received. We > have revised plan for this (see Diego's last reply). Here is a more > detailed re-cap: > > 1) keep MD_prof definition as
2015 Dec 17
2
RFC: Hotness thresholds in profile header
On Thu, Dec 17, 2015 at 9:21 AM, Andy Ayers <andya at microsoft.com> wrote: > While your bb count distribution is extremely likely to be some kind of power-law like distribution, it's not guaranteed. > > Also you might think about operations that can amplify (rerolling) or appear to amplify (TRE) or diminish BB counts, and how you'd go about reclassifying block hotness. yes
2015 May 27
2
[LLVMdev] FW: Capabilities of Clang's PGO (e.g. improving code density)
David, Yes, that is very helpful. Thanks! --randy From: Xinliang David Li [mailto:xinliangli at gmail.com] Sent: Wednesday, May 27, 2015 12:53 PM To: Randy Chapman Cc: Lee Hunt; llvmdev at cs.uiuc.edu Subject: Re: FW: [LLVMdev] Capabilities of Clang's PGO (e.g. improving code density) On Wed, May 27, 2015 at 12:40 PM, Randy Chapman <randyc at microsoft.com<mailto:randyc at