thr3ads.net - llvm dev - [LLVMdev] RFC - Profile Guided Optimization in LLVM [Jun 2013]

If this information is useful, please help other people find it:
Share via:

Eric Christopher

2013-Jun-18 21:48 UTC

[LLVMdev] RFC - Profile Guided Optimization in LLVM

On Tue, Jun 18, 2013 at 11:19 AM, Bob Wilson <bob.wilson at apple.com>
wrote:>
> On Jun 17, 2013, at 6:54 AM, Diego Novillo <dnovillo at google.com>
wrote:
>
>> On 2013-06-15 14:18 , Evan Cheng wrote:
>>> Apple folks are also gearing up to push on the PGO front. We are
primarily interested in using instrumentation, rather than sampling, to collect
profile info. However, I suspect the way profile ended up being used in the
various optimization and codegen passes would be largely similar.
>>>
>>
>> Excellent!  We are initially interested in instrumentation, as well. 
This is where we draw most of our performance with GCC. Sampling is showing a
lot of promise, however.  And it really is not much different than
instrumentation.  Most of what changes is the source of profile data.
>>
>>> There is also some interests in pursuing profile directed
specialization. But that can wait. I think it makes sense for us to get together
and discuss our plans to make sure there won't be duplication of efforts.
>>
>> Sure. My initial plan is fairly simple.  Triage the existing
instrumentation code and see what needs fixing.  I'm starting this in the
next week or so.  What are your plans?
>
> I've been working on prototyping a new approach to instrumentation for
both PGO and code coverage testing.  I want to use the same data for both of
those uses, and for code coverage we really need to have accurate source
location info, including column positions and the starting and ending locations
for every block of code.  Working backward from the debug info to find source
locations hasn't worked very well.  The debug info just doesn't have
enough detailed information.
I'm curious what problems you've had. You've surely not mentioned
them
before. Note that I'm not saying that I think this is the best method,
but I'm curious what problems you've had and what you're running
into.
> Instead, I am planning to insert the instrumentation code in the front-end.
The raw profile data in this scheme is tied to source locations.  One
disadvantage is that it conflicts with the goal of having graceful degradation. 
Simply adding a blank line invalidates all the following source locations.  My
plan is to ignore any profile data whenever the source changes in any way.  We
had some discussions about this, and it is easy to come up with cases where even
a trivial change to the source, e.g., enablin!
>  g a debugging flag, causes massive changes in the profile data.  I
don't think graceful degradation should even be a goal.  It is too hard to
sort out insignificant changes from those that should invalidate all the profile
data.
>
Even a comment?

-eric

Bob Wilson

2013-Jun-18 22:34 UTC

head link

[LLVMdev] RFC - Profile Guided Optimization in LLVM

On Jun 18, 2013, at 2:48 PM, Eric Christopher <echristo at gmail.com>
wrote:
> On Tue, Jun 18, 2013 at 11:19 AM, Bob Wilson <bob.wilson at
apple.com> wrote:
>> 
>> On Jun 17, 2013, at 6:54 AM, Diego Novillo <dnovillo at
google.com> wrote:
>> 
>>> On 2013-06-15 14:18 , Evan Cheng wrote:
>>>> Apple folks are also gearing up to push on the PGO front. We
are primarily interested in using instrumentation, rather than sampling, to
collect profile info. However, I suspect the way profile ended up being used in
the various optimization and codegen passes would be largely similar.
>>>> 
>>> 
>>> Excellent!  We are initially interested in instrumentation, as
well.  This is where we draw most of our performance with GCC. Sampling is
showing a lot of promise, however.  And it really is not much different than
instrumentation.  Most of what changes is the source of profile data.
>>> 
>>>> There is also some interests in pursuing profile directed
specialization. But that can wait. I think it makes sense for us to get together
and discuss our plans to make sure there won't be duplication of efforts.
>>> 
>>> Sure. My initial plan is fairly simple.  Triage the existing
instrumentation code and see what needs fixing.  I'm starting this in the
next week or so.  What are your plans?
>> 
>> I've been working on prototyping a new approach to instrumentation
for both PGO and code coverage testing.  I want to use the same data for both of
those uses, and for code coverage we really need to have accurate source
location info, including column positions and the starting and ending locations
for every block of code.  Working backward from the debug info to find source
locations hasn't worked very well.  The debug info just doesn't have
enough detailed information.
> 
> I'm curious what problems you've had. You've surely not
mentioned them
> before. Note that I'm not saying that I think this is the best method,
> but I'm curious what problems you've had and what you're
running into.
The main issue is that I want precise begin/end source locations for each
statement.  Debug info doesn't do that.  Even if we enable column
information, debug info just gives you one source location for each statement. 
That's not a problem with debug info -- it's just not what it was
intended to be used for.

There's also the issue that I want PGO instrumentation to work the same
regardless of the debug info setting.
> 
>> Instead, I am planning to insert the instrumentation code in the
front-end.  The raw profile data in this scheme is tied to source locations. 
One disadvantage is that it conflicts with the goal of having graceful
degradation.  Simply adding a blank line invalidates all the following source
locations.  My plan is to ignore any profile data whenever the source changes in
any way.  We had some discussions about this, and it is easy to come up with
cases where even a trivial change to the source, e.g., enablin!
>> g a debugging flag, causes massive changes in the profile data.  I
don't think graceful degradation should even be a goal.  It is too hard to
sort out insignificant changes from those that should invalidate all the profile
data.
>> 
> 
> Even a comment?
Yes.  Obviously you could do that, but it would add significant complexity and
little real value.

Eric Christopher

2013-Jun-18 22:41 UTC

head link

[LLVMdev] RFC - Profile Guided Optimization in LLVM

> The main issue is that I want precise begin/end source locations for each
statement.  Debug info doesn't do that.  Even if we enable column
information, debug info just gives you one source location for each statement. 
That's not a problem with debug info -- it's just not what it was
intended to be used for.
>
True... somewhat. Precise begin/end source locations exist and the
debug info will map them back to regions of code that are associated
with a particular line and column - this should also change when the
statement changes. Anything else should probably be a bug? The only
difference I can see is that there's no concrete "done with this
statement" in the code. It might take some processing work to get "all
statements that executed between these source ranges" because of code
motion effects, but you should be able to get a concrete range.
> There's also the issue that I want PGO instrumentation to work the same
regardless of the debug info setting.
Basically it's going to be "line-tables-only" :)

Of course, I'm also not signing up to do the work, just curious what
the limitations are in the infrastructure here.
>
>>
>> Even a comment?
>
> Yes.  Obviously you could do that, but it would add significant complexity
and little real value.
Mmm.. I think you're underestimating the amount of usefulness here,
but we can revisit when there's some code. :)

-eric

Possibly Parallel Threads

Search for more possibly parallel threads

llvm dev - Jun 2013 - [LLVMdev] RFC - Profile Guided Optimization in LLVM

[LLVMdev] RFC - Profile Guided Optimization in LLVM

[LLVMdev] RFC - Profile Guided Optimization in LLVM

[LLVMdev] RFC - Profile Guided Optimization in LLVM

Possibly Parallel Threads