Eric Christopher
2013-Jun-18 21:48 UTC
[LLVMdev] RFC - Profile Guided Optimization in LLVM
On Tue, Jun 18, 2013 at 11:19 AM, Bob Wilson <bob.wilson at apple.com> wrote:> > On Jun 17, 2013, at 6:54 AM, Diego Novillo <dnovillo at google.com> wrote: > >> On 2013-06-15 14:18 , Evan Cheng wrote: >>> Apple folks are also gearing up to push on the PGO front. We are primarily interested in using instrumentation, rather than sampling, to collect profile info. However, I suspect the way profile ended up being used in the various optimization and codegen passes would be largely similar. >>> >> >> Excellent! We are initially interested in instrumentation, as well. This is where we draw most of our performance with GCC. Sampling is showing a lot of promise, however. And it really is not much different than instrumentation. Most of what changes is the source of profile data. >> >>> There is also some interests in pursuing profile directed specialization. But that can wait. I think it makes sense for us to get together and discuss our plans to make sure there won't be duplication of efforts. >> >> Sure. My initial plan is fairly simple. Triage the existing instrumentation code and see what needs fixing. I'm starting this in the next week or so. What are your plans? > > I've been working on prototyping a new approach to instrumentation for both PGO and code coverage testing. I want to use the same data for both of those uses, and for code coverage we really need to have accurate source location info, including column positions and the starting and ending locations for every block of code. Working backward from the debug info to find source locations hasn't worked very well. The debug info just doesn't have enough detailed information.I'm curious what problems you've had. You've surely not mentioned them before. Note that I'm not saying that I think this is the best method, but I'm curious what problems you've had and what you're running into.> Instead, I am planning to insert the instrumentation code in the front-end. The raw profile data in this scheme is tied to source locations. One disadvantage is that it conflicts with the goal of having graceful degradation. Simply adding a blank line invalidates all the following source locations. My plan is to ignore any profile data whenever the source changes in any way. We had some discussions about this, and it is easy to come up with cases where even a trivial change to the source, e.g., enablin! > g a debugging flag, causes massive changes in the profile data. I don't think graceful degradation should even be a goal. It is too hard to sort out insignificant changes from those that should invalidate all the profile data. >Even a comment? -eric
On Jun 18, 2013, at 2:48 PM, Eric Christopher <echristo at gmail.com> wrote:> On Tue, Jun 18, 2013 at 11:19 AM, Bob Wilson <bob.wilson at apple.com> wrote: >> >> On Jun 17, 2013, at 6:54 AM, Diego Novillo <dnovillo at google.com> wrote: >> >>> On 2013-06-15 14:18 , Evan Cheng wrote: >>>> Apple folks are also gearing up to push on the PGO front. We are primarily interested in using instrumentation, rather than sampling, to collect profile info. However, I suspect the way profile ended up being used in the various optimization and codegen passes would be largely similar. >>>> >>> >>> Excellent! We are initially interested in instrumentation, as well. This is where we draw most of our performance with GCC. Sampling is showing a lot of promise, however. And it really is not much different than instrumentation. Most of what changes is the source of profile data. >>> >>>> There is also some interests in pursuing profile directed specialization. But that can wait. I think it makes sense for us to get together and discuss our plans to make sure there won't be duplication of efforts. >>> >>> Sure. My initial plan is fairly simple. Triage the existing instrumentation code and see what needs fixing. I'm starting this in the next week or so. What are your plans? >> >> I've been working on prototyping a new approach to instrumentation for both PGO and code coverage testing. I want to use the same data for both of those uses, and for code coverage we really need to have accurate source location info, including column positions and the starting and ending locations for every block of code. Working backward from the debug info to find source locations hasn't worked very well. The debug info just doesn't have enough detailed information. > > I'm curious what problems you've had. You've surely not mentioned them > before. Note that I'm not saying that I think this is the best method, > but I'm curious what problems you've had and what you're running into.The main issue is that I want precise begin/end source locations for each statement. Debug info doesn't do that. Even if we enable column information, debug info just gives you one source location for each statement. That's not a problem with debug info -- it's just not what it was intended to be used for. There's also the issue that I want PGO instrumentation to work the same regardless of the debug info setting.> >> Instead, I am planning to insert the instrumentation code in the front-end. The raw profile data in this scheme is tied to source locations. One disadvantage is that it conflicts with the goal of having graceful degradation. Simply adding a blank line invalidates all the following source locations. My plan is to ignore any profile data whenever the source changes in any way. We had some discussions about this, and it is easy to come up with cases where even a trivial change to the source, e.g., enablin! >> g a debugging flag, causes massive changes in the profile data. I don't think graceful degradation should even be a goal. It is too hard to sort out insignificant changes from those that should invalidate all the profile data. >> > > Even a comment?Yes. Obviously you could do that, but it would add significant complexity and little real value.
Eric Christopher
2013-Jun-18 22:41 UTC
[LLVMdev] RFC - Profile Guided Optimization in LLVM
> The main issue is that I want precise begin/end source locations for each statement. Debug info doesn't do that. Even if we enable column information, debug info just gives you one source location for each statement. That's not a problem with debug info -- it's just not what it was intended to be used for. >True... somewhat. Precise begin/end source locations exist and the debug info will map them back to regions of code that are associated with a particular line and column - this should also change when the statement changes. Anything else should probably be a bug? The only difference I can see is that there's no concrete "done with this statement" in the code. It might take some processing work to get "all statements that executed between these source ranges" because of code motion effects, but you should be able to get a concrete range.> There's also the issue that I want PGO instrumentation to work the same regardless of the debug info setting.Basically it's going to be "line-tables-only" :) Of course, I'm also not signing up to do the work, just curious what the limitations are in the infrastructure here.> >> >> Even a comment? > > Yes. Obviously you could do that, but it would add significant complexity and little real value.Mmm.. I think you're underestimating the amount of usefulness here, but we can revisit when there's some code. :) -eric
Possibly Parallel Threads
- [LLVMdev] RFC - Profile Guided Optimization in LLVM
- [LLVMdev] RFC - Profile Guided Optimization in LLVM
- [LLVMdev] RFC - Profile Guided Optimization in LLVM
- [LLVMdev] RFC - Profile Guided Optimization in LLVM
- [LLVMdev] RFC - Profile Guided Optimization in LLVM