Scott Egerton via llvm-dev
2016-Apr-16 17:43 UTC
[llvm-dev] [GSoc 2016] Proposal - Capture Tracking Improvements
Hello, Attached is the proposal that I have submitted. I would be grateful for any and all feedback provided. Many Thanks, Scott -------------- next part -------------- A non-text attachment was scrubbed... Name: GSoc Proposal 2016 - Scott Egerton.pdf Type: application/pdf Size: 391088 bytes Desc: GSoc Proposal 2016 - Scott Egerton.pdf URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160416/8d94977d/attachment-0001.pdf>
Scott Egerton via llvm-dev
2016-Apr-18 07:59 UTC
[llvm-dev] [GSoc 2016] Proposal - Capture Tracking Improvements
Adding Mehdi Amini and Phillip Reames to CC. -----Original Message----- From: llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org] On Behalf Of Scott Egerton via llvm-dev Sent: 16 April 2016 18:44 To: llvm-dev at lists.llvm.org Subject: [llvm-dev] [GSoc 2016] Proposal - Capture Tracking Improvements Hello, Attached is the proposal that I have submitted. I would be grateful for any and all feedback provided. Many Thanks, Scott
Philip Reames via llvm-dev
2016-Apr-22 16:32 UTC
[llvm-dev] [GSoc 2016] Proposal - Capture Tracking Improvements
(I've copied the body of the text from the PDF into the email and am replying inline to particularly parts.) On 04/16/2016 10:43 AM, Scott Egerton wrote:> Hello, > > Attached is the proposal that I have submitted. I would be grateful for any and all feedback provided. > > Many Thanks, > Scott > > C APTURE T RACKING I MPROVEMENTS > ABSTRACT > Capture tracking analysis is an analysis pass used to determine which pointers are > “captured”. This means that a function has made a copy of a pointer that has outlived > the function that called it.Subtle but important clarification: "has outlived" to "may potentially outlive". As with most compiler analysis, we have to be conservative when we can't fully analyze.> This information is useful during the optimisation process. I > believe that this is used to improve memory management.Via aliasing primarily. Yes.> The capture tracking analysis is currently inefficient and inaccurate in cases due to > the fact that it returns false positives and expensive functions are unnecessarily > repeated.We'll always have false positives. The trick is to get a reasonably small number with a reasonable impact on compile time. The universal problem balance in a compiler.> This is where I would make improvements. It could be improved in a > number of ways, as mentioned by Philip Reames on the mailing list. I would like to > use this opportunity to take my previous experience within LLVM and apply it to other > areas of LLVM. > > BENEFITS > > Improvements to pointer capture tracking would be greatly beneficial. Changes such > as the removal of false positives as well as the reduction to the cost through the > introduction of caching will be made. Caching the results of certain functions, storing > the result, will cause performance increases, should the same data be requested > again as the result will already be stored. As a result of this, it could become a more > valuable tool to have within the LLVM suite. This would also cause improvements to > the code optimisation process and potentially increase the quality of code compiled > with LLVM. It should also use up fewer resources during the analysis pass. > > DELIVERABLES > > There are cases where “potentially captured” is returned incorrectly. As a > starter task I would remove the known and unknown cases that cause this to > occur. This would serve as a good introduction to LLVM compiler analysis and > would be achievable in a short time span. > Approximately 2 weeks.It would be good to gather a concrete list of known false positives. I can help with this, but I'd like you to do a first pass over the code and bug tracking system first. Another approach is to instrument the analysis, print each potentially captured result, and scan through the output when compiling a largish piece of code. Do you see a common pattern which looks worth analyzing further? FYI, two weeks may be very optimistic. I suspect you'll iterate on this a couple of times finding more and more cases each time. Getting each change implemented and reviewed will take a couple of days. Thankfully, the analysis and review time should easily overlap. We can also overlap this with the design work for the next part.> By making changes to the current design, this could be made to be more cost > effective than it currently is. I would do this by caching the results as > previously suggested. This will be used to invalidate results when required. > Approximately 6 weeks.You will definitely need a more concrete proposal for what caching and invalidation scheme you're using. I'd suggest reviewing callers of PointerMayBeCaptured to get a sense for what's going on. I can also help with this. (We should setup a skype call or something if your proposal gets approved.) Once we have a proposal, we'll share this before you start implementation. That will give us a chance to find design problems early. :)> The analysis could be made more accurate in order to recognise object sub- > graphs which do not escape. > Approximately 5 weeks.I honestly doubt you'll get to this in the summer, but that's okay. Once we get closer to actually working on this, we'll need to drill in and make a more concrete proposal. There's a bunch of algorithms which could be used here and picking one will take some thought.> 1 | S COTT E GERTONThis work will link in with other ongoing work within LLVM and will assist other > developers working on compiler analysis. I believe that I may be able to complete > this work within 11 weeks; however I have allocated the extra time to account for any > issues I may encounter.I'm going to push strongly to have you contributing incrementally through the normal LLVM development process. This will mean that if something runs long and you don't get to finish, most of the supporting work will already have landed. (e.g. If we don't get to the object graph part, at least we'll have the caching improvements.)> BIOGRAPHICAL INFORMATION > I have been working on LLVM for the past year as an industrial placement student > within the MIPS compiler team and am keen to do more LLVM work. I am a BSc > Computer Science student at the University of Hull and will begin my final year of > study in September. In the past I have worked on a University project which required > me to be inventive with data structures in order to efficiently solve the given problem. > I thoroughly enjoyed this and would be grateful for another opportunity to exercise > my design skills. > Due to other commitments with my current industrial placement, unfortunately I will > be unable to work on this project during working hours (9-5 GMT) until the first week > of June. However, I am more than happy to make up this time as the summer > progresses.June is fine by me. If anything, that works better for my schedule anyways. Philip
Scott Egerton via llvm-dev
2016-Apr-29 08:49 UTC
[llvm-dev] [GSoc 2016] Proposal - Capture Tracking Improvements
Hi Phillip, Thank you for the feedback. I will be updating this as I learn more. Many thanks, Scott> -----Original Message----- > From: Philip Reames [mailto:listmail at philipreames.com] > Sent: 22 April 2016 17:32 > To: Scott Egerton; llvm-dev at lists.llvm.org > Cc: mehdi.amini at apple.com > Subject: Re: [GSoc 2016] Proposal - Capture Tracking Improvements > > (I've copied the body of the text from the PDF into the email and am replying > inline to particularly parts.) > > On 04/16/2016 10:43 AM, Scott Egerton wrote: > > Hello, > > > > Attached is the proposal that I have submitted. I would be grateful for any > and all feedback provided. > > > > Many Thanks, > > Scott > > > > C APTURE T RACKING I MPROVEMENTS > > ABSTRACT > > Capture tracking analysis is an analysis pass used to determine which > > pointers are "captured". This means that a function has made a copy of > > a pointer that has outlived the function that called it. > Subtle but important clarification: "has outlived" to "may potentially outlive". > As with most compiler analysis, we have to be conservative when we can't > fully analyze. > > This information is useful during the optimisation process. I believe > > that this is used to improve memory management. > Via aliasing primarily. Yes. > > The capture tracking analysis is currently inefficient and inaccurate > > in cases due to the fact that it returns false positives and expensive > > functions are unnecessarily repeated. > We'll always have false positives. The trick is to get a reasonably small > number with a reasonable impact on compile time. The universal problem > balance in a compiler. > > This is where I would make improvements. It could be improved in a > > number of ways, as mentioned by Philip Reames on the mailing list. I > > would like to use this opportunity to take my previous experience > > within LLVM and apply it to other areas of LLVM. > > > > BENEFITS > > > > Improvements to pointer capture tracking would be greatly beneficial. > > Changes such as the removal of false positives as well as the > > reduction to the cost through the introduction of caching will be > > made. Caching the results of certain functions, storing the result, > > will cause performance increases, should the same data be requested > > again as the result will already be stored. As a result of this, it > > could become a more valuable tool to have within the LLVM suite. This > > would also cause improvements to the code optimisation process and > potentially increase the quality of code compiled with LLVM. It should also > use up fewer resources during the analysis pass. > > > > DELIVERABLES > > > > There are cases where "potentially captured" is returned incorrectly. > > As a starter task I would remove the known and unknown cases that > > cause this to occur. This would serve as a good introduction to LLVM > > compiler analysis and would be achievable in a short time span. > > Approximately 2 weeks. > It would be good to gather a concrete list of known false positives. I can help > with this, but I'd like you to do a first pass over the code and bug tracking > system first. Another approach is to instrument the analysis, print each > potentially captured result, and scan through the output when compiling a > largish piece of code. Do you see a common pattern which looks worth > analyzing further? > > FYI, two weeks may be very optimistic. I suspect you'll iterate on this a > couple of times finding more and more cases each time. Getting each change > implemented and reviewed will take a couple of days. Thankfully, the > analysis and review time should easily overlap. We can also overlap this with > the design work for the next part. > > By making changes to the current design, this could be made to be more > > cost effective than it currently is. I would do this by caching the > > results as previously suggested. This will be used to invalidate results when > required. > > Approximately 6 weeks. > You will definitely need a more concrete proposal for what caching and > invalidation scheme you're using. I'd suggest reviewing callers of > PointerMayBeCaptured to get a sense for what's going on. I can also help > with this. (We should setup a skype call or something if your proposal gets > approved.) > > Once we have a proposal, we'll share this before you start implementation. > That will give us a chance to find design problems early. :) > > The analysis could be made more accurate in order to recognise object > > sub- graphs which do not escape. > > Approximately 5 weeks. > I honestly doubt you'll get to this in the summer, but that's okay. Once we > get closer to actually working on this, we'll need to drill in and make a more > concrete proposal. There's a bunch of algorithms which could be used here > and picking one will take some thought. > > 1 | S COTT E GERTONThis work will link in with other ongoing work > > within LLVM and will assist other developers working on compiler > > analysis. I believe that I may be able to complete this work within 11 > > weeks; however I have allocated the extra time to account for any issues I > may encounter. > I'm going to push strongly to have you contributing incrementally through the > normal LLVM development process. This will mean that if something runs > long and you don't get to finish, most of the supporting work will already > have landed. (e.g. If we don't get to the object graph part, at least we'll have > the caching improvements.) > > BIOGRAPHICAL INFORMATION > > I have been working on LLVM for the past year as an industrial > > placement student within the MIPS compiler team and am keen to do > more > > LLVM work. I am a BSc Computer Science student at the University of > > Hull and will begin my final year of study in September. In the past I > > have worked on a University project which required me to be inventive > with data structures in order to efficiently solve the given problem. > > I thoroughly enjoyed this and would be grateful for another > > opportunity to exercise my design skills. > > Due to other commitments with my current industrial placement, > > unfortunately I will be unable to work on this project during working > > hours (9-5 GMT) until the first week of June. However, I am more than > > happy to make up this time as the summer progresses. > June is fine by me. If anything, that works better for my schedule anyways. > > Philip >