Ruijie Fang via llvm-dev
2020-Jun-02 16:16 UTC
[llvm-dev] Improve hot cold splitting to aggressively outline small blocks
Hi Teresa, Thank you for your reply! I discussed this with Aditya and Rodrigo today about this. We will always have PGO turned on for our benchmark, (i.e. we assume the profiling information is always available). In terms of the workload we supply to PGO: For postgresql, I suggested we use the "pgbench" benchmark, a TPC-B-based SQL benchmark for postgres, to supply profiling information for PGO. We can use other workloads/benchmarks should you have any other suggestions about this. Thank you, Ruijie Ruijie Fang Email: ruijief at princeton.edu On Mon, Jun 1, 2020 at 11:28 AM Teresa Johnson <tejohnson at google.com> wrote:> > > On Sun, May 31, 2020 at 11:37 PM Ruijie Fang <ruijief at princeton.edu> > wrote: > >> Hello, >> I am Ruijie Fang, a GSoC student working on "Improve hot cold >> splitting to aggressively outline small blocks." Over the course of >> last week, I met with my mentor and co-mentor, Aditya Kumar, and >> Rodrigo Rocha, and we made a preliminary plan on improving the >> existing hot/cold splitting pass in LLVM through identifying patterns >> of cold blocks in real-world workloads via block frequency information >> (We have settled to use the PostgreSQL codebase as a workload first, >> although if time permits, we will also target other large codebases). >> >> Our project will involve identifying new cold block patterns via >> static analysis in our workload, implementing detection of these >> patterns into the existing hot/cold splitting pass, and then >> benchmarking hot/cold splitting in our workload to see if there are >> improvements. Our eventual goal is to improve the ability of hot/cold >> analysis to detect cold blocks in these real-world workloads. >> > > Hi Ruijie, > > Thanks for the info! > > I skimmed the doc (suggest including it inline in the thread). It wasn't > clear to me if the main goal is to improve PGO based HCS or non-PGO based > HCS. It sounds like you are going to be focusing on non-PGO based HCS given > the comments about static analysis and detection of throws, asserts etc. A > couple of suggestions. I'd focus first on ensuring best performance > possible given PGO information (the last time I tried HCS with PGO it > wasn't improving performance for one of our large apps). Second, for the > non-PGO case, rather than building in the detection of likely cold blocks > into HCS itself, it would be better to drive static generation of some kind > of profile metadata for likely cold blocks (a la __builtin_expect). This > will be more general and allow passes other than HCS to benefit. > > Teresa > > >> Our plan is attached at >> >> https://docs.google.com/document/d/1rGLcFpfVXnF7aS31dWnowd2y_BjJnRA-hj3cUt6MqZ8/edit?usp=sharing >> . >> >> Any feedback, input, or suggestion is welcome and highly appreciated! >> >> Best regards, >> Ruijie >> >> Ruijie Fang >> Email: ruijief at princeton.edu >> > > > -- > Teresa Johnson | Software Engineer | tejohnson at google.com | >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200602/bfc22206/attachment.html>
Tobias Hieta via llvm-dev
2020-Jun-02 16:48 UTC
[llvm-dev] Improve hot cold splitting to aggressively outline small blocks
Hello Ruijie, One other workload that would be interesting to test might be clang itself. Building clang with PGO information is a common trick for improving compiler performance and it's well supported in the build system. Thanks for working on this. Tobias. On Tue, Jun 2, 2020, 18:16 Ruijie Fang via llvm-dev <llvm-dev at lists.llvm.org> wrote:> Hi Teresa, > > Thank you for your reply! I discussed this with Aditya and Rodrigo today > about this. We will always have PGO turned on for our benchmark, (i.e. we > assume the profiling information is always available). In terms of the > workload we supply to PGO: For postgresql, I suggested we use the "pgbench" > benchmark, a TPC-B-based SQL benchmark for postgres, to supply profiling > information for PGO. We can use other workloads/benchmarks should you have > any other suggestions about this. > > Thank you, > Ruijie > > Ruijie Fang > Email: ruijief at princeton.edu > > > > On Mon, Jun 1, 2020 at 11:28 AM Teresa Johnson <tejohnson at google.com> > wrote: > >> >> >> On Sun, May 31, 2020 at 11:37 PM Ruijie Fang <ruijief at princeton.edu> >> wrote: >> >>> Hello, >>> I am Ruijie Fang, a GSoC student working on "Improve hot cold >>> splitting to aggressively outline small blocks." Over the course of >>> last week, I met with my mentor and co-mentor, Aditya Kumar, and >>> Rodrigo Rocha, and we made a preliminary plan on improving the >>> existing hot/cold splitting pass in LLVM through identifying patterns >>> of cold blocks in real-world workloads via block frequency information >>> (We have settled to use the PostgreSQL codebase as a workload first, >>> although if time permits, we will also target other large codebases). >>> >>> Our project will involve identifying new cold block patterns via >>> static analysis in our workload, implementing detection of these >>> patterns into the existing hot/cold splitting pass, and then >>> benchmarking hot/cold splitting in our workload to see if there are >>> improvements. Our eventual goal is to improve the ability of hot/cold >>> analysis to detect cold blocks in these real-world workloads. >>> >> >> Hi Ruijie, >> >> Thanks for the info! >> >> I skimmed the doc (suggest including it inline in the thread). It wasn't >> clear to me if the main goal is to improve PGO based HCS or non-PGO based >> HCS. It sounds like you are going to be focusing on non-PGO based HCS given >> the comments about static analysis and detection of throws, asserts etc. A >> couple of suggestions. I'd focus first on ensuring best performance >> possible given PGO information (the last time I tried HCS with PGO it >> wasn't improving performance for one of our large apps). Second, for the >> non-PGO case, rather than building in the detection of likely cold blocks >> into HCS itself, it would be better to drive static generation of some kind >> of profile metadata for likely cold blocks (a la __builtin_expect). This >> will be more general and allow passes other than HCS to benefit. >> >> Teresa >> >> >>> Our plan is attached at >>> >>> https://docs.google.com/document/d/1rGLcFpfVXnF7aS31dWnowd2y_BjJnRA-hj3cUt6MqZ8/edit?usp=sharing >>> . >>> >>> Any feedback, input, or suggestion is welcome and highly appreciated! >>> >>> Best regards, >>> Ruijie >>> >>> Ruijie Fang >>> Email: ruijief at princeton.edu >>> >> >> >> -- >> Teresa Johnson | Software Engineer | tejohnson at google.com | >> > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200602/7ab819ab/attachment.html>
Ruijie Fang via llvm-dev
2020-Jun-02 19:30 UTC
[llvm-dev] Improve hot cold splitting to aggressively outline small blocks
Hello Tobias, Thank you for the suggestion! Aditya also mentioned this. I will look into it. Best regards, Ruijie Ruijie Fang Email: ruijief at princeton.edu On Tue, Jun 2, 2020 at 12:48 PM Tobias Hieta <tobias at plexapp.com> wrote:> Hello Ruijie, > > One other workload that would be interesting to test might be clang > itself. Building clang with PGO information is a common trick for improving > compiler performance and it's well supported in the build system. > > Thanks for working on this. > > Tobias. > > On Tue, Jun 2, 2020, 18:16 Ruijie Fang via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > >> Hi Teresa, >> >> Thank you for your reply! I discussed this with Aditya and Rodrigo today >> about this. We will always have PGO turned on for our benchmark, (i.e. we >> assume the profiling information is always available). In terms of the >> workload we supply to PGO: For postgresql, I suggested we use the "pgbench" >> benchmark, a TPC-B-based SQL benchmark for postgres, to supply profiling >> information for PGO. We can use other workloads/benchmarks should you have >> any other suggestions about this. >> >> Thank you, >> Ruijie >> >> Ruijie Fang >> Email: ruijief at princeton.edu >> >> >> >> On Mon, Jun 1, 2020 at 11:28 AM Teresa Johnson <tejohnson at google.com> >> wrote: >> >>> >>> >>> On Sun, May 31, 2020 at 11:37 PM Ruijie Fang <ruijief at princeton.edu> >>> wrote: >>> >>>> Hello, >>>> I am Ruijie Fang, a GSoC student working on "Improve hot cold >>>> splitting to aggressively outline small blocks." Over the course of >>>> last week, I met with my mentor and co-mentor, Aditya Kumar, and >>>> Rodrigo Rocha, and we made a preliminary plan on improving the >>>> existing hot/cold splitting pass in LLVM through identifying patterns >>>> of cold blocks in real-world workloads via block frequency information >>>> (We have settled to use the PostgreSQL codebase as a workload first, >>>> although if time permits, we will also target other large codebases). >>>> >>>> Our project will involve identifying new cold block patterns via >>>> static analysis in our workload, implementing detection of these >>>> patterns into the existing hot/cold splitting pass, and then >>>> benchmarking hot/cold splitting in our workload to see if there are >>>> improvements. Our eventual goal is to improve the ability of hot/cold >>>> analysis to detect cold blocks in these real-world workloads. >>>> >>> >>> Hi Ruijie, >>> >>> Thanks for the info! >>> >>> I skimmed the doc (suggest including it inline in the thread). It wasn't >>> clear to me if the main goal is to improve PGO based HCS or non-PGO based >>> HCS. It sounds like you are going to be focusing on non-PGO based HCS given >>> the comments about static analysis and detection of throws, asserts etc. A >>> couple of suggestions. I'd focus first on ensuring best performance >>> possible given PGO information (the last time I tried HCS with PGO it >>> wasn't improving performance for one of our large apps). Second, for the >>> non-PGO case, rather than building in the detection of likely cold blocks >>> into HCS itself, it would be better to drive static generation of some kind >>> of profile metadata for likely cold blocks (a la __builtin_expect). This >>> will be more general and allow passes other than HCS to benefit. >>> >>> Teresa >>> >>> >>>> Our plan is attached at >>>> >>>> https://docs.google.com/document/d/1rGLcFpfVXnF7aS31dWnowd2y_BjJnRA-hj3cUt6MqZ8/edit?usp=sharing >>>> . >>>> >>>> Any feedback, input, or suggestion is welcome and highly appreciated! >>>> >>>> Best regards, >>>> Ruijie >>>> >>>> Ruijie Fang >>>> Email: ruijief at princeton.edu >>>> >>> >>> >>> -- >>> Teresa Johnson | Software Engineer | tejohnson at google.com | >>> >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org >> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >> >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200602/278c633b/attachment.html>
Apparently Analagous Threads
- Improve hot cold splitting to aggressively outline small blocks
- Improve hot cold splitting to aggressively outline small blocks
- [GSoC '20 Project Interest] - Improve MergeFunctions to incorporate MergeSimilarFunction patches and ThinLTO Support
- GSoC 2020 Project "Improve MegreFunctions to incorporate MergeSimilarFunctions patches and ThinLTO Support"
- [RFC] Machine Function Splitter - Split out cold blocks from machine functions using profile data