Johannes Doerfert via llvm-dev
2021-Feb-06 19:28 UTC
[llvm-dev] [GSoC] Machine learning and compiler optimizations: using inter-procedural analysis to select optimizations
Hi Konstantin, I didn't find the time to write new GSoC projects for 2021 yet but if you are interested we could probably set one up in this area. I also CC'ed Mircea who might be interested in this too, maybe as (co-)mentor. We could look at loop transformations such as unrolling and fusion, similar to the inliner work. Best case, we can distill a heuristic out of a model we learned. We could also look at pass selection and ordering. We started last year and I was hoping to continue. You might want to watch https://youtu.be/TSMputNvHlk?t=617 <https://youtu.be/TSMputNvHlk?t=617> and https://youtu.be/nxfew3hsMFM?t=1435 <https://youtu.be/nxfew3hsMFM?t=1435> . In case your interested in a runtime topic, I really would love to have a predictor for grid/block/thread block size for (OpenMP) GPU kernels. We are having real trouble on that end. I also would like to look at ML use in testing and CI. Let me know what area sounds most interesting to you and we can take it from there. ~ Johannes On 2/6/21 4:35 AM, Сидоров , Константин Сергеевич via llvm-dev wrote:> Dear all, > > I would like to continue the discussion of the GSoC project I mentioned in > the previous email. Now, when I know my way around the LLVM codebase, I > would like to propose the first draft of the plan: > > * Improving heuristics for existing passes – to start the discussion, I > propose to start the project by working on `MLInlineAdvisor` (as far as I > understand, in this class the ML infrastructure is already developed, and > thus it seems to be a good idea to start there) and after that switching to > the other passes (e.g., `LoopVectorizationPlanner` seems to be a good > candidate for such an approach, and `LoopRotate` class contains a > profitability heuristic which could also be studied deeper). > * Machine learning models to select the optimizations – to the best of my > understanding, the key concept here is the pass manager, but here I don't > quite understand the technical details of deciding which optimization to > select. For this reason, I would like to discuss this part more thoroughly. > > If the project mentors are reading this mailing list and are interested in > the discussion, can we start the discussion here? > > By the way – I would like to thank Stefanos for the comprehensive response > to my previous questions that helped me to get started :) > > Looking forward to a further discussion, > Konstantin Sidorov > > вт, 19 янв. 2021 г. в 07:04, Сидоров , Константин Сергеевич < > sidorov.ks at phystech.edu>: > >> Dear all, >> >> My name is Konstantin Sidorov, and I am a graduate student in Mathematics >> at Moscow Institute of Physics and Technology. >> >> I would like to work on a project "Machine learning and compiler >> optimizations: using inter-procedural analysis to select optimizations" >> during the Google Summer of Code 2021. >> >> I have an extensive background relevant to this project - in particular: >> >> * I have already participated in GSoC before in 2017 with mlpack >> organization on the project "Augmented RNNs": >> https://summerofcode.withgoogle.com/archive/2017/projects/4583913502539776/ >> * In 2019 I have graduated from the Yandex School of Data Analysis — a >> two-year program in Data Analysis by Yandex (the leading Russian search >> engine); more info on the curriculum could be also found at >> https://yandexdataschool.com/. >> * I have also been working as a software engineer at Adeptik from July >> 2018 to date, where I have predominantly worked on projects on applied >> combinatorial optimization problems, such as vehicle-routing problems or >> supply chain modeling. In particular, I have had experience with both >> metaheuristic algorithms (e.g., local search or genetic algorithms) and >> more "traditional" mathematical modeling (e.g., linear programming or >> constraint programming). >> >> I would like to discuss this project in more detail. While it is hard to >> discuss any kind of exact plan at this stage, I already have two questions >> concerning this project: >> >> (1) I have set up an LLVM dev environment, but I am unsure what to do >> next. Could you advise me on any simple (and, preferably, relevant) tasks >> to work on? >> (2) Could you suggest any learning materials to improve the understanding >> of "low-level" concepts? (E.g., CPU concepts such as caching and SIMD) >> >> Best regards, >> Konstantin Sidorov >> > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Abid Malik via llvm-dev
2021-Feb-06 19:36 UTC
[llvm-dev] [GSoC] Machine learning and compiler optimizations: using inter-procedural analysis to select optimizations
On Sat, Feb 6, 2021 at 2:29 PM Johannes Doerfert via llvm-dev < llvm-dev at lists.llvm.org> wrote:> Hi Konstantin, > > I didn't find the time to write new GSoC projects for 2021 yet but > if you are interested we could probably set one up in this area. I > also CC'ed Mircea who might be interested in this too, maybe as > (co-)mentor. > > We could look at loop transformations such as unrolling and fusion, > similar to the inliner work. Best case, we can distill a heuristic > out of a model we learned. We could also look at pass selection and > ordering. We started last year and I was hoping to continue. You > might want to watch https://youtu.be/TSMputNvHlk?t=617 > <https://youtu.be/TSMputNvHlk?t=617> and > https://youtu.be/nxfew3hsMFM?t=1435 <https://youtu.be/nxfew3hsMFM?t=1435> > . > > In case your interested in a runtime topic, I really would love to > have a predictor for grid/block/thread block size for (OpenMP) GPU > kernels. We are having real trouble on that end. > > Hi,We are working on an ML model that can predict the profitability of offloading a kernel to GPU. I feel that this problem is very much related. This problem will have the same challenges in terms of feature engineering and data preparation the one we are handling for our work. Abid> I also would like to look at ML use in testing and CI. > > Let me know what area sounds most interesting to you and we can > take it from there. > > ~ Johannes > > > On 2/6/21 4:35 AM, Сидоров , Константин Сергеевич via llvm-dev wrote: > > Dear all, > > > > I would like to continue the discussion of the GSoC project I mentioned > in > > the previous email. Now, when I know my way around the LLVM codebase, I > > would like to propose the first draft of the plan: > > > > * Improving heuristics for existing passes – to start the discussion, I > > propose to start the project by working on `MLInlineAdvisor` (as far as I > > understand, in this class the ML infrastructure is already developed, and > > thus it seems to be a good idea to start there) and after that switching > to > > the other passes (e.g., `LoopVectorizationPlanner` seems to be a good > > candidate for such an approach, and `LoopRotate` class contains a > > profitability heuristic which could also be studied deeper). > > * Machine learning models to select the optimizations – to the best of my > > understanding, the key concept here is the pass manager, but here I don't > > quite understand the technical details of deciding which optimization to > > select. For this reason, I would like to discuss this part more > thoroughly. > > > > If the project mentors are reading this mailing list and are interested > in > > the discussion, can we start the discussion here? > > > > By the way – I would like to thank Stefanos for the comprehensive > response > > to my previous questions that helped me to get started :) > > > > Looking forward to a further discussion, > > Konstantin Sidorov > > > > вт, 19 янв. 2021 г. в 07:04, Сидоров , Константин Сергеевич < > > sidorov.ks at phystech.edu>: > > > >> Dear all, > >> > >> My name is Konstantin Sidorov, and I am a graduate student in > Mathematics > >> at Moscow Institute of Physics and Technology. > >> > >> I would like to work on a project "Machine learning and compiler > >> optimizations: using inter-procedural analysis to select optimizations" > >> during the Google Summer of Code 2021. > >> > >> I have an extensive background relevant to this project - in particular: > >> > >> * I have already participated in GSoC before in 2017 with mlpack > >> organization on the project "Augmented RNNs": > >> > https://summerofcode.withgoogle.com/archive/2017/projects/4583913502539776/ > >> * In 2019 I have graduated from the Yandex School of Data Analysis — a > >> two-year program in Data Analysis by Yandex (the leading Russian search > >> engine); more info on the curriculum could be also found at > >> https://yandexdataschool.com/. > >> * I have also been working as a software engineer at Adeptik from July > >> 2018 to date, where I have predominantly worked on projects on applied > >> combinatorial optimization problems, such as vehicle-routing problems or > >> supply chain modeling. In particular, I have had experience with both > >> metaheuristic algorithms (e.g., local search or genetic algorithms) and > >> more "traditional" mathematical modeling (e.g., linear programming or > >> constraint programming). > >> > >> I would like to discuss this project in more detail. While it is hard to > >> discuss any kind of exact plan at this stage, I already have two > questions > >> concerning this project: > >> > >> (1) I have set up an LLVM dev environment, but I am unsure what to do > >> next. Could you advise me on any simple (and, preferably, relevant) > tasks > >> to work on? > >> (2) Could you suggest any learning materials to improve the > understanding > >> of "low-level" concepts? (E.g., CPU concepts such as caching and SIMD) > >> > >> Best regards, > >> Konstantin Sidorov > >> > > > > _______________________________________________ > > LLVM Developers mailing list > > llvm-dev at lists.llvm.org > > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-- Abid M. Malik ****************************************************** "I have learned silence from the talkative, toleration from the intolerant, and kindness from the unkind"---Gibran "Success is not for the chosen few, but for the few who choose" --- John Maxwell "Being a good person does not depend on your religion or status in life, your race or skin color, political views or culture. IT DEPENDS ON HOW GOOD YOU TREAT OTHERS"--- Abid "The Universe is talking to us, and the language of the Universe is mathematics."----Abid -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210206/73591b63/attachment.html>
Сидоров , Константин Сергеевич via llvm-dev
2021-Feb-07 14:31 UTC
[llvm-dev] [GSoC] Machine learning and compiler optimizations: using inter-procedural analysis to select optimizations
Hello Johannes, I guess working on the loop transformations is a good starting point – firstly, it is similar to the already existing code, and secondly, this problem doesn't look way too hard (especially comparing with other ideas). To the best of my understanding, it corresponds to refactoring `LoopUnrollAnalyzer` and, if I understood you correctly, `MacroFusion`, in the similar way it has been done with the `InlineAdvisor`. As for the next step, I think that knowledge distillation is a promising idea – in fact, we can experiment with the approaches from [1], which can yield a nice inference speed-up in those models. I think working on some kind of unified pipeline for pass selection and ordering is also an interesting idea – off the top of my head, a viable approach here is to consider a pass scheduling as a single-player game and running a Monte-Carlo tree search to maximize some objective function. For example, in [2] this kind of approach is used for learning to solve vertex cover and max-cut, while [3] employs this approach for searching for the molecule design with the specified properties. See also [4] for a survey of RL methods (including MCTS) for combinatorial problems. In case your interested in a runtime topic, I really would love to> have a predictor for grid/block/thread block size for (OpenMP) GPU > kernels. We are having real trouble on that end.I'm afraid I didn't quite understand this one – could you elaborate a bit more on this topic? Best regards, Konstantin Sidorov [1] https://github.com/lhyfst/knowledge-distillation-papers#recommended-papers [2] Abe, Kenshin et al. Solving NP-Hard Problems on Graphs with Extended AlphaGo Zero. https://arxiv.org/abs/1905.11623 [3] Kajita, S., Kinjo, T. & Nishi, T. Autonomous molecular design by Monte-Carlo tree search and rapid evaluations using molecular dynamics simulations. Commun Phys 3, 77 (2020). https://doi.org/10.1038/s42005-020-0338-y [4] Mazyavkina, Nina et al. Reinforcement Learning for Combinatorial Optimization: A Survey. https://arxiv.org/abs/2003.03600 сб, 6 февр. 2021 г. в 23:28, Johannes Doerfert <johannesdoerfert at gmail.com>:> Hi Konstantin, > > I didn't find the time to write new GSoC projects for 2021 yet but > if you are interested we could probably set one up in this area. I > also CC'ed Mircea who might be interested in this too, maybe as > (co-)mentor. > > We could look at loop transformations such as unrolling and fusion, > similar to the inliner work. Best case, we can distill a heuristic > out of a model we learned. We could also look at pass selection and > ordering. We started last year and I was hoping to continue. You > might want to watch https://youtu.be/TSMputNvHlk?t=617 > <https://youtu.be/TSMputNvHlk?t=617> and > https://youtu.be/nxfew3hsMFM?t=1435 <https://youtu.be/nxfew3hsMFM?t=1435> > . > > In case your interested in a runtime topic, I really would love to > have a predictor for grid/block/thread block size for (OpenMP) GPU > kernels. We are having real trouble on that end. > > I also would like to look at ML use in testing and CI. > > Let me know what area sounds most interesting to you and we can > take it from there. > > ~ Johannes > > > On 2/6/21 4:35 AM, Сидоров , Константин Сергеевич via llvm-dev wrote: > > Dear all, > > > > I would like to continue the discussion of the GSoC project I mentioned > in > > the previous email. Now, when I know my way around the LLVM codebase, I > > would like to propose the first draft of the plan: > > > > * Improving heuristics for existing passes – to start the discussion, I > > propose to start the project by working on `MLInlineAdvisor` (as far as I > > understand, in this class the ML infrastructure is already developed, and > > thus it seems to be a good idea to start there) and after that switching > to > > the other passes (e.g., `LoopVectorizationPlanner` seems to be a good > > candidate for such an approach, and `LoopRotate` class contains a > > profitability heuristic which could also be studied deeper). > > * Machine learning models to select the optimizations – to the best of my > > understanding, the key concept here is the pass manager, but here I don't > > quite understand the technical details of deciding which optimization to > > select. For this reason, I would like to discuss this part more > thoroughly. > > > > If the project mentors are reading this mailing list and are interested > in > > the discussion, can we start the discussion here? > > > > By the way – I would like to thank Stefanos for the comprehensive > response > > to my previous questions that helped me to get started :) > > > > Looking forward to a further discussion, > > Konstantin Sidorov > > > > вт, 19 янв. 2021 г. в 07:04, Сидоров , Константин Сергеевич < > > sidorov.ks at phystech.edu>: > > > >> Dear all, > >> > >> My name is Konstantin Sidorov, and I am a graduate student in > Mathematics > >> at Moscow Institute of Physics and Technology. > >> > >> I would like to work on a project "Machine learning and compiler > >> optimizations: using inter-procedural analysis to select optimizations" > >> during the Google Summer of Code 2021. > >> > >> I have an extensive background relevant to this project - in particular: > >> > >> * I have already participated in GSoC before in 2017 with mlpack > >> organization on the project "Augmented RNNs": > >> > https://summerofcode.withgoogle.com/archive/2017/projects/4583913502539776/ > >> * In 2019 I have graduated from the Yandex School of Data Analysis — a > >> two-year program in Data Analysis by Yandex (the leading Russian search > >> engine); more info on the curriculum could be also found at > >> https://yandexdataschool.com/. > >> * I have also been working as a software engineer at Adeptik from July > >> 2018 to date, where I have predominantly worked on projects on applied > >> combinatorial optimization problems, such as vehicle-routing problems or > >> supply chain modeling. In particular, I have had experience with both > >> metaheuristic algorithms (e.g., local search or genetic algorithms) and > >> more "traditional" mathematical modeling (e.g., linear programming or > >> constraint programming). > >> > >> I would like to discuss this project in more detail. While it is hard to > >> discuss any kind of exact plan at this stage, I already have two > questions > >> concerning this project: > >> > >> (1) I have set up an LLVM dev environment, but I am unsure what to do > >> next. Could you advise me on any simple (and, preferably, relevant) > tasks > >> to work on? > >> (2) Could you suggest any learning materials to improve the > understanding > >> of "low-level" concepts? (E.g., CPU concepts such as caching and SIMD) > >> > >> Best regards, > >> Konstantin Sidorov > >> > > > > _______________________________________________ > > LLVM Developers mailing list > > llvm-dev at lists.llvm.org > > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210207/10f96e52/attachment.html>
Mircea Trofin via llvm-dev
2021-Feb-08 19:16 UTC
[llvm-dev] [GSoC] Machine learning and compiler optimizations: using inter-procedural analysis to select optimizations
On Sat, Feb 6, 2021 at 11:28 AM Johannes Doerfert < johannesdoerfert at gmail.com> wrote:> Hi Konstantin, > > I didn't find the time to write new GSoC projects for 2021 yet but > if you are interested we could probably set one up in this area. I > also CC'ed Mircea who might be interested in this too, maybe as > (co-)mentor. >Sorry for being slow answering this - happy to help if there's the need! A suggestion for a potential project: bringing this work <https://dl.acm.org/doi/10.1145/3368826.3377928> to LLVM in a manner similar to what we did for the inliner - i.e. transparently integrating a general-purpose learned policy in the compiler.> > We could look at loop transformations such as unrolling and fusion, > similar to the inliner work. Best case, we can distill a heuristic > out of a model we learned. We could also look at pass selection and > ordering. We started last year and I was hoping to continue. You > might want to watch https://youtu.be/TSMputNvHlk?t=617 > <https://youtu.be/TSMputNvHlk?t=617> and > https://youtu.be/nxfew3hsMFM?t=1435 <https://youtu.be/nxfew3hsMFM?t=1435> > . > > In case your interested in a runtime topic, I really would love to > have a predictor for grid/block/thread block size for (OpenMP) GPU > kernels. We are having real trouble on that end. > > I also would like to look at ML use in testing and CI. > > Let me know what area sounds most interesting to you and we can > take it from there. > > ~ Johannes > > > On 2/6/21 4:35 AM, Сидоров , Константин Сергеевич via llvm-dev wrote: > > Dear all, > > > > I would like to continue the discussion of the GSoC project I mentioned > in > > the previous email. Now, when I know my way around the LLVM codebase, I > > would like to propose the first draft of the plan: > > > > * Improving heuristics for existing passes – to start the discussion, I > > propose to start the project by working on `MLInlineAdvisor` (as far as I > > understand, in this class the ML infrastructure is already developed, and > > thus it seems to be a good idea to start there) and after that switching > to > > the other passes (e.g., `LoopVectorizationPlanner` seems to be a good > > candidate for such an approach, and `LoopRotate` class contains a > > profitability heuristic which could also be studied deeper). > > * Machine learning models to select the optimizations – to the best of my > > understanding, the key concept here is the pass manager, but here I don't > > quite understand the technical details of deciding which optimization to > > select. For this reason, I would like to discuss this part more > thoroughly. > > > > If the project mentors are reading this mailing list and are interested > in > > the discussion, can we start the discussion here? > > > > By the way – I would like to thank Stefanos for the comprehensive > response > > to my previous questions that helped me to get started :) > > > > Looking forward to a further discussion, > > Konstantin Sidorov > > > > вт, 19 янв. 2021 г. в 07:04, Сидоров , Константин Сергеевич < > > sidorov.ks at phystech.edu>: > > > >> Dear all, > >> > >> My name is Konstantin Sidorov, and I am a graduate student in > Mathematics > >> at Moscow Institute of Physics and Technology. > >> > >> I would like to work on a project "Machine learning and compiler > >> optimizations: using inter-procedural analysis to select optimizations" > >> during the Google Summer of Code 2021. > >> > >> I have an extensive background relevant to this project - in particular: > >> > >> * I have already participated in GSoC before in 2017 with mlpack > >> organization on the project "Augmented RNNs": > >> > https://summerofcode.withgoogle.com/archive/2017/projects/4583913502539776/ > >> * In 2019 I have graduated from the Yandex School of Data Analysis — a > >> two-year program in Data Analysis by Yandex (the leading Russian search > >> engine); more info on the curriculum could be also found at > >> https://yandexdataschool.com/. > >> * I have also been working as a software engineer at Adeptik from July > >> 2018 to date, where I have predominantly worked on projects on applied > >> combinatorial optimization problems, such as vehicle-routing problems or > >> supply chain modeling. In particular, I have had experience with both > >> metaheuristic algorithms (e.g., local search or genetic algorithms) and > >> more "traditional" mathematical modeling (e.g., linear programming or > >> constraint programming). > >> > >> I would like to discuss this project in more detail. While it is hard to > >> discuss any kind of exact plan at this stage, I already have two > questions > >> concerning this project: > >> > >> (1) I have set up an LLVM dev environment, but I am unsure what to do > >> next. Could you advise me on any simple (and, preferably, relevant) > tasks > >> to work on? > >> (2) Could you suggest any learning materials to improve the > understanding > >> of "low-level" concepts? (E.g., CPU concepts such as caching and SIMD) > >> > >> Best regards, > >> Konstantin Sidorov > >> > > > > _______________________________________________ > > LLVM Developers mailing list > > llvm-dev at lists.llvm.org > > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210208/a76c650f/attachment.html>