Сидоров , Константин Сергеевич via llvm-dev
2021-Jan-19 03:04 UTC
[llvm-dev] [GSoC] Machine learning and compiler optimizations: using inter-procedural analysis to select optimizations
Dear all, My name is Konstantin Sidorov, and I am a graduate student in Mathematics at Moscow Institute of Physics and Technology. I would like to work on a project "Machine learning and compiler optimizations: using inter-procedural analysis to select optimizations" during the Google Summer of Code 2021. I have an extensive background relevant to this project - in particular: * I have already participated in GSoC before in 2017 with mlpack organization on the project "Augmented RNNs": https://summerofcode.withgoogle.com/archive/2017/projects/4583913502539776/ * In 2019 I have graduated from the Yandex School of Data Analysis — a two-year program in Data Analysis by Yandex (the leading Russian search engine); more info on the curriculum could be also found at https://yandexdataschool.com/. * I have also been working as a software engineer at Adeptik from July 2018 to date, where I have predominantly worked on projects on applied combinatorial optimization problems, such as vehicle-routing problems or supply chain modeling. In particular, I have had experience with both metaheuristic algorithms (e.g., local search or genetic algorithms) and more "traditional" mathematical modeling (e.g., linear programming or constraint programming). I would like to discuss this project in more detail. While it is hard to discuss any kind of exact plan at this stage, I already have two questions concerning this project: (1) I have set up an LLVM dev environment, but I am unsure what to do next. Could you advise me on any simple (and, preferably, relevant) tasks to work on? (2) Could you suggest any learning materials to improve the understanding of "low-level" concepts? (E.g., CPU concepts such as caching and SIMD) Best regards, Konstantin Sidorov -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210119/61e6a85c/attachment.html>
Stefanos Baziotis via llvm-dev
2021-Jan-19 14:55 UTC
[llvm-dev] [GSoC] Machine learning and compiler optimizations: using inter-procedural analysis to select optimizations
Hi Konstantin, Let me try to help with the last two questions: 1) From what I understand, you don't have much experience with LLVM (or compilers in general? in such a case, I can recommend more stuff). Given that, I'd start with these two videos a) https://www.youtube.com/watch?v=m8G_S5LwlTo b) https://www.youtube.com/watch?v=3QQuhL-dSys These cover fundamental concepts and you'll have to know most of them no matter what you do in LLVM. The next step depends on what you like. I'd say you could either play with an already existing pass or create your own. Personally, I would go with the second although it seems more difficult (but it's not). First, because already existing passes are production quality, which means that they contain a lot of stuff that are not exactly educational and would not be good for a beginner IMO. (so for example, if I were to start with an already existing pass, I'd start with one implemented for a course or sth) Second, because by creating your own pass, you will see sort of how everything fits together much faster. Now, the problem with creating your own pass is that most relevant tutorials out there, either they use the old pass manager or they make an out-of-tree pass (or both). The result is that you have to deal with a monstrosity of irrelevant (at least for now) things. So, I'll go ahead and recommend something else: Just delete the code of an already existing pass and start over :) In particular, find the ::run function and just delete what it does and call your own functions. For example, take this [4] and replace it with e.g. this https://pastebin.com/PhMLvMH7 You can now call "your" pass by adding -passes="loop-distribute" in the `opt` tool. This should give you enough knowledge of the infrastructure to e.g. program a dominators pass [5] (if you don't know about dominators, it's important that you learn about them, however, wikipedia is not a good source. I'd pick a book like "Engineering a Compiler"). 2) I would start with those two: a) CppCon 2016: Timur Doumler “Want fast C++? Know your hardware!" [1] b) Writing cache friendly C++ - Jonathan Müller - Meeting C++ 2018 [2] https://www.youtube.com/watch?v=Nz9SiF0QVKY If you want to go more in depth, you can watch this course [3] from CMU on Comp. Arch., I think it's amazing. You can cherry pick topics if you want. Hope this helps, let me know if there are other questions. - Stefanos [1] https://www.youtube.com/watch?v=BP6NxVxDQIs [2] https://www.youtube.com/watch?v=Nz9SiF0QVKY [3] https://www.youtube.com/playlist?list=PL5PHm2jkkXmi5CxxI7b3JCL1TWybTDtKq [4] https://github.com/llvm/llvm-project/blob/main/llvm/lib/Transforms/Scalar/LoopDistribute.cpp#L1043 [5] https://en.wikipedia.org/wiki/Dominator_(graph_theory)#Algorithms Στις Τρί, 19 Ιαν 2021 στις 5:05 π.μ., ο/η Сидоров , Константин Сергеевич via llvm-dev <llvm-dev at lists.llvm.org> έγραψε:> Dear all, > > My name is Konstantin Sidorov, and I am a graduate student in Mathematics > at Moscow Institute of Physics and Technology. > > I would like to work on a project "Machine learning and compiler > optimizations: using inter-procedural analysis to select optimizations" > during the Google Summer of Code 2021. > > I have an extensive background relevant to this project - in particular: > > * I have already participated in GSoC before in 2017 with mlpack > organization on the project "Augmented RNNs": > https://summerofcode.withgoogle.com/archive/2017/projects/4583913502539776/ > * In 2019 I have graduated from the Yandex School of Data Analysis — a > two-year program in Data Analysis by Yandex (the leading Russian search > engine); more info on the curriculum could be also found at > https://yandexdataschool.com/. > * I have also been working as a software engineer at Adeptik from July > 2018 to date, where I have predominantly worked on projects on applied > combinatorial optimization problems, such as vehicle-routing problems or > supply chain modeling. In particular, I have had experience with both > metaheuristic algorithms (e.g., local search or genetic algorithms) and > more "traditional" mathematical modeling (e.g., linear programming or > constraint programming). > > I would like to discuss this project in more detail. While it is hard to > discuss any kind of exact plan at this stage, I already have two questions > concerning this project: > > (1) I have set up an LLVM dev environment, but I am unsure what to do > next. Could you advise me on any simple (and, preferably, relevant) tasks > to work on? > (2) Could you suggest any learning materials to improve the understanding > of "low-level" concepts? (E.g., CPU concepts such as caching and SIMD) > > Best regards, > Konstantin Sidorov > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210119/fe4a2a5b/attachment.html>
Сидоров , Константин Сергеевич via llvm-dev
2021-Feb-06 10:35 UTC
[llvm-dev] [GSoC] Machine learning and compiler optimizations: using inter-procedural analysis to select optimizations
Dear all, I would like to continue the discussion of the GSoC project I mentioned in the previous email. Now, when I know my way around the LLVM codebase, I would like to propose the first draft of the plan: * Improving heuristics for existing passes – to start the discussion, I propose to start the project by working on `MLInlineAdvisor` (as far as I understand, in this class the ML infrastructure is already developed, and thus it seems to be a good idea to start there) and after that switching to the other passes (e.g., `LoopVectorizationPlanner` seems to be a good candidate for such an approach, and `LoopRotate` class contains a profitability heuristic which could also be studied deeper). * Machine learning models to select the optimizations – to the best of my understanding, the key concept here is the pass manager, but here I don't quite understand the technical details of deciding which optimization to select. For this reason, I would like to discuss this part more thoroughly. If the project mentors are reading this mailing list and are interested in the discussion, can we start the discussion here? By the way – I would like to thank Stefanos for the comprehensive response to my previous questions that helped me to get started :) Looking forward to a further discussion, Konstantin Sidorov вт, 19 янв. 2021 г. в 07:04, Сидоров , Константин Сергеевич < sidorov.ks at phystech.edu>:> Dear all, > > My name is Konstantin Sidorov, and I am a graduate student in Mathematics > at Moscow Institute of Physics and Technology. > > I would like to work on a project "Machine learning and compiler > optimizations: using inter-procedural analysis to select optimizations" > during the Google Summer of Code 2021. > > I have an extensive background relevant to this project - in particular: > > * I have already participated in GSoC before in 2017 with mlpack > organization on the project "Augmented RNNs": > https://summerofcode.withgoogle.com/archive/2017/projects/4583913502539776/ > * In 2019 I have graduated from the Yandex School of Data Analysis — a > two-year program in Data Analysis by Yandex (the leading Russian search > engine); more info on the curriculum could be also found at > https://yandexdataschool.com/. > * I have also been working as a software engineer at Adeptik from July > 2018 to date, where I have predominantly worked on projects on applied > combinatorial optimization problems, such as vehicle-routing problems or > supply chain modeling. In particular, I have had experience with both > metaheuristic algorithms (e.g., local search or genetic algorithms) and > more "traditional" mathematical modeling (e.g., linear programming or > constraint programming). > > I would like to discuss this project in more detail. While it is hard to > discuss any kind of exact plan at this stage, I already have two questions > concerning this project: > > (1) I have set up an LLVM dev environment, but I am unsure what to do > next. Could you advise me on any simple (and, preferably, relevant) tasks > to work on? > (2) Could you suggest any learning materials to improve the understanding > of "low-level" concepts? (E.g., CPU concepts such as caching and SIMD) > > Best regards, > Konstantin Sidorov >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210206/aa3adefe/attachment.html>