thr3ads.net - llvm dev - [llvm-dev] [GSoC] Machine learning and compiler optimizations: using inter-procedural analysis to select optimizations [Feb 2021]

If this information is useful, please help other people find it:
Share via:

Johannes Doerfert via llvm-dev

2021-Feb-06 19:28 UTC

[llvm-dev] [GSoC] Machine learning and compiler optimizations: using inter-procedural analysis to select optimizations

Hi Konstantin,

I didn't find the time to write new GSoC projects for 2021 yet but
if you are interested we could probably set one up in this area. I
also CC'ed Mircea who might be interested in this too, maybe as
(co-)mentor.

We could look at loop transformations such as unrolling and fusion,
similar to the inliner work. Best case, we can distill a heuristic
out of a model we learned. We could also  look at pass selection and
ordering. We started last year and I was hoping to continue. You
might want to watch https://youtu.be/TSMputNvHlk?t=617 
<https://youtu.be/TSMputNvHlk?t=617> and
https://youtu.be/nxfew3hsMFM?t=1435 <https://youtu.be/nxfew3hsMFM?t=1435>
.

In case your interested in a runtime topic, I really would love to
have a predictor for grid/block/thread block size for (OpenMP) GPU
kernels. We are having real trouble on that end.

I also would like to look at ML use in testing and CI.

Let me know what area sounds most interesting to you and we can
take it from there.

~ Johannes


On 2/6/21 4:35 AM, Сидоров , Константин Сергеевич via llvm-dev
wrote:> Dear all,
>
> I would like to continue the discussion of the GSoC project I mentioned in
> the previous email. Now, when I know my way around the LLVM codebase, I
> would like to propose the first draft of the plan:
>
> * Improving heuristics for existing passes – to start the discussion, I
> propose to start the project by working on `MLInlineAdvisor` (as far as I
> understand, in this class the ML infrastructure is already developed, and
> thus it seems to be a good idea to start there) and after that switching to
> the other passes (e.g., `LoopVectorizationPlanner` seems to be a good
> candidate for such an approach, and `LoopRotate` class contains a
> profitability heuristic which could also be studied deeper).
> * Machine learning models to select the optimizations – to the best of my
> understanding, the key concept here is the pass manager, but here I
don't
> quite understand the technical details of deciding which optimization to
> select. For this reason, I would like to discuss this part more thoroughly.
>
> If the project mentors are reading this mailing list and are interested in
> the discussion, can we start the discussion here?
>
> By the way – I would like to thank Stefanos for the comprehensive response
> to my previous questions that helped me to get started :)
>
> Looking forward to a further discussion,
> Konstantin Sidorov
>
> вт, 19 янв. 2021 г. в 07:04, Сидоров , Константин Сергеевич <
> sidorov.ks at phystech.edu>:
>
>> Dear all,
>>
>> My name is Konstantin Sidorov, and I am a graduate student in
Mathematics
>> at Moscow Institute of Physics and Technology.
>>
>> I would like to work on a project "Machine learning and compiler
>> optimizations: using inter-procedural analysis to select
optimizations"
>> during the Google Summer of Code 2021.
>>
>> I have an extensive background relevant to this project - in
particular:
>>
>> * I have already participated in GSoC before in 2017 with mlpack
>> organization on the project "Augmented RNNs":
>>
https://summerofcode.withgoogle.com/archive/2017/projects/4583913502539776/
>> * In 2019 I have graduated from the Yandex School of Data Analysis — a
>> two-year program in Data Analysis by Yandex (the leading Russian search
>> engine); more info on the curriculum could be also found at
>> https://yandexdataschool.com/.
>> * I have also been working as a software engineer at Adeptik from July
>> 2018 to date, where I have predominantly worked on projects on applied
>> combinatorial optimization problems, such as vehicle-routing problems
or
>> supply chain modeling. In particular, I have had experience with both
>> metaheuristic algorithms (e.g., local search or genetic algorithms) and
>> more "traditional" mathematical modeling (e.g., linear
programming or
>> constraint programming).
>>
>> I would like to discuss this project in more detail. While it is hard
to
>> discuss any kind of exact plan at this stage, I already have two
questions
>> concerning this project:
>>
>> (1) I have set up an LLVM dev environment, but I am unsure what to do
>> next. Could you advise me on any simple (and, preferably, relevant)
tasks
>> to work on?
>> (2) Could you suggest any learning materials to improve the
understanding
>> of "low-level" concepts? (E.g., CPU concepts such as caching
and SIMD)
>>
>> Best regards,
>> Konstantin Sidorov
>>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

Abid Malik via llvm-dev

2021-Feb-06 19:36 UTC

head link

[llvm-dev] [GSoC] Machine learning and compiler optimizations: using inter-procedural analysis to select optimizations

On Sat, Feb 6, 2021 at 2:29 PM Johannes Doerfert via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> Hi Konstantin,
>
> I didn't find the time to write new GSoC projects for 2021 yet but
> if you are interested we could probably set one up in this area. I
> also CC'ed Mircea who might be interested in this too, maybe as
> (co-)mentor.
>
> We could look at loop transformations such as unrolling and fusion,
> similar to the inliner work. Best case, we can distill a heuristic
> out of a model we learned. We could also  look at pass selection and
> ordering. We started last year and I was hoping to continue. You
> might want to watch https://youtu.be/TSMputNvHlk?t=617
> <https://youtu.be/TSMputNvHlk?t=617> and
> https://youtu.be/nxfew3hsMFM?t=1435
<https://youtu.be/nxfew3hsMFM?t=1435>
> .
>
> In case your interested in a runtime topic, I really would love to
> have a predictor for grid/block/thread block size for (OpenMP) GPU
> kernels. We are having real trouble on that end.
>
> Hi,
We are working on an ML model that can predict the profitability of
offloading a kernel to GPU. I feel that this problem is very much related.
This problem will have the same challenges in terms of feature engineering
and data preparation the one we are handling for our work.

Abid



> I also would like to look at ML use in testing and CI.
>
> Let me know what area sounds most interesting to you and we can
> take it from there.
>
> ~ Johannes
>
>
> On 2/6/21 4:35 AM, Сидоров , Константин Сергеевич via llvm-dev wrote:
> > Dear all,
> >
> > I would like to continue the discussion of the GSoC project I
mentioned
> in
> > the previous email. Now, when I know my way around the LLVM codebase,
I
> > would like to propose the first draft of the plan:
> >
> > * Improving heuristics for existing passes – to start the discussion,
I
> > propose to start the project by working on `MLInlineAdvisor` (as far
as I
> > understand, in this class the ML infrastructure is already developed,
and
> > thus it seems to be a good idea to start there) and after that
switching
> to
> > the other passes (e.g., `LoopVectorizationPlanner` seems to be a good
> > candidate for such an approach, and `LoopRotate` class contains a
> > profitability heuristic which could also be studied deeper).
> > * Machine learning models to select the optimizations – to the best of
my
> > understanding, the key concept here is the pass manager, but here I
don't
> > quite understand the technical details of deciding which optimization
to
> > select. For this reason, I would like to discuss this part more
> thoroughly.
> >
> > If the project mentors are reading this mailing list and are
interested
> in
> > the discussion, can we start the discussion here?
> >
> > By the way – I would like to thank Stefanos for the comprehensive
> response
> > to my previous questions that helped me to get started :)
> >
> > Looking forward to a further discussion,
> > Konstantin Sidorov
> >
> > вт, 19 янв. 2021 г. в 07:04, Сидоров , Константин Сергеевич <
> > sidorov.ks at phystech.edu>:
> >
> >> Dear all,
> >>
> >> My name is Konstantin Sidorov, and I am a graduate student in
> Mathematics
> >> at Moscow Institute of Physics and Technology.
> >>
> >> I would like to work on a project "Machine learning and
compiler
> >> optimizations: using inter-procedural analysis to select
optimizations"
> >> during the Google Summer of Code 2021.
> >>
> >> I have an extensive background relevant to this project - in
particular:
> >>
> >> * I have already participated in GSoC before in 2017 with mlpack
> >> organization on the project "Augmented RNNs":
> >>
> https://summerofcode.withgoogle.com/archive/2017/projects/4583913502539776/
> >> * In 2019 I have graduated from the Yandex School of Data Analysis
— a
> >> two-year program in Data Analysis by Yandex (the leading Russian
search
> >> engine); more info on the curriculum could be also found at
> >> https://yandexdataschool.com/.
> >> * I have also been working as a software engineer at Adeptik from
July
> >> 2018 to date, where I have predominantly worked on projects on
applied
> >> combinatorial optimization problems, such as vehicle-routing
problems or
> >> supply chain modeling. In particular, I have had experience with
both
> >> metaheuristic algorithms (e.g., local search or genetic
algorithms) and
> >> more "traditional" mathematical modeling (e.g., linear
programming or
> >> constraint programming).
> >>
> >> I would like to discuss this project in more detail. While it is
hard to
> >> discuss any kind of exact plan at this stage, I already have two
> questions
> >> concerning this project:
> >>
> >> (1) I have set up an LLVM dev environment, but I am unsure what to
do
> >> next. Could you advise me on any simple (and, preferably,
relevant)
> tasks
> >> to work on?
> >> (2) Could you suggest any learning materials to improve the
> understanding
> >> of "low-level" concepts? (E.g., CPU concepts such as
caching and SIMD)
> >>
> >> Best regards,
> >> Konstantin Sidorov
> >>
> >
> > _______________________________________________
> > LLVM Developers mailing list
> > llvm-dev at lists.llvm.org
> > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>

-- 
Abid M. Malik
******************************************************
"I have learned silence from the talkative, toleration from the intolerant,
and kindness from the unkind"---Gibran
"Success is not for the chosen few, but for the few who choose" ---
John
Maxwell
"Being a good person does not depend on your religion or status in life,
your race or skin color, political views or culture. IT DEPENDS ON HOW GOOD
YOU TREAT OTHERS"--- Abid
"The Universe is talking to us, and the language of the Universe is
mathematics."----Abid
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20210206/73591b63/attachment.html>

Сидоров , Константин Сергеевич via llvm-dev

2021-Feb-07 14:31 UTC

head link

[llvm-dev] [GSoC] Machine learning and compiler optimizations: using inter-procedural analysis to select optimizations

Hello Johannes,

I guess working on the loop transformations is a good starting point –
firstly, it is similar to the already existing code, and secondly, this
problem doesn't look way too hard (especially comparing with other ideas).
To the best of my understanding, it corresponds to refactoring
`LoopUnrollAnalyzer` and, if I understood you correctly, `MacroFusion`, in
the similar way it has been done with the `InlineAdvisor`.

As for the next step, I think that knowledge distillation is a promising
idea – in fact, we can experiment with the approaches from [1], which can
yield a nice inference speed-up in those models.

I think working on some kind of unified pipeline for pass selection and
ordering is also an interesting idea – off the top of my head, a viable
approach here is to consider a pass scheduling as a single-player game and
running a Monte-Carlo tree search to maximize some objective function. For
example, in [2] this kind of approach is used for learning to solve vertex
cover and max-cut, while [3] employs this approach for searching for the
molecule design with the specified properties. See also [4] for a survey of
RL methods (including MCTS) for combinatorial problems.

In case your interested in a runtime topic, I really would love
to> have a predictor for grid/block/thread block size for (OpenMP) GPU
> kernels. We are having real trouble on that end.
I'm afraid I didn't quite understand this one – could you elaborate a
bit
more on this topic?

Best regards,
Konstantin Sidorov

[1]
https://github.com/lhyfst/knowledge-distillation-papers#recommended-papers
[2] Abe, Kenshin et al. Solving NP-Hard Problems on Graphs with Extended
AlphaGo Zero. https://arxiv.org/abs/1905.11623
[3] Kajita, S., Kinjo, T. & Nishi, T. Autonomous molecular design by
Monte-Carlo tree search and rapid evaluations using molecular dynamics
simulations. Commun Phys 3, 77 (2020).
https://doi.org/10.1038/s42005-020-0338-y
[4] Mazyavkina, Nina et al. Reinforcement Learning for Combinatorial
Optimization: A Survey. https://arxiv.org/abs/2003.03600

сб, 6 февр. 2021 г. в 23:28, Johannes Doerfert <johannesdoerfert at
gmail.com>:
> Hi Konstantin,
>
> I didn't find the time to write new GSoC projects for 2021 yet but
> if you are interested we could probably set one up in this area. I
> also CC'ed Mircea who might be interested in this too, maybe as
> (co-)mentor.
>
> We could look at loop transformations such as unrolling and fusion,
> similar to the inliner work. Best case, we can distill a heuristic
> out of a model we learned. We could also  look at pass selection and
> ordering. We started last year and I was hoping to continue. You
> might want to watch https://youtu.be/TSMputNvHlk?t=617
> <https://youtu.be/TSMputNvHlk?t=617> and
> https://youtu.be/nxfew3hsMFM?t=1435
<https://youtu.be/nxfew3hsMFM?t=1435>
> .
>
> In case your interested in a runtime topic, I really would love to
> have a predictor for grid/block/thread block size for (OpenMP) GPU
> kernels. We are having real trouble on that end.
>
> I also would like to look at ML use in testing and CI.
>
> Let me know what area sounds most interesting to you and we can
> take it from there.
>
> ~ Johannes
>
>
> On 2/6/21 4:35 AM, Сидоров , Константин Сергеевич via llvm-dev wrote:
> > Dear all,
> >
> > I would like to continue the discussion of the GSoC project I
mentioned
> in
> > the previous email. Now, when I know my way around the LLVM codebase,
I
> > would like to propose the first draft of the plan:
> >
> > * Improving heuristics for existing passes – to start the discussion,
I
> > propose to start the project by working on `MLInlineAdvisor` (as far
as I
> > understand, in this class the ML infrastructure is already developed,
and
> > thus it seems to be a good idea to start there) and after that
switching
> to
> > the other passes (e.g., `LoopVectorizationPlanner` seems to be a good
> > candidate for such an approach, and `LoopRotate` class contains a
> > profitability heuristic which could also be studied deeper).
> > * Machine learning models to select the optimizations – to the best of
my
> > understanding, the key concept here is the pass manager, but here I
don't
> > quite understand the technical details of deciding which optimization
to
> > select. For this reason, I would like to discuss this part more
> thoroughly.
> >
> > If the project mentors are reading this mailing list and are
interested
> in
> > the discussion, can we start the discussion here?
> >
> > By the way – I would like to thank Stefanos for the comprehensive
> response
> > to my previous questions that helped me to get started :)
> >
> > Looking forward to a further discussion,
> > Konstantin Sidorov
> >
> > вт, 19 янв. 2021 г. в 07:04, Сидоров , Константин Сергеевич <
> > sidorov.ks at phystech.edu>:
> >
> >> Dear all,
> >>
> >> My name is Konstantin Sidorov, and I am a graduate student in
> Mathematics
> >> at Moscow Institute of Physics and Technology.
> >>
> >> I would like to work on a project "Machine learning and
compiler
> >> optimizations: using inter-procedural analysis to select
optimizations"
> >> during the Google Summer of Code 2021.
> >>
> >> I have an extensive background relevant to this project - in
particular:
> >>
> >> * I have already participated in GSoC before in 2017 with mlpack
> >> organization on the project "Augmented RNNs":
> >>
> https://summerofcode.withgoogle.com/archive/2017/projects/4583913502539776/
> >> * In 2019 I have graduated from the Yandex School of Data Analysis
— a
> >> two-year program in Data Analysis by Yandex (the leading Russian
search
> >> engine); more info on the curriculum could be also found at
> >> https://yandexdataschool.com/.
> >> * I have also been working as a software engineer at Adeptik from
July
> >> 2018 to date, where I have predominantly worked on projects on
applied
> >> combinatorial optimization problems, such as vehicle-routing
problems or
> >> supply chain modeling. In particular, I have had experience with
both
> >> metaheuristic algorithms (e.g., local search or genetic
algorithms) and
> >> more "traditional" mathematical modeling (e.g., linear
programming or
> >> constraint programming).
> >>
> >> I would like to discuss this project in more detail. While it is
hard to
> >> discuss any kind of exact plan at this stage, I already have two
> questions
> >> concerning this project:
> >>
> >> (1) I have set up an LLVM dev environment, but I am unsure what to
do
> >> next. Could you advise me on any simple (and, preferably,
relevant)
> tasks
> >> to work on?
> >> (2) Could you suggest any learning materials to improve the
> understanding
> >> of "low-level" concepts? (E.g., CPU concepts such as
caching and SIMD)
> >>
> >> Best regards,
> >> Konstantin Sidorov
> >>
> >
> > _______________________________________________
> > LLVM Developers mailing list
> > llvm-dev at lists.llvm.org
> > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20210207/10f96e52/attachment.html>

Mircea Trofin via llvm-dev

2021-Feb-08 19:16 UTC

head link

[llvm-dev] [GSoC] Machine learning and compiler optimizations: using inter-procedural analysis to select optimizations

On Sat, Feb 6, 2021 at 11:28 AM Johannes Doerfert <
johannesdoerfert at gmail.com> wrote:
> Hi Konstantin,
>
> I didn't find the time to write new GSoC projects for 2021 yet but
> if you are interested we could probably set one up in this area. I
> also CC'ed Mircea who might be interested in this too, maybe as
> (co-)mentor.
>
Sorry for being slow answering this - happy to help if there's the need!

A suggestion for a potential project: bringing this work
<https://dl.acm.org/doi/10.1145/3368826.3377928> to LLVM in a manner
similar to what we did for the inliner - i.e. transparently integrating a
general-purpose learned policy in the compiler.


>
> We could look at loop transformations such as unrolling and fusion,
> similar to the inliner work. Best case, we can distill a heuristic
> out of a model we learned. We could also  look at pass selection and
> ordering. We started last year and I was hoping to continue. You
> might want to watch https://youtu.be/TSMputNvHlk?t=617
> <https://youtu.be/TSMputNvHlk?t=617> and
> https://youtu.be/nxfew3hsMFM?t=1435
<https://youtu.be/nxfew3hsMFM?t=1435>
> .
>
> In case your interested in a runtime topic, I really would love to
> have a predictor for grid/block/thread block size for (OpenMP) GPU
> kernels. We are having real trouble on that end.
>
> I also would like to look at ML use in testing and CI.
>
> Let me know what area sounds most interesting to you and we can
> take it from there.
>
> ~ Johannes
>
>
> On 2/6/21 4:35 AM, Сидоров , Константин Сергеевич via llvm-dev wrote:
> > Dear all,
> >
> > I would like to continue the discussion of the GSoC project I
mentioned
> in
> > the previous email. Now, when I know my way around the LLVM codebase,
I
> > would like to propose the first draft of the plan:
> >
> > * Improving heuristics for existing passes – to start the discussion,
I
> > propose to start the project by working on `MLInlineAdvisor` (as far
as I
> > understand, in this class the ML infrastructure is already developed,
and
> > thus it seems to be a good idea to start there) and after that
switching
> to
> > the other passes (e.g., `LoopVectorizationPlanner` seems to be a good
> > candidate for such an approach, and `LoopRotate` class contains a
> > profitability heuristic which could also be studied deeper).
> > * Machine learning models to select the optimizations – to the best of
my
> > understanding, the key concept here is the pass manager, but here I
don't
> > quite understand the technical details of deciding which optimization
to
> > select. For this reason, I would like to discuss this part more
> thoroughly.
> >
> > If the project mentors are reading this mailing list and are
interested
> in
> > the discussion, can we start the discussion here?
> >
> > By the way – I would like to thank Stefanos for the comprehensive
> response
> > to my previous questions that helped me to get started :)
> >
> > Looking forward to a further discussion,
> > Konstantin Sidorov
> >
> > вт, 19 янв. 2021 г. в 07:04, Сидоров , Константин Сергеевич <
> > sidorov.ks at phystech.edu>:
> >
> >> Dear all,
> >>
> >> My name is Konstantin Sidorov, and I am a graduate student in
> Mathematics
> >> at Moscow Institute of Physics and Technology.
> >>
> >> I would like to work on a project "Machine learning and
compiler
> >> optimizations: using inter-procedural analysis to select
optimizations"
> >> during the Google Summer of Code 2021.
> >>
> >> I have an extensive background relevant to this project - in
particular:
> >>
> >> * I have already participated in GSoC before in 2017 with mlpack
> >> organization on the project "Augmented RNNs":
> >>
> https://summerofcode.withgoogle.com/archive/2017/projects/4583913502539776/
> >> * In 2019 I have graduated from the Yandex School of Data Analysis
— a
> >> two-year program in Data Analysis by Yandex (the leading Russian
search
> >> engine); more info on the curriculum could be also found at
> >> https://yandexdataschool.com/.
> >> * I have also been working as a software engineer at Adeptik from
July
> >> 2018 to date, where I have predominantly worked on projects on
applied
> >> combinatorial optimization problems, such as vehicle-routing
problems or
> >> supply chain modeling. In particular, I have had experience with
both
> >> metaheuristic algorithms (e.g., local search or genetic
algorithms) and
> >> more "traditional" mathematical modeling (e.g., linear
programming or
> >> constraint programming).
> >>
> >> I would like to discuss this project in more detail. While it is
hard to
> >> discuss any kind of exact plan at this stage, I already have two
> questions
> >> concerning this project:
> >>
> >> (1) I have set up an LLVM dev environment, but I am unsure what to
do
> >> next. Could you advise me on any simple (and, preferably,
relevant)
> tasks
> >> to work on?
> >> (2) Could you suggest any learning materials to improve the
> understanding
> >> of "low-level" concepts? (E.g., CPU concepts such as
caching and SIMD)
> >>
> >> Best regards,
> >> Konstantin Sidorov
> >>
> >
> > _______________________________________________
> > LLVM Developers mailing list
> > llvm-dev at lists.llvm.org
> > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20210208/a76c650f/attachment.html>

llvm dev - Feb 2021 - [GSoC] Machine learning and compiler optimizations: using inter-procedural analysis to select optimizations

[llvm-dev] [GSoC] Machine learning and compiler optimizations: using inter-procedural analysis to select optimizations

[llvm-dev] [GSoC] Machine learning and compiler optimizations: using inter-procedural analysis to select optimizations

[llvm-dev] [GSoC] Machine learning and compiler optimizations: using inter-procedural analysis to select optimizations

[llvm-dev] [GSoC] Machine learning and compiler optimizations: using inter-procedural analysis to select optimizations