Bardia Mahjour via llvm-dev
2019-Sep-13 15:36 UTC
[llvm-dev] Loop Opt WG Meeting Minutes for Sep 11, 2019
Thanks Florian. Tim you said:> Some cases can be undone by rematerialization, but not all, and it caninvolve a lot of effort which increases compile time. Do you have examples of cases where rematerialization is not possible? We are interested in learning about any previous attempts at trying to address the issue in RA. Have you tried it? Bardia Mahjour Compiler Optimizations IBM Toronto Software Lab bmahjour at ca.ibm.com (905) 413-2336 From: Florian Hahn <florian_hahn at apple.com> To: Bardia Mahjour <bmahjour at ca.ibm.com> Cc: via llvm-dev <llvm-dev at lists.llvm.org>, tcorring at amd.com Date: 2019/09/13 11:16 AM Subject: [EXTERNAL] Re: [llvm-dev] Loop Opt WG Meeting Minutes for Sep 11, 2019 Sent by: florian_hahn at apple.com Hi, On Sep 11, 2019, at 17:51, Bardia Mahjour via llvm-dev < llvm-dev at lists.llvm.org> wrote: --------------------------- Wed, Sep 11, 2019: --------------------------- - LICM vs Loop Sink Strategy (Whitney) - LICM and SCEV expander host code with no regards to increased live-ranges. This is a long standing issue where historically preference has been to keep LICM more aggressive. This issue also motivated adding metadata to disable LICM (l lvm.loop.licm.disable) recently. https://reviews.llvm.org/D64557 - Two questions from IBM side: a. This problem is not specific to the POWER platform, so we are wondering if other people are interested? - b. Where would be the best place to address this issue? - Since it's hard to come up with an accurate register pressure estimator in opt, it's probably better to be done fairly late, maybe after instruction scheduling. - A good place to start would be instruction re-materialization in the register allocator. - Problem is the logic in the register allocator can deal with a single instruction (instead of groups of instructions) at a time. - Start by handling one single-instruction at a time and apply the same logic to groups of instructions iteratively to see the impact on performance and compile-time. - live-range editor may have utilities to help with code motion. - lazy-code-motion may be a good long term solution, but no one seems to be actively working on it. - Announcements: - flang call moved so we are no longer in conflict! - Philip is working on making loop vectorizer robust in the face of multiple exits. There are two subproblems 1. vectorizer currently gives up because scev is not giving exit counts (due to a bug?). This is relatively easy to fix and Philip will have a patch for it soon. 2. loop exit cannot be analyzed due to data dependent exit, which is currently handled via predication. There is a lot of room for improvement, specially for read-only loops. Please let him know if you are interested. - Status Updates - Data Dependence Graph (https://reviews.llvm.org/D65350) (Bardia) - All review comments are addressed. Waiting for approval. - Bugzilla bugs update (Vivek) - Florian has a patch fixing loop bugs related to max trip count. ---------------------------- Tentative Agenda for Sept 25 ---------------------------- Presentation from Marc Moreno Maza about his work on delinearization. - Status Updates - Follow up on multi-dimensional array indexing RFC (Siddharth) - Impact of Loop Rotation on existing passes (Min-Yih) - Data Dependence Graph (https://reviews.llvm.org/D65350) (Bardia) - Bugzilla bugs update (Vivek) - Others? Bardia Mahjour Compiler Optimizations IBM Toronto Software Lab _______________________________________________ LLVM Developers mailing list llvm-dev at lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190913/8e0c9ce4/attachment.html> -------------- next part -------------- A non-text attachment was scrubbed... Name: graycol.gif Type: image/gif Size: 105 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190913/8e0c9ce4/attachment.gif>
Evgenii Stepanov via llvm-dev
2020-Jan-07 19:15 UTC
[llvm-dev] Loop Opt WG Meeting Minutes for Sep 11, 2019
Sorry for reviving this old thread. Is this the case that you are talking about? void use(int *); void f(int *p) { for (int i = 0; i < 1000; ++i) { use(p); use(p + 1); use(p + 2); use(p + 3); } } LICM hoists all the (p + N) computations out of the loop, and there is nothing that could sink them back. entry: %add.ptr = getelementptr inbounds i32, i32* %p, i64 1 %add.ptr1 = getelementptr inbounds i32, i32* %p, i64 2 %add.ptr2 = getelementptr inbounds i32, i32* %p, i64 3 ... for.body: ... tail call void @_Z3usePi(i32* %p) tail call void @_Z3usePi(i32* nonnull %add.ptr) tail call void @_Z3usePi(i32* nonnull %add.ptr1) tail call void @_Z3usePi(i32* nonnull %add.ptr2) With more calls to use(), these common expressions will be pre-computed, spilled and then reloaded inside the loop. Each individual instruction is not profitable to sink or rematerialize in the loop, because that would simply reduce the liverange of (p+N) at the cost of extending the liverange of (p). I see this problem in ARM MTE stack instrumentation. We use a virtual frame pointer there which makes all local variable access look like (p+N) in the above example. On Fri, Sep 13, 2019 at 8:36 AM Bardia Mahjour via llvm-dev <llvm-dev at lists.llvm.org> wrote:> > Thanks Florian. > > Tim you said: > > Some cases can be undone by rematerialization, but not all, and it can involve a lot of effort which increases compile time. > > Do you have examples of cases where rematerialization is not possible? We are interested in learning about any previous attempts at trying to address the issue in RA. Have you tried it? > > Bardia Mahjour > Compiler Optimizations > IBM Toronto Software Lab > bmahjour at ca.ibm.com (905) 413-2336 > > > > Florian Hahn ---2019/09/13 11:16:01 AM---Hi, > On Sep 11, 2019, at 17:51, Bardia Mahjour via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > From: Florian Hahn <florian_hahn at apple.com> > To: Bardia Mahjour <bmahjour at ca.ibm.com> > Cc: via llvm-dev <llvm-dev at lists.llvm.org>, tcorring at amd.com > Date: 2019/09/13 11:16 AM > Subject: [EXTERNAL] Re: [llvm-dev] Loop Opt WG Meeting Minutes for Sep 11, 2019 > Sent by: florian_hahn at apple.com > > ________________________________ > > > > Hi, > > On Sep 11, 2019, at 17:51, Bardia Mahjour via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > --------------------------- > Wed, Sep 11, 2019: > --------------------------- > > - LICM vs Loop Sink Strategy (Whitney) > - LICM and SCEV expander host code with no regards to increased > live-ranges. This is a long standing issue where historically > preference has been to keep LICM more aggressive. > > > This issue also motivated adding metadata to disable LICM (llvm.loop.licm.disable) recently. https://reviews.llvm.org/D64557 > > - Two questions from IBM side: > a. This problem is not specific to the POWER platform, so we are > wondering if other people are interested? > - b. Where would be the best place to address this issue? > - Since it's hard to come up with an accurate register pressure > estimator in opt, it's probably better to be done fairly late, > maybe after instruction scheduling. > - A good place to start would be instruction re-materialization in > the register allocator. > - Problem is the logic in the register allocator can deal with a > single instruction (instead of groups of instructions) at a time. > - Start by handling one single-instruction at a time and apply the > same logic to groups of instructions iteratively to see the > impact on performance and compile-time. > - live-range editor may have utilities to help with code motion. > - lazy-code-motion may be a good long term solution, but no one seems > to be actively working on it. > > - Announcements: > - flang call moved so we are no longer in conflict! > > - Philip is working on making loop vectorizer robust in the face of > multiple exits. There are two subproblems > 1. vectorizer currently gives up because scev is not giving exit > counts (due to a bug?). This is relatively easy to fix and > Philip will have a patch for it soon. > 2. loop exit cannot be analyzed due to data dependent exit, which > is currently handled via predication. There is a lot of room > for improvement, specially for read-only loops. > Please let him know if you are interested. > > > - Status Updates > - Data Dependence Graph (https://reviews.llvm.org/D65350) (Bardia) > - All review comments are addressed. Waiting for approval. > - Bugzilla bugs update (Vivek) > - Florian has a patch fixing loop bugs related to max trip count. > > ---------------------------- > Tentative Agenda for Sept 25 > ---------------------------- > > Presentation from Marc Moreno Maza about his work on delinearization. > > - Status Updates > - Follow up on multi-dimensional array indexing RFC (Siddharth) > - Impact of Loop Rotation on existing passes (Min-Yih) > - Data Dependence Graph (https://reviews.llvm.org/D65350) (Bardia) > - Bugzilla bugs update (Vivek) > - Others? > > > Bardia Mahjour > Compiler Optimizations > IBM Toronto Software Lab > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > > > > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Bardia Mahjour via llvm-dev
2020-Jan-09 16:13 UTC
[llvm-dev] Loop Opt WG Meeting Minutes for Sep 11, 2019
Hi Evgenii, The specific issue that we ran into turned out to be related to expansion of a remainder instruction which caused it to not be considered by RA rematerialization. However the example you provided falls into the general category of problem with LICM and live range extension, which is where we started from. I don't know the details but looks like when determining the cost of a sink or rematerialization we need to take a more holistic view than doing it on an instruction by instruction bases. Is that possible? Adding Hussain to the discussion as well. Bardia Mahjour Compiler Optimizations IBM Toronto Software Lab From: Evgenii Stepanov <eugenis at google.com> To: Bardia Mahjour <bmahjour at ca.ibm.com> Cc: Florian Hahn <florian_hahn at apple.com>, LLVM Dev <llvm-dev at lists.llvm.org>, tcorring at amd.com Date: 2020/01/07 02:15 PM Subject: [EXTERNAL] Re: [llvm-dev] Loop Opt WG Meeting Minutes for Sep 11, 2019 Sorry for reviving this old thread. Is this the case that you are talking about? void use(int *); void f(int *p) { for (int i = 0; i < 1000; ++i) { use(p); use(p + 1); use(p + 2); use(p + 3); } } LICM hoists all the (p + N) computations out of the loop, and there is nothing that could sink them back. entry: %add.ptr = getelementptr inbounds i32, i32* %p, i64 1 %add.ptr1 = getelementptr inbounds i32, i32* %p, i64 2 %add.ptr2 = getelementptr inbounds i32, i32* %p, i64 3 ... for.body: ... tail call void @_Z3usePi(i32* %p) tail call void @_Z3usePi(i32* nonnull %add.ptr) tail call void @_Z3usePi(i32* nonnull %add.ptr1) tail call void @_Z3usePi(i32* nonnull %add.ptr2) With more calls to use(), these common expressions will be pre-computed, spilled and then reloaded inside the loop. Each individual instruction is not profitable to sink or rematerialize in the loop, because that would simply reduce the liverange of (p+N) at the cost of extending the liverange of (p). I see this problem in ARM MTE stack instrumentation. We use a virtual frame pointer there which makes all local variable access look like (p+N) in the above example. On Fri, Sep 13, 2019 at 8:36 AM Bardia Mahjour via llvm-dev <llvm-dev at lists.llvm.org> wrote:> > Thanks Florian. > > Tim you said: > > Some cases can be undone by rematerialization, but not all, and it caninvolve a lot of effort which increases compile time.> > Do you have examples of cases where rematerialization is not possible? Weare interested in learning about any previous attempts at trying to address the issue in RA. Have you tried it?> > Bardia Mahjour > Compiler Optimizations > IBM Toronto Software Lab > bmahjour at ca.ibm.com (905) 413-2336 > > > > Florian Hahn ---2019/09/13 11:16:01 AM---Hi, > On Sep 11, 2019, at 17:51,Bardia Mahjour via llvm-dev <llvm-dev at lists.llvm.org> wrote:> > From: Florian Hahn <florian_hahn at apple.com> > To: Bardia Mahjour <bmahjour at ca.ibm.com> > Cc: via llvm-dev <llvm-dev at lists.llvm.org>, tcorring at amd.com > Date: 2019/09/13 11:16 AM > Subject: [EXTERNAL] Re: [llvm-dev] Loop Opt WG Meeting Minutes for Sep11, 2019> Sent by: florian_hahn at apple.com > > ________________________________ > > > > Hi, > > On Sep 11, 2019, at 17:51, Bardia Mahjour via llvm-dev<llvm-dev at lists.llvm.org> wrote:> > --------------------------- > Wed, Sep 11, 2019: > --------------------------- > > - LICM vs Loop Sink Strategy (Whitney) > - LICM and SCEV expander host code with no regards to increased > live-ranges. This is a long standing issue where historically > preference has been to keep LICM more aggressive. > > > This issue also motivated adding metadata to disable LICM(llvm.loop.licm.disable) recently. https://urldefense.proofpoint.com/v2/url?u=https-3A__reviews.llvm.org_D64557&d=DwIBaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=aihobyOnVzXW7OPSK1-NiSYQkq7oP3ZSUVc4BemvrVo&m=LmblL0WqDxceW7q5kWmr42tB6v0WRsjslJuUEzVWvco&s=cMpKwKnosBp_bwQWBssHmOyEfVQyRdwAGzOA56wuo8o&e> > - Two questions from IBM side: > a. This problem is not specific to the POWER platform, so we are > wondering if other people are interested? > - b. Where would be the best place to address this issue? > - Since it's hard to come up with an accurate register pressure > estimator in opt, it's probably better to be done fairly late, > maybe after instruction scheduling. > - A good place to start would be instruction re-materialization in > the register allocator. > - Problem is the logic in the register allocator can deal with a > single instruction (instead of groups of instructions) at a time. > - Start by handling one single-instruction at a time and apply the > same logic to groups of instructions iteratively to see the > impact on performance and compile-time. > - live-range editor may have utilities to help with code motion. > - lazy-code-motion may be a good long term solution, but no one seems > to be actively working on it. > > - Announcements: > - flang call moved so we are no longer in conflict! > > - Philip is working on making loop vectorizer robust in the face of > multiple exits. There are two subproblems > 1. vectorizer currently gives up because scev is not giving exit > counts (due to a bug?). This is relatively easy to fix and > Philip will have a patch for it soon. > 2. loop exit cannot be analyzed due to data dependent exit, which > is currently handled via predication. There is a lot of room > for improvement, specially for read-only loops. > Please let him know if you are interested. > > > - Status Updates > - Data Dependence Graph (https://urldefense.proofpoint.com/v2/url?u=https-3A__reviews.llvm.org_D65350&d=DwIBaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=aihobyOnVzXW7OPSK1-NiSYQkq7oP3ZSUVc4BemvrVo&m=LmblL0WqDxceW7q5kWmr42tB6v0WRsjslJuUEzVWvco&s=cDxL6tZAw-WIrhQ8WTaliZX2sE8JFaHUrWFeoVfOeyQ&e ) (Bardia)> - All review comments are addressed. Waiting for approval. > - Bugzilla bugs update (Vivek) > - Florian has a patch fixing loop bugs related to max trip count. > > ---------------------------- > Tentative Agenda for Sept 25 > ---------------------------- > > Presentation from Marc Moreno Maza about his work on delinearization. > > - Status Updates > - Follow up on multi-dimensional array indexing RFC (Siddharth) > - Impact of Loop Rotation on existing passes (Min-Yih) > - Data Dependence Graph (https://urldefense.proofpoint.com/v2/url?u=https-3A__reviews.llvm.org_D65350&d=DwIBaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=aihobyOnVzXW7OPSK1-NiSYQkq7oP3ZSUVc4BemvrVo&m=LmblL0WqDxceW7q5kWmr42tB6v0WRsjslJuUEzVWvco&s=cDxL6tZAw-WIrhQ8WTaliZX2sE8JFaHUrWFeoVfOeyQ&e ) (Bardia)> - Bugzilla bugs update (Vivek) > - Others? > > > Bardia Mahjour > Compiler Optimizations > IBM Toronto Software Lab > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org >https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.llvm.org_cgi-2Dbin_mailman_listinfo_llvm-2Ddev&d=DwIBaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=aihobyOnVzXW7OPSK1-NiSYQkq7oP3ZSUVc4BemvrVo&m=LmblL0WqDxceW7q5kWmr42tB6v0WRsjslJuUEzVWvco&s=esaBR0Z8WO01NykCMECsouFpZW1h3SvdmiRPWk0tIsg&e> > > > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org >https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.llvm.org_cgi-2Dbin_mailman_listinfo_llvm-2Ddev&d=DwIBaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=aihobyOnVzXW7OPSK1-NiSYQkq7oP3ZSUVc4BemvrVo&m=LmblL0WqDxceW7q5kWmr42tB6v0WRsjslJuUEzVWvco&s=esaBR0Z8WO01NykCMECsouFpZW1h3SvdmiRPWk0tIsg&e -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200109/afb338ab/attachment.html> -------------- next part -------------- A non-text attachment was scrubbed... Name: graycol.gif Type: image/gif Size: 105 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200109/afb338ab/attachment.gif>
Hussain Kadhem via llvm-dev
2020-Jan-10 19:00 UTC
[llvm-dev] Loop Opt WG Meeting Minutes for Sep 11, 2019
<div class="socmaildefaultfont" dir="ltr" style="font-family:Arial, Helvetica, sans-serif;font-size:10pt" ><div dir="ltr" ><div>Hi Evgenii,</div> <div> </div> <div>As Bardia mentioned, I had started work in the direction of extending LiveRangeEdit to allow grouped remat decisions, in response to an srem/urem remat issue which turned out to be more easily solved by modifying the SDAG combiner.</div> <div> </div> <div>This is now done for the most part, so I am going back to complete the LRE work, going off of the discussion here: <a href="http://lists.llvm.org/pipermail/llvm-dev/2016-December/107718.html">http://lists.llvm.org/pipermail/llvm-dev/2016-December/107718.html</a> and followed up here: <a href="http://llvm.1065342.n5.nabble.com/llvm-dev-Register-Rematerialization-td119906.html">http://llvm.1065342.n5.nabble.com/llvm-dev-Register-Rematerialization-td119906.html</a>.</div> <div> </div> <div>The goal is to allow LRE to make decisions that (i) take into account the live ranges of multiple independent values in the same basic block, or (ii) remat a group of dependent instructions in one basic block.</div> <div> </div> <div>The former should solve the case you are dealing with, and I would be happy to work with you to test the solution with ARM MTE once I'm done implementing it.</div> <div> </div> <div>Cheers,</div> <div>Hussain</div></div> <div dir="ltr" > </div> <blockquote data-history-content-modified="1" dir="ltr" style="border-left:solid #aaaaaa 2px; margin-left:5px; padding-left:5px; direction:ltr; margin-right:0px" >----- Original message -----<br>From: Bardia Mahjour/Toronto/IBM<br>To: Evgenii Stepanov <eugenis@google.com><br>Cc: Florian Hahn <florian_hahn@apple.com>, LLVM Dev <llvm-dev@lists.llvm.org>, tcorring@amd.com, Hussain Kadhem/Canada/IBM@IBM<br>Subject: Re: [EXTERNAL] Re: [llvm-dev] Loop Opt WG Meeting Minutes for Sep 11, 2019<br>Date: Thu, Jan 9, 2020 11:13 AM<br> <br><font size="2" >Hi Evgenii,</font><br><br><font size="2" >The specific issue that we ran into turned out to be related to expansion of a remainder instruction which caused it to not be considered by RA rematerialization. However the example you provided falls into the general category of problem with LICM and live range extension, which is where we started from. I don't know the details but looks like when determining the cost of a sink or rematerialization we need to take a more holistic view than doing it on an instruction by instruction bases. Is that possible?</font><br><br><font size="2" >Adding Hussain to the discussion as well.</font><br><br><font size="2" >Bardia Mahjour<br>Compiler Optimizations<br>IBM Toronto Software Lab</font><br><br><br><img alt="Inactive hide details for Evgenii Stepanov ---2020/01/07 02:15:52 PM---Sorry for reviving this old thread. Is this the case tha" src="/icons/graycol.gif" width="16" height="16" border="0" ><font size="2" color="#424282" >Evgenii Stepanov ---2020/01/07 02:15:52 PM---Sorry for reviving this old thread. Is this the case that you are talking about?</font><br><br><font size="2" color="#5F5F5F" >From: </font><font size="2" >Evgenii Stepanov <eugenis@google.com></font><br><font size="2" color="#5F5F5F" >To: </font><font size="2" >Bardia Mahjour <bmahjour@ca.ibm.com></font><br><font size="2" color="#5F5F5F" >Cc: </font><font size="2" >Florian Hahn <florian_hahn@apple.com>, LLVM Dev <llvm-dev@lists.llvm.org>, tcorring@amd.com</font><br><font size="2" color="#5F5F5F" >Date: </font><font size="2" >2020/01/07 02:15 PM</font><br><font size="2" color="#5F5F5F" >Subject: </font><font size="2" >[EXTERNAL] Re: [llvm-dev] Loop Opt WG Meeting Minutes for Sep 11, 2019</font> <hr style="color:#8091A5; " width="100%" size="2" align="left" ><br><br><br><tt><font size="3" face="" >Sorry for reviving this old thread.<br>Is this the case that you are talking about?<br>void use(int *);<br>void f(int *p) {<br> for (int i = 0; i < 1000; ++i) {<br> use(p);<br> use(p + 1);<br> use(p + 2);<br> use(p + 3);<br> }<br>}<br><br>LICM hoists all the (p + N) computations out of the loop, and there is<br>nothing that could sink them back.<br>entry:<br> %add.ptr = getelementptr inbounds i32, i32* %p, i64 1<br> %add.ptr1 = getelementptr inbounds i32, i32* %p, i64 2<br> %add.ptr2 = getelementptr inbounds i32, i32* %p, i64 3<br>...<br>for.body:<br>...<br> tail call void @_Z3usePi(i32* %p)<br> tail call void @_Z3usePi(i32* nonnull %add.ptr)<br> tail call void @_Z3usePi(i32* nonnull %add.ptr1)<br> tail call void @_Z3usePi(i32* nonnull %add.ptr2)<br><br>With more calls to use(), these common expressions will be<br>pre-computed, spilled and then reloaded inside the loop. Each<br>individual instruction is not profitable to sink or rematerialize in<br>the loop, because that would simply reduce the liverange of (p+N) at<br>the cost of extending the liverange of (p).<br><br>I see this problem in ARM MTE stack instrumentation. We use a virtual<br>frame pointer there which makes all local variable access look like<br>(p+N) in the above example.<br><br>On Fri, Sep 13, 2019 at 8:36 AM Bardia Mahjour via llvm-dev<br><llvm-dev@lists.llvm.org> wrote:<br>><br>> Thanks Florian.<br>><br>> Tim you said:<br>> > Some cases can be undone by rematerialization, but not all, and it can involve a lot of effort which increases compile time.<br>><br>> Do you have examples of cases where rematerialization is not possible? We are interested in learning about any previous attempts at trying to address the issue in RA. Have you tried it?<br>><br>> Bardia Mahjour<br>> Compiler Optimizations<br>> IBM Toronto Software Lab<br>> bmahjour@ca.ibm.com (905) 413-2336<br>><br>><br>><br>> Florian Hahn ---2019/09/13 11:16:01 AM---Hi, > On Sep 11, 2019, at 17:51, Bardia Mahjour via llvm-dev <llvm-dev@lists.llvm.org> wrote:<br>><br>> From: Florian Hahn <florian_hahn@apple.com><br>> To: Bardia Mahjour <bmahjour@ca.ibm.com><br>> Cc: via llvm-dev <llvm-dev@lists.llvm.org>, tcorring@amd.com<br>> Date: 2019/09/13 11:16 AM<br>> Subject: [EXTERNAL] Re: [llvm-dev] Loop Opt WG Meeting Minutes for Sep 11, 2019<br>> Sent by: florian_hahn@apple.com<br>><br>> ________________________________<br>><br>><br>><br>> Hi,<br>><br>> On Sep 11, 2019, at 17:51, Bardia Mahjour via llvm-dev <llvm-dev@lists.llvm.org> wrote:<br>><br>> ---------------------------<br>> Wed, Sep 11, 2019:<br>> ---------------------------<br>><br>> - LICM vs Loop Sink Strategy (Whitney)<br>> - LICM and SCEV expander host code with no regards to increased<br>> live-ranges. This is a long standing issue where historically<br>> preference has been to keep LICM more aggressive.<br>><br>><br>> This issue also motivated adding metadata to disable LICM (llvm.loop.licm.disable) recently. </font></tt><tt><font size="3" face="" ><a href="https://reviews.llvm.org/D64557" target="_blank">https://reviews.llvm.org/D64557</a></font></tt><tt><font size="3" face="" > <br>><br>> - Two questions from IBM side:<br>> a. This problem is not specific to the POWER platform, so we are<br>> wondering if other people are interested?<br>> - b. Where would be the best place to address this issue?<br>> - Since it's hard to come up with an accurate register pressure<br>> estimator in opt, it's probably better to be done fairly late,<br>> maybe after instruction scheduling.<br>> - A good place to start would be instruction re-materialization in<br>> the register allocator.<br>> - Problem is the logic in the register allocator can deal with a<br>> single instruction (instead of groups of instructions) at a time.<br>> - Start by handling one single-instruction at a time and apply the<br>> same logic to groups of instructions iteratively to see the<br>> impact on performance and compile-time.<br>> - live-range editor may have utilities to help with code motion.<br>> - lazy-code-motion may be a good long term solution, but no one seems<br>> to be actively working on it.<br>><br>> - Announcements:<br>> - flang call moved so we are no longer in conflict!<br>><br>> - Philip is working on making loop vectorizer robust in the face of<br>> multiple exits. There are two subproblems<br>> 1. vectorizer currently gives up because scev is not giving exit<br>> counts (due to a bug?). This is relatively easy to fix and<br>> Philip will have a patch for it soon.<br>> 2. loop exit cannot be analyzed due to data dependent exit, which<br>> is currently handled via predication. There is a lot of room<br>> for improvement, specially for read-only loops.<br>> Please let him know if you are interested.<br>><br>><br>> - Status Updates<br>> - Data Dependence Graph (</font></tt><tt><font size="3" face="" ><a href="https://reviews.llvm.org/D65350" target="_blank">https://reviews.llvm.org/D65350</a></font></tt><tt><font size="3" face="" > ) (Bardia)<br>> - All review comments are addressed. Waiting for approval.<br>> - Bugzilla bugs update (Vivek)<br>> - Florian has a patch fixing loop bugs related to max trip count.<br>><br>> ----------------------------<br>> Tentative Agenda for Sept 25<br>> ----------------------------<br>><br>> Presentation from Marc Moreno Maza about his work on delinearization.<br>><br>> - Status Updates<br>> - Follow up on multi-dimensional array indexing RFC (Siddharth)<br>> - Impact of Loop Rotation on existing passes (Min-Yih)<br>> - Data Dependence Graph (</font></tt><tt><font size="3" face="" ><a href="https://reviews.llvm.org/D65350" target="_blank">https://reviews.llvm.org/D65350</a></font></tt><tt><font size="3" face="" > ) (Bardia)<br>> - Bugzilla bugs update (Vivek)<br>> - Others?<br>><br>><br>> Bardia Mahjour<br>> Compiler Optimizations<br>> IBM Toronto Software Lab<br>><br>> _______________________________________________<br>> LLVM Developers mailing list<br>> llvm-dev@lists.llvm.org<br>> </font></tt><tt><font size="3" face="" ><a href="https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" target="_blank">https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a></font></tt><tt><font size="3" face="" > <br>><br>><br>><br>><br>> _______________________________________________<br>> LLVM Developers mailing list<br>> llvm-dev@lists.llvm.org<br>> </font></tt><tt><font size="3" face="" ><a href="https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" target="_blank">https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a></font></tt><tt><font size="3" face="" > </font></tt><br><br> </blockquote> <div dir="ltr" > </div></div><BR>