On Feb 7, 2013, at 10:55 AM, Pekka Jääskeläinen <pekka.jaaskelainen at tut.fi> wrote:> Hi Nadav, > > On 02/07/2013 07:46 PM, Nadav Rotem wrote: >> Pekka suggested that we add two kind of metadata: llvm.loop.parallel >> (attached to each loop latch) and llvm.mem.parallel (attached to each memory >> instruction!). I think that the motivation for the first metadata is clear - >> it says that the loop is data-parallel. I can also see us adding additional >> metadata such as llvm.loop.unrollcnt to allow the users to control the unroll >> count of loops using pragmas. That's fine. Pekka, can you think of >> transformations that may need invalidate or take this metadata into >> consideration ? > > Any pass that introduces new non-parallel memory instructions to the loop, > because they think the loop is sequential and it's ok to do so. I do not know > any other such pass than the one pointed out earlier, reg2mem (if the > variables inside the loop body reuse stack slots). E.g., inlining > should be safe. So should be unrolling an inner loop inside a parallel loop. >I suggest that we only add the 'llvm.loop.parallel' metadata and not llvm.mem.parallem. I believe that it should be the job of the consumer pass (e.g loop-vectorizer) to scan the loop and detect parallelism violations. This is also the approach that we use when we optimize stack slots using lifetime markers. I understand that the consumer passes will have to be more conservative and miss some optimizations. But I still think that this is better than forcing different passes in the compiler to know about parallel metadata.
On 02/07/2013 11:49 PM, Nadav Rotem wrote:> > On Feb 7, 2013, at 10:55 AM, Pekka Jääskeläinen <pekka.jaaskelainen at tut.fi> wrote: > >> Hi Nadav, >> >> On 02/07/2013 07:46 PM, Nadav Rotem wrote: >>> Pekka suggested that we add two kind of metadata: llvm.loop.parallel >>> (attached to each loop latch) and llvm.mem.parallel (attached to each memory >>> instruction!). I think that the motivation for the first metadata is clear - >>> it says that the loop is data-parallel. I can also see us adding additional >>> metadata such as llvm.loop.unrollcnt to allow the users to control the unroll >>> count of loops using pragmas. That's fine. Pekka, can you think of >>> transformations that may need invalidate or take this metadata into >>> consideration ? >> >> Any pass that introduces new non-parallel memory instructions to the loop, >> because they think the loop is sequential and it's ok to do so. I do not know >> any other such pass than the one pointed out earlier, reg2mem (if the >> variables inside the loop body reuse stack slots). E.g., inlining >> should be safe. So should be unrolling an inner loop inside a parallel loop. >> > > I suggest that we only add the 'llvm.loop.parallel' metadata and not llvm.mem.parallem. I believe that it should be the job of the consumer pass (e.g loop-vectorizer) to scan the loop and detect parallelism violations. This is also the approach that we use when we optimize stack slots using lifetime markers. I understand that the consumer passes will have to be more conservative and miss some optimizations. But I still think that this is better than forcing different passes in the compiler to know about parallel metadata.Hi Nadav, I am not sure if I am able to follow your reasoning. How could the -loop-vectorizer detect parallelism violations? I had the feeling that we introduce the llvm.loop meta-data for the case where we want to inform the loop vectorizer that it can assume the absence of dependences even though it can not prove their absence statically. Do you possibly mean that the -loop-vectorizer should in some way detect if the llvm.loop.parallel metadata is still correct? Given we go without llvm.mem meta-data. How would the -loop-vectorizer detect that the test case in parallel-loops-after-reg2mem.ll (see Pekkas patch) is not parallel any more even though the llvm.loop.parallel metadata is present? The llvm.mem meta-data would give the necessary information that there was a new memory access introduced that was not covered by the llvm.loop.parallel meta-data. However, without this additional information, I have a hard time to see how you can verify if the llvm.loop.parallel metadata is still up to date. All the best, Tobi
Hi Tobi, Thanks for reviewing the proposal. I imagine that it may also affects your parallelization work in Polly.> > I am not sure if I am able to follow your reasoning. How could the -loop-vectorizer detect parallelism violations? I had the feeling that we introduce the llvm.loop meta-data for the case where we want to inform the loop vectorizer that it can assume the absence of dependences even though it can not prove their absence statically. Do you possibly mean that the -loop-vectorizer should in some way detect if the llvm.loop.parallel metadata is still correct?Yes, the loop vectorizer can detect the kind of violations of the "llvm.loop.parallel" metadata that we are worried about.> Given we go without llvm.mem meta-data. How would the -loop-vectorizer > detect that the test case in parallel-loops-after-reg2mem.ll (see Pekkas patch) is not parallel any more even though the llvm.loop.parallel metadata is present?We already detect this case right now. Its really easy to do. The llvm.loop.parallel should only provide information that is not easily available within this compilation unit. For example, assumption that the input pointers don't overlap, or that dynamic indices are within a certain range that allow us to vectorize.> The llvm.mem meta-data would give the necessary information that there was a new memory access introduced that was not covered by the llvm.loop.parallel meta-data. However, without this additional information, I have a hard time to see how you can verify if the llvm.loop.parallel metadata is still up to date.Thanks, Nadav
Nadav Rotem wrote:> > On Feb 7, 2013, at 10:55 AM, Pekka Jääskeläinen <pekka.jaaskelainen at tut.fi> wrote: > > > Hi Nadav, > > > > On 02/07/2013 07:46 PM, Nadav Rotem wrote: > >> Pekka suggested that we add two kind of metadata: llvm.loop.parallel > >> (attached to each loop latch) and llvm.mem.parallel (attached to each memory > >> instruction!). I think that the motivation for the first metadata is clear - > >> it says that the loop is data-parallel. I can also see us adding additional > >> metadata such as llvm.loop.unrollcnt to allow the users to control the unroll > >> count of loops using pragmas. That's fine. Pekka, can you think of > >> transformations that may need invalidate or take this metadata into > >> consideration ? > > > > Any pass that introduces new non-parallel memory instructions to the loop, > > because they think the loop is sequential and it's ok to do so. I do not know > > any other such pass than the one pointed out earlier, reg2mem (if the > > variables inside the loop body reuse stack slots). E.g., inlining > > should be safe. So should be unrolling an inner loop inside a parallel loop. > > > > I suggest that we only add the 'llvm.loop.parallel' metadata and not llvm.mem.parallem.Ok. A solution similar to Pekka's, using fewer annotations as you suggested, is to reference in 'llvm.loop.parallel' all the datarefs to which the parallel access semantics applies. When the list of datarefs in 'llvm.loop.parallel' differs from the datarefs in the loop, the annotation is deemed invalid. Sebastian -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation
On 02/08/2013 07:02 PM, Sebastian Pop wrote:> Nadav Rotem wrote: >> >> On Feb 7, 2013, at 10:55 AM, Pekka Jääskeläinen <pekka.jaaskelainen at tut.fi> wrote: >> >>> Hi Nadav, >>> >>> On 02/07/2013 07:46 PM, Nadav Rotem wrote: >>>> Pekka suggested that we add two kind of metadata: llvm.loop.parallel >>>> (attached to each loop latch) and llvm.mem.parallel (attached to each memory >>>> instruction!). I think that the motivation for the first metadata is clear - >>>> it says that the loop is data-parallel. I can also see us adding additional >>>> metadata such as llvm.loop.unrollcnt to allow the users to control the unroll >>>> count of loops using pragmas. That's fine. Pekka, can you think of >>>> transformations that may need invalidate or take this metadata into >>>> consideration ? >>> >>> Any pass that introduces new non-parallel memory instructions to the loop, >>> because they think the loop is sequential and it's ok to do so. I do not know >>> any other such pass than the one pointed out earlier, reg2mem (if the >>> variables inside the loop body reuse stack slots). E.g., inlining >>> should be safe. So should be unrolling an inner loop inside a parallel loop. >>> >> >> I suggest that we only add the 'llvm.loop.parallel' metadata and not llvm.mem.parallem. > > > Ok. A solution similar to Pekka's, using fewer annotations as you suggested, is > to reference in 'llvm.loop.parallel' all the datarefs to which the parallel > access semantics applies. When the list of datarefs in 'llvm.loop.parallel' > differs from the datarefs in the loop, the annotation is deemed invalid.That sounds elegant and seems to solve the correctness problems. Cheers, Tobias