OK, I updated the text to LangRef in r209507 after some editing. On 05/11/2014 12:36 PM, Pekka Jääskeläinen wrote:> Hi, > > This looks good to me except that the first sentence > could already include "that refer to the same loop" or > similar. > > I could imagine that e.g. loop invariant code motion, > if applied to a parallel loop could hoist code out of > inner loops to outer (parallel) loops. Then the outer > loop contains parallel_loop_access instructions referring to > the inner loop, making the outer loop non-trivially parallel. > > But these are probably rare cases as, at least in pocl, basic > optimizations have already been executed before the work-group > function generation where the parallel work-item loops are created. > > On 05/10/2014 12:08 AM, Humphreys, Jonathan wrote: >> I propose that we change the first paragraph of >> http://llvm.org/docs/LangRef.html#llvm-mem-parallel-loop-access-metadata: >> >> --- >> For a loop to be parallel, in addition to using the llvm.loop metadata to >> mark the loop latch branch instruction, also all of the memory accessing >> instructions in the loop body need to be marked with the >> llvm.mem.parallel_loop_access metadata. If there is at least one memory >> accessing instruction not marked with the metadata, the loop must be >> considered a sequential loop. This causes parallel loops to be converted to >> sequential loops due to optimization passes that are unaware of the parallel >> semantics and that insert new memory instructions to the loop body. >> --- >> >> To be: >> >> --- >> The llvm.mem.parallel_loop_access metadata attaches to instructions and >> denotes that no loop carried memory dependence exist between it and other >> such denoted instructions. The llvm.mem.parallel_loop_access metadata >> refers to a loop identifier, or metadata containing a list of loop >> identifiers for nested loops - these are the loops to which the metadata >> applies. Precisely, given two instructions m1 and m2 that both have >> llvm.mem.parallel_loop_access metadata, with L1 and L2 being the set of >> loops associated with that metadata, respectively, then there is no loop >> carried dependence between m1 and m2 for loops in both L1 and L2. >> >> Trivially, if all memory accessing instructions in a loop have >> llvm.mem.parallel_loop_access metadata that refers to that loop, then the >> loop has no loop carried memory dependence and is considered to be a >> parallel loop. Note that if not all memory access instructions have this >> metadata referring to this loop, then the loop is not trivially parallel - >> additional memory dependence analysis is required to make that >> determination. As a fail safe mechanism, this causes loops that were >> originally parallel to be considered sequential if optimization passes that >> are unaware of the parallel semantics insert new memory instructions into >> the loop body. >> --- >> >> Please let me know your feedback. >> >> As far as taking advantage of the more precise semantics, I've dropped the >> priority of this work because I'm not seeing cases where we insert memory >> instruction. I'm wondering if others have any anecdotal evidence of how >> often we 'loose' the fact that a loop is parallel because of inserting >> memory instructions during optimizations. >> >> Thanks >> Jon >> >> >> -----Original Message----- >> From: Hal Finkel [mailto:hfinkel at anl.gov] >> Sent: Monday, May 05, 2014 5:14 PM >> To: Humphreys, Jonathan >> Cc: Pekka Jääskeläinen; llvmdev at cs.uiuc.edu; Tobias Grosser >> Subject: Re: [LLVMdev] parallel loop metadata question >> >> ----- Original Message ----- >>> From: "Jonathan Humphreys" <j-humphreys at ti.com> >>> To: "Hal Finkel" <hfinkel at anl.gov>, "Tobias Grosser" >>> <tobias at grosser.es> >>> Cc: "Pekka Jääskeläinen" <pekka.jaaskelainen at tut.fi>, >>> llvmdev at cs.uiuc.edu >>> Sent: Monday, May 5, 2014 5:09:42 PM >>> Subject: RE: [LLVMdev] parallel loop metadata question >>> >>> Will do. I will write something up. >>> >>> Hal, your concern below isn't so much with the proposed semantics but >>> rather with the use - that optimizations must respect the loop for >>> which the metadata applies, correct? >> >> Yes, sounds right. Nevertheless, I would recommend putting such a cautionary >> note into the documentation itself just to make explicit an issue which >> might otherwise be overlooked. >> >> -Hal >> >>> >>> Thanks >>> Jon >>> >>> -----Original Message----- >>> From: Hal Finkel [mailto:hfinkel at anl.gov] >>> Sent: Monday, May 05, 2014 4:00 AM >>> To: Tobias Grosser >>> Cc: Pekka Jääskeläinen; Humphreys, Jonathan; llvmdev at cs.uiuc.edu >>> Subject: Re: [LLVMdev] parallel loop metadata question >>> >>> ----- Original Message ----- >>>> From: "Tobias Grosser" <tobias at grosser.es> >>>> To: "Pekka Jääskeläinen" <pekka.jaaskelainen at tut.fi>, "Jonathan >>>> Humphreys" <j-humphreys at ti.com>, llvmdev at cs.uiuc.edu >>>> Sent: Monday, May 5, 2014 3:36:07 AM >>>> Subject: Re: [LLVMdev] parallel loop metadata question >>>> >>>> On 05/05/2014 10:14, Pekka Jääskeläinen wrote: >>>>> On 05/02/2014 07:22 PM, Humphreys, Jonathan wrote: >>>>>> Thanks for the link. I understand your concern of caution with >>>>>> metadata. >>>>>> I cannot, though, imagine how the dependence relation >>>>>> (independence) >>>>>> of two >>>>>> memory references can be affected by a third memory reference. >>>>>> If >>>>>> two references are independent across loop iterations, then they >>>>>> are independent, and any other load or store cannot change that. >>>>>> Right? >>>>> >>>>> Yes, it makes sense. I'm mostly concerned about accesses to stack, >>>>> but even those at this point should remain independent. Otherwise >>>>> even the current semantics might produce broken code with parallel >>>>> stack accesses. >>>>> >>>>> However, as this is such a major semantics change to the original >>>>> one, I'd like to hear more opinions on it. I suggest you create a >>>>> (documentation) >>>>> patch where the new semantics is articulated and request comments >>>>> for it at the LLVM-commits list. >>>> >>>> I agree with both. I think the extension is very reasonable and I >>>> also do not see a reason why this interpretation should cause >>>> troubles. >>>> However, to get it right it would be good to get this throughly >>>> reviewed. >>> >>> I agree, I think this sounds reasonable. You'll certainly need to be >>> careful, however, that the associated instruction has not been >>> hoisted/sunk out of the associated loops. Even if the load is one that >>> can be speculated, that does not mean that there was not a control >>> dependence on the independence information itself. >>> >>> -Hal >>> >>>> >>>> Tobias >>>> _______________________________________________ >>>> LLVM Developers mailing list >>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>>> >>> >>> -- >>> Hal Finkel >>> Assistant Computational Scientist >>> Leadership Computing Facility >>> Argonne National Laboratory >>> >> >> -- >> Hal Finkel >> Assistant Computational Scientist >> Leadership Computing Facility >> Argonne National Laboratory >> > >-- Pekka
Pekka, thanks for updating this. A small edit - the sentence ending with: "with L1 and L2 being the set of loops associated with that metadata, respectively, then there is no loop carried dependence between m1 and m2 for loops L1 or L2." Should read: "with L1 and L2 being the set of loops associated with that metadata, respectively, then there is no loop carried dependence between m1 and m2 for loops in both L1 and L2." Jon -----Original Message----- From: Pekka Jääskeläinen [mailto:pekka.jaaskelainen at tut.fi] Sent: Friday, May 23, 2014 6:45 AM To: Humphreys, Jonathan; Hal Finkel; Tobias Grosser Cc: llvmdev at cs.uiuc.edu Subject: Re: [LLVMdev] parallel loop metadata question OK, I updated the text to LangRef in r209507 after some editing. On 05/11/2014 12:36 PM, Pekka Jääskeläinen wrote:> Hi,>> This looks good to me except that the first sentence could already> include "that refer to the same loop" or similar.>> I could imagine that e.g. loop invariant code motion, if applied to a> parallel loop could hoist code out of inner loops to outer (parallel)> loops. Then the outer loop contains parallel_loop_access instructions> referring to the inner loop, making the outer loop non-trivially> parallel.>> But these are probably rare cases as, at least in pocl, basic> optimizations have already been executed before the work-group> function generation where the parallel work-item loops are created.>> On 05/10/2014 12:08 AM, Humphreys, Jonathan wrote:>> I propose that we change the first paragraph of>> http://llvm.org/docs/LangRef.html#llvm-mem-parallel-loop-access-metadata:>>>> --->> For a loop to be parallel, in addition to using the llvm.loop>> metadata to mark the loop latch branch instruction, also all of the>> memory accessing instructions in the loop body need to be marked with>> the llvm.mem.parallel_loop_access metadata. If there is at least one>> memory accessing instruction not marked with the metadata, the loop>> must be considered a sequential loop. This causes parallel loops to>> be converted to sequential loops due to optimization passes that are>> unaware of the parallel semantics and that insert new memory instructions to the loop body.>> --->>>> To be:>>>> --->> The llvm.mem.parallel_loop_access metadata attaches to instructions>> and denotes that no loop carried memory dependence exist between it>> and other such denoted instructions. The>> llvm.mem.parallel_loop_access metadata refers to a loop identifier,>> or metadata containing a list of loop identifiers for nested loops ->> these are the loops to which the metadata applies. Precisely, given>> two instructions m1 and m2 that both have>> llvm.mem.parallel_loop_access metadata, with L1 and L2 being the set>> of loops associated with that metadata, respectively, then there is no loop carried dependence between m1 and m2 for loops in both L1 and L2.>>>> Trivially, if all memory accessing instructions in a loop have>> llvm.mem.parallel_loop_access metadata that refers to that loop, then>> the loop has no loop carried memory dependence and is considered to>> be a parallel loop. Note that if not all memory access instructions>> have this metadata referring to this loop, then the loop is not>> trivially parallel - additional memory dependence analysis is>> required to make that determination. As a fail safe mechanism, this>> causes loops that were originally parallel to be considered>> sequential if optimization passes that are unaware of the parallel>> semantics insert new memory instructions into the loop body.>> --->>>> Please let me know your feedback.>>>> As far as taking advantage of the more precise semantics, I've>> dropped the priority of this work because I'm not seeing cases where>> we insert memory instruction. I'm wondering if others have any>> anecdotal evidence of how often we 'loose' the fact that a loop is>> parallel because of inserting memory instructions during optimizations.>>>> Thanks>> Jon>>>>>> -----Original Message----->> From: Hal Finkel [mailto:hfinkel at anl.gov]>> Sent: Monday, May 05, 2014 5:14 PM>> To: Humphreys, Jonathan>> Cc: Pekka Jääskeläinen; llvmdev at cs.uiuc.edu<mailto:llvmdev at cs.uiuc.edu>; Tobias Grosser>> Subject: Re: [LLVMdev] parallel loop metadata question>>>> ----- Original Message ----->>> From: "Jonathan Humphreys" <j-humphreys at ti.com<mailto:j-humphreys at ti.com>>>>> To: "Hal Finkel" <hfinkel at anl.gov<mailto:hfinkel at anl.gov>>, "Tobias Grosser">>> <tobias at grosser.es<mailto:tobias at grosser.es>>>>> Cc: "Pekka Jääskeläinen" <pekka.jaaskelainen at tut.fi<mailto:pekka.jaaskelainen at tut.fi>>,>>> llvmdev at cs.uiuc.edu<mailto:llvmdev at cs.uiuc.edu>>>> Sent: Monday, May 5, 2014 5:09:42 PM>>> Subject: RE: [LLVMdev] parallel loop metadata question>>>>>> Will do. I will write something up.>>>>>> Hal, your concern below isn't so much with the proposed semantics>>> but rather with the use - that optimizations must respect the loop>>> for which the metadata applies, correct?>>>> Yes, sounds right. Nevertheless, I would recommend putting such a>> cautionary note into the documentation itself just to make explicit>> an issue which might otherwise be overlooked.>>>> -Hal>>>>>>>> Thanks>>> Jon>>>>>> -----Original Message----->>> From: Hal Finkel [mailto:hfinkel at anl.gov]>>> Sent: Monday, May 05, 2014 4:00 AM>>> To: Tobias Grosser>>> Cc: Pekka Jääskeläinen; Humphreys, Jonathan; llvmdev at cs.uiuc.edu<mailto:llvmdev at cs.uiuc.edu>>>> Subject: Re: [LLVMdev] parallel loop metadata question>>>>>> ----- Original Message ----->>>> From: "Tobias Grosser" <tobias at grosser.es<mailto:tobias at grosser.es>>>>>> To: "Pekka Jääskeläinen" <pekka.jaaskelainen at tut.fi<mailto:pekka.jaaskelainen at tut.fi>>, "Jonathan>>>> Humphreys" <j-humphreys at ti.com<mailto:j-humphreys at ti.com>>, llvmdev at cs.uiuc.edu<mailto:llvmdev at cs.uiuc.edu>>>>> Sent: Monday, May 5, 2014 3:36:07 AM>>>> Subject: Re: [LLVMdev] parallel loop metadata question>>>>>>>> On 05/05/2014 10:14, Pekka Jääskeläinen wrote:>>>>> On 05/02/2014 07:22 PM, Humphreys, Jonathan wrote:>>>>>> Thanks for the link. I understand your concern of caution with>>>>>> metadata.>>>>>> I cannot, though, imagine how the dependence relation>>>>>> (independence)>>>>>> of two>>>>>> memory references can be affected by a third memory reference.>>>>>> If>>>>>> two references are independent across loop iterations, then they>>>>>> are independent, and any other load or store cannot change that.>>>>>> Right?>>>>>>>>>> Yes, it makes sense. I'm mostly concerned about accesses to stack,>>>>> but even those at this point should remain independent. Otherwise>>>>> even the current semantics might produce broken code with parallel>>>>> stack accesses.>>>>>>>>>> However, as this is such a major semantics change to the original>>>>> one, I'd like to hear more opinions on it. I suggest you create a>>>>> (documentation)>>>>> patch where the new semantics is articulated and request comments>>>>> for it at the LLVM-commits list.>>>>>>>> I agree with both. I think the extension is very reasonable and I>>>> also do not see a reason why this interpretation should cause>>>> troubles.>>>> However, to get it right it would be good to get this throughly>>>> reviewed.>>>>>> I agree, I think this sounds reasonable. You'll certainly need to be>>> careful, however, that the associated instruction has not been>>> hoisted/sunk out of the associated loops. Even if the load is one>>> that can be speculated, that does not mean that there was not a>>> control dependence on the independence information itself.>>>>>> -Hal>>>>>>>>>>> Tobias>>>> _______________________________________________>>>> LLVM Developers mailing list>>>> LLVMdev at cs.uiuc.edu<mailto:LLVMdev at cs.uiuc.edu> http://llvm.cs.uiuc.edu>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev>>>>>>>>>> -->>> Hal Finkel>>> Assistant Computational Scientist>>> Leadership Computing Facility>>> Argonne National Laboratory>>>>>>> -->> Hal Finkel>> Assistant Computational Scientist>> Leadership Computing Facility>> Argonne National Laboratory>>>>-- Pekka -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20140527/41c41bc4/attachment.html>
Thanks, updated in r210327. On 05/27/2014 07:23 PM, Humphreys, Jonathan wrote:> Pekka, thanks for updating this. > > A small edit - the sentence ending with: > > "with L1 and L2 being the set of loops associated with that metadata, > respectively, then there is no loop carried dependence between m1 and m2 for > loops L1 or L2." > > Should read: > > "with L1 and L2 being the set of loops associated with that metadata, > respectively, then there is no loop carried dependence between m1 and m2 for > loops /in both L1 and L2/." > > Jon > > -----Original Message----- > From: Pekka Jääskeläinen [mailto:pekka.jaaskelainen at tut.fi] > Sent: Friday, May 23, 2014 6:45 AM > To: Humphreys, Jonathan; Hal Finkel; Tobias Grosser > Cc: llvmdev at cs.uiuc.edu > Subject: Re: [LLVMdev] parallel loop metadata question > > OK, > > I updated the text to LangRef in r209507 after some editing. > > On 05/11/2014 12:36 PM, Pekka Jääskeläinen wrote: > > > Hi, > > > > > > This looks good to me except that the first sentence could already > > > include "that refer to the same loop" or similar. > > > > > > I could imagine that e.g. loop invariant code motion, if applied to a > > > parallel loop could hoist code out of inner loops to outer (parallel) > > > loops. Then the outer loop contains parallel_loop_access instructions > > > referring to the inner loop, making the outer loop non-trivially > > > parallel. > > > > > > But these are probably rare cases as, at least in pocl, basic > > > optimizations have already been executed before the work-group > > > function generation where the parallel work-item loops are created. > > > > > > On 05/10/2014 12:08 AM, Humphreys, Jonathan wrote: > > >> I propose that we change the first paragraph of > > >> http://llvm.org/docs/LangRef.html#llvm-mem-parallel-loop-access-metadata: > > >> > > >> --- > > >> For a loop to be parallel, in addition to using the llvm.loop > > >> metadata to mark the loop latch branch instruction, also all of the > > >> memory accessing instructions in the loop body need to be marked with > > >> the llvm.mem.parallel_loop_access metadata. If there is at least one > > >> memory accessing instruction not marked with the metadata, the loop > > >> must be considered a sequential loop. This causes parallel loops to > > >> be converted to sequential loops due to optimization passes that are > > >> unaware of the parallel semantics and that insert new memory instructions > to the loop body. > > >> --- > > >> > > >> To be: > > >> > > >> --- > > >> The llvm.mem.parallel_loop_access metadata attaches to instructions > > >> and denotes that no loop carried memory dependence exist between it > > >> and other such denoted instructions. The > > >> llvm.mem.parallel_loop_access metadata refers to a loop identifier, > > >> or metadata containing a list of loop identifiers for nested loops - > > >> these are the loops to which the metadata applies. Precisely, given > > >> two instructions m1 and m2 that both have > > >> llvm.mem.parallel_loop_access metadata, with L1 and L2 being the set > > >> of loops associated with that metadata, respectively, then there is no > loop carried dependence between m1 and m2 for loops in both L1 and L2. > > >> > > >> Trivially, if all memory accessing instructions in a loop have > > >> llvm.mem.parallel_loop_access metadata that refers to that loop, then > > >> the loop has no loop carried memory dependence and is considered to > > >> be a parallel loop. Note that if not all memory access instructions > > >> have this metadata referring to this loop, then the loop is not > > >> trivially parallel - additional memory dependence analysis is > > >> required to make that determination. As a fail safe mechanism, this > > >> causes loops that were originally parallel to be considered > > >> sequential if optimization passes that are unaware of the parallel > > >> semantics insert new memory instructions into the loop body. > > >> --- > > >> > > >> Please let me know your feedback. > > >> > > >> As far as taking advantage of the more precise semantics, I've > > >> dropped the priority of this work because I'm not seeing cases where > > >> we insert memory instruction. I'm wondering if others have any > > >> anecdotal evidence of how often we 'loose' the fact that a loop is > > >> parallel because of inserting memory instructions during optimizations. > > >> > > >> Thanks > > >> Jon > > >> > > >> > > >> -----Original Message----- > > >> From: Hal Finkel [mailto:hfinkel at anl.gov] > > >> Sent: Monday, May 05, 2014 5:14 PM > > >> To: Humphreys, Jonathan > > >> Cc: Pekka Jääskeläinen; llvmdev at cs.uiuc.edu <mailto:llvmdev at cs.uiuc.edu>; > Tobias Grosser > > >> Subject: Re: [LLVMdev] parallel loop metadata question > > >> > > >> ----- Original Message ----- > > >>> From: "Jonathan Humphreys" <j-humphreys at ti.com <mailto:j-humphreys at ti.com>> > > >>> To: "Hal Finkel" <hfinkel at anl.gov <mailto:hfinkel at anl.gov>>, "Tobias Grosser" > > >>> <tobias at grosser.es <mailto:tobias at grosser.es>> > > >>> Cc: "Pekka Jääskeläinen" <pekka.jaaskelainen at tut.fi > <mailto:pekka.jaaskelainen at tut.fi>>, > > >>> llvmdev at cs.uiuc.edu <mailto:llvmdev at cs.uiuc.edu> > > >>> Sent: Monday, May 5, 2014 5:09:42 PM > > >>> Subject: RE: [LLVMdev] parallel loop metadata question > > >>> > > >>> Will do. I will write something up. > > >>> > > >>> Hal, your concern below isn't so much with the proposed semantics > > >>> but rather with the use - that optimizations must respect the loop > > >>> for which the metadata applies, correct? > > >> > > >> Yes, sounds right. Nevertheless, I would recommend putting such a > > >> cautionary note into the documentation itself just to make explicit > > >> an issue which might otherwise be overlooked. > > >> > > >> -Hal > > >> > > >>> > > >>> Thanks > > >>> Jon > > >>> > > >>> -----Original Message----- > > >>> From: Hal Finkel [mailto:hfinkel at anl.gov] > > >>> Sent: Monday, May 05, 2014 4:00 AM > > >>> To: Tobias Grosser > > >>> Cc: Pekka Jääskeläinen; Humphreys, Jonathan; llvmdev at cs.uiuc.edu > <mailto:llvmdev at cs.uiuc.edu> > > >>> Subject: Re: [LLVMdev] parallel loop metadata question > > >>> > > >>> ----- Original Message ----- > > >>>> From: "Tobias Grosser" <tobias at grosser.es <mailto:tobias at grosser.es>> > > >>>> To: "Pekka Jääskeläinen" <pekka.jaaskelainen at tut.fi > <mailto:pekka.jaaskelainen at tut.fi>>, "Jonathan > > >>>> Humphreys" <j-humphreys at ti.com <mailto:j-humphreys at ti.com>>, > llvmdev at cs.uiuc.edu <mailto:llvmdev at cs.uiuc.edu> > > >>>> Sent: Monday, May 5, 2014 3:36:07 AM > > >>>> Subject: Re: [LLVMdev] parallel loop metadata question > > >>>> > > >>>> On 05/05/2014 10:14, Pekka Jääskeläinen wrote: > > >>>>> On 05/02/2014 07:22 PM, Humphreys, Jonathan wrote: > > >>>>>> Thanks for the link. I understand your concern of caution with > > >>>>>> metadata. > > >>>>>> I cannot, though, imagine how the dependence relation > > >>>>>> (independence) > > >>>>>> of two > > >>>>>> memory references can be affected by a third memory reference. > > >>>>>> If > > >>>>>> two references are independent across loop iterations, then they > > >>>>>> are independent, and any other load or store cannot change that. > > >>>>>> Right? > > >>>>> > > >>>>> Yes, it makes sense. I'm mostly concerned about accesses to stack, > > >>>>> but even those at this point should remain independent. Otherwise > > >>>>> even the current semantics might produce broken code with parallel > > >>>>> stack accesses. > > >>>>> > > >>>>> However, as this is such a major semantics change to the original > > >>>>> one, I'd like to hear more opinions on it. I suggest you create a > > >>>>> (documentation) > > >>>>> patch where the new semantics is articulated and request comments > > >>>>> for it at the LLVM-commits list. > > >>>> > > >>>> I agree with both. I think the extension is very reasonable and I > > >>>> also do not see a reason why this interpretation should cause > > >>>> troubles. > > >>>> However, to get it right it would be good to get this throughly > > >>>> reviewed. > > >>> > > >>> I agree, I think this sounds reasonable. You'll certainly need to be > > >>> careful, however, that the associated instruction has not been > > >>> hoisted/sunk out of the associated loops. Even if the load is one > > >>> that can be speculated, that does not mean that there was not a > > >>> control dependence on the independence information itself. > > >>> > > >>> -Hal > > >>> > > >>>> > > >>>> Tobias > > >>>> _______________________________________________ > > >>>> LLVM Developers mailing list > > >>>> LLVMdev at cs.uiuc.edu <mailto:LLVMdev at cs.uiuc.edu> http://llvm.cs.uiuc.edu > > >>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > >>>> > > >>> > > >>> -- > > >>> Hal Finkel > > >>> Assistant Computational Scientist > > >>> Leadership Computing Facility > > >>> Argonne National Laboratory > > >>> > > >> > > >> -- > > >> Hal Finkel > > >> Assistant Computational Scientist > > >> Leadership Computing Facility > > >> Argonne National Laboratory > > >> > > > > > > > > -- > > Pekka >-- Pekka