Here's an attempt to nail down the annotation semantics with support for respecting forward lexical dependences. Each load, store, call, or invoke instruction can be labeled with !llvm.mem.vector_loop_access, which has two operands: * The first operand is an integer denoting lexical position. The positions need not be consecutive, and may contain duplicates. * The second operand is the same as the first operand to llvm.mem.parallel_loop_access. It's second so that it can be omitted - see mention of inlining further below. The LoopID can have "llvm.loop.safelen" metadata. Here is an example with three accesses with positions {10, 15 17} and a safelen of 42. define void @foo(float* %a, float* %b) { entry: br label %for.body for.body: ; preds = %for.body, %entry ... %0 = load float* %arrayidx, !llvm.mem.vector_loop_access !{metadata i32 10, !0} ... %1 = load float* %arrayidx2, !llvm.mem.vector_loop_access !{metadata i32 15, !0} ... store float %add3, float* %arrayidx5, !llvm.mem.vector_loop_access !{metadata i32 17, !0} ... br i1 %exitcond, label %for.end, label %for.body, !llvm.loop !0 for.end: ; preds = %for.body ret void } !0 = metadata !{metadata !0, metadata !1} !1 = metadata !{metadata !"llvm.loop.safelen", i32 42} Let lex(x) denote the lexical position metadata for access x. If two accesses A and B: * Are marked with llvm.mem.vector_loop_access that reference a loop L AND * lex(A)<lex(B) AND * L has llvm.loop.safelen with value K THEN for loop L, the dependence distance from B to A is at least K iterations. When llvm.mem.vector_loop_access is used on a call/invoke instruction, any accesses therein inherit that lexical position. Open issue: when inlining a callee with more than one memory access, the accesses will end up with the same lexical position, and thus lose the dependence distance clue. I don't know how often this drawback would show up in practice. A possibility is to allow the callee instructions to be annotated with a !llvm.mem.vector_loop_access that omits the LoopId operand, i.e. just has lexical position information. Then inlining could be more clever. - Arch Robison Intel Corporation
Humphreys, Jonathan
2014-Aug-27 22:37 UTC
[LLVMdev] Proposal for ""llvm.mem.vectorize.safelen"
Sorry for coming to the discussion so late. I have a couple of questions/comments: - I'm no openMP expert but I don't understand the relevance of lexically forward or backward dependence with respect to the safelen() metadata. I would be surprised if the openMP spec based safelen() upon lexical ordering for the reasons that you are running up against. - I agree with Hal's request that we don't use the name 'safelen'. I can think of many other 'safety' requirements for SIMD (such as alignment), and the proposed metadata is specifically addressing memory dependence. - what are the semantics for outer loop dependences in nested loops? I believe that openMP restricts the usage of this clause to nested loops that are perfectly nested and collapsible. We will be introducing this metadata in LLVM without such restrictions. - Assuming that lexical ordering is irrelevant to this discussion, it would be nice if we could build upon the llvm.mem.parallel_loop_access metadata rather than inventing more metadata. I was thinking that you could add an additional parameter that specifies the minimum dependence distance. It would be optional and if absent, would be infinite (which preserves existing semantics). - And having said that, I'm also wondering (but haven't thought any of it through) if instead of inventing metadata to address particular language features, we should instead invent a more general mechanism for expressing memory independence through metadata. Then the expression of a language feature could be decomposed into basic dependence metadata. There are many advantages to this. It would be more precise (operation based vs loop based). Being basic/lowlevel, passes that invalidate it (or enhance it) can do that at a fundamental level and not need to be aware of a specific language feature that might be implied in the name of the metadata. And being more precise, it would be more robust because any pass that introduces memory operations would not have the metadata and so wouldn't inherit assertions that might not be true for it. Jon -----Original Message----- From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On Behalf Of Robison, Arch Sent: Thursday, August 21, 2014 2:17 PM To: Hal Finkel; Renato Golin; Alexander Musman (alexander.musman at gmail.com) Cc: LLVM Dev Subject: Re: [LLVMdev] Proposal for ""llvm.mem.vectorize.safelen" Here's an attempt to nail down the annotation semantics with support for respecting forward lexical dependences. Each load, store, call, or invoke instruction can be labeled with !llvm.mem.vector_loop_access, which has two operands: * The first operand is an integer denoting lexical position. The positions need not be consecutive, and may contain duplicates. * The second operand is the same as the first operand to llvm.mem.parallel_loop_access. It's second so that it can be omitted - see mention of inlining further below. The LoopID can have "llvm.loop.safelen" metadata. Here is an example with three accesses with positions {10, 15 17} and a safelen of 42. define void @foo(float* %a, float* %b) { entry: br label %for.body for.body: ; preds = %for.body, %entry ... %0 = load float* %arrayidx, !llvm.mem.vector_loop_access !{metadata i32 10, !0} ... %1 = load float* %arrayidx2, !llvm.mem.vector_loop_access !{metadata i32 15, !0} ... store float %add3, float* %arrayidx5, !llvm.mem.vector_loop_access !{metadata i32 17, !0} ... br i1 %exitcond, label %for.end, label %for.body, !llvm.loop !0 for.end: ; preds = %for.body ret void } !0 = metadata !{metadata !0, metadata !1} !1 = metadata !{metadata !"llvm.loop.safelen", i32 42} Let lex(x) denote the lexical position metadata for access x. If two accesses A and B: * Are marked with llvm.mem.vector_loop_access that reference a loop L AND * lex(A)<lex(B) AND * L has llvm.loop.safelen with value K THEN for loop L, the dependence distance from B to A is at least K iterations. When llvm.mem.vector_loop_access is used on a call/invoke instruction, any accesses therein inherit that lexical position. Open issue: when inlining a callee with more than one memory access, the accesses will end up with the same lexical position, and thus lose the dependence distance clue. I don't know how often this drawback would show up in practice. A possibility is to allow the callee instructions to be annotated with a !llvm.mem.vector_loop_access that omits the LoopId operand, i.e. just has lexical position information. Then inlining could be more clever. - Arch Robison Intel Corporation _______________________________________________ LLVM Developers mailing list LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> Sorry for coming to the discussion so late. I have a couple of questions/comments:Actually, you're note is timely, because I'm planning to send out a patch (as soon as I update LangRef and rerun tests) that supports safelen but *not* lexical dependences. I don't need the safelen for Julia, but having done the work and seeing that OpenMP needs it, feel that I should finish the work to contribute it. Yes, safelen and lexical forward dependences are separable. I was initially under the impression that to implement OpenMP 4.0 simd loops correctly, support for both safelen and lexical forward dependences were necessary. Indeed, the Intel compiler supports forward lexical dependences in OpenMP 4.0 simd loops. (Sophisticated vectorizers have to support them anyway for auto-vectorization.) But upon closer reading, and discussion with the experts here, it became apparent that the OpenMP 4.0 spec talks about execution of SIMD instructions, but does not say much about how source code is mapped onto those instructions, so support for lexical dependences is not necessary for 4.0. Likewise, I'm dropping my experiment with supporting lexical forward dependences in Julia, though I'm mulling an alternative language-level solution.> Assuming that lexical ordering is irrelevant to this discussion, it would be nice if we could build upon the llvm.mem.parallel_loop_access metadata rather than inventing more metadata.Right.> I was thinking that you could add an additional parameter that specifies the minimum dependence distance. It would be optional and if absent, would be infinite (which preserves existing semantics).The parameter is best put on the loop instead of each access. That way it can have different values for different loop levels. I.e., each loop is marked with its relevant component of the minimum distance vector. With respect to a more general mechanism, and perhaps more complete distance information, that seems worthwhile if there's real benefit to be obtained. I'll leave that to others, though I'd be curious to see proposed designs. I haven't explored the limits of LLVM's metadata format. For instance, what's the best way to store a distance matrix as metadata? - I agree with Hal's request that we don't use the name 'safelen' I'm fine with the "minimum_dependence_distance" if everyone else is. - Arch -----Original Message----- From: Humphreys, Jonathan [mailto:j-humphreys at ti.com] Sent: Wednesday, August 27, 2014 5:37 PM To: Robison, Arch; Hal Finkel; Renato Golin; Alexander Musman (alexander.musman at gmail.com) Cc: LLVM Dev Subject: RE: [LLVMdev] Proposal for ""llvm.mem.vectorize.safelen" Sorry for coming to the discussion so late. I have a couple of questions/comments: - I'm no openMP expert but I don't understand the relevance of lexically forward or backward dependence with respect to the safelen() metadata. I would be surprised if the openMP spec based safelen() upon lexical ordering for the reasons that you are running up against. - I agree with Hal's request that we don't use the name 'safelen'. I can think of many other 'safety' requirements for SIMD (such as alignment), and the proposed metadata is specifically addressing memory dependence. - what are the semantics for outer loop dependences in nested loops? I believe that openMP restricts the usage of this clause to nested loops that are perfectly nested and collapsible. We will be introducing this metadata in LLVM without such restrictions. - Assuming that lexical ordering is irrelevant to this discussion, it would be nice if we could build upon the llvm.mem.parallel_loop_access metadata rather than inventing more metadata. I was thinking that you could add an additional parameter that specifies the minimum dependence distance. It would be optional and if absent, would be infinite (which preserves existing semantics). - And having said that, I'm also wondering (but haven't thought any of it through) if instead of inventing metadata to address particular language features, we should instead invent a more general mechanism for expressing memory independence through metadata. Then the expression of a language feature could be decomposed into basic dependence metadata. There are many advantages to this. It would be more precise (operation based vs loop based). Being basic/lowlevel, passes that invalidate it (or enhance it) can do that at a fundamental level and not need to be aware of a specific language feature that might be implied in the name of the metadata. And being more precise, it would be more robust because any pass that introduces memory operations would not have the metadata and so wouldn't inherit assertions that might not be true for it. Jon
Maybe Matching Threads
- [LLVMdev] Proposal for ""llvm.mem.vectorize.safelen"
- [LLVMdev] Proposal for ""llvm.mem.vectorize.safelen"
- [LLVMdev] Proposal for ""llvm.mem.vectorize.safelen"
- [LLVMdev] Proposal for ""llvm.mem.vectorize.safelen"
- [LLVMdev] Proposal for ""llvm.mem.vectorize.safelen"