similar to: LLVM's loop unroller & llvm.loop.parallel_accesses

Displaying 20 results from an estimated 6000 matches similar to: "LLVM's loop unroller & llvm.loop.parallel_accesses"

2020 May 14
3
LLVM's loop unroller & llvm.loop.parallel_accesses
This is interesting! So are you saying that loop.parallel_accesses strictly loop parallel, and says nothing about aliasing? I see, I guess we may have been "abusing" the hint and re-purposed it. But isn't llvm's vectorizer using loop.parallel_accesses to vectorize loops including vectorize memory accesses that if you ignore loop-carried dependencies, usually means effectively
2020 May 18
2
LLVM's loop unroller & llvm.loop.parallel_accesses
Would you guys be open to supporting a new hint with the right semantics, like e.g. llvm.loop.noalias_accesses?! I would need to find support in clang however and the main point of support would be the loop unroller behaving as stated in the OP. On Thu, May 14, 2020 at 3:04 PM Michael Kruse <llvmdev at meinersbur.de> wrote: > Trivial example: > > #pragma clang loop
2020 May 19
2
LLVM's loop unroller & llvm.loop.parallel_accesses
Skipping the clang question for now, this had to be a loop pragma of some kind. One step back: what we really need is a way to express that memory accesses between iterations can be re-ordered. The code that's being compiled _is_ noalias, but we don't _have_ to use noalias semantics, e.g. loop parallel semantics are sufficient. What's missing is a way to express that past the llvm
2020 Jul 01
6
[RFC] Compiled regression tests.
On 7/1/20 12:40 AM, Michael Kruse via llvm-dev wrote: > To illustrate some difficulties with FileCheck, lets make a > non-semantic change in LLVM: > >     --- a/llvm/lib/Analysis/VectorUtils.cpp >     +++ b/llvm/lib/Analysis/VectorUtils.cpp >     @@ -642,8 +642,8 @@ MDNode *llvm::uniteAccessGroups(MDNode > *AccGroups1, MDNode *AccGroups2) { >          return AccGroups1;
2018 Dec 05
2
RFC: LoopIDs are not identifiers (and better loop-parallel metadata)
Dear LLVM community, LLVM IR has a concept of 'LoopID' [1] which is a misnomer: (a) LoopIDs are not unique: Any pass that duplicates IR will do it including its metadata (e.g. LoopVersioning) such that thereafter multiple loops are linked with the same LoopID. There is even a test case (Transforms/LoopUnroll/unroll-pragmas-disabled.ll) for multiple loops with the same LoopID. (b)
2020 Feb 19
2
i1 true ^= -1 in DAG matcher?
A constant i1 is stored as a one bit APInt wrapped in a ConstantInt which is then wrapped in ConstantSDNode for SelectionDAG. The BUILD_VECTOR will just point to the same ConstantSDNode for each element. There is no concept of a sign in the storage. It's just a bit. Whether or not its treated as 1 or negative 1 is going to depend on the code looking at the value including printing code. And
2020 Jun 24
6
[RFC] Compiled regression tests.
Am Mi., 24. Juni 2020 um 10:12 Uhr schrieb David Blaikie <dblaikie at gmail.com>: > > As mentioned in the Differential, generating the tests automatically > > will lose information about what actually is intended to be tested, > > Agreed - and I didn't mean to suggest tests should be automatically > generated. I work pretty hard in code reviews to encourage tests to
2019 Dec 18
2
Spilling to register for a given register class
Ok, thanks. Except the question was meant slightly different. Less w.r.t. organizing the register classes, and more w.r.t. implementation. I've noticed for instance that when trying to model this straight forwardly by writing a vreg from spills and reading this from fills (not further elaborated here), that the spiller can't handle vreg def-use pairs: there are assertions making sure a
2020 Feb 19
2
i1 true ^= -1 in DAG matcher?
The vnot PatFrag uses ImmAllOnesV which should put an OPC_CheckImmAllOnesV in the matcher table. And the matcher table should call ISD::isBuildVectorAllOnes. I believe we use vnot with vXi1 vectors on X86 and I haven't seen any issues. The FIXME you pointed to seems related to a scalar patcher not a vector pattern. In that case the issue is that the immediate matcher for scalars calls
2020 Jul 16
2
Selection DAG chain question
Yea. I think AMD chains the node they're expanding into, but they don't chain it into an _existing_ chain. e.g. adding A->B to the DAG is ok. But adding A->B and next C->D with B->C is the problem. I appreciate the input On Thu, Jul 16, 2020 at 2:04 PM Matt Arsenault <arsenm2 at gmail.com> wrote: > > > > On Jul 16, 2020, at 17:00, Hendrik Greving
2020 Jul 16
3
Selection DAG chain question
Re: Do they really need to be chained with each other or anything else Yes. For 2 reasons. Our architecture lowers udivmem into something with 1 producer and 2 consumers. Reason 1) neither the producers nor the consumers must get reordered. Reason 2) one of the consumers might be missing (either the div or mod consumer might not be present. Yet we need to keep the consuming instruction with side
2019 Jun 26
3
LAA behavior on Incorrect #pragma omp simd.
Hi All, I have a doubt regarding the behavior of LoopAccessAnalysis on incorrect #pragma omp simd with -fopenmp-simd flag. How should the compiler behave if the #pragma omp simd on a loop is incorrect and can be proved by Loop Access Analysis. Here is the sample code. #pragma omp simd for (dim_t p = 0; p < m; ++p) #pragma unroll for (dim_t i = 0; i < 6; ++i) { {
2020 Jul 20
2
Selection DAG chain question
I did it by code preparing into an intrinsic that has side effects. Pseudo instruction would work as well. I'm not sure if glue would help, since the nodes A->B, C->D from example above are not necessarily adjacent. More hooks into the selection DAG builder may be an idea for a LLVM extension. For example in this case, custom allowing for a node to be built with an existing chain would
2020 Jul 17
2
Selection DAG chain question
newbee here. What's the difference between glue and chain? Why can't we add chains to any node we want? On Fri, Jul 17, 2020, 10:25 PM Björn Pettersson A via llvm-dev < llvm-dev at lists.llvm.org> wrote: > Still sounds to me as Glue might help (as already proposed by Craig), but > maybe I’ve misunderstood something. > > > > Another option is to do a simple
2020 Jul 16
2
Selection DAG chain question
> Chain doesn't guarantee that operations on parallel chains don't get interleaved This would be a sequential chain... > This is the case for all operations expanded as library calls I think their originating node already has a chain (i.e. mem operand or side effect in llvm-ir). My case is a arithmetic node without ordering constraints (divrem) getting lowered into sth that _does_
2020 Sep 09
2
[EXTERNAL] RE: Machinepipeliner interface. shouldIgnoreForPipelining, actually not ignoring.
Hi James, One last thing - is your target upstream? or are you working on a downstream target? Cheers, James On Tue, 8 Sep 2020 at 23:02, Nagurne, James <j-nagurne at ti.com> wrote: > I greatly appreciate you going back to gather that intel, James. It > actually helps my understanding of the whole pipeliner puzzle quite a bit! > > > > I did identify, like you, that the
2020 Feb 19
2
i1 true ^= -1 in DAG matcher?
Hello, It looks like that in the DAG matcher, the DAG has a xor with '-1' for checking a true value vector for instance, %cmp4.i = icmp ne <8 x i32> %6, %5 %7 = xor <8 x i1> %cmp4.i, <i1 true, i1 true, i1 true, i1 true, i1 true, i1 true, i1 true, i1 true> [use of %7] results in vector of '-1' in the DAG. This also seems the reason why LLVM's vnot PatFrag
2020 Jul 16
2
Selection DAG chain question
> No, non-sideeffecting operations can be legalized as compiler-rt calls Right, but not as "regular" nodes with side-effects? I guess you could search and analyze the DAG manually but that seems hacky. Maybe something that one day LLVM could support natively. On Thu, Jul 16, 2020 at 11:55 AM Matt Arsenault <arsenm2 at gmail.com> wrote: > > > On Jul 16, 2020, at
2020 Jul 16
2
Selection DAG chain question
I need to lower a node into something in the machine that has side effects, i.e. needs a chain. Specifically it's actually UDIVREM. UDIVREM does not have a chain. I can custom lower UDIVREM into the nodes I want, with chain, I can even chain the new nodes and connect them to entry and root with token factors. But then the new nodes are not chained with respect to other nodes, or not chained
2020 Sep 07
2
[EXTERNAL] RE: Machinepipeliner interface. shouldIgnoreForPipelining, actually not ignoring.
Hi James, Having not worked on this for circa one year I've gone and refreshed my memory. We have a pretty capable implementation of swing modulo scheduling downstream, distinct from the MachinePipeliner implementation. Historically, MachinePipeliner had very tight coupling between the finding of a suitable schedule and emitting the code that adheres to that schedule. I spent quite a bit of