thr3ads.net - similar to: "LLVM's loop unroller & llvm.loop.parallel

Displaying 20 results from an estimated 6000 matches similar to: "LLVM's loop unroller & llvm.loop.parallel_accesses"

LLVM's loop unroller & llvm.loop.parallel_accesses

2020 May 14

LLVM's loop unroller & llvm.loop.parallel_accesses

This is interesting! So are you saying that loop.parallel_accesses strictly loop parallel, and says nothing about aliasing? I see, I guess we may have been "abusing" the hint and re-purposed it. But isn't llvm's vectorizer using loop.parallel_accesses to vectorize loops including vectorize memory accesses that if you ignore loop-carried dependencies, usually means effectively

LLVM's loop unroller & llvm.loop.parallel_accesses

2020 May 18

LLVM's loop unroller & llvm.loop.parallel_accesses

Would you guys be open to supporting a new hint with the right semantics, like e.g. llvm.loop.noalias_accesses?! I would need to find support in clang however and the main point of support would be the loop unroller behaving as stated in the OP. On Thu, May 14, 2020 at 3:04 PM Michael Kruse <llvmdev at meinersbur.de> wrote: > Trivial example: > > #pragma clang loop

LLVM's loop unroller & llvm.loop.parallel_accesses

2020 May 19

LLVM's loop unroller & llvm.loop.parallel_accesses

Skipping the clang question for now, this had to be a loop pragma of some kind. One step back: what we really need is a way to express that memory accesses between iterations can be re-ordered. The code that's being compiled _is_ noalias, but we don't _have_ to use noalias semantics, e.g. loop parallel semantics are sufficient. What's missing is a way to express that past the llvm

[RFC] Compiled regression tests.

2020 Jul 01

[RFC] Compiled regression tests.

On 7/1/20 12:40 AM, Michael Kruse via llvm-dev wrote: > To illustrate some difficulties with FileCheck, lets make a > non-semantic change in LLVM: > > --- a/llvm/lib/Analysis/VectorUtils.cpp > +++ b/llvm/lib/Analysis/VectorUtils.cpp > @@ -642,8 +642,8 @@ MDNode *llvm::uniteAccessGroups(MDNode > *AccGroups1, MDNode *AccGroups2) { > return AccGroups1;

RFC: LoopIDs are not identifiers (and better loop-parallel metadata)

2018 Dec 05

RFC: LoopIDs are not identifiers (and better loop-parallel metadata)

Dear LLVM community, LLVM IR has a concept of 'LoopID' [1] which is a misnomer: (a) LoopIDs are not unique: Any pass that duplicates IR will do it including its metadata (e.g. LoopVersioning) such that thereafter multiple loops are linked with the same LoopID. There is even a test case (Transforms/LoopUnroll/unroll-pragmas-disabled.ll) for multiple loops with the same LoopID. (b)

i1 true ^= -1 in DAG matcher?

2020 Feb 19

i1 true ^= -1 in DAG matcher?

A constant i1 is stored as a one bit APInt wrapped in a ConstantInt which is then wrapped in ConstantSDNode for SelectionDAG. The BUILD_VECTOR will just point to the same ConstantSDNode for each element. There is no concept of a sign in the storage. It's just a bit. Whether or not its treated as 1 or negative 1 is going to depend on the code looking at the value including printing code. And

[RFC] Compiled regression tests.

2020 Jun 24

[RFC] Compiled regression tests.

Am Mi., 24. Juni 2020 um 10:12 Uhr schrieb David Blaikie <dblaikie at gmail.com>: > > As mentioned in the Differential, generating the tests automatically > > will lose information about what actually is intended to be tested, > > Agreed - and I didn't mean to suggest tests should be automatically > generated. I work pretty hard in code reviews to encourage tests to

Spilling to register for a given register class

2019 Dec 18

Spilling to register for a given register class

Ok, thanks. Except the question was meant slightly different. Less w.r.t. organizing the register classes, and more w.r.t. implementation. I've noticed for instance that when trying to model this straight forwardly by writing a vreg from spills and reading this from fills (not further elaborated here), that the spiller can't handle vreg def-use pairs: there are assertions making sure a

i1 true ^= -1 in DAG matcher?

2020 Feb 19

i1 true ^= -1 in DAG matcher?

The vnot PatFrag uses ImmAllOnesV which should put an OPC_CheckImmAllOnesV in the matcher table. And the matcher table should call ISD::isBuildVectorAllOnes. I believe we use vnot with vXi1 vectors on X86 and I haven't seen any issues. The FIXME you pointed to seems related to a scalar patcher not a vector pattern. In that case the issue is that the immediate matcher for scalars calls

Selection DAG chain question

2020 Jul 16

Selection DAG chain question

Yea. I think AMD chains the node they're expanding into, but they don't chain it into an _existing_ chain. e.g. adding A->B to the DAG is ok. But adding A->B and next C->D with B->C is the problem. I appreciate the input On Thu, Jul 16, 2020 at 2:04 PM Matt Arsenault <arsenm2 at gmail.com> wrote: > > > > On Jul 16, 2020, at 17:00, Hendrik Greving

Selection DAG chain question

2020 Jul 16

Selection DAG chain question

Re: Do they really need to be chained with each other or anything else Yes. For 2 reasons. Our architecture lowers udivmem into something with 1 producer and 2 consumers. Reason 1) neither the producers nor the consumers must get reordered. Reason 2) one of the consumers might be missing (either the div or mod consumer might not be present. Yet we need to keep the consuming instruction with side

LAA behavior on Incorrect #pragma omp simd.

2019 Jun 26

LAA behavior on Incorrect #pragma omp simd.

Hi All, I have a doubt regarding the behavior of LoopAccessAnalysis on incorrect #pragma omp simd with -fopenmp-simd flag. How should the compiler behave if the #pragma omp simd on a loop is incorrect and can be proved by Loop Access Analysis. Here is the sample code. #pragma omp simd for (dim_t p = 0; p < m; ++p) #pragma unroll for (dim_t i = 0; i < 6; ++i) { {

Selection DAG chain question

2020 Jul 20

Selection DAG chain question

I did it by code preparing into an intrinsic that has side effects. Pseudo instruction would work as well. I'm not sure if glue would help, since the nodes A->B, C->D from example above are not necessarily adjacent. More hooks into the selection DAG builder may be an idea for a LLVM extension. For example in this case, custom allowing for a node to be built with an existing chain would

Selection DAG chain question

2020 Jul 17

Selection DAG chain question

newbee here. What's the difference between glue and chain? Why can't we add chains to any node we want? On Fri, Jul 17, 2020, 10:25 PM Björn Pettersson A via llvm-dev < llvm-dev at lists.llvm.org> wrote: > Still sounds to me as Glue might help (as already proposed by Craig), but > maybe I’ve misunderstood something. > > > > Another option is to do a simple

Selection DAG chain question

2020 Jul 16

Selection DAG chain question

> Chain doesn't guarantee that operations on parallel chains don't get interleaved This would be a sequential chain... > This is the case for all operations expanded as library calls I think their originating node already has a chain (i.e. mem operand or side effect in llvm-ir). My case is a arithmetic node without ordering constraints (divrem) getting lowered into sth that _does_

[EXTERNAL] RE: Machinepipeliner interface. shouldIgnoreForPipelining, actually not ignoring.

2020 Sep 09

[EXTERNAL] RE: Machinepipeliner interface. shouldIgnoreForPipelining, actually not ignoring.

Hi James, One last thing - is your target upstream? or are you working on a downstream target? Cheers, James On Tue, 8 Sep 2020 at 23:02, Nagurne, James <j-nagurne at ti.com> wrote: > I greatly appreciate you going back to gather that intel, James. It > actually helps my understanding of the whole pipeliner puzzle quite a bit! > > > > I did identify, like you, that the

i1 true ^= -1 in DAG matcher?

2020 Feb 19

i1 true ^= -1 in DAG matcher?

Hello, It looks like that in the DAG matcher, the DAG has a xor with '-1' for checking a true value vector for instance, %cmp4.i = icmp ne <8 x i32> %6, %5 %7 = xor <8 x i1> %cmp4.i, <i1 true, i1 true, i1 true, i1 true, i1 true, i1 true, i1 true, i1 true> [use of %7] results in vector of '-1' in the DAG. This also seems the reason why LLVM's vnot PatFrag

Selection DAG chain question

2020 Jul 16

Selection DAG chain question

> No, non-sideeffecting operations can be legalized as compiler-rt calls Right, but not as "regular" nodes with side-effects? I guess you could search and analyze the DAG manually but that seems hacky. Maybe something that one day LLVM could support natively. On Thu, Jul 16, 2020 at 11:55 AM Matt Arsenault <arsenm2 at gmail.com> wrote: > > > On Jul 16, 2020, at

Selection DAG chain question

2020 Jul 16

Selection DAG chain question

I need to lower a node into something in the machine that has side effects, i.e. needs a chain. Specifically it's actually UDIVREM. UDIVREM does not have a chain. I can custom lower UDIVREM into the nodes I want, with chain, I can even chain the new nodes and connect them to entry and root with token factors. But then the new nodes are not chained with respect to other nodes, or not chained

[EXTERNAL] RE: Machinepipeliner interface. shouldIgnoreForPipelining, actually not ignoring.

2020 Sep 07

[EXTERNAL] RE: Machinepipeliner interface. shouldIgnoreForPipelining, actually not ignoring.

Hi James, Having not worked on this for circa one year I've gone and refreshed my memory. We have a pretty capable implementation of swing modulo scheduling downstream, distinct from the MachinePipeliner implementation. Historically, MachinePipeliner had very tight coupling between the finding of a suitable schedule and emitting the code that adheres to that schedule. I spent quite a bit of

similar to: LLVM's loop unroller & llvm.loop.parallel_accesses