search for: hwloop

Displaying 18 results from an estimated 18 matches for "hwloop".

2020 May 19
2
LV: predication
...t think this change needs to be predicated on a big change landing first like the LV switching to VP intrinsics. > The difference is that in the VP version there is an explicit dependence of every vector operation in the loop to the set.num.elements intrinsic. This dependence is obscured in the hwloop proposals (more on that below). This discussion is getting complicated, because I think we are discussing 3 topics at the same time now: predication, hardware loops, and a new set of intrinsics, the VP intrinsics. For the change that kicked off this thread, i.e. 1 new intrinsic to get the active l...
2020 May 19
3
LV: predication
...9;t have these disadvantages. Also, the vectoriser isn't using the VP intrinsics yet, so using them is a bridge too far for me at this point. But we should definitely re-evaluate at some point if we should use or transition to them in our backend passes. > Are all vector instructions in the hwloop implicitly predicated or only the masked load/store ops? In a nutshell, when a vector loop with (explicitly) predicated masked loads/stores hit the backend, we translate the generic intrinsic get.active.mask to a target specific one. All predication remains explicit, and this remains the case. Onl...
2020 May 18
2
LV: predication
...eneric hardware loop codegen pass inserts hardware loop intrinsics. Very late in the pipeline, e.g. in the PPC and ARM backends, this is picked and turned into an actual hardwareloop, in our case possibly predicated, or it is reverted. > What will you do if there are no masked intrinsics in the hwloop body? Nothing. I.e., it can become a hardware loop, but not one with implicit predication. > And i am curious why couldn't you use the %evl parameter of VP intrinsics to get the tail predication you are interested in? In D79100<https://reviews.llvm.org/D79100>, intrinsic get.active....
2020 May 18
2
LV: predication
...ruppe at gmail.com <hanna.kruppe at gmail.com> Subject: Re: [llvm-dev] LV: predication On 5/5/20 12:07 AM, Sjoerd Meijer via llvm-dev wrote: what we would like to generate is a vector loop with implicit predication, which works by setting up the the number of elements processed by the loop: hwloop 10 [i:4] = b[i:4] + c[i:4] Why couldn't you use VP intrinsics and scalable types for this? %bval = <4 x vscale x double> call @vp.load(..., /* %evl */ 10) %cval = <4 x vscale x double> call @vp.load(..., /* %evl */ 10) %sum = <4 x vscale x double> fadd %bval, %cva...
2012 Nov 22
2
[LLVMdev] Disable loop unroll pass
...nt dec/inc and compare. 1) is irrelevant to HW loop. Any scalar optimizer should handle 1). It is not difficult at all to handle 2) in CodeGen and it is unnecessary to to introduce a Operator just for that purpose. Shuxin On 11/22/2012 06:03 AM, Gang Yu wrote: > I am the designer for open64 hwloop structure, but I am not a student. > > Hope the following helps: > > To transform a loop into hwloop, we need the help from optimizer. For > example, > | > while(k3>=10){ > sum+=k1; > k3 --; > } > | > > into the form:|| > > | >...
2020 May 04
3
LV: predication
...t is rounded up to 14, the next multiple of 4, and lanes are predicated on i < 10: for i= 0 to 12 a[i:4] = b[i:4] + c[i:4], if i < 10; what we would like to generate is a vector loop with implicit predication, which works by setting up the the number of elements processed by the loop: hwloop 10 [i:4] = b[i:4] + c[i:4] This is implicit since instructions don't produce/consume a mask, but it is generated ans used under the hood by the "hwloop" construct. Your observation that the information in the IR is mostly there is correct, but rather than pattern matching and recon...
2020 Aug 07
2
Branches which return values in SelectionDAG
Hi all, I am working on modeling an instruction similar to SystemZ's 'BRCT', which takes a register, decrements it, and branches if the register is nonzero. I saw that the LLVM backend for SystemZ generates the instruction in a MachineFunctionPass as part of a pass intended to eliminate or combine compares. I then looked at ARM, where it uses the HardwareLoops pass first, and then a
2012 Nov 22
0
[LLVMdev] Disable loop unroll pass
Hi shuxin, Promote while-loop to do-loop is the job of loop induction recognized, not this transformation. The scalar transform for hwloop in optimizer is for that it is a trouble to discriminate trip counting code with the real production code stuff and do the elimination in cg, we have to write customized code to handle this general stuff in ervey targets. So, we take the help from optimizer DCE, make the trip count code hidden in...
2012 Nov 22
0
[LLVMdev] Disable loop unroll pass
I am the designer for open64 hwloop structure, but I am not a student. Hope the following helps: To transform a loop into hwloop, we need the help from optimizer. For example, while(k3>=10){ sum+=k1; k3 --; } into the form: zdl_loop(k3-9) { sum+=k1; } So, we introduce a new ZDLBR whirl(open64 optimiz...
2019 Jul 11
4
llvm.set.loop.iterations
...er of branches-to-make / backedges-taken then this is slightly awkward as we need to subtract the constant 1. Of course if the iteration count was constant this is trivial but if it is passed in register then it is not so nice to have to insert these subtract instructions from a MIR pass (where the hwloop finalization is being done). I wonder what would be the best way to deal with this. One way would be to add a TTI hook gating the original addition but then the intrinsic will have two meanings depending on what this hook returns which is not good. Another way would be to introduce a second intr...
2012 Nov 22
3
[LLVMdev] Disable loop unroll pass
Hi Shuxin, Eli, On 22/11/2012 03:19, Shuxin Yang wrote: > Hi, Ivan: > > My $0.02. hasZeroCostLooping() disabling unrolling dose not seem > to be > appropriate for other architectures, at least the one I worked before. I appreciate your feed-back. Could you give an example where building a hw loop is not appropriate for your target? > > You mentioned: >
2012 Nov 23
0
[LLVMdev] Disable loop unroll pass
Ok, let‘s stop the open64 "polution". Whether the design is as you stated doesn‘t simpler, the code before and after the change already tells us. We take detailed investigation on gcc support for hwloop, then we come to the conclusion we are essentially the same. So i think the idea can be shared among different compilers, general abstract tripcount, make pseudo operators for indentification and special handling, that‘s what i think might help. Sent from Huawei Mobile Shuxin Yang <shuxin.llvm...
2012 Nov 22
2
[LLVMdev] Disable loop unroll pass
...ses need to be fully aware of this new operator, which doesn't make things any simpler. Thanks Shuxin On 11/22/2012 02:56 PM, Gang Yu wrote: > Hi shuxin, > > Promote while-loop to do-loop is the job of loop induction recognized, > not this transformation. The scalar transform for hwloop in optimizer > is for that it is a trouble to discriminate trip counting code with > the real production code stuff and do the elimination in cg, we have > to write customized code to handle this general stuff in ervey > targets. So, we take the help from optimizer DCE, make the t...
2016 Nov 21
2
Conditional jump or move depends on uninitialised value(s)
...ized values in LLVM r287520 built using itself. (One of quite a few such reports that comes up during a "make check".) I could use another set of eyes on the issue if someone has time. This command gives me an error: valgrind -q ./bin/llc < /home/regehr/llvm/test/CodeGen/Hexagon/hwloop-dbg.ll -march=hexagon -mcpu=hexagonv4 The error is at this line: https://github.com/llvm-mirror/llvm/blob/master/lib/CodeGen/DeadMachineInstructionElim.cpp#L142 Here I've refactored the code into a minimal (noinline) function that still triggers the problem. xfunc2() and xfunc3() are also...
2012 Nov 23
0
[LLVMdev] Disable loop unroll pass
...nsigend TripCount) has been proposed so far. Ivan > > Thanks > Shuxin > > On 11/22/2012 02:56 PM, Gang Yu wrote: >> Hi shuxin, >> >> Promote while-loop to do-loop is the job of loop induction >> recognized, not this transformation. The scalar transform for hwloop >> in optimizer is for that it is a trouble to discriminate trip >> counting code with the real production code stuff and do the >> elimination in cg, we have to write customized code to handle this >> general stuff in ervey targets. So, we take the help from optimizer...
2012 Nov 23
1
[LLVMdev] Disable loop unroll pass
...; > Ivan > >> >> Thanks >> Shuxin >> >> On 11/22/2012 02:56 PM, Gang Yu wrote: >>> Hi shuxin, >>> >>> Promote while-loop to do-loop is the job of loop induction >>> recognized, not this transformation. The scalar transform for hwloop >>> in optimizer is for that it is a trouble to discriminate trip >>> counting code with the real production code stuff and do the >>> elimination in cg, we have to write customized code to handle this >>> general stuff in ervey targets. So, we take the help...
2016 Nov 22
2
Conditional jump or move depends on uninitialised value(s)
...ew such reports >> that comes up during a "make check".) >> >> I could use another set of eyes on the issue if someone has time. >> >> This command gives me an error: >> >> valgrind -q ./bin/llc < >> /home/regehr/llvm/test/CodeGen/Hexagon/hwloop-dbg.ll -march=hexagon >> -mcpu=hexagonv4 >> >> The error is at this line: >> >> https://github.com/llvm-mirror/llvm/blob/master/lib/CodeGen/DeadMachineInstructionElim.cpp#L142 >> >> >> >> Here I've refactored the code into a minimal (noinlin...
2020 May 04
3
LV: predication
> The harm comes if the intrinsic ends up with the wrong value, or attached to the wrong loop. The intrinsic is marked as IntrNoDuplicate, so I wasn't worried about it ending up somewhere else. Also, it is a property of a specific loop, a tail-folded vector loop, that holds even after it is transformed I think. I.e. unrolling a vector loop is probably not what you want, but even if you do