similar to: LoopStrengthReduce.cpp

Displaying 20 results from an estimated 5000 matches similar to: "LoopStrengthReduce.cpp"

2016 Mar 29
0
LoopStrengthReduce.cpp
Hi Jonas, Are you talking specifically about the induction variable? You might look at what I did for PowerPC's counter-based loops (lib/Target/PowerPC/PPCCTRLoops.cpp, etc.). -Hal ----- Original Message ----- > From: "Jonas Paulsson via llvm-dev" <llvm-dev at lists.llvm.org> > To: "llvm-dev" <llvm-dev at lists.llvm.org> > Sent: Monday, March 28,
2016 Mar 29
2
LoopStrengthReduce.cpp
Hi Hal, yes, it's all about the induction variable. SystemZ has a late pass (pre-emit) that looks for MI sequences that can be rewritten to 'branch on count'. Currently only about half the number of BRCTs are output compared to gcc on the same benchmarks. One reason for this is that when a loop gets unrolled, the loop gets a greater increment / decrement than 1, which makes the
2016 Mar 29
0
LoopStrengthReduce.cpp
On 3/29/2016 3:05 AM, Jonas Paulsson via llvm-dev wrote: > Could this be done somehow, or is it really so that all targets have to > have their own passes to do this? In the Hexagon backend we also have a separate pass that converts compare+branch loops into hardware loops. We recognize several different patterns of the controlling induction variable, including cases where the increment
2016 Mar 31
1
LoopStrengthReduce.cpp
> On that note, I think that in general it would be useful to have some > target-independent (CodeGen) pass that would do the majority of the > work for hardware loop generation. I have thought about it, but I > won't be able to do anything in the short term. > > -Krzysztof > I think a first and useful step would be to let targets optionally have the loop induction
2015 Aug 17
2
RFC for a design change in LoopStrengthReduce / ScalarEvolution
This is related to an issue in loop strength reduction [1] that I've been trying to fix on and off for a while. [1] has a more detailed description of the issue and an example, but briefly put, I want LSR to consider formulae that have "Zext T" as base and/or scale registers, and to appropriately rate such formulae. My first attempt[2] at fixing this was buggy and had to be
2015 Aug 17
4
RFC for a design change in LoopStrengthReduce / ScalarEvolution
> I don't understand why you want to factor out the information, > exactly. It seems like what you need is a function like: > > unsigned getMinLeadingZeros(const SCEV *); > > then, if you want to get the non-extended expression, you can just > apply an appropriate truncation. I assume, however, that I'm missing > something. The problem is not about how to codegen
2015 Aug 17
2
RFC for a design change in LoopStrengthReduce / ScalarEvolution
> To back up for a second, how much of this is self-inflicted damage? > IndVarSimplify likes to preemptively widen induction variables. Is > that why you have the extensions here in the first place? In the specific example I was talking about the zext came from our frontend (our FE used to insert these extensions for reasons that are no longer relevant). But you can easily get the same
2015 Aug 18
2
RFC for a design change in LoopStrengthReduce / ScalarEvolution
> Of course, and the point is that, for example, on x86_64, the zext here is free. I'm still trying to understand the problem... > > In the example you provided in your previous e-mail, we choose the solution: > > `GEP @Global, zext(V)` -> `GEP (@Global + zext VStart), {i64 0,+,1}` > `V` -> `trunc({i64 0,+,1}) + VStart` > > instead of the actually-better
2018 Sep 13
4
Loop Distribution pass
Hi, I found with the help of the optimization remarks a loop that could not be vectorized, but if loop distribution was enabled this may happen, which it in fact did with a very significant benchmark improvement (~25%). I tried (on SystemZ) to enable this pass, and found that it only affected a handful of files on SPEC. This means I could enable this without worrying about any regressions on
2016 Oct 06
2
LoopVectorizer -- generating bad and unhandled shufflevector sequence
Hi, I have experimented with enabling the LoopVectorizer for SystemZ. I have come across a loop which, when vectorized, seems to have been poorly generated. In short, there seems to be a completely unnecessary sequence of shufflevector instructions, that doesn't get optimized away anywhere. In other words, there is a shuffling so that leads back to the original vector: [0 1 2 3
2019 Aug 09
4
How to best deal with undesirable Induction Variable Simplification?
Hi Hal, I see. So LSR could theoretically counteract undesirable Ind Var transformations but it's not implemented at the moment? I think I've managed to come up with a small reproducer that can also exhibit similar problem on x86, here it is: https://godbolt.org/z/_wxzut As you can see, when rewriteLoopExitValues is not disabled Clang generates worse code due to additional spills,
2016 Apr 27
2
phys reg liveness during foldMemoryOperandImpl()
I would expect that it shouldn't be too hard to pass around a reference to LiveIntervalAnalysis*. Patches welcome :) - Matthias > On Apr 27, 2016, at 11:38 AM, Jonas Paulsson via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > ping. > > Either this can be implemented easily, or the current SystemZ optimization LAY -> AGSI in foldMemoryOperandImpl() should be
2016 Apr 15
2
phys reg liveness during foldMemoryOperandImpl()
Hi, I wonder if it would be possible to extend foldMemoryOperandImp() so that targets can check for liveness of a particular phys reg? The case I am thinking of is when the new instruction clobbers the CC reg, while the old one did not. In this case the new instruction can only become a replacement if the CC reg is known to be dead. The idea is that liveness of phys regs should be available
2017 Aug 17
3
callee saved regs list
Hi, It has been discovered recently that it is needed for the SystemZ backend to add super-regs to the callee saved regs list like: def CSR_SystemZ : CalleeSavedRegs<(add (sequence "R%dD", 6, 15), - (sequence "F%dD", 8, 15))>; + [R6Q, R8Q, R10Q, R12Q, R14Q], +
2019 Jun 19
2
live-in lists during register allocation
Hi, I wonder if live-in lists can be trusted to be accurate during register allocation / foldMemoryOperandImp(). On SystemZ, a compare register-register which has one of the registers spilled can fold that reload into a compare register-memory instruction. In order to do this also with the first (LHS) register, the operands must be swapped. This can only reasonably be done when all the CC
2020 Aug 07
2
Branches which return values in SelectionDAG
Hi all, I am working on modeling an instruction similar to SystemZ's 'BRCT', which takes a register, decrements it, and branches if the register is nonzero. I saw that the LLVM backend for SystemZ generates the instruction in a MachineFunctionPass as part of a pass intended to eliminate or combine compares. I then looked at ARM, where it uses the HardwareLoops pass first, and then a
2018 Apr 03
4
SCEV and LoopStrengthReduction Formulae
I am attempting to implement a minor loop strength reduction optimization for targets that support compare and jump fusion, specifically TTI::canMacroFuseCmp(). My approach might be wrong; however, I am soliciting the idea for feedback, so that I can implement this correctly. My plan is to add a Supplemental LSR formula to LoopStrengthReduce.cpp that optimizes the following case, but perhaps
2017 Nov 23
3
mischeduler (pre-RA) experiments
Hi, I have been experimenting for a while with tryCandidate() method of the pre-RA mischeduler. I have by chance found some parameters that give quite good results on benchmarks on SystemZ (on average 1% improvement, some improvements of several percent and very little regressions). Basically, I add a "latency heuristic boost" just above processor resources checking:
2019 Aug 13
2
How to best deal with undesirable Induction Variable Simplification?
I've noticed that there was an attempt to mitigate ExitValues problem in https://reviews.llvm.org/D12494 that went nowhere. Were there particular issues with that approach? -- Danila From: Philip Reames [mailto:listmail at philipreames.com] Sent: Saturday, August 10, 2019 02:05 To: Danila Malyutin <Danila.Malyutin at synopsys.com>; Finkel, Hal J. <hfinkel at anl.gov> Cc: llvm-dev
2018 May 15
1
[MachineScheduler] Question about IssueWidth / NumMicroOps
Hi Andy, >> Right now it seems that BeginGroup/EndGroup is only used by SystemZ, >> or? I see they are used in checkHazard(), which I actually don't see >> as helpful during pre-RA scheduling for SystemZ. Could this be made >> optional, or perhaps only done post-RA if target does post-RA >> scheduling? SystemZ does post-RA scheduling to manage decoder