thr3ads.net - similar to: "Loop branching inefficiencies in Backend output"

Displaying 20 results from an estimated 3000 matches similar to: "Loop branching inefficiencies in Backend output"

X86 Backend SelectionDAG - Source Scheduling

2017 Jul 31

X86 Backend SelectionDAG - Source Scheduling

Thanks that clears things up. So if I want to mess around with how schedules are generated, looking at the MachineScheduler pass is the best place now? -Dilan On Mon, Jul 31, 2017 at 3:24 PM Matthias Braun <mbraun at apple.com> wrote: > > > On Jul 31, 2017, at 2:51 PM, Dilan Manatunga via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > > > > Hi, > >

Issue with DAG legalization of brcond, setcc, xor

2017 Jul 21

Issue with DAG legalization of brcond, setcc, xor

But isn't kinda silly that we transform to xor and then we transform it back. What is the advantage in doing so? Also, since we do that method, I now have to introduce setcc patterns for i1 values, instead of being able to just use logical pattern operators like not. -Dilan On Fri, Jul 21, 2017 at 11:00 AM Dilan Manatunga <manatunga at gmail.com> wrote: > For some reason I

Lowering Select to Two Predicated Movs

2017 Jul 07

Lowering Select to Two Predicated Movs

Ohh, that makes sense. And is the reason the first instruction doesn't get deleted because the ExpandPseudoInstructions pass occurs after Register Allocation and machine dead code elimination? -Dilan On Fri, Jul 7, 2017 at 12:37 PM Friedman, Eli <efriedma at codeaurora.org> wrote: > On 7/7/2017 12:10 PM, Dilan Manatunga wrote: > > My bad for not looking further. I'm still

Lowering Select to Two Predicated Movs

2017 Jul 07

Lowering Select to Two Predicated Movs

My bad for not looking further. I'm still somewhat confused though. MOVCCr gets expanded in the ARMExpandPseudoInsts pass, and it still seems only a case of one instruction replacing the other. My worry of emitting two instructions, is that a dead code pass will eliminate the first instruction cause it thinks the second instruction is defining the same register. -Dilan On Fri, Jul 7, 2017

Signed Division and InstCombine

2016 May 31

Signed Division and InstCombine

I was looking through the InstCombine pass, and I was wondering why signed division is not considered a valid operation to combine in the canEvaluateTruncated function. This means, given the following code: %conv = sext i16 %0 to i32 %conv1 = sext i16 %1 to i32 %div = sdiv i32 %conv, %conv1 %conv2 = trunc i32 %div to i16 * Assume %0 and %1 are registers created from simple 16-bit loads. We

Lowering For Loops to use architecture "loop" instruction

2016 Jun 02

Lowering For Loops to use architecture "loop" instruction

Hi, I'm working on project which involves writing a backend for a hypothetical architecture. I am currently trying to figure out the best way to translate for loops to use a specialized "loop" instruction the architecture supports. The instruction is similar X86's loop instruction, where a register is automatically decremented and the condition is automatically checked to see if

Issue with DAG legalization of brcond, setcc, xor

2017 Jul 20

Issue with DAG legalization of brcond, setcc, xor

Hi, I am having some issues with how some of the instructions are being legalized. So this is my intial basic block. The area of concern is the last three instructions. I will pick and choose debug output to keep this small. SelectionDAG has 36 nodes: t0: ch = EntryToken t6: i32,ch = CopyFromReg t0, Register:i32 %vreg507 t2: i32,ch = CopyFromReg t0, Register:i32 %vreg17

Lowering Select to Two Predicated Movs

2017 Jul 07

Lowering Select to Two Predicated Movs

Hi, I was wondering what would be the best way to lower a select operation two predicated movs. I looked through the ARM, MIPS, and NVPTX backends and they all seem to lower a select to some sort of conditional move or native select operation. Ex. select t3, cond, t2, t1 Becomes cond mov t3, t2 !cond mov t3, t1 -Dilan -------------- next part -------------- An HTML attachment was scrubbed...

Signed Division and InstCombine

2016 May 31

Signed Division and InstCombine

On 31 May 2016 at 16:02, Dilan Manatunga <manatunga at gmail.com> wrote: > Just to verify, a 16-bit divion of INT16_MIN by -1 results in INT16_MIN > again? No, "sdiv i16 -32768, -1" is undefined behaviour. The version with an "sext" and "trunc" avoids the undefined behaviour and does return -32768. > If the issue only occurs in this case, why

X86 Backend SelectionDAG - Source Scheduling

2017 Jul 31

X86 Backend SelectionDAG - Source Scheduling

Hi, I was looking into how SelectionDAG scheduling is done in LLVM for different backends, and I noticed that for the X86 backend, even though it sets scheduling preferences of ILP or RegisterPressure depending on architecture, in the end, it ends up using source scheduling. I realized this is because it overrides enableMachineScheduler to return true. Is there any specific reasons why it was

Signed Division and InstCombine

2016 May 31

Signed Division and InstCombine

Just to verify, a 16-bit divion of INT16_MIN by -1 results in INT16_MIN again? If the issue only occurs in this case, why aren't there checks to see if we can simplify sdiv in cases where we know that numerator is not INT16_MIN or the denominator is not -1. For example, we could simplify divides involving one operand constants. Is it because this case is most likely rare? -Dilan On Tue,

Signed Division and InstCombine

2016 May 31

Signed Division and InstCombine

On 31 May 2016 at 15:42, Tim Northover <t.p.northover at gmail.com> wrote: > A 16-bit division of INT16_MIN by -1 is undefined behaviour but the > original ext/trunc version is well-defined as 0. Sorry, INT16_MIN again actually. The main point still stands though, I think. Tim.

[LLVMdev] : Predication on SIMD architectures and LLVM

2012 Oct 31

[LLVMdev] : Predication on SIMD architectures and LLVM

Hi all, I am working on a CGRA backend (something like a 2D VLIW), and we also absolutely need predication. I extended the IfConversion pass to allow it to be executed multiple times and to predicate already predicated code. This is necessary to predicate code with nested conditional statements. At this point, we support or, and, and conditional predicates (see Scott Mahlke's papers on this

Signed Division and InstCombine

2016 May 31

Signed Division and InstCombine

Hi Dilan, On 31 May 2016 at 15:34, Dilan Manatunga via llvm-dev <llvm-dev at lists.llvm.org> wrote: > What is the reason for the exclusion of sdiv from the operations considered > valid for execution in a truncated format. A 16-bit division of INT16_MIN by -1 is undefined behaviour but the original ext/trunc version is well-defined as 0. Cheers. Tim.

[LLVMdev] : Predication on SIMD architectures and LLVM

2012 Nov 01

[LLVMdev] : Predication on SIMD architectures and LLVM

On Wed, Oct 31, 2012 at 09:13:43PM +0100, Bjorn De Sutter wrote: > Hi all, > > I am working on a CGRA backend (something like a 2D VLIW), and we also absolutely need predication. I extended the IfConversion pass to allow it to be executed multiple times and to predicate already predicated code. This is necessary to predicate code with nested conditional statements. At this point, we

[CodeGen] CodeSize - TailMerging and BlockPlacement

2016 Mar 29

[CodeGen] CodeSize - TailMerging and BlockPlacement

Hi everyone, The code layout that TailMerging (inside BranchFolding) works on is not the final layout optimized based on the branch probability. Generally, after BlockPlacement, many new merging opportunities emerge. I did an experiment of adding additional BranchFolding and BlockPlacement after the existing BlockPlacement (i.e., -block-placement -branch-folder -block-placement) targeting

[LLVMdev] Is LLVM expressive enough to represent asynchronous exceptions?

2011 Jun 14

[LLVMdev] Is LLVM expressive enough to represent asynchronous exceptions?

Hi Andrew, > No. Duncan suggested that he could hitch a ride with us through France. The problem is, we're not driving to Spain at all and there doesn't appear to be any place to transfer. > > The point is, you're not going to be able to leverage most of a CFG-based optimizing compiler if don't use the CFG to express control flow. when Chris first came up with his

[LLVMdev] HazardRecognizer and RegisterAllocation

2009 Jan 20

[LLVMdev] HazardRecognizer and RegisterAllocation

Dan: CellSPU could clearly benefit from the post-RA scheduler. In fact, we were thinking about writing a machine pass of our own. One thing that does "disturb" me is that both HazardRecognizer and post-RA sched assume there's only one kind of NOP. For Cell, there are two, depending upon the pipeline being filled. Pipe 0 takes "ENOP" whereas Pipe 1 take

[LLVMdev] Tail Duplication Questions

2012 Nov 01

[LLVMdev] Tail Duplication Questions

Eli Friedman <eli.friedman at gmail.com> writes: >> Ah. So is the MachineFunction version expected to work correctly? > > It's part of the default set of CodeGen passes. It is? Was that true in 3.1? I can't see where it is initialized in llc. I probably missed something important. :) Thanks! -David

What about multiple MachineMemOperands in one MI (BranchFolding/MachineInstr::mayAlias)?

2019 Sep 27

What about multiple MachineMemOperands in one MI (BranchFolding/MachineInstr::mayAlias)?

On 9/27/19 7:33 AM, Matt Arsenault via llvm-dev wrote: > > >> On Sep 27, 2019, at 09:07, Björn Pettersson A via llvm-dev >> <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: >> >> Obviously we do not store into two locations (it is still a single >> two byte store). >> So is it (always) correct to interpret the list of

similar to: Loop branching inefficiencies in Backend output