thr3ads.net - search: "peepholer"

Displaying 20 results from an estimated 283 matches for "peepholer".

Did you mean: peephole

2010 Oct 07

[LLVMdev] [Q] x86 peephole deficiency

Hi all, I am slowly working on a SwitchInst optimizer (http://llvm.org/PR8125) and now I am running into a deficiency of the x86 peephole optimizer (or jump-threader?). Here is what I get: andl $3, %edi je .LBB0_4 # BB#2: # %nz # in Loop: Header=BB0_1 Depth=1 cmpl $2, %edi

[LLVMdev] [Q] x86 peephole deficiency

2010 Oct 13

[LLVMdev] [Q] x86 peephole deficiency

Am 07.10.2010 um 19:50 schrieb Chris Lattner: > > On Oct 6, 2010, at 6:16 PM, Gabor Greif wrote: > >> Hi all, >> >> I am slowly working on a SwitchInst optimizer (http://llvm.org/ >> PR8125) >> and now I am running into a deficiency of the x86 >> peephole optimizer (or jump-threader?). Here is what I get: >> >> >> andl $3,

[ARM] Peephole optimization ( instructions tst + add )

2019 Nov 22

[ARM] Peephole optimization ( instructions tst + add )

Ok, thank you, I will implement it then. As far as I see this optimization should be done in AArch64LoadStoreOptimizer, is it right? From: Eli Friedman [mailto:efriedma at quicinc.com] Sent: Thursday, November 21, 2019 11:55 PM To: Kosov Pavel <kosov.pavel at huawei.com>; LLVM Dev <llvm-dev at lists.llvm.org> Subject: RE: [llvm-dev] [ARM] Peephole optimization ( instructions tst +

[LLVMdev] [Q] x86 peephole deficiency

2010 Oct 07

[LLVMdev] [Q] x86 peephole deficiency

On Oct 6, 2010, at 6:16 PM, Gabor Greif wrote: > Hi all, > > I am slowly working on a SwitchInst optimizer (http://llvm.org/PR8125) > and now I am running into a deficiency of the x86 > peephole optimizer (or jump-threader?). Here is what I get: > > > andl $3, %edi > je .LBB0_4 > # BB#2: # %nz >

[LLVMdev] deleting or replacing a MachineInst

2015 Feb 11

[LLVMdev] deleting or replacing a MachineInst

I'm writing a peephole pass and I'm done with the X86_64 instruction level detail work. But I'm having difficulty with the basic block surgery of replacing the old MachineInst. The peephole pass gets called per MachineFunction and then iterates over each MachineBasicBlock and in turn over each MachineInst. When it finds an instruction which should be replaced, it builds a new

[LLVMdev] [Q] x86 peephole deficiency

2010 Oct 13

[LLVMdev] [Q] x86 peephole deficiency

On Oct 13, 2010, at 11:22 AM, Gabor Greif wrote: > Hi Chris, > > I had a look into MachineCSE, but it looks like MBB-oriented. > The above problem is an inter-block one. Also MCSE seems > to perform value numbering on virtual/physical registers, which > does not map very well to status register bits that are implicitly > defined. > Any chance to recast this issue as a

[LLVMdev] Changes in MachineInstruction/Peephole Optimizer?

2004 Feb 20

[LLVMdev] Changes in MachineInstruction/Peephole Optimizer?

Hi all, The register allocator that I implemented is failing in the LLVM cvs version, but not in LLVM 1.1. The generated code fails a check in the x86 peephole optimizer: llc: PeepholeOptimizer.cpp:128: bool <unnamed>::PH::PeepholeOptimize(llvm::Machi neBasicBlock&, llvm::ilist_iterator<llvm::MachineInstr>&): Assertion `MI->getNum Operands() == 2 && "These

[ARM] Peephole optimization ( instructions tst + add )

2019 Nov 21

[ARM] Peephole optimization ( instructions tst + add )

Hello! I noticed that in some cases clang generates sequence of AND+TST instructions: For example: AND x3, x2, x1 TST x2, x1 I think these instructions should be merged to one: ANDS x3, x2, x1 ( because TST <Xn>, <Xm> is alias for ANDS XZR, <Xn>, <Xm> -

Using [GlobalISel] to provide peephole optimizations

2019 Aug 23

Using [GlobalISel] to provide peephole optimizations

Hi, GlobalISel is fantastic, but obviously lacks a lot of the transforms that makes SelectionDAG so good. Whilst it's plenty usable, you'll find yourself wanting/needing to add a lot of manual little transforms to clean things up. I know of the RFC for a new Combiner with its own syntax (https://reviews.llvm.org/D54286 is the latest I can find of it), but after manually adding my Nth

[PATCH] nv50/ir: take postFactor into account when doing peephole optimizations

2015 Mar 25

[PATCH] nv50/ir: take postFactor into account when doing peephole optimizations

Multiply operations can have a post-factor on them, which other ops don't support. Only perform the peephole optimizations when there is no post-factor involved. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89758 Signed-off-by: Ilia Mirkin <imirkin at alum.mit.edu> --- src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp | 12 ++++++++---- 1 file changed, 8 insertions(+),

Machine instruction verifier pass

2017 May 05

Machine instruction verifier pass

Hello Devs, Machine Instruction verifier pass always validates Live variable info associated with MachineInstr along with other checks. Please consider following scenario (w.r.t bugZilla 32583) 1/ MachineCSE pass may prohibit optimising out a common sub-expression for instruction using physical registers by looking at the LiveIn info of successor basic blocks. 2/ Which means we need Live

Publication

2016 Aug 29

Publication

Hi, Can you add the following two publications from our group to the LLVM publications page. - *Alive-FP: Automated Verification of Floating Point Based Peephole Optimizations in LLVM [pdf] <http://www.cs.rutgers.edu/~santosh.nagarakatte/papers/alive-fp-sas16.pdf> *David Menendez, Santosh Nagarakatte, and Aarti Gupta *To Appear in the Proceedings of the 23rd Static Analysis

[LLVMdev] X86TargetLowering::LowerToBT

2015 Jan 19

[LLVMdev] X86TargetLowering::LowerToBT

I'm tracking down an X86 code generation malfeasance regarding BT (bit test) and I have some questions. This IR *matches* and then *X86TargetLowering::LowerToBT **is called:* %and = and i64 %shl, %val * ; (val & (1 << index)) != 0 ; *bit test with a *register* index This IR *does not match* and so *X86TargetLowering::LowerToBT **is not called:* %and = lshr i64 %val, 25

Constants propagation into printf format string

2018 May 18

Constants propagation into printf format string

On 5/17/2018 4:04 PM, Davide Italiano via llvm-dev wrote: > On Thu, May 17, 2018 at 3:53 PM, Dávid Bolvanský via llvm-dev > <llvm-dev at lists.llvm.org> wrote: >> Hello, >> >> I was thinking about a new possible simplification in InstCombine which >> would transform code like this: >> >> const char * NAME = "Prog"; >> >> void

Making llvm-xyz -help useful

2019 May 09

Making llvm-xyz -help useful

...- Emit Apple-style NEON assembly -amdgpu-dump-hsa-metadata - Dump AMDGPU HSA Metadata -amdgpu-enable-merge-m0 - Merge and hoist M0 initializations -amdgpu-sdwa-peephole - Enable SDWA peepholer [...] Surely, the style of NEON code to emit from AArch64 backend is not the information I was looking for... I've implemented a straight-forward patch for llvm-cat here https://reviews.llvm.org/D61740, and the result becomes: OVERVIEW: Module concatenation USAGE: llvm-cat [optio...

[LLVMdev] LLVM evaluation

2011 May 24

[LLVMdev] LLVM evaluation

Hi, My organization is doing a technology comparison between GCC optimization/backend and LLVM infrastructures to select the best environment for our next product. Would you please help me to find answers on the following features’ list support availability inside current (or near future) LLVM implementation: 1. Intrinsic function support 2. Semi-standard embedded C types (complex, fractional)

Question about target instruction optimization

2018 Jul 25

Question about target instruction optimization

This is a question about optimizing the code generation in a (new) Z80 backend: The CPU has a couple of 8 bit physical registers, e.g. H, L, D and E, which are overlaid in 16 bit register pairs named HL and DE. It has also a native instruction to load a 16 bit immediate value into a 16 bit register pair (HL or DE), e.g.: LD HL,<imm16> Now when having a sequence of loading two 16

[LLVMdev] 2-address and 3-address instructions

2015 Jul 17

[LLVMdev] 2-address and 3-address instructions

I am writing a backend for an experimental machine that has both 2-address and 3-address versions of some instructions. The 2-address versions are more compact and thus preferred when applicable. How does one go about generating the most compact version? 1. At instruction selection, is there a predicate that can test whether one of the input sources is dead, thus allowing the selection of the

[CodeGen] PeepholeOptimizer: optimizing condition dependent instrunctions

2016 Mar 10

[CodeGen] PeepholeOptimizer: optimizing condition dependent instrunctions

Hi Quentin, Yes, the code allows to process connected instructions. Although it should be taken into account that the instruction next to the current processed instruction must never be erased because this invalidates iterator. I've been fixing a bug in AArch64InstrInfo::optimizeCompareInstr: instructions are converted into S form but it's not checked that they produce the same flags as

[LLVMdev] Adding "S" suffixed ARM/Thumb2 instructions

2011 Feb 18

[LLVMdev] Adding "S" suffixed ARM/Thumb2 instructions

Hello everyone, I've added the "S" suffixed versions of ARM and Thumb2 instructions to tablegen. Those are, for example, "movs" or "muls". Of course, some instructions have already had their twins, such as add/adds, and I leaved them untouched. Besides, I propose the codegen optimization based on them, which removes the redundant comparison in patterns like orr

search for: peepholer