Displaying 20 results from an estimated 283 matches for "peepholer".
Did you mean:
peephole
2010 Oct 07
2
[LLVMdev] [Q] x86 peephole deficiency
Hi all,
I am slowly working on a SwitchInst optimizer (http://llvm.org/PR8125)
and now I am running into a deficiency of the x86
peephole optimizer (or jump-threader?). Here is what I get:
andl $3, %edi
je .LBB0_4
# BB#2: # %nz
# in Loop: Header=BB0_1
Depth=1
cmpl $2, %edi
2010 Oct 13
2
[LLVMdev] [Q] x86 peephole deficiency
Am 07.10.2010 um 19:50 schrieb Chris Lattner:
>
> On Oct 6, 2010, at 6:16 PM, Gabor Greif wrote:
>
>> Hi all,
>>
>> I am slowly working on a SwitchInst optimizer (http://llvm.org/
>> PR8125)
>> and now I am running into a deficiency of the x86
>> peephole optimizer (or jump-threader?). Here is what I get:
>>
>>
>> andl $3,
2019 Nov 22
2
[ARM] Peephole optimization ( instructions tst + add )
Ok, thank you, I will implement it then.
As far as I see this optimization should be done in AArch64LoadStoreOptimizer, is it right?
From: Eli Friedman [mailto:efriedma at quicinc.com]
Sent: Thursday, November 21, 2019 11:55 PM
To: Kosov Pavel <kosov.pavel at huawei.com>; LLVM Dev <llvm-dev at lists.llvm.org>
Subject: RE: [llvm-dev] [ARM] Peephole optimization ( instructions tst +
2010 Oct 07
0
[LLVMdev] [Q] x86 peephole deficiency
On Oct 6, 2010, at 6:16 PM, Gabor Greif wrote:
> Hi all,
>
> I am slowly working on a SwitchInst optimizer (http://llvm.org/PR8125)
> and now I am running into a deficiency of the x86
> peephole optimizer (or jump-threader?). Here is what I get:
>
>
> andl $3, %edi
> je .LBB0_4
> # BB#2: # %nz
>
2015 Feb 11
2
[LLVMdev] deleting or replacing a MachineInst
I'm writing a peephole pass and I'm done with the X86_64 instruction level
detail work. But I'm having difficulty with the basic block surgery
of replacing the old MachineInst.
The peephole pass gets called per MachineFunction and then iterates over
each MachineBasicBlock and in turn over each MachineInst. When it finds an
instruction which should be replaced, it builds a new
2010 Oct 13
0
[LLVMdev] [Q] x86 peephole deficiency
On Oct 13, 2010, at 11:22 AM, Gabor Greif wrote:
> Hi Chris,
>
> I had a look into MachineCSE, but it looks like MBB-oriented.
> The above problem is an inter-block one. Also MCSE seems
> to perform value numbering on virtual/physical registers, which
> does not map very well to status register bits that are implicitly
> defined.
> Any chance to recast this issue as a
2004 Feb 20
1
[LLVMdev] Changes in MachineInstruction/Peephole Optimizer?
Hi all,
The register allocator that I implemented is failing in the LLVM cvs
version, but not in LLVM 1.1. The generated code fails a check in the
x86 peephole optimizer:
llc: PeepholeOptimizer.cpp:128: bool
<unnamed>::PH::PeepholeOptimize(llvm::Machi
neBasicBlock&, llvm::ilist_iterator<llvm::MachineInstr>&): Assertion
`MI->getNum
Operands() == 2 && "These
2019 Nov 21
2
[ARM] Peephole optimization ( instructions tst + add )
Hello!
I noticed that in some cases clang generates sequence of AND+TST instructions:
For example:
AND x3, x2, x1
TST x2, x1
I think these instructions should be merged to one:
ANDS x3, x2, x1
( because TST <Xn>, <Xm> is alias for ANDS XZR, <Xn>, <Xm> -
2019 Aug 23
2
Using [GlobalISel] to provide peephole optimizations
Hi,
GlobalISel is fantastic, but obviously lacks a lot of the transforms that
makes SelectionDAG so good. Whilst it's plenty usable, you'll find yourself
wanting/needing to add a lot of manual little transforms to clean things up.
I know of the RFC for a new Combiner with its own syntax
(https://reviews.llvm.org/D54286 is the latest I can find of it), but after
manually adding my Nth
2015 Mar 25
0
[PATCH] nv50/ir: take postFactor into account when doing peephole optimizations
Multiply operations can have a post-factor on them, which other ops
don't support. Only perform the peephole optimizations when there is no
post-factor involved.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89758
Signed-off-by: Ilia Mirkin <imirkin at alum.mit.edu>
---
src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp | 12 ++++++++----
1 file changed, 8 insertions(+),
2017 May 05
2
Machine instruction verifier pass
Hello Devs,
Machine Instruction verifier pass always validates Live variable info
associated with MachineInstr along with other checks.
Please consider following scenario (w.r.t bugZilla 32583)
1/ MachineCSE pass may prohibit optimising out a common sub-expression for
instruction using physical registers
by looking at the LiveIn info of successor basic blocks.
2/ Which means we need Live
2016 Aug 29
2
Publication
Hi,
Can you add the following two publications from our group to the LLVM
publications page.
-
*Alive-FP: Automated Verification of Floating Point Based Peephole
Optimizations in LLVM [pdf]
<http://www.cs.rutgers.edu/~santosh.nagarakatte/papers/alive-fp-sas16.pdf>
*David
Menendez, Santosh Nagarakatte, and Aarti Gupta
*To Appear in the Proceedings of the 23rd Static Analysis
2015 Jan 19
6
[LLVMdev] X86TargetLowering::LowerToBT
I'm tracking down an X86 code generation malfeasance regarding BT (bit
test) and I have some questions.
This IR *matches* and then *X86TargetLowering::LowerToBT **is called:*
%and = and i64 %shl, %val * ; (val & (1 << index)) != 0 ; *bit test
with a *register* index
This IR *does not match* and so *X86TargetLowering::LowerToBT **is not
called:*
%and = lshr i64 %val, 25
2018 May 18
1
Constants propagation into printf format string
On 5/17/2018 4:04 PM, Davide Italiano via llvm-dev wrote:
> On Thu, May 17, 2018 at 3:53 PM, Dávid Bolvanský via llvm-dev
> <llvm-dev at lists.llvm.org> wrote:
>> Hello,
>>
>> I was thinking about a new possible simplification in InstCombine which
>> would transform code like this:
>>
>> const char * NAME = "Prog";
>>
>> void
2019 May 09
4
Making llvm-xyz -help useful
...- Emit Apple-style NEON assembly
-amdgpu-dump-hsa-metadata - Dump AMDGPU HSA Metadata
-amdgpu-enable-merge-m0 - Merge and hoist M0 initializations
-amdgpu-sdwa-peephole - Enable SDWA peepholer
[...]
Surely, the style of NEON code to emit from AArch64 backend is not the information I was looking for...
I've implemented a straight-forward patch for llvm-cat here https://reviews.llvm.org/D61740, and the result becomes:
OVERVIEW: Module concatenation
USAGE: llvm-cat [optio...
2011 May 24
1
[LLVMdev] LLVM evaluation
Hi,
My organization is doing a technology comparison between GCC
optimization/backend and LLVM infrastructures to select the best
environment for our next product. Would you please help me to find
answers on the following features’ list support availability inside
current (or near future) LLVM implementation:
1. Intrinsic function support
2. Semi-standard embedded C types (complex, fractional)
2018 Jul 25
2
Question about target instruction optimization
This is a question about optimizing the code generation in a (new) Z80
backend:
The CPU has a couple of 8 bit physical registers, e.g. H, L, D and E,
which are overlaid in 16 bit register pairs named HL and DE.
It has also a native instruction to load a 16 bit immediate value into a
16 bit register pair (HL or DE), e.g.:
LD HL,<imm16>
Now when having a sequence of loading two 16
2015 Jul 17
3
[LLVMdev] 2-address and 3-address instructions
I am writing a backend for an experimental machine that has both 2-address and
3-address versions of some instructions. The 2-address versions are more
compact and thus preferred when applicable. How does one go about generating
the most compact version?
1. At instruction selection, is there a predicate that can test whether one of
the input sources is dead, thus allowing the selection of the
2016 Mar 10
2
[CodeGen] PeepholeOptimizer: optimizing condition dependent instrunctions
Hi Quentin,
Yes, the code allows to process connected instructions. Although it should be taken into account that the instruction next to the current processed instruction must never be erased because this invalidates iterator.
I've been fixing a bug in AArch64InstrInfo::optimizeCompareInstr: instructions are converted into S form but it's not checked that they produce the same flags as
2011 Feb 18
2
[LLVMdev] Adding "S" suffixed ARM/Thumb2 instructions
Hello everyone,
I've added the "S" suffixed versions of ARM and Thumb2 instructions to
tablegen. Those are, for example, "movs" or "muls".
Of course, some instructions have already had their twins, such as add/adds,
and I leaved them untouched.
Besides, I propose the codegen optimization based on them, which removes the
redundant comparison in patterns like
orr