similar to: [LLVMdev] missing blocks

Displaying 20 results from an estimated 700 matches similar to: "[LLVMdev] missing blocks"

2019 Jan 15
4
Aggressive optimization opportunity
Hi, There are some compilers with a aggressive optimization which restricts function pointer parameters. Let's say opt restrict_args. When restrict_args is turned on, compiler will treat all function pointer parameters as restrict one. int foo(int * a) + restrict_args opt equals to: int foo(int * restrict a) Here is a complete example: source code: extern int num; int foo(int * a) {
2019 Jan 15
3
Aggressive optimization opportunity
Restrict is supported by Clang for C++ via __restrict__, so it seems strange to block using this proposed option for C++. That said, this kind of option can be dangerous and should come with a suitable warning. We’ve had a similar option and in practice it’s been used to hunt for performance gains (i.e., turn it on and see what happens), but just because the code runs faster and produces the
2008 Sep 23
0
[LLVMdev] Multi-Instruction Patterns
On Sep 23, 2008, at 11:26 AM, David Greene wrote: > Are there any examples of using tablegen to generate multiple machine > instructions from a single pattern? Or do these cases always have > to be > manually expanded? PPC has a bunch of examples, for example: // Arbitrary immediate support. Implement in terms of LIS/ORI. def : Pat<(i32 imm:$imm), (ORI (LIS (HI16
2004 Sep 10
1
altivec lpc_restore_signal
I've had this a long time but haven't submitted it yet. I've tried to mirror the ia32 setup, so there should be a new subdirectory src/libFLAC/ppc . The first two attachments go there. The third is a context diff for src/libFLAC/Makefile.am . I have some more modified files, which I figured I'd submit after the above are checked in and working for somebody other than me. If you
2015 Oct 27
4
How can I tell llvm, that a branch is preferred ?
If I read the llvm language correctly, it doesn't have a way to specify the preferred branch, correct ? I see nothing in the specs for "branch" or "switch". And __buildin_expect does nothing, that I am sure of. Unfortunately llvm has this knack for ordering my one most crucial part of code exactly the opposite I want to, it does: (x86_64) cmpq %r15, (%rax,%rdx) jne
2015 Mar 03
2
[LLVMdev] Need a clue to improve the optimization of some C code
Am 03.03.2015 um 19:49 schrieb Philip Reames <listmail at philipreames.com>: Hi Philip first thanks for your response, > You'll need to prove a bit more information to get any useful response. Questions: > 1) What's you're use case? Are you using clang to compile C code? Are you manually generating LLVM IR? yes the "inline function C code" will be compiled
2010 Oct 07
2
[LLVMdev] [Q] x86 peephole deficiency
Hi all, I am slowly working on a SwitchInst optimizer (http://llvm.org/PR8125) and now I am running into a deficiency of the x86 peephole optimizer (or jump-threader?). Here is what I get: andl $3, %edi je .LBB0_4 # BB#2: # %nz # in Loop: Header=BB0_1 Depth=1 cmpl $2, %edi
2010 Oct 07
0
[LLVMdev] [Q] x86 peephole deficiency
On Oct 6, 2010, at 6:16 PM, Gabor Greif wrote: > Hi all, > > I am slowly working on a SwitchInst optimizer (http://llvm.org/PR8125) > and now I am running into a deficiency of the x86 > peephole optimizer (or jump-threader?). Here is what I get: > > > andl $3, %edi > je .LBB0_4 > # BB#2: # %nz >
2015 Mar 03
2
[LLVMdev] Need a clue to improve the optimization of some C code
Hi I have some inline function C code, that llvm could be optimizing better. Since I am new to this, I wonder if someone could give me a few pointers, how to approach this in LLVM. Should I try to change the IR code -somehow- to get the code generator to generate better code, or should I rather go to the code generator and try to add an optimization pass ? Thanks for any feedback. Ciao
2017 Nov 20
2
Nowaday Scalar Evolution's Problem.
The Problem? Nowaday, SCEV called "Scalar Evolution" does only evolate instructions that has predictable operand, Constant-Based operand. such as that can evolute as a constant. otherwise we couldn't evolate it as SCEV node, evolated as SCEVUnknown. important thing that we remember is, we do not use SCEV only for Loop Deletion, which that doesn't really needed on nature loops
2017 May 30
3
[atomics][AArch64] Possible bug in cmpxchg lowering
Currently the AtomicExpandPass will lower the following IR: define i1 @foo(i32* %obj, i32 %old, i32 %new) { entry: %v0 = cmpxchg weak volatile i32* %obj, i32 %old, i32 %new _*release acquire*_ %v1 = extractvalue { i32, i1 } %v0, 1 ret i1 %v1 } to the equivalent of the following on AArch64: _*ldxr w8, [x0]*_ cmp w8, w1 b.ne .LBB0_3 // BB#1:
2016 Nov 06
3
MC PowerPC 32 bit vs. 64 bit
Hi, over the past days I have been proofing a concept involving LLVM MC on the PowerPC target. The 32 bit part went quite ok, but i am puzzled with the results I get using the 64 bit target. When disassembling in 64bit some instructions refer to GPRs in PPC::R0 to PPC::R31, some refer to PPC::X0 to PPC::X31. I understand that the registers are modeled with Rx referring to the 32bit parts and Xx
2016 Oct 18
2
A use of RDF to extend register Remat
Dear Community, I would like to discuss few points to use RDF to extend register remat scope. Mr. Krzysztof and I have started discussion this on private mail. But I think now it would be better to include community. Interested community member kindly previous discussion (at the end of mail) before starting here. After analyzing if RDF can be used for solving Remat, we think that problem with
2013 Dec 13
0
[LLVMdev] GVNPRE /PRE is not effective
Hi All, The PRE or GVNPRE is not effective for the below use case. int sum; int phi =30; void f (int i, int *a) { if ((a[i] << (1)) > -15) sum =(phi+ 0x7fffffffL )/ a[i]; if ((a[i] << (2)) > -15) sum =(phi + 0x7fffffffL) /a[i]; } respective asm (clang on trunk ) #clang -O3 -S test.c BB#0: # %entry pushl %edi pushl %esi
2008 Jun 12
4
[LLVMdev] Possible miscompilation?
Gordon Henriksen wrote: > On 2008-06-11, at 13:16, Gary Benson wrote: > > Duncan Sands wrote: > > > Can you please attach IR which can be compiled to an executable > > > (and shows the problem). > > > > I've been generating functions using a builder and then compiling > > them with ExecutionEngine::getPointerToFunction(). Is there some > >
2012 May 12
0
[LLVMdev] [cfe-dev] Odd PPC inline asm constraint
On Tue, 01 May 2012 21:25:29 -0500 Peter Bergner <bergner at vnet.ibm.com> wrote: > On Tue, 2012-05-01 at 19:58 -0500, Peter Bergner wrote: > > On Tue, 2012-05-01 at 17:47 -0500, Hal Finkel wrote: > > > By default it should build for > > > whatever the current host is (no special flags required). To > > > specifically build for something else, use:
2010 Oct 13
2
[LLVMdev] [Q] x86 peephole deficiency
Am 07.10.2010 um 19:50 schrieb Chris Lattner: > > On Oct 6, 2010, at 6:16 PM, Gabor Greif wrote: > >> Hi all, >> >> I am slowly working on a SwitchInst optimizer (http://llvm.org/ >> PR8125) >> and now I am running into a deficiency of the x86 >> peephole optimizer (or jump-threader?). Here is what I get: >> >> >> andl $3,
2017 Dec 19
4
A code layout related side-effect introduced by rL318299
Hi, Recently 10% performance regression on an important benchmark showed up after we integrated https://reviews.llvm.org/rL318299. The analysis showed that rL318299 triggered loop rotation on an multi exits loop, and the loop rotation introduced code layout issue. The performance regression is a side-effect of rL318299. I got two testcases a.ll and b.ll attached to illustrate the problem. a.ll
2013 Aug 06
1
[LLVMdev] Patching jump tables at run-time
I am looking for guidance on how to: 1.
2018 Dec 20
0
4.20-rc6: WARNING: CPU: 30 PID: 197360 at net/core/flow_dissector.c:764 __skb_flow_dissect
Folks, I got this warning today. I cant tell when and why this happened, so I do not know yet how to reproduce. Maybe someone has a quick idea. [85109.572032] WARNING: CPU: 30 PID: 197360 at net/core/flow_dissector.c:764 __skb_flow_dissect+0x1f0/0x1318 [85109.572036] Modules linked in: vhost_net vhost macvtap macvlan tap vfio_ap vfio_mdev mdev vfio_iommu_type1 vfio kvm xt_CHECKSUM ipt_MASQUERADE