thr3ads.net - similar to: "[LLVMdev] missing blocks"

Displaying 20 results from an estimated 700 matches similar to: "[LLVMdev] missing blocks"

2019 Jan 15

Aggressive optimization opportunity

Hi, There are some compilers with a aggressive optimization which restricts function pointer parameters. Let's say opt restrict_args. When restrict_args is turned on, compiler will treat all function pointer parameters as restrict one. int foo(int * a) + restrict_args opt equals to: int foo(int * restrict a) Here is a complete example: source code: extern int num; int foo(int * a) {

Aggressive optimization opportunity

2019 Jan 15

Aggressive optimization opportunity

Restrict is supported by Clang for C++ via __restrict__, so it seems strange to block using this proposed option for C++. That said, this kind of option can be dangerous and should come with a suitable warning. We’ve had a similar option and in practice it’s been used to hunt for performance gains (i.e., turn it on and see what happens), but just because the code runs faster and produces the

[LLVMdev] Multi-Instruction Patterns

2008 Sep 23

[LLVMdev] Multi-Instruction Patterns

On Sep 23, 2008, at 11:26 AM, David Greene wrote: > Are there any examples of using tablegen to generate multiple machine > instructions from a single pattern? Or do these cases always have > to be > manually expanded? PPC has a bunch of examples, for example: // Arbitrary immediate support. Implement in terms of LIS/ORI. def : Pat<(i32 imm:$imm), (ORI (LIS (HI16

altivec lpc_restore_signal

2004 Sep 10

altivec lpc_restore_signal

I've had this a long time but haven't submitted it yet. I've tried to mirror the ia32 setup, so there should be a new subdirectory src/libFLAC/ppc . The first two attachments go there. The third is a context diff for src/libFLAC/Makefile.am . I have some more modified files, which I figured I'd submit after the above are checked in and working for somebody other than me. If you

How can I tell llvm, that a branch is preferred ?

2015 Oct 27

How can I tell llvm, that a branch is preferred ?

If I read the llvm language correctly, it doesn't have a way to specify the preferred branch, correct ? I see nothing in the specs for "branch" or "switch". And __buildin_expect does nothing, that I am sure of. Unfortunately llvm has this knack for ordering my one most crucial part of code exactly the opposite I want to, it does: (x86_64) cmpq %r15, (%rax,%rdx) jne

[LLVMdev] Need a clue to improve the optimization of some C code

2015 Mar 03

[LLVMdev] Need a clue to improve the optimization of some C code

Am 03.03.2015 um 19:49 schrieb Philip Reames <listmail at philipreames.com>: Hi Philip first thanks for your response, > You'll need to prove a bit more information to get any useful response. Questions: > 1) What's you're use case? Are you using clang to compile C code? Are you manually generating LLVM IR? yes the "inline function C code" will be compiled

[LLVMdev] [Q] x86 peephole deficiency

2010 Oct 07

[LLVMdev] [Q] x86 peephole deficiency

Hi all, I am slowly working on a SwitchInst optimizer (http://llvm.org/PR8125) and now I am running into a deficiency of the x86 peephole optimizer (or jump-threader?). Here is what I get: andl $3, %edi je .LBB0_4 # BB#2: # %nz # in Loop: Header=BB0_1 Depth=1 cmpl $2, %edi

[LLVMdev] [Q] x86 peephole deficiency

2010 Oct 07

[LLVMdev] [Q] x86 peephole deficiency

On Oct 6, 2010, at 6:16 PM, Gabor Greif wrote: > Hi all, > > I am slowly working on a SwitchInst optimizer (http://llvm.org/PR8125) > and now I am running into a deficiency of the x86 > peephole optimizer (or jump-threader?). Here is what I get: > > > andl $3, %edi > je .LBB0_4 > # BB#2: # %nz >

[LLVMdev] Need a clue to improve the optimization of some C code

2015 Mar 03

[LLVMdev] Need a clue to improve the optimization of some C code

Hi I have some inline function C code, that llvm could be optimizing better. Since I am new to this, I wonder if someone could give me a few pointers, how to approach this in LLVM. Should I try to change the IR code -somehow- to get the code generator to generate better code, or should I rather go to the code generator and try to add an optimization pass ? Thanks for any feedback. Ciao

Nowaday Scalar Evolution's Problem.

2017 Nov 20

Nowaday Scalar Evolution's Problem.

The Problem? Nowaday, SCEV called "Scalar Evolution" does only evolate instructions that has predictable operand, Constant-Based operand. such as that can evolute as a constant. otherwise we couldn't evolate it as SCEV node, evolated as SCEVUnknown. important thing that we remember is, we do not use SCEV only for Loop Deletion, which that doesn't really needed on nature loops

[atomics][AArch64] Possible bug in cmpxchg lowering

2017 May 30

[atomics][AArch64] Possible bug in cmpxchg lowering

Currently the AtomicExpandPass will lower the following IR: define i1 @foo(i32* %obj, i32 %old, i32 %new) { entry: %v0 = cmpxchg weak volatile i32* %obj, i32 %old, i32 %new _*release acquire*_ %v1 = extractvalue { i32, i1 } %v0, 1 ret i1 %v1 } to the equivalent of the following on AArch64: _*ldxr w8, [x0]*_ cmp w8, w1 b.ne .LBB0_3 // BB#1:

MC PowerPC 32 bit vs. 64 bit

2016 Nov 06

MC PowerPC 32 bit vs. 64 bit

Hi, over the past days I have been proofing a concept involving LLVM MC on the PowerPC target. The 32 bit part went quite ok, but i am puzzled with the results I get using the 64 bit target. When disassembling in 64bit some instructions refer to GPRs in PPC::R0 to PPC::R31, some refer to PPC::X0 to PPC::X31. I understand that the registers are modeled with Rx referring to the 32bit parts and Xx

A use of RDF to extend register Remat

2016 Oct 18

A use of RDF to extend register Remat

Dear Community, I would like to discuss few points to use RDF to extend register remat scope. Mr. Krzysztof and I have started discussion this on private mail. But I think now it would be better to include community. Interested community member kindly previous discussion (at the end of mail) before starting here. After analyzing if RDF can be used for solving Remat, we think that problem with

[LLVMdev] GVNPRE /PRE is not effective

2013 Dec 13

[LLVMdev] GVNPRE /PRE is not effective

Hi All, The PRE or GVNPRE is not effective for the below use case. int sum; int phi =30; void f (int i, int *a) { if ((a[i] << (1)) > -15) sum =(phi+ 0x7fffffffL )/ a[i]; if ((a[i] << (2)) > -15) sum =(phi + 0x7fffffffL) /a[i]; } respective asm (clang on trunk ) #clang -O3 -S test.c BB#0: # %entry pushl %edi pushl %esi

[LLVMdev] Possible miscompilation?

2008 Jun 12

[LLVMdev] Possible miscompilation?

Gordon Henriksen wrote: > On 2008-06-11, at 13:16, Gary Benson wrote: > > Duncan Sands wrote: > > > Can you please attach IR which can be compiled to an executable > > > (and shows the problem). > > > > I've been generating functions using a builder and then compiling > > them with ExecutionEngine::getPointerToFunction(). Is there some > >

[LLVMdev] [cfe-dev] Odd PPC inline asm constraint

2012 May 12

[LLVMdev] [cfe-dev] Odd PPC inline asm constraint

On Tue, 01 May 2012 21:25:29 -0500 Peter Bergner <bergner at vnet.ibm.com> wrote: > On Tue, 2012-05-01 at 19:58 -0500, Peter Bergner wrote: > > On Tue, 2012-05-01 at 17:47 -0500, Hal Finkel wrote: > > > By default it should build for > > > whatever the current host is (no special flags required). To > > > specifically build for something else, use:

[LLVMdev] [Q] x86 peephole deficiency

2010 Oct 13

[LLVMdev] [Q] x86 peephole deficiency

Am 07.10.2010 um 19:50 schrieb Chris Lattner: > > On Oct 6, 2010, at 6:16 PM, Gabor Greif wrote: > >> Hi all, >> >> I am slowly working on a SwitchInst optimizer (http://llvm.org/ >> PR8125) >> and now I am running into a deficiency of the x86 >> peephole optimizer (or jump-threader?). Here is what I get: >> >> >> andl $3,

A code layout related side-effect introduced by rL318299

2017 Dec 19

A code layout related side-effect introduced by rL318299

Hi, Recently 10% performance regression on an important benchmark showed up after we integrated https://reviews.llvm.org/rL318299. The analysis showed that rL318299 triggered loop rotation on an multi exits loop, and the loop rotation introduced code layout issue. The performance regression is a side-effect of rL318299. I got two testcases a.ll and b.ll attached to illustrate the problem. a.ll

[LLVMdev] Patching jump tables at run-time

2013 Aug 06

[LLVMdev] Patching jump tables at run-time

I am looking for guidance on how to: 1.

4.20-rc6: WARNING: CPU: 30 PID: 197360 at net/core/flow_dissector.c:764 __skb_flow_dissect

2018 Dec 20

4.20-rc6: WARNING: CPU: 30 PID: 197360 at net/core/flow_dissector.c:764 __skb_flow_dissect

Folks, I got this warning today. I cant tell when and why this happened, so I do not know yet how to reproduce. Maybe someone has a quick idea. [85109.572032] WARNING: CPU: 30 PID: 197360 at net/core/flow_dissector.c:764 __skb_flow_dissect+0x1f0/0x1318 [85109.572036] Modules linked in: vhost_net vhost macvtap macvlan tap vfio_ap vfio_mdev mdev vfio_iommu_type1 vfio kvm xt_CHECKSUM ipt_MASQUERADE

similar to: [LLVMdev] missing blocks