similar to: [RFC] __builtin_constant_p() Improvements

Displaying 20 results from an estimated 5000 matches similar to: "[RFC] __builtin_constant_p() Improvements"

2018 Apr 13
0
[RFC] __builtin_constant_p() Improvements
I actually was working on an updated patch for the LLVM-side of this, also. :) I was just working on some test cases; I'll post it soon. It's somewhat different than yours. I haven't touched the clang side yet, but I think it needs to be more complex than what you have there. I think it actually needs to be able to evaluate the intrinsic as a constant _false_ in the front-end in some
2016 May 27
2
Handling post-inc users in LSR
Hello, For a very simple loop where all IV users are post-inc users, I observed redundant add instructions in AArch64. From LSR debug, I can see initial formula for icmp is the one that transformed to a post-inc form in OptimizeLoopTermCond() and later expanded in post-inc mode. Based on the observation that the icmp is already a post-inc user, I hacked LSR to prevent the icmp from being
2016 May 27
0
Handling post-inc users in LSR
> On May 27, 2016, at 2:50 PM, via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > Hello, > > For a very simple loop where all IV users are post-inc users, I observed redundant add instructions in AArch64. > > From LSR debug, I can see initial formula for icmp is the one that transformed to a post-inc form in OptimizeLoopTermCond() and later expanded in post-inc
2015 Aug 31
2
[RFC] New pass: LoopExitValues
Hello LLVM, This is a proposal for a new pass that improves performance and code size in some nested loop situations. The pass is target independent. >From the description in the file header: This optimization finds loop exit values reevaluated after the loop execution and replaces them by the corresponding exit values if they are available. Such sequences can arise after the
2015 Sep 01
2
[RFC] New pass: LoopExitValues
On Mon, Aug 31, 2015 at 5:52 PM, Jake VanAdrighem <jvanadrighem at gmail.com> wrote: > Do you have some specific performance measurements? Averaging 4 runs of 10000 iterations each of Coremark on my X86_64 desktop showed: -O2 performance: +2.9% faster with the L.E.V. pass -Os size: 1.5% smaller with the L.E.V. pass In the case of Coremark, the benefit comes mainly from the matrix
2019 Feb 05
2
clang emits calls to consexpr function.
Hi Devs, consider below testcase $cat test.cpp constexpr int product() { return 10*20; } int main() { const int x = product(); return 0; } $./clang test.cpp -std=c++11 -S $./clang -v clang version 9.0.0 Target: x86_64-unknown-linux-gnu $cat test.s main: .cfi_startproc # %bb.0: pushq %rbp .cfi_def_cfa_offset 16 .cfi_offset %rbp, -16 movq
2009 Mar 15
2
[LLVMdev] [Bug 3756] __attribute__((always_inline)) and __builtin_constant_p
[ please CC: me as I'm not subscribed ] On Wed, Mar 11, 2009 at 04:13:34AM +0000, bugzilla-daemon at cs.uiuc.edu wrote: > http://llvm.org/bugs/show_bug.cgi?id=3756 > > Chris Lattner <clattner at apple.com> changed: > > What |Removed |Added > ---------------------------------------------------------------------------- >
2009 Mar 16
0
[LLVMdev] [Bug 3756] __attribute__((always_inline)) and __builtin_constant_p
Pierre Habouzit wrote: > [ please CC: me as I'm not subscribed ] > > On Wed, Mar 11, 2009 at 04:13:34AM +0000, bugzilla-daemon at cs.uiuc.edu wrote: >> http://llvm.org/bugs/show_bug.cgi?id=3756 >> >> Chris Lattner <clattner at apple.com> changed: >> >> What |Removed |Added >>
2009 Mar 16
3
[LLVMdev] [Bug 3756] __attribute__((always_inline)) and __builtin_constant_p
On Mar 15, 2009, at 6:16 PM, Nick Lewycky wrote: > Pierre Habouzit wrote: >> [ please CC: me as I'm not subscribed ] >> >> On Wed, Mar 11, 2009 at 04:13:34AM +0000, bugzilla- >> daemon at cs.uiuc.edu wrote: >>> http://llvm.org/bugs/show_bug.cgi?id=3756 >>> >>> Chris Lattner <clattner at apple.com> changed: >>> >>>
2017 Jul 01
2
KNL Assembly Code for Matrix Multiplication
Thank You, It means vmovdqa64 zmm22, zmmword ptr [rip + .LCPI0_0] # zmm22 = [8,9,10,11,12,13,14,15] zmm22 will contain 64 bit constant values which are indexes here zmm22=8, 9, 10, 11, 12,13,14,15. not the values loaded from these locations. and zmm2 contains constant 4000. so, vpmuludq zmm14, zmm10, zmm2 ; will multiply the indexes values with 4000, as for array b the stride is 4000. zmm14=
2018 Sep 14
2
Function calls keep increasing the stack usage
Sorry I missed that important detail. The relevant part of the command line is: -cc1 -S -triple i386-pc-win32 I don't expect it matters if it's for Windows or Linux in this case. On Fri, Sep 14, 2018 at 9:16 PM David Blaikie <dblaikie at gmail.com> wrote: > Can't say I've observed that behavior (though I'm just building from > top-of-tree rather than 6.0,
2018 Nov 06
4
Rather poor code optimisation of current clang/LLVM targeting Intel x86 (both -64 and -32)
Hi @ll, while clang/LLVM recognizes common bit-twiddling idioms/expressions like unsigned int rotate(unsigned int x, unsigned int n) { return (x << n) | (x >> (32 - n)); } and typically generates "rotate" machine instructions for this expression, it fails to recognize other also common bit-twiddling idioms/expressions. The standard IEEE CRC-32 for "big
2011 Oct 19
3
[LLVMdev] Question regarding basic-block placement optimization
On Tue, Oct 18, 2011 at 6:58 PM, Jakob Stoklund Olesen <stoklund at 2pi.dk>wrote: > > On Oct 18, 2011, at 5:22 PM, Chandler Carruth wrote: > > As for why it should be an IR pass, mostly because once the selection dag >> runs through the code, we can never recover all of the freedom we have at >> the IR level. To start with, splicing MBBs around requires known about
2020 Jul 20
2
[ARM] Should Use Load and Store with Register Offset
Hello LLVM Community (specifically anyone working with ARM Cortex-M), While trying to compile the Newlib C library I found that Clang10 was generating slightly larger binaries than the libc from the prebuilt gcc-arm-none-eabi toolchain. I looked at a few specific functions (memcpy, strcpy, etc.) and noticed that LLVM does not tend to generate load/store instructions with a register offset (e.g.
2012 Jul 29
3
[LLVMdev] rotate
Nice! Clever compiler.. On 07/28/2012 08:55 PM, Michael Gottesman wrote: > I can get clang/llvm to emit a rotate instruction on x86-64 when compiling C by just using -Os and the rotate from Hacker's Delight i.e., > > ====== > #include<stdlib.h> > #include<stdint.h> > > uint32_t ror(uint32_t input, size_t rot_bits) > { > return (input>>
2017 Aug 21
3
DragonEgg for GCC v8.x and LLVM v6.x is just able to work
Hi LLVM and GCC developers, My sincere thanks will goto: * Duncan, the core developer of llvm-gcc and dragonegg http://llvm.org/devmtg/2009-10/Sands_LLVMGCCPlugin.pdf * David, the innovator and developer of GCC https://dmalcolm.fedorapeople.org/gcc/global-state/requirements.html and others who give me kind response for teaching me patiently and carefully about how to migrate GCC v4.8.x to
2013 Aug 19
2
[LLVMdev] Duplicate loading of double constants
Hi, I found that in some cases llvm generates duplicate loads of double constants, e.g. $ cat t.c double f(double* p, int n) { double s = 0; if (n) s += *p; return s; } $ clang -S -O3 t.c -o - ... f: # @f .cfi_startproc # BB#0: xorps %xmm0, %xmm0 testl %esi, %esi je .LBB0_2 # BB#1: xorps
2009 Mar 20
0
[LLVMdev] [Bug 3756] __attribute__((always_inline)) and __builtin_constant_p
Dan Gohman wrote: > On Mar 15, 2009, at 6:16 PM, Nick Lewycky wrote: > >> Pierre Habouzit wrote: >>> [ please CC: me as I'm not subscribed ] >>> >>> On Wed, Mar 11, 2009 at 04:13:34AM +0000, bugzilla- >>> daemon at cs.uiuc.edu wrote: >>>> http://llvm.org/bugs/show_bug.cgi?id=3756 >>>> >>>> Chris Lattner
2012 May 24
4
[LLVMdev] use AVX automatically if present
I wonder why AVX is not used automatically if available at the host machine. In contrast to that, SSE41 instructions (like pmulld) are automatically used if the host machine supports SSE41. E.g. $ cat avx.ll define void @_fun1(<8 x float>*, <8 x float>*) { _L1: %x = load <8 x float>* %0 %y = load <8 x float>* %1 %z = fadd <8 x float> %x, %y store
2017 Nov 20
2
Nowaday Scalar Evolution's Problem.
The Problem? Nowaday, SCEV called "Scalar Evolution" does only evolate instructions that has predictable operand, Constant-Based operand. such as that can evolute as a constant. otherwise we couldn't evolate it as SCEV node, evolated as SCEVUnknown. important thing that we remember is, we do not use SCEV only for Loop Deletion, which that doesn't really needed on nature loops