similar to: Suboptimal code generated by clang+llc in quite a common scenario (?)

Displaying 20 results from an estimated 2000 matches similar to: "Suboptimal code generated by clang+llc in quite a common scenario (?)"

2019 Aug 08
3
Suboptimal code generated by clang+llc in quite a common scenario (?)
This might not be the workaround you want because it is only available in C, but you can use restrict to allow such optimizations. https://godbolt.org/z/2gQ26f Alex On Thu, Aug 8, 2019 at 11:50 AM Michael Kruse via llvm-dev < llvm-dev at lists.llvm.org> wrote: > Hi, > > char* scscx is an universal pointer and may point to anything, > including itself. That is, scscx might
2013 Nov 11
2
[LLVMdev] What's the Alias Analysis does clang use ?
Hi, LLVM community: I found basicaa seems not to tell must-not-alias for __restrict__ arguments in c/c++. It only compares two pointers and the underlying objects they point to. I wonder how clang does alias analysis for c/c++ keyword restrict. let assume we compile the following code: $cat myalias.cc float foo(float * __restrict__ v0, float * __restrict__ v1, float * __restrict__ v2, float *
2013 Nov 12
0
[LLVMdev] What's the Alias Analysis does clang use ?
Hi, Your problem is that the function arguments, which are makes as noalias, are not being directly used as the base objects of the array accesses: > %v0.addr = alloca float*, align 8 > %v1.addr = alloca float*, align 8 > %v2.addr = alloca float*, align 8 > %t.addr = alloca float*, align 8 ... > store float* %v0, float** %v0.addr, align 8 > store float* %v1, float** %v1.addr,
2020 Jun 19
4
LLVM-IR store-load propagation
Hello everyone, This week I was looking into the following example ( https://godbolt.org/z/uhgQcq) where two constants are written to a local array and an input argument, masked and shifted, is used to select between them. The possible values for the CC variable are 0 and 1, so I'm expecting that at the maximum level of optimizations the two constants are actually propagated, resulting in the
2014 Sep 18
2
[LLVMdev] [Vectorization] Mis match in code generated
Hi Nadav, Thanks for the quick reply !! Ok, so as of now we are lacking capability to handle flat large reductions. I did go through function vectorizeChainsInBlock() (line number 2862). In this function, we try to vectorize if we have phi nodes in the IR (several if's check for phi nodes) i.e we try to construct tree that starts at chains. Any pointers on how to join multiple trees? I
2014 Sep 19
3
[LLVMdev] [Vectorization] Mis match in code generated
Hi Arnold, Thanks for your reply. I tried test case as suggested by you. *void foo(int *a, int *sum) {*sum = a[0]+a[1]+a[2]+a[3]+a[4]+a[5]+a[6]+a[7]+a[8]+a[9]+a[10]+a[11]+a[12]+a[13]+a[14]+a[15];}* so that it has a 'store' in its IR. *IR before vectorization :*target datalayout = "e-m:e-p:32:32-f64:32:64-f80:32-n8:16:32-S128" target triple =
2013 Feb 07
1
[LLVMdev] alloca scalarization with dynamic indexing into vectors
Hi all, I have a question regarding dynamic indexing into a vector with GEP. I see that in the ScalarReplAggregates pass in the LLVM 3.2 release the call SROA::isSafeGEP() will now allow alloca scalarization in the case where a GEP index into a vector isn’t a constant. My question is: what is the expected behavior when the index is out of bounds of the vector? Is it undefined? I have an
2014 Sep 18
2
[LLVMdev] [Vectorization] Mis match in code generated
Hi, I am trying to understand LLVM vectorization implementation and was looking into both loop and SLP vectorization. test case 1: *int foo(int *a) {int sum = 0,i;for(i=0; i<16; i++) sum += a[i];return sum;}* This code is vectorized by loop vectorizer where we calculate scalar loop cost as 4 and vector loop cost as 2. Since vector loop cost is less and above reduction is legal to
2012 Mar 28
2
[LLVMdev] Suboptimal code due to excessive spilling
Hi, I have run into the following strange behavior and wanted to ask for some advice. For the C program below, function sum() gets inlined in foo() but the code generated looks very suboptimal (the code is an extract from a larger program). Below I show the 32-bit x86 assembly as produced by the demo page on the llvm home page ("Output A"). As you can see from the assembly, after
2012 Apr 05
0
[LLVMdev] Suboptimal code due to excessive spilling
I don't know much about this, but maybe -mllvm -unroll-count=1 can be used as a workaround? /Patrik Hägglund -----Original Message----- From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On Behalf Of Brent Walker Sent: den 28 mars 2012 03:18 To: llvmdev Subject: [LLVMdev] Suboptimal code due to excessive spilling Hi, I have run into the following strange behavior
2019 Aug 08
2
Suboptimal code generated by clang+llc in quite a common scenario (?)
Hi Tim and Alex Thanks for your replies. So just to make it clear for me: does this imply that there’s indeed no way on the current version to tell the compiler or Clang to optimize this? Thanks, Joan > On 8 Aug 2019, at 18:30, Tim Northover via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > On Thu, 8 Aug 2019 at 17:08, Alex Brachet-Mialot via llvm-dev > <llvm-dev at
2014 Nov 10
2
[LLVMdev] [Vectorization] Mis match in code generated
Hi Suyog, Thanks for looking at this. This has recently got itself onto my TODO list too. > I am not sure how much all this will improve the code quality for horizontal reduction > (donno how frequently such pattern of horizontal reduction from same array occurs in real world/SPECS). Actually the main loop of 470.lbm can be SLP vectorized like this. We have three parts to it: A fully
2012 Jul 29
2
[LLVMdev] rotate
in C or C++, how can I get clang/llvm to try and do a "rotate". (want to test this code in the mips16 port) i.e. emit rotr node. tia. reed
2012 Jul 29
3
[LLVMdev] rotate
Nice! Clever compiler.. On 07/28/2012 08:55 PM, Michael Gottesman wrote: > I can get clang/llvm to emit a rotate instruction on x86-64 when compiling C by just using -Os and the rotate from Hacker's Delight i.e., > > ====== > #include<stdlib.h> > #include<stdint.h> > > uint32_t ror(uint32_t input, size_t rot_bits) > { > return (input>>
2012 Jul 29
0
[LLVMdev] rotate
I can get clang/llvm to emit a rotate instruction on x86-64 when compiling C by just using -Os and the rotate from Hacker's Delight i.e., ====== #include <stdlib.h> #include <stdint.h> uint32_t ror(uint32_t input, size_t rot_bits) { return (input >> rot_bits) | (input << ((sizeof(input) << 3) - rot_bits)); } ====== Then compile with (assuming you are on OS
2012 Jun 20
3
[LLVMdev] Is Loop Dependence Analysis Printing Correct Stats?
Hi; I was playing with the -lda pass of LLVM on the following program- #include <stdio.h> void main() { int a[10]; int i; for(i = 0; i < 4; i ++) { a[i] = a[i-1]+1; } } I run the following commands - clang a.c -emit-llvm -S opt -analyze -stats -lda a.s The output is - Printing analysis 'Loop Dependence Analysis': Loop at depth 1, header block: %for.cond Load/store
2015 Apr 16
3
[LLVMdev] double* to <2 x double>*
Does anyone know how to instrument *double* to <2 x doulbe>**, e.g., 2.2 --> <2.2, 2.2>? For example, I want to change the following IR code %arrayidx1 = getelementptr inbounds [100 x double]* @main.B, i32 0, i32 %i.021 %1 = load double* %arrayidx1, align 4, !tbaa !0 to: %arrayidx1 = getelementptr inbounds [100 x double]* @main.B, i32 0, i32 %i.021 %1 = bitcast double* %arrayidx1
2019 Nov 10
2
Reassociation is blocking a vectorization
Hi Devs, I am looking at the bug https://bugs.llvm.org/show_bug.cgi?id=43953 and found that following piece of ir %arrayidx = getelementptr inbounds float, float* %Vec0, i64 %idxprom %0 = load float, float* %arrayidx, align 4, !tbaa !2 %arrayidx2 = getelementptr inbounds float, float* %Vec1, i64 %idxprom %1 = load float, float* %arrayidx2, align 4, !tbaa !2 %sub = fsub fast float %0, %1
2014 Oct 17
2
[LLVMdev] opt -O2 leads to incorrect operation (possibly a bug in the DSE)
Hi all, Consider the following example: define void @fn(i8* %buf) #0 { entry: %arrayidx = getelementptr i8* %buf, i64 18 tail call void @llvm.memcpy.p0i8.p0i8.i64(i8* %arrayidx, i8* %buf, i64 18, i32 1, i1 false) %arrayidx1 = getelementptr i8* %buf, i64 18 store i8 1, i8* %arrayidx1, align 1 tail call void @llvm.memcpy.p0i8.p0i8.i64(i8* %buf, i8* %arrayidx, i64 18, i32 1, i1 false)
2013 Sep 05
2
[LLVMdev] CFI Directives
Hi Rafael, I've been staring at the CFI directives and have a question. Some background: I want to generate the compact unwind information using just the CFI directives. I *think* that this should be doable. The issue I'm facing right now is that I need to know how much the stack pointer was adjusted. So when I have something like this: .cfi_startproc Lfunc_begin175: